A geometric viewpoint of manifold learning
- Binbin Lin^{1}Email author,
- Xiaofei He^{2} and
- Jieping Ye^{1}
https://doi.org/10.1186/s40535-015-0006-6
© Lin et al.; licensee Springer. 2015
Received: 23 July 2014
Accepted: 22 January 2015
Published: 12 March 2015
Abstract
In many data analysis tasks, one is often confronted with very high dimensional data. The manifold assumption, which states that the data is sampled from a submanifold embedded in much higher dimensional Euclidean space, has been widely adopted by many researchers. In the last 15 years, a large number of manifold learning algorithms have been proposed. Many of them rely on the evaluation of the geometrical and topological of the data manifold. In this paper, we present a review of these methods on a novel geometric perspective. We categorize these methods by three main groups: Laplacian-based, Hessian-based, and parallel field-based methods. We show the connection and difference between these three groups on their continuous and discrete counterparts. The discussion is focused on the problem of dimensionality reduction and semi-supervised learning.
Keywords
Review
Introduction
In many data analysis tasks, one is often confronted with very high dimensional data. There is a strong intuition that the data may have a lower dimensional intrinsic representation. Various researchers have considered the case when the data is sampled from a submanifold embedded in much higher dimensional Euclidean space. Consequently, estimating and extracting the low-dimensional manifold structure, or specifically the intrinsic topological and geometrical properties of the data manifold, become a crucial problem. These problems are often referred to as manifold learning (Belkin and Niyogi 2007).
The most natural technique to exact low-dimensional manifold structure with given finite samples is dimensionality reduction. The early work for dimensionality reduction includes principal component analysis (PCA; Jolliffe 1989), multidimensional scaling (MDS; Cox and Cox 1994), and linear discriminant analysis (LDA; Duda et al. 2000). PCA is probably the most popular dimensionality reduction methods. Given a data set, PCA finds the directions along which the data has maximum variance. However, these linear methods may fail to recover the intrinsic manifold structure when the data manifold is not a low-dimensional subspace or an affine manifold.
There are various works on nonlinear dimensionality reduction in the last decade. The typical work includes isomap (Tenenbaum et al. 2000), locally linear embedding (LLE; Roweis and Saul 2000), Laplacian eigenmaps (LE; Belkin and Niyogi 2001), Hessian eigenmaps (HLLE; Donoho and Grimes 2003), and diffusion maps (Coifman and Lafon 2006; Lafon and Lee 2006; Nadler et al. 2006). Isomap generalizes MDS to the nonlinear manifold case which tries to preserve pairwise geodesic distances on the data manifold. Diffusion maps try to preserve another meaningful distance, that is, diffusion distance on the manifold. Laplacian operator and Hessian operator are two of the most important differential operators in manifold learning. Intuitively, Laplacian measures the smoothness of the functions, while Hessian measures how a function changes the metric of the manifold.
One natural nonlinear extension of PCA is kernel principal component analysis (kernel PCA; Schölkopf et al. 1998). Interestingly, Ham et al. (2004) showed that Isomap, LLE, and LE are all special cases of kernel PCA with specific kernels. Recently, maximum variance unfolding (MVU; Weinberger et al. 2004) is proposed to learn a kernel matrix that preserves pairwise distances on the manifold.
Tangent space-based methods have also received considerable interest recently, such as local tangent space alignment (LTSA; Zhang and Zha 2005), manifold charting (Brand 2003), Riemannian manifold learning (RML; Lin and Zha 2008), and locally smooth manifold learning (LSML; Dollár et al. 2007). These methods try to find coordinate representation for curved manifolds. LTSA tries to construct a global coordinate via local tangent space alignment. Manifold charting has a similar strategy, which tries to expand the manifold by splicing local charts. RML uses normal coordinate to unfold the manifold, which aims to preserve the metric of the manifold. LSML tries to learn smooth tangent spaces of the manifold by proposing a smoothness regularization term of tangent spaces. Vector diffusion maps (VDM; Singer and Wu 2012) and parallel field embedding (PFE; Lin et al. 2013) are much recent works which employ the vector fields to study the metric of the manifold.
Among many of these methods, the core ideas of learning the manifold structure are based on differential operators. In this paper, we would like to discuss differential operators defined on functions and on vector fields. The former include Laplacian and Hessian operators, and the latter include the covariant derivative and the connection Laplacian operator. Since there are lots of geometric concepts involved, we first introduce the background of relevant geometric concepts. Then, we discuss the problem of dimensionality reduction and semi-supervised learning by using these differential operators. The discussion not only focuses on their continuous counterpart but also on their discrete approximations. We try to give a rigorous derivation of these methods and provide some new insights for future work.
Background
In this section, we introduce the most relevant concepts.
Tangent spaces and vector fields
Let be a d-dimensional Riemannian manifold. As the manifold is locally a Euclidean space, the key tool for studying the manifold will be the idea of linear approximation. The fundamental linear structure of the manifold is the tangent space.
Definition 2.1 (Tangent space; Lee 2003).
Clearly, this operation is linear and it satisfies the derivation rule. Therefore, we might write the directional derivative of f in the direction of Y as Y f=Y(f)=D _{ Y } f=∇_{ Y } f, where ∇ denotes the covariant derivative on the manifold. Next, we show what a tangent space is on the manifold by using local coordinates. Let {x ^{ i }|i=1,…,d} denote a local coordinate chart around p. Then, it can be easily verified by definition that \(\partial _{i}|_{p}:=\frac {\partial }{\partial x_{i}}|_{p}\) is a tangent vector at p. Moreover, these coordinate vectors ∂ _{1}|_{ p },…,∂ _{ d }|_{ p } form a basis for \(T_{p}\mathcal {M}\) (Lee 2003). Therefore, the dimension of the tangent space is exactly the same as the dimension of the manifold. For example, consider a two-dimensional sphere embedded in \(\mathbb {R}^{3}\); given any point of the sphere, the tangent space of the sphere is just a two dimensional plane.
For any smooth manifold , we define the tangent bundle of , denoted by \(T\mathcal {M}\), to be the disjoint union of the tangent spaces at all points of : \(T\mathcal {M} = \cup _{p\in \mathcal {M}}T_{p}\mathcal {M}.\) Now, we define the vector field.
Definition 2.2 (Vector field; Lee 2003).
A vector field is a continuous map \(X:\mathcal {M} \rightarrow T\mathcal {M}\), usually written as p↦X _{ p }, with the property that for each \(p\in \mathcal {M}\), X _{ p } is an element of \(T_{p}\mathcal {M}\).
Intuitively, a vector field is nothing but a collection of tangent vectors with the continuous constraint. Since at each point, a tangent vector is a derivation. A vector field can be viewed as a directional derivative on the whole manifold. It might be worth noting that each vector X _{ p } of a vector field X must be in the corresponding tangent space \(T_{p}\mathcal {M}\). Let X be a vector field on the manifold. We can represent the vector field locally using the coordinate basis as \(X = \sum _{i=1}^{d} a^{i}\partial _{i}\), where each a ^{ i } is a function which is often called the coefficient of X. For the sake of convenience, we will use the Einstein summation convention: when an index variable appears twice in a single term, it implies summation of that term over all the values of the index, i.e., we might simply write a ^{ i } ∂ _{ i } instead of \(\sum _{i=1}^{d} a^{i}\partial _{i}\).
Riemannian metric
Next, we discuss the metric tensor of the manifold. Let \((\mathcal {M},g)\) be a d-dimensional Riemannian manifold embedded in a much higher dimensional Euclidean space \(\mathbb {R}^{m}\), where g is a Riemannian metric on . A Riemannian metric is a Euclidean inner product g _{ p } on each of the tangent space \(T_{p}\mathcal {M}\), where p is a point on the manifold . In addition, we assume that g _{ p } varies smoothly (Petersen 1998). This means that for any two smooth vector fields X,Y, the inner product g _{ p }(X _{ p },Y _{ p }) should be a smooth function of p. The subscript p will be suppressed when it is not needed. Thus, we might write g(X,Y) or g _{ p }(X,Y) with the understanding that this is to be evaluated at each p where X and Y are defined. Generally, we use the induced metric for . That is, the inner product defined in the tangent space of is the same as that in the ambient space \(\mathbb {R}^{m}\), i.e., g(u,v)=〈u,v〉 where 〈·,·〉 denote the canonical inner product in \(\mathbb {R}^{m}\).
The function g(∂ _{ i },∂ _{ j }) are denoted by g _{ ij }, i.e., g _{ ij }:=g(∂ _{ i },∂ _{ j }). This gives us a representation of g in local coordinates as a positive definite symmetric matrix with entries parameterized over U.
Covariant derivative
Therefore, we measure the change in X by measuring how the coefficients of X change. However, this definition relies on the fact that the coordinate vector field ∂ _{ i } is constant vector field. In other words, ∇_{ Y } ∂ _{ i }=0 for any vector field Y. For general coordinate vector fields, they are not always constant. Therefore, we should give a coordinate free definition of the covariant derivative.
Theorem 2.1 (The fundamental theorem of Riemannian geometry; Petersen 1998).
- 1.Y→∇_{ Y } X is a (1,1)-tensor:$$ \nabla_{\alpha v + \beta w} X = \alpha \nabla_{v} X + \beta \nabla_{w} X. $$
- 2.X→∇_{ Y } X is a derivation:for functions \(f:\mathcal {M} \rightarrow \mathbb {R}\).$$\begin{aligned} \nabla_{Y}(X_{1} + X_{2}) &= \nabla_{Y} X_{1} + \nabla_{Y} X_{2},\\ \nabla_{Y}(fX) &= (Yf)X + f\nabla_{Y} X \end{aligned} $$
- 3.Covariant differentiation is torsion free:$$ \nabla_{X} Y - \nabla_{Y} X = [ X, Y ]. $$
- 4.Covariant differentiation is metric:where Z is a vector field.$$ Zg(X, Y) = g(\nabla_{Z} X, Y) + g(X, \nabla_{Z} Y), $$
Here, [·,·] denotes the Lie derivative on vector fields defined as [X,Y]=X Y−Y X. Any assignment on a manifold that satisfies rules 1 to 4 is called a Riemannian connection. This connection is uniquely determined by these four rules.
The second equality holds due to the first rule of the connection and the third equality holds due to the second rule of the connection. Since \(\nabla _{\partial _{i}} \partial _{j}\) is still a vector field on the manifold, we can further represent it as \(\nabla _{\partial _{i}} \partial _{j} = \Gamma ^{k}_{\textit {ij}}\partial _{k}\), where \(\gamma ^{k}_{\textit {ij}}\) are called Christoffel symbols (Petersen 1998). The Christoffel symbols can be represented in terms of the metric.
Laplacian operator and Hessian operator
Let f be a function on the manifold. The one-form \(df: T\mathcal {M} \rightarrow \mathbb {R}\) measures the change of the function. In local coordinates, we have d f=∂ _{ i }(f)d x ^{ i }. Note that df is independent to the metric of the manifold. However, the gradient field gradf=∇f depends on the metric of the manifold.
Definition 2.3 (Gradient field; Petersen 1998).
It might be worth noting that we also have d f(X)=X f. In local coordinates, we have ∇f=g ^{ i j } ∂ _{ i } f ∂ _{ j }.
Definition 2.4 (Laplacian; Petersen 1998).
Definition 2.5 (Hessian; Petersen 1998).
A geometric viewpoint of dimensionality reduction
In the problem of dimensionality reduction, one tries to find a smooth map: \(F:\mathcal {M}\rightarrow \mathbb {R}^{d}\), which preserves the topological and geometrical properties of .
However, for some kinds of manifolds, it is impossible to preserve all the geometrical and topological properties. For example, consider a two-dimensional sphere, there is no such map that maps the sphere to a plane without breaking the topology of the sphere. Thus, there should be some assumptions of the data manifold. In most of papers, they consider a relatively general assumption that the manifold is diffeomorphic to an open subset of the Euclidean space \(\mathbb {R}^{d}\). In other words, we assume that there exists a topology preserving map from to \(\mathbb {R}^{d}\).
Variational principals
Since the target space is the Euclidean space \(\mathbb {R}^{d}\), we can represent F by its components F=(f _{1},…,f _{ d }), where each \(f_{i}:\mathcal {M} \rightarrow \mathbb {R}\) is a function on the manifold.
Since a cannot be zero, the constraint becomes \(\int _{\mathcal {M}} \mathbf {x} dx = 0\). If we approximate the integral by discrete summations over data points, then Equation 3 becomes the objective function of PCA. The solution can be obtained by singular value decomposition (SVD).
In discrete cases, one often uses graph Laplacian (Chung 1997) to approximate the Laplacian operator. Some theoretical results (Belkin and Niyogi 2005; Hein et al. 2005) also show the consistency of the approximation. One of the most important features of the graph Laplacian is that it is coordinate free. That is, it does not depend on any special coordinate system. The representative methods include Laplacian eigenmaps (LE; Belkin and Niyogi 2001) and locality preserving projections (LPP; He and Niyogi 2003). Note that (Belkin and Niyogi 2001) has showed that the objective function of LLE is equivalent to minimizing 〈L ^{2} f,f〉. If we replace L by L ^{2} in the last equation, we will get LLE. Therefore, LLE also belongs to this category. Generally, we can replace the Laplacian operator by any compact self-adjoint operators.
The norm ∥·∥ represents the standard tensor norm. In matrix form, this norm is equivalent to the Frobenius norm. In this case, it is much harder to get the optimality condition. Since Hessian operator is second order, the optimality condition will be a fourth-order equation. However, we can simply study the null space of the objective function. In other words, we would like to study functions that satisfying Hess f=0.
Proposition 3.1 (Petersen 1998).
The function satisfying Hessf≡0 is said to be linear on the manifold. Proposition 3.1 tells us that a linear function on the manifold varies linearly along the geodesics on the manifold. As pointed out by Goldberg et al. (2008), the final embedding may not be isometry due to the fact of normalization. The representative methods are Hessian-based methods including HLLE.
Note that Hessf=∇∇f. If V=∇f, then, the objective function in Equation 6 becomes the objective function of HLLE. And also Δ f=−div(V) holds by noticing that Δ f=−div(∇f).
Manifold regularization
Besides dimensionality reduction, the functionals introduced in the last section have been widely employed in semi-supervised learning. In semi-supervised learning, one often gives a set of labeled points, and we aim to learn the label on unlabeled points. It is well known that in order to make semi-supervised learning work, there should be some assumptions on the dependency between the prediction function and the marginal distribution of data (Zhu 2006). In the last decade, the manifold assumption is widely adopted in semi-supervised learning, which states that the prediction function lives in a low-dimensional manifold of the marginal distribution. Under the manifold assumption, previous studies focus on using differential operators on the manifold to construct a regularization term on the unlabeled data. These methods can be roughly classified into three categories: Laplacian regularization, Hessian regularization, and parallel field regularization.
R _{1} is often written as a functional norm associated with a differential operator, i.e., \(R_{1}(f)=\int _{\mathcal {M}}\|Df\|^{2}\) where D is a differential operator.
Laplacian regularization
Hessian regularization
The Hessian-based methods in unsupervised learning were first proposed in Hessian eigenmaps (HLLE; Donoho and Grimes 2003). The important feature of Hessian is that it preserves second-order smoothness, i.e., preserves the distance or linearity. Steinke and Hein (2008) extend the Hessian regularizer to Elles energy, which is applied to the problem of regression between manifolds. Kim et al. (2009) further propose to employ the Hessian regularizer in semi-supervised regression using an alternative implementation for approximating the Hessian operator in HLLE.
The recent theoretical analysis by Lafferty and Wasserman (2008) shows that using the Laplacian regularizer in semi-supervised regression does not lead to faster minimax rates of convergence. They further propose to use the Hessian regularizer when the Hessian of the prediction function is consistent with the Hessian of the marginal distribution. A more recent work (Nadler et al. 2009) shows that when there are infinite unlabeled data but only finite labeled data, the prediction function learned by using the Laplacian regularizer can be globally smooth but locally bumpy, which is meaningless for learning. These results indicate that the smoothness measured by Laplacian, i.e., the first-order smoothness, is way too general in semi-supervised regression.
Parallel field regularization
where R _{1}(f,V)=∥∇f−V∥^{2} and \(R_{2}(V) = \| \nabla V \|^{2}_{\text {HS}}\).
Discrete approximation of differential operators
Differential operators | Discrete approximations | |
---|---|---|
Gradient operator | ∇ | C |
Divergence operator | div | −C ^{ T } |
Connection Laplacian | ∇^{∗}∇ | B |
Metric tensor | g | G |
Conclusions
In this paper, we discussed differential operators defined on functions and on vector fields. These differential operators include Laplacian, Hessian, covariant derivative, and the connection Laplacian. We introduced the background of relevant geometric concepts. Then, we discussed the problem of dimensionality reduction and semi-supervised learning by using these differential operators. The discussion not only focused on their continuous counterpart but also on their discrete approximations.
Vector field-based methods are developed recently, which have been proved efficient in many applications including multi-task learning (Lin et al. 2012), manifold alignment (Mao et al. 2013), and ranking (Ji et al. 2012). However, there are still many problems unknown and worth to explore. The first is the convergence of the approximation of the differential operators. The second is the theoretical explanation of vector field regularization. Preliminary work indicates that there is a deep connection between the heat flows on vector fields. The study of heat equation on vector fields and machine learning would be an interesting topic.
Declarations
Acknowledgements
This work was supported in part by the National Program for Special Support of Top-Notch Young Professionals, in part by the National Natural Science Foundation of China under Grant 61233011, Grant 61125203, and in part by the National Basic Research Program of China (973 Program) under Grant 2012CB316400.
Authors’ Affiliations
References
- Belkin, M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering In: Advances in Neural Information Processing Systems 14, 585–591.. MIT Press, Cambridge, MA.Google Scholar
- Belkin, M, Niyogi P (2005) Towards a theoretical foundation for Laplacian-based manifold methods In: COLT, 486–500.. Curran Associates, Inc.Google Scholar
- Belkin, M, Niyogi P (2007) Convergence of Laplacian eigenmaps In: Advances in Neural Information Processing Systems 19, 129–136.. Curran Associates, Inc.Google Scholar
- Belkin, M, Matveeva I, Niyogi P (2004) Regularization and semi-supervised learning on large graphs In: COLT, 624–638.. Curran Associates, Inc.Google Scholar
- Belkin, M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from examples. J Machine Learning Res 7: 2399–2434.MATHMathSciNetGoogle Scholar
- Brand, M (2003) Charting a manifold In: Advances in Neural Information Processing Systems 16.. Curran Associates, Inc.Google Scholar
- Chung, FRK (1997) Spectral graph theory. Regional Conference Series in Mathematics, Vol. 92. AMS.Google Scholar
- Coifman, RR, Lafon S (2006) Diffusion maps. Appl Comput Harmonic Anal 21(1): 5–30. Diffusion Maps and Wavelets.View ArticleMATHMathSciNetGoogle Scholar
- Cox, T, Cox M (1994) Multidimensional scaling. Chapman & Hall, London.MATHGoogle Scholar
- Dollár, P, Rabaud V, Belongie S (2007) Non-isometric manifold learning: analysis and an algorithm In: ICML ’07: Proceedings of the 24th International Conference on Machine Learning, 241–248.. Curran Associates, Inc.Google Scholar
- Donoho, DL, Grimes CE (2003) Hessian eigenmaps: locally linear embedding techniques for high-dimensional data. Proc Nat Acad Sci USA 100(10): 5591–5596.View ArticleMATHMathSciNetGoogle Scholar
- Duda, RO, Hart PE, Stork DG (2000) Pattern classification. 2nd edn. Wiley-Interscience, Hoboken, NJ.MATHGoogle Scholar
- Goldberg, Y, Zakai A, Kushnir D, Ritov Y (2008) Manifold learning: the price of normalization. J Machine Learning Res 9: 1909–1939.MATHMathSciNetGoogle Scholar
- Ham, J, Lee DD, Mika S, Schölkopf B (2004) A kernel view of the dimensionality reduction of manifolds In: Proceedings of the Twenty-first International Conference on Machine Learning, 47.. Curran Associates, Inc.Google Scholar
- He, X, Niyogi P (2003) Locality preserving projections In: Advances in Neural Information Processing Systems 16.. Curran Associates, Inc.Google Scholar
- Hein, M, Audibert J, Luxburg UV (2005) From graphs to manifolds - weak and strong pointwise consistency of graph laplacians In: COLT, 470–485.. Springer.Google Scholar
- Ji, M, Lin B, He X, Cai D, Han J (2012) Parallel field ranking In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’12, 723–731.Google Scholar
- Jolliffe, IT (1989) Principal component analysis. Springer, New York.MATHGoogle Scholar
- Kim, KI, Steinke F, Hein M (2009) Semi-supervised regression using hessian energy with an application to semi-supervised dimensionality reduction In: Advances in Neural Information Processing Systems 22, 979–987.. Curran Associates, Inc.Google Scholar
- Lafferty, J, Wasserman L (2008) Statistical analysis of semi-supervised regression. In: Platt JC, Koller D, Singer Y, Roweis S (eds)Advances in Neural Information Processing Systems 20, 801–808.. MIT Press, Cambridge, MA.Google Scholar
- Lafon, S, Lee AB (2006) Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization. IEEE Trans Pattern Anal Machine Intelligence 28: 1393–1403.View ArticleGoogle Scholar
- Lee, JM (2003) Introduction to smooth manifolds. 2nd edn. Springer, New York.View ArticleGoogle Scholar
- Lin, T, Zha H (2008) Riemannian manifold learning. IEEE Trans Pattern Anal Machine Intelligence 30(5): 796–809.View ArticleGoogle Scholar
- Lin, B, He X, Zhang C, Ji M (2013) Parallel vector field embedding. J Machine Learning Res 14(1): 2945–2977.MATHMathSciNetGoogle Scholar
- Lin, B, Yang S, Zhang C, Ye J, He X (2012) Multi-task vector field learning In: Advances in Neural Information Processing Systems 25, 296–304.. Curran Associates, Inc.Google Scholar
- Lin, B, Zhang C, He X (2011) Semi-supervised regression via parallel field regularization In: Advances in Neural Information Processing Systems, 433–441.. Curran Associates, Inc.Google Scholar
- Mao, X, Lin B, Cai D, He X, Pei J (2013) Parallel field alignment for cross media retrieval In: Proceedings of the 21st ACM International Conference on Multimedia, 897–906.. ACM.Google Scholar
- Nadler, B, Lafon S, Coifman R, Kevrekidis I (2006) Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators In: Advances in Neural Information Processing Systems 18, 955–962.. Curran Associates, Inc.Google Scholar
- Nadler, B, Srebro N, Zhou X (2009) Statistical analysis of semi-supervised learning: the limit of infinite unlabelled data. In: Bengio Y, Schuurmans D, Lafferty J, Williams CKI, Culotta A (eds)Advances in Neural Information Processing Systems 22, 1330–1338.. Curran Associates, Inc.Google Scholar
- Petersen, P (1998) Riemannian geometry. Springer, New York.View ArticleMATHGoogle Scholar
- Roweis, S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500): 2323–2326.View ArticleGoogle Scholar
- Schölkopf, B, Smola A, Müller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5): 1299–1319.View ArticleGoogle Scholar
- Sindhwani, V, Niyogi P, Belkin M (2005) Beyond the point cloud: from transductive to semi-supervised learning In: ICML, 824–831.. Curran Associates, Inc.Google Scholar
- Singer, A, Wu H-T (2012) Vector diffusion maps and the connection Laplacian. Commun Pure Appl mathematics 65(8): 1067–1144.View ArticleMATHMathSciNetGoogle Scholar
- Steinke, F, Hein M (2008) Non-parametric regression between manifolds. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds)Advances in Neural Information Processing Systems 21, 1561–1568.. Curran Associates, Inc.Google Scholar
- Tenenbaum, J, de Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500): 2319–2323.View ArticleGoogle Scholar
- Weinberger, KQ, Sha F, Saul LK (2004) Learning a kernel matrix for nonlinear dimensionality reduction In: ICML ’04: Proceedings of the Twenty-first International Conference on Machine Learning.. Curran Associates, Inc.Google Scholar
- Zhang, Z, Zha H (2005) Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J Sci Comput 26(1): 313–338.View ArticleMathSciNetMATHGoogle Scholar
- Zhou, D, Bousquet O, Lal TN, Weston J, Schölkopf B (2003) Learning with local and global consistency In: Advances in Neural Information Processing Systems 16.. Curran Associates, Inc.Google Scholar
- Zhu, X (2006) Semi-supervised learning literature survey. Comput Sci, University of Wisconsin-Madison 2: 3.Google Scholar
- Zhu, X, Ghahramani Z, Lafferty J (2003) Semi-supervised learning using gaussian fields and harmonic functions In: Proc. of the Twentieth Internation Conference on Machine Learning.. Curran Associates, Inc.Google Scholar
Copyright
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.