Learning orthogonal projections for Isomap

Transcription

Learning orthogonal projections for Isomap
Neurocomputing 103 (2013) 149–154
Contents lists available at SciVerse ScienceDirect
Neurocomputing
journal homepage: www.elsevier.com/locate/neucom
Learning orthogonal projections for Isomap
Yali Zheng a, Bin Fang a,n, Yuan Yan Tang b, Taiping Zhang a, Ruizong Liu a
a
b
School of Computer Science, Chongqing University, Chongqing, China
University of Macau, Macau, China
a r t i c l e i n f o
abstract
Article history:
Received 14 April 2012
Received in revised form
11 September 2012
Accepted 16 September 2012
Communicated by Liang Wang
Available online 23 October 2012
We propose a dimensionality reduction technique in this paper, named Orthogonal Isometric Projection
(OIP). In contrast with Isomap, which learns the low-dimension embedding, and solves problem under
the classic Multidimensional Scaling (MDS) framework, we consider an explicit linear projection by
capturing the geodesic distance, which is able to handle new data straightforward, and leads to a
standard eigenvalue problem. We consider the orthogonal projection, and analyze the properties of
orthogonal projection, and demonstrate the benefits, in which Euclidean distance, and angle at each
pair in high-dimensional space are equivalent to ones in low-dimension, such that both global and local
geometric structure are preserved. Numerical experiments are reported to demonstrate the performance of OIP by comparing with a few competing methods over different datasets.
& 2012 Elsevier B.V. All rights reserved.
Keywords:
Manifold learning
Isomap
Orthogonal projection
1. Introduction
The reason that researchers are interested in reducing dimensions of digital data is the existence of digital information redundancy. Extracting efficient features from data can improve object
classification and recognition, and simplify the visualization of data.
One main category of the dimensionality reduction techniques is
data embedding, which is seeking for a representation of data in low
dimension to benefit the data analysis, such as Multidimensional
Scaling (MDS) [1] and Isomap [4]. MDS is a classic data embedding
technique, and considers preserving the pairwise distance to obtain
the low dimension configuration. The other is the projection
method, which is explicitly learning linear or nonlinear projections
between high dimension and low dimension over training sets, to
reduce the dimensionality of the coming data by applying projections. Principal component analysis (PCA) can be used as a projection method, which learns a linear projection by maximizing the
variance of data in low dimension. PCA is identical to Classical
Multidimensional Scaling if Euclidean distance is used [1], but PCA
learns the projection. Linear Discriminant Analysis (LDA) maximizes
the ratio of between-class variance to the within-class variance to
determine an explicit projection as well.
From the contemporary point of view, Isomap and Locally Linear
Embedding (LLE) [3] are pioneers of manifold learning methods.
Manifold learning theory was introduced into dimensionality reduction field in early 20 century, which assumed that a low-dimension
manifold is embedded in high-dimension data. Roweis and Saul [3]
proposed a nonlinear dimensionality reduction method (LLE), which
n
Corresponding author. Tel.: þ86 23 65112784.
E-mail address: fb@cqu.edu.cn (B. Fang).
0925-2312/$ - see front matter & 2012 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.neucom.2012.09.015
aimed at preserving the same local configuration of each neighborhood in low dimensional space as in high dimensional space. He and
Niyogi [2] found an optimal linear approximations to eigenfunctions
of the nonlinear Laplacian Eigenmap, called Local Preserving Projection (LPP), and gave the justification in the paper with the Graph
Laplacian theory. Cai et al. [5] proposed a variation of Laplacianface
(LPP)—Orthogonal Laplacianface (OLPP), which iteratively computed
the orthogonal eigenvectors to compose the projection matrix.
Kokiopoulou and Saad [6] analyzed and compared LPP and OLPP,
and proposed an Orthogonal Neighborhood Preserving Projections
(ONPP). It can be thought of as an orthogonal version of LLE, but
projections are learned explicitly as a standard eigenvalue problem.
However, on one hand, manifold learning was doubted by
researchers since being introduced into dimensionality reduction
[15] from topological stability. On the other hand, researchers pay
lots of attention to discovery and develop a various of manifold
learning methods to deal with problems in manifold learning.
Geodesic distance computing suffers from loops when constructing a neighborhood graph, Lee and Verleysen [12] designed a
nonlinear dimensionality reduction technique to break loops on
the manifold. In paper [13], Chois related Isomap to Mercer kernel
machines to consider the projection property, and eliminated
critical outliers for topological stability. The authors reviewed
dimensionality reduction techniques in [16].
This paper is an extension of our work [8]. We are motivated by
Isomap [4] and Isometric Projection [9], and propose a linear
projection method, called Orthogonal Isometric Projection, which is
a variation of Isometric Projection. Cai proposed Isometric Projection
[9] addressed the same purpose as ours. However, in this paper, we
constrain the projection is orthogonal which differs from Cai’s, and
solve a standard eigenvalue problem; we also analyze the properties
of orthogonal projection in manifold learning, and demonstrate why
150
Y. Zheng et al. / Neurocomputing 103 (2013) 149–154
orthogonal projection works better. The main difference between our
method and Cai’s orthogonal version of Isometric Projection is that
we intuitively build a reasonable objective function, and solve the
optimization in a standard eigenvalue problem, while Cai solved
problem in a generalized eigenvalue problem. We extend our method
to the supervised cases and sparse Orthogonal Isometric Projection.
We test our method on different datasets. In the following section we
will briefly introduce Cai’s Isometric Projection to demonstrate the
advantages of our algorithm.
2. A brief review of Isomap and Isometric Projection
2.1. Isomap
Isomap was proposed by Tenenbaum et al. [4] and is one of
most popular manifold learning techniques. It aims to obtain a
Euclidean embedding of points such that the geodesic distance in
high dimensional space gets close to Euclidean distance between
each pair of points. The mathematical formulation is
X
min ðdG ðxi ,xj ÞdE ðyi ,yj ÞÞ2
ð1Þ
preserve the geodesic distance on a neighborhood graph, and
learn a linear projection under the general Isomap framework, but
has a different and reasonable constraint that projections are
orthogonal. We will demonstrate why orthogonal projections
work better in the next section.
3.1. The objective function of OIP
Under the Isomap framework, it minimizes the objective
function in Eq. (2). In math tðDG Þ ¼ CWC=2, and C is the
centering matrix defined by C ¼ In 1=N eN eTN , where eN ¼
½1, . . . ,1TN , W is a Dijkstra distance matrix based on K nearest
neighbor graph over all data points. Let f ðVÞ ¼ JtðDG ÞX T VV T XJ2 ,
we are seeking for a linear projection:
min f ðVÞ
V
Let S ¼ tðDG Þ, which is a known neighborhood graph constructed
from the given dataset. We have
f ðVÞ ¼ trððSX T VV T XÞT ðSX T VV T XÞÞ
¼ trðST SST ðX T VV T XÞðX T VV T XÞT S þðX T VV T XÞT X T VV T XÞ
¼ trðST SÞ þtrððX T VV T XÞT ðX T VV T XÞ2ST ðX T VV T XÞÞ
i,j
¼ trððX T VV T XÞT 2ST ÞðX T VV T XÞ þ trðST SÞ
dG is the geodesic distance, dE is the Euclidean distance, and in a
matrix form:
So, the objective function Eq. (5) is equivalent to
minJtðDG ÞtðDE ÞJ2
min
ð2Þ
DG is the geodesic distance matrix, DE is the Euclidean distance
matrix, t is an operation which converts the Euclidean distance into
an inner product form. The problem is solved under the MDS
framework. Isomap makes an assumption that a manifold existing
in high dimension space, and applies the geodesic distance to measure the similarity of each point pair. However, if insufficient samples
are given or the data is noised, the intrinsic geometry of the data is
difficult to be captured by constructing the neighborhood graph.
ð5Þ
ð6Þ
trðV T XððX T VV T XÞT 2ST ÞX T VÞ
s:t:
VTV ¼ I
Let M ¼ XðX T X2ST ÞX T , then the problem becomes to
min trðV T MVÞ
s:t:
VTV ¼ I
which leads to a standard eigenvalue problem
2.2. Isometric Projection
MV ¼ lV
Cai et al. [9] extended Isomap algorithm to learn a linear
projection by solving a spectral graph optimization problem.
Suppose that Y ¼ V T X, they minimized the objective function
minJtðDG ÞX T VV T XJ2
trðV T X tðDG ÞX T VÞ
V
s:t:
T
V XX V ¼ I
which is equivalent to a generalized eigenvalue problem
the edge between i and j weighted by Gaussian kernel is defined
2
as eJxi xj J .
We summarize the algorithm in Algorithm 1.
T
X tðDG ÞX V ¼ lXX V
To solve the problem efficiently in computation cost, Cai also
applied the regression in [9,14] over Y and X called spectral regression
(SR). Y is computed first, which is the eigenvector of tðDG Þ, then
a ¼ arg min
a
m
X
ðaT xi yi Þ2 þ aJaJ2
First we need to construct the neighborhood graph G. As all we
know, there are two options:
if xj A Nðxi Þ, Nðxi Þ is the k nearest neighbor of xi,
if Jxi xj J2 r E, E is a small number as a threshold.
T
T
3.2. The algorithm of OIP
ð3Þ
To make the problem tractable, they imposed a constraint
V T XX T V ¼ I, and rewrote the minimization problem as
arg max
ð7Þ
V is determined by eigenvectors of M corresponding to p smallest
eigenvalues.
ð4Þ
i¼1
The condition V T XX T V ¼ I constrained that low-dimension
embedding of points is orthogonal, namely, Y T Y ¼ I.
3. Orthogonal Isometric Projection
The main idea of Orthogonal Isometric Projection is to seek
an orthogonal mapping over the training dataset so as to best
Algorithm 1. Orthogonal Isometric Projection.
1: Construct a neighborhood graph G over the data points.
Compute the Dijkstra distances matrix Wði,jÞ ¼ dG ði,jÞ over
the graph between every point pair. dG ði,jÞ is the shortest
path of i and j, otherwise dG ði,jÞ ¼ inf.
2: Compute tðDG Þ ¼ CWC=2, C is the centering matrix, and
C ¼ In 1=N eN eTN .
3: Compute the eigenvectors of M ¼ XðX T X2tðDG ÞT ÞX T . V is
determined by eigenvectors of M associated with p smallest
eigenvalues.
Y. Zheng et al. / Neurocomputing 103 (2013) 149–154
4. Why orthogonal projections better
5. Extensions of OIP
Cai et al. proposed OLPP [5] after LPP [7] by composing the
orthogonal constraints. Kokiopoulou and Saad extended LLE [3] to
ONPP [6], showed the success of orthogonal projection over
various datasets. It seems that orthogonal extensions always
work better than the original methods. However, the existing
algorithms add orthogonal constraints either to make the problems tractable in the optimization or to seek an explicit linear
projection. They did not analyze the properties of orthogonal
projections itself, and did not find out the reasons’ causing the
better results. We discover the properties of orthogonal projections, which explain the reasons of orthogonal algorithms working reasonable and better.
5.1. Supervised case
Lemma 1. Suppose that xi A Rm is a point in high dimensional space,
V is a m p projection matrix, and VV T ¼ I, yi A Rp is the linear
projection of xi in low dimension, yi ¼ V T xi , then
Jyi yj J2 ¼ Jxi xj J2
151
We can extend OIP to a supervised case. In supervised cases,
the class labels are available. OIP modifies appropriately, and
produces a projection which carries geometric information as
well as discriminate information. We modify the objective function as follows, which differs from the original Isomap (Eq. (1))
X
min ðoij dM ðxi ,xj ÞdE ðyi ,yj ÞÞ2
i,j
oij is a weight, which compacts points xi and xj if they share the
same label, and expands xi and xj if they belong to different
classes. In other words, if xi and xj come from the same class, we
would like to have the projections of xi and xj closer in lowdimension space, namely, compacting by setting oij Z0, oij r1. If
xi and xj are from different classes, we would like to force the
projections further by adding oij Z1. In a matrix form, we
minimize the objective function:
minJA tðDG ÞtðDY ÞJ2
Proof. The Euclidean distance between yi and yj
A tðDG Þ means ½Aij ½tðDG Þij . Obviously, we can solve the problem in the same way as the unsupervised case, when we just
take S as A tðDG Þ instead of t in Eq. (5).
Jyi yj J2 ¼ ðyi yj ÞT ðyi yj Þ
¼ ðV T xi V T xj ÞT ðV T xi V T xj Þ
5.2. Sparse OIP
¼ ðV T ðxi xj ÞÞT ðV T ðxi xj ÞÞ
¼ ðxi xj ÞT VV T ðxi xj Þ
¼ Jxi xj J2
&
ð8Þ
Lemma 2. Suppose that xi A Rm is a point in high dimensional space,
V is a m p projection matrix, and VV T ¼ I, yi A Rp is the linear
projection of xi in low dimension, yi ¼ V T xi , then
angleðyi ,yj Þ ¼ angleðxi ,xj Þ
Proof. Since the inner product between yi and yj equal to xi and xj
/yi ,yj S ¼ yTi yj ¼ ðV T xi ÞT ðV T xj Þ ¼ xTi VV T xj
¼ /xi ,xj S
ð9Þ
Our problem essentially solves a standard optimization in Eq.
(7), which also can be incorporated into a regression framework
in the way of Sparse PCA [11]. The optimal V is the eigenvectors
with respect to the maximum eigenvalues of Eq. (7). Since M is a
T
real symmetric matrix, so M can be decomposed into X~ X~ .
~
Suppose the rank of X is r, and SVDðX Þ is
~ V~ T
X~ ¼ U~ S
it is easy to verify that the column vectors in U~ are the eigenvectors of
T
X~ X~ . Let Y ¼ ½y1 ,y2 , . . . ,yr nr , each row vector is a sample vector in rdimensional subspace, and V ¼ ½v1 ,v2 , . . . ,vr mr . Therefore, the projective functions of OIP are solved by the linear systems:
X T vp ¼ yp ,
p ¼ 1,2, . . . ,r
vp is the solution of the regression system
The angle between vectors yi and yj is
/yi ,yj S
angleðyi ,yj Þ ¼ arccos
Jyi J Jyj J
¼ arccos
/yi ,yj S
Jyi 0y J Jyj 0y J
/xi ,xj S
¼ arccos
Jxi 0x J Jxj 0x J
¼ angleðxi ,xj Þ
0y denotes a p dimension zero vector and 0x denotes a m
dimension zero vector, yi, yj, xi and xj are non-zero vectors. &
We can see that the Euclidean distance and the angle between
each data pair in high-dimensional space have been preserved in
the low dimensional space, respectively, when orthogonal projections are applied. We believe that more microscopic characteristics of data points are kept during orthogonally projecting, more
macroscopic properties of the data will be preserved, and less
information will be lost from utilizing dimensionality reduction
techniques.
vp ¼ arg min
v
n
X
ðvT xi yip Þ2
i¼1
where yip is the ith element of yp. Similar to Zou et al. [11], we can get
the sparse solutions by adding L1 regularization
vp ¼ arg min
v
n
X
ðvT xi yip Þ2 þ aJvJ1
i¼1
where vp is the pth column of vp. The regression can be solved
efficiently using the LARS algorithm.
6. Experiments
We evaluate our algorithm on Reuters-21578 and USPS, which
are downloaded from the public website,1 and compare with LDA,
IP, IPþ SR, OLPP and ONPP, and demonstrate the average accuracy
and average error rates in the section. We randomly sample 25
times from the datasets as the training sets with varying rates
from 20% to 80%, and the rest are used for testing. We map points
1
http://www.zjucadcg.cn/dengcai/Data/TextData.html
152
Y. Zheng et al. / Neurocomputing 103 (2013) 149–154
Table 1
Comparison on Reuters-21578.
Training/kNN
LDA
IP
OLPP
ONPP
OIP
20%
k¼ 1
k¼ 5
79.53 70.78
79.57 70.79
79.75 71.03
78.24 71.22
72.65 71.26
74.71 71.40
73.60 72.09
69.89 7 2.17
81.487 0.32
80.87 70.39
30%
k¼ 1
k¼ 5
80.397 1.06
80.327 1.12
80.117 0.69
80.017 0.79
73.41 71.47
75.44 71.09
76.94 7 0.82
73.07 71.35
82.967 0.63
82.13 7 0.61
40%
k¼ 1
k¼ 5
79.93 71.59
79.87 71.60
82.077 0.94
81.16 71.25
73.40 71.13
75.29 70.95
78.52 7 1.18
75.08 72.19
83.767 0.34
83.29 7 0.34
50%
k¼ 1
k¼ 5
79.69 72.12
79.65 72.13
83.047 1.23
82.59 71.26
73.48 70.81
75.32 70.65
80.11 70.84
76.72 7 1.01
83.83 7 0.35
83.967 0.70
60%
k¼ 1
k¼ 5
80.067 1.84
80.097 1.86
78.79 71.34
78.057 1.66
73.61 70.93
75.04 70.70
80.63 70.4
77.35 7 0.54
84.627 0.63
84.51 7 0.44
70%
k¼ 1
k¼ 5
77.87 73.48
77.99 73.44
79.45 70.71
77.84 70.56
72.77 70.85
73.50 71.10
81.71 7 0.84
77.68 7 1.31
84.947 0.96
84.73 7 0.84
80%
k¼ 1
k¼ 5
79.14 71.57
79.23 71.65
80.167 0.63
78.88 71.18
71.70 71.88
72.15 71.36
81.94 7 1.53
77.28 7 1.37
85.09 70.71
85.387 0.78
0.22
LDA
IP
ONPP
OLPP
OIP
0.21
Average error rate
0.2
0.19
0.18
0.17
Fig. 2. USPS example.
0.16
0.15
20
30
40
50
60
70
80
90
100
Dimensions
Fig. 1. Dimensions vs. average classification error on Reuters-21578 dataset.
in test sets by projections learned from training sets, and apply
the nearest neighbor (k¼1) and k nearest neighbor (k¼5) to
determine categories labels. In KNN, we decide the label of the
testing sample by voting from the K nearest neighbor. We also
demonstrate the dimensions versus average error rate by half
training and half testing. Assume that li is the ground truth, bi is
the label assigned after dimensionality reduction by methods,
Ntest is the number of test samples
Acc ¼
N
1X
Nj¼1
PNtest
dðli ,bi Þ
N test
i¼1
and N ¼25 in our experiment, d is the Dirac delta function.
The original Reuters-21578 corpus belongs to 135 categories
with multiple labels, and the documents with multiple labels are
ignored. The data left in Reuters-21578 dataset for our experiment has 8293 samples with 18,933 features, and comes from 65
categories. Table 1 shows the comparison results of our method
with LDA, IP, OLPP, ONPP on Reuters-21578. The bolds are from
our method, 7 shows the standard derivation. With the number
of training samples, the classification precision increases as what
we expect Fig. 1 shows with the number of dimensions increases,
the average classification error decreases.
USPS is a well-known handwritten digits corpus from US
postal service. It contains normalized gray scale images of size
16 16, and totally 9298 samples with 256 features. Fig. 2 shows
some examples of USPS dataset. A human error rate estimated to
be 2.37% shows that it is a hard task over USPS dataset [10]. We
show the classification precision from our method with the
results from LDA, IP, IP þSR, ONPP in Table 2. Our method
outperforms all other methods. Fig. 3 shows the average error
with the number of dimensions on the USPS dataset.
7. Conclusion
In this paper, we propose an Orthogonal Isometric Projection
by solving a standard eigenvalue problem. We discover two
properties of orthogonal projections, microscopic characteristics
(Euclidean distance and angle) have been kept through the
projection, which explain why orthogonal techniques work better
than original ones. And our method also can be extended to
supervise cases and the regression framework. However, dimensionality reduction is still a challenging problem. The parameters
tuning of manifold learning algorithms is sort of troubles for
reaching the best performance of the algorithms. And manifold
learning essentially needs a large and unbiased data for the
neighbor graph. If samples are not available enough, would the
Y. Zheng et al. / Neurocomputing 103 (2013) 149–154
153
Table 2
Comparison on USPS.
Training/kNN
LDA
IP
IP þSR
ONPP
OIP
20%
k¼ 1
k¼ 5
89.13 70.34
90.49 70.48
92.11 7 0.39
90.65 70.54
93.90 7 0.50
93.55 7 0.48
91.30 7 0.45
90.77 7 0.56
95.107 0.32
94.027 0.20
30%
k¼ 1
k¼ 5
90.39 70.32
91.73 70.31
93.61 7 0.33
95.02 70.39
94.96 7 0.36
94.93 7 0.31
92.79 7 0.48
92.16 7 0.42
95.937 0.22
94.85 70.17
40%
k¼ 1
k¼ 5
91.18 70.36
92.01 70.31
94.48 7 0.28
94.01 70.28
95.69 7 0.36
95.51 7 0.29
93.20 7 0.51
92.73 7 0.50
96.407 0.20
95.49 70.34
50%
k¼ 1
k¼ 5
91.54 70.32
92.71 70.33
94.85 7 0.42
94.77 7 0.37
95.91 7 0.34
96.01 7 0.25
93.33 7 0.57
93.03 7 0.55
96.657 0.26
95.96 70.25
60%
k¼ 1
k¼ 5
91.97 70.32
92.90 70.37
95.21 7 0.31
94.92 7 0.25
96.14 7 0.35
96.13 7 0.34
93.93 7 0.48
93.68 7 0.58
97.017 0.25
96.16 70.28
70%
k¼ 1
k¼ 5
92.02 70.49
93.29 70.42
95.61 7 0.35
95.15 7 0.42
96.59 7 0.32
96.34 7 0.37
93.98 7 0.56
93.50 7 0.61
97.177 0.23
96.42 70.25
80%
k¼ 1
k¼ 5
92.19 70.58
93.22 70.62
95.97 7 0.40
95.83 7 0.29
96.74 7 0.40
96.79 7 0.37
94.26 7 0.57
93.81 7 0.72
97.357 0.34
96.49 70.33
0.13
LDA
IP
IP+SR
ONPP
OIP
0.12
0.11
Average error rate
0.1
0.09
0.08
0.07
0.06
0.05
0.04
0.03
20
30
40
50
60
70
80
90
100
Dimensions
[6] E. Kokiopoulou, Y. Saad, Orthogonal neighborhood preserving projections:
a projection-based dimensionality reduction technique, IEEE Trans. Pattern
Anal. Mach. Intell. 29 (12) (2007) 2143–2156.
[7] X. He, S. Yan, Y. Hu, P. Niyogi, H.J. Zhang, IEEE Trans. Pattern Anal. Mach.
Intell. 27 (3) (2005) 328–340.
[8] Yali Zheng, Yuan Yan Tang, Bin Fang, Taiping Zhang, Orthogonal Isometric
Projection, in: Proceedings of IEEE International Conference of Pattern
Recognition, 2012.
[9] D. Cai, X. He, J. Han, Isometric projection, in: Proceedings of the National
Conference on Artificial Intelligence, 2007, pp. 528–533.
[10] I. Chaaban, M. Scheessele, Human Performance on the USPS Database,
Technique Report, Indiana University South Bend, 2007.
[11] H. Zou, T. Hastie, R. Tibshirani, Sparse principal component analysis,
J. Comput. Graphical Stat. 15 (2) (2006) 265–286.
[12] J.A. Lee, M. Verleysen, Nonlinear dimensionality reduction of data manifolds
with essential loops, Neurocomputing 67 (2005) 29–53.
[13] H. Choi, S. Choi, Robust kernel Isomap, Pattern Recognition 40 (3) (2007)
853–862.
[14] D. Cai, X. He, J. Han, SRDA: an efficient algorithm for large-scale discriminant
analysis, IEEE Trans. Knowl. Data Eng. 20 (1) (2008) 1–12.
[15] M. Balasubramanian, E.L. Schwartz, The isomap algorithm and topological
stability, Science 295 (2002) 7.
[16] L. Maaten, E. Postma, J. Herik, Dimensionality reduction: a comparative
review, Comput. Inf. Sci. 10 (2) (2007) 1–35.
Fig. 3. Dimensions vs. average classification error on USPS dataset.
manifold learning work? Can we generate new samples from the
insufficient data? These problems are our future work.
Acknowledgments
Yali Zheng is a Ph.D. candidate in School of Computer
Science, Chongqing University, China. She was a visiting student at University of Pittsburgh and Carnegie
Mellon University in USA (10/2008-09/2011). She is
interested in dimensionality reduction, computer
vision, and computational photography. She is also
an IEEE student member.
This work is supported by National Science Foundations of China
(61075045, 60873092, 90820306, 61173129) and the Fundamental
Research Funds for the Central Universities (ZYGX2009X013).
References
[1] W.S. Torgerson, Multidimensional scaling: I. Theory and method, Psychometrika 17 (4) (1952) 401–419.
[2] X. He, P. Niyogi, Locality preserving projections, in: Proceedings of Advances
in Neural Information Processing Systems, 2003, pp. 153–160.
[3] S.T. Roweis, L.K. Saul, Nonlinear dimensionality reduction by locally linear
embedding, Science 290 (2000) 2323–2326.
[4] J.B. Tenenbaum, V. Silva, J.C. Langford, A global geometric framework for
nonlinear dimensionality reduction, Science 290 (2000) 2319–2323.
[5] D. Cai, X. He, J. Han, H.J. Zhang, Orthogonal Laplacian faces for face
recognition, IEEE Trans. Image Process. 15 (11) (2006) 3608–3614.
Bin Fang received the B.S. degree in Electrical Engineering
from Xi’an Jiaotong University, Xi’an, China, the M.S.
degree in Electrical Engineering from Sichuan University,
Chengdu, China, and the Ph.D. degree in Electrical Engineering from the University of Hong Kong, Hong Kong,
China. He is currently a Professor in School of Computer
Science, Chongqing University, China. His research interests include computer vision, pattern recognition, medical
image processing, biometrics applications, and document
analysis. He has published more than 100 technical
papers and is an Associate Editor of the International
Journal of Pattern Recognition and Artificial Intelligence.
He is a Senior Member of IEEE.
154
Y. Zheng et al. / Neurocomputing 103 (2013) 149–154
Yuan Yan Tang received the B.S. degree in Electrical
and Computer Engineering from Chongqing University,
Chongqing, China, the M.S. degree in Electrical
Engineering from the Graduate School of Post and
Telecommunications, Beijing, China, and the Ph.D.
degree in Computer Science from Concordia University,
Montreal, Canada.
He is currently a Professor in the School of Computer
Science at Chongqing University, and a Chair Professor
at the Faculty of Science and Technology in the
University of Macau and Adjunct Professor in Computer
Science at Concordia University. He is an Honorary
Professor at Hong Kong Baptist University, an Advisory
Professor at many institutes in China. His current interests include wavelet theory
and applications, pattern recognition, image processing, document processing,
artificial intelligence, parallel processing, Chinese computing and VLSI architecture. He has published more than 300 technical papers and is the author/co-author
of 23 books/bookchapters on subjects ranging from electrical engineering to
computer science.
He serviced as a General Chair, Program Chair and Committee Member
for many international conferences. He was the General Chair of the 18th
International Conference on Pattern Recognition (ICPR’06). He is the Founder
and Editor-in-Chief of International Journal on Wavelets, Multiresolution, and
Information Processing (IJWMIP) and Associate Editors of several international
journals related to Pattern Recognition and Artificial Intelligence. He is an IAPR
Fellow and IEEE Fellow. He is the chair of pattern recognition committee in
IEEE SMC.
Taiping Zhang received the B.S. and M.S. degrees in
Computational Mathematics, and Ph.D. degree at School
of Computer Science in Chongqing University, Chongqing,
China, in 1999, 2001, and 2010, respectively. He is
currently an Associate Professor at School of Computer
Science in Chongqing University, and a visiting Research
fellow at the Faculty of Science and Technology, University
of Macau. His research interests include pattern recognition, image processing, machine learning, and computational mathematics. He published extensively in IEEE
Transactions on Image Processing, IEEE Transactions on
Systems, Man, and Cybernetics, Part B (TSMC-B), Pattern
Recognition, Neurocomputing, etc. He is a member of IEEE.
Ruizong Liu Received his B.S. degree in project
management from Nanjing University of Technology,
Nanjing, China, in 2004. He received his Ph.D. Degree
at School of Computer Science in Chongqing University.
He is postdoc at the Faculty of Science and Technology,
University of Macau, supervised by Yuan Yang Tang.
His research interests include machine learning,
pattern recognition, computer vision, and cognitive
science.