Xu, Weiping (2012) Non-Euclidean Dissimilarity Data in Pattern Recognition. PhD thesis, University of York.
Abstract
This thesis addresses problems in dissimilarity (proximity) learning, particularly focusing
on identifying the sources and rectifying the non-Euclidean dissimilarity in pattern recog-
nition. We aim to develop a framework for analyzing the non-Euclidean dissimilarity by
combining the methods from differential geometry and manifold learning theory. The
algorithms are applied to objects represented by the dissimilarity measures.
In Chapter 3 we describe how to reveal the origins of the non-Euclidean behaviors
of the dissimilarity matrix for the purpose of rectifying the dissimilarities. We com-
mence by developing a new measure which gauges the extent to which individual data
give rises to departures from metricity in a set of dissimilarity data. This allows us to as-
sess whether the non-Euclidean artifacts in a dataset can be attributed to individual objects
or are distributed uniformly. The second novel contribution of Chapter 3 is to provide sim-
ple empirical tests that can be used to determine the sources of the negative dissimilarity
eigenvalues. We consider three sources of the negative dissimilarity eigenvalues, namely
a) that the data resides on a manifold, b) that the objects may be extended and c) that there
is Gaussian noise. We experiment with the algorithms on a set of public dissimilarities
used in various applications available from the EU SIMBAD project.
In Chapter 4, we propose a framework for rectifying the dissimilarities using Ricci
flow on the manifolds so that the non-Euclidean artifacts are eliminated, as the second
main contribution of this thesis. We consider the objects of interest to be represented by
points on a manifold consisting of local patches with constant curvatures, and the given
dissimilarities to be the geodesic distances on the manifold between these points. In dif-
ferential geometry, Ricci flow changes the metric of a Riemannian manifold according to
the curvature of the manifold. We seek to flatten the curved manifold so that a corrected
set of Euclidean distances are obtained. We achieve this by deforming the manifold usingRicci flow. In the first technique, we consider each edge as a local patch and apply Ricci
flow independently to flatten each patch. In this way, the local structure of the manifold
is ignored, as Ricci flow is applied independently on each edge. To overcome this prob-
lem, we propose a second technique, where add a curvature regularization process before
evolving the manifold. Specifically we use the heat kernel to smooth out the curvatures
on the edges. The results show both improved numerical stability and lower classification
error in the embedded space.
To reduce the reliance on the piecewise embedding and its effects on individual edges,
we extend the previous two techniques and develop a third means of correcting non-
Euclidean dissimilarity data as the first contribution of Chapter 5. This is done by using
a tangent space reprojection to inflate the local hyperspherical patches and align the local
patches with the shortest edge-connected path. These three Ricci-flow-based techniques
proposed through this thesis are investigated as a means of correcting the dissimilarities
so that the the non-Euclidean artefacts are eliminated. We experiment on two datasets
represented by dissimilarities, namely the CoilYork and the Chickenpieces datasets.
In the framework for correcting the non-Euclidean dissimilarities using the Ricci flow
process, estimating the curvatures of the embedded manifold is an important component
prior deforming the manifold. The second contribution of Chapter 5 is the investigation
of the effects of the piecewise embedding methods (the kernel embedding and the Isomap
embedding) on the curvatures computation and the introduction of a new way of com-
puting the curvatures from a set of dissimilarities. We consider each local patch on a
hypersphere, and deduce the enclosed volume of the points in terms of the curvature. We
estimate the curvature by fitting the volume. We illustrate the utility of this method for
estimating curvatures on the artificial dataset (2-sphere dataset).
Metadata
Awarding institution: | University of York |
---|---|
Academic Units: | The University of York > Computer Science (York) |
Identification Number/EthosID: | uk.bl.ethos.572365 |
Depositing User: | Weiping Xu |
Date Deposited: | 08 May 2013 13:43 |
Last Modified: | 08 Sep 2016 13:02 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:3848 |
Download
t(1)
Filename: t(1).pdf
Licence:
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.