ELEN E 6880 Statistical Pattern Recognition
(V. Castelli, M. Brodie, I. Rish, D. Oblinger)
Lecture 8: Nearest-Neighbor Classifiers, by Vittorio Castelli
Relevant Book Sections
-
Chapters 4.4, 4.5, and 4.6 of Duda, Hart, and Stork, "Pattern Classification".
-
Chapters 5, 6, 7, 11, and 26 of Devroye, Gyorfi, and Lugosi, "A Probabilistic
Theory of Pattern Recognition" (on the class reading list) are devoted
to the problem at hand and to its variations. Some of this material
is very advanced.
-
Chapters 2.3 and 13 of Hastie, Tibshirani, and Friedman, "The Elements
of Statistical Learning" (on the class reading list) deal with the nearest-neighbor
and related method. This book is more oriented to the practitioner
than to the previous bool
Material For The Lecture
Material covered in class
The lecture was prepared using a wide variety of material from the textbooks
and from the additional material listed below.
The writeup, in pdf format, of the material covered in the lecture
can
be found here.
Variable-metric Nearest-Neighbor Classifiers
You might be asking yourself what is a good metric for nearest-neighbor
classifiers. Although asymptotically it is known that the metric does not
matter, it is clear (and known) that an appropriate choice of a metric
can improve the classifier error rate for finite training sample size.
This area has been an active area of research in the past. However,
more recently researchers have started questioning the principle that a
unique distance metric for the entire feature space, and are working on
adaptive metrics (namely, on "distance" functions that vary depending on
the query point).
Early work on this topic was done by Jerome Friedman, at Stanford.
His seminal paper "Flexible
Metric Nearest Neighbor Classification" is available in compressed
postscript form by following the link.
A researcher who has worked on the topic in very recent times is Carlotta
Domeniconi. The following citations might be of interest to you
-
C. Domeniconi, D. Gunopulos, "Efficient
Local Flexible Nearest Neighbor Classification", to appear in the Proceedings
of the Second SIAM Intl. Conference on Data Mining, 2002.
-
C. Domeniconi, D. Gunopulos, "Adaptive
Nearest Neighbor Classification using Support Vector Machines", Advances
in Neural Information Processing Systems 14, MIT Press (NIPS-2001).
-
C. Domeniconi, J. Peng, D. Gunopulos, "Adaptive
Metric Nearest Neighbor Classification", in Proceedings of IEEE Conference
on Computer Vision and Pattern Recognition, June 13-15, 2000, Hilton Head
Island, South Carolina.
-
C.Domeniconi, J. Peng, D. Gunopulos, "Locally
Adaptive Metric Nearest Neighbor Classification", Technical Report
UCR-CSE-00-02, August 10, 2000
Where to Find Additional Material
There is an enormous literature on Nearest-Neighbor Methods.
-
A collection of seminal papers was published by the IEEE: B.
Dasarathy, "Nearest Neighbor Pattern Classification Techniques", IEEE Computer
Society Press, 1990.
-
The IEEE and ACM digital library will return a large number of hits in
response to queries on nearest-neighbor methods.
-
As we mentioned in class, nearest-neighbor methods are computationally
intensive. The computational cost can be reduced using indexing structures.
A recent survey of multidimensional indexing methods supporting nearest-neighbor
queries can be found in this IBM technical
report. Since the material in the report has been published in
a book, please refrain from distributing it.