1) Decision surface for two squares with Gaussian and nearest neighbor
classifiers (30pts)
We observe four data points each for two pattern classes y=1 and y=2 in 2D (x1,
x2):
C1: (0, 0), (0, 1), (1, 0), (1, 1)
C2: (2, 3), (2, 2), (3, 2), (3, 3)
1.1) (10pts) Assume that the two classes both have have Gaussian probability
density function have equal prior, i.e., P(y=1)=P(y=2) =1/2.
Compute the means and covariance matrixes of the two class-conditional distributions
p(x|y=1) and p(x|y=2)
Obtain the equation of the Bayes decision boundary between class 1 and class
2, and sketch it on the x1-x2 plane.
1.2) (10pts) Does the decision boundary change if Pr(y=1) = 0.6, or P(y=1)=0.2? If yes, sketch or plot the respective new boundary (boundaries).
1.3) (10pts) Assume we use Nearest-Neighbor classifier on these eight points. Draw the Voronoi Diagram on the x1-x2 plane, and find the corresponding descision boundary.
2) Recognizing hand-written digits (40 pts + 15 pt bonus)
Download a copy of the MNIST
dataset here. Download the example scripts either
as one .zip, or one-by-one as listed below.
Read, understand and run the classification example hw6_example.m , it involves four implemented steps and two more steps left to you:
%% 1. load the data %%%%%%%%%%%% %% 2. Specify the training and testing data involved in this task % .. we're working with four digits [1 3 5 7] only %% 3. classification with minimum distance classifiers % this needs mdist_learn.m and mdist_classify.m for learning and applying the minimum distance classifier % .. this step should finish in a second or two and yield an error rate ~17% %% 4. 1NN classification with Euclidean metric %%%%%%%%%% % this needs NN_euclidean.m for performing 1-NN classification % .. this step should finish in about 2.5 minutes and yield an error rate ~1.5%
Can we do better in classifying digits than (a) the simple minimum distance classifier, or (b) the nearest neighbor ?
2.1) (10 pts) Change the 1-nearest neighbor algorithm into k-Nearest-Neighbor
with L3 norm (defined here).
Run the classifier with k=3 and k=5, report the classification error rates.
2.2) (10pts) Find out which digits are mis-classfied by 1-NN and k-NN -- display
their image and submit them in your writeup. Are the errors reasonable? does
the neightbor votes in k-NN correspond to the confidence about a digit?
2.3) (5 pts) Implement one of the following three options, report classification
error rate. (1) PCA/KL-transform on the vector, followed by k-NN. (2) Linear
perceptron (with netlab)(3)
SVM classifier (with libSVM).
2.4) (15 pts) Discuss other possible ways for improving performance, list at
least three approaches from what has been covered in class so far, in books,
or reference papers + websites. For each proposed approach briefly describe
how to realize it technically and why it may be useful for this task.
2.4) (10 bonus points) implement one of your proposed solutions in 2.3 and
see if it indeed improves the result.
2.5) 5 more bonus points for the lowest classification error rate obtained among all submissions of question 2.4.
3) Source codes for letters and words (30pts)
In this problem we look at a language written with half the English alphabet. The designated 13 letters appear with the following probability.
Letter | a | b | c | e | g | i | l | m | o | r | s | t | u |
Probability | 0.1000 | 0.0500 | 0.0500 | 0.1500 | 0.0500 | 0.1000 | 0.0500 | 0.0500 | 0.0800 | 0.0800 | 0.0800 | 0.0800 | 0.0800 |
Prepared by Lexing Xie < xlx at ee dot columbia dot edu >, 2008-04-21