3. Neural Networks

GMM estimates of PDFs are only one of a large number of different available classification schemes. As a comparison, we now look at solving the same task with a neural network (NN), specifically a multi-layer perceptron with a single hidden layer. An NN of this kind is essentially just a complex nonlinear mapping whose parameters can be optimized to match some training data using gradient descent (the so-called back-propagation algorithm). By training it to predict the actual 1/0 singing label, it ends up giving us an estimate of the actual probability that a particular frame corresponds to voice. Netlab again makes the training very easy for us:

% Set up the training parameters options = zeros(1,18); options(9) = 1; % Check the gradient calculations options(14) = 10; % Number of training cycles nhid = 5; % Hidden units in network - analogous to Gauss components nout = 1; % Single output is Pr(singing) alpha = 0.2; % Controls learning rate - some experimentation needed ndim = 2; net = mlp(ndim, nhid, nout, 'logistic', alpha); % Training is via a generalized optimization routine net = netopt(net, options, ftrs(:,1:2), labs, 'quasinew');

Because we're still only classifying on two dimensions, we can again sample the network output over a range of values and see what we get. We can reuse the grid defined for the GMMs:

% Run the net 'forward' on the grid points nno = mlpfwd(net, [x(:),y(:)]); nno = reshape(nno, 100, 100); subplot(221) imagesc(xx,yy,nno) axis xy % Notice how MLP outputs are soft planar intersections % Compare to GMM likelihood ratio subplot(222) imagesc(xx,yy,log(ppS./ppM)) axis xy % Plot the actual decision regions subplot(223) imagesc(xx,yy,nno>0.5); axis xy subplot(224) imagesc(xx,yy,log(ppS./ppM)>0) axis xy

We can calculate the overall accuracy on the training data as before:

% Run the net on the training data nnd = mlpfwd(net, ftrs(:,[1 2])); % How well does it agree with the labels? mean( (nnd>0.5) == labs) ans = 0.6500 % Pretty close to simple GMMs

Finally, we can again wrap all this up in a neat parameterized function, trainnns:

% Try again with 2 dimensions and 5 hidden units, trained for 10 iterations net = trainnns(ftrs(:,[1 2]), labs, 5, 10); Accuracy on training data = 65.5% Elapsed time = 87.5088 secs % There's a random element in the training, so results will vary from run to run

Try to find neural networks that parallel the complexity (e.g. training time) of the GMMs you investigated before. How do they compare in terms of accuracy?

Back: GMMs | Top | Next: Evaluation |

Last updated: $Date: 2003/07/02 15:40:30 $

Dan Ellis <dpwe@ee.columbia.edu>