November 12, 2014
Speaker: Dr. Andrew Senior, Google
Our recent work has shown that deep Long Short-Term Memory Recurrent Neural Networks (LSTM-RNNs) give improved accuracy over deep neural networks for large vocabulary continuous speech recognition. I will describe our task of recognizing speech from Google Now in dozens of languages and give an overview of LSTM-RNNs for acoustic modelling. I will describe distributed Asynchronous Stochastic Gradient Descent training of LSTMs on clusters of hundreds of machines, and show improved results through sequence-discriminative training.
Andrew Senior received his PhD from the University of Cambridge, having worked on speech recognition at LIMSI at the University of Paris XI. He joined IBM Research in 1994 where he worked in the areas of handwriting, audio-visual speech, face and fingerprint recognition as well as video privacy and visual tracking. In 2008 he taught at Columbia University before joining Google Research. He has coauthored a "Guide to Biometrics", and over seventy scientific papers; holds thirty-five patents. His research interests range across speech and pattern recognition, computer vision and visual art. He is an internationally exhibited multimedia artist.
Hosted by Colin Raffel.