Former PhD student Yi Luo and professor Nima Mesgarani are the recipients of the 2021 IEEE Signal Processing Society Best Paper Award. This Award honors the authors of a paper of exceptional merit dealing with a subject related to the Society’s technical scope and are judged based on general quality, originality, subject matter, and timeliness.
They received this award for their study which was published in 2019 in IEEE Transaction on Speech and Audio signal processing, titled “Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation.”
This highly cited paper proposed a novel method for the challenging task of automatic speech separation, also known as the cocktail party problem. While humans are extremely good at extracting and focusing on one voice in crowded acoustic scenes, this problem remains challenging to solve in machines. Their study is among the first to formulate speech separation in the time-domain by first encoding the waveform of mixed audio, then separating the individual talkers, and finally converting them back to separated waveforms. Notably, this time-domain formulation outperformed the assumed upper bound on standard benchmarks which were imposed by time-frequency decomposition of the audio signal. In addition to its superior accuracy, their method also removed several limitations of previous methods and enabled real-time and low-latency implementation. This made it possible to apply speech separation in applications such as hearing technologies and brain-controlled hearing aids.
Yi Luo received his PhD from the Electrical Engineering department at Columbia in 2021 where he worked with Dr. Nima Mesgarani in his Neural Acoustic Processing Lab.
"Receiving this prestigious award is very humbling and exciting at the same time, and it reaffirms the insights we have had on how this problem should be approached," Dr. Mesgarani said.