I am a PhD student in Columbia University, working with Prof. Dan Ellis and Prof. Nima Mesgarani. I am interested in the machine learning algorithms with the application in auditory source separation, speech enhancement, and the automatic speech recognition.You can find my CV here
Together with Yi Luo, and Prof. Nima Mesgarani, we made the generalized version of deep clustering, allowing direct end-to-end optimization for multi-speaker separation, see demo here
Together with Yi Luo, Nima, Mesgarani, Jonathan Leroux and John Hershey, we largely increase the state of the art performance in music separation with our model, the Chimera network, and won the best performance in MIREX 2016 on singing separation track, see demo here
Together with James O'Sullivan, Sameer Sheth, Guy, McKhann; Ashesh Mehta, and Nima Mesgarani, we create this revolutionary device, that allows the patient to directly separate out the audio targets with high quality, using their attention as guiding clue, which is the next step of the hearing aid industry
Together with Tasha Nagamine and Nima Mesgarani, we created a unsupervisely model for neural network adaptaion, which largely increase the robustness for ASR system under noisy enviorment
Together with Shixiong Zhang, Yong Zhao, Jinyu Li and Yifan Gong, we created a end to end speaker verfication model, which is the first deep learning based model that can be used for both text dependent and text independent speaker verification.
Together with sevaral friends of mine, I found this company, specically for speech enhancement and audio source separation. I left this company in 2016
Together with John Hershey, Jonathan LeRoux, Shinji Watanabe and Yusuf Isik, we invented this revolutionary technic. We refresh the previous state of art performance by THREE TIMES. And it is the first time for human to achieve the high quality for overlapped unknown speaker (and unknown number of speaker) separation. See demo here
Together with researchers in MERL and SRI, we got the 2nd best performance in the 3rd CHiME challenge, a world level challenge for automatic speech recognition under noisy enviorment
Together with researcher in MERL and BBN, we won the IRAPA-ASpIRE challenge, a world level ASR evalutation under highly corrupted enviroment
PhD Candidate, Laboratory for the Recognition and Organization of Speech and Audio (LabROSA), GPA: 3.9
Focused on the problem of automatic speech recognition , speech enhancement and source separation using deep learning technol- ogy and Bayesian statistic model.
M.S. in Electrical Engineering, GPA: 3.65.
Focused heavily on the advance technique of signal processing, and the foundation of the statistical model and optimization tools for signal processing.
B.S. in Electrical Engineering, GPA: 3.4 | Minor in Economics, GPA: 3.2
Focused on the foundation of signal processing and the optimization of the power grid.
Research Intern | Bellevue, USA
FFS Team Member | Seattle, USA
Intern Researcher | Boston, USA
Research assistant | Berkeley, USA