Home

Call for
papers

Dates

Submissions

Committee

Program


CRAC

Consistent & Reliable Acoustic Cues
for sound analysis

Program

n.b. You can download all the papers along with an index in a single zip file.

Sunday September 2nd 2001

Session 1 09:00-10:30

PERCEPTION & AUDITORY MODELS

Overview Phil Green
Oral Johannes Nix & Volker Hohmann
Enhancing Sound Sources by use of Binaural Spatial Cues
Alain de Cheveigné
Generalized Correlation Network model of auditory processing
Posters Toshio Irino, R. D. Patterson, and H. Kawahara
Sound resynthesis from Auditory Mellin Image using STRAIGHT
Olivier Crouzet & W.A. Ainsworth
On the various influences of envelope information on the perception of speech in adverse conditions: An analysis of between-channel envelope correlation
Philip J.B. Jackson
Acoustic cues of voiced and voiceless plosives for determining place of articulation
Matti Karjalainen
Auditory Interpretation and Application of Warped Linear Prediction
Shuangyu Chang, Lokendra Shastri & Steven Greenberg
Robust Phonetic Feature Extraction Under a Wide Range of Noise Backgrounds and Signal-to-Noise Ratios

 

Session 2 11:00-12:30

MUSIC & GENERAL AUDIO ANALYSIS

Overview Dan Ellis
Oral Anssi Klapuri, Tuomas Virtanen, Antti Eronen & Jarno Seppänen
Automatic transcription of musical recordings
Michael Casey
Reduced-Rank Spectra and Entropic Priors as Consistent and Reliable Cues for Generalized Sound Recognition
Posters Silvia Allegro & Stefan Launer
Sound Classification in Hearing Instruments by means of Auditory Scene Analysis
Shoko Araki, Shoji Makino, Ryo Mukai & Hiroshi Saruwatari
Equivalence between Frequency Domain Blind Source Separation and Frequency Domain Adaptive Beamformers
Tomoya Narita & Masahide Sugiyama
Fast Music Retrieval using Spectrum and Power Information
Shin'ichi Takeuchi, Masaki Yamashita, Takayuki Uchida, Masahide Sugiyama
Optimization of Voice/Music Detection in Sound Data
Masataka Goto
A Predominant-F0 Estimation Method for Real-world Musical Audio Signals: MAP Estimation for Incorporating Prior Knowledge about F0s and Tone Models
Daniel P.W. Ellis
Detecting alarm sounds

 

Session 3 14:00-15:30

MISSING-DATA SPEECH RECOGNITION

Overview Martin Cooke
Oral Jon Barker, Martin Cooke & Dan Ellis
Integrating bottom-up and top-down constraints to achieve robust ASR: The multisource decoder
Phillippe Renevey & Andrzej Drygajlo
Detection of Reliable Features for Speech Recognition in Noisy Conditions Using a Statistical Criterion
Posters Kalle Palomäki, Guy Brown & DeLiang Wang
A Binaural Model for Missing Data Speech Recognition in Noisy and Reverberant Conditions
Juha Häkkinen & Hemmo Haverinen
On the Use of Missing Feature Theory with Cepstral Features
A. C. Morris
Data Utility Modelling for Mismatch Reduction
Herve Glotin
Robust multi-stream speech recognition based on the combined reliabilities of the speech signal (voicing cue) and phonemes estimates using a bias prediction
Stéphane Dupont & Christophe Ris
Multiband with contaminated training data
Bhiksha Raj, Michael L. Seltzer & Richard M. Stern
Robust Speech Recognition using Missing Features: the Case for Restoring Missing Input Features

 

Session 4 16:00-17:30

APPROACHES TO HANDLING NOISY SPEECH

Overview Hiroshi G Okuno
Oral Emmanuel Tessier & Frédéric Berthommier
Speech enhancement and segregation based on the localisation cue for cocktail-party processing
Ikuyo Masuda-Katsuse & Yoshimori Sugano
Speech estimation biased by phonemic expectation in the presence of non-stationary and unpredictable noise
Posters Laurens van de Werff, Johan de Veth, Bert Cranen & Louis Boves
Analysis of Disturbed Acoustic Features in terms of Emission Cost
Yuichi ISHIMOTO, Masashi UNOKI & Masato AKAGI
A fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency
Joan Marí , José Manuel Ferrer & Fritz Class
Evaluation of Robust Feature Extraction and Acoustic Modelling algorithms/systems by interfacing ASR systems
Hiroshi G. Okuno, Kazuhiro Nakadai & Hiroaki Kitano
Effects of increasing modalities in understanding three simultaneous speeches with two microphones
Yasuhiro Minami, Erik McDermott, Atsushi Nakamura & Shigeru Katagiri
A recognition method using synthesis-based scoring that incorporates direct relations between static and dynamic feature vector time series

This is a paragraph of dummy text whose purpose is to see if that by adding a line of text long enough to span more than the entire window we can in fact persuade the table rendering algorithms to respect the width=pixels specification of the leftmost column as an absolute maximum, rather than stretching it when the rightmost column is thinner.


last update: Mon Sep 24 19:15:56 EDT 2001 [email protected]