CRAC: One day workshop on Consistent and Reliable Acoustic Cues

Program

Consistent & Reliable Acoustic Cues
for sound analysis

Program

n.b. You can download all the papers along with an index in a single zip file.

Sunday September 2nd 2001

Session 1 09:00-10:30

PERCEPTION & AUDITORY MODELS

Overview Phil Green

Oral Johannes Nix & Volker Hohmann
Enhancing Sound Sources by use of Binaural Spatial Cues

Alain de Cheveigné
Generalized Correlation Network model of auditory processing

Posters Toshio Irino, R. D. Patterson, and H. Kawahara
Sound resynthesis from Auditory Mellin Image using STRAIGHT

Olivier Crouzet & W.A. Ainsworth
On the various influences of envelope information on the perception of speech in adverse conditions: An analysis of between-channel envelope correlation

Philip J.B. Jackson
Acoustic cues of voiced and voiceless plosives for determining place of articulation

Matti Karjalainen
Auditory Interpretation and Application of Warped Linear Prediction

Shuangyu Chang, Lokendra Shastri & Steven Greenberg
Robust Phonetic Feature Extraction Under a Wide Range of Noise Backgrounds and Signal-to-Noise Ratios

Session 2 11:00-12:30

MUSIC & GENERAL AUDIO ANALYSIS

Overview Dan Ellis

Oral Anssi Klapuri, Tuomas Virtanen, Antti Eronen & Jarno Seppänen
Automatic transcription of musical recordings

Michael Casey
Reduced-Rank Spectra and Entropic Priors as Consistent and Reliable Cues for Generalized Sound Recognition

Posters Silvia Allegro & Stefan Launer
Sound Classification in Hearing Instruments by means of Auditory Scene Analysis

Shoko Araki, Shoji Makino, Ryo Mukai & Hiroshi Saruwatari
Equivalence between Frequency Domain Blind Source Separation and Frequency Domain Adaptive Beamformers

Tomoya Narita & Masahide Sugiyama
Fast Music Retrieval using Spectrum and Power Information

Shin'ichi Takeuchi, Masaki Yamashita, Takayuki Uchida, Masahide Sugiyama
Optimization of Voice/Music Detection in Sound Data

Masataka Goto
A Predominant-F0 Estimation Method for Real-world Musical Audio Signals: MAP Estimation for Incorporating Prior Knowledge about F0s and Tone Models

Daniel P.W. Ellis
Detecting alarm sounds

Session 3 14:00-15:30

MISSING-DATA SPEECH RECOGNITION

Overview Martin Cooke

Oral Jon Barker, Martin Cooke & Dan Ellis
Integrating bottom-up and top-down constraints to achieve robust ASR: The multisource decoder

Phillippe Renevey & Andrzej Drygajlo
Detection of Reliable Features for Speech Recognition in Noisy Conditions Using a Statistical Criterion

Posters Kalle Palomäki, Guy Brown & DeLiang Wang
A Binaural Model for Missing Data Speech Recognition in Noisy and Reverberant Conditions

Juha Häkkinen & Hemmo Haverinen
On the Use of Missing Feature Theory with Cepstral Features

A. C. Morris
Data Utility Modelling for Mismatch Reduction

Herve Glotin
Robust multi-stream speech recognition based on the combined reliabilities of the speech signal (voicing cue) and phonemes estimates using a bias prediction

Stéphane Dupont & Christophe Ris
Multiband with contaminated training data

Bhiksha Raj, Michael L. Seltzer & Richard M. Stern
Robust Speech Recognition using Missing Features: the Case for Restoring Missing Input Features

Session 4 16:00-17:30

APPROACHES TO HANDLING NOISY SPEECH

Overview Hiroshi G Okuno

Oral Emmanuel Tessier & Frédéric Berthommier
Speech enhancement and segregation based on the localisation cue for cocktail-party processing

Ikuyo Masuda-Katsuse & Yoshimori Sugano
Speech estimation biased by phonemic expectation in the presence of non-stationary and unpredictable noise

Posters Laurens van de Werff, Johan de Veth, Bert Cranen & Louis Boves
Analysis of Disturbed Acoustic Features in terms of Emission Cost

Yuichi ISHIMOTO, Masashi UNOKI & Masato AKAGI
A fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency

Joan Marí , José Manuel Ferrer & Fritz Class
Evaluation of Robust Feature Extraction and Acoustic Modelling algorithms/systems by interfacing ASR systems

Hiroshi G. Okuno, Kazuhiro Nakadai & Hiroaki Kitano
Effects of increasing modalities in understanding three simultaneous speeches with two microphones

Yasuhiro Minami, Erik McDermott, Atsushi Nakamura & Shigeru Katagiri
A recognition method using synthesis-based scoring that incorporates direct relations between static and dynamic feature vector time series

This is a paragraph of dummy text whose purpose is to see if that by adding a line of text long enough to span more than the entire window we can in fact persuade the table rendering algorithms to respect the width=pixels specification of the leftmost column as an absolute maximum, rather than stretching it when the rightmost column is thinner.

last update: Mon Sep 24 19:15:56 EDT 2001 [email protected]

Consistent & Reliable Acoustic Cues for sound analysis