Dan Ellis: Research
Projects:
Alarm Sound Detection
Alarms are a very important class of sounds that we encounter in everyday
life. They have been specially designed to be distinctive, to attract attention,
and to be easily identifiable as alarms even to listeners who have never
heard them before.
The problem of automatically detecting alarm sounds is interesting for
a number of reasons. Firstly, it will involve a study of the alarm sounds
themselves, which could lead to a better understanding of what makes a good
alarm sound. Secondly, it could lead to valuable applications, for instance
a portable device that could alert a hearing-impaired person to an alarm
that might otherwise go unnoticed. Thirdly, it is a relatively tractable
example of a much larger class of problems the separation and identification
of particular sound sources in real-world environment mixtures that
we would like to tackle (i.e. the problem domain often referred to as computational
auditory scene analysis or CASA).
Because alarms are meant to be easily heard, we can be optimistic that,
in favorable circumstances, we can develop automatic algorithms to detect
them. The more significant question concerns the kind of discrimination
that can be achieved between weaker alarm sounds in noisy backgrounds, and
false alarms arising from the noise. In order to be useful, any kind of
automatic device would need to approach the performance of a normal-hearing
listener, which is a high standard.
Some preliminary investigations have been conducted into this work, as
illustrated in the figure above, which shows a telephone ring being detected
against a jazz recording playing in the background. However, a more careful
and thorough investigation is required. The project will involve:
- Corpus collection: Making recordings of a large variety of alarm
sounds in a large variety of the real-world scenarios where we might want
to detect them. Our hypothesis is that many alarms have significant features
in common which prevent them being confused from other sounds; we would
like to identify these properties in order to build a detector that recognizes
alarms generically, rather than being limited to a previously-known set
(although on detection each alarm sound could be classified according to
some known models). Part of this stage will be defining a list of alarm
sounds that we which to include, and then making real environmental recordings
of the alarms in action. To be useful for training the detector, this database
will need to be labelled, probably by hand, to indicate the particular
instants of alarm sounds, and perhaps to identify each alarm for later
classifier training.
- Detection algorithm: Given a set of examples (and counter examples
arising from the gaps between the examples), we can develop algorithms
to detect the alarm instances. In addition to standard classification and
statistical learning techniques, the core of this task is finding the right
representation (or representations) that most effectively reveal the features
that make alarms resemble one another and differ from other sound classes.
- Evaluation: Given an algorithm that works on some development
examples, the scientifically correct next stage is to make a quantitative
evaluation of its performance on a separate set of test examples, to get
an idea of how the algorithm can behave if taken into the 'real world'.
This may reveal certain weaknesses in the representation that will lead
to further development of the detection algorithm. In order to justify
any task-specific aspects of the detection algorithm, its performance should
be compared to a 'neutral' baseline e.g. a simple classifier (perhaps a
neural net) trained on a popular acoustic feature representation such as
Mel-frequency cepstral coefficients.
This is a neat little project. Assuming this part works out, and we end
up with a useful alarm detection system, there are numerous possible follow-ons.
I am most interested in generalizing the techniques used to isolate alarm
sounds from their background to work for other sound sources in real acoustic
environments.
Last updated: $Date: 2000/12/11 17:17:41 $
Dan Ellis <[email protected]>