A Bayesian network for time-frequency speech modeling and recognition

Khalid Daoudi and Dominique Fohr and Christophe Antoine. ( 2001 )
in: International Conference on Artificial Intelligence and Soft Computing, pages 5 p

Abstract

In this paper, we propose a new speech model which is a Bayesian network (BN) built in the time-frequency domain. Contrarily to HMMs, this BN provides a good modeling of the frequency dynamics, particularly the asynchrony between sub-bands. The experiments we carried out show that, consequently, speech is modeled with higher fidelity. Moreover, our new model allows to perform multi-band speech recognition without {\it all} the drawbacks of the usual multi-band approach where each sub-band is independently modeled by a HMM. This makes our model very suited to the case where speech is corrupted by a band-limited noise. We present experiments on an isolated digit recognition task, in clean and noisy conditions. The results we obtain show that the BNs framework is very promising in the field of speech modeling and recognition.

Download / Links

BibTeX Reference

@inproceedings{daoudi:inria-00100524,
 abstract = {In this paper, we propose a new speech model which is a Bayesian network (BN) built in the time-frequency domain. Contrarily to HMMs, this BN provides a good modeling of the frequency dynamics, particularly the asynchrony between sub-bands. The experiments we carried out show that, consequently, speech is modeled with higher fidelity. Moreover, our new model allows to perform multi-band speech recognition without {\it all} the drawbacks of the usual multi-band approach where each sub-band is independently modeled by a HMM. This makes our model very suited to the case where speech is corrupted by a band-limited noise. We present experiments on an isolated digit recognition task, in clean and noisy conditions. The results we obtain show that the BNs framework is very promising in the field of speech modeling and recognition.},
 address = {Cancun, Mexico},
 author = {Daoudi, Khalid and Fohr, Dominique and Antoine, Christophe},
 booktitle = {{International Conference on Artificial Intelligence and Soft Computing}},
 hal_id = {inria-00100524},
 hal_local_reference = {A01-R-197 || daoudi01a},
 hal_version = {v1},
 keywords = {bayesian networks ; reconnaissance de la parole ; speech recognition ; r{\'e}seaux bay{\'e}siens},
 month = {May},
 note = {Colloque avec actes et comit{\'e} de lecture. internationale.},
 pages = {5 p},
 title = {{A Bayesian network for time-frequency speech modeling and recognition}},
 url = {https://hal.inria.fr/inria-00100524},
 year = {2001}
}