Dynamic Bayesian Networks for multi-band automatic speech recognition

Khalid Daoudi and Dominique Fohr and Christophe Antoine. ( 2003 )
in: Computer Speech and Language, 17:2-3 (263-285)

Abstract

This paper presents a new approach to multi-band automatic speech recognition which has the advantage to overcome many limitations of classical muti-band systems. The principle of this new approach is to build a speech model in the time-frequency domain using the formalism of dynamic Bayesian networks. In contrast to classical multi-band modeling, this formalism leads to a probabilistic speech model which allows communications between the different sub-bands and, consequently, no recombination step is required in recognition. We develop efficient learning and decoding algorithms both for isolated and continuous speech recognition. We present illustrative experiments on isolated and connected digit recognition tasks. These experiments show that the this new approach is very promising in the field of noisy speech recognition.

Download / Links

BibTeX Reference

@article{daoudi:inria-00099530,
 abstract = {This paper presents a new approach to multi-band automatic speech recognition which has the advantage to overcome many limitations of classical muti-band systems. The principle of this new approach is to build a speech model in the time-frequency domain using the formalism of dynamic Bayesian networks. In contrast to classical multi-band modeling, this formalism leads to a probabilistic speech model which allows communications between the different sub-bands and, consequently, no recombination step is required in recognition. We develop efficient learning and decoding algorithms both for isolated and continuous speech recognition. We present illustrative experiments on isolated and connected digit recognition tasks. These experiments show that the this new approach is very promising in the field of noisy speech recognition.},
 author = {Daoudi, Khalid and Fohr, Dominique and Antoine, Christophe},
 doi = {10.1016/S0885-2308(03)00011-1},
 hal_id = {inria-00099530},
 hal_local_reference = {A02-R-278 || daoudi02b},
 hal_version = {v1},
 journal = {{Computer Speech and Language}},
 keywords = {Bayesian networks ; Speech recognition ; Reconnaissance de la parole ; R{\'e}seaux bay{\'e}siens},
 note = {Article dans revue scientifique avec comit{\'e} de lecture.},
 number = {2-3},
 pages = {263-285},
 pdf = {https://hal.inria.fr/inria-00099530/file/00099530.pdf},
 publisher = {{Elsevier}},
 title = {{Dynamic Bayesian Networks for multi-band automatic speech recognition}},
 url = {https://hal.inria.fr/inria-00099530},
 volume = {17},
 year = {2003}
}