Automatic Detection of Disfluency Boundaries in Spontaneous Speech of Children Using Audio-Visual Information
Yükleniyor...
Tarih
2009
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Ieee-Inst Electrical Electronics Engineers Inc
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
The presence of disfluencies in spontaneous speech, while poses a challenge for robust automatic recognition, also offers means for gaining additional insights into understanding a speaker's communicative and cognitive state. This paper analyzes disfluencies in children's spontaneous speech, in the context of spoken dialog based computer game play, and addresses the automatic detection of disfluency boundaries. Although several approaches have been proposed to detect disfluencies in speech, relatively little work has been done to utilize visual information to improve the performance and robustness of the disfluency detection system. This paper describes the use of visual information along with prosodic and language information to detect the presence of disfluencies in a child's computer-directed speech and shows how these information sources can be integrated to increase the overall information available for disfluency detection. The experimental results on our children's multimodal dialog corpus indicate that disfluency detection accuracy of over 80% can be obtained by utilizing audio-visual information. Specifically, results showed that the addition of visual information to prosody and language features yield relative improvements in disfluency detection error rates of 3.6% and 6.3%, respectively, for information fusion at the feature level and decision level.
Açıklama
Anahtar Kelimeler
Disfluency detection, feature selection, information fusion, spontaneous children speech, spoken language processing
Kaynak
Ieee Transactions on Audio Speech and Language Processing
WoS Q Değeri
Q1
Scopus Q Değeri
N/A
Cilt
17
Sayı
1