Automatic Detection of Disfluency Boundaries in Spontaneous Speech of Children Using Audio-Visual Information

Yükleniyor...
Küçük Resim

Tarih

2009

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Ieee-Inst Electrical Electronics Engineers Inc

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

The presence of disfluencies in spontaneous speech, while poses a challenge for robust automatic recognition, also offers means for gaining additional insights into understanding a speaker's communicative and cognitive state. This paper analyzes disfluencies in children's spontaneous speech, in the context of spoken dialog based computer game play, and addresses the automatic detection of disfluency boundaries. Although several approaches have been proposed to detect disfluencies in speech, relatively little work has been done to utilize visual information to improve the performance and robustness of the disfluency detection system. This paper describes the use of visual information along with prosodic and language information to detect the presence of disfluencies in a child's computer-directed speech and shows how these information sources can be integrated to increase the overall information available for disfluency detection. The experimental results on our children's multimodal dialog corpus indicate that disfluency detection accuracy of over 80% can be obtained by utilizing audio-visual information. Specifically, results showed that the addition of visual information to prosody and language features yield relative improvements in disfluency detection error rates of 3.6% and 6.3%, respectively, for information fusion at the feature level and decision level.

Açıklama

Anahtar Kelimeler

Disfluency detection, feature selection, information fusion, spontaneous children speech, spoken language processing

Kaynak

Ieee Transactions on Audio Speech and Language Processing

WoS Q Değeri

Q1

Scopus Q Değeri

N/A

Cilt

17

Sayı

1

Künye