Automatic Detection of Disfluency Boundaries in Spontaneous Speech of Children Using Audio-Visual Information

dc.authoridYILDIRIM, Serdar/0000-0003-3151-9916
dc.contributor.authorYildirim, Serdar
dc.contributor.authorNarayanan, Shrikanth
dc.date.accessioned2024-09-18T20:15:15Z
dc.date.available2024-09-18T20:15:15Z
dc.date.issued2009
dc.departmentHatay Mustafa Kemal Üniversitesien_US
dc.description.abstractThe presence of disfluencies in spontaneous speech, while poses a challenge for robust automatic recognition, also offers means for gaining additional insights into understanding a speaker's communicative and cognitive state. This paper analyzes disfluencies in children's spontaneous speech, in the context of spoken dialog based computer game play, and addresses the automatic detection of disfluency boundaries. Although several approaches have been proposed to detect disfluencies in speech, relatively little work has been done to utilize visual information to improve the performance and robustness of the disfluency detection system. This paper describes the use of visual information along with prosodic and language information to detect the presence of disfluencies in a child's computer-directed speech and shows how these information sources can be integrated to increase the overall information available for disfluency detection. The experimental results on our children's multimodal dialog corpus indicate that disfluency detection accuracy of over 80% can be obtained by utilizing audio-visual information. Specifically, results showed that the addition of visual information to prosody and language features yield relative improvements in disfluency detection error rates of 3.6% and 6.3%, respectively, for information fusion at the feature level and decision level.en_US
dc.description.sponsorshipNational Science Foundation (NSF) [EEC-9529152]; Integrated Media Systems Center; CAREER; Department of the Army [DAAD 19-99-D-0046]; USC; Direct For Computer & Info Scie & Enginr; Div Of Information & Intelligent Systems [0803565] Funding Source: National Science Foundationen_US
dc.description.sponsorshipManuscript received July 13, 2007 revised October 15, 2008. Current version published December 11, 2008. This work was supported in part by the National Science Foundation (NSF) through the Integrated Media Systems Center, an NSF Engineering Research Center, Cooperative Agreement under Contract EEC-9529152, a CAREER award, and the Department of the Army under Contract DAAD 19-99-D-0046. The work of S. Yildirim was supported in part by the National Science Foundation, a USC Zumberge Interdisciplinary Research, award, and a USC Annenberg Communications Critical Pathway Fellowship. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Helen Meng.en_US
dc.identifier.doi10.1109/TASL.2008.2006728
dc.identifier.endpage12en_US
dc.identifier.issn1558-7916
dc.identifier.issn1558-7924
dc.identifier.issue1en_US
dc.identifier.scopus2-s2.0-70350442414en_US
dc.identifier.scopusqualityN/Aen_US
dc.identifier.startpage2en_US
dc.identifier.urihttps://doi.org/10.1109/TASL.2008.2006728
dc.identifier.urihttps://hdl.handle.net/20.500.12483/9534
dc.identifier.volume17en_US
dc.identifier.wosWOS:000262327000002en_US
dc.identifier.wosqualityQ1en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.language.isoenen_US
dc.publisherIeee-Inst Electrical Electronics Engineers Incen_US
dc.relation.ispartofIeee Transactions on Audio Speech and Language Processingen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectDisfluency detectionen_US
dc.subjectfeature selectionen_US
dc.subjectinformation fusionen_US
dc.subjectspontaneous children speechen_US
dc.subjectspoken language processingen_US
dc.titleAutomatic Detection of Disfluency Boundaries in Spontaneous Speech of Children Using Audio-Visual Informationen_US
dc.typeArticleen_US

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
[ N/A ]
İsim:
Tam Metin / Full Text
Boyut:
599.63 KB
Biçim:
Adobe Portable Document Format