Automatic Detection of Disfluency Boundaries in Spontaneous Speech of Children Using Audio-Visual Information
dc.authorid | YILDIRIM, Serdar/0000-0003-3151-9916 | |
dc.contributor.author | Yildirim, Serdar | |
dc.contributor.author | Narayanan, Shrikanth | |
dc.date.accessioned | 2024-09-18T20:15:15Z | |
dc.date.available | 2024-09-18T20:15:15Z | |
dc.date.issued | 2009 | |
dc.department | Hatay Mustafa Kemal Üniversitesi | en_US |
dc.description.abstract | The presence of disfluencies in spontaneous speech, while poses a challenge for robust automatic recognition, also offers means for gaining additional insights into understanding a speaker's communicative and cognitive state. This paper analyzes disfluencies in children's spontaneous speech, in the context of spoken dialog based computer game play, and addresses the automatic detection of disfluency boundaries. Although several approaches have been proposed to detect disfluencies in speech, relatively little work has been done to utilize visual information to improve the performance and robustness of the disfluency detection system. This paper describes the use of visual information along with prosodic and language information to detect the presence of disfluencies in a child's computer-directed speech and shows how these information sources can be integrated to increase the overall information available for disfluency detection. The experimental results on our children's multimodal dialog corpus indicate that disfluency detection accuracy of over 80% can be obtained by utilizing audio-visual information. Specifically, results showed that the addition of visual information to prosody and language features yield relative improvements in disfluency detection error rates of 3.6% and 6.3%, respectively, for information fusion at the feature level and decision level. | en_US |
dc.description.sponsorship | National Science Foundation (NSF) [EEC-9529152]; Integrated Media Systems Center; CAREER; Department of the Army [DAAD 19-99-D-0046]; USC; Direct For Computer & Info Scie & Enginr; Div Of Information & Intelligent Systems [0803565] Funding Source: National Science Foundation | en_US |
dc.description.sponsorship | Manuscript received July 13, 2007 revised October 15, 2008. Current version published December 11, 2008. This work was supported in part by the National Science Foundation (NSF) through the Integrated Media Systems Center, an NSF Engineering Research Center, Cooperative Agreement under Contract EEC-9529152, a CAREER award, and the Department of the Army under Contract DAAD 19-99-D-0046. The work of S. Yildirim was supported in part by the National Science Foundation, a USC Zumberge Interdisciplinary Research, award, and a USC Annenberg Communications Critical Pathway Fellowship. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Helen Meng. | en_US |
dc.identifier.doi | 10.1109/TASL.2008.2006728 | |
dc.identifier.endpage | 12 | en_US |
dc.identifier.issn | 1558-7916 | |
dc.identifier.issn | 1558-7924 | |
dc.identifier.issue | 1 | en_US |
dc.identifier.scopus | 2-s2.0-70350442414 | en_US |
dc.identifier.scopusquality | N/A | en_US |
dc.identifier.startpage | 2 | en_US |
dc.identifier.uri | https://doi.org/10.1109/TASL.2008.2006728 | |
dc.identifier.uri | https://hdl.handle.net/20.500.12483/9534 | |
dc.identifier.volume | 17 | en_US |
dc.identifier.wos | WOS:000262327000002 | en_US |
dc.identifier.wosquality | Q1 | en_US |
dc.indekslendigikaynak | Web of Science | en_US |
dc.indekslendigikaynak | Scopus | en_US |
dc.language.iso | en | en_US |
dc.publisher | Ieee-Inst Electrical Electronics Engineers Inc | en_US |
dc.relation.ispartof | Ieee Transactions on Audio Speech and Language Processing | en_US |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
dc.rights | info:eu-repo/semantics/closedAccess | en_US |
dc.subject | Disfluency detection | en_US |
dc.subject | feature selection | en_US |
dc.subject | information fusion | en_US |
dc.subject | spontaneous children speech | en_US |
dc.subject | spoken language processing | en_US |
dc.title | Automatic Detection of Disfluency Boundaries in Spontaneous Speech of Children Using Audio-Visual Information | en_US |
dc.type | Article | en_US |
Dosyalar
Orijinal paket
1 - 1 / 1