Who Said What? An Automated Approach to Analyzing Speech in Preschool Classrooms
CoRR(2024)
摘要
Young children spend substantial portions of their waking hours in noisy
preschool classrooms. In these environments, children's vocal interactions with
teachers are critical contributors to their language outcomes, but manually
transcribing these interactions is prohibitive. Using audio from child- and
teacher-worn recorders, we propose an automated framework that uses open source
software both to classify speakers (ALICE) and to transcribe their utterances
(Whisper). We compare results from our framework to those from a human expert
for 110 minutes of classroom recordings, including 85 minutes from child-word
microphones (n=4 children) and 25 minutes from teacher-worn microphones (n=2
teachers). The overall proportion of agreement, that is, the proportion of
correctly classified teacher and child utterances, was .76, with an
error-corrected kappa of .50 and a weighted F1 of .76. The word error rate for
both teacher and child transcriptions was .15, meaning that 15
need to be deleted, added, or changed to equate the Whisper and expert
transcriptions. Moreover, speech features such as the mean length of utterances
in words, the proportion of teacher and child utterances that were questions,
and the proportion of utterances that were responded to within 2.5 seconds were
similar when calculated separately from expert and automated transcriptions.
The results suggest substantial progress in analyzing classroom speech that may
support children's language development. Future research using natural language
processing is underway to improve speaker classification and to analyze results
from the application of the automated it framework to a larger dataset
containing classroom recordings from 13 children and 4 teachers observed on 17
occasions over one year.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要