The Clore Center for Biological Physics
Structure in Prosody
Lunch at 12:45
Prosody, by and large, is the variation in pitch, timing, and loudness that gives speech its musical quality. It is pivotal in human communication, yet its structure and meaning remain subjects of ongoing research. I will describe a data-driven model for English prosody based on large-scale analysis of spontaneous conversations. As a first step, we identified approximately 200 discernible prosodic patterns, i.e., pitch contours typically spanning 1-4 words that we view as building blocks of a prosodic vocabulary, and outlined their properties and communicative meanings. Next, we revealed a Markovian logic, akin to a syntax, affecting how these elementary building blocks concatenate into coherent utterances. We further identified distinct compound functions associated with pairs of consecutive patterns and demonstrated that this Markovian structure is significantly more prevalent in spontaneous prosody compared to scripted speech. These findings offer insights into the underlying mechanisms of conversational prosody, empirically informing and refining existing theoretical concepts in linguistics. The methodology of combining unsupervised clustering analysis of large speech datasets with careful manual annotation could guide future research aimed at refining our model and expanding it to other languages.