The Role of Prosody in Linguistic Communication

prosody n.w
1 / 23
Embed
Share

Explore the significance of prosody in linguistic communication, including its impact on conveying emotions, identifying depression through speech patterns, and speaker identification. Prosody involves various paralinguistic features that influence speech rate, pitch, and energy, ultimately shaping the overall message being communicated.

  • Prosody
  • Linguistic Communication
  • Emotions
  • Speaker Identification

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Prosody Lecture 20: Paralinguistic Prosody Nigel G. Ward, University of Texas at El Paso Gina-Anne Levow, University of Washington Tutorial presented at ACL 2021

  2. Linguistic Communication aadapted from freesvg.org Paralinguistic Communication

  3. Linguistic Communication aadapted from freesvg.org Paralinguistic Communication Charles Darwin 1872, via Wikimedia Commons Audio communication without language Prosody has big role

  4. Emotion Correlations Anger Happy Sad Fear Disgust Very much slower Very much lower Slightly wider Slightly Faster Faster or slower Slightly slower Much faster Speech Rate Very much higher Much wider Very much higher Much wider Much higher Slightly lower Pitch Average Much wider Slightly narrower Pitch Range higher higher lower normal lower Energy NB: these prosodic features are whole-utterance averages

  5. Emotion Correlations Anger Happy Sad Fear Disgust Very much slower Very much lower Slightly wider Slightly Faster Faster or slower Slightly slower Much faster Speech Rate Very much higher Much wider Very much higher Much wider Much higher Slightly lower Pitch Average Much wider Slightly narrower Pitch Range higher higher lower normal lower Energy NB: these prosodic features are whole-utterance averages Murray, I.R. and Arnott, J.L., 1995. Implementation and testing of a system for producing emotion-by-rule in synthetic speech. Speech Communication, 16(4), pp.369-390.

  6. Why Prosody is Informative intention prosodic specification phoneme sequence muscle control glottis articulators lungs [[ sound

  7. Depression intention Prosodic Observables: low, monotone, long pauses before responding, highly variable pause lengths prosodic specification phoneme sequence General cause: psychomotor retardation, or slowing : decreased intensity, irregular timing muscle control glottis articulators lungs [[ sound (Yang, Fairbain, Cohn 2016)

  8. Speaker Identification * intention * prosodic specification phoneme sequence * muscle control *individuals differ * glottis articulators lungs [[ sound

  9. individual identity age surprised respiratory infection gender angry fatigue personality sad intoxication uncertain depression confident dominant thinking of a word Alzheimers team leader intending to continue Parkinsons autism group affiliations (regional, ethnic ) stable fleeting

  10. individual identity age surprised respiratory infection gender angry fatigue personality sad intoxication uncertain depression confident dominance thinking of a word Alzheimers team leader intending to continue Parkinsons autism group affiliations (regional, ethnic ) stable fleeting

  11. Paralinguistic Inference Midlevel features+ linear regression decision trees neural networks more informative Frame-level features recurrent networks better performing* deep networks with attention, etc. Pretrained models linear regression *often as well as humans +seldom unit-linked, except roughly for speaker identification

  12. Realms of Prosody phonological paralinguistic Linked to language Independent Mental Language-specific Controllable Symbolic Bodily Universal Uncontrollable Iconic/Gradient

  13. Contents Introduction Production, Perception Classic Linguistic Prosody 20. Paralinguistics 21. Pragmatics . . . . . . Three Realms Technology and Techniques Paralinguistics, Pragmatics Speech Synthesis and Dialog Perspectives

  14. Contents Introduction Production, Perception Classic Linguistic Prosody 20. Paralinguistics 21. Pragmatics . . . . . . Three Realms Technology and Techniques Paralinguistics, Pragmatics Speech Synthesis and Dialog Perspectives

  15. Paralinguistics Emotional states Health conditions Momentary states Individual identity Traits Social identity . . .

  16. Paralinguistics Applications Emotional states Emotional recognition Health conditions Diagnosis, screening Momentary states Language modeling Individual identity Speaker identification Traits . . . Social identity . . .

  17. Paralinguistics Emotional states Emotional recognition Health conditions Diagnosis, screening Momentary states Language modeling Individual identity Speaker identification Traits . . . Social identity . . .

  18. Traits (just a few samples) o Autism: monotone, high, loud, nasal o Depression: low, monotone, long switching pauses (latencies), variable ditto From a psychopathology perspective, one would expect depression to be associated with decreased intensity, irregular timing, and decreased F0variability. These features are conceptually related to what is referred to as psychomotor retardation, or slowing, insensitivity to positive and negative stimuli, and the attenuated interest in other people that are common in depression. (Yang, Fairbain, Cohn 2016) o Parkinsons, aging, etc. o History of alcohol abuse: aprosodia (production, also perception)

  19. More Biological Codes Production code: (Gussenhoven 2002) End of breath group -> lower air pressure -> lower intensity, pitch Beginnings: louder, higher pitch; Endings: quieter, lower pitch "Declination": gradual reduction across a phrase Effort Code: (Gussenhoven 2002) Greater articulatory effort/precision -> greater importance, emphasis Wider pitch range, more precise pitch movements, louder speech Associated with surprise, anger, etc Perhaps also mention iconicity, as in biiig fish.

  20. Speech Health Analysis Condition Task Metric Conventional State-of-art Regression (0-10) 2-class classification 2-class classification Regression (0-24) Correlation 0.343 0.383 Fatigue UAR 0.659 0.710 Intoxication Accuracy 0.625 0.708 Alzheimer s RMSE 8.19 6.80 Depression Parkinson's disease Child Development Disorders Regression Correlation 0.390 0.649 4-class classification UAR 0.671 0.694 Results from ComParE, AVEC and ADReSS challenges (compiled by Nicholas Cummins)

  21. Beyond Classic Emotion More subtle states More communicative states More transient states, e.g. confidence, thinking of a word, turn-taking intentions C.f. language modeling

  22. personality marking social identity: as gay, or as a stoner, or a businessman, or a free spirit, or as a professor. Useful for parodies and mocking.

More Related Content