Exploratory Study on Automatic Speech Recognition in Conference Interpreting

automatic speech recognition in conference n.w
1 / 27
Embed
Share

Explore how Automatic Speech Recognition (ASR) technology is impacting conference interpreting, including benefits and challenges faced by interpreters. Learn about ASR systems, experimental research on ASR-assisted interpreting, and the potential of ASR for enhancing interpreter services.

  • Speech Recognition
  • Conference Interpreting
  • ASR Systems
  • Interpreter-Machine Interaction
  • Technology

Uploaded on | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Automatic Speech Recognition in Conference Interpreting An exploratory study on Consecutive Interpreting assisted by Sight-Terp

  2. PART I PART III PART II Automatic Speech Recognition systems Results & discussion Experimental study

  3. PART I Automatic Speech Recognition systems

  4. Technologies for CI Sim-consec (Orlando 2010) Tablet consecutive (Goldsmith 2018, Altieri 2020) Automatic speech recognition (Wang/Wang 2019, Chen/Kruger 2023, nl 2023)

  5. CAI tool for CI A technology with potentially much greater impact on interpreting services is voice recognition, or automatic speech recognition, used as a key component of machine interpreting and applied in speech-to-text interpreting for deaf or hard-of-hearing audiences (Kalina/Ziegler 2015: 411).

  6. Automatic Speech Recognition ASR refers to the conversion of speech into text (Filippidou/Moussiades 2020: 73) Since the advent of new computational approaches (e.g. neural network, big data), ASR technologies can generate up to 30 percent better results than the best statistical systems (Seligman/Waibel 2019: 240), thus potentially achieving an error rate comparable to that of human listeners (Cavallo/Ortiz 2018: 23)

  7. Automatic Speech Recognition Sound quality, background noise, speech rate, linguistic variation, non-verbal communication (body language), emotions, metaphors, intonation, irony.

  8. ASR for interpreters 1) speaker-independent 2) reliable (low word error rate, WER) 3) low latency (Fantinuoli 2017: 28-29)

  9. PART II Experimental research

  10. Experimentalstudy Interpreter-machine interaction and impact of ASR on interpreters rendition 1) More concentration during phase 1 2) More time needed in delivery phase 3) ASR useful for figures and named entities, but possibly not easy-to-read (see repetitions, digressions, false starts, etc.)

  11. Design Six students of conf. interpreting Two speeches DE > IT Sight-Terp vs. paper notepad Tab. 1 Features of the speeches Speech A Speech A 621 5 35 110 13 9 Speech B Speech B 610 6 02 101 10 27 Lenght (words) Duration (mins) Speech rate Figures Named entities

  12. Methodology Data triangulation Preparation Data collection in randomised order Focus group discussion Questionnaire

  13. Sight-Terp https://www.sightterp.net/ CAI tool for ASR-assisted Consecutive Interpreting by Cihan nl speaker-independent ASR transcript + MT output NER (named entities recognition) function Digital Notepad

  14. ASR systems [ ] multilingual language models still pose difficulties because formal grammars as well n-grams (here meaning strings of n words that are used to predict the probability that a certain word will be the next to follow in a sequence already recognised) are not capable of handling the freer word order that is typical of spoken freer word order that is typical of spoken language language (my emphasis, Jekat 2015: 240). the

  15. PART III Results & discussion

  16. Results Modes of use Int. 1 and 5 note taking + look up figures and NE; rendition based on both supports Int. 3 and 4 note taking; rendition based on transcript by Sight-Terp Int. 2 and 6 no note taking; rendition based on sight translation

  17. Results Interpreters renditions Tab. 2 & 3 Results of the interpreters renditions Interpreter Interpreter Int. 1 Int. 2 Int. 3 Int. 4 Int. 5 Int. 6 Or. Speech Or. Speech Rendition Rendition 5 13 5 16 4 20 4 22 4 48 7 07 Modes Modes Time diff. Time diff. -0 22 -0 19 -1 ,15 -1 13 -0 48 +1 32 Median Median ST CONSEC CONSEC ST CONSEC ST 5 35 0 34 Interpreter Interpreter Or. Speech Or. Speech Rendition Rendition Mode Mode Time diff. Time diff. Median Median Int. 1 Int. 2 Int. 3 Int. 4 Int. 5 Int. 6 5 42 6 22 5 58 4 45 5 14 4 52 CONSEC ST ST CONSEC ST CONSEC -0 20 +0 20 -0 04 -1 08 -0 48 -1 10 6 02 0 40

  18. Results Performance of Sight-Terp WER WER Figures Figures NE NE Incorrect segmenation Incorrect segmenation Speech A 13,6% 9/13 10/18 9 Speech B 8.2% 8/10 23/28 12 Tab. 4 Results produced by Sight-Terp WER WER Figures Figures NE NE Incorrect Incorrect segmenation segmenation 9 12 Speech A Speech B 13,6% 8.2% 9/13 8/10 10/18 23/28 168 Milliarden Einhundertachtundsechzig Milliarden 100.000 Dollar 100 00 $0 50.000 Dollar 50 00 0E Mosambik Osnabr ck Gafam Gafar, Gafan Niassa erster, nasser, Iassa

  19. Listening and comprehension The transcript facilitatesthe retrieval of figures may inadvertently disrupt the listening effort, thus affecting comprehension and rendition

  20. Speech delivery CONSEC 6 renditions shorter than or. speech SIGHT-TERP 2 renditions longer than or. speech 4 renditions shorter than or. speech, though longer if compared to CONSEC

  21. Readability of the transcript The transcript produces quite accurate transcripts (4 int. 2 int. ) erroneous words recognition* incorrect segmentation linear transcription *NB: easy to fix if upstream preparation

  22. Conclusions 3 modes of interactions with ASR support Usefulness of ASR support for figures Risk of overreliance More studies needed to further explore assisted CI and develop strategic approaches

  23. Future prospects Practioners instead of students Cognitive studies Usability studies Quality analysis of the interpreter s rendition Development of strategies for assisted CI Other language combinations (also into B-language)

  24. Bibliography Altieri M. (2020) Tablet interpreting: rude exp rimentale de l interpr tation cons cutive sur tablette , The Interpreters Newsletter 25/2020, 19-35. Cavallo P. / Ortiz Schild L.E. (2018) Computer-assisted interpreting tools (CAI) and options for automation with Automatic Speech Recognition , <https://www.researchgate.net/publication/330207613_Computer- Assisted_Interpreting_Tools_CAI_and_options_for_automation_with_Automatic_Speech_Recognition> 2024). Chen S. / Kruger J. (2023) The effectiveness of computer-assisted interpreting. A preliminary study based on English- Chinese consecutive interpreting , Translation and Interpreting Studies, 18/3, 399-420. Fantinuoli C. (2017) Speech Recognition in the Interpreter Workstation , in J. Esteves-Ferreira / J.M. Macan / R. Mitkov / O.M. Stefanov (eds.) Proceedings of the 39th Conference Translating and the Computer, Geneva, Tradulex, 25-34, <https://www.asling.org/tc39/wp-content/uploads/TC39-proceedings-final-1Nov-4.20pm.pdf> (28 October 2024). Filippidou F. / Moussiades L. (2020) A Benchmarking of IBM, Google and Wit Automatic Speech Recognition Systems , in I. Maglogiannis, L. Iliadis, E. Pimenidis (eds.), Artificial Intelligence Applications and Innovations, Springer, 73 82. Goldsmith J. (2018) Tablet interpreting. Consecutive interpreting 2.0 , in N.K. Pokorn / C.D. Mellinger (eds.) Community Interpreting, Translation, and Technology, Special Issue of Translation and Interpreting Studies 3, 342-365. Jekat S. (2015) Machine interpreting , in F. P chhacker, N. Grbi , P. Mead, R. Setton (eds.) Routledge Encyclopedia of InterpretingStudies,NewYork, Routledge,239-242. Kalina S., Ziegler K. (2015) Technology , in F. P chhacker, N. Grbi , P. Mead, R. Setton (eds.) Routledge Encyclopedia of InterpretingStudies,NewYork, Routledge,410-412. Orlando M. (2010) Digital pen technology and consecutive interpreting: another dimension in note-taking training and assessment , The Interpreters Newsletter 15/2010, 71-86. nl C. (2023) Automatic Speech recognition in consecutive interpreter workstation: computer-aided interpreting tool sight-terp , unpublished MA Thesis, University of Ankara. Wang C. / Wang X. (2019) Can computer-assisted interpreting tools assist interpreting? , Transletters. International Journal of Translation and Interpreting, 3, 109-139. TradTerm, 32/2018, 9-31, (28 October

  25. if ?s, I !

Related


More Related Content