Exploratory Study on Automatic Speech Recognition in Conference Interpreting

automatic speech recognition in conference n.w

1 / 27

Embed Share

Explore how Automatic Speech Recognition (ASR) technology is impacting conference interpreting, including benefits and challenges faced by interpreters. Learn about ASR systems, experimental research on ASR-assisted interpreting, and the potential of ASR for enhancing interpreter services.

emir_186 Follow

Uploaded on Mar 18, 2025 | 1 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Automatic Speech Recognition in Conference Interpreting An exploratory study on Consecutive Interpreting assisted by Sight-Terp

PART I PART III PART II Automatic Speech Recognition systems Results & discussion Experimental study

PART I Automatic Speech Recognition systems

Technologies for CI Sim-consec (Orlando 2010) Tablet consecutive (Goldsmith 2018, Altieri 2020) Automatic speech recognition (Wang/Wang 2019, Chen/Kruger 2023, nl 2023)

CAI tool for CI A technology with potentially much greater impact on interpreting services is voice recognition, or automatic speech recognition, used as a key component of machine interpreting and applied in speech-to-text interpreting for deaf or hard-of-hearing audiences (Kalina/Ziegler 2015: 411).

Automatic Speech Recognition ASR refers to the conversion of speech into text (Filippidou/Moussiades 2020: 73) Since the advent of new computational approaches (e.g. neural network, big data), ASR technologies can generate up to 30 percent better results than the best statistical systems (Seligman/Waibel 2019: 240), thus potentially achieving an error rate comparable to that of human listeners (Cavallo/Ortiz 2018: 23)

Automatic Speech Recognition Sound quality, background noise, speech rate, linguistic variation, non-verbal communication (body language), emotions, metaphors, intonation, irony.

ASR for interpreters 1) speaker-independent 2) reliable (low word error rate, WER) 3) low latency (Fantinuoli 2017: 28-29)

PART II Experimental research

Experimentalstudy Interpreter-machine interaction and impact of ASR on interpreters rendition 1) More concentration during phase 1 2) More time needed in delivery phase 3) ASR useful for figures and named entities, but possibly not easy-to-read (see repetitions, digressions, false starts, etc.)

Design Six students of conf. interpreting Two speeches DE > IT Sight-Terp vs. paper notepad Tab. 1 Features of the speeches Speech A Speech A 621 5 35 110 13 9 Speech B Speech B 610 6 02 101 10 27 Lenght (words) Duration (mins) Speech rate Figures Named entities

Methodology Data triangulation Preparation Data collection in randomised order Focus group discussion Questionnaire

Sight-Terp https://www.sightterp.net/ CAI tool for ASR-assisted Consecutive Interpreting by Cihan nl speaker-independent ASR transcript + MT output NER (named entities recognition) function Digital Notepad

ASR systems [ ] multilingual language models still pose difficulties because formal grammars as well n-grams (here meaning strings of n words that are used to predict the probability that a certain word will be the next to follow in a sequence already recognised) are not capable of handling the freer word order that is typical of spoken freer word order that is typical of spoken language language (my emphasis, Jekat 2015: 240). the

PART III Results & discussion

Results Modes of use Int. 1 and 5 note taking + look up figures and NE; rendition based on both supports Int. 3 and 4 note taking; rendition based on transcript by Sight-Terp Int. 2 and 6 no note taking; rendition based on sight translation

Results Interpreters renditions Tab. 2 & 3 Results of the interpreters renditions Interpreter Interpreter Int. 1 Int. 2 Int. 3 Int. 4 Int. 5 Int. 6 Or. Speech Or. Speech Rendition Rendition 5 13 5 16 4 20 4 22 4 48 7 07 Modes Modes Time diff. Time diff. -0 22 -0 19 -1 ,15 -1 13 -0 48 +1 32 Median Median ST CONSEC CONSEC ST CONSEC ST 5 35 0 34 Interpreter Interpreter Or. Speech Or. Speech Rendition Rendition Mode Mode Time diff. Time diff. Median Median Int. 1 Int. 2 Int. 3 Int. 4 Int. 5 Int. 6 5 42 6 22 5 58 4 45 5 14 4 52 CONSEC ST ST CONSEC ST CONSEC -0 20 +0 20 -0 04 -1 08 -0 48 -1 10 6 02 0 40

Results Performance of Sight-Terp WER WER Figures Figures NE NE Incorrect segmenation Incorrect segmenation Speech A 13,6% 9/13 10/18 9 Speech B 8.2% 8/10 23/28 12 Tab. 4 Results produced by Sight-Terp WER WER Figures Figures NE NE Incorrect Incorrect segmenation segmenation 9 12 Speech A Speech B 13,6% 8.2% 9/13 8/10 10/18 23/28 168 Milliarden Einhundertachtundsechzig Milliarden 100.000 Dollar 100 00 $0 50.000 Dollar 50 00 0E Mosambik Osnabr ck Gafam Gafar, Gafan Niassa erster, nasser, Iassa

Listening and comprehension The transcript facilitatesthe retrieval of figures may inadvertently disrupt the listening effort, thus affecting comprehension and rendition

Speech delivery CONSEC 6 renditions shorter than or. speech SIGHT-TERP 2 renditions longer than or. speech 4 renditions shorter than or. speech, though longer if compared to CONSEC

Readability of the transcript The transcript produces quite accurate transcripts (4 int. 2 int. ) erroneous words recognition* incorrect segmentation linear transcription *NB: easy to fix if upstream preparation

Conclusions 3 modes of interactions with ASR support Usefulness of ASR support for figures Risk of overreliance More studies needed to further explore assisted CI and develop strategic approaches

Future prospects Practioners instead of students Cognitive studies Usability studies Quality analysis of the interpreter s rendition Development of strategies for assisted CI Other language combinations (also into B-language)

Bibliography Altieri M. (2020) Tablet interpreting: rude exp rimentale de l interpr tation cons cutive sur tablette , The Interpreters Newsletter 25/2020, 19-35. Cavallo P. / Ortiz Schild L.E. (2018) Computer-assisted interpreting tools (CAI) and options for automation with Automatic Speech Recognition , <https://www.researchgate.net/publication/330207613_Computer- Assisted_Interpreting_Tools_CAI_and_options_for_automation_with_Automatic_Speech_Recognition> 2024). Chen S. / Kruger J. (2023) The effectiveness of computer-assisted interpreting. A preliminary study based on English- Chinese consecutive interpreting , Translation and Interpreting Studies, 18/3, 399-420. Fantinuoli C. (2017) Speech Recognition in the Interpreter Workstation , in J. Esteves-Ferreira / J.M. Macan / R. Mitkov / O.M. Stefanov (eds.) Proceedings of the 39th Conference Translating and the Computer, Geneva, Tradulex, 25-34, <https://www.asling.org/tc39/wp-content/uploads/TC39-proceedings-final-1Nov-4.20pm.pdf> (28 October 2024). Filippidou F. / Moussiades L. (2020) A Benchmarking of IBM, Google and Wit Automatic Speech Recognition Systems , in I. Maglogiannis, L. Iliadis, E. Pimenidis (eds.), Artificial Intelligence Applications and Innovations, Springer, 73 82. Goldsmith J. (2018) Tablet interpreting. Consecutive interpreting 2.0 , in N.K. Pokorn / C.D. Mellinger (eds.) Community Interpreting, Translation, and Technology, Special Issue of Translation and Interpreting Studies 3, 342-365. Jekat S. (2015) Machine interpreting , in F. P chhacker, N. Grbi , P. Mead, R. Setton (eds.) Routledge Encyclopedia of InterpretingStudies,NewYork, Routledge,239-242. Kalina S., Ziegler K. (2015) Technology , in F. P chhacker, N. Grbi , P. Mead, R. Setton (eds.) Routledge Encyclopedia of InterpretingStudies,NewYork, Routledge,410-412. Orlando M. (2010) Digital pen technology and consecutive interpreting: another dimension in note-taking training and assessment , The Interpreters Newsletter 15/2010, 71-86. nl C. (2023) Automatic Speech recognition in consecutive interpreter workstation: computer-aided interpreting tool sight-terp , unpublished MA Thesis, University of Ankara. Wang C. / Wang X. (2019) Can computer-assisted interpreting tools assist interpreting? , Transletters. International Journal of Translation and Interpreting, 3, 109-139. TradTerm, 32/2018, 9-31, (28 October

if ?s, I !

Exploratory Study on Automatic Speech Recognition in Conference Interpreting

Download Presentation

Presentation Transcript

Related

More Related Content