Voice-Enabled Beamlines at NSLS-II: Enhancing User Operations with AI Agents

1 / 7

Embed Share

Explore how voice-enabled beamlines at NSLS-II are revolutionizing user operations using AI agents, high-level speech commands, and fine-tuning techniques. Overcoming challenges with specialized terminology, this innovative solution aims to make beamline operations easier and more efficient.

slty667 Follow

Uploaded on Jun 20, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Voice-Enabled Beamlines at NSLS-II Shray Mathur, Esther Tsai, Kevin Yager

Goal Make user operation easier and more efficient at the beamline using AI agents. High-level speech commands Beamline executable code Move sample x to absolute 15 mm and use quick align. sam.xabs(15) sam.quick_align() 2

Challenges Out of the box pretrained Speech-to-Text (STT) models not familiar with beamline specific terminology (SAXS, WAXS, GISAXS, GIWAXS, etc) or commands. This process uses feedback from in-situ sacks wax measurements Pre-trained STT 3

Solution: Fine-tuning Further Challenges Data Availability: Where do we find reliable beamline-specific audio-text pairs? Data Requirements: Require high-quality audio-text pairs to learn effectively Solution: Utilize Text-to-Speech (TTS) models to generate synthetic audio from beamline proposal documents 4

Fine-tuning pipeline: TTS + LoRA Proposal Documents Chunk TTS model TTS Synthetic Audio - Text Pairs LoRA FT Fine-tuned STT LoRA Pre-trained STT 5

Fine-tuned Model This process uses feedback from in-situ sacks wax measurements Pre-trained STT This process uses feedback from in-situ SAXS/WAXS measurements Fine-tuned STT 6

Key Takeaways Fine-tuning pipeline is: Simple Effective Scalable Requires about 8-10 mins of audio-text pairs to teach STT model a new word Work part of a larger project - Exocortex! Yager, Kevin G. "Towards a Science Exocortex." Digital Discovery (2024). 7

Voice-Enabled Beamlines at NSLS-II: Enhancing User Operations with AI Agents

Download Presentation

Presentation Transcript

Related

More Related Content