
Engine Fault Sound Event Detection using Multimodal Signals
Explore a novel approach for fine-grained engine fault sound event detection using multimodal signals. The research focuses on building a dataset, proposing a fusion SED model, and introducing a pretraining scheme for accurate detection of engine faults through sound and vibration analysis.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
FINE-GRAINED ENGINE FAULT SOUND EVENT DETECTION USING MULTIMODAL SIGNALS ICASSP 2024 Dennis Fedorishin, Srirangaraj Setlur , Venu Govindaraju,University at Buffalo, Center for Unified Biometrics and Sensors, Livio Forte III , Philip Schneider ,ACV Auctions
Introduction Expert mechanics often use sound and vibration to diagnose vehicle engines, as engine faults often emit unique sound and vibration characteristics. They seek to extend engine fault detection into sound event detection (SED), which is the task of detecting the temporal occurrence of sound events, with onset and offset times. Their contributions 1) collect a strongly-labeled dataset of ten fine-grained engine fault sound events across a wide variety of vehicles. 2) propose a multimodal fusion SED model that predicts engine fault sound events using audio and accelerometer-recorded vibration. 3) introduce a pretraining scheme
Method 30-second signals Audio 44.1kHz 128 Mel bins frame size 2048 hop length 1024 samples Vibration 416Hz linearly-scaled frame size 256 hop length 32 samples
Dataset A large variety of common vehicles used in the united states Collect a 25-35 second audio and vibration recording using a professional-grade microphone and tri-axial accelerometer. Table 1 shows our labeled dataset across the ten engine faults. 5,000 sound events across 2,643 audio-vibration samples, spread across 232 unique vehicle models. Train(70%) validation(15%) test(15%)