Posted in | News | Consumer Robotics

Configuring Acoustic Scenes with “Semantic Hearing”

Anyone who has used noise-canceling headphones is probably aware of how important it might be to hear the appropriate audio at the right moment. When working indoors, someone might prefer to muffle automobile horns, but not when strolling down a crowded street. People still have little control over which sounds their headphones block out.

A team led by researchers at the University of Washington has developed deep-learning algorithms that let users pick which sounds filter through their headphones in real-time. Pictured is co-author Malek Itani demonstrating the system. Image Credit: University of Washington

Researchers at the University of Washington are leading a team that has created deep-learning algorithms that let users select the sounds that will filter through their headphones in real-time. The team refers to the system as “semantic hearing.”. Headphones eliminate all background noise by streaming recorded audio to a connected smartphone.

Headphone users can choose from 20 types of sounds, including sirens, infant cries, speech, vacuum cleaners, and bird chirps, using voice commands or a smartphone app. The headphones will only play the sounds that the users have chosen.

The team presented its findings at UIST ‘23 in San Francisco on November 1st, 2023. The researchers intend to release a commercial version of the technology in the future.

Understanding what a bird sounds like and extracting it from all other sounds in an environment requires real-time intelligence that today’s noise-canceling headphones haven’t achieved. The challenge is that the sounds headphone wearers hear the need to sync with their visual senses. You can’t be hearing someone’s voice two seconds after they talk to you. This means the neural algorithms must process sounds in under a hundredth of a second.

Shyam Gollakota, Study Senior Author and Professor, Paul G. Allen School of Computer Science and Engineering, University of Washington

Due to time constraints, the semantic hearing system must process sounds on a device such as a linked smartphone rather than on more powerful cloud servers. Furthermore, since sounds from different directions arrive in people’s ears at different times, the system must preserve these delays as well as other spatial signals so that people can still experience sounds in their environment meaningfully.

The system was tested in a variety of settings, including streets, parks, and offices. It was able to isolate target sounds, such as sirens and bird chirps, while eliminating background noise. Twenty-two participants rated the audio output of the system for the target sound, and they reported that, on average, the quality was better than the original recording.

The system occasionally had trouble recognizing sounds that have a lot in common, like vocal music and human speech. The models could produce better results if they were trained on more real-world data, according to the researchers.

Additional study co-authors included Takuya Yoshioka, head of research at AssemblyAI, Justin Chan, who performed this research as a doctoral student in the Allen School and is currently at Carnegie Mellon University, and Bandhav Veluri and Malek Itani, both UW PhD students in the Allen School.

Semantic hearing: Future of intelligent hearables

Video Credit: University of Washington

Journal Reference

Veluri, B., et al. (2023) Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables. Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. doi:10.1145/3586183.3606779.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.