New headphones being developed at the University of Washington’s Mobile Intelligence Lab use machine learning to selectively eliminate irritating noises while leaving desired sounds untouched, offering wearers a previously impossible degree of control over their auditory environment.
Professor Shyam Gollakota, who leads the lab, describes the technology — termed “semantic hearing” — as a system that allows users to choose which sounds they want to hear and which they want to block, processing audio in real time. Traditional noise-cancelling headphones work by generating an inverse soundwave to cancel ambient noise, but they struggle to differentiate between wanted and unwanted sounds. This new system overcomes that limitation through sophisticated deep-learning algorithms and neural networks trained to recognise and classify roughly 20 categories of environmental sound, from birdsong and sirens to crying babies, vacuum cleaners and human speech.
How AI Targets and Filters Specific Sounds
The core of the technology lies in its ability to identify and isolate individual sounds within a complex audio landscape. The deep-learning models are trained on vast datasets of labelled sounds, enabling them to distinguish between, say, a car horn and a dog bark, or a lawnmower and a conversation. When a user selects a sound to filter out — for example, the drone of a leaf blower — the system identifies that specific acoustic signature and suppresses it, while allowing other sounds, such as birdsong or traffic alerts, to pass through.
A critical technical challenge is speed. To avoid jarring delays that would desynchronise audio from visual cues, the system must process sound in under a hundredth of a second. To achieve this, the headphones offload the heavy computational work to a connected smartphone, which runs the AI models and returns the filtered audio to the earpieces nearly instantaneously. Users control their soundscape via voice commands or a companion smartphone app, selecting which categories of sound to amplify, which to mute, and which to leave untouched.
One limitation the researchers have identified is difficulty distinguishing between sounds with very similar acoustic properties, such as vocal music and human speech. Gollakota’s team believes that training the models on a wider range of real-world data will improve accuracy, and the University of Washington has collaborated with Microsoft on developing the semantic hearing concept. A commercial version of the system is planned for future release.
From Misophonia to Personal Soundscapes: Potential Applications
The most immediate and profound application may be for people living with misophonia, a condition in which specific sounds — often oral noises such as chewing or slurping, or repetitive sounds like pen clicking — trigger intense negative emotional and physiological responses. Misophonia is not yet officially classified as a diagnosable condition in major diagnostic manuals such as the DSM-5-TR or ICD-11, though a consensus definition was published in 2022. Research into its brain mechanisms is ongoing, and the disorder can severely affect social interactions, work and daily life, often leading to avoidance behaviours and anticipatory anxiety.
Early evidence supports the technology’s potential for this group. Timothy Wunrow at Mississippi State University demonstrated that a basic selective noise-cancelling algorithm using a convolutional neural network significantly reduced misophonic reactions when applied to trigger sounds. Gollakota’s system goes further by allowing users to target their own specific triggers — the sound of a partner chewing, for example, can be muted while the partner’s voice remains audible.
Beyond clinical applications, the technology offers more nuanced control for any listener. Gollakota points to the example of sitting on a park bench: a wearer could block the sound of loud talkers nearby while still hearing birdsong. The implications extend to situational awareness — a user might mute construction noise while preserving the sound of traffic or a bicycle bell. The system also has potential as an assistive technology for hearing impairments, and Gollakota is a co-founder of Hearvana AI, a startup developing AI-enhanced hearing technologies.
One study has shown a correlation between noise exposure levels and aggression, and another of the area around Frankfurt airport found that a 1 decibel increase in average noise levels raises the violent crime rate by 1.6%. Last month, British Airways began allowing calls on flights, raising the unwelcome prospect of being trapped in a pressurised tube with other passengers’ work and personal conversations. Semantic hearing headphones could filter out those calls, leaving the soothing brown noise of aircraft cabin sound — a soundscape some people already stream on Spotify.
The true power of the system, however, lies in the ability to programme one’s own personal triggers. A wearer might choose to silence the yapping of a neighbour’s chihuahua while keeping the television on; filter out the tinny melody of a child’s electronic toy without muting the child’s voice; block the thud of footsteps from the flat upstairs while preserving the song of a blackbird outside the window; or suppress the sound of a partner chewing but still hear them ask if anyone wants an ice-cream. The system functions as a user-directed, forensic sound sniper, curating the auditory world one noise at a time.
Researchers note the system has shown some difficulty distinguishing between sounds with very similar properties, such as vocal music and human speech, and are training models on more real-world data to improve performance.
