Close your eyes. You are already twice as good at detecting lies than you were a moment ago. The reason lies in how your brain processes sound alone, stripped of the misleading clutter of facial expressions, gestures and posture. When people lie, the rhythm and intonation of their speech changes – and human beings are remarkably good at spotting that distortion when they only hear it, not see it.
The advantage is striking. Dora Giorgianni, a researcher at the University of Portsmouth’s International Centre for Research, found that participants who only listened to an audio recording of a mock suspect being interviewed achieved substantially higher overall accuracy in assessing lies – 61.7% – than those who watched a video with sound, who managed just 35%. The reason, Giorgianni explains, is cognitive overload. “When too much information is presented at once – visual details, facial expressions, body movements, tone of voice and the actual content of what is being said – the cognitive system must continually select what to focus on and what to ignore, which increases the risk of making inaccurate judgments,” she says.
This fits with a counterintuitive finding from the same university’s research into juries during the pandemic: wearing face masks actually improved jurors’ ability to differentiate truth from lies. By obscuring facial expressions – which are often unreliable indicators of deceit – masks forced greater attention onto vocal cues. Professor Aldert Vrij, a leading expert in lie detection, notes that while non-verbal behaviour is often unhelpful, focusing on speech content can lead to more accurate results.
Giorgianni points out that from an intuitive or evolutionary perspective one might expect that seeing facial expressions, gestures and posture should help humans detect deception. “However,” she says, “modern investigative settings differ from ancestral environments. The cues that matter for survival are not the same as those that distinguish a practised liar from a truthful witness in an investigative interview.”
What voices reveal
Our voices give away a vast amount of information with every sentence. “Voices are an instrument and they reflect our physical nature,” says Professor Sophie Scott, director of the Institute of Cognitive Neuroscience at University College London. Just as a ukulele, guitar and violin are defined by material, strings and playing technique, the voice is shaped by the body. Taller people have longer vocal tracts and therefore produce lower vocal tract resonances. A man’s voice is usually roughly one octave lower than a woman’s. As we age, the cartilage of the larynx may harden, making a voice hoarser or weaker – interestingly, a woman’s voice may become lower while a man’s may become higher. Research has even shown that women’s voices get higher in the days leading up to and during ovulation, because the larynx reacts to the amount of oestrogen in their bodies.
Smiling changes the shape of the mouth, producing a warmer, brighter and slightly higher-pitched tone. “We’re very good at telling if someone is ill from their voice, for example,” says Scott. “The vocal folds get inflamed and vibrate differently.” Beyond physical cues, we also assess accents, socioeconomic status and even television viewing habits – the low-frequency “vocal fry” of a Kardashian-style drawl is a giveaway. The late Queen’s voice changed significantly over her lifetime. “Voices are aspirational,” Scott adds. “We had a charismatic senior person working here and everyone suddenly started talking like her. You change your voice depending on who you’re talking to.”
Accents themselves are shifting. They used to change roughly every 25 miles across the UK, but modern distinctions are much less marked. Scott warns against placing too much store by them: “People project a lot on to voices. Your reaction will often tell you more about your bias than about the other person.”
The brain processes these cues astonishingly fast. Professor Silke Paulmann, executive dean of the Faculty of Science and Health at the University of Essex, says: “When we hear someone speak, our brain starts evaluating voice cues within an eyeblink, or 200 milliseconds. Before we’ve fully processed the words or meaning, the brain has already started analysis. Within an eyeblink, we can hear if someone is warm or cold, calm or stressed, positive or negative.”
This ability is the product of millions of years of evolution. The process of speaking and listening – one of the key transitions from ape to Homo sapiens – required vocal structures, ears and brains to develop in tandem. Our ancestors began to understand the difference between vowel sounds about 27 million years ago. The hyoid bone in the throat, crucial to more sophisticated vocalisations, appeared about half a million years ago (with a human-like hyoid bone emerging around 700,000 years ago, according to additional research). The descent of the larynx in humans compared to other primates contributed to a wider range and clarity of sound. Interestingly, human vocal folds have simplified compared to other primates, which may have led to greater vocal stability and control. Meanwhile, the auricular muscles that allow ears to move – still seen in cats and dogs – remain as vestigial structures in humans; we lost the ability to swivel our ears about 25 million years ago. Recent research suggests these muscles still activate when we strain to listen in noisy environments, potentially serving as an objective measure of listening effort.
The challenges of lie detection
Despite this sophisticated machinery, spotting a liar remains notoriously difficult. “There is no single verbal cue that ‘gives away’ lying in a strong or reliable way,” says Giorgianni. “Common beliefs about nonverbal indicators of deception are frequently inaccurate and a clear, reliable ‘Pinocchio’s nose’ simply does not exist.” The clues we have been taught to expect – talking faster, voice rising – appear in some people but not others, and they are also indicators of stress, which can occur without lying.
Harriet Tyce, a novelist and recent contestant on the reality show The Traitors, understands this difficulty well. “What’s most surprising about the difficulties of spotting a liar on The Traitors is that one goes into it knowing that everyone could be – and in fact pretty much is – lying about something, which means that it should in theory be almost impossible not to spot it,” she says. “But I think we are hardwired as humans to trust, and trying to override that instinct is nearly impossible.”
Technology promises to help. Several companies offer AI-driven analysis that tracks voice, facial muscle movements, eye tracking and brain activity. Some systems claim laboratory accuracies of 75-79%, but Dr Frederika Holmes, a consultant specialising in the forensic analysis of speech and language samples who frequently acts as an expert witness, warns that voices are not like DNA. “Voices aren’t like DNA, which doesn’t change over the course of your life and can be directly compared from one sample to the next,” she says. “Voices are plastic and they change depending on circumstances, so we can’t say with absolute certainty. We assess the points of similarity and difference and reach a conclusion regarding the strength of the evidence.” A 75% accuracy rate means a one-in-four error rate, with severe consequences in real-world applications, and the technology can be biased by training data or misinterpret stress as deception.
Ultimately, if you listen closely enough to a voice, it will tell you some of its secrets. But it still won’t tell you everything.
