Information

Why does your recorded or objective voice sound different to what you hear in your own head?

Why does your recorded or objective voice sound different to what you hear in your own head?



We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

When speaking, I hear my own voice very differently from how others do and from what it really is. The sound differs in tone, pitch, volume, etc. For example, recordings of my singing or speaking in foreign languages sounded a lot worse than what I heard in my own head.

Is that a popular phenomenon? Or is that a "syndrome", and which parts of sensation and perception are unusual? Last, is it modifiable?


I think this is not a psychological syndrome but just a reflection of the physical procesces. As such it might not be on-topic for this site. Having this said, here is a quick answer.

When you hear your own while speaking, the sound source is in a different place than it is, when you hear a recording of your voice through a loudspeaker. In addition, when you hear your own voice while speaking, you not only hear the sound that is "in the air", but you also have the vibrations from within your body, which is an additional sound source. Others don"t have the additional source. So these two factors alone make for a very different sound quality.

Besides, not every recording of your voice sounds the same. Now, I am sure there are a lot of psychological processes at work, when your hear a recording of your own voice. Unfortunately I don't know about these. But for example the position of the microphone will greatly affect the sound of the recording. Any musician who has tried to record himself will attest to that.

I wouldn't be so sure, though, that the recording sounds worse than your "real voice". That's probably just a your subjective perspective. I also found it interesting that you talk about how your voice "really is". I think that the sound of your voice that you hear while speaking and the sound of your voice that others hear when they listen to you are both instances of your voice as it really is, just from different perspectives, whereas on a recording you could actually change the sound of your voice by means of a lot of different sound effects.


Jens' answer is pretty much spot on, but misses the fact, remembered from my undergraduate lectures, that your ears actually partially 'turn off' when you speak (or chew), in what's called the stapedius reflex (wikipedia).

The most common reference I've seen for this is Møller (2000), which unfortunately is a book, but I'm sure more information could be found with a little digging.

Reference

Møller, Aage (2000). Hearing: It's Physiology and Pathophysiology (illustrated ed.). Academic Press. pp. 181-90.


There are two sound pathways by which we hear: bone conduction and air conduction. The air conduction pathway involves vibrations in the air being transmitted from the ear drum, through the bones of the middle ear, which act as a lever, to our fluid filled inner ear. The lever acts as an impedance matcher between the air and fluid filled inner ear. It effectively provides about 60 dB of gain, although there is a strong frequency dependence, and therefore is the normal conductance pathway for hearing. The bone conduction pathway involves vibrations in our skull, typically due to vibrations in the air, being transmitted directly to our fluid filled inner ear. For pressure waves in the air there is a large impedance mismatch between the air and fluid filled inner ear making this an inefficient pathway. When we speak, the vibrations in our skull are not due to vibrations in the air, but rather vibrations in the vocal tract. This means that the impedance mismatch is substantially reduced and the normal conductance pathway when we are speaking is through bone conductance.

Things are a little more complicated because our vocal tract is not a rigid system, but rather has muscles and soft tissue. This changes the filtering characteristics of the bone conduction pathway by adding more attenuation to the lower frequencies than is typically seen when measuring bone conduction with external sources. Further, our vocal system and hearing systems are linked and there are feedback mechanisms, including the stapedius reflex, that change our perception of our speech.


Speaking as a musician and one-time music teacher, this is not just true of the voice. When an instrumentalist records and plays back a performance, it can be very disconcerting. This is especially true of beginners who often imagine they sound much better than they really do.

I suspect that the action of performing somehow suppresses the ability to listen. Musicians are often exhorted by their teachers to "Listen to what you are playing!", for example:

Listen to what you are playing. 2. Many times, we play through our music and couldn't even begin to say anything about it when we have finished!

Let's Practise: Be a Better Musician By Susan Whykes

https://goo.gl/mEDQhu

The reason could be as simple as the brain being too occupied with producing a sound to attend to it properly. Only the most accomplished musicians can 'stand aside' from their performance to simultaneously monitor it.


Sony WH-1000XM4 vs Shure AONIC 50: Sound quality

The Shure AONIC 50 amplifies upper-bass notes which helps fundamental vocal frequencies stand out. This was measured with firmware 0.4.9.

The Shure AONIC 50 have a neutral-leaning frequency response, and the slight boost between 70-300Hz helps vocals stand out. The small dip between 2-4kHz is intentional, and it helps reduce unwanted resonances within your ears. The 0.4.9 firmware version made the Shure AONIC 50’s frequency response more neutral in the sub-bass region, allowing for a more accurate sound signature.

The flat bass response allows for more clarity in the lows even if they’re not as strong.

The Sony WH-1000XM4 also have a neutral-leaning frequency response, allowing for accurate sound reproduction and generally clear, detailed audio. Just like the Shure AONIC 50, the dip at around 2kHz is intentional for combating unpleasing harmonic resonances that occur in the ear canal when a seal is formed.

Both the AONIC 50 and the XM4 have a sound signature that can be customized in their respective apps’ EQ settings, but as I mentioned before, you cannot carry over your EQ settings for the Shure AONIC 50 to playback outside of the app. For this reason and our objective scoring, the Sony WH-1000XM4 wins this round, but not by a significant margin.


4pAAa12 – Hearing voices in the high frequencies: What your cell phone isn’t telling you – Brian B. Monson

Ever noticed how or wondered why people sound different on your cell phone than in person? You might already know that the reason is because a cell phone doesn’t transmit all of the sounds that the human voice creates. Specifically, cell phones don’t transmit very low-frequency sounds (below about 300 Hz) or high-frequency sounds (above about 3,400 Hz). The voice can and typically does make sounds at very high frequencies in the “treble” audio range (from about 6,000 Hz up to 20,000 Hz) in the form of vocal overtones and noise from consonants. Your cell phone cuts all of this out, however, leaving it up to your brain to “fill in” if you need it.

Figure 1. A spectrogram showing acoustical energy up to 20,000 Hz (on a logarithmic axis) created by a male human voice. The current cell phone bandwidth (dotted line) only transmits sounds between about 300 and 3400 Hz. High-frequency energy (HFE) above 6000 Hz (solid line) has information potentially useful to the brain when perceiving singing and speech.

What are you missing out on? One way to answer this question is to have individuals listen to only the high frequencies and report what they hear. We can do this using conventional signal processing methods: cut out everything below 6,000 Hz thereby only transmitting sounds above 6,000 Hz to the ear of the listener. When we do this, some listeners only hear chirps and whistles, but most normal-hearing listeners report hearing voices in the high frequencies. Strangely, some voices are very easy to hear out in the high frequencies, while others are quite difficult. The reason for this difference is not yet clear. You might experience this phenomenon if you listen to the following clips of high frequencies from several different voices. (You’ll need a good set of high-fidelity headphones or speakers to ensure you’re getting the high frequencies.)

Until recently, these treble frequencies were only thought to affect some aspects of voice quality or timbre. If you try playing with the treble knob on your sound system you’ll probably notice the change in quality. We now know, however, that it’s more than just quality (see Monson et al., 2014). In fact, the high frequencies carry a surprising amount of information about a vocal sound. For example, could you tell the gender of the voices you heard in the examples? Could you tell whether they were talking or singing? Could you tell what they were saying or singing? (Hint: the words are lyrics to a familiar song.) Most of our listeners could accurately report all of these things, even when we added noise to the recordings.

Figure 2. A frequency spectrum (on a linear axis) showing the energy in the high frequencies combined with speech-shaped low-frequency noise.

[Insert noise clip here: MonsonM1singnoise.wav]

What does this all mean? Cell phone and hearing aid technology is now attempting to include transmission of the high frequencies. It is tempting to speculate how inclusion of the high frequencies in cell phones, hearing aids, and even cochlear implants might benefit listeners. Lack of high-frequency information might be why we sometimes experience difficulty understanding someone on our phones, especially when sitting on a noisy bus or at a cocktail party. High frequencies might be of most benefit to children who tend to have better high-frequency hearing than adults. And what about quality? High frequencies certainly play a role in determining voice quality, which means vocalists and sound engineers might want to know the optimal amount of high-frequency energy for the right aesthetic. Some voices naturally produce higher amounts of high-frequency energy, and this might contribute to how well you like that voice. These possibilities give rise to many research questions we hope to pursue in our study of the high frequencies.

Monson, B. B., Hunter, E. J., Lotto, A. J., and Story, B. H. (2014). “The perceptual significance of high-frequency energy in the human voice,” Frontiers in Psychology, 5, 587, doi: 10.3389/fpsyg.2014.00587.

Hearing voices in the high frequencies: What your cell phone isn’t telling you


Why we hate the sound of our own voices

For me, these recordings are incredibly valuable. They allow me to track slight changes in their voices from visit to visit, and it helps confirm whether surgery or voice therapy led to improvements.

Yet I'm surprised by how difficult these sessions can be for my patients. Many become visibly uncomfortable upon hearing their voice played back to them.

Advertisement

"Do I really sound like that?" they wonder, wincing.

Some become so unsettled they refuse outright to listen to the recording – much less go over the subtle changes I want to highlight.

Read More

For one, the sound from an audio recording is transmitted differently to your brain than the sound generated when you speak.

When listening to a recording of your voice, the sound travels through the air and into your ears – what's referred to as "air conduction". The sound energy vibrates the ear drum and small ear bones. These bones then transmit the sound vibrations to the cochlea, which stimulates nerve axons that send the auditory signal to the brain.

However, when you speak, the sound from your voice reaches the inner ear in a different way. While some of the sound is transmitted through air conduction, much of the sound is internally conducted directly through your skull bones.

When you hear your own voice when you speak, it's due to a blend of both external and internal conduction, and internal bone conduction appears to boost the lower frequencies.

Advertisement

For this reason, people generally perceive their voice as deeper and richer when they speak. The recorded voice, in comparison, can sound thinner and higher pitched, which many find cringeworthy.

There's a second reason hearing a recording of your voice can be so disconcerting. It really is a new voice – one that exposes a difference between your self-perception and reality.

Because your voice is unique and an important component of self-identity, this mismatch can be jarring. Suddenly you realise other people have been hearing something else all along.

Even though we may actually sound more like our recorded voice to others, I think the reason so many of us squirm upon hearing it is not that the recorded voice is necessarily worse than our perceived voice. Instead, we're simply more used to hearing ourselves sound a certain way.

A study published in 2005 had patients with voice problems rate their own voices when presented with recordings of them. They also had clinicians rate the voices. The researchers found that patients, across the board, tended to more negatively rate the quality of their recorded voice compared with the objective assessments of clinicians.

So if the voice in your head castigates the voice coming out of a recording device, it's probably your inner critic overreacting – and you're judging yourself a bit too harshly.

Advertisement

Neel Bhatt is assistant professor of otolaryngology at UW Medicine, University of Washington

This article is republished from The Conversation under a Creative Commons license. Read the original article.


Contents

Bone conduction is one reason why a person's voice sounds different to them when it is recorded and played back. Because the skull conducts lower frequencies better than air, people perceive their own voices to be lower and fuller than others do, and a recording of one's own voice frequently sounds higher than one expects. [1] [2]

Musicians may use bone conduction using a tuning fork while tuning stringed instruments. After the fork starts vibrating, placing it in the mouth with the stem between the back teeth ensures that one continues to hear the note via bone conduction, and both hands are free to do the tuning. [3] Ludwig van Beethoven was famously rumored to be using bone conduction after losing most of his hearing, by placing one end of a rod in his mouth and resting the other end on the rim of his piano. [4]

It has also been observed that some animals can perceive sound and even communicate by sending and receiving vibration through bone. [5]

Comparison of hearing sensitivity through bone conduction and directly through the ear canal can aid audiologists in identifying pathologies of the middle ear—the area between the tympanic membrane (ear drum) and the cochlea (inner ear). If hearing is markedly better through bone conduction than through the ear canal (air-bone gap), [6] problems with the ear canal (e.g. ear wax accumulation), the tympanic membrane or ossicles can be suspected. [7]

Some hearing aids employ bone conduction, achieving an effect equivalent to hearing directly by means of the ears. A headset is ergonomically positioned on the temple and cheek and the electromechanical transducer, which converts electric signals into mechanical vibrations, sends sound to the internal ear through the cranial bones. Likewise, a microphone can be used to record spoken sounds via bone conduction. The first description, in 1923, of a bone conduction hearing aid was Hugo Gernsback's "Osophone", [8] which he later elaborated on with his "Phonosone". [9]

After the discovery of Osseointegration around 1950 and its application to dentistry around 1965, it was noticed that implanted teeth conducted vibrations to the ear. As a result, bone-anchored hearing aids were developed and implanted from 1977 on.

Bone conduction products are usually categorized into three groups:

  • Ordinary products, such as hands-free headsets or headphones and assistive listening devices
  • Specialized communication products (e.g. for underwater or high-noise environments)

One example of a specialized communication product is a bone conduction speaker that is used by scuba divers. The device is a rubber over-moulded, piezoelectric flexing disc that is approximately 40 millimetres (1.6 in) across and 6 millimetres (0.24 in) thick. A connecting cable is molded into the disc, resulting in a tough, waterproof assembly. In use, the speaker is strapped against one of the dome-shaped bone protrusions behind the ear and the sound, which can be surprisingly clear and crisp, seems to come from inside the user's head. [10]

Because bone conduction headphones transmit sound to the inner ear through the bones of the skull, users can consume audio content while maintaining situational awareness. [11]

The Google Glass device employs bone conduction technology for the relay of information to the user through a transducer that sits beside the user's ear. The use of bone conduction means that any vocal content that is received by the Glass user is nearly inaudible to outsiders. [12]

German broadcaster Sky Deutschland and advertising agency BBDO Germany collaborated on an advertising campaign that uses bone conduction that was premiered in Cannes, France at the International Festival of Creativity in June 2013. The "Talking Window" advertising concept uses bone conduction to transmit advertising to public transport passengers who lean their heads against train glass windows. Academics from Australia's Macquarie University suggested that, apart from not touching the window, passengers would need to use a dampening device that is made of material that would not transmit the vibration from the window in order to not hear the sound. [13] [14]

Land Rover BAR employed 'military' bone conduction technology, designed by BAE Systems, within their helmets for use within the 2017 America's Cup. [15] The helmets allowed the crews to communicate effectively with each other under race conditions and within the harsh, noisy environment whilst maintaining situational awareness due to their ears being uncovered. [16]

In March 2019 at The National Maritime Museum, London, British composer Hollie Harding premiered the use of Bone Conduction Headphones as part of a musical performance. [17] The use of the technology allowed the audience to listen to a pre-recorded musical track on the headsets, whilst a live orchestra performed a separate but related musical track. This multilayered effect meant that electronic and digitally-edited sounds could be heard in conjunction with live music without the use of loud-speakers for the first time and that the source of sounds could appear to be close to, far from, or all around the listener.

Bone conduction technology has found many other uses in the corporate world, in particular with technology companies trying to cultivate a collaborative team environment. Bone conduction technology allows computer programmers and dev teams to work in a team-focused environment with the ability to communicate and collaborate while maintaining the ability to listen to music, audiobooks, or join teleconference calls.


The real reason the sound of your own voice makes you cringe

Most of us have shuddered on hearing the sound of our own voice. In fact, not liking the sound of your own voice is so common that there’s a term for it: voice confrontation.

But why is voice confrontation so frequent, while barely a thought is given to the voices of others?

A common explanation often found in popular media is that because we normally hear our own voice while talking, we receive both sound transferred to our ears externally by air conduction and sound transferred internally through our bones. This bone conduction of sound delivers rich low frequencies that are not included in air-conducted vocal sound. So when you hear your recorded voice without these frequencies, it sounds higher – and different. Basically, the reasoning is that because our recorded voice does not sound how we expect it to, we don’t like it.

Dr Silke Paulmann, a psychologist at the University of Essex, says, “I would speculate that the fact that we sound more high-pitched than what we think we should leads us to cringe as it doesn’t meet our internal expectations our voice plays a massive role in forming our identity and I guess no one likes to realise that you’re not really who you think you are.”

Indeed, a realisation that we sound more like Mickey Mouse than we care to can lead to disappointment.

Yet some studies have shown that this might only be a partial explanation.

For example, a 2013 study asked participants to rate the attractiveness of different recorded voice samples. When their own voice was secretly mixed in with these samples, participants gave significantly higher ratings to their voice when they did not recognise it as their own.

What’s more, a complete explanation can be found in a series of early studies published years before the plenitude of reports offering the sound frequency and expectancy explanation.

Through their experiments, the late psychologists Phil Holzemann and Clyde Rousey concluded in 1966 that voice confrontation arises not only from a difference in expected frequency, but also a striking revelation that occurs upon the realisation of all that your voice conveys. Not only does it sound different than you expect through what are called “extra-linguistic cues”, it reveals aspects of your personality that you can only fully perceive upon hearing it from a recording. These include aspects such as your anxiety level, indecision, sadness, anger, and so on.

To quote them, “The disruption and defensive experience are a response to a sudden confrontation with expressive qualities in the voice which the subject had not intended to express and which, until that moment, [s]he was not aware [s]he had expressed.”

Their following study showed that bilinguals who learned a second language after the age of 16 showed more discomfort when hearing their recorded voices in their first language – a fact not easily explained a lack of bone-conducted sound frequencies.

The complexity of vocal coordination is enormous and we simply don’t have complete, conscious, “online” control. Indeed, the vocal larynx contains the highest ratio of nerve to muscle fibres in the human body. Moreover, when hearing a recording, we have none of the control of our speaking that we usually do it’s as though our voices are running wild.

Marc Pell, a neuroscientist at McGill University, specialises in the communication of emotion. He stands by the Holzemann and Rousey studies, saying: “when we hear our isolated voice which is disembodied from the rest of our behaviour, we may go through the automatic process of evaluating our own voice in the way we routinely do with other people’s voices … I think we then compare our own impressions of the voice to how other people must evaluate us socially, leading many people to be upset or dissatisfied with the way they sound because the impressions formed do not fit with social traits they wish to project.”

So, even though we may be surprised by the “Mickey Mouse” quality of what we actually sound like, it is the extralinguistic content of what our voices may reveal that could be more disconcerting. Yet it is unlikely that others are similarly surprised by a high-pitched aspect of your voice, and moreover others probably aren’t making the same evaluations about your voice that you might. We tend not to be critical of other people’s voices, so the chances are you’re the only person thinking about your own.


Why we find the sound of our voice cringeworthy

Representational image. | Carlo Allegri/Reuters

As a surgeon who specialises in treating patients with voice problems, I routinely record my patients speaking. For me, these recordings are incredibly valuable. They allow me to track slight changes in their voices from visit to visit, and it helps confirm whether surgery or voice therapy led to improvements.

Yet I am surprised by how difficult these sessions can be for my patients. Many become visibly uncomfortable upon hearing their voice played back to them.

“Do I really sound like that?” they wonder, wincing.

Some become so unsettled they refuse outright to listen to the recording – much less go over the subtle changes I want to highlight.

Sound in head

For one, the sound from an audio recording is transmitted differently to your brain than the sound generated when you speak.

When listening to a recording of your voice, the sound travels through the air and into your ears – what is referred to as “air conduction”. The sound energy vibrates the ear drum and small ear bones. These bones then transmit the sound vibrations to the cochlea, which stimulates nerve axons that send the auditory signal to the brain.

However, when you speak, the sound from your voice reaches the inner ear in a different way. While some of the sound is transmitted through air conduction, much of the sound is internally conducted directly through your skull bones. When you hear your own voice when you speak, it is due to a blend of both external and internal conduction and internal bone conduction appears to boost the lower frequencies.

For this reason, people generally perceive their voice as deeper and richer when they speak. The recorded voice, in comparison, can sound thinner and higher-pitched, which many find cringeworthy.

There is a second reason hearing a recording of your voice can be so disconcerting. It really is a new voice – one that exposes a difference between your self-perception and reality. Because your voice is unique and an important component of self-identity, this mismatch can be jarring. Suddenly you realise other people have been hearing something else all along.

Even though we may actually sound more like our recorded voice to others, I think the reason so many of us squirm upon hearing it is not that the recorded voice is necessarily worse than our perceived voice. Instead, we are simply more used to hearing ourselves sound a certain way.

A study published in 2005 had patients with voice problems rate their own voices when presented with recordings of them. They also had clinicians rate the voices. The researchers found that patients, across the board, tended to more negatively rate the quality of their recorded voice compared with the objective assessments of clinicians.

So if the voice in your head castigates the voice coming out of a recording device, it is probably your inner critic overreacting – and you are judging yourself a bit too harshly.

Neel Bhatt is an Assistant Professor of Otolaryngology, UW Medicine at the University of Washington.

This article first appeared on The Conversation.


You’re not necessarily stuck with your voice forever

If you’re really disturbed by the sound of your voice, you have options, Birchall says. First, you can go to see a properly trained voice therapist, which is different from a speech therapist. Voice therapists work with patients to improve their cadence and the rhythms of their pitch by doing specific exercises, like working on breathing patterns by getting them to blow bubbles through a straw. “It’s like physiotherapy, but for the voice,” he says.

If voice therapy is unsuccessful, people can seek seek specialist psychologist support. It’s also possible to make a person’s pitch higher or lower through surgery, which is a common part of gender reassignment surgery.


Why do we hate the sound of our own voices ?


Most people tend to shun recordings of their own voice. Image Credit: PD - Kane Reinholdtsen

Neel Bhatt - an Assistant Professor of Otolaryngology at the University of Washington - takes a closer look at the psychological and physiological reasons behind this phenomenon.


As a surgeon who specializes in treating patients with voice problems, I routinely record my patients speaking. For me, these recordings are incredibly valuable. They allow me to track slight changes in their voices from visit to visit, and it helps confirm whether surgery or voice therapy led to improvements.

Yet I'm surprised by how difficult these sessions can be for my patients. Many become visibly uncomfortable upon hearing their voice played back to them.

"Do I really sound like that?" they wonder, wincing.

Some become so unsettled they refuse outright to listen to the recording - much less go over the subtle changes I want to highlight.

The discomfort we have over hearing our voices in audio recordings is probably due to a mix of physiology and psychology.

For one, the sound from an audio recording is transmitted differently to your brain than the sound generated when you speak.

When listening to a recording of your voice, the sound travels through the air and into your ears - what's referred to as "air conduction." The sound energy vibrates the ear drum and small ear bones. These bones then transmit the sound vibrations to the cochlea, which stimulates nerve axons that send the auditory signal to the brain.

However, when you speak, the sound from your voice reaches the inner ear in a different way. While some of the sound is transmitted through air conduction, much of the sound is internally conducted directly through your skull bones. When you hear your own voice when you speak, it's due to a blend of both external and internal conduction, and internal bone conduction appears to boost the lower frequencies.

For this reason, people generally perceive their voice as deeper and richer when they speak. The recorded voice, in comparison, can sound thinner and higher pitched, which many find cringeworthy.

There's a second reason hearing a recording of your voice can be so disconcerting. It really is a new voice - one that exposes a difference between your self-perception and reality. Because your voice is unique and an important component of self-identity, this mismatch can be jarring. Suddenly you realize other people have been hearing something else all along.

Even though we may actually sound more like our recorded voice to others, I think the reason so many of us squirm upon hearing it is not that the recorded voice is necessarily worse than our perceived voice. Instead, we're simply more used to hearing ourselves sound a certain way.

A study published in 2005 had patients with voice problems rate their own voices when presented with recordings of them. They also had clinicians rate the voices. The researchers found that patients, across the board, tended to more negatively rate the quality of their recorded voice compared with the objective assessments of clinicians.

So if the voice in your head castigates the voice coming out of a recording device, it's probably your inner critic overreacting - and you're judging yourself a bit too harshly.

Neel Bhatt, Assistant Professor of Otolaryngology, UW Medicine, University of Washington

This article is republished from The Conversation under a Creative Commons license.

Similar stories based on this topic:


Sony WH-1000XM4 vs Shure AONIC 50: Sound quality

The Shure AONIC 50 amplifies upper-bass notes which helps fundamental vocal frequencies stand out. This was measured with firmware 0.4.9.

The Shure AONIC 50 have a neutral-leaning frequency response, and the slight boost between 70-300Hz helps vocals stand out. The small dip between 2-4kHz is intentional, and it helps reduce unwanted resonances within your ears. The 0.4.9 firmware version made the Shure AONIC 50’s frequency response more neutral in the sub-bass region, allowing for a more accurate sound signature.

The flat bass response allows for more clarity in the lows even if they’re not as strong.

The Sony WH-1000XM4 also have a neutral-leaning frequency response, allowing for accurate sound reproduction and generally clear, detailed audio. Just like the Shure AONIC 50, the dip at around 2kHz is intentional for combating unpleasing harmonic resonances that occur in the ear canal when a seal is formed.

Both the AONIC 50 and the XM4 have a sound signature that can be customized in their respective apps’ EQ settings, but as I mentioned before, you cannot carry over your EQ settings for the Shure AONIC 50 to playback outside of the app. For this reason and our objective scoring, the Sony WH-1000XM4 wins this round, but not by a significant margin.


Why we hate the sound of our own voices

For me, these recordings are incredibly valuable. They allow me to track slight changes in their voices from visit to visit, and it helps confirm whether surgery or voice therapy led to improvements.

Yet I'm surprised by how difficult these sessions can be for my patients. Many become visibly uncomfortable upon hearing their voice played back to them.

Advertisement

"Do I really sound like that?" they wonder, wincing.

Some become so unsettled they refuse outright to listen to the recording – much less go over the subtle changes I want to highlight.

Read More

For one, the sound from an audio recording is transmitted differently to your brain than the sound generated when you speak.

When listening to a recording of your voice, the sound travels through the air and into your ears – what's referred to as "air conduction". The sound energy vibrates the ear drum and small ear bones. These bones then transmit the sound vibrations to the cochlea, which stimulates nerve axons that send the auditory signal to the brain.

However, when you speak, the sound from your voice reaches the inner ear in a different way. While some of the sound is transmitted through air conduction, much of the sound is internally conducted directly through your skull bones.

When you hear your own voice when you speak, it's due to a blend of both external and internal conduction, and internal bone conduction appears to boost the lower frequencies.

Advertisement

For this reason, people generally perceive their voice as deeper and richer when they speak. The recorded voice, in comparison, can sound thinner and higher pitched, which many find cringeworthy.

There's a second reason hearing a recording of your voice can be so disconcerting. It really is a new voice – one that exposes a difference between your self-perception and reality.

Because your voice is unique and an important component of self-identity, this mismatch can be jarring. Suddenly you realise other people have been hearing something else all along.

Even though we may actually sound more like our recorded voice to others, I think the reason so many of us squirm upon hearing it is not that the recorded voice is necessarily worse than our perceived voice. Instead, we're simply more used to hearing ourselves sound a certain way.

A study published in 2005 had patients with voice problems rate their own voices when presented with recordings of them. They also had clinicians rate the voices. The researchers found that patients, across the board, tended to more negatively rate the quality of their recorded voice compared with the objective assessments of clinicians.

So if the voice in your head castigates the voice coming out of a recording device, it's probably your inner critic overreacting – and you're judging yourself a bit too harshly.

Advertisement

Neel Bhatt is assistant professor of otolaryngology at UW Medicine, University of Washington

This article is republished from The Conversation under a Creative Commons license. Read the original article.


4pAAa12 – Hearing voices in the high frequencies: What your cell phone isn’t telling you – Brian B. Monson

Ever noticed how or wondered why people sound different on your cell phone than in person? You might already know that the reason is because a cell phone doesn’t transmit all of the sounds that the human voice creates. Specifically, cell phones don’t transmit very low-frequency sounds (below about 300 Hz) or high-frequency sounds (above about 3,400 Hz). The voice can and typically does make sounds at very high frequencies in the “treble” audio range (from about 6,000 Hz up to 20,000 Hz) in the form of vocal overtones and noise from consonants. Your cell phone cuts all of this out, however, leaving it up to your brain to “fill in” if you need it.

Figure 1. A spectrogram showing acoustical energy up to 20,000 Hz (on a logarithmic axis) created by a male human voice. The current cell phone bandwidth (dotted line) only transmits sounds between about 300 and 3400 Hz. High-frequency energy (HFE) above 6000 Hz (solid line) has information potentially useful to the brain when perceiving singing and speech.

What are you missing out on? One way to answer this question is to have individuals listen to only the high frequencies and report what they hear. We can do this using conventional signal processing methods: cut out everything below 6,000 Hz thereby only transmitting sounds above 6,000 Hz to the ear of the listener. When we do this, some listeners only hear chirps and whistles, but most normal-hearing listeners report hearing voices in the high frequencies. Strangely, some voices are very easy to hear out in the high frequencies, while others are quite difficult. The reason for this difference is not yet clear. You might experience this phenomenon if you listen to the following clips of high frequencies from several different voices. (You’ll need a good set of high-fidelity headphones or speakers to ensure you’re getting the high frequencies.)

Until recently, these treble frequencies were only thought to affect some aspects of voice quality or timbre. If you try playing with the treble knob on your sound system you’ll probably notice the change in quality. We now know, however, that it’s more than just quality (see Monson et al., 2014). In fact, the high frequencies carry a surprising amount of information about a vocal sound. For example, could you tell the gender of the voices you heard in the examples? Could you tell whether they were talking or singing? Could you tell what they were saying or singing? (Hint: the words are lyrics to a familiar song.) Most of our listeners could accurately report all of these things, even when we added noise to the recordings.

Figure 2. A frequency spectrum (on a linear axis) showing the energy in the high frequencies combined with speech-shaped low-frequency noise.

[Insert noise clip here: MonsonM1singnoise.wav]

What does this all mean? Cell phone and hearing aid technology is now attempting to include transmission of the high frequencies. It is tempting to speculate how inclusion of the high frequencies in cell phones, hearing aids, and even cochlear implants might benefit listeners. Lack of high-frequency information might be why we sometimes experience difficulty understanding someone on our phones, especially when sitting on a noisy bus or at a cocktail party. High frequencies might be of most benefit to children who tend to have better high-frequency hearing than adults. And what about quality? High frequencies certainly play a role in determining voice quality, which means vocalists and sound engineers might want to know the optimal amount of high-frequency energy for the right aesthetic. Some voices naturally produce higher amounts of high-frequency energy, and this might contribute to how well you like that voice. These possibilities give rise to many research questions we hope to pursue in our study of the high frequencies.

Monson, B. B., Hunter, E. J., Lotto, A. J., and Story, B. H. (2014). “The perceptual significance of high-frequency energy in the human voice,” Frontiers in Psychology, 5, 587, doi: 10.3389/fpsyg.2014.00587.

Hearing voices in the high frequencies: What your cell phone isn’t telling you


You hear your own voice differently from others.

We hear our voices all day, from our inner thoughts to all the times we&aposve gabbed with friends and co-workers. So if anyone knows what your voice sounds like, you&aposd think it&aposd be you, right? Wrong. You hear your own voice differently due to the way sound travels out your mouth and back into your ears. 

When your friends are listening to you, your voice travels though the air and hits their external ear drums, sending vibrations into the inner ear. However, when you&aposre hearing your own voice, the sound is coming from inside your body, which means it&aposs filtered through your ears differently. These inner vibrations transfer in two ways, both externally, from your mouth, and internally, from the bones in your head and neck. The combination of these two pathways is what creates the unique voice you hear in your head that your friends, who only hear the airborne version of your voice, can&apost hear.

“When we talk, it’s like everyone hears the sound through speakers, but we’re hearing it through a cave complex inside our own heads,” Martin Birchall, professor of laryngology at University College London, told Time. “The sound is going around our  sinuses, all the empty spaces in our heads and the middle part of our ꃪrs, which changes the way we hear sounds compared to what other people  hear.”

So, in other words, the bones in your face make it impossible to accurately hear your true voice.