The physics of sound: psychoacoustics


This section deals with the physical and psychoacoustic properties of sound, its impact on the hearing mechanism and the acoustic content of the speech spectrum.

This is one of the recurring themes in the audiometry units of competency that are aligned to the Diploma Course. The units of competency that include this theme are:
HLTAU1A – Conduct screening hearing tests for children
HLTAU2A – Conduct screening hearing tests for adults
HLTAU3A – Conduct hearing tests assessments
HLTAU4A – Dispense hearing aids

The understanding of how sound is produced and perceived is part of the required knowledge that underpins the development of competence. This knowledge will help you to understand the complexity of hearing impairment and the management of clients with hearing loss.

In your activities and assessments your teacher can reasonably ask you to:
• Describe the physical properties of sound
• Differentiate between the various decibel scales that relate to the hearing mechanism
• Define intensity and loudness
• Define frequency and pitch
• Outline the concepts of binaural advantage and occlusion
• Describe how the different parts of the auditory pathway affect the incoming sound
• Identify the acoustic content of the speech spectrum
• Explain the frequency content of vowels and consonants in relation to the speech spectrum
• Describe the effect of environmental noise on the speech spectrum.

This resource is designed to complement your class or individual learning activities. You should use this resource as a guide to identify areas of learning. This learning guide does not provide enough information on its own for you to gain the knowledge you require.

There is not a required textbook for this topic but there are many that can help you. The following textbooks are all relevant and you may decide to refer to them as you study this topic. There are hundreds of internet sites that you may find useful. The sites that are listed below were accessed in December 2003. As internet sites and the information in them change, you may wish to perform your own search.

Relevant texts
TITLE The Speech Chain – The Physics and Biology of Spoken Language
AUTHOR Denes, P.B. and Pinson, E.N.
PUB DATE 2nd edition, 1993
PUBLISHER W.H. Freeman and Company, New York, New York
ISBN 0716723441

TITLE Bases of Hearing Science
AUTHOR Durrant, J.D. and Lovrinic, J.H.
PUB DATE 3rd edition, 1995
PUBLISHER Lippincott, Williams & Wilkins, Baltimore
ISBN 0683027379

TITLE Clinical Audiology – An Introduction
AUTHOR Stach, B.A.
PUBLISHER Singular Publishing Group Inc, San Diego
ISBN 156593346X

TITLE Handbook of Clinical Audiology
AUTHOR Katz, J. et al.
PUB DATE 4th Edition, 1994
PUBLISHER Williams & Wilkins, Baltimore. Md.
ISBN 0683006207

TITLE Audiology: the Fundamentals.
AUTHOR Bess, F.H. & Humes, L.E.
PUB DATE 2nd Edition, 1995
PUBLISHER Williams & Wilkins, Baltimore. Md.
ISBN 0683006207

TITLE Introduction to Audiology
AUTHOR Martin, F.N.
PUB DATE 6th Edition, 1996
PUBLISHER Allyn & Bacon, USA
ISBN 0205195709

TITLE Audiologists Desk Reference Volume I - Diagnostic Audiology
Principles, Procedures and Protocols
AUTHOR Hall, J.W. & Mueller, G.H.
PUBLISHER Singular Publishing Group, UK
ISBN 156593269

TITLE Basic Principles of Audiology Assessment
AUTHOR Hannely, M.
PUBLISHER Prentice-Hall, USA
ISBN 0205135528

TITLE Audiology
AUTHOR Newby, Hayes.
PUBLISHER Prentice-Hall, New York
ISBN 0130519219

The physical properties of sound

What is sound?
The ear’s anatomy allows us to hear sounds but there must be sound before the ear can perform its function. Normally when we talk about sound we talk about music, noise or speech. These are all sounds but difficult to describe unless you know about their features. How do you describe sound? Would you say “Sound is something that we hear”?
Then, if you were asked “What do we hear?” would you say “We hear sound”? Neither of these comments is particularly useful or descriptive. To really describe sound we have to look at how sound is produced and what its features are.

This topic is concerned with how sound can be defined and how that relates to the sense of hearing. What must exist before sound can be generated? To generate sound there must be a sound source that vibrates and a medium for the sound to travel through.
The production of sound by vibration Vibration occurs because a force has caused particles to move. There are many different ways of producing vibrations. For example, to produce a vibration you could: beat a drum; pluck a string; blow across the top of a bottle; tap a glass with a pencil; ring a bell; play a musical instrument; or speak. Sound is generated by a source that vibrates.
This vibrating sound source then moves the molecules (or particles) of the medium. The medium is usually air. There will be more discussion about the medium in the next section. The molecules are pushed into the ones nearby and then are pushed back; they rebound and overshoot their original position and then return to their resting place. Each individual molecule does not move far from its original position. NB: Although air molecules are constantly moving in a random way, ( the phenomenon known as Brownian motion) we can ignore this fact for the purposes of describing sound.


It is the change in pressure through the molecules that causes the sound wave. Another way of saying this is that it is the energy that is caused by the vibrating sound source that produces a sound.

There are 2 terms you need to learn to describe what is happening to the molecules. These are compression and rarefaction. Occasionally, you will see the term “condensation”. This term is used instead of the term “compression” and means the same thing.
Compression means pushing together. When we talk about sound, we mean that the molecules of the medium are being pushed together at a particular point. As these molecules move away from their resting point to go into the area of compression it leaves an area where there are fewer molecules. This is the area of rarefaction.

Rarefied air is “thin” air. For example, the higher the altitude the thinner the air. However, when we talk about rarefaction in the transmission of sound we are talking about a temporary condition where part of the medium is “thin” until the molecules move back to their resting position.
Therefore, sound waves are a series of compressions and rarefactions that cause a change in pressure. Or you could say that the change in pressure causes a sound wave, the pressure goes up to a maximum of compression and then goes down to a minimum point of pressure that is rarefaction.


A: a force is applied to the medium which starts it vibrating
B: the molecules are moving closer together in the direction of the force applied
C: the molecules are at their maximum point of compression, ie they are close together, and at maximum pressure
D: the molecules begin to move apart, ie move into rarefaction
E: the molecules have reached their resting point but due to the force they continue to rebound
F: the molecules are rebounding and are in the area of rarefaction
G: the maximum point of rebound has occurred, ie the molecules are at their furthest, this is the point of minimum pressure
H: the molecules move back towards their resting place
I: the molecules are now back to their resting place
A to I: one full cycle has been completed, ie one wavelength. If one second has elapsed then this sound is 1Hz, a very low frequency sound below the range of human hearing.

Sound cannot exist in a vacuum

Sound needs a medium in which to travel, ie, it needs something to work in.
In the previous section it was stated that sound is produced by the pressure changes occurring in the movement of molecules caused by a vibrating sound source. Another way of saying this is that sound cannot exist unless there are some molecules to transmit the vibration.

Most of the sound we hear is transmitted by air, ie, even though we can’t see them the molecules in air are being compressed and rarefied to produce sound.
Any medium will transmit sound as long as the molecules in it have some elasticity, ie, they are free to move. Therefore you can hear sound under water, in the ground and through the air. It is the elasticity of the medium that allows the molecules to return to their original position.

Remember it is the energy wave that is transmitted and sound is produced by the transmission of energy.

The speed of sound

What do you think affects the speed of sound?
The speed or velocity of sound is affected by temperature, humidity and the density of the medium.

(in degrees Celcius)
Speed of Sound
(in metres per second)
The speed of sound in dry air at a temperature of 0 degrees Celsius (which is the same as 32 degrees Fahrenheit and is the freezing point of water) is 330 metres per second (or 1088 feet per second). If the temperature is warmer the speed of sound gets faster. This is because the molecules move faster if they are heated up. Therefore, at 20 degrees Celsius (or 68 degrees Fahrenheit) the velocity of sound is 343 m/sec (1130 ft/sec). At 100 degrees Celsius (or 212 degrees Fahrenheit, the boiling point of water) it travels at 386 metres per second (or 1,266 feet per second).

Sound travels slightly faster in moist air than in dry air. So sound travels slightly faster as the humidity rises.

The most important factor in determining the speed of sound is the density of the medium. Sound travels faster through water than through air and travels faster through the ground than through water. The speed of sound in water is about 1463m (4800 ft) per second. This is because the molecules are packed closer together as the medium becomes more dense (or more solid) and therefore have greater elasticity. Air is gaseous and therefore the molecules are further apart. So putting your ear to the ground is not just something they made up for the Westerns, it actually is true.

The following table contains some other “speeds” that you might find interesting.
Approximate Speed of Sound
(in metres per second)
Salt Water

Diffuse and free fields

So far the discussion has been about sound as if it were in a free field. A free field is an area where there are no surfaces or barriers for the sound to interact with. That is, the sound waves do not encounter any obstruction. This is compared to a diffuse field. A diffuse field is where sound encounters surfaces or barriers. Most of the sound we listen to is in a diffuse field.

In the following diagram, the area shown by ‘B’ is a free field and the area shown by ‘A’ is a diffuse field.
Sound waves are affected by what is around them. That is, when sound waves meet a barrier they may be absorbed or reflected by that barrier, or they may go through or around the barrier. Sound waves may also be affected by other sound waves.

The terms we use for these phenomena are transmission, diffraction, reflection, refraction, reverberation and absorption. Sound waves may be affected by one of these or, as is more usual, a combination of them.

Transmission occurs when the sound goes through a barrier. It is very common for some sound to be transmitted through a barrier. For example, if you are in a row of offices with thin walls, it is very common to hear some sounds from the other offices. It may happen that you hear as much as you would if you were sitting in the other office. It is more common, however, that you would hear some of the sounds from the other office. The sound that has not been transmitted through the barrier (ie, the thin walls) will be reflected off it or absorbed by it.

The amount of sound that is transmitted and therefore the amount of sound that is reflected depends on how similar the barrier is to the medium in which the sound is travelling.

If the sound meets a barrier that is very different from the one in which it was travelling more sound will be reflected, and conversely, less sound will be transmitted through the barrier. For example, if sound travelling in air meets water a lot of the sound will be reflected off the surface of the water. This is called an impedance mismatch.

It may be a good thing to have an impedance mismatch when you are trying to have a private conversation with your doctor. There are times, however, when you lose too much energy because of the impedance mismatch. Recall that the function of the middle ear is to compensate for the impedance mismatch between the sound energy that causes the eardrum to vibrate and the sound energy that reaches the cochlea. If the middle ear did not perform this function our hearing would not be as sensitive as it is.

Soundproof rooms are built to reduce the amount of sound energy being transmitted from the outside. To achieve total soundproofing is very difficult and in fact it is more correct to say that the rooms we use for hearing testing are sound treated.
Diffraction occurs when the sound waves are scattered. For example, the sound waves could pass through a hole in a wall or go around a filing cabinet. Diffraction is the property of sound that allows it to go around corners. So even though you can’t see around corners you can hear around corners.
Reflection and reverberation relate to the sound waves being “bounced off” a barrier.
Reverberation is where the sound wave continues even though the sound source has stopped and is a result of multiple reflections. An echo is a type of reverberation.
The name of a room with no reflection or reverberation is called an anechoic room. Most people find walking into an anechoic a room quite strange as the sound appears to be “dead” and prefer a room with some reverberation. The average living room will have a reverberation time of about .5 second compared to the average bathroom that will have a reverberation time of about 1 second.
Anechoic rooms are very expensive to build so rooms that are used to test hearing are usually lined with acoustically absorbent materials to reduce the effects of reflection and reverberation.
Refraction occurs when the sound waves meet a barrier and the sound wave bends. This changes the wavelength and therefore changes the way the sound will be perceived.
Absorption occurs when the sound wave gets “caught” within a barrier. Materials that are soft, porous and have rough surfaces absorb sound energy better than materials with hard, smooth surfaces. Even air absorbs sound.
So if you want to create a quiet room you can use soft furnishings to absorb sound. Cloth covered chairs and carpet will help absorb sound.

NB: High frequencies are absorbed more than low frequencies (see topic 4.4 for information about frequencies).

The Decibel

The decibel is the term we use when talking about the intensity and loudness of sound.
This and the next topic, which deals with intensity and loudness, need to be looked at together. We can’t discuss the decibel without referring to intensity and loudness and we can’t discuss intensity and loudness without referring to the decibel.

What is the decibel (dB)?
The decibel describes how loud a sound is. It is a measure of intensity and loudness. The decibel is a ratio on a logarithmic scale.
The decibel is a ratio. In other words a decibel is not an absolute intensity; it is a representation of how much louder one sound is in comparison to another sound. This means that every time you talk about the decibel you have to state what the reference point of the ratio is. Decibels are stated in terms of their reference point, eg, dBSPL.
Decibels are logarithmic to make it easier to deal with very large numbers AND because the ear responds to increases in intensity/loudness in a logarithmic fashion.

The base of the logarithm in dBSPL is 10 (although it can be any number, sound uses a base of 10). An easy way of remembering base 10 logarithms is to count the number of zeros.

Humans have an extremely large range of sound pressures that they can tolerate if hearing is normal. The loudest tolerable sound is about 100,000,000,000,000 times louder than the softest sound able to be detected. This figure can be expressed as a ratio logarithmically as 140:1. You can see it is much easier to write 140 than 100,000,000,000,000.
A one hundred fold increase in sound pressure is 20dB of gain, a sound 1,000 times stronger is 30dB of gain, etc. A sound that is twice as loud as another is 3dB of gain, eg, 60dB + 60dB = 63dB.
Whenever the decibel is used in relation to a level of pressure it is that level in comparison to a reference point. That is, 0dB does not mean there is no sound or that there is no pressure change caused by the vibration of a sound source, it means that the level of pressure is the same amount of pressure as the reference point.

Decibel Sound Pressure Level (dBSPL)
When a decibel is measured, the measure corresponds to an amount of pressure in relation to another point of pressure. By stating the decibel in Sound Pressure Level as dBSPL you are referring to a sound in relation to 0dBSPL.

There are many different ways to state the reference point 0dBSPL. All of these are the same amount of pressure and include:
10 to -16 watts per square centimetre (W/cm 2)
0.0002 dynes per square centimetre (d/cm 2)
2 x 10 to -5 Newton/metre squared (N/m 2)
20 micropascal (μPa)

Decibel Hearing Level or Hearing Threshold Level (dBHL/dBHTL)
When we test an individual’s hearing we relate the measure of intensity needed to dBHTL or dBHL. These are the same and are Decibels Hearing Threshold Level or Decibels Hearing Level. These are in turn related to dBSPL.
Decibels Hearing Threshold Level is the amount of sound in dBSPL that is needed by a majority of young people with no history of ear problems (otologically normal) to just hear a particular frequency (more on frequency in topic 4.4). That level of dBSPL becomes 0dBHTL, which is called audiometric zero.
A sound of 0dBHTL does not mean there is no sound . It is an average level at which sound is just audible to people with normal hearing in ideal listening conditions. The level of dBSPL needed to be just heard by listeners with normal hearing will be affected by the frequency of the sound.

For example at 1000Hz the majority needed 7.5dBSPL to just hear that sound, so at 1000Hz 0dBHTL=7.5dBSPL. At 250Hz the majority needed 26.5dBSPL to just hear the sound. We need more pressure/intensity to hear the lower frequencies and the higher frequencies. In other words, our ears are more sensitive at the mid range. The comparison between dBHTL and dBSPL is as follows:
Decibels HTL
Decibels SPL

Decibel Sensation Level (dBSL)
If you see decibel noted as SL, it means sensation level. It is how many dB above threshold the sound is.
For example,
You want to compare responses of 2 people to uncomfortable levels of sound:
Person A can just hear a sound at 40 dBSPL but at 110 dBSPL it is uncomfortably loud.
Person B can just hear a sound at 90 dBSPL that is uncomfortably loud at 130 dBSPL.

Comparing 110dBSPL and 130dBSPL is not meaningful because of the different levels at which the 2 people can just hear a sound. Because of the difference between them of the level at which they can just hear the sound it is better to say that uncomfortable levels occurred at 70dBSL and 40dBSL respectively. So you know that even though one person does not hear as well as the other, that person also can’t tolerate the increase of sound as comfortably.

Measuring Decibels

Sound Level Meters (SLM) measure the intensity of sound in decibels. Most SLMs will measure intensity using different settings, eg, dBSPL, dBA. Many SLMs will perform extremely complicated measures. If you have access to an SLM, have a look at it and its manual, to see the range of procedures it can perform.
One of the purposes of a SLM is to measure ambient noise, ie background noise. This is particularly useful when deciding whether a particular room can be used for testing. You should always check a room when you are testing away from your permanent site.

Decibel measurement with weightings (dBA, dBB, dBC)
The original reason for using decibel weighting was to approximate the response of the ear to sound at different levels.

The ear responds to sounds in a different way depending on how intense they are. Therefore, because the ear responds differently to different levels of sound, different decibel weighting was used.

The original purpose for setting these weightings was to use the A weighting for levels below 55dB, the B weighting for levels between 55 and 85dB and the C weighting for levels above 85db.

It is now more common to use weightings for specific purposes. The A weighting is most commonly used in work related to noisy working conditions and industrial audiology. The B and C weightings are rarely used.

You can find a pictorial representation of the relationship between dBA, dBB, dBC in most introductory audiology textbooks. You will see the main difference is in the low frequencies. That is, if the dBC filter is used to measure louder sound, more of the low frequencies will be included.

Intensity and loudness

Intensity is related to the objective measurement of how intense a sound is and loudness is the subjective experience of the perception of how loud a sound is.
Intensity relates to how far the molecules are moved away from their original position, the further they move the greater the intensity. This is the amplitude of the sound. The sound is louder because it strikes the eardrum with greater force because the molecules move more, ie, with greater force.


Sound wave ‘A’ is louder than sound wave ‘B’. The amplitude is greater for sound wave ‘A’ than it is for sound wave ‘B’, because the molecules have moved further from their resting place.

The decibel relates intensity to the subjective experience of loudness.
The ear responds to a wide range of sounds. Normal conversational speech occurs at about 65dBSPL. Sounds are physically painful at about 140dBSPL, this is true for the greater majority of people even when they have a hearing loss.
The perception of loudness is not just related to its decibel level. It is also affected by the duration of the sound. The sound is judged as being louder if its duration is longer.
The frequency of the sound will also affect the perception of loudness as the ear is more sensitive at some frequencies. You will need to obtain a copy of a diagram showing equal loudness curves. You should be able to find a diagram of these curves in most introductory audiology textbooks.
Equal loudness curves are also called phon curves. Listeners are asked to indicate a sound that is equally loud to a reference sound at 1kHz. For example, at 60dBSPL, 1kHz a 100Hz sound is judged to be equally loud at about 71dBSPL.

You must understand the relationship between loudness and the way the ear responds to it differently so that later on you will be able to understand some of the issues to do with rehabilitation of the hearing impaired.

If you are listening in a diffuse field, which you normally do, the loudness of the sound may be affected by reflection, refraction and reverberation. The sound may be louder or softer.

Sounds may be affected by other sounds. For example, by interference. Interference can also affect the intensity and loudness of a sound. Interference occurs when sound waves cross other sound waves. Depending on the property of the waves the sound will be affected in various ways. Waves interact so that when combined they can make a louder sound, a softer sound, a different quality of sound or no sound at all.

When a sound wave is in phase with another sound they move in the same time frame. That is, they reach the peaks and dips at the same time.
Waves may be out of phase. That is, one wave will reach the peak and the other wave will reach the trough at different times. If waves are 1800 (ie, 180 degrees) out of phase they are exactly opposite. That is, one wave will reach the peak and the other wave will reach the trough at exactly the same time.
Waves that are in phase make a more intense sound that we hear as being louder. Standing waves occur when the waves are exactly opposite, ie they have the same intensity but are 1800 out of phase. They cancel each other so there is no sound at all. That is, you would hear nothing, this is sometimes called a deadspot. If the waves are slightly different then you would hear a beat or pulsations of different loudness.

That is, if the interaction is constructive (adding together) there is an increase in amplitude but if the interaction is destructive (taking away from each other) the amplitude is decreased.


In this diagram waveform 1 and waveform 2 are in phase at points A and C, but are 1800 out of phase at B and D. Waveform 3 shows the addition, point by point of waveforms 1 and 2.

The effect of distance on sound

If you are trying to listen to something interesting on TV and couldn’t turn the volume up, would you sit closer or further away?
You would sit closer. This is because sound becomes softer with distance.

The way distance affects sound is called the inverse square law.
If you were listening in a free field the inverse square law would apply. That is, as the distance from the sound sourced is doubled, the sound pressure is halved.

Frequency and pitch

Frequency is the physical or acoustic property of sound and pitch is its psychoacoustic parallel.
Frequency means how often something has happened. When talking about sound, frequency is the number of times the molecules move from their position of rest and return to it in one second. That is, the number of completed cycles of compression and rarefaction in one second.

The description of frequency is made in relation to cycles per second or Hertz in honour of Heinrich Hertz a nineteenth century German physicist. The higher the number of cycles per second, or Hertz, the higher the frequency of the sound. The abbreviation for Hertz is Hz. Another abbreviation used is kHz for Kilohertz, ie, 1kHz = 1000 Hz .
The wavelength of the sound is related to the frequency. The shorter the wavelength the higher the frequency.


Wavelength a is shorter than wavelength b.
Sound A is a higher frequency than sound B.

We often talk about frequency in relation to Middle C. You would think this was fairly straightforward but there are in fact 3 figures associated with Middle C. If you use the concert scale the frequency of Middle C is 261.6Hertz, compared to 256Hz which is considered middle C for scientific purposes and for audiometry it is rounded down to 250Hz.


If the frequency is doubled it is subjectively perceived as a one octave increase in pitch.
In audiometry the octave frequencies are 125Hz, 250Hz, 500Hz, 1000Hz, 2000Hz, 4000Hz, 8000Hz. These can also be written as .125kHz, .25kHz, .5kHz, 1kHz, 2kHz, 4kHz, 8kHz.
Occasionally in audiometry you will see a reference to half octave frequencies. These are 750Hz, 1500Hz, 3000Hz and 6000Hz.

Ultrasonic frequencies are beyond the range of frequencies detectable by humans, ie above 20kHz. Infrasonic frequencies are those below the range of frequencies detectable by humans, ie below 20Hz. Ultra-high frequencies are those between 8kHz and 20kHz.

Pure tones

In audiometric testing the sounds used are called pure tones. Pure tones are also called sine waves and are sounds of a single frequency. That is, pure tones are one frequency of vibration.
A sine wave is the result of sinusoidal motion or simple harmonic motion, ie simple back and forward movement.

Fourier Analysis

All sounds are a combination of sine waves. This is called Fourier analysis.
Fourier analysis is a mathematical technique in which all sounds, no matter how complex, are able to be described by the sine waves that interact to produce the complex wave. It was named after the mathematician, Baron Jean-Baptiste Joseph Fourier, who described this technique in the nineteenth century.
For example, in the following diagram the sound wave at the top is a complex wave. It can be broken down into the 12 pure tones shown underneath it.


Resonant frequency

Resonance is sympathetic vibration. The resonant frequency of an object is the frequency at which it vibrates most easily. That is, it is the natural frequency an oscillating object tends to settle into if it is not disturbed, and conversely it is the frequency that most easily sets an object into vibration. At the resonant frequency the magnitude of the vibration is greatest and decays most slowly.

Doppler effect

The Doppler effect occurs if a sound source is moving past a listener. The pitch of the sound seems to change as the source moves past. The pitch seems higher at the front of the sound source and then to get lower as the source moves past.
This is named after Christian Johann Doppler who described the effect in1842.

How is the quality of sound described?
As you have seen sound is described subjectively by pitch and loudness, or objectively by frequency and intensity. However, there is one more aspect to fully describe a sound. This is related to the quality or timbre of the sound. It is not known exactly what affects timbre but one of the aspects is related to harmonics.

If you listened to a number of musical instruments playing the same note equally loudly, they would have the same frequency and intensity but their quality would be totally different. This is because the harmonic structure of the notes are different. Harmonics are the additional components in the wave that vibrate in simple multiples of the base (or fundamental) frequency.
If you limit the number of harmonics the sound will be less rich.
Men’s voices are often described as being richer than women’s voices. This is because the fundamental frequency of a man’s voice is generally lower and therefore, there are more harmonics within our range of hearing.


Noise is defined as an aperiodic sound because it doesn’t repeat itself and the intensity varies. That is, it is a complex sound with a mixture of many different frequencies or notes not harmoniously related, a sound that lacks identifiable pitch with random amplitude.
White noise is often used to define sound which contains all audible frequencies with the same intensity across the frequency range.

Upward spread of masking

Essentially it means that low frequencies mask higher ones, but the reverse is not the case. This is often what happens in a crowd or at a party. Most people have trouble hearing in this type of situation.

Binaural advantage

Binaural means two ears. Binaural advantage relates to the benefit we receive from having two ears. These benefits are better hearing, the ability to locate the direction of the sound and hear better in background noise.
Binaural summation occurs when a sound is heard by both ears at the same intensity and the ears make the sound louder. If you compare the detection of sound using one ear to using both ears there is an intensity benefit of 2 to 3 dB, binaurally.

Localisation of sound is possible because the ears work together. There are two cues for locating the direction of a sound source. The first relies on interaural time differences. Interaural means between the ears. So this cue relies on the sound reaching one ear before the other. This is the cue needed for low frequency sounds. The second cue relates to the effect of the head. The head casts a sound shadow that affects the high frequencies so the intensity of the sound reaching each ear is slightly different. This is called the interaural intensity difference.

The binaural squelch effect occurs when the two ears operate together in background noise. This benefit of the ears working together means that it is easier to hear in background noise.


The occlusion effect occurs when the external ear canal is closed, eg, with a finger. It happens because there is an increase in sound pressure level in the ear canal. The lower the frequency the greater the effect. However, there is essentially no effect above 100 Hz.

An activity to help you understand the concepts of binaural advantage and occlusion is as follows:
Obtain an earplug or use Blu Tack to plug your ears with. Insert the earplug in each ear carefully. Don’t put them so far down that you can’t get them out easily!

Speak to yourself. Notice how your voice seems to change. It will sound like it’s all inside your head. This is the occlusion effect.
Now remove one earplug. It doesn’t matter which one.
Create a sound source. For example, turn on the radio. Close your eyes and turn around slowly and take note of where you think the sound is coming from. If you have someone who can help, get them to put the radio in various places around the room and try to guess where the radio is. Of course, you have to keep your eyes shut.
Now take the earplug out and see if it is easier to locate the sound. It should be!!!!!

The effect of the auditory pathway on sound

Hearing is important for keeping in contact with what is going on around us, however sound needs to be perceived and interpreted before we can use our sense of hearing. The auditory pathway affects the incoming sound in different ways to maximise perception and interpretation.

Audibility range of frequencies

Sound can be produced at any frequency but this does not mean that we can hear it. It is possible that a sound is generated that is beyond the range of our hearing.
The cochlea is where sound begins to be interpreted. The cochlea responds to sounds tonotopically. That means, the cochlea responds in different locations to different frequencies. This is related to the movement of the basilar membrane in response to the movement of the stapes in the oval window.
The audible range of frequencies for human hearing is 20 to 20,000 Hertz (cycles per second). We talk about this range in relation to young adults with otologically normal ears, that is, they have no hearing loss or ear disease.

What about other animals? Do you know any animals that seem to have better hearing than humans?
Perhaps you said a dog. The audible range of frequencies for a dog is 15 to 50,000 Hertz. That is why there are some sounds that will send dogs into a frenzy that we can’t even hear.

The audible range of hearing for:
the cat is 60 to 65,000 Hertz
the porpoise 150 to 150,000 Hertz
The human ear is most sensitive at 500 to 8kHz. However, as we age we lose the ability to hear the very high frequencies.

The ear’s response to the intensity of sound

The ear responds differently to the frequency of the sound depending on the intensity level of the sound.
This is represented by phon curves. These are also called equal loudness contours because the points along the lines or contours are judged as being equally loud compared to the 1000Hertz tone.
At low levels of intensity, the ear is less sensitive to the very low and very high frequencies.
As the tones become more intense, louder, the ear is more sensitive to the low frequencies.

Head shadow

Sound can be diffracted around the head. This is called the head shadow effect. That is, some sounds do not carry across to the other side of the head because the head is creating a barrier. The average human head creates a shadow for sounds of frequencies above 1000Hz. That is, sounds above 1000Hz have short wavelengths and they are easily blocked by the head. So, the left ear will not easily hear high frequency sounds coming from the right side of the head.
This effect assists with the localisation of sounds with frequencies above 1000Hz.

Body baffle

If a microphone was placed on the body, there would be a boost in the low frequencies caused by the body. The effect is greatest around 500Hz with a 5 to 10dB boost.

Localisation of sound

Detection of direction, or the location of a sound source, depends on intensity differences in the high frequencies and time differences in the low frequencies. To be able to localise sounds we need to be able to use both ears.

Ear canal resonance

The natural resonance of the ear canal allows us to hear better at certain frequencies. The external auditory meatus enhances frequencies between 2000 and 5500Hz.
Each person will have their own unique ear canal resonance.

Middle ear advantage

To be able to hear sounds at a reasonable intensity level we need to be able to match the impedance of the sound in air to the fluid in the inner ear. The middle ear is arranged so that sound is made louder by about 25dB.


Sound is important when there is a purpose attached. Noise is often described as unwanted sound. This is a subjective description. Noise can also be defined in an objective manner as aperiodic sound. This is a purely physical description of noise and does not relate to the concept of unwanted sound.
Consider the situation when you are trying to study (like now) and you are in the lounge room and the rest of the family are watching TV. The TV and the family’s conversation becomes noise. Whereas if you were not needing to concentrate on studying you would probably enjoy the TV and the conversation.

To account for these types of situations, the terminology jammers and targets is sometimes used. A jammer is something that is interfering with the target. To go back to the previous example while you are studying the TV is jammer but when you are enjoying a show the TV is a target.
Noise is not just an irritation but it can also cause damage to the ear and affect hearing depending on the intensity of the sound and the length of exposure.

Lombard effect

When we talk in a situation where there is competing sound (or noise) we raise the level of our voice so that we can monitor our own voice. As the level of competing sound is increased we keep raising the level of our voice.


Perception is a process by which individuals select, organise and evaluate their sensory impressions in order to give meaning to their environment. It is a 3-part process: 1- what you sense; 2 - organised by brain and 3 - evaluated and made sense of. Therefore, no perception can occur unless the person perceiving the sound is able to understand it. The peripheral hearing mechanism can operate perfectly but unless the pathways to the brain are intact no perception occurs.

The ultimate organ of hearing is the brain. The brain interprets sound and organises the reaction to it.
This reaction could be an action response, eg,
Mother: “Bridget, come here”, Bridget responds by walking over to her mother.
The reaction could be emotional. For example, you are listening to music and it can make you happy or sad.
The reaction could be verbal, eg, you answer a question.

Sound and speech

We are aware of our surroundings through our senses, primarily hearing and sight. Hearing is often referred to as a distance sense. This is related to the properties of sound that allow it to go through or around barriers.
Hearing keeps us in contact with our environment and with our fellow human beings. We are aware of the siren even when we can’t see the fire engine, the baby crying in another room or the knock at the door. However, the primary purpose of the hearing mechanism is to access spoken communication.
The purpose of this topic is to relate the physical properties of sound to the communication process involved in speech.

Speech is only one mode of communication but is often used synonymously with communication. Speech is the verbal mode of communication. Other modes may involve writing, drawing, gesture or signs. The discussion here will be limited to speech, not because the other modes of communication are less important, but because your major interest as an audiometrist will be with the sense of hearing and how it impacts on the communication process.

Linguistics is the study of the way languages are structured and acquired. Language is made up of sounds put together in characteristic ways that are understood by other speakers of that language. Competent users of a language are often unaware of how they developed the skills involved and therefore may not realise why they are experiencing certain difficulties.

There are many building blocks to communication and each has its area of study. The areas of study include:
  • Phonology: the study of sounds in language
  • Phonetics: the study of how language is produced
  • Morphology: the study of minimal meaningful units
  • Syntax: grammatical cues
  • Semantics: the meaning of the words used in a language
  • Pragmatics: the method by which information is transmitted

The acoustic content of the speech spectrum

Speech is made up of sounds that have particular acoustic features. These acoustic features are transmitted to the ear, the ear’s anatomy converts the air-borne pressure waves to neural stimuli that are interpreted in the cortical area of the brain. Once the brain has interpreted these stimuli, communication has occurred.
To understand spoken communication you must, firstly, be able to detect the physical changes in the air molecules. To do this your ears must function.

Do you remember the range of human hearing?

The range of human hearing is 20Hz to 20kHz. This is a greater range than is actually necessary for an adequate understanding of speech. Humans produce frequencies of 85Hz to 11000Hz. The frequencies most important for understanding speech are in the range where the ear is most sensitive, 500Hz to 4kHz. The other frequencies add extra information.

What is the acoustic content of speech?
The short answer to this is speech consists of frequencies between 85Hz and 11000Hz. There is, of course, a little more you need to know.

How is speech produced?
When you look at someone speaking you see them move their mouth. This is just part of the process by which speech is produced.
Speech occurs because the brain sends out nerve impulses to the vocal tract that creates the acoustic waveforms that leave the mouth. Air from the lungs passes through the vocal folds in the larynx, is altered by the throat, nose and mouth in characteristic ways producing sounds for us to perceive
Often speech is defined in terms of segmental and suprasegmental features.
Segmental features are the sounds that make up the utterance. For example, the word ‘dog’ has 3 segments: ‘d’, ‘o’ and ‘g’.
Suprasegmental features include stress and intonation. These features are those that give us information like whether the utterance is a question, a command or a statement, whether it is a man or a woman speaking, etc.

Can you tell if it is a man, woman or child speaking without seeing them?
Usually you can. Do you know why? This ability is related to fundamental frequency.

Men’s, women’s and children’s voices are different essentially because of the size of the larynx. Men normally have a larger larynx than women and children have the smallest. We perceive men’s voices as being deeper than women’s and children’s voices. Men have a lower fundamental frequency. The vocal folds determine the fundamental frequency and therefore our perception of the pitch of the voice.

The larynx is the tube in the throat that goes to the lungs. The vocal folds can close off the larynx when they come together. When we speak the vocal folds open and close to produce sound. The vocal folds are elastic and can be tightened.
This implies that fundamental frequency is always static. This is not true. Changes in the fundamental frequency produce intonation patterns that identify the utterance as a question or a statement, give stress to certain words, etc.

It is often said that men’s voices are easier to hear. This is probably because of the harmonics generated. Harmonics are integral multiples of the fundamental frequency. For example, if the fundamental frequency is 70Hz, harmonics will occur at 140Hz, 210Hz, 280Hz, 350Hz, etc. This means that if the fundamental frequency is lower then more harmonics will be in our audible range of hearing.

Vowels and consonants

Phonemes are the sounds of speech. Phonemes are classified by the way in which they are produced. Consonants and vowels are formed by different shapes of the vocal tract, ie from the larynx to the lips and nostrils.
Consonants are usually thought of as being predominantly high frequency sounds and vowels are essentially low frequency sounds.

Sounds may also be classified as voiced or voiceless. Voiced phonemes are produced with the vocal folds vibrating compared to voiceless phonemes , the vocal folds are open but not vibrating. Normally only consonants are specifically referred to as voiced or voiceless as all vowels will be voiced.
You can feel the difference in voiced and voiceless sounds easily. If you produce a ‘z’ sound and put your hand on your throat you will feel the vocal folds vibrating. If you produce an ‘s’ sound and do the same you will not feel the vocal folds vibrating. The ‘z’ sound is voiced and the ‘s’ sound is unvoiced.

Nasals are another category of sound that are not specified as voiced or unvoiced as they are always voiced. The nasals are ‘m’, ‘n’, and ‘ng’. The ‘ng’ sound is the sound at the end of the word ‘sing’. Nasals are low frequency speech sounds.
Nasal sounds go through the nose. That is, the air stream goes through the nose rather than the mouth. When you have a cold and your nose is blocked the sound cannot go through the nose and although it is common to say that people sound nasal when they have a cold, in fact they are denasal.

Other categories of sounds include the fricatives, the plosives and the affricatives.
Fricatives are produced by constriction of turbulent noise. These sounds are; ‘f’, ‘v’, ‘th’, (voiced and voiceless), ‘s’, ‘z’ and ‘sh’ (voiced and voiceless).
Plosives stop the flow of air at some point in the vocal tract causing the acoustic energy to be decreased or stopped. They are also referred to as stops. These sounds are: ‘p’, ‘t’, ‘k’, ‘b’, ‘d’ and ‘g’.
Affricatives are sounds in which the turbulent air is stopped at some point. These sounds occur at the beginning of the words ‘cheat’ and ‘jump’.
This implies that vowels and consonants have the same acoustic content in all situations. This is not true. Sounds will be affected by other sounds around them.

Formants are peaks in the spectral energy at particular frequencies. The first three formant frequencies are the most important in the perception of vowels. The perception of consonants relies on formant transitions, ie how the sounds move from the vowels to the position of the consonant.
Although English is one language there are many variants of English. For example, there are native speakers of English in America, Canada, Britain, South Africa, New Zealand and Australia. If you meet people who are native English speakers and you are too, you will usually have no trouble understanding them. You may be able to know from their voice alone where they grew up.

We relate to acoustic input with cultural ears. As language and speech is developed we relate what we hear to the language or languages we are exposed to. All babies, including deaf babies, no matter which language is spoken around them begin to babble using similar sounds and will often make sounds that are not in their native tongue. At a certain stage (at about 6-9 months) they will start to babble using only the sounds which they have heard.

The effect of environmental noise

People often complain that they can't hear well in background noise. You will have noticed it yourself that when you are trying to listen at a party or in some other noisy place you often can't hear everything that is spoken. This is sometimes called the cocktail party effect. There is even a tape available of cocktail party noise. This tape has recordings of many voices talking at the same time.

When you are at a party or in some other noisy place what techniques do you use to understand what is being said?
Think about what you can actually hear?

Sometimes you may hear a lot of suprasegmental information. That is, changes in the pitch and a lot of low frequency sound but not much else. This is related to the upward spread of masking. This is explained on the next page.
In every language there are redundancies. Redundancy is where there is more information provided to understand communication than is actually needed. That is, you could have understood what was said with less information. Keep reading and you’ll find out more about it.
In a difficult listening environment you may also use speechreading. More information soon.

Listening in quiet

Although people prefer to listen to in quiet they do not like to listen in an anechoic room as they find it quite strange and the sound appears to be “dead”. Most people prefer a room with some reverberation. Even a sound treated room appears strange the first time you walk into it. The preference is for a reverberation time of about .5 second which is what you would find in average living room.

Most people are able to focus on the sound they wish to pay attention to, the target, and ignore the rest, the jammers. This becomes more difficult as the jammers become more intrusive until a point is reached where the jammers totally overwhelm the target.
In other words, most people are able to listen selectively. They can switch off the background noise and concentrate on what they want to hear until the noise becomes overwhelming.

Most people with a hearing loss have difficulty in doing this. This is one of the most common problems mentioned by hearing impaired people.
It is even more difficult when the hearing is asymmetrical. That is, a person’s hearing in one ear is worse than the other.

Upward spread of masking

This is a phenomenon where low frequencies mask higher ones but the reverse does not happen. That is, low frequency sounds interfere with speech understanding more than high frequencies. In addition to this phenomenon is the fact that the high frequency components are more important for the understanding of speech.

The upward spread of masking is often what happens in a crowd or at a party. Most people have trouble hearing in this type of situation. Hearing impaired people have more trouble in background noise partly because the upward spread of masking has a greater effect in a damaged cochlea.


Redundancy in communication allows a message to be predictable. These redundancies may be verbal or non-verbal.
Consider, for example, someone yelled out to you something that you detected as being “prosexe”. Do you know what this means? This word was yelled out to you on a busy street in Athens and you were looking right and stepping from the footpath into the busy traffic. What do you think the word means now? It is a Greek word and it means “take care”.

Given the context, the situation in which you were in, you may have been able to react appropriately to the intention of the communication without actually understanding the word used. In this situation the physical changes in the air molecules were transmitted to your auditory cortex without any communication. Communication occurred because you understood the contextual cue and not the actual utterance.

Contextual cues are particularly important in noise. It is possible to understand what is said without hearing each word. This is particularly so where there is high predictability. You may know the meaning of an utterance without actually hearing all parts of the sentence. For example, fill in the missing word: “I live in a block of ............ on the other side of the street”. There are 2 words that fit: units and flats. Either word is acceptable and both words provide the same communicative intention.

If you are in a situation that is familiar with familiar speakers and a familiar topic you will often know what is said without actually hearing what is said. For example, a work colleague says something that you couldn’t hear because a very loud truck went past your window. They give you a piece of paper. You look at the paper and realise they need your signature to authorise a purchase. The communication process proceeds uninterrupted even though your ear was unable to perceive the changes in the pressure of the air molecules.


Speechreading is another cue that is useful when auditory communication is difficult. Lipreading is one aspect of speechreading.

Many of us are unaware that we are using lipreading cues. It is when the lipreading pattern is not consistent with what our ears are hearing that we realise that how we subconsciously integrate visual and auditory information. Have you seen a foreign movie that has been dubbed? We only notice that we use lipreading cues when the information that we hear seems different from the information we see.

Lipreading is one part of visual perception to do with speech. When we are talking to another person we are also receiving visual information that helps us to understand what is said. We often know the topic of conversation from gestures people use. They may point to what they are talking about and never actually refer to it. For example, “My brother had one just like that only in blue. He never had much use for it so he gave it away. I wish he had given it to me.” This conversation could be about a jacket, a car, a lounge or many other things. We also know a little about the way the person is feeling about what happened. You may be able to see, for example, if the person was angry with their brother.

Therefore, although the term lipreading is more commonly used, speechreading is the more appropriate term.


What must exist before sound can be generated?
Describe how sound waves are produced.
Describe the way in which sound is propagated.
Describe compression and rarefaction.
What affects the speed of sound?
Does sound travel faster in water or in air?
Explain the difference between diffuse and free fields.
Define the terms: diffuse field; free field; transmission; diffraction; reverberation; reflection; refraction; absorption.
What is an echo?
What types of materials are good absorbers of sound energy?
What is an anechoic room?
Which has the greater reverberation time, a bathroom or a living room?
Are high frequencies are absorbed more than low frequencies?
What types of materials are good absorbers of sound energy?
What is the term decibel used to describe?
Why is the decibel used to describe sound?
What does it mean when we say the decibel is a ratio?
Why are decibels logarithmic?
If a sound has 3dB more gain than another sound, how much louder is it?
What is dBHTL?
What does 0dB mean?
How does dBSPL relate to dBHL?
What is dBSL?
What is an SLM and what is it for?
What weighting is used to measure sound for industrial audiometry?
Is intensity a subjective or an objective term?
Why is one sound louder than another?
What is the level of normal conversational speech?
At what level are sounds painfully loud?
How would you define intensity?
What is amplitude?
At what level does normal conversational level occur?
At what level are sounds painfully loud?
What affects the perception of loudness?
What is interference?
What happens when waves are 180degrees out of phase?
What are standing waves?
How does distance affect sound?
What is the inverse square law?
What are the terms frequency and pitch used for?
How do you define frequency?
How is frequency measured?
How does wavelength relate to frequency?
What does it mean when we talk about sound frequency?
What are pure tones?
How does wavelength relate to frequency?
How is Middle C described?
What is the frequency range of human hearing?
What are the octave frequencies in audiometry from 250Hz to 8000Hz?
What is the principle behind Fourier analysis?
What is resonant frequency?
What is the Doppler effect?
What are ultrasonic frequencies?
What are infrasonic frequencies?
What are ultra-high frequencies?
What is resonant frequency?
What does binaural mean?
What is binaural advantage?
How are we able to localise sound?
What is the binaural squelch effect?
What is the benefit of having two ears that hear symmetrically?
What is binaural summation?
How do we localise sound?
What is the interaural time difference?
What is the interaural intensity difference?
What is the binaural squelch effect?
What does binaural mean?
How is the quality of sound described?
Why are men’s voices easier to understand than women’s voices?
What is the physical description of noise?
What is the psychoacoustic description of noise?
What is the upward spread of masking?
What is occlusion?
What is the occlusion effect?
Why does the occlusion effect occur?
How do you know if occlusion is occurring when a person is wearing a hearing aid?
What is the audible range of frequencies for humans?
What effect does the head have on sound?
What is ear canal resonance?
What effect does the middle ear have on sound?
What is the Lombard effect?
What does it mean to say that the cochlea responds to sounds tonotopically?
What do phon curves represent?
What is the head shadow effect?
What is the body baffle effect?
How does the EAM affect the incoming sound?
How does the middle ear affect the incoming sound?
What is a jammer and a target?
What is the Lombard effect?
What is perception?
Why is the brain the ultimate organ of hearing?
What is the 3-part process of perception?
What is the range of frequencies produced by humans?
Which frequencies are most important for understanding speech?
How can you tell if a sound is voiced or voiceless?
Are vowels voiced or voiceless sounds?
What is the upward spread of masking?
What is redundancy in language?
What is speechreading?
Why is hearing described as our distance sense?
Describe the modes of communication?
What is the range of frequencies that humans can produce?
What are the most important frequencies for understanding speech?
At what frequencies is the ear most sensitive?
What is the acoustic content of speech?
What are the segmental features of speech?
What are the suprasegmental features of speech?
What is fundamental frequency?
How can the sounds of speech be described?
How do you know if a sound is voiced?
What is redundancy?
What role does context play in the communication process?
What are visual cues?
Are vowels voiced or voiceless?