CD v MP3

                                   CD – the complete wave form                                                          MP3 – note missing high frequencies

Digital Audio has opened a new world of sound and a plethora of choices in the field of recorded music. Never before in human history has recorded music been so omnipresent, with near instant distribution possible on a global scale.  What does all this new freedom of expression and wide range of choice mean for the therapeutic application of sound and music?

Most people, including many musicians, have only a vague idea of how digital music works.  Because it is a complex technology based on the mathematics of sampling theorem, myths still persist about digital music relative to analog recording.  In the early days of recording, music was preserved on a wax cylinder almost like an exact plaster cast of a sound wave.  This technology evolved into the vinyl disc that defined popular music from the 1920s to the 1980s. Analog music was a direct storage and replication of the original sound, converting sound waves to magnetic impulses encoded on magnetic tape like the ripples in sand left by waves on a beach.

Compact disc redefined recorded music in the 1980s.  It offered a more portable format and cleaner sound without the noise floor of a needle in a groove or the tape hiss from a cassette.  Compact disc required a whole new technology to convert sound from a physical waveform to a pattern of binary code.  The mathematical theory that allows sound to be transformed into digits is called sampling theory.  By sampling enough points on a sound wave, the wave can be stored and replayed in it’s entirety with no loss of signal.  The original issues with CD sound were not so much in the storage medium, but the equipment that converted the sound wave into digits and back into a sound wave.  This critical interface was called an analog/digital converter, and early consumer players had inferior conversion that tended to give CD a brittle, overly bright sound when compared to high quality tape or record player systems.  This is where the myth of digital sound being a kind of step process that left pieces of the sound out of the picture was born.  The truth is, audio recorded at the CD standard of 16 bits 44.1Khz provides an exact replication of sound in the frequency range of human hearing, 20 Hz to 20Khz.  Higher sample rates don’t mean more density of sound.  They simply raise the frequency band so that a sample rate of 88.2 Khz would increase the upper frequency range to more than 40Khz, well beyond the range of human hearing.

The CD has reigned for more than 25 years; billions have been sold.  It is a durable and reliable storage medium and one that should not be discarded out of hand simply because we have a new more convenient option in MP3. While magnetic tape and compact disc work on the principle of preserving an audio event in its entirely, MP3 works on an entirely different concept based on psycho-acoustics.  MP3 exploits the unique architecture of the human hearing mechanism by utilizing perceptual coding.

OK, here is where it gets complicated!  Please hang in there with me as I give you an overview.  I am compressing about 2000 pages of audio engineering into the next few paragraphs, so forgive me if the next section is dry and complex.

‘The ear perceives only a portion of the information in an audio signal…A perceptual music codec does not attempt to model the original source…instead, the music signal is tailored according to the receiver, the human ear, using a psycho-acoustic model to identify irrelevant and redundant content in the audio signal.’ P319 Principles of Digital Audio.

In order to understand MP3 we have to look at how we hear and process sound.

Human hearing is a complex system with a bandwidth of around eight octaves. Comparatively, our sense of sight has a bandwidth of one octave. Hearing transforms mechanical pressure waves in air into electrical impulses in the brain. In the process the sound is collated into frequency, volume and directional data, all of which is decoded and given meaning by the left and right hemispheres of the brain. The ears are cross-wired. The right ear feeds the left brain, and the left ear feeds the right brain.  The left cerebral hemisphere processes verbal language and rhythm, while the right temporal lobe processes melodic, harmonic information. We are enormously sensitive to subtle differences in the pressure waves that constitute sound moving through our environment. At the same time, our ear is quite tolerant of tonal variances in music, recognizing and adjusting for inaccuracies in tuning of up to 5-7%.

Still with me?  Here’s more!

The range of dynamic sensitivity from a whisper to a loud rock band is in the range of 1 to 1,000, 000, 000, 000.  A delicately played note in the top octave of the piano will displace the eardrum by approximately one hundredth the diameter of a Hydrogen molecule (p317 principles of digital audio)

The outer ear, known as a pinna, funnels sound waves into the ear canal.  The pressure waves displace the eardrum, which converts acoustic energy into mechanical energy.  Behind the eardrum are 3 tiny bones, the hammer, anvil, and stirrup. These bones effectively amplify the sound energy by a factor of 16 in order to convey the signal through the denser liquid medium of the inner ear.  Coiled inside the cochlea lies the basilar membrane, which decodes the frequency and loudness of sound, converting the information to electrical impulses that are sent to the brain and processed as neural information.  The part of the hearing mechanism that is crucial to the understanding of perceptual coding is the basilar membrane.

The basilar membrane has approximately 30,000 hair cells arrayed in rows along its length. The membrane is wider at one end and narrower at the other, allowing for the sorting and coding of sound into its constituent frequencies.

At lower frequencies… ‘tones a few Hz apart can be distinguished, however at high frequencies they must differ by hundreds of Hz…hair cells respond to the strongest stimulation in their local region; this is known as the critical band…’ P320 Principles of Digital Audio.

Critical bands divide the basilar membrane into strips of approximately 1300 hair cells. ‘Because hair cells tend to vibrate at the frequency of the strongest stimulation, they will convey that frequency in a critical band, ignoring lesser stimulation.’ P323, Principles of Digital Audio.  The information that falls outside a critical band is said to be ‘masked’.

So here’s where the rubber meets the road….

Not all frequencies are treated equally by the ear. The basilar membrane is less sensitive at high and low frequencies. To reach the threshold of hearing, a bass note on the piano would have to be approximately70 Db louder than a note that is 6 octaves higher.  Perceptual coding compares the signal to the hearing threshold of the human ear and discards sound that falls below the threshold.  It may also place quantization noise generated by computing error, below this threshold. Masking is the essential phenomenon that allows perceptual coding to deviate from replicating the original signal and jettison significant amounts of information to make digital audio files small enough for rapid downloading with minimal storage space.

If the issue were merely storage, the MP3 approach is very practical, but the use of this technology as a primary source of recorded music presents some problems.  It assumes that sound is perceived by the ear only and ignores subtle subconscious spatial and harmonic information that are an integral part of the healing response to music.  These subtle elements are ephemeral and require a large allocation, bits, to encode compared to the relative strength of their signal, but are essential to a deeper resonant response to music.  Only by replicating the waveform in its entirely can the finer psycho-acoustic elements be conveyed and experienced.  This is in the realm of felt-sense experience.  The higher bitrate is important to your body and healing.

MP3 or streaming audio reduces music to its surface elements and is a very practical application if music is considered simply a disposable form of entertainment.  MP3 is a fine solution in noisy environments such as shops, restaurants or cars, where listening is not critical.  In a therapy waiting room or in open clinics and exercise areas streaming audio can offer a cost-effective sound track solution.  In an enclosed clinical situation more care and attention is needed to provide the very best listening environment. A relaxed listener is open on every level of their being, and music has a far more profound effect in such cases, even though, if programmed correctly, (Like a good soundtrack to a movie) it may be barely noticed by the conscious mind.

If you stayed with me to this point I thank you for your long attention span and award you an A in Healing Music 101!!