• Portfolio
  • Archive
    • Grad Students Halloween Party 2016
    • Praxis Showcase 2010 Highlights
    • 2015 Canada Blooms Near North Hardscapes
    • 2014 IEEE Toronto AGM Highlights
    • 2014 GSU Halloween Party
    • 2014 GSU Halloween Party 2
    • IEEE Day: Wine and Cheese
    • 2014 Akido Club Photoshoot
    • Canonball 1T5
    • 2014 Toronto Christmas Market
    • 2015 Nocturne
    • Cute-Baby-TBP
  • Stuff.
  • Contact Info

kmingk

Just Me.

  • Portfolio
  • Archive
    • Grad Students Halloween Party 2016
    • Praxis Showcase 2010 Highlights
    • 2015 Canada Blooms Near North Hardscapes
    • 2014 IEEE Toronto AGM Highlights
    • 2014 GSU Halloween Party
    • 2014 GSU Halloween Party 2
    • IEEE Day: Wine and Cheese
    • 2014 Akido Club Photoshoot
    • Canonball 1T5
    • 2014 Toronto Christmas Market
    • 2015 Nocturne
    • Cute-Baby-TBP
  • Stuff.
  • Contact Info

Audio encoding for BBB

Background

Basic PCM (Pulse Code Modulated) is the digital representation of a sampled analog signal. Specifically for audio purposes, the typical format that the dat used to store this raw data is typically WAV file is the raw file which stores the data without any compressions techniques. This is in the form as shown below:

Pulse-Code Modulation Waveform for 4-bit data. 

Pulse-Code Modulation Waveform for 4-bit data. 

These raw signals are very easily managed since no additional processing needs to be done to output this signal to an DAC.  However, the raw data files are extremely space-inefficient.  This website discusses in detail what a normal sized audio file would typical cost in terms of size for a particular sample/bit rate. It shows how storing the raw audio of a song would be ~ 30Mbytes, while an equivalent MP3 would be about 2.82Mbytes. 

For audio signals, the typical sample rate is typically 8kHz (2x speech bandwidth).  Which is less than the sample rate for songs due to the full human hearing range going up to 20kHz, the sample rate is 44kHz or more.   Consider storing a 3 minute speech track as a raw WAV file without compression will result in the following size.

Size
  = (#bits/sample) * (sample freq) * (length in seconds)
  = 16*8000*3*60
  = 23040000bits
  = 2.88 Mbytes

 

Although this seems small, by encoding this equivalent file as an MP3, we will see alot more size saving.    

Mu-Law Encoding.  

There are multiple methods to encode audio, one such method which is very simple to implement is the mu-law.  This type of encoding takes advantage of the non-linear hearing capabilities of the human ear or a poor dynamic range.  An example is how one who is listening to a rock concert won't be able to hear the whisper of someone next to them.  

Utilizing this non-linearity, the mu law algorithm disregards the lower significant bits.  This article discusses mu law in a more detailed fashion.   Using the mu-law, a file which would typically require 9 to 16 bits per sample (2 bytes per sample) would now require 8 bits (1 byte per sample).  This effectively reduces the size of the file by 50%. 

Implementation Alogorithm 

The following are the steps to perform mu-law encoding, Implementation details will be added at a later time.  

  1. Save Sign Bit
  2. Clip value to ensure no overflow when bias is added
  3. Add a bias to ensure there is a 1 within the Exponent Region (most significant 8 bits to the right of the sign bit)
  4. Determine the Mantissa Region (The next most significant 4 bits to the right of the most significant "1" within the exponent region) 
  5. Using a 3 bit encoding for the position of the most significant "1" within the exponent region.  (Most significant = position 7, least significant = position 0) 
  6. 8bit encoding SPPPMMMM (P => binary encoding for the most significant 1, M = Mantissa

Example  11931 (0010 1110 1001 1011)

 

 

  1. Save Sign Bit: (S) = 0
  2. Clip Amplitude: Clip to 2^(N-1) - Bias =  2^15-132 =  32636 (No need to clip in this case)
  3. Add Bias: 11931 + 132 = 12063 (0010 1111 0001 1111 )
  4. Determine Exp Region: 0010 1111 0001 1111 --> Most significant 1 is in position 6 (110) of Exp Region 
  5. Determine Mantissa Region: 0101 1110 Mantissa = 0111
  6. Mu-Law Encoding Value: SBBBMMMM = 01100111

 

tags: Audio Encoding
Thursday 11.07.13
Posted by Kei-Ming Kwong