Background
Basic PCM (Pulse Code Modulated) is the digital representation of a sampled analog signal. Specifically for audio purposes, the typical format that the dat used to store this raw data is typically WAV file is the raw file which stores the data without any compressions techniques. This is in the form as shown below:
These raw signals are very easily managed since no additional processing needs to be done to output this signal to an DAC. However, the raw data files are extremely space-inefficient. This website discusses in detail what a normal sized audio file would typical cost in terms of size for a particular sample/bit rate. It shows how storing the raw audio of a song would be ~ 30Mbytes, while an equivalent MP3 would be about 2.82Mbytes.
For audio signals, the typical sample rate is typically 8kHz (2x speech bandwidth). Which is less than the sample rate for songs due to the full human hearing range going up to 20kHz, the sample rate is 44kHz or more. Consider storing a 3 minute speech track as a raw WAV file without compression will result in the following size.
Size
= (#bits/sample) * (sample freq) * (length in seconds)
= 16*8000*3*60
= 23040000bits
= 2.88 Mbytes
Although this seems small, by encoding this equivalent file as an MP3, we will see alot more size saving.
Mu-Law Encoding.
There are multiple methods to encode audio, one such method which is very simple to implement is the mu-law. This type of encoding takes advantage of the non-linear hearing capabilities of the human ear or a poor dynamic range. An example is how one who is listening to a rock concert won't be able to hear the whisper of someone next to them.
Utilizing this non-linearity, the mu law algorithm disregards the lower significant bits. This article discusses mu law in a more detailed fashion. Using the mu-law, a file which would typically require 9 to 16 bits per sample (2 bytes per sample) would now require 8 bits (1 byte per sample). This effectively reduces the size of the file by 50%.
Implementation Alogorithm
The following are the steps to perform mu-law encoding, Implementation details will be added at a later time.
- Save Sign Bit
- Clip value to ensure no overflow when bias is added
- Add a bias to ensure there is a 1 within the Exponent Region (most significant 8 bits to the right of the sign bit)
- Determine the Mantissa Region (The next most significant 4 bits to the right of the most significant "1" within the exponent region)
- Using a 3 bit encoding for the position of the most significant "1" within the exponent region. (Most significant = position 7, least significant = position 0)
- 8bit encoding SPPPMMMM (P => binary encoding for the most significant 1, M = Mantissa
Example 11931 (0010 1110 1001 1011)
- Save Sign Bit: (S) = 0
- Clip Amplitude: Clip to 2^(N-1) - Bias = 2^15-132 = 32636 (No need to clip in this case)
- Add Bias: 11931 + 132 = 12063 (0010 1111 0001 1111 )
- Determine Exp Region: 0010 1111 0001 1111 --> Most significant 1 is in position 6 (110) of Exp Region
- Determine Mantissa Region: 0101 1110 Mantissa = 0111
- Mu-Law Encoding Value: SBBBMMMM = 01100111