Exercise 3.4: Different Voice Codecs
From LNTwww
The development of the GSM standard after 1990 was accompanied by the standardization of various voice codecs:
- With the first "Full-Rate Codec" $\rm (FR)$ from 1991 a reduction to the data rate $13 \ \rm kbit/s$ was achieved, sufficiently low to be able to transmit a voice signal over a single traffic channel.
- In 1994 the "Half-Rate Codec" $\rm (HR)$ with the bitrate $5.6 \ \rm kbit/s$ was developed with the aim of being able to transmit two calls simultaneously in one traffic channel if required. However, the quality does not quite reach the full-rate codec.
- The "Enhanced Full-Rate Codec" $\rm (EFR)$ from 1995 represented a significant development based on the data reduction method $\rm ACELP$ ("Algebraic Code Excited Linear Prediction"). The EFR codec delivers a data rate of $12.2 \ \rm kbit/s$ and stands for the common quality standard in mobile communications nowadays.
- In 1999, ETSI standardized the Adaptive Multi-Rate Codec" $\rm (AMR)$ for GSM. This provides eight different modes with data rates between $4.75 \ \ \rm kbit/s$ and $12.2 \ \ \rm kbit/s$ . The AMR codec uses the ACELP method like the EFR codec.
- The "Wideband AMR" $\rm(WB-AMR)$ is a further development of the original AMR. It was standardized by the 3GPP consortium in 2001 and by ITU-T in 2002 and uses the frequency range from $50 \ \rm Hz$ to $7 \rm kHz$. This corresponds to a "WideBand signal".
Notes:
- The task belongs to the chapter Similarities between GSM and UMTS.
- The graph shows the magnitude spectrum of an audio signal and defines the characteristics "narrowband" and "wideband".
- We refer you to the interactive SWF applet Qualität verschiedener Sprachcodecs ⇒ Quality of different voice codecs
(based on "Shock Wave Flash", German language).
Questionnaire
Solution
(1) Correct are the answers 1 and 3:
- The required data rate is reduced by removing redundancy and irrelevance from the data signal.
- The artificial word "codec" indicates that the same functional unit is used for both encoding and decoding.
(2) Correct are the answers 2 and 3:
- The EFR codec from 1995 is a significant development of the "Full–Rate Codecs" from 1991, whereby, among other things, speech quality is less impaired by background noise.
- Like the AMR, the EFR codec is based on the data reduction method ACELP (Algebraic Code Excited Linear Prediction).
- The first proposed solution is wrong. Like the FR and AMR codecs, the EFR codec is only designed for the telephone channel $(300 \ \rm Hz$ – $3.4 \ \rm kHz)$.
- For better intelligibility and to avoid a dull sound, there is also a mid-range boost and a low-frequency cut.
(3) Only the answers 2 is correct:
- The advantage of the AMR codec over the EFR is its greater flexibility.
- If the channel quality deteriorates significantly, it is possible to switch smoothly to a low-rate mode where transmission errors are less disturbing.
- In addition, as with the "Half–Rate Codec", it is possible to have two conversations in one traffic channel.
- The highest mode at $\rm 12.2 \ kbit/s$ - and not the lowest - is identical to the EFR codec. It is therefore obvious that AMR cannot provide better voice quality than EFR.
(4) All answers are correct:
- Nine modes are provided in wideband–AMR, but only five of them are used for mobile communications, namely those with data rates of $6.60$, $8.85$, $12.65$, $15.85$, and $\text{23.65 kbit/s}$.
- The modes up to $\text{12.65 kbit/s}$ have the advantage that a voice signal encoded in this way can be accommodated in a single GSM traffic channel. For the higher rate modes, GSM/EDGE or UMTS is required.
- The higher rate modes $(15.85$ and $\text{23.65 kbit/s})$ provide only a slight improvement for speech, but due to the larger frequency range, they provide a noticeable improvement for the transmission of music.
- Both the wideband AMR and the higher modes of narrowband AMR show weaknesses here. An even lower data rate gives extremely poor results with music signals.
- The WB-AMR has a better voice quality than the NB-AMR with a comparable data rate $\text{(12.65 kbit/s)}$. Due to the greater bandwidth, speech sounds are more natural and sibilants such as "s", "f" and "sch" become more intelligible.