Loading [MathJax]/jax/output/HTML-CSS/fonts/TeX/fontdata.js

Difference between revisions of "Aufgaben:Exercise 3.4Z: GSM Full-Rate Voice Codec"

From LNTwww
 
(11 intermediate revisions by 2 users not shown)
Line 1: Line 1:
  
  
{{quiz-Header|Buchseite=Mobile Kommunikation/Gemeinsamkeiten von GSM und UMTS
+
{{quiz-Header|Buchseite=Mobile_Communications/Similarities_Between_GSM_and_UMTS
  
 
}}
 
}}
  
[[File:EN_Mob_A_3_4_Z.png|right|frame|LPC-, LTP- und RPE-Parameter beim GSM-Vollraten-Codec]]
+
[[File:EN_Mob_A_3_4_Z.png|right|frame|LPC, LTP and RPE parameters in the GSM Full Rate Vocoder]]
This codec called ''GSM Fullrate Vocoder''  (which was standardized for the GSM system in 1991)  stands for a joint realization of coder and decoder and combines three methods for the compression of speech signals:
+
This codec called "GSM Full Rate Vocoder"  (which was standardized for the GSM system in 1991)  stands for a joint realization of coder and decoder and combines three methods for the compression of speech signals:
*Linear Predictive Coding ('''LPC'''),
+
*Linear Predictive Coding  $\rm (LPC)$,
*Long Term Prediction ('''LTP'''), and
 
*Regular Pulse Excitation ('''RPE''' ).
 
  
 +
*Long Term Prediction  (LTP), and
  
The numbers shown in the graph indicate the number of bits generated by the three units of this FR speech codec per frame of  $20$  millisecond duration each.
+
*Regular Pulse Excitation  $\rm (RPE)$.
  
It should be noted that LTP and RPE, unlike LPC, do not work frame by frame, but with sub-blocks of  5  milliseconds. However, this has no influence on solving the task.
+
 
 +
The numbers shown in the graphic indicate the number of bits generated by the three units of this Full Rate speech codec per frame of  20  millisecond duration each.
 +
 
 +
It should be noted that LTP and RPE, unlike LPC, do not work frame by frame, but with sub-blocks of  5  milliseconds.  However, this has no influence on solving the task.
  
 
The input signal in the above graphic is the digitalized speech signal   sR(n).  
 
The input signal in the above graphic is the digitalized speech signal   sR(n).  
  
 
This results from the analog speech signal  s(t)  by
 
This results from the analog speech signal  s(t)  by
*a suitable limitation to the bandwidth  B,
+
*a suitable limitation to the bandwidth B,
*sampling at the sampling rate  fA=8 kHz,
+
 
*quantization with  13 Bit,
+
*sampling at the sampling rate fA=8 kHz,
*following segmentation into blocks of each 20 ms.
 
  
The further tasks of preprocessing will not be discussed in detail here.
+
*quantization with 13 bit,
  
 +
*following segmentation into blocks of each 20 ms.
  
  
 +
The further tasks of preprocessing will not be discussed in detail here.
  
  
Line 34: Line 37:
 
''Notes:''  
 
''Notes:''  
  
*This exercise belongs to the chapter    [[Mobile_Kommunikation/Gemeinsamkeiten_von_GSM_und_UMTS|Gemeinsamkeiten von GSM und
+
*The task belongs to the chapter  [[Mobile_Communications/Similarities_Between_GSM_and_UMTS|Similarities between GSM and UMTS]].   
UMTS]].   
+
*Reference is also made to the Chapter   [[Examples_of_Communication_Systems/Voice_Coding|Speech Coding]]   of the book "Examples of Communication Systems".
*Reference is also made to the Chapter   [[Beispiele_von_Nachrichtensystemen/Sprachcodierung|Sprachcodierung]]   of the book „Beispiele von Nachrichtensystemen”.
 
 
   
 
   
  
Line 45: Line 47:
 
<quiz display=simple>
 
<quiz display=simple>
  
{To which bandwidth&nbsp; B&nbsp; must the speech signal be limited?
+
{To which bandwidth B&nbsp; must the speech signal be limited?
 
|type="{}"}
 
|type="{}"}
 
B =  { 4 3% }  kHz
 
B =  { 4 3% }  kHz
  
{Of How many samples&nbsp; (NR)&nbsp; is there a language frame? How large is the input data rate&nbsp; RIn?
+
{Of how many samples&nbsp; (NR)&nbsp; is there a speech frame?&nbsp; How large is the input data rate RIn?
 
|type="{}"}
 
|type="{}"}
 
NR=  { 160 3% }  samples
 
NR=  { 160 3% }  samples
 
RIn=  { 104 3% }  kbit/s
 
RIn=  { 104 3% }  kbit/s
  
{What is the output data rate&nbsp; ROut of the GSM full rate codec?
+
{What is the output data rate ROut of the GSM&ndash;full rate codec?
 
|type="{}"}
 
|type="{}"}
 
ROut =  { 13 3% }  kbit/s
 
ROut =  { 13 3% }  kbit/s
Line 64: Line 66:
 
+ The&nbsp; 36&nbsp; LPC bits specify coefficients that the receiver uses to undo the LPC filtering.
 
+ The&nbsp; 36&nbsp; LPC bits specify coefficients that the receiver uses to undo the LPC filtering.
 
- The filter for short-term prediction is recursive.
 
- The filter for short-term prediction is recursive.
- The LPC output signal is identical to the input&nbsp;  sR(t).
+
- The LPC output signal is identical to the input signal&nbsp;  sR(t).
  
{Which statements regarding the block „LTP” are true?
+
{Which statements regarding the block "LTP" are true?
 
|type="[]"}
 
|type="[]"}
 
+ LTP removes periodic structures of the speech signal.
 
+ LTP removes periodic structures of the speech signal.
Line 77: Line 79:
 
+ RPE removes unimportant parts for the subjective impression.
 
+ RPE removes unimportant parts for the subjective impression.
 
+ RPE subdivides each sub-block into four sub-sequences.
 
+ RPE subdivides each sub-block into four sub-sequences.
- RPE selects the subsequence with the minimum energy.
+
- RPE selects the sub-sequence with the minimum energy.
  
 
</quiz>
 
</quiz>
  
===Sample solution===
+
===Solution===
 
{{ML-Kopf}}
 
{{ML-Kopf}}
  
'''(1)'''&nbsp; To satisfy the sampling theorem, the bandwidth B must not exceed fA/2=4  kHz_.
+
'''(1)'''&nbsp; To satisfy the sampling theorem, the bandwidth B&nbsp; must not exceed&nbsp; fA/2=4  kHz_.
  
  
  
'''(2)'''&nbsp; The given sampling rate fA=8 kHz results in a distance between individual samples of TA=0.125 ms.  
+
'''(2)'''&nbsp; The given sampling rate&nbsp; fA=8 kHz&nbsp; results in a distance between individual samples of&nbsp; TA=0.125 ms.  
*Thus a speech frame of $(20 {\rm ms})consistsofN_{\rm R} = 20/0.125 = \underline{160 \ \rm samples},eachquantizedwith13 \ \rm Bit$.  
+
*Thus a speech frame of&nbsp; 20ms&nbsp; consists of&nbsp; NR=20/0.125=160 samples_, each quantized with&nbsp; $13 \ \rm bit$.  
 
*The data rate is thus
 
*The data rate is thus
 
:RIn=1601320ms=104kbit/s_.
 
:RIn=1601320ms=104kbit/s_.
Line 95: Line 97:
  
  
'''(3)'''&nbsp;  The graph shows that per speech frame $36 \ {\rm (LPC)} + 36 \ {\rm (LTP)} + 188 \ {\rm (RPE)} = 260 \ \ \rm Bit$ are output.  
+
'''(3)'''&nbsp;  The graph shows that per speech frame&nbsp; $36 \ {\rm (LPC)} + 36 \ {\rm (LTP)} + 188 \ {\rm (RPE)} = 260 \ \ \rm bit$&nbsp; are output.  
 
*From this the output data rate is calculated as
 
*From this the output data rate is calculated as
 
:ROut=26020ms=13kbit/s_.
 
:ROut=26020ms=13kbit/s_.
*The compression factor achieved by the full rate speech codec is thus 104/13=8.
+
*The compression factor achieved by the full rate speech codec is thus&nbsp; $104/13 = 8$.
  
  
  
'''(4)'''&nbsp; Only the <u> first two statements</u> are true:  
+
'''(4)'''&nbsp; The <u>first two statements</u> are true:  
*The 36 LPC&ndash;bits describe a total of eight filter coefficients of a non-recursive filter, whereby eight acf&ndash;values are determined from the short-term analysis and where these are converted into reflection factors rk after the so-called Schur recursion.  
+
*The 36 LPC bits describe a total of eight filter coefficients of a non-recursive filter, whereby eight&nbsp; ACF values are determined from the short-term analysis and where these are converted into reflection factors&nbsp; rk&nbsp; after the so-called "Schur recursion".  
*From these the eight LAR&ndash;coefficients are calculated according to the function ln[(1rk)/(1+rk)], quantized with a different number of bits and sent to the receiver.
+
*From these the eight LAR coefficients are calculated according to the function&nbsp; ${\rm ln}\big[(1 - r_{k})/(1 + r_{k})\big]$, quantized with a different number of bits and sent to the receiver.
*The LPC output signal has a significantly lower amplitude than its input sR(n), and it has a significantly reduced dynamic range and a flatter spectrum.
+
*The LPC output signal has a significantly lower amplitude than its input&nbsp; sR(n), and it has a significantly reduced dynamic range and a flatter spectrum.
  
  
  
'''(5)'''&nbsp; Correct are the <u>the statements 1 and 3</u>, but not the second:  
+
'''(5)'''&nbsp; Correct are the <u>statements 1 and 3</u>, but not the second:  
*The LTP&ndash;analysis and &ndash;filtering is done blockwise every 5 ms (40 samples), i.e. four times per speech frame.  
+
*The LTP analysis and filtering is done blockwise every&nbsp; $5 \ \rm ms$&nbsp; &rArr; &nbsp; $(40$&nbsp; samples$)$, i.e. four times per speech frame.  
*The cross correlation function (CCF) between the current sub-block and the three previous sub-blocks is formed.  
+
*The cross correlation function&nbsp; $\rm (CCF)$&nbsp; between the current sub-block and the three previous sub-blocks is formed.  
*For each sub-block, an LTP&ndash;delay and an LTP&ndash;gain are determined which best match the sub-block.  
+
*For each sub-block, an LTP delay and an LTP gain are determined which best match the sub-block.  
*A correction signal of the following component &bdquo;RPE&rdquo; is also taken into account.  
+
*A correction signal of the following component "RPE" is also taken into account.  
 
*For the long-term prediction, as with the LPC, the output is reduced in redundancy compared to the input.
 
*For the long-term prediction, as with the LPC, the output is reduced in redundancy compared to the input.
  
Line 119: Line 121:
  
 
'''(6)'''&nbsp; The statements <u>2 and 3</u> are correct:  
 
'''(6)'''&nbsp; The statements <u>2 and 3</u> are correct:  
*The fact that statement 1 is wrong can be seen from the graphic on the data page, because 188 of the 260 output bits come from the RPE. Language would be understandable with RPE alone (without LPC and LTP).
+
*The fact that statement 1 is wrong can be seen from the graphic on the data page, because&nbsp; 188&nbsp; of the&nbsp; 260&nbsp; output bits come from the RPE.&nbsp; Voice would be understandable with RPE alone (without LPC and LTP).
*Regarding the last statement: The RPE is of course looking for the subsequence with the '''maximum'' energy. The RPE pulses are a subsequence (13 of 40 samples) of three bits per subframe of 5 ms and accordingly $12 \ \rm Bitper20 \ \rm ms$ frame.  
+
*Regarding the last statement:&nbsp; The RPE is of course looking for the subsequence with the '''maximum''' energy.&nbsp; The RPE pulses are a subsequence&nbsp; $(13$&nbsp; of&nbsp; $40$&nbsp; samples$)$&nbsp; of three bits per subframe of&nbsp; 5 ms&nbsp; and accordingly&nbsp; 12&nbsp; bits per&nbsp; 20 ms&nbsp; frame.  
*The "RPE pulse" thus occupies 1312=156 of the 260 output bits.
+
*The "RPE pulse" thus occupies&nbsp; 1312=156&nbsp; of the&nbsp; 260&nbsp; output bits.
  
  
More details about the RPE block can be found on the page [[Beispiele_von_Nachrichtensystemen/Sprachcodierung#Regular_Pulse_Excitation_.E2.80.93_RPE.E2.80.93Codierung|RPE&ndash;Codierung]] des Buches „Beispiele von Nachrichtensystemen”.
+
More details about the RPE block can be found on the page&nbsp; [[Examples_of_Communication_Systems/Voice_Coding#Regular_Pulse_Excitation_.E2.80.93_RPE_Coding|RPE coding]]&nbsp; of the book&nbsp; "Examples of Communication Systems".
  
 
{{ML-Fuß}}
 
{{ML-Fuß}}
Line 130: Line 132:
  
  
[[Category:Exercises for Mobile Communications|^3.2 Similarities between GSM and UMTS
+
[[Category:Mobile Communications: Exercises|^3.2 Similarities between GSM and UMTS
 
^]]
 
^]]

Latest revision as of 13:25, 23 January 2023


LPC, LTP and RPE parameters in the GSM Full Rate Vocoder

This codec called "GSM Full Rate Vocoder"  (which was standardized for the GSM system in 1991)  stands for a joint realization of coder and decoder and combines three methods for the compression of speech signals:

  • Linear Predictive Coding  (LPC),
  • Long Term Prediction  (LTP), and
  • Regular Pulse Excitation  (RPE).


The numbers shown in the graphic indicate the number of bits generated by the three units of this Full Rate speech codec per frame of  20  millisecond duration each.

It should be noted that LTP and RPE, unlike LPC, do not work frame by frame, but with sub-blocks of  5  milliseconds.  However, this has no influence on solving the task.

The input signal in the above graphic is the digitalized speech signal  sR(n).

This results from the analog speech signal  s(t)  by

  • a suitable limitation to the bandwidth B,
  • sampling at the sampling rate fA=8 kHz,
  • quantization with 13 bit,
  • following segmentation into blocks of each 20 ms.


The further tasks of preprocessing will not be discussed in detail here.



Notes:



Questionnaire

1

To which bandwidth B  must the speech signal be limited?

B = 

 kHz

2

Of how many samples  (NR)  is there a speech frame?  How large is the input data rate RIn?

NR= 

 samples
RIn= 

 kbit/s

3

What is the output data rate ROut of the GSM–full rate codec?

ROut = 

 kbit/s

4

Which statements apply to the block "LPC"?

LPC makes a short-term prediction over one millisecond.
The  36  LPC bits specify coefficients that the receiver uses to undo the LPC filtering.
The filter for short-term prediction is recursive.
The LPC output signal is identical to the input signal  sR(t).

5

Which statements regarding the block "LTP" are true?

LTP removes periodic structures of the speech signal.
The long-term prediction is performed once per frame.
The memory of the LTP predictor is up to  15 ms.

6

Which statements apply to the block "RPE"?

RPE delivers fewer bits than LPC and LTP.
RPE removes unimportant parts for the subjective impression.
RPE subdivides each sub-block into four sub-sequences.
RPE selects the sub-sequence with the minimum energy.


Solution

(1)  To satisfy the sampling theorem, the bandwidth B  must not exceed  fA/2=4  kHz_.


(2)  The given sampling rate  fA=8 kHz  results in a distance between individual samples of  TA=0.125 ms.

  • Thus a speech frame of  20ms  consists of  NR=20/0.125=160 samples_, each quantized with  13 bit.
  • The data rate is thus
RIn=1601320ms=104kbit/s_.


(3)  The graph shows that per speech frame  36 (LPC)+36 (LTP)+188 (RPE)=260  bit  are output.

  • From this the output data rate is calculated as
ROut=26020ms=13kbit/s_.
  • The compression factor achieved by the full rate speech codec is thus  104/13=8.


(4)  The first two statements are true:

  • The 36 LPC bits describe a total of eight filter coefficients of a non-recursive filter, whereby eight  ACF values are determined from the short-term analysis and where these are converted into reflection factors  rk  after the so-called "Schur recursion".
  • From these the eight LAR coefficients are calculated according to the function  ln[(1rk)/(1+rk)], quantized with a different number of bits and sent to the receiver.
  • The LPC output signal has a significantly lower amplitude than its input  sR(n), and it has a significantly reduced dynamic range and a flatter spectrum.


(5)  Correct are the statements 1 and 3, but not the second:

  • The LTP analysis and filtering is done blockwise every  5 ms  ⇒   (40  samples), i.e. four times per speech frame.
  • The cross correlation function  (CCF)  between the current sub-block and the three previous sub-blocks is formed.
  • For each sub-block, an LTP delay and an LTP gain are determined which best match the sub-block.
  • A correction signal of the following component "RPE" is also taken into account.
  • For the long-term prediction, as with the LPC, the output is reduced in redundancy compared to the input.


(6)  The statements 2 and 3 are correct:

  • The fact that statement 1 is wrong can be seen from the graphic on the data page, because  188  of the  260  output bits come from the RPE.  Voice would be understandable with RPE alone (without LPC and LTP).
  • Regarding the last statement:  The RPE is of course looking for the subsequence with the maximum energy.  The RPE pulses are a subsequence  (13  of  40  samples)  of three bits per subframe of  5 ms  and accordingly  12  bits per  20 ms  frame.
  • The "RPE pulse" thus occupies  1312=156  of the  260  output bits.


More details about the RPE block can be found on the page  RPE coding  of the book  "Examples of Communication Systems".