Difference between revisions of "Aufgaben:Exercise 3.6: Adaptive Multi Rate Codec"

From LNTwww
 
(11 intermediate revisions by 3 users not shown)
Line 1: Line 1:
  
{{quiz-Header|Buchseite=Beispiele von Nachrichtensystemen/Sprachcodierung
+
{{quiz-Header|Buchseite=Examples_of_Communication_Systems/Voice_Coding
 
}}  
 
}}  
  
[[File:P_ID1233__Bei_A_3_6.png|right|frame|Spuren des AMR–Codecs '''Korrektur''']]
+
[[File:En_Bei_A_3_6.png|right|frame|Tracks of the AMR codec]]
Ende der 1990er Jahre wurde mit dem AMR–Codec ein sehr flexibler, adaptiver Sprachcodec entwickelt und standardisiert. Dieser stellt insgesamt acht verschiedene Modi mit Datenraten zwischen  $4.75 \ \rm kbit/s$  und  $12.2 \ \rm kbit/s$  zur Verfügung.
+
In the late 1990s,  a very flexible,  adaptive speech codec was developed and standardized in the form of  $\rm AMR$  codec.  This provides a total of eight different modes with data rates between  $4.75 \ \rm kbit/s$  and  $12.2 \ \rm kbit/s$.
  
Der AMR-Codec beinhaltet wie der in  [[Aufgaben:Aufgabe_3.5:_GSM–Vollraten–Sprachcodec|Aufgabe 3.5]]  behandelte Vollraten–Codec (FRC)  sowohl eine Kurzzeitprädiktion (LPC) als auch eine Langzeitprädiktion (LTP). Allerdings sind diese beiden Komponenten anders realisiert als beim FRC.
+
The AMR codec,  like the full rate codec  $\rm (FRC)$  discussed in  [[Aufgaben:Exercise_3.5:_GSM_Full_Rate_Vocoder|$\text{Exercise 3.5}$]],  includes both a short-term prediction  $\rm (LPC)$  and a long-term prediction  $\rm (LTP)$.  However,  these two components are realized differently from FRC.
  
Der wesentliche Unterschied von AMR gegenüber FRC stellt die Codierung des Restsignals (nach LPC und LTP) dar:  
+
The main difference between AMR and FRC is the encoding of the residual signal  $($after LPC and LTP$)$:  
*Anstelle von „Regular Pulse Excitation” (RPE) wird beim AMR–Code das Verfahren „Algebraic Code Excitation Linear Prediction” (ACELP) angewendet.  
+
#Instead of  "Regular Pulse Excitation"  $\rm (RPE)$,  here the  "Algebraic Code Excitation Linear Prediction"  $\rm (ACELP)$  is used.  
*Aus dem festen Codebuch (FCB) wird für jeden Unterrahmen von  $5 \ \rm ms$  Dauer derjenige FCB–Puls und diejenige FCB–Verstärkung ausgewählt, die am besten zum Restsignal passen (für die der mittlere quadratische Fehler des Differenzsignals minimal wird).
+
#From the fixed code book  $\rm (FCB)$,  for each subframe of  $5 \ \rm ms$  duration,  the  "FCB pulse"  and the  "FCB gain"  that best match the residual signal  $($for which the mean square error of the difference signal becomes minimum$)$  is selected.
  
  
Jeder Eintrag im festen Codebuch kennzeichnet einen Puls, bei dem genau  $10$  der  $40$  Positionen mit  $\pm1$  belegt sind. Hierzu ist anzumerken:
+
Each entry in the fixed code book identifies a pulse where exactly  $10$  of  $40$  positions are occupied by  $\pm1$. 
*Der Puls ist in fünf Spuren mit jeweils acht möglichen Positionen aufgeteilt, wobei die Spur  $1$  die Positionen  $1,\ 6,\ 11$, ... , $36$  des Unterrahmens und Spur  $5$  die Positionen  $5,\ 10,\ 15$, ... , $40$  beschreibt.
 
*In jeder Spur sind genau zwei Werte  $\pm1$, während alle anderen sechs Werte  $0$  sind. Die beiden  $±1$–Positionen werden mit je drei Bit – also mit  $000$, ... ,  $111$ – codiert.
 
*Für das Vorzeichen des erstgenannten Pulses wird ein weiteres Bit verwendet, wobei eine "$1$" ein positives Vorzeichen und eine "$0$" ein negatives  Vorzeichen kennzeichnet.
 
*Ist die Pulsposition des zweiten Impulses größer als die des ersten Impulses, so hat der zweite Impuls das gleiche Vorzeichen wie der erste, ansonsten das umgekehrte.
 
*Zum Empfänger werden somit pro Spur sieben Bit übertragen, außerdem noch fünf Bit für die so genannte  ''FCB–Verstärkung''.
 
  
 +
In this regard it should be noted:
 +
*The pulse is divided into five tracks with eight possible positions each, where track  $1$  contains the positions  $1,\ 6,\ 11$, ... , $36$  of the subframe and track  $5$   describes the positions  $5,\ 10,\ 15$, ... , $40$.
  
In der Grafik sind die  $35$  Bit zur Beschreibung eines FCB–Pulses beispielhaft angegeben.
+
*In each track there are exactly two values  $\pm1$,  while all the other six values are  zero. 
 +
 
 +
*The two  $±1$-positions are each assigned three bits –   i.e. encoded with  "$000$", ... ,  "$111$".
 +
 
 +
*Another bit is used for the  "sign of the first-mentioned pulse",  where a  "$1$"  indicates a positive sign and a  "$0$"  a negative sign.
 +
 
 +
*If the pulse position of the second pulse is greater than that of the first pulse,  the second pulse has the same sign as the first,  otherwise the opposite.
 +
 
 +
*Thus,  seven bits per track are transmitted to the receiver,  plus five bits for the so-called  "FCB amplification''.
 +
 
 +
 
 +
In the diagram,  the  $35$  bits describing an FCB pulse are given as an example:
 
   
 
   
'''Spur 1''' beinhaltet
+
⇒   '''Track 1'''  includes
*einen positiven Impuls  $({\rm VZ} = 1)$  bei  $1$  (erste mögliche Position für Spur 1)  $\hspace{0.2cm}\text{plus}\hspace{0.2cm}0$ (Bitangabe für " 000") $= 1$,
+
#a positive pulse  $({\rm sign} = 1)$  at position  $\big [1$  (first possible position for track 1)  $\hspace{0.02cm}\text{plus}\hspace{0.2cm}0$  (bit specification for  "000") $= 1\big]$,
*einen weiteren positiven Impuls (da $110 > 000$) bei der Position $1 \hspace{0.2cm}\text{plus}\hspace{0.2cm}5$ (Pulsabstand in jeder Spur) $\hspace{0.2cm}\text{mal}\hspace{0.2cm}6$ (Bitangabe für " 110") = $31\hspace{0.05cm}.$
+
#another positive pulse  $($since $110 > 000)$  at position  $\big [1 \hspace{0.2cm}\text{plus}\hspace{0.2cm}5$ (pulse spacing in each track) $\hspace{0.02cm}\text{times}\hspace{0.2cm}6$  (bit specification for " 110") = $31\hspace{0.05cm}\big].$
  
  
'''Spur 2''' beinhaltet
+
'''Track 2''' includes.
*einen negativen Impuls (${\rm VZ} = 0$) bei $2$ (erste mögliche Position für Spur 2) $\hspace{0.2cm}\text{plus}\hspace{0.2cm}5\hspace{0.2cm}\text{mal}\hspace{0.2cm}4$ (Bitangabe für " 100")  = $22\hspace{0.05cm},$
+
#a negative pulse (${\rm sign} = 0$)  at position  $\big [2$ (first possible position for track 2)  $\hspace{0.02cm}\text{plus}\hspace{0.2cm}5\hspace{0.2cm}\text{times}\hspace{0.2cm}4$  (bit specification for  " 100")  $=22\hspace{0.05cm}\big],$
*einen positiven Impuls (Vorzeichenumkehr wegen  $011 > 100$) bei der Position $2 \hspace{0.2cm}\text{plus}\hspace{0.2cm}5\hspace{0.2cm}\text{mal}\hspace{0.2cm}3$ (Bitangabe für " 011")  = $17\hspace{0.05cm}.$
+
#a positive pulse  $($sign reversal due to  $011 > 100)$  at position  $\big [2 \hspace{0.2cm}\text{plus}\hspace{0.2cm}5\hspace{0.2cm}\text{times}\hspace{0.2cm}3$  (bit specification for  " 011")  $=17\hspace{0.05cm}\big].$
  
  
Line 38: Line 46:
  
  
''Hinweise:''
+
<u>Hint:</u>
  
*Diese Aufgabe gehört zum Kapitel&nbsp; [[Examples_of_Communication_Systems/Sprachcodierung|Sprachcodierung]].
+
*This exercise belongs to the chapter&nbsp; [[Examples_of_Communication_Systems/Voice_Coding|"Speech Coding"]].
 
   
 
   
*Bei der Eingabe der Pulspositionen bezeichnet&nbsp; $N_{1}$&nbsp; das erste Bit–Tripel und&nbsp; $N_{2}$&nbsp; das zweite.
+
*When entering the pulse positions&nbsp; $N_{1}$&nbsp; denotes the first triple of bits and&nbsp; $N_{2}$&nbsp; the second.
*Man müsste zum Beispiel für Spur&nbsp; $2$&nbsp; die Werte&nbsp; $N_{1}=-22$&nbsp; und&nbsp; $N_{2}=+17$&nbsp; eintragen.   
+
 
 +
*For example,&nbsp; for track&nbsp; $2$&nbsp; one would have to enter the values&nbsp; $N_{1}=-22$&nbsp; and&nbsp; $N_{2}=+17$.   
  
  
  
===Fragebogen===
+
===Questions===
  
 
<quiz display=simple>
 
<quiz display=simple>
  
{Wie viele Bit beschreiben einen Sprachrahmen $($der Dauer&nbsp; $20 \ \rm ms)$&nbsp; im&nbsp; $12.2 \ \rm kbit/s$–Modus?
+
{How many bits describe a speech frame $($of duration&nbsp; $20 \ \rm ms)$&nbsp; in&nbsp; $12.2 \ \rm kbit/s$ mode?
 
|type="{}"}
 
|type="{}"}
$N_{12.2} \ = \ $ { 244 3% } $ \ \rm Bit$
+
$N_{12.2} \ = \ $ { 244 3% } $ \ \rm bits$
  
{Wie viele Bit werden für FCB–Puls und –Verstärkung pro Rahmen benötigt?
+
{How many bits are needed for FCB pulse and gain per frame?
 
|type="{}"}
 
|type="{}"}
$N_{\rm FCB} \ = \ $ { 160 3% } $ \ \rm Bit$
+
$N_{\rm FCB} \ = \ $ { 160 3% } $ \ \rm bits$
  
{Wie viele Bit verbleiben somit für LPC und LTP?
+
{ How many bits are left for LPC and LTP?
 
|type="{}"}
 
|type="{}"}
$N_{\rm LPC/LTP} \ = \ $ { 84 3% } $ \ \rm Bit$
+
$N_{\rm LPC/LTP} \ = \ $ { 84 3% } $ \ \rm bits$
  
{Welche Impulspositionen des Unterrahmens und Vorzeichen beschreibt die Spur&nbsp; $3$? <br>Beachten Sie die Hinweise zur Eingabe auf der Angabenseite.
+
{What subframe pulse positions and signs does track&nbsp; $3$&nbsp; describe?&nbsp; Follow the instructions for input on the information page.
 
|type="{}"}
 
|type="{}"}
 
$N_{1} \ = \ $ { -8.24--7.76 }  
 
$N_{1} \ = \ $ { -8.24--7.76 }  
 
$N_{2} \ = \ $ { -18.54--17.46 }  
 
$N_{2} \ = \ $ { -18.54--17.46 }  
  
{Welche Impulspositionen inklusive Vorzeichen beschreiben die Spur&nbsp; $4$?
+
{What pulse positions including sign describe the track&nbsp; $4$?
 
|type="{}"}
 
|type="{}"}
 
$N_{1} \ = \ $ { 39 3% }  
 
$N_{1} \ = \ $ { 39 3% }  
 
$N_{2} \ = \ $ { -14.42--13.58 }  
 
$N_{2} \ = \ $ { -14.42--13.58 }  
  
{Welche Impulspositionen inklusive Vorzeichen beschreiben die Spur&nbsp; $5$?
+
{What pulse positions including sign describe the track&nbsp; $5$?
 
|type="{}"}
 
|type="{}"}
 
$N_{1} \ = \ $ { -30.9--29.1 }  
 
$N_{1} \ = \ $ { -30.9--29.1 }  
Line 79: Line 88:
 
</quiz>
 
</quiz>
  
===Musterlösung===
+
===Solution===
 
{{ML-Kopf}}
 
{{ML-Kopf}}
  
'''(1)'''&nbsp; Mit der Datenrate $12.2 \ \rm kbit/s$ ergeben sich innerhalb von $20 \ \rm ms$ genau $\underline{244 \ \rm Bit}$, während zum Beispiel im  $4.75 \ \rm kbit/s$–Modus nur $95 \ \rm Bit$ übertragen werden.
+
'''(1)'''&nbsp; With the data rate&nbsp; $R_{\rm C} = 12.2 \ \rm kbit/s$,&nbsp; exactly&nbsp; $\underline{244 \ \rm bits}$&nbsp; results within&nbsp; $20 \ \rm ms$,&nbsp; while e.g. in&nbsp; $4.75 \ \rm kbit/s$&nbsp; mode only&nbsp; $95 \ \rm bits$&nbsp; are transmitted.
 +
 
  
 +
'''(2)'''&nbsp; In each subframe,&nbsp; the FCB pulse requires&nbsp; $35 \ \rm bits$&nbsp; (five tracks of seven bits each)&nbsp; and the FCB gain requires five bits.
  
'''(2)'''&nbsp; In jedem Unterrahmen benötigt der FCB–Puls $35 \ \rm Bit$ (fünf Spuren zu je sieben Bit) und die FCB–Verstärkung fünf Bit.
+
*With four subframes,&nbsp; this gives $N_{\rm FCB} \hspace{0.15cm}\underline{= 160 \ \rm bits}$.
*Bei vier Unterrahmen kommt man so auf $N_{\rm FCB} \underline{= 160 \ \rm Bit}$.
 
  
  
  
'''(3)'''&nbsp; Hierfür verbleiben die Differenz aus (1) und (2), also $N_{\rm LPC/LTP}\underline{ = 84 \ \rm Bit}$.
+
'''(3)'''&nbsp; This leaves the difference from&nbsp; '''(1)'''&nbsp; and&nbsp; '''(2)''',&nbsp; i.e. $N_{\rm LPC/LTP}\hspace{0.15cm} \underline{ = 84\ \rm bits}$.
  
  
'''(4)'''&nbsp; Das Vorzeichenbit "$0$" deutet auf einen negativen ersten Impuls hin.  
+
'''(4)'''&nbsp; The sign bit&nbsp; "$0$"&nbsp; indicates a negative first pulse.  
*Wegen $001 < 011$ hat der zweite Impuls das gleiche Vorzeichen.  
+
*Because&nbsp; $001 < 011$,&nbsp; the second pulse has the same sign.
*Die beiden Beträge ergeben sich zu
+
:$$|N_1| \ = \ 3 \hspace{0.1cm}{\rm(da \hspace{0.1cm} Spur \hspace{0.1cm}3)} + 5\cdot 1 \hspace{0.1cm} {\rm(Bitangabe \hspace{0.1cm} 001)} = 8\hspace{0.05cm}, $$
+
*The two magnitudes result in
:$$ |N_2| \ = \ 3 \hspace{0.1cm}{\rm(da \hspace{0.1cm} Spur \hspace{0.1cm}3)} + 5\cdot 3 \hspace{0.1cm} {\rm(Bitangabe \hspace{0.1cm} 011)} = 18\hspace{0.05cm}.$$
+
:$$|N_1| \ = \ 3 \hspace{0.1cm}{\rm(since \hspace{0.1cm} track \hspace{0.1cm}3)} + 5\cdot 1 \hspace{0.1cm} {\rm(bit\:specification \hspace{0.1cm} 001)} = 8\hspace{0.05cm}, $$
*Einzugeben sind deshalb für die dritte Spur  $N_{1} \underline{ = -8}$ und $N_{2} \underline{ = -18}.$
+
:$$ |N_2| \ = \ 3 \hspace{0.1cm}{\rm(since \hspace{0.1cm} track \hspace{0.1cm}3)} + 5\cdot 3 \hspace{0.1cm} {\rm(bit\:specification \hspace{0.1cm} 011)} = 18\hspace{0.05cm}.$$
 +
*Therefore,&nbsp; to be entered for the third track is&nbsp; $N_{1}\hspace{0.15cm} \underline{ = -8}$&nbsp; and&nbsp; $N_{2} \hspace{0.15cm}\underline{ = -18}.$
  
  
'''(5)'''&nbsp; In analoger Weise erhält man für die Spur $4$ die Werte&nbsp; $N_{1} \underline{ = +39}$&nbsp; und&nbsp; $N_{2} \underline{ = -14}$.
+
'''(5)'''&nbsp; In an analogous way,&nbsp; for track&nbsp; $4$&nbsp; we obtain the values&nbsp; $N_{1}\hspace{0.15cm} \underline{ = +39}$&nbsp; and&nbsp; $N_{2}\hspace{0.15cm} \underline{ = -14}$.
  
  
'''(6)'''&nbsp; Die fünfte Spur liefert&nbsp; $N_{1} \underline{ =-30}$&nbsp; und&nbsp; $N_{2} \underline{ = +5}$
+
'''(6)'''&nbsp; The fifth track provides&nbsp; $N_{1}\hspace{0.15cm} \underline{ =-30}$&nbsp; and&nbsp; $N_{2}\hspace{0.15cm} \underline{ = +5}$
  
 
{{ML-Fuß}}
 
{{ML-Fuß}}
Line 110: Line 121:
  
  
[[Category:Examples of Communication Systems: Exercises|^3.3 Voice Coding^]]
+
[[Category:Examples of Communication Systems: Exercises|^3.3 Speech Coding^]]

Latest revision as of 13:58, 25 January 2023

Tracks of the AMR codec

In the late 1990s,  a very flexible,  adaptive speech codec was developed and standardized in the form of  $\rm AMR$  codec.  This provides a total of eight different modes with data rates between  $4.75 \ \rm kbit/s$  and  $12.2 \ \rm kbit/s$.

The AMR codec,  like the full rate codec  $\rm (FRC)$  discussed in  $\text{Exercise 3.5}$,  includes both a short-term prediction  $\rm (LPC)$  and a long-term prediction  $\rm (LTP)$.  However,  these two components are realized differently from FRC.

The main difference between AMR and FRC is the encoding of the residual signal  $($after LPC and LTP$)$:

  1. Instead of  "Regular Pulse Excitation"  $\rm (RPE)$,  here the  "Algebraic Code Excitation Linear Prediction"  $\rm (ACELP)$  is used.
  2. From the fixed code book  $\rm (FCB)$,  for each subframe of  $5 \ \rm ms$  duration,  the  "FCB pulse"  and the  "FCB gain"  that best match the residual signal  $($for which the mean square error of the difference signal becomes minimum$)$  is selected.


Each entry in the fixed code book identifies a pulse where exactly  $10$  of  $40$  positions are occupied by  $\pm1$. 

In this regard it should be noted:

  • The pulse is divided into five tracks with eight possible positions each, where track  $1$  contains the positions  $1,\ 6,\ 11$, ... , $36$  of the subframe and track  $5$  describes the positions  $5,\ 10,\ 15$, ... , $40$.
  • In each track there are exactly two values  $\pm1$,  while all the other six values are  zero. 
  • The two  $±1$-positions are each assigned three bits –   i.e. encoded with  "$000$", ... ,  "$111$".
  • Another bit is used for the  "sign of the first-mentioned pulse",  where a  "$1$"  indicates a positive sign and a  "$0$"  a negative sign.
  • If the pulse position of the second pulse is greater than that of the first pulse,  the second pulse has the same sign as the first,  otherwise the opposite.
  • Thus,  seven bits per track are transmitted to the receiver,  plus five bits for the so-called  "FCB amplification.


In the diagram,  the  $35$  bits describing an FCB pulse are given as an example:

⇒   Track 1  includes

  1. a positive pulse  $({\rm sign} = 1)$  at position  $\big [1$  (first possible position for track 1)  $\hspace{0.02cm}\text{plus}\hspace{0.2cm}0$  (bit specification for  "000") $= 1\big]$,
  2. another positive pulse  $($since $110 > 000)$  at position  $\big [1 \hspace{0.2cm}\text{plus}\hspace{0.2cm}5$ (pulse spacing in each track) $\hspace{0.02cm}\text{times}\hspace{0.2cm}6$  (bit specification for " 110") = $31\hspace{0.05cm}\big].$


Track 2 includes.

  1. a negative pulse (${\rm sign} = 0$)  at position  $\big [2$ (first possible position for track 2)  $\hspace{0.02cm}\text{plus}\hspace{0.2cm}5\hspace{0.2cm}\text{times}\hspace{0.2cm}4$  (bit specification for  " 100")  $=22\hspace{0.05cm}\big],$
  2. a positive pulse  $($sign reversal due to  $011 > 100)$  at position  $\big [2 \hspace{0.2cm}\text{plus}\hspace{0.2cm}5\hspace{0.2cm}\text{times}\hspace{0.2cm}3$  (bit specification for  " 011")  $=17\hspace{0.05cm}\big].$




Hint:

  • When entering the pulse positions  $N_{1}$  denotes the first triple of bits and  $N_{2}$  the second.
  • For example,  for track  $2$  one would have to enter the values  $N_{1}=-22$  and  $N_{2}=+17$.


Questions

1

How many bits describe a speech frame $($of duration  $20 \ \rm ms)$  in  $12.2 \ \rm kbit/s$ mode?

$N_{12.2} \ = \ $

$ \ \rm bits$

2

How many bits are needed for FCB pulse and gain per frame?

$N_{\rm FCB} \ = \ $

$ \ \rm bits$

3

How many bits are left for LPC and LTP?

$N_{\rm LPC/LTP} \ = \ $

$ \ \rm bits$

4

What subframe pulse positions and signs does track  $3$  describe?  Follow the instructions for input on the information page.

$N_{1} \ = \ $

$N_{2} \ = \ $

5

What pulse positions including sign describe the track  $4$?

$N_{1} \ = \ $

$N_{2} \ = \ $

6

What pulse positions including sign describe the track  $5$?

$N_{1} \ = \ $

$N_{2} \ = \ $


Solution

(1)  With the data rate  $R_{\rm C} = 12.2 \ \rm kbit/s$,  exactly  $\underline{244 \ \rm bits}$  results within  $20 \ \rm ms$,  while e.g. in  $4.75 \ \rm kbit/s$  mode only  $95 \ \rm bits$  are transmitted.


(2)  In each subframe,  the FCB pulse requires  $35 \ \rm bits$  (five tracks of seven bits each)  and the FCB gain requires five bits.

  • With four subframes,  this gives $N_{\rm FCB} \hspace{0.15cm}\underline{= 160 \ \rm bits}$.


(3)  This leaves the difference from  (1)  and  (2),  i.e. $N_{\rm LPC/LTP}\hspace{0.15cm} \underline{ = 84\ \rm bits}$.


(4)  The sign bit  "$0$"  indicates a negative first pulse.

  • Because  $001 < 011$,  the second pulse has the same sign.
  • The two magnitudes result in
$$|N_1| \ = \ 3 \hspace{0.1cm}{\rm(since \hspace{0.1cm} track \hspace{0.1cm}3)} + 5\cdot 1 \hspace{0.1cm} {\rm(bit\:specification \hspace{0.1cm} 001)} = 8\hspace{0.05cm}, $$
$$ |N_2| \ = \ 3 \hspace{0.1cm}{\rm(since \hspace{0.1cm} track \hspace{0.1cm}3)} + 5\cdot 3 \hspace{0.1cm} {\rm(bit\:specification \hspace{0.1cm} 011)} = 18\hspace{0.05cm}.$$
  • Therefore,  to be entered for the third track is  $N_{1}\hspace{0.15cm} \underline{ = -8}$  and  $N_{2} \hspace{0.15cm}\underline{ = -18}.$


(5)  In an analogous way,  for track  $4$  we obtain the values  $N_{1}\hspace{0.15cm} \underline{ = +39}$  and  $N_{2}\hspace{0.15cm} \underline{ = -14}$.


(6)  The fifth track provides  $N_{1}\hspace{0.15cm} \underline{ =-30}$  and  $N_{2}\hspace{0.15cm} \underline{ = +5}$