Difference between revisions of "Aufgaben:Exercise 3.6: Adaptive Multi Rate Codec"

Latest revision as of 13:58, 25 January 2023

Tracks of the AMR codec

In the late 1990s, a very flexible, adaptive speech codec was developed and standardized in the form of $\rm AMR$ codec. This provides a total of eight different modes with data rates between $4.75 \ \rm kbit/s$ and $12.2 \ \rm kbit/s$.

The AMR codec, like the full rate codec $\rm (FRC)$ discussed in $\text{Exercise 3.5}$, includes both a short-term prediction $\rm (LPC)$ and a long-term prediction $\rm (LTP)$. However, these two components are realized differently from FRC.

The main difference between AMR and FRC is the encoding of the residual signal $($after LPC and LTP$)$:

Instead of "Regular Pulse Excitation" $\rm (RPE)$, here the "Algebraic Code Excitation Linear Prediction" $\rm (ACELP)$ is used.
From the fixed code book $\rm (FCB)$, for each subframe of $5 \ \rm ms$ duration, the "FCB pulse" and the "FCB gain" that best match the residual signal $($for which the mean square error of the difference signal becomes minimum$)$ is selected.

Each entry in the fixed code book identifies a pulse where exactly $10$ of $40$ positions are occupied by $\pm1$.

In this regard it should be noted:

The pulse is divided into five tracks with eight possible positions each, where track $1$ contains the positions $1,\ 6,\ 11$, ... , $36$ of the subframe and track $5$ describes the positions $5,\ 10,\ 15$, ... , $40$.

In each track there are exactly two values $\pm1$, while all the other six values are zero.

The two $±1$-positions are each assigned three bits – i.e. encoded with "$000$", ... , "$111$".

Another bit is used for the "sign of the first-mentioned pulse", where a "$1$" indicates a positive sign and a "$0$" a negative sign.

If the pulse position of the second pulse is greater than that of the first pulse, the second pulse has the same sign as the first, otherwise the opposite.

Thus, seven bits per track are transmitted to the receiver, plus five bits for the so-called "FCB amplification.

In the diagram, the $35$ bits describing an FCB pulse are given as an example:

⇒ Track 1 includes

a positive pulse $({\rm sign} = 1)$ at position $\big [1$ (first possible position for track 1) $\hspace{0.02cm}\text{plus}\hspace{0.2cm}0$ (bit specification for "000") $= 1\big]$,
another positive pulse $($since $110 > 000)$ at position $\big [1 \hspace{0.2cm}\text{plus}\hspace{0.2cm}5$ (pulse spacing in each track) $\hspace{0.02cm}\text{times}\hspace{0.2cm}6$ (bit specification for " 110") = $31\hspace{0.05cm}\big].$

Track 2 includes.

a negative pulse (${\rm sign} = 0$) at position $\big [2$ (first possible position for track 2) $\hspace{0.02cm}\text{plus}\hspace{0.2cm}5\hspace{0.2cm}\text{times}\hspace{0.2cm}4$ (bit specification for " 100") $=22\hspace{0.05cm}\big],$
a positive pulse $($sign reversal due to $011 > 100)$ at position $\big [2 \hspace{0.2cm}\text{plus}\hspace{0.2cm}5\hspace{0.2cm}\text{times}\hspace{0.2cm}3$ (bit specification for " 011") $=17\hspace{0.05cm}\big].$

Hint:

This exercise belongs to the chapter "Speech Coding".

When entering the pulse positions $N_{1}$ denotes the first triple of bits and $N_{2}$ the second.

For example, for track $2$ one would have to enter the values $N_{1}=-22$ and $N_{2}=+17$.

Questions

$N_{12.2} \ = \ $

$ \ \rm bits$

$N_{\rm FCB} \ = \ $

$ \ \rm bits$

$N_{\rm LPC/LTP} \ = \ $

$ \ \rm bits$

$N_{1} \ = \ $

$N_{2} \ = \ $

$N_{1} \ = \ $

$N_{2} \ = \ $

$N_{1} \ = \ $

$N_{2} \ = \ $

Solution

(1) With the data rate $R_{\rm C} = 12.2 \ \rm kbit/s$, exactly $\underline{244 \ \rm bits}$ results within $20 \ \rm ms$, while e.g. in $4.75 \ \rm kbit/s$ mode only $95 \ \rm bits$ are transmitted.

(2) In each subframe, the FCB pulse requires $35 \ \rm bits$ (five tracks of seven bits each) and the FCB gain requires five bits.

With four subframes, this gives $N_{\rm FCB} \hspace{0.15cm}\underline{= 160 \ \rm bits}$.

(3) This leaves the difference from (1) and (2), i.e. $N_{\rm LPC/LTP}\hspace{0.15cm} \underline{ = 84\ \rm bits}$.

(4) The sign bit "$0$" indicates a negative first pulse.

Because $001 < 011$, the second pulse has the same sign.

The two magnitudes result in

$$|N_1| \ = \ 3 \hspace{0.1cm}{\rm(since \hspace{0.1cm} track \hspace{0.1cm}3)} + 5\cdot 1 \hspace{0.1cm} {\rm(bit\:specification \hspace{0.1cm} 001)} = 8\hspace{0.05cm}, $$

$$ |N_2| \ = \ 3 \hspace{0.1cm}{\rm(since \hspace{0.1cm} track \hspace{0.1cm}3)} + 5\cdot 3 \hspace{0.1cm} {\rm(bit\:specification \hspace{0.1cm} 011)} = 18\hspace{0.05cm}.$$

Therefore, to be entered for the third track is $N_{1}\hspace{0.15cm} \underline{ = -8}$ and $N_{2} \hspace{0.15cm}\underline{ = -18}.$

(5) In an analogous way, for track $4$ we obtain the values $N_{1}\hspace{0.15cm} \underline{ = +39}$ and $N_{2}\hspace{0.15cm} \underline{ = -14}$.

(6) The fifth track provides $N_{1}\hspace{0.15cm} \underline{ =-30}$ and $N_{2}\hspace{0.15cm} \underline{ = +5}$

@@ Line 1: / Line 1: @@
-{{quiz-Header|Buchseite=Beispiele von Nachrichtensystemen/Sprachcodierung
+{{quiz-Header|Buchseite=Examples_of_Communication_Systems/Voice_Coding
 }}
-[[File:P_ID1233__Bei_A_3_6.png|right|frame|Spuren des AMR&ndash;Codecs]]
+[[File:En_Bei_A_3_6.png|right|frame|Tracks of the AMR codec]]
-Ende der 1990er Jahre wurde mit dem AMR–Codec ein sehr flexibler, adaptiver Sprachcodec entwickelt und standardisiert. Dieser stellt insgesamt acht verschiedene Modi mit Datenraten zwischen&nbsp; $4.75 \ \rm kbit/s$&nbsp; und&nbsp; $12.2 \ \rm kbit/s$&nbsp; zur Verfügung.
+In the late 1990s,&nbsp; a very flexible,&nbsp; adaptive speech codec was developed and standardized in the form of&nbsp; $\rm AMR$&nbsp; codec.&nbsp; This provides a total of eight different modes with data rates between&nbsp; $4.75 \ \rm kbit/s$&nbsp; and&nbsp; $12.2 \ \rm kbit/s$.
-Der AMR-Codec beinhaltet wie der in&nbsp; [[Aufgaben:Aufgabe_3.5:_GSM–Vollraten–Sprachcodec|Aufgabe 3.5]]&nbsp; behandelte Vollraten–Codec (FRC)  sowohl eine Kurzzeitprädiktion (LPC) als auch eine Langzeitprädiktion (LTP). Allerdings sind diese beiden Komponenten anders realisiert als beim FRC.
+The AMR codec,&nbsp; like the full rate codec&nbsp; $\rm (FRC)$&nbsp; discussed in&nbsp; [[Aufgaben:Exercise_3.5:_GSM_Full_Rate_Vocoder|$\text{Exercise 3.5}$]],&nbsp; includes both a short-term prediction&nbsp; $\rm (LPC)$&nbsp; and a long-term prediction&nbsp; $\rm (LTP)$.&nbsp; However,&nbsp; these two components are realized differently from FRC.
-Der wesentliche Unterschied von AMR gegenüber FRC stellt die Codierung des Restsignals (nach LPC und LTP) dar:
+The main difference between AMR and FRC is the encoding of the residual signal&nbsp; $($after LPC and LTP$)$:
-*Anstelle von „Regular Pulse Excitation” (RPE) wird beim AMR–Code das Verfahren „Algebraic Code Excitation Linear Prediction” (ACELP) angewendet.
+#Instead of&nbsp; "Regular Pulse Excitation"&nbsp; $\rm (RPE)$,&nbsp; here the&nbsp; "Algebraic Code Excitation Linear Prediction"&nbsp; $\rm (ACELP)$&nbsp; is used.
-*Aus dem festen Codebuch (FCB) wird für jeden Unterrahmen von&nbsp; $5 \ \rm ms$&nbsp; Dauer derjenige FCB–Puls und diejenige FCB–Verstärkung ausgewählt, die am besten zum Restsignal passen (für die der mittlere quadratische Fehler des Differenzsignals minimal wird).
+#From the fixed code book&nbsp; $\rm (FCB)$,&nbsp; for each subframe of&nbsp; $5 \ \rm ms$&nbsp; duration,&nbsp; the&nbsp; "FCB pulse"&nbsp; and the&nbsp; "FCB gain"&nbsp; that best match the residual signal&nbsp; $($for which the mean square error of the difference signal becomes minimum$)$&nbsp; is selected.
-Jeder Eintrag im festen Codebuch kennzeichnet einen Puls, bei dem genau&nbsp; $10$&nbsp; der&nbsp; $40$&nbsp; Positionen mit&nbsp; $\pm1$&nbsp; belegt sind. Hierzu ist anzumerken:
+Each entry in the fixed code book identifies a pulse where exactly&nbsp; $10$&nbsp; of&nbsp; $40$&nbsp; positions are occupied by&nbsp; $\pm1$.&nbsp;
-*Der Puls ist in fünf Spuren mit jeweils acht möglichen Positionen aufgeteilt, wobei die Spur&nbsp; $1$&nbsp; die Positionen&nbsp; $1,\ 6,\ 11$, ... , $36$&nbsp; des Unterrahmens und Spur&nbsp; $5$&nbsp; die Positionen&nbsp; $5,\ 10,\ 15$, ... , $40$&nbsp; beschreibt.
-*In jeder Spur sind genau zwei Werte&nbsp; $\pm1$, während alle anderen sechs Werte&nbsp; $0$&nbsp; sind. Die beiden&nbsp; $±1$–Positionen werden mit je drei Bit – also mit&nbsp; $000$, ... ,&nbsp; $111$ – codiert.
-*Für das Vorzeichen des erstgenannten Pulses wird ein weiteres Bit verwendet, wobei eine "$1$" ein positives Vorzeichen und eine "$0$" ein negatives  Vorzeichen kennzeichnet.
-*Ist die Pulsposition des zweiten Impulses größer als die des ersten Impulses, so hat der zweite Impuls das gleiche Vorzeichen wie der erste, ansonsten das umgekehrte.
-*Zum Empfänger werden somit pro Spur sieben Bit übertragen, außerdem noch fünf Bit für die so genannte&nbsp; ''FCB–Verstärkung''.
+In this regard it should be noted:
+*The pulse is divided into five tracks with eight possible positions each, where track&nbsp; $1$&nbsp; contains the positions&nbsp; $1,\ 6,\ 11$, ... , $36$&nbsp; of the subframe and track&nbsp; $5$&nbsp;  describes the positions&nbsp; $5,\ 10,\ 15$, ... , $40$.
-In der Grafik sind die&nbsp; $35$&nbsp; Bit zur Beschreibung eines FCB–Pulses beispielhaft angegeben.
+*In each track there are exactly two values&nbsp; $\pm1$,&nbsp; while all the other six values are&nbsp; zero.&nbsp;
+*The two&nbsp; $±1$-positions are each assigned three bits &ndash; &nbsp; i.e. encoded with&nbsp; "$000$", ... ,&nbsp; "$111$".
+*Another bit is used for the&nbsp; "sign of the first-mentioned pulse",&nbsp; where a&nbsp; "$1$"&nbsp; indicates a positive sign and a&nbsp; "$0$"&nbsp; a negative sign.
+*If the pulse position of the second pulse is greater than that of the first pulse,&nbsp; the second pulse has the same sign as the first,&nbsp; otherwise the opposite.
+*Thus,&nbsp; seven bits per track are transmitted to the receiver,&nbsp; plus five bits for the so-called&nbsp; "FCB amplification''.
+In the diagram,&nbsp; the&nbsp; $35$&nbsp; bits describing an FCB pulse are given as an example:
-'''Spur 1''' beinhaltet
+&rArr; &nbsp; '''Track 1'''&nbsp; includes
-*einen positiven Impuls&nbsp; $({\rm VZ} = 1)$&nbsp; bei&nbsp; $1$&nbsp; (erste mögliche Position für Spur 1)&nbsp;  $\hspace{0.2cm}\text{plus}\hspace{0.2cm}0$ (Bitangabe für " 000") $= 1$,
+#a positive pulse&nbsp; $({\rm sign} = 1)$&nbsp; at position&nbsp; $\big [1$&nbsp; (first possible position for track 1)&nbsp; $\hspace{0.02cm}\text{plus}\hspace{0.2cm}0$&nbsp; (bit specification for&nbsp; "000") $= 1\big]$,
-*einen weiteren positiven Impuls (da $110 > 000$) bei der Position $1 \hspace{0.2cm}\text{plus}\hspace{0.2cm}5$ (Pulsabstand in jeder Spur)  $\hspace{0.2cm}\text{mal}\hspace{0.2cm}6$ (Bitangabe für " 110")  = $31\hspace{0.05cm}.$
+#another positive pulse&nbsp; $($since $110 > 000)$&nbsp; at position&nbsp; $\big [1 \hspace{0.2cm}\text{plus}\hspace{0.2cm}5$ (pulse spacing in each track) $\hspace{0.02cm}\text{times}\hspace{0.2cm}6$&nbsp; (bit specification for " 110") = $31\hspace{0.05cm}\big].$
-'''Spur 2''' beinhaltet
+'''Track 2''' includes.
-*einen negativen Impuls (${\rm VZ} = 0$) bei $2$ (erste mögliche Position für Spur 2)  $\hspace{0.2cm}\text{plus}\hspace{0.2cm}5\hspace{0.2cm}\text{mal}\hspace{0.2cm}4$ (Bitangabe für " 100")  = $22\hspace{0.05cm},$
+#a negative pulse (${\rm sign} = 0$)&nbsp; at position&nbsp; $\big [2$ (first possible position for track 2)&nbsp; $\hspace{0.02cm}\text{plus}\hspace{0.2cm}5\hspace{0.2cm}\text{times}\hspace{0.2cm}4$&nbsp; (bit specification for&nbsp; " 100")&nbsp;  $=22\hspace{0.05cm}\big],$
-*einen positiven Impuls (Vorzeichenumkehr wegen  $011 > 100$) bei der Position $2 \hspace{0.2cm}\text{plus}\hspace{0.2cm}5\hspace{0.2cm}\text{mal}\hspace{0.2cm}3$ (Bitangabe für " 011")  = $17\hspace{0.05cm}.$
+#a positive pulse&nbsp; $($sign reversal due to&nbsp; $011 > 100)$&nbsp; at position&nbsp; $\big [2 \hspace{0.2cm}\text{plus}\hspace{0.2cm}5\hspace{0.2cm}\text{times}\hspace{0.2cm}3$&nbsp; (bit specification for&nbsp; " 011")&nbsp;  $=17\hspace{0.05cm}\big].$
@@ Line 38: / Line 46: @@
-''Hinweise:''
+<u>Hint:</u>
-*Diese Aufgabe gehört zum Kapitel&nbsp; [[Examples_of_Communication_Systems/Sprachcodierung|Sprachcodierung]].
+*This exercise belongs to the chapter&nbsp; [[Examples_of_Communication_Systems/Voice_Coding|"Speech Coding"]].
-*Bei der Eingabe der Pulspositionen bezeichnet&nbsp; $N_{1}$&nbsp; das erste Bit–Tripel und&nbsp; $N_{2}$&nbsp; das zweite.
+*When entering the pulse positions&nbsp; $N_{1}$&nbsp; denotes the first triple of bits and&nbsp; $N_{2}$&nbsp; the second.
-*Man müsste zum Beispiel für Spur&nbsp; $2$&nbsp; die Werte&nbsp; $N_{1}=-22$&nbsp;  und&nbsp; $N_{2}=+17$&nbsp; eintragen.
+*For example,&nbsp; for track&nbsp; $2$&nbsp; one would have to enter the values&nbsp; $N_{1}=-22$&nbsp; and&nbsp; $N_{2}=+17$.
-===Fragebogen===
+===Questions===
 <quiz display=simple>
-{Wie viele Bit beschreiben einen Sprachrahmen $($der Dauer&nbsp; $20 \ \rm ms)$&nbsp; im&nbsp; $12.2 \ \rm kbit/s$–Modus?
+{How many bits describe a speech frame $($of duration&nbsp; $20 \ \rm ms)$&nbsp; in&nbsp; $12.2 \ \rm kbit/s$ mode?
 |type="{}"}
-$N_{12.2} \ = \ $ { 244 3% } $ \ \rm Bit$
+$N_{12.2} \ = \ $ { 244 3% } $ \ \rm bits$
-{Wie viele Bit werden für FCB–Puls und –Verstärkung pro Rahmen benötigt?
+{How many bits are needed for FCB pulse and gain per frame?
 |type="{}"}
-$N_{\rm FCB} \ = \ $ { 160 3% } $ \ \rm Bit$
+$N_{\rm FCB} \ = \ $ { 160 3% } $ \ \rm bits$
-{Wie viele Bit verbleiben somit für LPC und LTP?
+{ How many bits are left for LPC and LTP?
 |type="{}"}
-$N_{\rm LPC/LTP} \ = \ $ { 84 3% } $ \ \rm Bit$
+$N_{\rm LPC/LTP} \ = \ $ { 84 3% } $ \ \rm bits$
-{Welche Impulspositionen des Unterrahmens und Vorzeichen beschreibt die Spur&nbsp; $3$? <br>Beachten Sie die Hinweise zur Eingabe auf der Angabenseite.
+{What subframe pulse positions and signs does track&nbsp; $3$&nbsp; describe?&nbsp; Follow the instructions for input on the information page.
 |type="{}"}
 $N_{1} \ = \ $ { -8.24--7.76 }
 $N_{2} \ = \ $ { -18.54--17.46 }
-{Welche Impulspositionen inklusive Vorzeichen beschreiben die Spur&nbsp; $4$?
+{What pulse positions including sign describe the track&nbsp; $4$?
 |type="{}"}
 $N_{1} \ = \ $ { 39 3% }
 $N_{2} \ = \ $ { -14.42--13.58 }
-{Welche Impulspositionen inklusive Vorzeichen beschreiben die Spur&nbsp; $5$?
+{What pulse positions including sign describe the track&nbsp; $5$?
 |type="{}"}
 $N_{1} \ = \ $ { -30.9--29.1 }
@@ Line 79: / Line 88: @@
 </quiz>
-===Musterlösung===
+===Solution===
 {{ML-Kopf}}
-'''(1)'''&nbsp; Mit der Datenrate $12.2 \ \rm kbit/s$ ergeben sich innerhalb von $20 \ \rm ms$ genau $\underline{244 \ \rm Bit}$, während zum Beispiel im  $4.75 \ \rm kbit/s$–Modus nur $95 \ \rm Bit$ übertragen werden.
+'''(1)'''&nbsp; With the data rate&nbsp; $R_{\rm C} = 12.2 \ \rm kbit/s$,&nbsp; exactly&nbsp; $\underline{244 \ \rm bits}$&nbsp; results within&nbsp; $20 \ \rm ms$,&nbsp; while e.g. in&nbsp; $4.75 \ \rm kbit/s$&nbsp; mode only&nbsp; $95 \ \rm bits$&nbsp; are transmitted.
+'''(2)'''&nbsp; In each subframe,&nbsp; the FCB pulse requires&nbsp; $35 \ \rm bits$&nbsp; (five tracks of seven bits each)&nbsp; and the FCB gain requires five bits.
-'''(2)'''&nbsp;  In jedem Unterrahmen benötigt der FCB–Puls $35 \ \rm Bit$ (fünf Spuren zu je sieben Bit) und die FCB–Verstärkung fünf Bit.
+*With four subframes,&nbsp; this gives $N_{\rm FCB} \hspace{0.15cm}\underline{= 160 \ \rm bits}$.
-*Bei vier Unterrahmen kommt man so auf $N_{\rm FCB} \underline{= 160 \ \rm Bit}$.
-'''(3)'''&nbsp; Hierfür verbleiben die Differenz aus (1) und (2), also $N_{\rm LPC/LTP}\underline{ = 84 \ \rm Bit}$.
+'''(3)'''&nbsp; This leaves the difference from&nbsp; '''(1)'''&nbsp; and&nbsp; '''(2)''',&nbsp; i.e. $N_{\rm LPC/LTP}\hspace{0.15cm} \underline{ = 84\  \rm bits}$.
-'''(4)'''&nbsp;  Das Vorzeichenbit "$0$" deutet auf einen negativen ersten Impuls hin.
+'''(4)'''&nbsp; The sign bit&nbsp; "$0$"&nbsp; indicates a negative first pulse.
-*Wegen $001 < 011$ hat der zweite Impuls das gleiche Vorzeichen.
+*Because&nbsp; $001 < 011$,&nbsp; the second pulse has the same sign.
-*Die beiden Beträge ergeben sich zu
-:$$|N_1| \ = \ 3 \hspace{0.1cm}{\rm(da \hspace{0.1cm} Spur \hspace{0.1cm}3)} + 5\cdot 1 \hspace{0.1cm} {\rm(Bitangabe \hspace{0.1cm} 001)} = 8\hspace{0.05cm}, $$
+*The two magnitudes result in
-:$$ |N_2| \ = \ 3 \hspace{0.1cm}{\rm(da \hspace{0.1cm} Spur \hspace{0.1cm}3)} + 5\cdot 3 \hspace{0.1cm} {\rm(Bitangabe \hspace{0.1cm} 011)} = 18\hspace{0.05cm}.$$
+:$$|N_1| \ = \ 3 \hspace{0.1cm}{\rm(since \hspace{0.1cm} track \hspace{0.1cm}3)} + 5\cdot 1 \hspace{0.1cm} {\rm(bit\:specification \hspace{0.1cm} 001)} = 8\hspace{0.05cm}, $$
-*Einzugeben sind deshalb für die dritte Spur  $N_{1} \underline{ = -8}$ und $N_{2} \underline{ = -18}.$
+:$$ |N_2| \ = \ 3 \hspace{0.1cm}{\rm(since \hspace{0.1cm} track \hspace{0.1cm}3)} + 5\cdot 3 \hspace{0.1cm} {\rm(bit\:specification \hspace{0.1cm} 011)} = 18\hspace{0.05cm}.$$
+*Therefore,&nbsp; to be entered for the third track is&nbsp; $N_{1}\hspace{0.15cm} \underline{ = -8}$&nbsp; and&nbsp; $N_{2} \hspace{0.15cm}\underline{ = -18}.$
-'''(5)'''&nbsp;  In analoger Weise erhält man für die Spur $4$ die Werte&nbsp; $N_{1} \underline{ = +39}$&nbsp; und&nbsp; $N_{2} \underline{ = -14}$.
+'''(5)'''&nbsp; In an analogous way,&nbsp; for track&nbsp; $4$&nbsp; we obtain the values&nbsp; $N_{1}\hspace{0.15cm} \underline{ = +39}$&nbsp; and&nbsp; $N_{2}\hspace{0.15cm} \underline{ = -14}$.
-'''(6)'''&nbsp; Die fünfte Spur liefert&nbsp; $N_{1} \underline{ =-30}$&nbsp; und&nbsp; $N_{2} \underline{ = +5}$
+'''(6)'''&nbsp; The fifth track provides&nbsp; $N_{1}\hspace{0.15cm} \underline{ =-30}$&nbsp; and&nbsp; $N_{2}\hspace{0.15cm} \underline{ = +5}$
 {{ML-Fuß}}
@@ Line 110: / Line 121: @@
-[[Category:Examples of Communication Systems: Exercises|^3.3 Sprachcodierung^]]
+[[Category:Examples of Communication Systems: Exercises|^3.3 Speech Coding^]]