Difference between revisions of "Channel Coding/Decoding of Convolutional Codes"

From LNTwww
 
(127 intermediate revisions by 7 users not shown)
Line 1: Line 1:
 
   
 
   
 
{{Header
 
{{Header
|Untermenü=Faltungscodierung und geeignete Decoder
+
|Untermenü=Convolutional Codes and Their Decoding
|Vorherige Seite=Codebeschreibung mit Zustands– und Trellisdiagramm
+
|Vorherige Seite=Code Description with State and Trellis Diagram
|Nächste Seite=Distanzeigenschaften und Fehlerwahrscheinlichkeitsschranken
+
|Nächste Seite=Distance Characteristics and Error Probability Bounds
 
}}
 
}}
  
== Blockschaltbild und Voraussetzungen ==
+
== Block diagram and requirements ==
 
<br>
 
<br>
Ein wesentlicher Vorteil der Faltungscodierung ist, dass es hierfür mit dem Viterbi&ndash;Algorithmus ein sehr effizientes Decodierverfahren gibt. Dieser von Andrew J. Viterbi entwickelte Algorithmus wurde bereits im [http://en.lntwww.de/Digitalsignal%C3%BCbertragung/Viterbi%E2%80%93Empf%C3%A4nger#Blockschaltbild_und_Voraussetzungen_f.C3.BCr_Kapitel_3.8_.281.29 Kapitel 3.8] des Buches &bdquo;Digitalsignalübertragung&rdquo; im Hinblick auf seinen Einsatz zur Entzerrung im Detail beschrieben. Für seinen Einsatz als Faltungsdecodierer gehen wir von folgendem Blockschaltbild und den folgenden Voraussetzungen aus:<br>
+
A significant advantage of convolutional coding is that there is a very efficient decoding method for this in the form of the&nbsp; "Viterbi algorithm".&nbsp; This algorithm,&nbsp; developed by&nbsp; [https://en.wikipedia.org/wiki/Andrew_Viterbi $\text{Andrew James Viterbi}$]&nbsp; has already been described in the chapter&nbsp; [[Digital_Signal_Transmission/Viterbi_Receiver| "Viterbi receiver"]]&nbsp; of the book "Digital Signal Transmission" with regard to its use for equalization.  
  
[[File:P ID2651 KC T 3 4 S1 v1.png|Systemmodell zur Beschreibung der Decodierung von Faltungscodes|class=fit]]<br>
+
For its use as a convolutional decoder we assume the block diagram on the right and the following prerequisites:<br>
*Die Informationssequenz <u><i>u</i></u> = (<i>u</i><sub>1</sub>, <i>u</i><sub>2</sub>, ...) ist hier im Gegensatz zur Beschreibung der linearen Blockcodes &#8658; [http://en.lntwww.de/Kanalcodierung/Decodierung_von_Faltungscodes#Blockschaltbild_und_Voraussetzungen Kapitel 1.5] oder von Reed&ndash;Solomon&ndash;Codes &#8658; [http://en.lntwww.de/Kanalcodierung/Reed%E2%80%93Solomon%E2%80%93Decodierung_beim_Ausl%C3%B6schungskanal#Blockschaltbild_und_Voraussetzungen_zu_Kapitel_2.4 Kapitel 2.4] im allgemeinen unendlich lang (<i>&bdquo;semi&ndash;infinite&rdquo;</i>). Für die Informationssymbole gilt stets <i>u<sub>i</sub></i> &#8712; {0, 1}.<br>
 
  
*Die Codesequenz <u><i>x</i></u> = (<i>x</i><sub>1</sub>, <i>x</i><sub>2</sub>, ...) mit <i>x<sub>i</sub></i> &#8712; {0, 1} hängt außer von <u><i>u</i></u> auch noch von der Coderate <i>R</i> = 1/<i>n</i>, dem Gedächtnis <i>m</i> und der Übertragungsfunktionsmatrix <b>G</b>(<i>D</i>) ab. Bei endlicher Anzahl <i>L</i> an Informationsbits sollte der Faltungscode durch Anfügen von <i>m</i> Nullen terminiert werden:
+
[[File:EN_KC_T_3_4_S1.png|right|frame|System model for the decoding of convolutional codes|class=fit]]
  
::<math>\underline{u}= (u_1,\hspace{0.05cm} u_2,\hspace{0.05cm} ...  \hspace{0.1cm}, u_L, \hspace{0.05cm} 0 \hspace{0.05cm},\hspace{0.05cm} ...  \hspace{0.1cm}, 0 ) \hspace{0.3cm}\Rightarrow \hspace{0.3cm}
+
*The information sequence&nbsp; $\underline{u} = (u_1, \ u_2, \ \text{... } \ )$&nbsp; is here in contrast to the description of linear block codes &nbsp; &#8658; &nbsp; [[Channel_Coding/Decoding_of_Linear_Block_Codes#Block_diagram_and_requirements| "first main chapter"]]&nbsp; or of Reed&ndash;Solomon codes &nbsp; &#8658; &nbsp; [[Channel_Coding/Reed-Solomon_Decoding_for_the_Erasure_Channel#Block_diagram_and_requirements_for_Reed-Solomon_error_detection| "second main chapter"]]&nbsp; generally infinitely long&nbsp; ("semi&ndash;infinite").&nbsp; For the information symbols always applies&nbsp; $u_i &#8712; \{0, 1\}$.<br>
\underline{x}= (x_1,\hspace{0.05cm} x_2,\hspace{0.05cm} ... \hspace{0.1cm}, x_{2L}, \hspace{0.05cm} x_{2L+1} ,\hspace{0.05cm} ...  \hspace{0.1cm}, \hspace{0.05cm} x_{2L+2m} ) \hspace{0.05cm}.</math>
 
  
*Die Empfangssequenz <u><i>y</i></u> = (<i>y</i><sub>1</sub>, <i>y</i><sub>2</sub>, ...) ergibt sich entsprechend dem angenommenen Kanalmodell. Bei einem digitalen Modell wie dem [http://en.lntwww.de/Kanalcodierung/Klassifizierung_von_Signalen#Binary_Symmetric_Channel_.E2.80.93_BSC Binary Symmetric Channel] (BSC) gilt <i>y<sub>i</sub></i> &#8712; {0, 1}, so dass die Verfälschung von <u><i>x</i></u> auf <u><i>y</i></u> mit der [http://en.lntwww.de/Kanalcodierung/Zielsetzung_der_Kanalcodierung#Einige_wichtige_Definitionen_zur_Blockcodierung Hamming&ndash;Distanz] <i>d</i><sub>H</sub> (<u><i>x</i></u>, <u><i>y</i></u>) quantifiziert werden kann. Die erforderlichen Modifikationen für den [http://en.lntwww.de/Kanalcodierung/Klassifizierung_von_Signalen#AWGN.E2.80.93Kanal_bei_bin.C3.A4rem_Eingang AWGN&ndash;Kanal] folgen auf [http://en.lntwww.de/Kanalcodierung/Decodierung_von_Faltungscodes#Viterbi.E2.80.93Algorithmus.2C_basierend_auf_Korrelation_und_Metriken_.281.29 Seite 6] dieses Kapitels.
+
*The encoded sequence&nbsp; $\underline{x} = (x_1, \ x_2, \ \text{... })$&nbsp; with&nbsp; $x_i &#8712; \{0, 1\}$&nbsp; depends not only on &nbsp; $\underline{u}$ &nbsp; but also on the code rate&nbsp; $R = 1/n$, the memory&nbsp; $m$&nbsp; and the transfer function matrix&nbsp; $\mathbf{G}(D)$&nbsp; . For finite number&nbsp; $L$&nbsp; of information bits,&nbsp; the convolutional code should be terminated by appending&nbsp; $m$&nbsp; zeros:
  
*Der nachfolgend beschriebene Viterbi&ndash;Algorithmus liefert eine Schätzung <u><i>z</i></u> für die Codesequenz <u><i>x</i></u> und eine weitere Schätzung <u><i>&upsilon;</i></u> für die Informationssequenz <u><i>u</i></u>. Dabei gilt:
+
::<math>\underline{u}= (u_1,\hspace{0.05cm} u_2,\hspace{0.05cm} \text{...} \hspace{0.1cm}, u_L, \hspace{0.05cm} 0 \hspace{0.05cm},\hspace{0.05cm} \text{...}  \hspace{0.1cm}, 0 ) \hspace{0.3cm}\Rightarrow \hspace{0.3cm}
 +
\underline{x}= (x_1,\hspace{0.05cm} x_2,\hspace{0.05cm} \text{...}  \hspace{0.1cm}, x_{2L}, \hspace{0.05cm} x_{2L+1} ,\hspace{0.05cm} \text{...}  \hspace{0.1cm}, \hspace{0.05cm} x_{2L+2m} ) \hspace{0.05cm}.</math>
 +
 
 +
*The received sequence&nbsp; $\underline{y} = (y_1, \ y_2, \ \text{...} )$&nbsp; results according to the assumed channel model. For a digital model like the&nbsp; [[Channel_Coding/Channel_Models_and_Decision_Structures#Binary_Symmetric_Channel_.E2.80.93_BSC|$\text{Binary Symmetric Channel}$]]&nbsp; $\rm (BSC)$&nbsp; holds &nbsp; $y_i &#8712; \{0, 1\}$,&nbsp; so the falsification from&nbsp; $\underline{x}$&nbsp; to&nbsp; $\underline{y}$ &nbsp; can be quantified with the&nbsp; [[Channel_Coding/Objective_of_Channel_Coding#Important_definitions_for_block_coding|$\text{Hamming distance}$]]&nbsp; $d_{\rm H}(\underline{x}, \underline{y})$.
 +
 
 +
*The required modifications for the&nbsp; [[Channel_Coding/Channel_Models_and_Decision_Structures#AWGN_channel_at_binary_input|$\text{AWGN channel}$]]&nbsp; follow in the section&nbsp; [[Channel_Coding/Decoding_of_Convolutional_Codes#Viterbi_algorithm_based_on_correlation_and_metrics| "Viterbi algorithm based on correlation and metrics"]].
 +
 
 +
*The Viterbi algorithm provides an estimate&nbsp; $\underline{z}$&nbsp; for the encoded sequence&nbsp; $\underline{x}$&nbsp; and another estimate&nbsp; $\underline{v}$&nbsp; for the information sequence&nbsp; $\underline{u}$.&nbsp; Thereby holds:
  
 
::<math>{\rm Pr}(\underline{z} \ne \underline{x})\stackrel{!}{=}{\rm Minimum}
 
::<math>{\rm Pr}(\underline{z} \ne \underline{x})\stackrel{!}{=}{\rm Minimum}
Line 27: Line 32:
 
\hspace{0.05cm}.</math>
 
\hspace{0.05cm}.</math>
  
'''Zusammenfassung :''' Der Viterbi&ndash;Algorithmus sucht bei einem digitalen Kanalmodell (zum Beispiel BSC) von allen möglichen Codesequenzen <u><i>x</i></u>' diejenige Sequenz <u><i>z</i></u> mit der minimalen Hamming&ndash;Distanz <i>d</i><sub>H</sub>(<u><i>x</i></u>', <u><i>y</i></u>) zur Empfangssequenz <u><i>y</i></u>:
+
{{BlaueBox|TEXT= 
 +
$\text{Conclusion:}$&nbsp; Given a digital channel model &nbsp; $($for example, &nbsp; the BSC model$)$, &nbsp; the Viterbi algorithm searches from all possible encoded sequences&nbsp; $\underline{x}\hspace{0.05cm}'$&nbsp; the sequence&nbsp; $\underline{z}$&nbsp; with the minimum Hamming distance &nbsp; $d_{\rm H}(\underline{x}\hspace{0.05cm}', \underline{y})$ &nbsp; to the received sequence&nbsp; $\underline{y}$:
  
:<math>\underline{z} = {\rm arg} \min_{\underline{x}' \in \hspace{0.05cm} \mathcal{C}} \hspace{0.1cm} d_{\rm H}( \underline{x}'\hspace{0.02cm},\hspace{0.02cm} \underline{y}  )  
+
:<math>\underline{z} = {\rm arg} \min_{\underline{x}\hspace{0.05cm}' \in \hspace{0.05cm} \mathcal{C} } \hspace{0.1cm} d_{\rm H}( \underline{x}\hspace{0.05cm}'\hspace{0.02cm},\hspace{0.02cm} \underline{y}  )  
= {\rm arg} \max_{\underline{x}' \in \hspace{0.05cm} \mathcal{C}} \hspace{0.1cm} {\rm Pr}( \underline{y}  \hspace{0.05cm}|\hspace{0.05cm} \underline{x}')\hspace{0.05cm}.</math>
+
= {\rm arg} \max_{\underline{x}' \in \hspace{0.05cm} \mathcal{C} } \hspace{0.1cm} {\rm Pr}( \underline{y}  \hspace{0.05cm} \vert \hspace{0.05cm} \underline{x}')\hspace{0.05cm}.</math>
  
Das bedeutet auch: Der Viterbi&ndash;Algorithmus erfüllt das [http://en.lntwww.de/Kanalcodierung/Klassifizierung_von_Signalen#MAP.E2.80.93_und_ML.E2.80.93Kriterium_.281.29 Maximum&ndash;Likelihood&ndash;Kriterium.]<br>
+
*This also means: &nbsp; The Viterbi algorithm satisfies the&nbsp; [[Channel_Coding/Channel_Models_and_Decision_Structures#Criteria_.C2.BBMaximum-a-posteriori.C2.AB_and_.C2.BBMaximum-Likelihood.C2.AB|$\text{maximum likelihood criterion}$]].}}<br>
  
== Vorbemerkungen zu den nachfolgenden Decodierbeispielen (1) ==
+
== Preliminary remarks on the following decoding examples ==
 
<br>
 
<br>
In den Beispielen dieses Kapitels wird stets von unserem Standard&ndash;Faltungscodierer der Rate <i>R</i>&nbsp;=&nbsp;1/2, mit dem Gedächtnis <i>m</i> = 2 sowie der Übertragungsfunktionsmatrix <b>G</b>(<i>D</i>) = (1 + <i>D</i> + <i>D</i><sup>2</sup>, 1 + <i>D</i><sup>2</sup>) ausgegangen. Die Länge der Informationssequenz sei <i>L</i> = 5 und unter Berücksichtigung der Terminierung <i>L&prime;</i> = 7. Die betrachteten Codesequenzen <u><i>x</i></u> und auch die Empfangssequenzen <u><i>y</i></u> setzen sich somit jeweils aus 14 Bit zusammen. Durch Aufteilung entsprechend <u><i>y</i></u> = ( <u><i>y</i></u><sub>1</sub>,  <u><i>y</i></u><sub>2</sub>, ... ,  <u><i>y</i></u><sub>7</sub>) ergeben sich die Bitpaare  <u><i>y</i></u><sub><i>i</i></Sub>&nbsp;&#8712;&nbsp;{00,&nbsp;01,&nbsp;10,&nbsp;11}.<br>
+
[[File:P ID2652 KC T 3 4 S2 v1.png|right|frame|Trellis for decoding the received sequence&nbsp; $\underline{y}$|class=fit]]
 +
The following&nbsp; &raquo;'''prerequisites'''&laquo; &nbsp; apply to all examples in this chapter:
  
[[File:P ID2652 KC T 3 4 S2 v1.png|Trellis zur Decodierung der Empfangssequenz <u><i>y</i></u>|class=fit]]<br>
+
*Standard convolutional encoder: &nbsp; Rate $R = 1/2$,&nbsp; memory&nbsp; $m = 2$;
 +
   
 +
*transfer function matrix: &nbsp; $\mathbf{G}(D) = (1 + D + D^2, 1 + D^2)$;
  
Die Viterbi&ndash;Decodierung erfolgt mit Hilfe des dargestellten Trellisdiagramms. Ein roter Pfeil steht für die Hypothese <i>u<sub>i</sub></i> = 0, ein blauer für <i>u<sub>i</sub></i> = 1. Um Verwechslungen zu vermeiden, versehen wir hypothetische Größen mit Apostroph. Auch dann, wenn wir sicher wissen, dass das Informationsbit <i>u<sub>i</sub></i>&nbsp;=&nbsp;1 ist, gilt <i>u<sub>i</sub></i>'&nbsp;&#8712;&nbsp;{0,&nbsp;1}. Neben den Pfeilen steht die jeweilige hypothetische Codesequenz <i><u>x</u><sub>i</sub></i>' &#8712; {00, 01, 10, 11}.<br>
+
*length of information sequence: &nbsp; $L = 5$;
  
Wir gehen auf dieser und den nachfolgenden Seiten davon aus, dass die Viterbi&ndash;Decodierung auf der [http://en.lntwww.de/Kanalcodierung/Zielsetzung_der_Kanalcodierung#Einige_wichtige_Definitionen_zur_Blockcodierung Hamming&ndash;Distanz] <i>d</i><sub>H</sub>(<u><i>x</i></u><sub><i>i</i></sub>', <u><i>y</i></u><sub><i>i</i></sub>) zwischen dem Empfangswort <u><i>y</i></u><sub><i>i</i></sub> und den vier möglichen Codeworten <i>x</i><sub>i</sub>'&nbsp;&#8712;&nbsp;{00,&nbsp;01,&nbsp;10,&nbsp;11}  basiert. Dann gehen wir wie folgt vor:
+
*consideration of termination: &nbsp; $L\hspace{0.05cm}' = 7$;
  
*In den noch leeren Kreisen werden die Fehlergrößen <i>&Gamma;</i><sub><i>i</i></sub>(<i>S<sub>&mu;</sub></i>) der Zustände <i>S<sub>&mu;</sub></i> (0 &#8804; <i>&mu;</i> &#8804; 3) zu den Zeitpunkten <i>i</i> eingetragen. Der Startwert ist stets <i>&Gamma;</i><sub>0</sub>(<i>S</i><sub>0</sub>) = 0.
+
*length of sequences&nbsp; $\underline{x}$&nbsp; and&nbsp; $\underline{y}$&nbsp;: &nbsp; $14$&nbsp; bits each;
  
*Die Fehlergrößen für <i>i</i> = 1 und <i>i</i> = 2 ergeben sich zu
+
*allocation according to&nbsp; $\underline{y} = (\underline{y}_1, \ \underline{y}_2, \ \text{...} \ , \ \underline{y}_7)$ <br>&rArr; &nbsp; bit pairs&nbsp; $\underline{y}_i &#8712; \{00, 01, 10, 11\}$;
 +
*Viterbi decoding using trellis diagram:
 +
::red arrow &nbsp; &rArr; &nbsp; hypothesis&nbsp; $u_i = 0$,
 +
::blue arrow &nbsp; &rArr; &nbsp; hypothesis&nbsp; $u_i = 1$;
  
::<math>{\it \Gamma}_1(S_0) \hspace{-0.15cm}  =  \hspace{-0.15cm} d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_1 \big )  \hspace{0.05cm}, \hspace{2.13cm}{\it \Gamma}_1(S_1) = d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_1 \big )  \hspace{0.05cm},</math>
+
*hypothetical encoded sequence&nbsp; $\underline{x}_i\hspace{0.01cm}' &#8712; \{00, 01, 10, 11\}$;
::<math>{\it \Gamma}_2(S_0) \hspace{-0.15cm} \hspace{-0.15cm}{\it \Gamma}_1(S_0) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_2 \big )\hspace{0.05cm}, \hspace{0.4cm}{\it \Gamma}_2(S_1) = {\it \Gamma}_1(S_0)+ d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_2 \big )  \hspace{0.05cm},</math>
 
::<math>{\it \Gamma}_2(S_2) \hspace{-0.15cm}  =  \hspace{-0.15cm}{\it \Gamma}_1(S_1) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_2 \big )\hspace{0.05cm}, \hspace{0.4cm}{\it \Gamma}_2(S_3) = {\it \Gamma}_1(S_1)+ d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_2 \big )  \hspace{0.05cm}.</math>
 
  
Die Beschreibung wird auf der nächsten Seite fortgesetzt.<br>
+
*all hypothetical quantities with apostrophe.
 +
<br clear=all>
 +
We always assume that the Viterbi decoding is based at the&nbsp; [[Channel_Coding/Objective_of_Channel_Coding#Important_definitions_for_block_coding|$\text{Hamming distance}$]]&nbsp; $d_{\rm H}(\underline{x}_i\hspace{0.01cm}', \ \underline{y}_i)$&nbsp; between the received word&nbsp; $\underline{y}_i$&nbsp; and the four possible code words&nbsp; $x_i\hspace{0.01cm}' &#8712; \{00, 01, 10, 11\}$.&nbsp;  We then proceed as follows:
  
== Vorbemerkungen zu den nachfolgenden Decodierbeispielen (2) ==
+
*In the still empty circles the error value&nbsp; ${\it \Gamma}_i(S_{\mu})$&nbsp; of states&nbsp; $S_{\mu} (0 &#8804; \mu &#8804; 3)$&nbsp; at time points&nbsp; $i$&nbsp; are entered.&nbsp; The initial value is always&nbsp; ${\it \Gamma}_0(S_0) = 0$.
<br>
+
 
<b>Fortsetzung der allgemeinen Viterbi&ndash;Beschreibung</b>:<br>
+
*The error values for&nbsp; $i = 1$&nbsp; and&nbsp; $i = 2$&nbsp; are given by
 +
 
 +
::<math>{\it \Gamma}_1(S_0) =d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_1 \big )  \hspace{0.05cm}, \hspace{2.38cm}{\it \Gamma}_1(S_1) = d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_1 \big )  \hspace{0.05cm},</math>
 +
::<math>{\it \Gamma}_2(S_0) ={\it \Gamma}_1(S_0) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_2 \big )\hspace{0.05cm}, \hspace{0.6cm}{\it \Gamma}_2(S_1) = {\it \Gamma}_1(S_0)+ d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_2 \big )  \hspace{0.05cm},\hspace{0.6cm}{\it \Gamma}_2(S_2) ={\it \Gamma}_1(S_1) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_2 \big )\hspace{0.05cm}, \hspace{0.6cm}{\it \Gamma}_2(S_3) = {\it \Gamma}_1(S_1)+ d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_2 \big )  \hspace{0.05cm}.</math>
  
[[File:P ID2664 KC T 3 4 S2 v1.png|Trellis zur Decodierung der Empfangssequenz  <u><i>y</i></u>|class=fit]]<br>
+
*From&nbsp; $i = 3$&nbsp; the trellis has reached its basic form, and to compute all&nbsp; ${\it \Gamma}_i(S_{\mu})$&nbsp; the minimum between two sums must be determined in each case:
  
*Ab dem Zeitpunkt <i>i</i> = 3 hat das Trellisdiagramm seine Grundform erreicht, und zur Berechnung aller <i>&Gamma;</i><sub><i>i</i></sub>(<i>S<sub>&mu;</sub></i>) muss jeweils das Minimum zwischen zwei Summen ermittelt werden:
+
::<math>{\it \Gamma}_i(S_0) ={\rm Min} \left [{\it \Gamma}_{i-1}(S_0) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{i-1}(S_2) + d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big ) \right ] \hspace{0.05cm},</math>
 +
::<math>{\it \Gamma}_i(S_1)={\rm Min} \left [{\it \Gamma}_{i-1}(S_0) + d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{i-1}(S_2) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big ) \right ] \hspace{0.05cm},</math>
 +
::<math>{\it \Gamma}_i(S_2) ={\rm Min} \left [{\it \Gamma}_{i-1}(S_1) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{i-1}(S_3) + d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big ) \right ] \hspace{0.05cm},</math>
 +
::<math>{\it \Gamma}_i(S_3) ={\rm Min} \left [{\it \Gamma}_{i-1}(S_1) + d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{i-1}(S_3) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big ) \right ] \hspace{0.05cm}.</math>
  
::<math>{\it \Gamma}_i(S_0) \hspace{-0.15cm}  =  \hspace{-0.15cm}{\rm Min} \left [{\it \Gamma}_{i-1}(S_0) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{i-1}(S_2) + d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big ) \right ] \hspace{0.05cm},</math>
+
*Of the two branches arriving at a node&nbsp; ${\it \Gamma}_i(S_{\mu})$&nbsp; the worse one&nbsp; $($which would have led to a larger&nbsp; ${\it \Gamma}_i(S_{\mu})$&nbsp; is eliminated.&nbsp; Only one branch then leads to each node.<br>
::<math>{\it \Gamma}_i(S_1) \hspace{-0.15cm}  =  \hspace{-0.15cm}{\rm Min} \left [{\it \Gamma}_{i-1}(S_0) + d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{i-1}(S_2) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big ) \right ] \hspace{0.05cm},</math>
 
::<math>{\it \Gamma}_i(S_2) \hspace{-0.15cm}  =  \hspace{-0.15cm}{\rm Min} \left [{\it \Gamma}_{i-1}(S_1) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{i-1}(S_3) + d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big ) \right ] \hspace{0.05cm},</math>
 
::<math>{\it \Gamma}_i(S_3) \hspace{-0.15cm}  =  \hspace{-0.15cm}{\rm Min} \left [{\it \Gamma}_{i-1}(S_1) + d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{i-1}(S_3) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big ) \right ] \hspace{0.05cm}.</math>
 
  
*Von den zwei an einem Knoten <i>&Gamma;</i><sub><i>i</i></sub>(<i>S<sub>&mu;</sub></i>) ankommenden Zweigen wird der schlechtere (der zu einem größeren <i>&Gamma;</i><sub><i>i</i></sub>(<i>S<sub>&mu;</sub></i>) geführt hätte) eliminiert. Zu jedem Knoten führt dann nur noch ein einziger Zweig.<br>
+
*Once all error values up to and including&nbsp; $i = 7$&nbsp; have been determined,&nbsp; the Viterbi algotithm can be completed by searching the&nbsp; "connected path"&nbsp; from the end of the trellis &nbsp; &#8658; &nbsp; ${\it \Gamma}_7(S_0)$&nbsp; to the beginning &nbsp; &#8658; &nbsp; ${\it \Gamma}_0(S_0)$&nbsp;.
  
*Sind alle Fehlergrößen bis einschließlich <i>i</i> = 7 ermittelt, so kann der Viterbi&ndash;Algotithmus mit der Suche das zusammenhängenden Pfades vom Ende des Trellis &#8658; <i>&Gamma;</i><sub>7</sub>(<i>S</i><sub>0</sub>) bis zum Anfang &#8658; <i>&Gamma;</i><sub>0</sub>(<i>S</i><sub>0</sub>) abgeschlossen werden.<br>
+
*Through this path,&nbsp; the most likely encoded sequence&nbsp; $\underline{z}$&nbsp; and the most likely information sequence&nbsp; $\underline{v}$&nbsp; are then fixed.
 +
 +
*Not all received sequences are transmitted error-free&nbsp; $(\underline{y} =\underline{x})$,&nbsp; however often holds with Viterbis decoding: &nbsp; $\underline{z} = \underline{x}$&nbsp; and&nbsp; $\underline{v} = \underline{u}$.  
  
*Durch diesen Pfad liegen dann die  am wahrscheinlichsten erscheinende Codesequenz <u><i>z</i></u> und die wahrscheinlichste Informationssequenz <u><i>&upsilon;</i></u> fest. Nicht für alle Empfangssequenzen  <u><i>y</i></u> gilt aber <u><i>z</i></u> = <u><i>x</i></u> und <u><i>&upsilon;</i></u>&nbsp;=&nbsp;<u><i>u</i></u>. Das heißt: Bei zu vielen Übertragungsfehlern versagt auch der Viterbi&ndash;Algorithmus.<br>
+
*'''But if there are too many transmission errors,&nbsp; the Viterbi algorithm also fails'''.<br>
  
== Decodierbeispiel für den fehlerfreien Fall (1) ==
+
== Creating the trellis in the error-free case &nbsp;&ndash;&nbsp; Acumulated error value calculation==
 
<br>
 
<br>
Zunächst gehen wir von der Empfangssequenz <u><i>y</i></u> = (11, 01, 01, 11, 11, 10, 11) aus, die hier &ndash; wegen der Codewortlänge <i>n</i> = 2 &ndash; bereits in Bitpaare <u><i>y</i></u><sub>1</sub>, ... , <u><i>y</i></u><sub>7</sub> unterteilt ist.<br>
+
First,&nbsp; we assume the received sequence&nbsp; $\underline{y} = (11, 01, 01, 11, 11, 10, 11)$&nbsp; which is here  already subdivided into bit pairs:&nbsp;  
 +
:$$\underline{y}_1, \hspace{0.05cm} \text{...} \hspace{0.05cm} , \ \underline{y}_7.$$
 +
 
 +
The numerical values entered in the trellis and the different types of strokes are explained in the following text.<br>
 +
 
 +
[[File:KC_T_3_4_S3a_neu.png|right|frame|Viterbi scheme for the received vector&nbsp; $\underline{y} = (11, 01, 01, 11, 11, 10, 11)$|class=fit]]
 +
 
 +
*Starting from the initial value&nbsp; ${\it \Gamma}_0(S_0) = 0$&nbsp; we get&nbsp; $\underline{y}_1 = (11)$&nbsp; by adding the Hamming distances
 +
:$$d_{\rm H}((00), \ \underline{y}_1) = 2\hspace{0.6cm} \text{or} \hspace{0.6cm}d_{\rm H}((11), \ \underline{y}_1) = 0$$
  
[[File:P ID2653 KC T 3 4 S2 v95.png|Viterbi–Schema für <u><i>y</i></u> = (11, 01, 01, 11, 11, 10, 11)|class=fit]]<br>
+
:to the&nbsp; "$($acumulated$)$ error values" &nbsp; ${\it \Gamma}_1(S_0) = 2, \ {\it \Gamma}_1(S_1) = 0$.<br>
  
Die eingetragenen Zahlenwerte und die verschiedenen Stricharten werden im folgenden Text erklärt:
+
*In the second decoding step there are error values for all four states: &nbsp; With&nbsp; $\underline{y}_2 = (01)$&nbsp; one obtains:
*Ausgehend vom Initialwert <i>&Gamma;</i><sub>0</sub>(<i>S</i><sub>0</sub>) = 0 kommt man mit <u><i>y</i></u><sub>1</sub> = (11) durch Addition der Hamming-Distanzen <i>d</i><sub>H</sub> ((00), <u><i>y</i></u><sub>1</sub>) = 2 bzw. <i>d</i><sub>H</sub> ((11), <u><i>y</i></u><sub>1</sub>) = 0 zu den Fehlergrößen <i>&Gamma;</i><sub>1</sub>(<i>S</i><sub>0</sub>) = 2, <i>&Gamma;</i><sub>1</sub>(<i>S</i><sub>1</sub>) = 0.<br>
 
  
*Im zweiten Decodierschritt gibt es Fehlergrößen für alle vier Zustände: Mit <u><i>y</i></u><sub>2</sub> = (01) erhält man:
+
::<math>{\it \Gamma}_2(S_0) = {\it \Gamma}_1(S_0) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} (01) \big )  = 2+1 = 3 \hspace{0.05cm},</math>
 +
::<math>{\it \Gamma}_2(S_1) ={\it \Gamma}_1(S_0) + d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} (01) \big )  = 2+1 = 3 \hspace{0.05cm},</math>
 +
::<math>{\it \Gamma}_2(S_2) ={\it \Gamma}_1(S_1) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} (01) \big )  = 0+2=2 \hspace{0.05cm},</math>
 +
::<math>{\it \Gamma}_2(S_3) = {\it \Gamma}_1(S_1) + d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} (01) \big )  = 0+0=0 \hspace{0.05cm}.</math>
  
::<math>{\it \Gamma}_2(S_0) \hspace{-0.15cm}  =  \hspace{-0.15cm} {\it \Gamma}_1(S_0) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} (01) \big )  = 2+1 = 3 \hspace{0.05cm},</math>
+
*In all further decoding steps,&nbsp; two values must be compared in each case,&nbsp; whereby the node&nbsp; ${\it \Gamma}_i(S_{\mu})$&nbsp; is always assigned the smaller value.  
::<math>{\it \Gamma}_2(S_1) \hspace{-0.15cm}  =  \hspace{-0.15cm} {\it \Gamma}_1(S_0) + d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} (01) \big )  = 2+1 = 3 \hspace{0.05cm},</math>
 
::<math>{\it \Gamma}_2(S_2) \hspace{-0.15cm}  =  \hspace{-0.15cm} {\it \Gamma}_1(S_1) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} (01) \big )  = 0+2=2 \hspace{0.05cm},</math>
 
::<math>{\it \Gamma}_2(S_3) \hspace{-0.15cm}  =  \hspace{-0.15cm} {\it \Gamma}_1(S_1) + d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} (01) \big )  = 0+0=0 \hspace{0.05cm}.</math>
 
  
*In allen weiteren Decodierschritten müssen jeweils zwei Werte verglichen werden, wobei dem Knoten <i>&Gamma;<sub>i</sub></i>(<i>S<sub>&mu;</sub></i>) stets der kleinere Wert zugewiesen wird. Beispielsweise gilt für <i>i</i> = 3 mit <u><i>y</i></u><sub>3</sub> = (01):
+
*For example,&nbsp; for&nbsp; $i = 3$&nbsp; with&nbsp; $\underline{y}_3 = (01)$:
  
::<math>{\it \Gamma}_3(S_0) \hspace{-0.15cm}  = \hspace{-0.15cm}{\rm min} \left [{\it \Gamma}_{2}(S_0) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} (01) \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{2}(S_2) + d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} (01) \big ) \right ] =</math>
+
::<math>{\it \Gamma}_3(S_0) ={\rm min} \left [{\it \Gamma}_{2}(S_0) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} (01) \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{2}(S_2) + d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} (01) \big ) \right ] ={\rm min} \big [ 3+1\hspace{0.05cm},\hspace{0.05cm} 2+1 \big ] = 3\hspace{0.05cm},</math>
::<math> \hspace{1.2cm} =  \hspace{-0.15cm}{\rm min} \left [ 3+1\hspace{0.05cm},\hspace{0.05cm} 2+1 \right ] = 3\hspace{0.05cm},</math>
+
::<math>{\it \Gamma}_3(S_3) ={\rm min} \left [{\it \Gamma}_{2}(S_1) + d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} (01) \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{2}(S_3) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} (01) \big ) \right ] ={\rm min} \big [ 3+0\hspace{0.05cm},\hspace{0.05cm} 0+2 \big ] = 2\hspace{0.05cm}.</math>
::<math>{\it \Gamma}_3(S_3) \hspace{-0.15cm}  = \hspace{-0.15cm}{\rm min} \left [{\it \Gamma}_{2}(S_1) + d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} (01) \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{2}(S_3) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} (01) \big ) \right ] =</math>
 
::<math> \hspace{1.2cm}  =  \hspace{-0.15cm}{\rm min} \left [ 3+0\hspace{0.05cm},\hspace{0.05cm} 0+2 \right ] = 2\hspace{0.05cm}.</math>
 
  
*Ab <i>i</i> = 6 wird im betrachteten Beispiel die Terminierung des Faltungscodes wirksam. Hier sind nur noch zwei Vergleiche zur Bestimmung von <i>&Gamma;</i><sub>6</sub>(<i>S</i><sub>0</sub>) und <i>&Gamma;</i><sub>6</sub>(<i>S</i><sub>2</sub>) anzustellen, und für <i>i</i> = 7 nur noch ein Vergleich mit dem Endergebnis <i>&Gamma;</i><sub>7</sub>(<i>S</i><sub>0</sub>).<br>
+
* In the considered example,&nbsp; from&nbsp; $i = 6$&nbsp; the termination of the convolutional code becomes effective.&nbsp; Here,&nbsp; only two comparisons are left to determine&nbsp; ${\it \Gamma}_6(S_0) = 3$&nbsp; and&nbsp; ${\it \Gamma}_6(S_2)= 0$&nbsp; and for&nbsp; $i = 7$&nbsp; only one comparison with the final error value&nbsp; ${\it \Gamma}_7(S_0) = 0$.<br>
  
*Von den jeweils zwei an einem Knoten ankommenden Zweigen wird stets nur derjenige bei der abschließenden Pfadsuche herangezogen, der zur minimalen Fehlergröße <i>&Gamma;<sub>i</sub></i>(<i>S<sub>&mu;</sub></i>) geführt hat. Die &bdquo;schlechten&rdquo; Zweige werden verworfen. Sie sind in obiger Grafik jeweils punktiert dargestellt.<br><br>
 
  
Die Beschreibung wird auf der nächsten Seite fortgesetzt.
+
The description of the Viterbi decoding process continues in the next section.
  
== Decodierbeispiel für den fehlerfreien Fall (2) ==
+
== Evaluating the trellis in the error-free case &nbsp;&ndash;&nbsp; Path search==
 
<br>
 
<br>
Nachdem alle Fehlergrößen <i>&Gamma;<sub>i</sub></i>(<i>S<sub>&mu;</sub></i>) &ndash; in dem vorliegenden Beispiel für 1 &#8804; <i>i</i> &#8804; 7 und 0 &#8804; <i>&mu;</i> &#8804; 3 &ndash; ermittelt wurden, kann der Viterbi&ndash;Decoder mit der Pfadsuche beginnen.<br>
+
After all error values&nbsp; ${\it \Gamma}_i(S_{\mu})$&nbsp; have been determined&nbsp; $($ in the present example for&nbsp; $1 &#8804; i &#8804; 7$&nbsp; and&nbsp; $0 &#8804; \mu &#8804; 3)$,&nbsp; the Viterbi decoder can start the path search:<br>
 +
 
 +
# &nbsp; The following graph shows the trellis after the error value calculation.&nbsp; All circles are assigned numerical values.
 +
# &nbsp; However,&nbsp; the most probable path already drawn in the graphic is not yet known.
 +
# &nbsp;  In the following,&nbsp; of course,&nbsp; no use is made of the&nbsp; "error-free case"&nbsp; information already contained in the heading.
 +
# &nbsp; Of the two branches arriving at a node,&nbsp; only the one that led to the minimum error value&nbsp; ${\it \Gamma}_i(S_{\mu})$&nbsp; is used for the final path search.
 +
# &nbsp; The&nbsp; "bad"&nbsp; branches are discarded.&nbsp; They are each shown dotted in the above graph.
  
[[File:P ID2654 KC T 3 4 S3b v1.png|Viterbi–Pfadsuche für <u><i>y</i></u> = (11, 01, 01, 11, 11, 10, 11)|class=fit]]<br>
+
[[File:P ID2654 KC T 3 4 S3b v1.png|right|frame|Viterbi path search for for the received vector&nbsp; $\underline{y} = (11, 01, 01, 11, 11, 10, 11)$|class=fit]]
  
Die Pfadsuche läuft wie folgt ab:
 
*Ausgehend vom Endwert <i>&Gamma;</i><sub>7</sub>(<i>S</i><sub>0</sub>) wird in Rückwärtsrichtung ein zusammenhängender Pfad bis zum Startwert  <i>&Gamma;</i><sub>0</sub>(<i>S</i><sub>0</sub>) gesucht. Erlaubt sind nur die durchgezogenen Zweige. Punktierte Linien können nicht Teil des ausgewählten Pfades sein.<br>
 
  
*Der ausgewählte Pfad ist in der Grafik grau markiert. Er durchläuft von rechts nach links die Zustände <i>S</i><sub>0</sub> &#8592; <i>S</i><sub>2</sub> &#8592; <i>S</i><sub>1</sub> &#8592; <i>S</i><sub>0</sub> &#8592; <i>S</i><sub>2</sub> &#8592; <i>S</i><sub>3</sub> &#8592; <i>S</i><sub>1</sub> &#8592; <i>S</i><sub>0</sub>. Es gibt keinen zweiten durchgehenden Pfad von  <i>&Gamma;</i><sub>7</sub>(<i>S</i><sub>0</sub>) zu  <i>&Gamma;</i><sub>0</sub>(<i>S</i><sub>0</sub>). Das bedeutet: Das Decodierergebnis ist eindeutig.<br>
+
The path search runs as follows:
 +
*Starting from the end value&nbsp; ${\it \Gamma}_7(S_0)$&nbsp; a continuous path is searched in backward direction to the start value&nbsp; ${\it \Gamma}_0(S_0)$.&nbsp; Only the solid branches are allowed.&nbsp; Dotted lines cannot be part of the selected&nbsp; $($best$)$&nbsp; path.<br>
  
*Das Ergebnis  <u><i>&upsilon;</i></u> = (1, 1, 0, 0, 1, 0, 0) des Viterbi&ndash;Decoders hinsichtlich der Informationssequenz erhält man, wenn man für den durchgehenden Pfad &ndash; nun aber in Vorwärtsrichtung von links nach rechts &ndash; die Farben der einzelnen Zweige auswertet (rot entspricht einer 0, blau einer 1).<br><br>
+
*The selected path&nbsp; $($grayed out in the graph$)$&nbsp; traverses from right to left in the sketch the states is&nbsp;
 +
::$$S_0 &#8592; S_2 &#8592; S_1 &#8592; S_0 &#8592; S_2 &#8592; S_3 &#8592; S_1 &#8592; S_0.$$
 +
:There is no second continuous path from&nbsp; ${\it \Gamma}_7(S_0)$&nbsp; to&nbsp; ${\it \Gamma}_0(S_0)$. This means: &nbsp; The decoding result is unique.<br>
  
Aus dem Endwert <i>&Gamma;</i><sub>7</sub>(<i>S</i><sub>0</sub>) = 0 erkennt man, dass  in diesem ersten Beispiel keine Übertragungsfehler vorlagen. Das Decodierergebnis <u><i>z</i></u> stimmt also mit dem Empfangsvektor <u><i>y</i></u> = (11, 01, 01, 11, 11, 10, 11) und der tatsächlichen Codesequenz <u><i>x</i></u> überein. Außerdem ist <u><i>&upsilon;</i></u> nicht nur die nach dem ML&ndash;Kriterium wahrscheinlichste Informationssequenz  <u><i>u</i></u>, sondern es gilt auch hier die Identität: <u><i>&upsilon;</i></u> =  <u><i>u</i></u>.<br>
+
*The result&nbsp; $\underline{v} = (1, 1, 0, 0, 1, 0, 0)$&nbsp; of the Viterbi decoder with respect to the information sequence is obtained if for the continuous path&nbsp; $($but now in forward direction from left to right$)$&nbsp; the colors of the individual branches are evaluated&nbsp; $($red &nbsp; &rArr; &nbsp; "$0$", &nbsp; blue &nbsp; &rArr; &nbsp; $1)$.<br><br>
  
<i>Anmerkung:</i> Bei der beschriebenen Decodierung wurde von der bereits in der Überschrift enthaltenen Information &bdquo;Fehlerfreier Fall&rdquo; natürlich kein Gebrauch gemacht wurde.<br>
+
From the final value &nbsp; ${\it \Gamma}_7(S_0) = 0$ &nbsp; it can be seen that there were no transmission errors in this first example:  
 +
*The decoding result&nbsp; $\underline{z}$&nbsp; thus matches the received vector &nbsp;$\underline{y} = (11, 01, 01, 11, 11, 10, 11)$&nbsp; and the actual encoded sequence&nbsp; $\underline{x}$.  
  
== Decodierbeispiele für den fehlerbehafteten Fall (1) ==
+
*With error-free transmission,&nbsp; $ \underline{v}$&nbsp; is not only the most probable info sequence&nbsp; $\underline{u}$&nbsp; according to the maximum likelihood  criterion,&nbsp; but both are even identical: &nbsp; $\underline{v} \equiv \underline{u}$.<br>
 +
 
 +
 
 +
 
 +
== Decoding examples for the erroneous case  ==
 
<br>
 
<br>
Es folgen zwei weitere Beispiele zur Viterbi&ndash;Decodierung. Die Berechnung der Fehlergrößen <i>&Gamma;</i><sub><i>i</i></sub>(<i>S<sub>&mu;</sub></i>) geschieht wie auf [http://en.lntwww.de/Kanalcodierung/Decodierung_von_Faltungscodes#Vorbemerkungen_zu_den_nachfolgenden_Decodierbeispielen_.281.29 Seite 2] beschrieben und auf der letzten Seite für den fehlerfreien Fall demonstriert.<br>
+
Now follow three examples of Viterbi decoding for the erroneous case. <br>
 +
 
 +
{{GraueBox|TEXT= 
 +
$\text{Example 1:}$&nbspWe assume here the received vector &nbsp;$\underline{y} = \big (11\hspace{0.05cm}, 11\hspace{0.05cm}, 10\hspace{0.05cm}, 00\hspace{0.05cm}, 01\hspace{0.05cm}, 01\hspace{0.05cm}, 11 \hspace{0.05cm} \hspace{0.05cm} \big ) $&nbsp; which does not represent a valid encoded sequence&nbsp; $\underline{x}$&nbsp;. The calculation of error values&nbsp; ${\it \Gamma}_i(S_{\mu})$&nbsp; and the path search is done as described in section&nbsp; [[Channel_Coding/Code_Description_with_State_and_Trellis_Diagram#Definition_of_the_free_distance| "Preliminaries"]]&nbsp; and demonstrated in the last two sections for the error-free case.<br>
 +
 
 +
[[File:P ID2655 KC T 3 4 S4a v1.png|right|frame|Decoding example with two bit errors at the beginning|class=fit]]
  
[[File:P ID2655 KC T 3 4 S4a v1.png|Decodierbeispiel mit zwei Bitfehlern|class=fit]]<br>
+
As summary of this first example,&nbsp; it should be noted:
 +
*Also with this trellis,&nbsp; a unique path&nbsp; $($with dark gray background$)$&nbsp; can be traced,&nbsp; leading to the following results&nbsp; $($recognizable by the labels or the colors of this path$)$:
  
Als Resümee dieses ersten Beispiels mit obigem Trellis ist festzuhalten:
+
::<math>\underline{z} = \big (00\hspace{0.05cm}, 11\hspace{0.05cm}, 10\hspace{0.05cm}, 00\hspace{0.05cm}, 01\hspace{0.05cm}, 01\hspace{0.05cm}, 11 \hspace{0.05cm} \big ) \hspace{0.05cm},</math>
*Auch hier lässt sich ein eindeutiger Pfad (dunkelgraue Markierung) zurückverfolgen, der zu den folgenden Ergebnissen führt (erkennbar an den Beschriftungen bzw. den Farben dieses Pfades):
+
::<math> \underline{\upsilon} =\big (0\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 1\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 0 \hspace{0.05cm} \big ) \hspace{0.05cm}.</math>
  
::<math>\underline{z} \hspace{-0.15cm} =  \hspace{-0.15cm} \big (00\hspace{0.05cm}, 11\hspace{0.05cm}, 10\hspace{0.05cm}, 00\hspace{0.05cm}, 01\hspace{0.05cm}, 01\hspace{0.05cm}, 11 \hspace{0.05cm} \big ) \hspace{0.05cm},</math>
+
*Comparing the most likely transmitted encoded sequence &nbsp;$\underline{z}$&nbsp; with the received vector &nbsp;$\underline{y}$&nbsp; shows that there were two bit errors directly at the beginning.&nbsp; But since the used convolutional code has the [[Channel_Coding/Code_Description_with_State_and_Trellis_Diagram#Definition_of_the_free_distance| $\text{free distance}$]]&nbsp; $d_{\rm F} = 5$,&nbsp; two transmission errors do not yet lead to a wrong decoding result.<br>
::<math> \underline{\upsilon} \hspace{-0.15cm}  = \hspace{-0.15cm} \big (0\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 1\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 0 \hspace{0.05cm} \big ) \hspace{0.05cm}.</math>
 
  
*Der Vergleich der am wahrscheinlichsten gesendeten Codesequenz <u><i>z</i></u> mit dem Empfangsvektor <u><i>y</i></u> zeigt, dass hier zwei Bitfehler (am Beginn) vorlagen. Da aber der verwendete Faltungscode die [http://en.lntwww.de/Kanalcodierung/Codebeschreibung_mit_Zustands%E2%80%93_und_Trellisdiagramm#Definition_der_freien_Distanz_.281.29 freie Distanz <i>d</i><sub>F</sub> = 5] aufweist, führen zwei Fehler noch nicht zu einem falschen Decodierergebnis.<br>
+
*There are other paths such as the lighter highlighted path
 +
:$$S_0 &#8594; S_1 &#8594; S_3 &#8594; S_3 &#8594; S_3 &#8594; S_2 &#8594; S_0 &#8594; S_0$$
 +
:that initially appear to be promising.&nbsp; Only in the last decoding step&nbsp; $(i = 7)$&nbsp; can this light gray path finally be discarded.<br>
  
*Es gibt andere Pfade wie zum Beispiel den heller markierten Pfad (<i>S</i><sub>0</sub> &#8594; <i>S</i><sub>1</sub> &#8594; <i>S</i><sub>3</sub> &#8594; <i>S</i><sub>3</sub> &#8594; <i>S</i><sub>3</sub> &#8594; <i>S</i><sub>2</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub>), die zunächst als vielversprechend erscheinen. Erst im letzten Decodierschritt (<i>i</i>&nbsp;=&nbsp;7) kann dieser hellgraue Pfad endgültig verworfen werden.<br>
+
<u>Further remarks:</u>
 +
# The example shows that a too early decision is often not purposeful.&nbsp;
 +
# One can also see the expediency of termination: &nbsp; With final decision at&nbsp; $i = 5$&nbsp; $($end of information sequence$)$,&nbsp; the sequences &nbsp;$(0, 1, 0, 1, 1)$&nbsp; and &nbsp;$(1, 1, 1, 1, 0)$&nbsp; would still have been considered equally likely.
 +
# In the calculation of&nbsp; ${\it \Gamma}_5(S_0) = 3$&nbsp; and&nbsp; ${\it \Gamma}_5(S_1) = 3$&nbsp; here in each case the two comparison branches lead to exactly the same minimum error value. In the graph these two special cases are marked by dash dots.&nbsp; In this example,&nbsp; this special case has no effect on the path search.
 +
# Nevertheless,&nbsp; the algorithm always expects a decision between two competing branches.&nbsp; In practice,&nbsp; one helps by randomly selecting one of the two paths if they are equal.}}<br>
  
*Dieses Beispiel zeigt, dass eine zu frühe Entscheidung oft zu einem Decodierfehler führt, und man erkennt auch die Zweckmäßigkeit der Terminierung: Bei endgültiger Entscheidung zum Zeitpunkt <i>i</i> = 5 (dem Ende der eigentlichen Informationssequenz) wären die Sequenzen (0, 1, 0, 1, 1) und (1, 1, 1, 1, 0) noch als gleichwahrscheinlich angesehen worden.<br><br>
+
{{GraueBox|TEXT= 
 +
$\text{Example 2:}$&nbsp; 
 +
In this example,&nbsp; we assume the following assumptions regarding source and encoder:
 +
[[File:P ID2700 KC T 3 4 S4b v1.png|right|frame|Decoding example with three bit errors|class=fit]]
 +
:$$\underline{u} = \big (1\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 0\hspace{0.05cm}, 1 \hspace{0.05cm}, 0\hspace{0.05cm}, 0  \big )$$
 +
:$$\Rightarrow \hspace{0.3cm}
 +
\underline{x} = \big (11\hspace{0.05cm}, 01\hspace{0.05cm}, 01\hspace{0.05cm}, 11\hspace{0.05cm}, 11\hspace{0.05cm}, 10\hspace{0.05cm}, 11 \hspace{0.05cm} \hspace{0.05cm} \big ) \hspace{0.05cm}.$$
  
<i>Anmerkung:</i> Bei der Berechnung von <i>&Gamma;</i><sub>5</sub>(<i>S</i><sub>0</sub>) = 3 und <i>&Gamma;</i><sub>5</sub>(<i>S</i><sub>1</sub>) = 3 führen hier jeweils die beiden Vergleichszweige zur gleichen minimalen Fehlergröße. In der Grafik sind diese beiden Sonderfälle durch Strichpunktierung markiert.<br>
+
From the graph you can see here that the decoder decides for the correct path&nbsp; $($dark background$)$&nbsp; despite three bit errors.
 +
*So there is not always a wrong decision,&nbsp; if more than&nbsp; $d_{\rm F}/2$&nbsp; bit errors occurred.
 +
 +
*But with statistical distribution of the three bit errors,&nbsp; wrong decision would be more frequent than right.}}<br>
  
In diesem Beispiel hat dieser Sonderfall keine Auswirkung auf die Pfadsuche. Der Algorithmus erwartet trotzdem stets eine Entscheidung zwischen zwei konkurrierenden Zweigen. In der Praxis hilft man sich dadurch, dass man bei Gleichheit einen der beiden Pfade zufällig auswählt.<br>
+
{{GraueBox|TEXT= 
 +
$\text{Example 3:}$&nbsp;  Here also applies&nbsp;
 +
[[File:P ID2704 KC T 3 4 S4c v1.png|right|frame|Decoding example with four bit errors|class=fit]]
 +
:$$\underline{u} = \big (1\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 0\hspace{0.05cm}, 1 \hspace{0.05cm}, 0\hspace{0.05cm}, 0 \big )$$
 +
:$$\Rightarrow \hspace{0.3cm}
 +
\underline{x} = \big (11\hspace{0.05cm}, 01\hspace{0.05cm}, 01\hspace{0.05cm}, 11\hspace{0.05cm}, 11\hspace{0.05cm}, 10\hspace{0.05cm}, 11 \hspace{0.05cm} \hspace{0.05cm} \big ) \hspace{0.05cm}.$$
  
== Decodierbeispiele für den fehlerbehafteten Fall (2) ==
+
Unlike the last example,&nbsp; a fourth bit error is added:&nbsp; $\underline{y}_7 = (01).$
<br>
 
Im letzten Beispiel gehen wir stets von folgenden Voraussetzungen bezüglich Quelle und Coder aus:
 
  
:<math>\underline{u} = \big (1\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 0\hspace{0.05cm}, 1 \hspace{0.05cm}, 0\hspace{0.05cm}, 0  \big )\hspace{0.3cm}\Rightarrow \hspace{0.3cm}
+
*Now both branches in step&nbsp; $i = 7$&nbsp; lead to the minimum error value&nbsp; ${\it \Gamma}_7(S_0) = 4$,&nbsp; recognizable by the dash-dotted transitions.  
\underline{x} = \big (11\hspace{0.05cm}, 01\hspace{0.05cm}, 01\hspace{0.05cm}, 11\hspace{0.05cm}, 11\hspace{0.05cm}, 10\hspace{0.05cm}, 11 \hspace{0.05cm} \hspace{0.05cm} \big ) \hspace{0.05cm}.</math>
 
  
Aus der ersten der beiden Grafiken erkennt man, dass sich hier der Decoder trotz dreier Bitfehler für den richtigen Pfad (dunkle Hinterlegung) entscheidet. Es gibt also nicht immer eine Fehlentscheidung, wenn mehr als <i>d</i><sub>F</sub>/2 Bitfehler aufgetreten sind.<br>
+
*If one decides in the then required lottery procedure for the path with dark background,&nbsp; the correct decision is still made even with four bit errors:  &nbsp; $\underline{v} = \big (1\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 0\hspace{0.05cm}, 1 \hspace{0.05cm}, 0\hspace{0.05cm}, 0  \big )$. <br>
  
[[File:P ID2700 KC T 3 4 S4b v1.png|Decodierbeispiel mit drei Bitfehlern|class=fit]]<br>
+
*Otherwise,&nbsp; a wrong decision is made. Depending on the outcome of the dice roll in step&nbsp; $i =6$&nbsp; between the two dash-dotted competitors,&nbsp; you choose either the purple or the light gray path.&nbsp; 
  
In der unteren Grafik ist noch ein vierter Bitfehler in Form von <i><u>y</u></i><sub>7</sub> = (01) hinzugefügt:
+
*Both have little in common with the correct path.}}
*Nun führen beide Zweige im Schritt  <i>i</i>&nbsp;=&nbsp;7 zur minimalen Fehlergröße <i>&Gamma;</i><sub>7</sub>(<i>S</i><sub>0</sub>) = 4, erkennbar an den strichpunktierten Übergängen. Entscheidet man sich im dann erforderlichen Losverfahren für den dunkel hinterlegten Pfad, so wird auch bei vier Bitfehlern  noch die richtige Entscheidung getroffen.<br>
 
  
*Andernfalls kommt es zu einer Fehlentscheidung. Je nachdem, wie das Auswürfeln im Schritt <i>i</i>&nbsp;=&nbsp;6 zwischen den beiden strichpunktierten Konkurrenten ausgeht, entscheidet man sich entweder für den violetten oder den hellgrauen Pfad.  Mit dem richtigen Pfad haben beide wenig gemein.<br>
 
  
:[[File:P ID2704 KC T 3 4 S4c v1.png|Decodierbeispiel mit vier Bitfehlern|class=fit]]<br>
 
  
== Zusammenhang zwischen Hamming–Distanz und Korrelation ==
+
== Relationship between Hamming distance and correlation ==
 
<br>
 
<br>
Insbesondere beim [http://en.lntwww.de/Kanalcodierung/Klassifizierung_von_Signalen#Binary_Symmetric_Channel_.E2.80.93_BSC BSC&ndash;Modell] &ndash; aber auch bei jedem anderen Binärkanal &nbsp;&#8658;&nbsp; Eingang <i>x<sub>i</sub></i> &#8712; {0, 1}, Ausgang <i>y<sub>i</sub></i> &#8712; {0, 1} wie zum Beispiel dem [http://en.lntwww.de/Digitalsignal%C3%BCbertragung/B%C3%BCndelfehlerkan%C3%A4le#Kanalmodell_nach_Gilbert.E2.80.93Elliott_.281.29 Gilbert&ndash;Elliott&ndash;Modell] &ndash; liefert die Hamming&ndash;Distanz <i>d</i><sub>H</sub>(<u><i>x</i></u>,&nbsp;<u><i>y</i></u>) genau die gleiche Information über die Ähnlichkeit der Eingangsfolge <u><i>x</i></u> und der Ausgangsfolge <u><i>y</i></u> wie das [http://en.lntwww.de/Digitalsignal%C3%BCbertragung/Signale,_Basisfunktionen_und_Vektorr%C3%A4ume#Zur_Nomenklatur_im_vierten_Kapitel_.282.29 innere Produkt.] Nimmt man an, dass die beiden Folgen in bipolarer Darstellung vorliegen (gekennzeichnet durch Tilden) und dass die Folgenlänge jeweils <i>L</i> ist, so gilt für das innere Produkt:
+
Especially for the&nbsp; [[Channel_Coding/Channel_Models_and_Decision_Structures#Binary_Symmetric_Channel_.E2.80.93_BSC|$\text{BSC model}$]]&nbsp; $($but also for any other binary channel&nbsp; &#8658; &nbsp; input&nbsp; $x_i &#8712; \{0,1\}$,&nbsp; output $y_i &#8712; \{0,1\}$&nbsp; such as the&nbsp; [[Digital_Signal_Transmission/Burst_Error_Channels#Channel_model_according_to_Gilbert-Elliott|$\text{Gilbert&ndash;Elliott model}$]]$)$&nbsp; provides
 +
*the Hamming distance&nbsp; $d_{\rm H}(\underline{x}, \ \underline{y})$&nbsp; exactly the same information about the similarity of the input sequence&nbsp; $\underline{x}$&nbsp; and the output sequence&nbsp; $\underline{y}$&nbsp;
  
:<math><\hspace{-0.1cm}\underline{\tilde{x}}, \hspace{0.05cm}\underline{\tilde{y}} \hspace{-0.1cm}> \hspace{0.15cm}
+
*as the&nbsp; [[Digital_Signal_Transmission/Signals,_Basis_Functions_and_Vector_Spaces#Nomenclature_in_the_fourth_chapter| $\text{inner product}$]].&nbsp; Assuming that the sequences are in bipolar form&nbsp; $($denoted by tildes$)$&nbsp; and that the sequence length is&nbsp; $L$&nbsp; in each case,&nbsp; the inner product is:
= \sum_{i = 1}^{L} \tilde{x}_i \cdot \tilde{y}_i \hspace{0.3cm}{\rm mit } \hspace{0.2cm} \tilde{x}_i = 1 - 2 \cdot x_i  \hspace{0.05cm},\hspace{0.2cm} \tilde{y}_i = 1 - 2 \cdot y_i \hspace{0.05cm},\hspace{0.2cm} \tilde{x}_i, \hspace{0.05cm}\tilde{y}_i \in \hspace{0.1cm}\{ -1, +1\} \hspace{0.05cm}.</math>
 
  
Wir bezeichnen dieses innere Produkt manchmal auch als &bdquo;Korrelationswert&rdquo;. Die Anführungszeichen sollen darauf hinweisen, dass der Wertebereich eines [http://en.lntwww.de/Stochastische_Signaltheorie/Zweidimensionale_Zufallsgr%C3%B6%C3%9Fen#Korrelationskoeffizient Korrelationskoeffizienten] eigentlich &plusmn;1 ist.<br>
+
::<math><\hspace{-0.1cm}\underline{\tilde{x}}, \hspace{0.05cm}\underline{\tilde{y}} \hspace{-0.1cm}> \hspace{0.15cm}
 +
= \sum_{i = 1}^{L} \tilde{x}_i \cdot \tilde{y}_i \hspace{0.3cm}{\rm with } \hspace{0.2cm} \tilde{x}_i = 1 - 2 \cdot x_i  \hspace{0.05cm},\hspace{0.2cm} \tilde{y}_i = 1 - 2 \cdot y_i \hspace{0.05cm},\hspace{0.2cm} \tilde{x}_i, \hspace{0.05cm}\tilde{y}_i \in \hspace{0.1cm}\{ -1, +1\} \hspace{0.05cm}.</math>
  
{{Beispiel}}''':''' Wir betrachten hier zwei Binärfolgen der Länge <i>L</i> = 10:<br>
+
We sometimes refer to this inner product as the&nbsp; &raquo;'''correlation value'''&laquo;.&nbsp; Unlike the &nbsp; [[Theory_of_Stochastic_Signals/Two-Dimensional_Random_Variables#Correlation_coefficient| $\text{correlation coefficient}$]]&nbsp; the&nbsp; "correlation value"&nbsp; may well exceed the range of values&nbsp; $&plusmn;1$.
  
[[File:P ID2662 KC T 3 4 S5 v1.png|Zusammenhang zwischen Haming–Distanz und „Korrelation” |class=fit]]<br>
+
{{GraueBox|TEXT= 
 +
$\text{Example 4:}$&nbsp;  We consider here two binary sequences of length&nbsp; $L = 10$.&nbsp; Shown on the left are the&nbsp; &raquo;'''unipolar'''&laquo;&nbsp; sequences&nbsp; $\underline{x}$&nbsp; and&nbsp; $\underline{y}$&nbsp; and the product&nbsp; $\underline{x} \cdot \underline{y}$.
 +
[[File:KC_T_3_4_S5_neu.png|right|frame|Relationship between Haming distance and correlation value |class=fit]]
 +
 +
*You can see the Hamming distance&nbsp; $d_{\rm H}(\underline{x}, \ \underline{y}) = 6$ &nbsp; &#8658; &nbsp; six bit errors at the arrow positions.
 +
 +
*The inner product &nbsp; $ < \underline{x} \cdot \underline{y} > \hspace{0.15cm} = \hspace{0.15cm}0$ &nbsp; has no significance here.&nbsp; For example,&nbsp; $< \underline{0} \cdot \underline{y} >&nbsp;$ is always zero regardless of&nbsp; $\underline{y}$.
  
*Links dargestellt sind die unipolaren Folgen <u><i>x</i></u> und <u><i>y</i></u> sowie das Produkt <u><i>x</i></u> &middot; <u><i>y</i></u>. Man erkennt die Hamming&ndash;Distanz <i>d</i><sub>H</sub>(<u><i>x</i></u>, <u><i>y</i></u>) = 6 &nbsp;&#8658;&nbsp; sechs Bitfehler an den Pfeilpositionen. Das innere Produkt &#9001;<u><i>x</i></u> &middot; <u><i>y</i></u>&#9002; = 0 hat hier keine Aussagekraft. Zum Beispiel ist &#9001;<u>0</u> &middot; <u><i>y</i></u>&#9002; unabhängig von <u><i>y</i></u> stets Null.
 
  
*Die Hamming&ndash;Distanz <i>d</i><sub>H</sub> = 6 erkennt man auch aus der bipolaren (antipodalen) Darstellung der rechten Grafik. Die &bdquo;Korrelationswert&rdquo; hat aber nun den richtigen Wert 4 &middot; (+1) + 6 &middot; (&ndash;1) = &ndash;2. Es gilt der deterministische Zusammenhang zwischen den beiden Größen mit der Folgenlänge <i>L</i>:
+
The Hamming distance&nbsp; $d_{\rm H} = 6$&nbsp; can also be seen from the&nbsp; &raquo;'''bipolar'''&laquo;&nbsp; $($antipodal$)$ plot in the right graph.  
 +
*The&nbsp; "correlation value"&nbsp; has now the correct value:
 +
:$$4 \cdot (+1) + 6 \cdot (-1) = \, -2.$$
 +
*For the deterministic relationship between the&nbsp; "correlation value"&nbsp; and the&nbsp; "Hamming distance"&nbsp; holds with the sequence length&nbsp; $L$:
  
::<math><\hspace{-0.1cm}\underline{\tilde{x}}, \hspace{0.05cm}\underline{\tilde{y}} \hspace{-0.1cm}> \hspace{0.15cm}
+
:$$ < \underline{ \tilde{x} } \cdot \underline{\tilde{y} } > \hspace{0.15cm} = \hspace{0.15cm} L - 2 \cdot d_{\rm H} (\underline{\tilde{x} }, \hspace{0.05cm}\underline{\tilde{y} })\hspace{0.05cm}. $$}}
= L - 2 \cdot d_{\rm H} (\underline{\tilde{x}}, \hspace{0.05cm}\underline{\tilde{y}})\hspace{0.05cm}.</math>{{end}}<br>
+
<br clear=all>
  
Interpretieren wir nun diese Gleichung für einige Sonderfälle:
+
{{BlaueBox|TEXT= 
*Identische Folgen: Die Hamming&ndash;Distanz ist gleich 0 und der &bdquo;Korrelationswert&rdquo; gleich <i>L</i>.<br>
+
$\text{Conclusion:}$&nbsp;  Let us now interpret this last equation for some special cases:
 +
*&raquo;'''Identical sequences'''&laquo;: &nbsp; The Hamming distance is equal to&nbsp; $0$&nbsp; and the&nbsp; correlation value is equal to&nbsp; $L$.<br>
  
*Invertierte: Folgen: Die Hamming&ndash;Distanz ist gleich <i>L</i> und der &bdquo;Korrelationswert&rdquo; gleich &ndash;<i>L</i>.<br>
+
*&raquo;'''Inverted sequences'''&laquo;: &nbsp; The Hamming distance is equal to&nbsp; $L$&nbsp; and the&nbsp; correlation value&nbsp; is equal to&nbsp; $-L$.<br>
  
*Unkorrelierte Folgen: Die Hamming&ndash;Distanz ist gleich <i>L</i>/2, der &bdquo;Korrelationswert&rdquo; gleich 0.<br><br>
+
*&raquo;'''Uncorrelated sequences'''&laquo;: &nbsp; The Hamming distance is equal to&nbsp; $L/2$&nbsp; and the&nbsp; correlation value&nbsp; is equal to&nbsp; $0$.}}
  
== Viterbi–Algorithmus, basierend auf Korrelation und Metriken (1) ==
+
== Viterbi algorithm based on correlation and metrics ==
 
<br>
 
<br>
Mit den Erkenntnissen der letzten Seite lässt sich der Viterbi&ndash;Algorithmus auch wie folgt charakterisieren. Der Algorithmus sucht von allen möglichen Codesequenzen <u><i>x</i></u>&prime; &#8712; <i>C</i> die Sequenz <u><i>z</i></u> mit der maximalen Korrelation zur Empfangssequenz <u><i>y</i></u>:
+
Using the insights of the last section,&nbsp; the Viterbi algorithm can also be characterized as follows.
  
:<math>\underline{z} = {\rm arg} \max_{\underline{x}' \in \hspace{0.05cm} \mathcal{C}} \hspace{0.1cm} \left\langle \tilde{\underline{x}}'\hspace{0.05cm} ,\hspace{0.05cm}  \tilde{\underline{y}} \right\rangle
+
{{BlaueBox|TEXT= 
  \hspace{0.4cm}{\rm mit }\hspace{0.4cm}\tilde{\underline{x}}'= 1 - 2 \cdot \underline{x}'\hspace{0.05cm}, \hspace{0.2cm}
+
$\text{Alternative description:}$&nbsp; 
  \tilde{\underline{y}}= 1 - 2 \cdot \underline{y}
+
*The Viterbi algorithm searches from all possible encoded sequences&nbsp; $\underline{x}' &#8712; \mathcal{C}$&nbsp; the sequence&nbsp; $\underline{z}$&nbsp; with the&nbsp; &raquo;'''maximum correlation value'''&laquo;&nbsp; to the received sequence&nbsp; $\underline{y}$:
 +
 
 +
::<math>\underline{z} = {\rm arg} \max_{\underline{x}' \in \hspace{0.05cm} \mathcal{C} } \hspace{0.1cm} \left\langle \tilde{\underline{x} }'\hspace{0.05cm} ,\hspace{0.05cm}  \tilde{\underline{y} } \right\rangle
 +
  \hspace{0.4cm}{\rm with }\hspace{0.4cm}\tilde{\underline{x} }\hspace{0.05cm}'= 1 - 2 \cdot \underline{x}'\hspace{0.05cm}, \hspace{0.2cm}
 +
  \tilde{\underline{y} }= 1 - 2 \cdot \underline{y}
 
\hspace{0.05cm}.</math>
 
\hspace{0.05cm}.</math>
  
&#9001; ... &#9002; bezeichnet einen &bdquo;Korrelationswert&rdquo; entsprechend den Aussagen auf der letzten Seite. Die Tilden weisen wieder auf die bipolare (antipodale) Darstellung hin.<br>
+
*Here,&nbsp; $&#9001;\ \text{ ...} \  &#9002;$&nbsp; denotes a&nbsp; "correlation value"&nbsp; according to the statements in the last section.&nbsp; The tildes again indicate the bipolar representation.}}<br>
 +
 
 +
The graphic shows the corresponding trellis evaluation.&nbsp;
 +
*As for the&nbsp; [[Channel_Coding/Decoding_of_Convolutional_Codes#Decoding_examples_for_the_erroneous_case|$\text{Trellis evaluation according to Example 1}$]]&nbsp; based on the minimum Hamming distance and the error values&nbsp; ${\it \Gamma}_i(S_{\mu})$
 +
 
 +
*the input sequence and the encoded sequence are
 +
:$$\underline{u} = \big (0\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 1\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 0 \hspace{0.05cm} \big ) \hspace{0.3cm} &rArr; \hspace{0.3cm} \underline{x} = \big (00, 11, 10, 00, 01, 01, 11 \big ) \hspace{0.05cm}.$$
 +
[[File:P ID2663 KC T 3 4 S6 v1.png|right|frame|Viterbi decoding based on correlation and metrics|class=fit]]
  
Die Grafik zeigt die diesbezügliche Trellisauswertung. Zugrunde liegt
+
Further are assumed:
*der gleiche Faltungscode mit <i>R</i> = 1/2, <i>m</i> = 2 und <b>G</b>(<i>D</i>) = (1 + <i>D</i> + <i>D</i><sup>2</sup>, 1 + <i>D</i><sup>2</sup>), und<br>
+
* Standard convolutional encoder: &nbsp; rate&nbsp; $R = 1/2$,&nbsp; memory&nbsp; $m = 2$;
  
*der gleiche Empfangsvektor <u><i>y</i></u> = (11, 11, 10, 00, 01, 01, 11)<br><br>
+
* the transfer function matrix: &nbsp; $\mathbf{G}(D) = (1 + D + D^2, 1 + D^2)$;
  
wie für das [http://en.lntwww.de/Kanalcodierung/Decodierung_von_Faltungscodes#Decodierbeispiel_f.C3.BCr_den_fehlerfreien_Fall_.281.29 Trellisdiagramm] auf Seite 3a dieses Kapitels, basierend auf der minimalen Hamming&ndash;Distanz und den Fehlergrößen <i>&Gamma;<sub>i</sub></i>(<i>S<sub>&mu;</sub></i>). Beide Darstellungen ähneln sich sehr.<br>
+
* length of the information sequence: &nbsp; $L = 5$;
  
[[File:P ID2663 KC T 3 4 S6 v1.png|Viterbi–Decodierung, basierend auf Korrelation und Metrik|class=fit]]<br>
+
* consideration of termination: &nbsp;$L' = 7$;
  
Ebenso wie die Suche nach der Sequenz mit der minimalen Hamming&ndash;Distanz geschieht auch die <i>Suche nach dem maximalen Korrelationswert</i> schrittweise:
+
* received vector &nbsp; $\underline{y} = (11, 11, 10, 00, 01, 01, 11)$ &nbsp; &rArr; &nbsp; two bit errors;  
*Die Knoten bezeichnet man hier als die Metriken <i>&Lambda;<sub>i</sub></i>(<i>S<sub>&mu;</sub></i>). Die englische Bezeichnung hierfür ist <i>Cumulative Metric</i>, während  <i>Branch Metric</i> den Metrikzuwachs angibt.<br>
 
  
*Der Endwert <i>&Lambda;</i><sub>7</sub>(<i>S</i><sub>0</sub>) = 10 gibt den &bdquo;Korrelationswert&rdquo; zwischen der ausgewählten Folge <u><i>z</i></u> und dem Empfangsvektor <u><i>y</i></u> an. Im fehlerfreien Fall ergäbe sich <i>&Lambda;</i><sub>7</sub>(<i>S</i><sub>0</sub>) = 14.<br><br>
+
* Viterbi decoding using trellis diagram:
 +
:*red arrow &rArr; &nbsp; hypothesis $u_i = 0$,
 +
:*blue arrow &rArr; &nbsp; hypothesis $u_i = 1$.
  
Die Trellisbeschreibung wird auf der nächsten Seite fortgesetzt.<br>
 
  
== Viterbi–Algorithmus, basierend auf Korrelation und Metriken (2) ==
+
Adjacent trellis and the&nbsp; [[Channel_Coding/Decoding_of_Convolutional_Codes#Decoding_examples_for_the_erroneous_case|$\text{Example 1 trellis}$ ]]&nbsp; are very similar.&nbsp; Just like the search for the sequence with the&nbsp; "minimum Hamming distance",&nbsp; the&nbsp; "search for the maximum correlation value"&nbsp; is also done step by step:
<br>
+
# &nbsp; The nodes here are called the&nbsp; "cumulative metrics"&nbsp; ${\it \Lambda}_i(S_{\mu})$.
Die nachfolgende Beschreibung bezieht sich auf die Trellisauswertung entsprechend der letzten Seite, basierend auf [http://en.lntwww.de/Kanalcodierung/Decodierung_von_Faltungscodes#Viterbi.E2.80.93Algorithmus.2C_basierend_auf_Korrelation_und_Metriken_.281.29 &bdquo;Korrelationswerten&rdquo; und den Metriken <i>&Lambda;<sub>i</sub></i>(<i>S<sub>&mu;</sub></i>).] Zum Vergleich zeigt die Grafik auf Seite 3a Trellisauswertung, die auf [http://en.lntwww.de/Kanalcodierung/Decodierung_von_Faltungscodes#Decodierbeispiele_f.C3.BCr_den_fehlerbehafteten_Fall_.281.29 Hamming&ndash;Distanzen und den Fehlergrößen <i>&Gamma;<sub>i</sub></i>(<i>S<sub>&mu;</sub></i>)] basieren.<br>
+
# &nbsp; The&nbsp; "branch metrics"&nbsp; specify the&nbsp;  "metric increments".<br>
 +
# &nbsp; The final value&nbsp; ${\it \Lambda}_7(S_0) = 10$&nbsp; indicates the&nbsp;  "end correlation value"&nbsp; between the selected sequence&nbsp; $\underline{z}$&nbsp; and the received vector&nbsp; $\underline{y}$.  
 +
# &nbsp; In the error-free case,&nbsp; the result would be&nbsp; ${\it \Lambda}_7(S_0) = 14$.<br><br>
 +
{{GraueBox|TEXT= 
 +
$\text{Example 5:}$&nbsp; The following detailed description of the trellis evaluation refer to the above trellis:
  
*Die Metriken zum Zeitpunkt <i>i</i> = 1 ergeben sich mit <u><i>y</i></u><sub>1</sub> = (11) zu
+
*The acumulated metrics at time&nbsp; $i = 1$&nbsp; result with&nbsp; $\underline{y}_1 = (11)$&nbsp; to
  
 
::<math>{\it \Lambda}_1(S_0) \hspace{0.15cm}  =  \hspace{0.15cm} <\hspace{-0.05cm}(00)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm}
 
::<math>{\it \Lambda}_1(S_0) \hspace{0.15cm}  =  \hspace{0.15cm} <\hspace{-0.05cm}(00)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm}
Line 226: Line 303:
 
= \hspace{0.1cm} +2  \hspace{0.05cm}.</math>
 
= \hspace{0.1cm} +2  \hspace{0.05cm}.</math>
  
*Entsprechend gilt zum Zeitpunkt <i>i</i> = 2 mit <u><i>y</i></u><sub>2</sub> = (11):
+
*Accordingly,&nbsp; at time&nbsp; $i = 2$&nbsp; with&nbsp; $\underline{y}_2 = (11)$:
  
::<math>{\it \Lambda}_2(S_0) \hspace{-0.15cm}  \hspace{-0.15cm} {\it \Lambda}_1(S_0) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(00)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm}
+
::<math>{\it \Lambda}_2(S_0) =  {\it \Lambda}_1(S_0) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(00)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm}
 
= \hspace{0.1cm} -2-2 = -4  \hspace{0.05cm},</math>
 
= \hspace{0.1cm} -2-2 = -4  \hspace{0.05cm},</math>
::<math>{\it \Lambda}_2(S_1) \hspace{-0.15cm}  = \hspace{-0.15cm} {\it \Lambda}_1(S_0) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(11)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm}
+
::<math>{\it \Lambda}_2(S_1) = {\it \Lambda}_1(S_0) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(11)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm}
 
= \hspace{0.1cm} -2+2 = 0  \hspace{0.05cm},</math>
 
= \hspace{0.1cm} -2+2 = 0  \hspace{0.05cm},</math>
::<math>{\it \Lambda}_2(S_2) \hspace{-0.15cm}  = \hspace{-0.15cm} {\it \Lambda}_1(S_1) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(10)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm}
+
::<math>{\it \Lambda}_2(S_2)= {\it \Lambda}_1(S_1) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(10)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm}
= \hspace{0.1cm} 2+0 = 2  \hspace{0.05cm},</math>
+
= \hspace{0.1cm} +2+0 = +2  \hspace{0.05cm},</math>
::<math>{\it \Lambda}_2(S_3) \hspace{-0.15cm}  = \hspace{-0.15cm} {\it \Lambda}_1(S_1) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(01)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm}
+
::<math>{\it \Lambda}_2(S_3)= {\it \Lambda}_1(S_1) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(01)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm}
= \hspace{0.1cm} 2+0 = 2  \hspace{0.05cm}.</math>
+
= \hspace{0.1cm} +2+0 = +2  \hspace{0.05cm}.</math>
  
*Ab dem Zeitpunkt <i>i</i> = 3 muss eine Entscheidung zwischen zwei Metriken getroffen werden. Beispielsweise erhält man mit <u><i>y</i></u><sub>3</sub> = (10) für die oberste und die unterste Metrik im Trellis:
+
*From time&nbsp; $i =3$&nbsp; a decision must be made between two acumulated metrics.&nbsp; For example,&nbsp; $\underline{y}_3 = (10)$&nbsp; is obtained for the top and bottom metrics in the trellis:
  
::<math>{\it \Lambda}_3(S_0) \hspace{-0.15cm}  = \hspace{-0.15cm}{\rm max} \left [{\it \Lambda}_{2}(S_0) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(00)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm} \hspace{0.05cm}, \hspace{0.2cm}{\it \Lambda}_{2}(S_1) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(00)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}> \right ] =</math>
+
::<math>{\it \Lambda}_3(S_0)={\rm max} \left [{\it \Lambda}_{2}(S_0) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(00)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm} \hspace{0.05cm}, \hspace{0.2cm}{\it \Lambda}_{2}(S_1) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(00)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}> \right ] = {\rm max} \left [ -4+0\hspace{0.05cm},\hspace{0.05cm} +2+0 \right ] = +2\hspace{0.05cm},</math>
::<math>\hspace{1.25cm}  =  \hspace{-0.15cm} {\rm max} \left [ -4+0\hspace{0.05cm},\hspace{0.05cm} 2+0 \right ] = 2\hspace{0.05cm},</math>
+
::<math>{\it \Lambda}_3(S_3) ={\rm max} \left [{\it \Lambda}_{2}(S_1) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(01)\hspace{0.05cm}, \hspace{0.05cm}(10) \hspace{-0.05cm}>\hspace{0.2cm} \hspace{0.05cm}, \hspace{0.2cm}{\it \Lambda}_{2}(S_3) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(10)\hspace{0.05cm}, \hspace{0.05cm}(10) \hspace{-0.05cm}> \right ] = {\rm max} \left [ 0+0\hspace{0.05cm},\hspace{0.05cm} +2+2 \right ] = +4\hspace{0.05cm}.</math>
::<math>{\it \Lambda}_3(S_3) \hspace{-0.15cm}  = \hspace{-0.15cm}{\rm max} \left [{\it \Lambda}_{2}(S_1) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(01)\hspace{0.05cm}, \hspace{0.05cm}(10) \hspace{-0.05cm}>\hspace{0.2cm} \hspace{0.05cm}, \hspace{0.2cm}{\it \Lambda}_{2}(S_3) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(10)\hspace{0.05cm}, \hspace{0.05cm}(10) \hspace{-0.05cm}> \right ] =</math>
 
::<math>\hspace{1.25cm}  =  \hspace{-0.15cm} {\rm max} \left [ 0+0\hspace{0.05cm},\hspace{0.05cm} 2+2 \right ] = 4\hspace{0.05cm}.</math>
 
  
Vergleicht man die zu minimierenden Fehlergrößen <i>&Gamma;<sub>i</sub></i>(<i>S<sub>&mu;</sub></i>) mit den zu maximierenden Metriken <i>&Lambda;<sub>i</sub></i>(<i>S<sub>&mu;</sub></i>), so erkennt man den folgenden deterministischen Zusammenhang:
+
*Comparing the&nbsp; &raquo;'''accumulated correlation values'''&laquo;&nbsp; ${\it \Lambda}_i(S_{\mu})$&nbsp;  to be maximized with the &nbsp; &raquo;'''accumulated error values'''&laquo;&nbsp; ${\it \Gamma}_i(S_{\mu})$&nbsp; to be minimized&nbsp; according to the&nbsp; [[Channel_Coding/Decoding_of_Convolutional_Codes#Decoding_examples_for_the_erroneous_case| $\text{$\text{Example 1}$}$]],&nbsp; one sees the following deterministic relationship:
  
:<math>{\it \Lambda}_i(S_{\mu}) = 2 \cdot  \big [ i -  {\it \Gamma}_i(S_{\mu}) \big ] \hspace{0.05cm}.</math>
+
::<math>{\it \Lambda}_i(S_{\mu}) = 2 \cdot  \big [ i -  {\it \Gamma}_i(S_{\mu}) \big ] \hspace{0.05cm}.</math>
  
Die Auswahl der zu den einzelnen Decodierschritten überlebenden Zweige ist bei beiden Verfahren identisch, und auch die Pfadsuche liefert das gleiche Ergebnis.<br>
+
*The selection of surviving  branches to each decoding step is identical for both methods,&nbsp;  and the path search also gives the same result.<br>}}
  
<b>Zusammenfassung:</b>
 
*Beim Binärkanal &ndash; zum Beispiel dem BSC&ndash;Modell &ndash; führen die beiden beschriebenen Viterbi&ndash;Varianten <i>Fehlergrößenminimierung</i> und <i>Metrikmaximierung</i> zum gleichen Ergebnis.<br>
 
  
*Beim AWGN&ndash;Kanal ist die Fehlergrößenminimierung nicht anwendbar, da keine Hamming&ndash;Distanz zwischen dem binären Eingang <u><i>x</i></u> und dem analogen Ausgang <u><i>y</i></u> angegeben werden kann.<br>
+
{{BlaueBox|TEXT= 
 +
$\text{Conclusions:}$&nbsp; 
 +
# &nbsp; &nbsp; In the binary channel &ndash; for example according to the BSC model &ndash; '''the two described Viterbi variants'''&nbsp; "error value minimization"&nbsp; and &nbsp; "correlation value maximization"&nbsp; '''lead to the same result'''.<br>
 +
# &nbsp; In the AWGN channel,&nbsp; on the other hand,&nbsp; "error value minimization"&nbsp; is not applicable because no Hamming distance can be specified between the binary input&nbsp; $\underline{x}$&nbsp; and the analog output&nbsp; $\underline{y}$.<br>
 +
# &nbsp; For the AWGN channel,&nbsp; the&nbsp; "correlation value maximization"&nbsp; is rather identical to the minimization of the&nbsp; [https://en.wikipedia.org/wiki/Euclidean_distance $\text{Euclidean distance}$]&nbsp; &ndash; see&nbsp; [[Aufgaben:Exercise_3.10Z:_Maximum_Likelihood_Decoding_of_Convolutional_Codes|"Exercise 3.10Z"]].<br>
 +
# &nbsp; Another advantage of the&nbsp; "correlation value maximization"&nbsp; is that a reliability information about the received values&nbsp; $\underline{y}$&nbsp; can be considered in a simple way.}}<br>
  
*Die Metrikmaximierung ist beim AWGN&ndash;Kanal vielmehr identisch mit der Minimierung der [http://en.lntwww.de/Kanalcodierung/Klassifizierung_von_Signalen#ML.E2.80.93Entscheidung_beim_AWGN.E2.80.93Kanal Euklidischen Distanz] &ndash; siehe Aufgabe Z3.10.<br>
+
== Viterbi decision for non-terminated convolutional codes==
 +
<br>
 +
So far,&nbsp; a terminated convolutional code of length&nbsp; $L\hspace{0.05cm}' = L + m$&nbsp; has always been considered,&nbsp; and the result of the Viterbi decoder was the continuous trellis path from the start time&nbsp; $(i = 0)$&nbsp; to the end&nbsp; $(i = L\hspace{0.05cm}')$.<br>
 +
*For non&ndash;terminated convolutional codes&nbsp; $(L\hspace{0.05cm}' &#8594; &#8734;)$&nbsp; this decision strategy is not applicable.
 +
 +
*Here,&nbsp; the algorithm must be modified to provide a best estimate&nbsp; $($according to maximum likelihood$)$&nbsp; of the incoming bits of the encoded sequence in finite time.<br>
  
*Ein weiterer Vorteil der Metrikmaximierung ist, dass eine Zuverlässigkeitsinformation über die Empfangswerte <u><i>y</i></u> in einfacher Weise berücksichtigt werden kann.<br>
+
[[File:P ID2676 KC T 3 4 S7 v1.png|right|frame|Exemplary trellis and surviving paths|class=fit]]
  
== Viterbi–Entscheidung bei nicht–terminierten Faltungscodes (1) ==
 
<br>
 
Bisher wurde stets ein terminierter Faltungscode der Länge <i>L</i>&prime; = <i>L</i> + <i>m</i> betrachtet, und das Ergebnis des Viterbi&ndash;Decoders war der durchgehende Trellispfad vom Startzeitpunkt (<i>i</i> = 0) bis zum Ende (<i>i</i> = <i>L</i>&prime;).<br>
 
  
Bei nicht&ndash;terminierten Faltungscodes (<i>L</i>&prime; &#8594; &#8734;) ist diese Entscheidungsstrategie nicht anwendbar. Hier muss der Algorithmus abgewandelt werden, um in endlicher Zeit eine bestmögliche Schätzung (gemäß Maximum&ndash;Likelihood) der einlaufenden Bits der Codesequenz liefern zu können.<br>
+
The graphic shows in the upper part an exemplary trellis for
 +
*"our standard encoder"&nbsp; $(R = 1/2, \ m = 2)$
 +
:$$ {\rm G}(D) = (1 + D + D^2, \ 1 + D^2),$$
  
[[File:P ID2676 KC T 3 4 S7 v1.png|Beispielhaftes Trellis und überlebende Pfade|class=fit]]<br>
+
*the zero input sequence &nbsp; &#8658; &nbsp; $\underline{u} = \underline{0} = (0, 0, 0, \ \text{...})$;&nbsp; output:  
 +
:$$\underline{x} = \underline{0} = (00, 00, 00, \ \text{...}),$$
  
Die obere Grafik zeigt ein beispielhaftes Trellis für
+
*in each case,&nbsp; transmission errors at&nbsp; $i = 4$&nbsp; and&nbsp; $i = 5$.
*unseren Standard&ndash;Codierer &nbsp;&nbsp;&#8658;&nbsp;&nbsp; <i>R</i> = 1/2, <i>m</i> = 2, G(<i>D</i>) = (1 + <i>D</i> + <i>D</i><sup>2</sup>, 1 + <i>D</i><sup>2</sup>),<br>
 
  
*die Nullfolge &#8658; <u><i>u</i></u> = <u>0</u> = (0, 0, 0, ...) &nbsp;&nbsp;&#8658;&nbsp;&nbsp; <i>x</i> = <u>0</u> = (00, 00, 00, ...),<br>
 
  
*jeweils einen Übertragungsfehler bei <i>i</i> = 4 und <i>i</i> = 5.<br><br>
+
&rArr; &nbsp; Based on the stroke types,&nbsp; one can recognize allowed&nbsp; $($solid$)$&nbsp; and forbidden&nbsp; $($dotted$)$&nbsp; arrows in red&nbsp; $(u_i = 0)$&nbsp; and blue&nbsp; $(u_i = 1)$.  
  
Anhand der Stricharten erkennt man erlaubte (durchgezogene) und verbotene (punktierte) Pfeile in rot (<i>u<sub>i</sub></i> = 0) und blau (<i>u<sub>i</sub></i> = 1). Punktierte Linien haben einen Vergleich gegen einen Konkurrenten verloren und können nicht Teil des ausgewählten Pfades sein.<br>
+
Dotted lines have lost a comparison against a competitor and cannot be part of the selected path.<br>
  
Die untere Grafik zeigt die 2<sup><i>m</i></sup> überlebenden Pfade <i>&Phi;<sub>i</sub></i> (<i>S<sub>&mu;</sub></i>) für den Zeitpunkt <i>i</i> = 9. Man findet diese Pfade am einfachsten von rechts nach links. Die folgende Angabe zeigt die durchlaufenen Zustände <i>S<sub>&mu;</sub></i> in Vorwärtsrichtung:<br><br>
+
&rArr; &nbsp; The lower part of the graph shows the&nbsp; $2^m = 4$&nbsp; surviving paths&nbsp; ${\it \Phi}_9(S_{\mu})$&nbsp; at time&nbsp; $i = 9$.  
 +
*It is easiest to find these paths from right to left&nbsp;  $($"backward"$)$.
 +
 +
*The following specification shows the  traversed states&nbsp; $S_{\mu}$&nbsp; but in the forward direction:<br>
 +
:$${\it \Phi}_9(S_0) \text{:} \hspace{0.4cm} S_0 &#8594; S_0 &#8594; S_0 &#8594; S_0 &#8594; S_0 &#8594; S_0 &#8594; S_0 &#8594; S_0 &#8594; S_0 &#8594; S_0,$$
 +
:$${\it \Phi}_9(S_1) \text{:} \hspace{0.4cm}  S_0 &#8594; S_0 &#8594; S_0 &#8594; S_0 &#8594; S_1 &#8594; S_2 &#8594; S_1 &#8594; S_3 &#8594; S_2 &#8594; S_1,$$
 +
:$${\it \Phi}_9(S_2) \text{:} \hspace{0.4cm}  S_0 &#8594; S_0 &#8594; S_0 &#8594; S_0 &#8594; S_1 &#8594; S_2 &#8594; S_1 &#8594; S_2 &#8594; S_1 &#8594; S_2,$$
 +
:$${\it \Phi}_9(S_3) \text{:} \hspace{0.4cm}  S_0 &#8594; S_0 &#8594; S_0 &#8594; S_0 &#8594; S_1 &#8594; S_2 &#8594; S_1 &#8594; S_2 &#8594; S_1 &#8594; S_3.$$
  
:<i>&Phi;</i><sub>9</sub>(<i>S</i><sub>0</sub>) :  <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub>,<br>
+
At earlier times&nbsp; $(i<9)$&nbsp; other survivors&nbsp; ${\it \Phi}_i(S_{\mu})$&nbsp; would result. Therefore, we define:
  
:<i>&Phi;</i><sub>9</sub>(<i>S</i><sub>1</sub>) : <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>1</sub> &#8594; <i>S</i><sub>2</sub> &#8594; <i>S</i><sub>1</sub> &#8594; <i>S</i><sub>3</sub> &#8594; <i>S</i><sub>2</sub> &#8594; <i>S</i><sub>1</sub>,<br>
+
{{BlaueBox|TEXT= 
 +
$\text{Definition:}$&nbspThe &nbsp; &raquo;<b>survivor</b>&laquo; &nbsp; ${\it \Phi}_i(S_{\mu})$&nbsp; is the continuous path from the start&nbsp; $S_0$&nbsp; $($at&nbsp; $i = 0)$&nbsp; to the node&nbsp; $S_{\mu}$ at time&nbsp; $i$.
  
:<i>&Phi;</i><sub>9</sub>(<i>S</i><sub>2</sub>) :  <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>1</sub> &#8594; <i>S</i><sub>2</sub> &#8594; <i>S</i><sub>1</sub> &#8594; <i>S</i><sub>2</sub> &#8594; <i>S</i><sub>1</sub> &#8594; <i>S</i><sub>2</sub>,<br>
+
*It is recommended to search for paths in the backward direction.}}<br>
  
:<i>&Phi;</i><sub>9</sub>(<i>S</i><sub>3</sub>) :  <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>0</sub> &#8594; <i>S</i><sub>1</sub> &#8594; <i>S</i><sub>2</sub> &#8594; <i>S</i><sub>1</sub> &#8594; <i>S</i><sub>2</sub> &#8594; <i>S</i><sub>1</sub> &#8594; <i>S</i><sub>3</sub>.<br><br>
+
The following graph shows the surviving paths for time points&nbsp; $i = 6$&nbsp; to&nbsp; $i = 9$.&nbsp; In addition,&nbsp; the respective metrics&nbsp; $($accumulated correlaltion values$)$&nbsp; ${\it \Lambda}_i(S_{\mu})$&nbsp; for all four states are given.  
  
Die Beschreibung wird auf der nächsten Seite fortgesetzt.<br>
+
[[File:EN_KC_T_3_4_S7b.png|right|frame|The survivors&nbsp; ${\it \Phi}_6, \ \text{...} \ , \ {\it \Phi}_9$|class=fit]]
  
== Viterbi–Entscheidung bei nicht–terminierten Faltungscodes (2) ==
+
This graph is to be interpreted as follows:
 +
#At time&nbsp; $i = 9$&nbsp; no final maximum  likelihood decision can yet be made about the first nine bits of the information sequence.
 +
#But it is already certain that the most probable bit sequence is represented by one of the paths&nbsp; ${\it \Phi}_9(S_0), \ \text{...} \ , \ {\it \Phi}_9(S_3)$.<br>
 +
#Since all four paths up to&nbsp; $i = 3$&nbsp; are identical,&nbsp; the decision&nbsp; "$v_1 = 0,\ v_2 = 0, \ v_3 = 0$"&nbsp; is the the most probable.&nbsp; Also at a later time no other decision would be made.&nbsp;
 +
#Regarding the bits&nbsp; $v_4, \ v_5, \ \text{...}$&nbsp; one should not decide at this early stage.&nbsp; Only the first two zeros are safe,&nbsp; not&nbsp; $v_3 = 0$.
 +
#If one had to make a constraint decision at time&nbsp; $i = 9$&nbsp; one would choose&nbsp; ${\it \Phi}_9(S_0)$ &nbsp; &#8658; &nbsp; $\underline{v} = (0, 0, \text{. ..} \ , 0)$ since the metric&nbsp; ${\it \Lambda}_9(S_0) = 14$&nbsp; is larger than the comparison metrics.<br>
 +
#The forced decision at time&nbsp; $i = 9$&nbsp; leads to the correct result in this example,&nbsp; see lower sketch.
 +
#A decision at time&nbsp; $i = 6$&nbsp;  would have would have led to the wrong result&nbsp; &#8658; &nbsp; $\underline{v} = (0, 0, 0, 1, 0, 1)$,&nbsp; see upper sketch.
 +
# At time&nbsp; $i = 7$&nbsp; resp.&nbsp; $i = 8$,&nbsp;  a forced decision would not have been clear,&nbsp; as shown in the two middle diagrams<br>
 +
<br clear=all>
 +
== Other decoding methods for convolutional codes ==
 
<br>
 
<br>
{{Definition}}''':''' Der <b>überlebende Pfad</b> (englisch: <i>Survivor</i>) <i>&Phi;<sub>i</sub></i>(<i>S<sub>&mu;</sub></i>) ist der durchgehende Pfad vom Start <i>S</i><sub>0</sub> (bei <i>i</i> = 0) zum Knoten <i>S<sub>&mu;</sub></i> zum Zeitpunkt <i>i</i>. Empfehlenswert ist die Pfadsuche in Rückwärtsrichtung.{{end}}<br>
+
We have so far dealt only with the Viterbi algorithm in the form presented in 1967 by Andrew J. Viterbi in&nbsp; [Vit67]<ref name'Vit67'>Viterbi, A.J.:&nbsp; Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm.&nbsp; In: IEEE Transactions on Information Theory, vol. IT-13, pp. 260-269, April 1967.</ref>.&nbsp; It was not until 1974 that&nbsp; [https://en.wikipedia.org/wiki/Dave_Forney $\text{George David Forney}$]&nbsp; proved that this algorithm performs maximum likelihood decoding of convolutional codes.<br>
  
[[File:P ID2677 KC T 3 4 S7b v1.png|Die überlebenden Pfade <i>Φ</i><sub>6</sub>, ... , <i>Φ</i><sub>9</sub> |class=fit]]<br>
+
But even in the years before,&nbsp; many scientists were very eager to provide efficient decoding methods for the convolutional codes first described by&nbsp; [https://en.wikipedia.org/wiki/Peter_Elias $\text{Peter Elias}$]&nbsp; in 1955.&nbsp; Among others,&nbsp; more detailed descriptions can be found for example in&nbsp; [Bos99]<ref name='Bos99'>Bossert, M.:&nbsp; Channel Coding for Telecommunications.&nbsp; Wiley & Sons, 1999.</ref>.  
 +
*[http://ieeexplore.ieee.org/document/1057663/ $\text{Sequential Decoding}$&nbsp;]&nbsp; by J. M. Wozencraft and B. Reiffen from 1961,<br>
  
Die Grafik zeigt die überlebenden Pfade für die Zeitpunkte <i>i</i> = 6 bis <i>i</i> = 9. Zusätzlich sind die jeweiligen Metriken <i>&Lambda;<sub>i</sub></i>(<i>S<sub>&mu;</sub></i>) für alle vier Zustände angegeben.Die Grafik ist wie folgt zu interpretieren:
+
*the proposal of&nbsp; [https://en.wikipedia.org/wiki/Robert_Fano $\text{Robert Mario Fano}$]&nbsp; (1963),&nbsp; which became known as the&nbsp; "Fano algorithm",<br>
*Zum Zeitpunkt <i>i</i> = 9 kann noch keine endgültige ML&ndash;Entscheidung über die ersten neun Bit der Informationssequenz getroffen werden. Allerdings ist bereits sicher, dass die wahrscheinlichste Bitfolge durch einen der Pfade <i>&Phi;</i><sub>9</sub>(<i>S</i><sub>0</sub>), ... , <i>&Phi;</i><sub>9</sub>(<i>S</i><sub>3</sub>) richtig wiedergegeben wird.<br>
 
  
*Da alle vier Pfade bei <i>i</i> = 3 zusammenlaufen, ist die Entscheidung &bdquo;<i>&upsilon;</i><sub>1</sub> = 0, <i>&upsilon;</i><sub>2</sub> = 0, <i>&upsilon;</i><sub>3</sub> = 0&rdquo; die bestmögliche (hellgraue Hinterlegung). Auch zu einem späteren Zeitpunkt würde keine andere Entscheidung getroffen werden. Hinsichtlich der Bits <i>&upsilon;</i><sub>4</sub>, <i>&upsilon;</i><sub>5</sub>, ... sollte man sich noch nicht festlegen.<br>
+
*the work of Kamil Zigangirov&nbsp; (1966)&nbsp; and&nbsp; [https://en.wikipedia.org/wiki/Frederick_Jelinek $\text{Frederick Jelinek}$]&nbsp; (1969),&nbsp; whose decoding method is also referred to as the&nbsp; "stack algorithm".<br><br>
 
 
*Müsste man zum Zeitpunkt <i>i</i> = 9 eine Zwangsentscheidung treffen, so würde man sich für <i>&Phi;</i><sub>9</sub>(<i>S</i><sub>0</sub>) &nbsp;&#8658;&nbsp; <u><i>&upsilon;</i></u> = (0, 0, ... , 0) entscheiden, da die Metrik <i>&Lambda;</i><sub>9</sub>(<i>S</i><sub>0</Sub>) = 14 größer ist als die Vergleichsmetriken.<br>
 
 
 
*Die Zwangsentscheidung zum Zeitpunkt <i>i</i> = 9 führt in diesem Beispiel zum richtigen Ergebnis. Zum Zeitpunkt <i>i</i> = 6 wäre ein solcher Zwangsentscheid  falsch gewesen &#8658; <u><i>&upsilon;</i></u> = (0, 0, 0, 1, 0, 1), und zu den Zeitpunten <i>i</i> = 7 bzw. <i>i</i> = 8 nicht eindeutig.<br>
 
 
 
== Weitere Decodierverfahren für Faltungscodes ==
 
<br>
 
Wir haben uns bisher in diesem Kapitel nur mit dem Viterbi&ndash;Algorithmus beschäftigt, der 1967 von A. J. Viterbi veröffentlicht wurde. Erst 1974 hat G. D. Forney nachgewiesen, dass dieser Algorithmus eine Maximum&ndash;Likelihood&ndash;Decodierung von Faltungscodes durchführt.<br>
 
  
Aber schon in den Jahren zuvor waren viele Wissenschaftler sehr bemüht, effiziente Decodierverfahren für die 1955 erstmals von Peter Elias beschriebenen Faltungscodes bereitzustellen. Zu nennen sind hier unter Anderem &ndash; genauere Beschreibungen findet man beispielsweise in [Bos99]<ref>Bossert, M.: ''Channel Coding for Telecommunications.'' Wiley & Sons, 1999.</ref>.  
+
All of these decoding schemes,&nbsp; as well as the Viterbi algorithm as described so far,&nbsp; provide&nbsp; "hard decided"&nbsp; output values &nbsp; &#8658; &nbsp; $v_i &#8712; \{0, 1\}$.&nbsp; Often,&nbsp; however,&nbsp; information about the reliability of the decisions made would be desirable,&nbsp; especially when there is a concatenated coding scheme with an outer and an inner code.<br>
*<i>Sequential Decoding</i> von J. M. Wozencraft und B. Reifen aus dem Jahre 1961,<br>
 
  
*der Vorschlag von R. M. Fano (1963), der als <i>Fano&ndash;Algorithmus</i> bekannt wurde,<br>
+
If the reliability of the bits decided by the inner decoder is known at least roughly,&nbsp; this information can be used to&nbsp; (significantly)&nbsp; reduce the bit error probability of the outer decoder.&nbsp; The&nbsp; "soft output Viterbi algorithm"&nbsp; $\rm(SOVA)$&nbsp; proposed by&nbsp; [[Biographies_and_Bibliographies/Lehrstuhlinhaber_des_LNT#Prof._Dr.-Ing._Dr.-Ing._E.h._Joachim_Hagenauer_.281993-2006.29|$\text{Joachim Hagenauer}$]]&nbsp; in&nbsp; [Hag90]<ref name='Hag90'>Hagenauer, J.:&nbsp; Soft Output Viterbi Decoder.&nbsp; In: Technischer Report, Deutsche Forschungsanstalt für Luft- und Raumfahrt (DLR), 1990.</ref>&nbsp; allows to specify a reliability measure in each case in addition to the decided symbols.
  
*die Arbeiten von K. Zigangirov (1966) und F. Jelinek (1969), deren Decodierverfahren häufig als <i>Stack&ndash;Algorithmus</i> bezeichnet wird.<br><br>
+
Finally, we go into some detail about the&nbsp; &raquo;<b>BCJR algorithm</b>&laquo;&nbsp; named after its inventors L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv&nbsp; [BCJR74]<ref name='BCJR74'>Bahl, L.R.; Cocke, J.; Jelinek, F.; Raviv, J.:&nbsp; Optimal Decoding of Linear Codes for Minimizing Symbol Error Rate.&nbsp; In: IEEE Transactions on Information Theory, Vol. IT-20, pp. 284-287, 1974.</ref>.  
 +
:While the Viterbi algorithm only estimates the total sequence &nbsp; &#8658; &nbsp; [[Channel_Coding/Channel_Models_and_Decision_Structures#Definitions_of_the_different_optimal_receivers|$\text{block-wise ML}$]],&nbsp; the BCJR algorithm estimates a single bit considering the entire received sequence.&nbsp; So this is a&nbsp; "bit-wise maximum a-posteriori decoding" &nbsp; &#8658; &nbsp; [[Channel_Coding/Channel_Models_and_Decision_Structures#Definitions_of_the_different_optimal_receivers|$\text{bit-wise MAP}$]].<br>
  
Alle diese Decodierverfahren und auch der Viterbi&ndash;Algorithmus in seiner bisher beschriebenen Form liefern &bdquo;hart&rdquo; entschiedene Ausgangswerte &#8658; <i>&upsilon;<sub>i</sub></i> &#8712; {0, 1}. Oftmals wären jedoch Informationen über die Zuverlässigkeit der getroffenen Entscheidungen wünschenswert, insbesondere dann, wenn ein verkettetes Codierschema mit einem äußeren und einem inneren Code vorliegt.<br>
 
  
Kennt man die Zuverlässigkeit der vom inneren Decoder entschiedenen Bits zumindest grob, so kann durch diese Information die Bitfehlerwahrscheinlichkeit des äußeren Decoders (signifikant) herabgesetzt werden. Der von J. Hagenauer in [Hag90]<ref>Hagenauer, J.: ''Soft Output Viterbi Decoder.'' In: Technischer Report, Deutsche Forschungsanstalt für Luft- und Raumfahrt (DLR), 1990.</ref> vorgeschlagene <i>Soft&ndash;Output&ndash;Viterbi&ndash;Algorithmus</i> (SOVA) erlaubt es, zusätzlich zu den entschiedenen Symbolen auch jeweils ein Zuverlässigkeitsmaß anzugeben.<br>
+
{{BlaueBox|TEXT= 
 +
$\text{Conclusion:}$&nbsp;  
  
Abschließend gehen wir noch etwas genauer auf den <i>BCJR&ndash;Algorithmus</i> ein, benannt nach dessen Erfinder L. R. Bahl, J. Cocke, F. Jelinek  und J. Raviv [BCJR74]<ref>Bahl, L.R.; Cocke, J.; Jelinek, F.; Raviv, J.: ''Optimal Decoding of Linear Codes for Minimizing Symbol Error Rate.'' In: IEEE Transactions on Information Theory, Vol. IT-20, S. 284-287, 1974.</ref>. Während der Viterbi&ndash;Algorithmus nur eine Schätzung der Gesamtsequenz vornimmt &nbsp;&#8658;&nbsp; block&ndash;wise ML, schätzt der BCJR&ndash;Algorithmus ein einzelnes Symbol (Bit) unter Berücksichtigung der gesamten empfangenen Codesequenz. Es handelt sich hierbei also um eine <i>symbolweise Maximum&ndash;Aposteriori&ndash;Decodierung</i> &nbsp;&#8658;&nbsp; bit&ndash;wise MAP.<br>
+
The difference between Viterbi&ndash;algorithm and BCJR algorithm shall be&nbsp; &ndash; greatly simplified &ndash;&nbsp; illustrated by the example of a terminated convolutional code:
 +
*The &nbsp; &raquo;<b>Viterbi algorithm</b> &laquo;&nbsp; processes the trellis in only one direction&nbsp; &ndash; the forward direction&nbsp; &ndash; and computes the metrics&nbsp; ${\it \Lambda}_i(S_{\mu})$&nbsp; for each node.&nbsp; After reaching the terminal node,&nbsp; the surviving path is searched,&nbsp; which identifies the most probable encoded sequence.<br>
  
Der Unterschied zwischen Viterbi&ndash;Algorithmus und BCJR&ndash;Algorithmus soll &ndash; stark vereinfacht &ndash; am Beispiel eines terminierten Faltungscodes dargestellt werden:
+
*In the &nbsp; &raquo;<b>BCJR algorithm </b>&laquo;&nbsp; the trellis is processed twice,&nbsp; once in forward direction and then in backward direction.&nbsp; Two metrics can then be specified for each node,&nbsp; from which the a-posterori probability can be determined for each bit}}.<br>
*Der <b>Viterbi&ndash;Algorithmus</b> arbeitet das Trellis nur in einer Richtung &ndash; der  <i>Vorwärtsrichtung</i> &ndash; ab und berechnet für jeden Knoten die Metriken <i>&Lambda;<sub>i</sub></i>(<i>S<sub>&mu;</sub></i>). Nach Erreichen des Endknotens wird der  überlebende Pfad gesucht, der die wahrscheinlichste Codesequenz kennzeichnet.<br>
 
  
*Beim <b>BCJR&ndash;Algorithmus</b> wird das Trellis zweimal abgearbeitet, einmal in Vorwärtsrichtung und anschließend in <i>Rückwärtsrichtung</i>. Für jeden Knoten sind dann zwei Metriken angebbar,  aus denen für jedes Bit die Aposterori&ndash;Wahrscheinlichkeit bestimmt werden kann.<br><br>
+
<u>Notes:</u>
 +
# &nbsp; This short summary is based on the textbook&nbsp;   [Bos99]<ref name='Bos99'>Bossert, M.:&nbsp; Channel Coding for Telecommunications.&nbsp; Wiley & Sons, 1999.</ref>.  
 +
# &nbsp; A description of the BCJR&ndash;algorithm follows also in section&nbsp; [[Channel_Coding/Soft–in_Soft–out_Decoder#Hard_Decision_vs._Soft_Decision| "Hard Decision vs. Soft Decision"]]&nbsp; [https://en.lntwww.de/Channel_Coding "in the fourth main chapter"]&nbsp; "Iterative Decoding Methods".<br>
  
<b>Hinweis:</b> Diese Kurzzusammenfassung basiert auf dem Lehrbuch [Bos98]<ref>Bossert, M.: ''Kanalcodierung.'' Stuttgart: B. G. Teubner, 1998.</ref>. Eine etwas ausführlichere Beschreibung des BCJR&ndash;Algorithmus' folgt im [http://en.lntwww.de/Kanalcodierung/Soft%E2%80%93in_Soft%E2%80%93out_Decoder#Hard_Decision_vs._Soft_Decision_.281.29 Kapitel 4.1.]<br>
 
  
== Aufgaben ==
+
== Exercises for the chapter ==
 
<br>
 
<br>
[[Aufgaben:3.9 Viterbi–Algorithmus: Grundlegendes|A3.9 Viterbi–Algorithmus: Grundlegendes]]
+
[[Aufgaben:Exercise_3.09:_Basics_of_the_Viterbi_Algorithm|Exercise 3.09: Basics of the Viterbi Algorithm]]
  
[[Zusatzaufgaben:3.9 Nochmals Viterbi–Algorithmus]]
+
[[Aufgaben:Exercise_3.09Z:_Viterbi_Algorithm_again|Exercise 3.09Z: Viterbi Algorithm again]]
  
[[Aufgaben:3.10 Fehlergrößenberechnung|A3.10 Fehlergrößenberechnung]]
+
[[Aufgaben:Exercise_3.10:_Metric_Calculation|Exercise 3.10: Metric Calculation]]
  
[[Zusatzaufgaben:3.10 ML–Decodierung von Faltungscodes]]
+
[[Aufgaben:Exercise_3.10Z:_Maximum_Likelihood_Decoding_of_Convolutional_Codes|Exercise 3.10Z: Maximum Likelihood Decoding of Convolutional Codes]]
  
[[Aufgaben:3.11 Viterbi–Pfadsuche|A3.11 Viterbi–Pfadsuche]]
+
[[Aufgaben:Exercise_3.11:_Viterbi_Path_Finding|Exercise 3.11: Viterbi Path Finding]]
  
==Quellenverzeichnis==
+
==References==
 
<references/>
 
<references/>
  
 
{{Display}}
 
{{Display}}

Latest revision as of 16:53, 23 January 2023

Block diagram and requirements


A significant advantage of convolutional coding is that there is a very efficient decoding method for this in the form of the  "Viterbi algorithm".  This algorithm,  developed by  $\text{Andrew James Viterbi}$  has already been described in the chapter  "Viterbi receiver"  of the book "Digital Signal Transmission" with regard to its use for equalization.

For its use as a convolutional decoder we assume the block diagram on the right and the following prerequisites:

System model for the decoding of convolutional codes
  • The information sequence  $\underline{u} = (u_1, \ u_2, \ \text{... } \ )$  is here in contrast to the description of linear block codes   ⇒   "first main chapter"  or of Reed–Solomon codes   ⇒   "second main chapter"  generally infinitely long  ("semi–infinite").  For the information symbols always applies  $u_i ∈ \{0, 1\}$.
  • The encoded sequence  $\underline{x} = (x_1, \ x_2, \ \text{... })$  with  $x_i ∈ \{0, 1\}$  depends not only on   $\underline{u}$   but also on the code rate  $R = 1/n$, the memory  $m$  and the transfer function matrix  $\mathbf{G}(D)$  . For finite number  $L$  of information bits,  the convolutional code should be terminated by appending  $m$  zeros:
\[\underline{u}= (u_1,\hspace{0.05cm} u_2,\hspace{0.05cm} \text{...} \hspace{0.1cm}, u_L, \hspace{0.05cm} 0 \hspace{0.05cm},\hspace{0.05cm} \text{...} \hspace{0.1cm}, 0 ) \hspace{0.3cm}\Rightarrow \hspace{0.3cm} \underline{x}= (x_1,\hspace{0.05cm} x_2,\hspace{0.05cm} \text{...} \hspace{0.1cm}, x_{2L}, \hspace{0.05cm} x_{2L+1} ,\hspace{0.05cm} \text{...} \hspace{0.1cm}, \hspace{0.05cm} x_{2L+2m} ) \hspace{0.05cm}.\]
  • The received sequence  $\underline{y} = (y_1, \ y_2, \ \text{...} )$  results according to the assumed channel model. For a digital model like the  $\text{Binary Symmetric Channel}$  $\rm (BSC)$  holds   $y_i ∈ \{0, 1\}$,  so the falsification from  $\underline{x}$  to  $\underline{y}$   can be quantified with the  $\text{Hamming distance}$  $d_{\rm H}(\underline{x}, \underline{y})$.
  • The Viterbi algorithm provides an estimate  $\underline{z}$  for the encoded sequence  $\underline{x}$  and another estimate  $\underline{v}$  for the information sequence  $\underline{u}$.  Thereby holds:
\[{\rm Pr}(\underline{z} \ne \underline{x})\stackrel{!}{=}{\rm Minimum} \hspace{0.25cm}\Rightarrow \hspace{0.25cm} {\rm Pr}(\underline{\upsilon} \ne \underline{u})\stackrel{!}{=}{\rm Minimum} \hspace{0.05cm}.\]

$\text{Conclusion:}$  Given a digital channel model   $($for example,   the BSC model$)$,   the Viterbi algorithm searches from all possible encoded sequences  $\underline{x}\hspace{0.05cm}'$  the sequence  $\underline{z}$  with the minimum Hamming distance   $d_{\rm H}(\underline{x}\hspace{0.05cm}', \underline{y})$   to the received sequence  $\underline{y}$:

\[\underline{z} = {\rm arg} \min_{\underline{x}\hspace{0.05cm}' \in \hspace{0.05cm} \mathcal{C} } \hspace{0.1cm} d_{\rm H}( \underline{x}\hspace{0.05cm}'\hspace{0.02cm},\hspace{0.02cm} \underline{y} ) = {\rm arg} \max_{\underline{x}' \in \hspace{0.05cm} \mathcal{C} } \hspace{0.1cm} {\rm Pr}( \underline{y} \hspace{0.05cm} \vert \hspace{0.05cm} \underline{x}')\hspace{0.05cm}.\]


Preliminary remarks on the following decoding examples


Trellis for decoding the received sequence  $\underline{y}$

The following  »prerequisites«   apply to all examples in this chapter:

  • Standard convolutional encoder:   Rate $R = 1/2$,  memory  $m = 2$;
  • transfer function matrix:   $\mathbf{G}(D) = (1 + D + D^2, 1 + D^2)$;
  • length of information sequence:   $L = 5$;
  • consideration of termination:   $L\hspace{0.05cm}' = 7$;
  • length of sequences  $\underline{x}$  and  $\underline{y}$ :   $14$  bits each;
  • allocation according to  $\underline{y} = (\underline{y}_1, \ \underline{y}_2, \ \text{...} \ , \ \underline{y}_7)$
    ⇒   bit pairs  $\underline{y}_i ∈ \{00, 01, 10, 11\}$;
  • Viterbi decoding using trellis diagram:
red arrow   ⇒   hypothesis  $u_i = 0$,
blue arrow   ⇒   hypothesis  $u_i = 1$;
  • hypothetical encoded sequence  $\underline{x}_i\hspace{0.01cm}' ∈ \{00, 01, 10, 11\}$;
  • all hypothetical quantities with apostrophe.


We always assume that the Viterbi decoding is based at the  $\text{Hamming distance}$  $d_{\rm H}(\underline{x}_i\hspace{0.01cm}', \ \underline{y}_i)$  between the received word  $\underline{y}_i$  and the four possible code words  $x_i\hspace{0.01cm}' ∈ \{00, 01, 10, 11\}$.  We then proceed as follows:

  • In the still empty circles the error value  ${\it \Gamma}_i(S_{\mu})$  of states  $S_{\mu} (0 ≤ \mu ≤ 3)$  at time points  $i$  are entered.  The initial value is always  ${\it \Gamma}_0(S_0) = 0$.
  • The error values for  $i = 1$  and  $i = 2$  are given by
\[{\it \Gamma}_1(S_0) =d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_1 \big ) \hspace{0.05cm}, \hspace{2.38cm}{\it \Gamma}_1(S_1) = d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_1 \big ) \hspace{0.05cm},\]
\[{\it \Gamma}_2(S_0) ={\it \Gamma}_1(S_0) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_2 \big )\hspace{0.05cm}, \hspace{0.6cm}{\it \Gamma}_2(S_1) = {\it \Gamma}_1(S_0)+ d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_2 \big ) \hspace{0.05cm},\hspace{0.6cm}{\it \Gamma}_2(S_2) ={\it \Gamma}_1(S_1) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_2 \big )\hspace{0.05cm}, \hspace{0.6cm}{\it \Gamma}_2(S_3) = {\it \Gamma}_1(S_1)+ d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_2 \big ) \hspace{0.05cm}.\]
  • From  $i = 3$  the trellis has reached its basic form, and to compute all  ${\it \Gamma}_i(S_{\mu})$  the minimum between two sums must be determined in each case:
\[{\it \Gamma}_i(S_0) ={\rm Min} \left [{\it \Gamma}_{i-1}(S_0) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{i-1}(S_2) + d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big ) \right ] \hspace{0.05cm},\]
\[{\it \Gamma}_i(S_1)={\rm Min} \left [{\it \Gamma}_{i-1}(S_0) + d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{i-1}(S_2) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big ) \right ] \hspace{0.05cm},\]
\[{\it \Gamma}_i(S_2) ={\rm Min} \left [{\it \Gamma}_{i-1}(S_1) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{i-1}(S_3) + d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big ) \right ] \hspace{0.05cm},\]
\[{\it \Gamma}_i(S_3) ={\rm Min} \left [{\it \Gamma}_{i-1}(S_1) + d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{i-1}(S_3) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} \underline{y}_i \big ) \right ] \hspace{0.05cm}.\]
  • Of the two branches arriving at a node  ${\it \Gamma}_i(S_{\mu})$  the worse one  $($which would have led to a larger  ${\it \Gamma}_i(S_{\mu})$  is eliminated.  Only one branch then leads to each node.
  • Once all error values up to and including  $i = 7$  have been determined,  the Viterbi algotithm can be completed by searching the  "connected path"  from the end of the trellis   ⇒   ${\it \Gamma}_7(S_0)$  to the beginning   ⇒   ${\it \Gamma}_0(S_0)$ .
  • Through this path,  the most likely encoded sequence  $\underline{z}$  and the most likely information sequence  $\underline{v}$  are then fixed.
  • Not all received sequences are transmitted error-free  $(\underline{y} =\underline{x})$,  however often holds with Viterbis decoding:   $\underline{z} = \underline{x}$  and  $\underline{v} = \underline{u}$.
  • But if there are too many transmission errors,  the Viterbi algorithm also fails.

Creating the trellis in the error-free case  –  Acumulated error value calculation


First,  we assume the received sequence  $\underline{y} = (11, 01, 01, 11, 11, 10, 11)$  which is here already subdivided into bit pairs: 

$$\underline{y}_1, \hspace{0.05cm} \text{...} \hspace{0.05cm} , \ \underline{y}_7.$$

The numerical values entered in the trellis and the different types of strokes are explained in the following text.

Viterbi scheme for the received vector  $\underline{y} = (11, 01, 01, 11, 11, 10, 11)$
  • Starting from the initial value  ${\it \Gamma}_0(S_0) = 0$  we get  $\underline{y}_1 = (11)$  by adding the Hamming distances
$$d_{\rm H}((00), \ \underline{y}_1) = 2\hspace{0.6cm} \text{or} \hspace{0.6cm}d_{\rm H}((11), \ \underline{y}_1) = 0$$
to the  "$($acumulated$)$ error values"   ${\it \Gamma}_1(S_0) = 2, \ {\it \Gamma}_1(S_1) = 0$.
  • In the second decoding step there are error values for all four states:   With  $\underline{y}_2 = (01)$  one obtains:
\[{\it \Gamma}_2(S_0) = {\it \Gamma}_1(S_0) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} (01) \big ) = 2+1 = 3 \hspace{0.05cm},\]
\[{\it \Gamma}_2(S_1) ={\it \Gamma}_1(S_0) + d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} (01) \big ) = 2+1 = 3 \hspace{0.05cm},\]
\[{\it \Gamma}_2(S_2) ={\it \Gamma}_1(S_1) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} (01) \big ) = 0+2=2 \hspace{0.05cm},\]
\[{\it \Gamma}_2(S_3) = {\it \Gamma}_1(S_1) + d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} (01) \big ) = 0+0=0 \hspace{0.05cm}.\]
  • In all further decoding steps,  two values must be compared in each case,  whereby the node  ${\it \Gamma}_i(S_{\mu})$  is always assigned the smaller value.
  • For example,  for  $i = 3$  with  $\underline{y}_3 = (01)$:
\[{\it \Gamma}_3(S_0) ={\rm min} \left [{\it \Gamma}_{2}(S_0) + d_{\rm H} \big ((00)\hspace{0.05cm},\hspace{0.05cm} (01) \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{2}(S_2) + d_{\rm H} \big ((11)\hspace{0.05cm},\hspace{0.05cm} (01) \big ) \right ] ={\rm min} \big [ 3+1\hspace{0.05cm},\hspace{0.05cm} 2+1 \big ] = 3\hspace{0.05cm},\]
\[{\it \Gamma}_3(S_3) ={\rm min} \left [{\it \Gamma}_{2}(S_1) + d_{\rm H} \big ((01)\hspace{0.05cm},\hspace{0.05cm} (01) \big )\hspace{0.05cm}, \hspace{0.2cm}{\it \Gamma}_{2}(S_3) + d_{\rm H} \big ((10)\hspace{0.05cm},\hspace{0.05cm} (01) \big ) \right ] ={\rm min} \big [ 3+0\hspace{0.05cm},\hspace{0.05cm} 0+2 \big ] = 2\hspace{0.05cm}.\]
  • In the considered example,  from  $i = 6$  the termination of the convolutional code becomes effective.  Here,  only two comparisons are left to determine  ${\it \Gamma}_6(S_0) = 3$  and  ${\it \Gamma}_6(S_2)= 0$  and for  $i = 7$  only one comparison with the final error value  ${\it \Gamma}_7(S_0) = 0$.


The description of the Viterbi decoding process continues in the next section.

Evaluating the trellis in the error-free case  –  Path search


After all error values  ${\it \Gamma}_i(S_{\mu})$  have been determined  $($ in the present example for  $1 ≤ i ≤ 7$  and  $0 ≤ \mu ≤ 3)$,  the Viterbi decoder can start the path search:

  1.   The following graph shows the trellis after the error value calculation.  All circles are assigned numerical values.
  2.   However,  the most probable path already drawn in the graphic is not yet known.
  3.   In the following,  of course,  no use is made of the  "error-free case"  information already contained in the heading.
  4.   Of the two branches arriving at a node,  only the one that led to the minimum error value  ${\it \Gamma}_i(S_{\mu})$  is used for the final path search.
  5.   The  "bad"  branches are discarded.  They are each shown dotted in the above graph.
Viterbi path search for for the received vector  $\underline{y} = (11, 01, 01, 11, 11, 10, 11)$


The path search runs as follows:

  • Starting from the end value  ${\it \Gamma}_7(S_0)$  a continuous path is searched in backward direction to the start value  ${\it \Gamma}_0(S_0)$.  Only the solid branches are allowed.  Dotted lines cannot be part of the selected  $($best$)$  path.
  • The selected path  $($grayed out in the graph$)$  traverses from right to left in the sketch the states is 
$$S_0 ← S_2 ← S_1 ← S_0 ← S_2 ← S_3 ← S_1 ← S_0.$$
There is no second continuous path from  ${\it \Gamma}_7(S_0)$  to  ${\it \Gamma}_0(S_0)$. This means:   The decoding result is unique.
  • The result  $\underline{v} = (1, 1, 0, 0, 1, 0, 0)$  of the Viterbi decoder with respect to the information sequence is obtained if for the continuous path  $($but now in forward direction from left to right$)$  the colors of the individual branches are evaluated  $($red   ⇒   "$0$",   blue   ⇒   $1)$.

From the final value   ${\it \Gamma}_7(S_0) = 0$   it can be seen that there were no transmission errors in this first example:

  • The decoding result  $\underline{z}$  thus matches the received vector  $\underline{y} = (11, 01, 01, 11, 11, 10, 11)$  and the actual encoded sequence  $\underline{x}$.
  • With error-free transmission,  $ \underline{v}$  is not only the most probable info sequence  $\underline{u}$  according to the maximum likelihood criterion,  but both are even identical:   $\underline{v} \equiv \underline{u}$.


Decoding examples for the erroneous case


Now follow three examples of Viterbi decoding for the erroneous case.

$\text{Example 1:}$  We assume here the received vector  $\underline{y} = \big (11\hspace{0.05cm}, 11\hspace{0.05cm}, 10\hspace{0.05cm}, 00\hspace{0.05cm}, 01\hspace{0.05cm}, 01\hspace{0.05cm}, 11 \hspace{0.05cm} \hspace{0.05cm} \big ) $  which does not represent a valid encoded sequence  $\underline{x}$ . The calculation of error values  ${\it \Gamma}_i(S_{\mu})$  and the path search is done as described in section  "Preliminaries"  and demonstrated in the last two sections for the error-free case.

Decoding example with two bit errors at the beginning

As summary of this first example,  it should be noted:

  • Also with this trellis,  a unique path  $($with dark gray background$)$  can be traced,  leading to the following results  $($recognizable by the labels or the colors of this path$)$:
\[\underline{z} = \big (00\hspace{0.05cm}, 11\hspace{0.05cm}, 10\hspace{0.05cm}, 00\hspace{0.05cm}, 01\hspace{0.05cm}, 01\hspace{0.05cm}, 11 \hspace{0.05cm} \big ) \hspace{0.05cm},\]
\[ \underline{\upsilon} =\big (0\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 1\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 0 \hspace{0.05cm} \big ) \hspace{0.05cm}.\]
  • Comparing the most likely transmitted encoded sequence  $\underline{z}$  with the received vector  $\underline{y}$  shows that there were two bit errors directly at the beginning.  But since the used convolutional code has the $\text{free distance}$  $d_{\rm F} = 5$,  two transmission errors do not yet lead to a wrong decoding result.
  • There are other paths such as the lighter highlighted path
$$S_0 → S_1 → S_3 → S_3 → S_3 → S_2 → S_0 → S_0$$
that initially appear to be promising.  Only in the last decoding step  $(i = 7)$  can this light gray path finally be discarded.

Further remarks:

  1. The example shows that a too early decision is often not purposeful. 
  2. One can also see the expediency of termination:   With final decision at  $i = 5$  $($end of information sequence$)$,  the sequences  $(0, 1, 0, 1, 1)$  and  $(1, 1, 1, 1, 0)$  would still have been considered equally likely.
  3. In the calculation of  ${\it \Gamma}_5(S_0) = 3$  and  ${\it \Gamma}_5(S_1) = 3$  here in each case the two comparison branches lead to exactly the same minimum error value. In the graph these two special cases are marked by dash dots.  In this example,  this special case has no effect on the path search.
  4. Nevertheless,  the algorithm always expects a decision between two competing branches.  In practice,  one helps by randomly selecting one of the two paths if they are equal.


$\text{Example 2:}$  In this example,  we assume the following assumptions regarding source and encoder:

Decoding example with three bit errors
$$\underline{u} = \big (1\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 0\hspace{0.05cm}, 1 \hspace{0.05cm}, 0\hspace{0.05cm}, 0 \big )$$
$$\Rightarrow \hspace{0.3cm} \underline{x} = \big (11\hspace{0.05cm}, 01\hspace{0.05cm}, 01\hspace{0.05cm}, 11\hspace{0.05cm}, 11\hspace{0.05cm}, 10\hspace{0.05cm}, 11 \hspace{0.05cm} \hspace{0.05cm} \big ) \hspace{0.05cm}.$$

From the graph you can see here that the decoder decides for the correct path  $($dark background$)$  despite three bit errors.

  • So there is not always a wrong decision,  if more than  $d_{\rm F}/2$  bit errors occurred.
  • But with statistical distribution of the three bit errors,  wrong decision would be more frequent than right.


$\text{Example 3:}$  Here also applies 

Decoding example with four bit errors
$$\underline{u} = \big (1\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 0\hspace{0.05cm}, 1 \hspace{0.05cm}, 0\hspace{0.05cm}, 0 \big )$$
$$\Rightarrow \hspace{0.3cm} \underline{x} = \big (11\hspace{0.05cm}, 01\hspace{0.05cm}, 01\hspace{0.05cm}, 11\hspace{0.05cm}, 11\hspace{0.05cm}, 10\hspace{0.05cm}, 11 \hspace{0.05cm} \hspace{0.05cm} \big ) \hspace{0.05cm}.$$

Unlike the last example,  a fourth bit error is added:  $\underline{y}_7 = (01).$

  • Now both branches in step  $i = 7$  lead to the minimum error value  ${\it \Gamma}_7(S_0) = 4$,  recognizable by the dash-dotted transitions.
  • If one decides in the then required lottery procedure for the path with dark background,  the correct decision is still made even with four bit errors:   $\underline{v} = \big (1\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 0\hspace{0.05cm}, 1 \hspace{0.05cm}, 0\hspace{0.05cm}, 0 \big )$.
  • Otherwise,  a wrong decision is made. Depending on the outcome of the dice roll in step  $i =6$  between the two dash-dotted competitors,  you choose either the purple or the light gray path. 
  • Both have little in common with the correct path.


Relationship between Hamming distance and correlation


Especially for the  $\text{BSC model}$  $($but also for any other binary channel  ⇒   input  $x_i ∈ \{0,1\}$,  output $y_i ∈ \{0,1\}$  such as the  $\text{Gilbert–Elliott model}$$)$  provides

  • the Hamming distance  $d_{\rm H}(\underline{x}, \ \underline{y})$  exactly the same information about the similarity of the input sequence  $\underline{x}$  and the output sequence  $\underline{y}$ 
  • as the  $\text{inner product}$.  Assuming that the sequences are in bipolar form  $($denoted by tildes$)$  and that the sequence length is  $L$  in each case,  the inner product is:
\[<\hspace{-0.1cm}\underline{\tilde{x}}, \hspace{0.05cm}\underline{\tilde{y}} \hspace{-0.1cm}> \hspace{0.15cm} = \sum_{i = 1}^{L} \tilde{x}_i \cdot \tilde{y}_i \hspace{0.3cm}{\rm with } \hspace{0.2cm} \tilde{x}_i = 1 - 2 \cdot x_i \hspace{0.05cm},\hspace{0.2cm} \tilde{y}_i = 1 - 2 \cdot y_i \hspace{0.05cm},\hspace{0.2cm} \tilde{x}_i, \hspace{0.05cm}\tilde{y}_i \in \hspace{0.1cm}\{ -1, +1\} \hspace{0.05cm}.\]

We sometimes refer to this inner product as the  »correlation value«.  Unlike the   $\text{correlation coefficient}$  the  "correlation value"  may well exceed the range of values  $±1$.

$\text{Example 4:}$  We consider here two binary sequences of length  $L = 10$.  Shown on the left are the  »unipolar«  sequences  $\underline{x}$  and  $\underline{y}$  and the product  $\underline{x} \cdot \underline{y}$.

Relationship between Haming distance and correlation value
  • You can see the Hamming distance  $d_{\rm H}(\underline{x}, \ \underline{y}) = 6$   ⇒   six bit errors at the arrow positions.
  • The inner product   $ < \underline{x} \cdot \underline{y} > \hspace{0.15cm} = \hspace{0.15cm}0$   has no significance here.  For example,  $< \underline{0} \cdot \underline{y} > $ is always zero regardless of  $\underline{y}$.


The Hamming distance  $d_{\rm H} = 6$  can also be seen from the  »bipolar«  $($antipodal$)$ plot in the right graph.

  • The  "correlation value"  has now the correct value:
$$4 \cdot (+1) + 6 \cdot (-1) = \, -2.$$
  • For the deterministic relationship between the  "correlation value"  and the  "Hamming distance"  holds with the sequence length  $L$:
$$ < \underline{ \tilde{x} } \cdot \underline{\tilde{y} } > \hspace{0.15cm} = \hspace{0.15cm} L - 2 \cdot d_{\rm H} (\underline{\tilde{x} }, \hspace{0.05cm}\underline{\tilde{y} })\hspace{0.05cm}. $$


$\text{Conclusion:}$  Let us now interpret this last equation for some special cases:

  • »Identical sequences«:   The Hamming distance is equal to  $0$  and the  correlation value is equal to  $L$.
  • »Inverted sequences«:   The Hamming distance is equal to  $L$  and the  correlation value  is equal to  $-L$.
  • »Uncorrelated sequences«:   The Hamming distance is equal to  $L/2$  and the  correlation value  is equal to  $0$.

Viterbi algorithm based on correlation and metrics


Using the insights of the last section,  the Viterbi algorithm can also be characterized as follows.

$\text{Alternative description:}$ 

  • The Viterbi algorithm searches from all possible encoded sequences  $\underline{x}' ∈ \mathcal{C}$  the sequence  $\underline{z}$  with the  »maximum correlation value«  to the received sequence  $\underline{y}$:
\[\underline{z} = {\rm arg} \max_{\underline{x}' \in \hspace{0.05cm} \mathcal{C} } \hspace{0.1cm} \left\langle \tilde{\underline{x} }'\hspace{0.05cm} ,\hspace{0.05cm} \tilde{\underline{y} } \right\rangle \hspace{0.4cm}{\rm with }\hspace{0.4cm}\tilde{\underline{x} }\hspace{0.05cm}'= 1 - 2 \cdot \underline{x}'\hspace{0.05cm}, \hspace{0.2cm} \tilde{\underline{y} }= 1 - 2 \cdot \underline{y} \hspace{0.05cm}.\]
  • Here,  $〈\ \text{ ...} \ 〉$  denotes a  "correlation value"  according to the statements in the last section.  The tildes again indicate the bipolar representation.


The graphic shows the corresponding trellis evaluation. 

  • the input sequence and the encoded sequence are
$$\underline{u} = \big (0\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 1\hspace{0.05cm}, 1\hspace{0.05cm}, 0\hspace{0.05cm}, 0 \hspace{0.05cm} \big ) \hspace{0.3cm} ⇒ \hspace{0.3cm} \underline{x} = \big (00, 11, 10, 00, 01, 01, 11 \big ) \hspace{0.05cm}.$$
Viterbi decoding based on correlation and metrics

Further are assumed:

  • Standard convolutional encoder:   rate  $R = 1/2$,  memory  $m = 2$;
  • the transfer function matrix:   $\mathbf{G}(D) = (1 + D + D^2, 1 + D^2)$;
  • length of the information sequence:   $L = 5$;
  • consideration of termination:  $L' = 7$;
  • received vector   $\underline{y} = (11, 11, 10, 00, 01, 01, 11)$   ⇒   two bit errors;
  • Viterbi decoding using trellis diagram:
  • red arrow ⇒   hypothesis $u_i = 0$,
  • blue arrow ⇒   hypothesis $u_i = 1$.


Adjacent trellis and the  $\text{Example 1 trellis}$   are very similar.  Just like the search for the sequence with the  "minimum Hamming distance",  the  "search for the maximum correlation value"  is also done step by step:

  1.   The nodes here are called the  "cumulative metrics"  ${\it \Lambda}_i(S_{\mu})$.
  2.   The  "branch metrics"  specify the  "metric increments".
  3.   The final value  ${\it \Lambda}_7(S_0) = 10$  indicates the  "end correlation value"  between the selected sequence  $\underline{z}$  and the received vector  $\underline{y}$.
  4.   In the error-free case,  the result would be  ${\it \Lambda}_7(S_0) = 14$.

$\text{Example 5:}$  The following detailed description of the trellis evaluation refer to the above trellis:

  • The acumulated metrics at time  $i = 1$  result with  $\underline{y}_1 = (11)$  to
\[{\it \Lambda}_1(S_0) \hspace{0.15cm} = \hspace{0.15cm} <\hspace{-0.05cm}(00)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm} = \hspace{0.2cm}<(+1,\hspace{0.05cm} +1)\hspace{0.05cm}, \hspace{0.05cm}(-1,\hspace{0.05cm} -1) >\hspace{0.1cm} = \hspace{0.1cm} -2 \hspace{0.05cm},\]
\[{\it \Lambda}_1(S_1) \hspace{0.15cm} = \hspace{0.15cm} <\hspace{-0.05cm}(11)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm} = \hspace{0.2cm}<(-1,\hspace{0.05cm} -1)\hspace{0.05cm}, \hspace{0.05cm}(-1,\hspace{0.05cm} -1) >\hspace{0.1cm} = \hspace{0.1cm} +2 \hspace{0.05cm}.\]
  • Accordingly,  at time  $i = 2$  with  $\underline{y}_2 = (11)$:
\[{\it \Lambda}_2(S_0) = {\it \Lambda}_1(S_0) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(00)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm} = \hspace{0.1cm} -2-2 = -4 \hspace{0.05cm},\]
\[{\it \Lambda}_2(S_1) = {\it \Lambda}_1(S_0) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(11)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm} = \hspace{0.1cm} -2+2 = 0 \hspace{0.05cm},\]
\[{\it \Lambda}_2(S_2)= {\it \Lambda}_1(S_1) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(10)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm} = \hspace{0.1cm} +2+0 = +2 \hspace{0.05cm},\]
\[{\it \Lambda}_2(S_3)= {\it \Lambda}_1(S_1) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(01)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm} = \hspace{0.1cm} +2+0 = +2 \hspace{0.05cm}.\]
  • From time  $i =3$  a decision must be made between two acumulated metrics.  For example,  $\underline{y}_3 = (10)$  is obtained for the top and bottom metrics in the trellis:
\[{\it \Lambda}_3(S_0)={\rm max} \left [{\it \Lambda}_{2}(S_0) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(00)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}>\hspace{0.2cm} \hspace{0.05cm}, \hspace{0.2cm}{\it \Lambda}_{2}(S_1) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(00)\hspace{0.05cm}, \hspace{0.05cm}(11) \hspace{-0.05cm}> \right ] = {\rm max} \left [ -4+0\hspace{0.05cm},\hspace{0.05cm} +2+0 \right ] = +2\hspace{0.05cm},\]
\[{\it \Lambda}_3(S_3) ={\rm max} \left [{\it \Lambda}_{2}(S_1) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(01)\hspace{0.05cm}, \hspace{0.05cm}(10) \hspace{-0.05cm}>\hspace{0.2cm} \hspace{0.05cm}, \hspace{0.2cm}{\it \Lambda}_{2}(S_3) \hspace{0.2cm}+ \hspace{0.1cm}<\hspace{-0.05cm}(10)\hspace{0.05cm}, \hspace{0.05cm}(10) \hspace{-0.05cm}> \right ] = {\rm max} \left [ 0+0\hspace{0.05cm},\hspace{0.05cm} +2+2 \right ] = +4\hspace{0.05cm}.\]
  • Comparing the  »accumulated correlation values«  ${\it \Lambda}_i(S_{\mu})$  to be maximized with the   »accumulated error values«  ${\it \Gamma}_i(S_{\mu})$  to be minimized  according to the  $\text{$\text{Example 1}$}$,  one sees the following deterministic relationship:
\[{\it \Lambda}_i(S_{\mu}) = 2 \cdot \big [ i - {\it \Gamma}_i(S_{\mu}) \big ] \hspace{0.05cm}.\]
  • The selection of surviving branches to each decoding step is identical for both methods,  and the path search also gives the same result.


$\text{Conclusions:}$ 

  1.     In the binary channel – for example according to the BSC model – the two described Viterbi variants  "error value minimization"  and   "correlation value maximization"  lead to the same result.
  2.   In the AWGN channel,  on the other hand,  "error value minimization"  is not applicable because no Hamming distance can be specified between the binary input  $\underline{x}$  and the analog output  $\underline{y}$.
  3.   For the AWGN channel,  the  "correlation value maximization"  is rather identical to the minimization of the  $\text{Euclidean distance}$  – see  "Exercise 3.10Z".
  4.   Another advantage of the  "correlation value maximization"  is that a reliability information about the received values  $\underline{y}$  can be considered in a simple way.


Viterbi decision for non-terminated convolutional codes


So far,  a terminated convolutional code of length  $L\hspace{0.05cm}' = L + m$  has always been considered,  and the result of the Viterbi decoder was the continuous trellis path from the start time  $(i = 0)$  to the end  $(i = L\hspace{0.05cm}')$.

  • For non–terminated convolutional codes  $(L\hspace{0.05cm}' → ∞)$  this decision strategy is not applicable.
  • Here,  the algorithm must be modified to provide a best estimate  $($according to maximum likelihood$)$  of the incoming bits of the encoded sequence in finite time.
Exemplary trellis and surviving paths


The graphic shows in the upper part an exemplary trellis for

  • "our standard encoder"  $(R = 1/2, \ m = 2)$
$$ {\rm G}(D) = (1 + D + D^2, \ 1 + D^2),$$
  • the zero input sequence   ⇒   $\underline{u} = \underline{0} = (0, 0, 0, \ \text{...})$;  output:
$$\underline{x} = \underline{0} = (00, 00, 00, \ \text{...}),$$
  • in each case,  transmission errors at  $i = 4$  and  $i = 5$.


⇒   Based on the stroke types,  one can recognize allowed  $($solid$)$  and forbidden  $($dotted$)$  arrows in red  $(u_i = 0)$  and blue  $(u_i = 1)$.

Dotted lines have lost a comparison against a competitor and cannot be part of the selected path.

⇒   The lower part of the graph shows the  $2^m = 4$  surviving paths  ${\it \Phi}_9(S_{\mu})$  at time  $i = 9$.

  • It is easiest to find these paths from right to left  $($"backward"$)$.
  • The following specification shows the traversed states  $S_{\mu}$  but in the forward direction:
$${\it \Phi}_9(S_0) \text{:} \hspace{0.4cm} S_0 → S_0 → S_0 → S_0 → S_0 → S_0 → S_0 → S_0 → S_0 → S_0,$$
$${\it \Phi}_9(S_1) \text{:} \hspace{0.4cm} S_0 → S_0 → S_0 → S_0 → S_1 → S_2 → S_1 → S_3 → S_2 → S_1,$$
$${\it \Phi}_9(S_2) \text{:} \hspace{0.4cm} S_0 → S_0 → S_0 → S_0 → S_1 → S_2 → S_1 → S_2 → S_1 → S_2,$$
$${\it \Phi}_9(S_3) \text{:} \hspace{0.4cm} S_0 → S_0 → S_0 → S_0 → S_1 → S_2 → S_1 → S_2 → S_1 → S_3.$$

At earlier times  $(i<9)$  other survivors  ${\it \Phi}_i(S_{\mu})$  would result. Therefore, we define:

$\text{Definition:}$  The   »survivor«   ${\it \Phi}_i(S_{\mu})$  is the continuous path from the start  $S_0$  $($at  $i = 0)$  to the node  $S_{\mu}$ at time  $i$.

  • It is recommended to search for paths in the backward direction.


The following graph shows the surviving paths for time points  $i = 6$  to  $i = 9$.  In addition,  the respective metrics  $($accumulated correlaltion values$)$  ${\it \Lambda}_i(S_{\mu})$  for all four states are given.

The survivors  ${\it \Phi}_6, \ \text{...} \ , \ {\it \Phi}_9$

This graph is to be interpreted as follows:

  1. At time  $i = 9$  no final maximum likelihood decision can yet be made about the first nine bits of the information sequence.
  2. But it is already certain that the most probable bit sequence is represented by one of the paths  ${\it \Phi}_9(S_0), \ \text{...} \ , \ {\it \Phi}_9(S_3)$.
  3. Since all four paths up to  $i = 3$  are identical,  the decision  "$v_1 = 0,\ v_2 = 0, \ v_3 = 0$"  is the the most probable.  Also at a later time no other decision would be made. 
  4. Regarding the bits  $v_4, \ v_5, \ \text{...}$  one should not decide at this early stage.  Only the first two zeros are safe,  not  $v_3 = 0$.
  5. If one had to make a constraint decision at time  $i = 9$  one would choose  ${\it \Phi}_9(S_0)$   ⇒   $\underline{v} = (0, 0, \text{. ..} \ , 0)$ since the metric  ${\it \Lambda}_9(S_0) = 14$  is larger than the comparison metrics.
  6. The forced decision at time  $i = 9$  leads to the correct result in this example,  see lower sketch.
  7. A decision at time  $i = 6$  would have would have led to the wrong result  ⇒   $\underline{v} = (0, 0, 0, 1, 0, 1)$,  see upper sketch.
  8. At time  $i = 7$  resp.  $i = 8$,  a forced decision would not have been clear,  as shown in the two middle diagrams


Other decoding methods for convolutional codes


We have so far dealt only with the Viterbi algorithm in the form presented in 1967 by Andrew J. Viterbi in  [Vit67][1].  It was not until 1974 that  $\text{George David Forney}$  proved that this algorithm performs maximum likelihood decoding of convolutional codes.

But even in the years before,  many scientists were very eager to provide efficient decoding methods for the convolutional codes first described by  $\text{Peter Elias}$  in 1955.  Among others,  more detailed descriptions can be found for example in  [Bos99][2].

  • the work of Kamil Zigangirov  (1966)  and  $\text{Frederick Jelinek}$  (1969),  whose decoding method is also referred to as the  "stack algorithm".

All of these decoding schemes,  as well as the Viterbi algorithm as described so far,  provide  "hard decided"  output values   ⇒   $v_i ∈ \{0, 1\}$.  Often,  however,  information about the reliability of the decisions made would be desirable,  especially when there is a concatenated coding scheme with an outer and an inner code.

If the reliability of the bits decided by the inner decoder is known at least roughly,  this information can be used to  (significantly)  reduce the bit error probability of the outer decoder.  The  "soft output Viterbi algorithm"  $\rm(SOVA)$  proposed by  $\text{Joachim Hagenauer}$  in  [Hag90][3]  allows to specify a reliability measure in each case in addition to the decided symbols.

Finally, we go into some detail about the  »BCJR algorithm«  named after its inventors L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv  [BCJR74][4].

While the Viterbi algorithm only estimates the total sequence   ⇒   $\text{block-wise ML}$,  the BCJR algorithm estimates a single bit considering the entire received sequence.  So this is a  "bit-wise maximum a-posteriori decoding"   ⇒   $\text{bit-wise MAP}$.


$\text{Conclusion:}$ 

The difference between Viterbi–algorithm and BCJR algorithm shall be  – greatly simplified –  illustrated by the example of a terminated convolutional code:

  • The   »Viterbi algorithm «  processes the trellis in only one direction  – the forward direction  – and computes the metrics  ${\it \Lambda}_i(S_{\mu})$  for each node.  After reaching the terminal node,  the surviving path is searched,  which identifies the most probable encoded sequence.
  • In the   »BCJR algorithm «  the trellis is processed twice,  once in forward direction and then in backward direction.  Two metrics can then be specified for each node,  from which the a-posterori probability can be determined for each bit

.

Notes:

  1.   This short summary is based on the textbook  [Bos99][2].
  2.   A description of the BCJR–algorithm follows also in section  "Hard Decision vs. Soft Decision"  "in the fourth main chapter"  "Iterative Decoding Methods".


Exercises for the chapter


Exercise 3.09: Basics of the Viterbi Algorithm

Exercise 3.09Z: Viterbi Algorithm again

Exercise 3.10: Metric Calculation

Exercise 3.10Z: Maximum Likelihood Decoding of Convolutional Codes

Exercise 3.11: Viterbi Path Finding

References

  1. Viterbi, A.J.:  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm.  In: IEEE Transactions on Information Theory, vol. IT-13, pp. 260-269, April 1967.
  2. 2.0 2.1 Bossert, M.:  Channel Coding for Telecommunications.  Wiley & Sons, 1999.
  3. Hagenauer, J.:  Soft Output Viterbi Decoder.  In: Technischer Report, Deutsche Forschungsanstalt für Luft- und Raumfahrt (DLR), 1990.
  4. Bahl, L.R.; Cocke, J.; Jelinek, F.; Raviv, J.:  Optimal Decoding of Linear Codes for Minimizing Symbol Error Rate.  In: IEEE Transactions on Information Theory, Vol. IT-20, pp. 284-287, 1974.