Loading [MathJax]/jax/element/mml/optable/BasicLatin.js

Difference between revisions of "Channel Coding/Soft-in Soft-Out Decoder"

From LNTwww
Line 132: Line 132:
 
  L(y = -1\hspace{0.05cm}\vert\hspace{0.05cm}x) = -2.197\hspace{0.05cm}.</math>
 
  L(y = -1\hspace{0.05cm}\vert\hspace{0.05cm}x) = -2.197\hspace{0.05cm}.</math>
  
*This example shows that the so-called&nbsp; &raquo;LLR algebra&laquo;  can also be applied to conditional probabilities. In &nbsp;[[Aufgaben:Exercise_4.1Z:_Log_Likelihood_Ratio_at_the_BEC_Model|$\text{Exercise 4.1Z}$]],&nbsp; the BEC&ndash;model is described in a similar way.}}<br>
+
*This example shows that the so-called&nbsp; &raquo;LLR algebra&laquo;  can also be applied to conditional probabilities. In &nbsp;[[Aufgaben:Exercise_4.1Z:_Log_Likelihood_Ratio_at_the_BEC_Model|"Exercise 4.1Z"]],&nbsp; the BEC&ndash;model is described in a similar way.}}<br>
  
 
{{GraueBox|TEXT=  
 
{{GraueBox|TEXT=  
Line 181: Line 181:
 
[[File:EN_KC_T_4_1_S4.png|right|frame|Model of bit-wise soft-in soft-out decoding|class=fit]]
 
[[File:EN_KC_T_4_1_S4.png|right|frame|Model of bit-wise soft-in soft-out decoding|class=fit]]
 
   
 
   
*For long codes,&nbsp; a&nbsp; "maximum-a-posteriori block-level decision"&nbsp; (for short: &nbsp;<b>[[Channel_Coding/Channel_Models_and_Decision_Structures#Criteria_.C2.BBMaximum-a-posteriori.C2.AB_and_.C2.BBMaximum-Likelihood.C2.AB|$\text{ "block-wise MAP"}$]]</b>&nbsp; &ndash; very elaborate.
+
*For long codes,&nbsp; a&nbsp; "maximum-a-posteriori block-level decision"&nbsp; (for short: &nbsp;<b>[[Channel_Coding/Channel_Models_and_Decision_Structures#Criteria_.C2.BBMaximum-a-posteriori.C2.AB_and_.C2.BBMaximum-Likelihood.C2.AB| block-wise MAP]]</b>&nbsp; &ndash; very elaborate.
 
   
 
   
 
*One would have to find among the&nbsp; 2^k&nbsp; allowable code words&nbsp; \underline{x}_j &#8712; \mathcal{C}&nbsp; the one with the greatest a-posteriori probability&nbsp; (APP).
 
*One would have to find among the&nbsp; 2^k&nbsp; allowable code words&nbsp; \underline{x}_j &#8712; \mathcal{C}&nbsp; the one with the greatest a-posteriori probability&nbsp; (APP).
Line 290: Line 290:
 
  \hspace{0.05cm}.</math>
 
  \hspace{0.05cm}.</math>
  
*As will be shown in the&nbsp; [[Aufgaben:Aufgabe_4.4:_Extrinsische_L–Werte_beim_SPC|$\text{Exercise 4.4}$]],&nbsp; it is also possible to write for this:
+
*As will be shown in the&nbsp; [[Aufgaben:Aufgabe_4.4:_Extrinsische_L–Werte_beim_SPC|"Exercise 4.4"]],&nbsp; it is also possible to write for this:
  
 
::<math>L_{\rm E}(i) = 2 \cdot {\rm tanh}^{-1} \hspace{0.1cm} \left [  \prod\limits_{j \ne i}^{n} \hspace{0.15cm}{\rm tanh}(L_j/2) \right ]
 
::<math>L_{\rm E}(i) = 2 \cdot {\rm tanh}^{-1} \hspace{0.1cm} \left [  \prod\limits_{j \ne i}^{n} \hspace{0.15cm}{\rm tanh}(L_j/2) \right ]
Line 335: Line 335:
 
<br>
 
<br>
 
An example of iterative decoding of convolutional codes is the&nbsp; &raquo;<b>BCJR algorithm</b>&laquo;,&nbsp; named after its inventors L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv &nbsp; &#8658; &nbsp; [BCJR74]<ref name='BCJR74'>Bahl, L.R.; Cocke, J.; Jelinek, F.; Raviv, J.:&nbsp; Optimal Decoding of Linear Codes for Minimizing Symbol Error Rate.&nbsp; In: IEEE Transactions on Information Theory, Vol. IT-20, pp. 284-287, 1974.</ref>.&nbsp; The algorithm has many parallels to the seven-year older Viterbi decoding,&nbsp; but also some significant differences:
 
An example of iterative decoding of convolutional codes is the&nbsp; &raquo;<b>BCJR algorithm</b>&laquo;,&nbsp; named after its inventors L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv &nbsp; &#8658; &nbsp; [BCJR74]<ref name='BCJR74'>Bahl, L.R.; Cocke, J.; Jelinek, F.; Raviv, J.:&nbsp; Optimal Decoding of Linear Codes for Minimizing Symbol Error Rate.&nbsp; In: IEEE Transactions on Information Theory, Vol. IT-20, pp. 284-287, 1974.</ref>.&nbsp; The algorithm has many parallels to the seven-year older Viterbi decoding,&nbsp; but also some significant differences:
*While Viterbi estimates the total sequence &nbsp; &#8658; &nbsp; [[Channel_Coding/Channel_Models_and_Decision_Structures#Definitions_of_the_different_optimal_receivers| "block&ndash;wise maximum likelihood"]],&nbsp; the BCJR&ndash; Algorithm minimizes the bit error probability &nbsp; &#8658; &nbsp; [[Channel_Coding/Channel_Models_and_Decision_Structures#Definitions_of_the_different_optimal_receivers|"bit&ndash;wise MAP"]].<br>
+
*While Viterbi estimates the total sequence &nbsp; &#8658; &nbsp; [[Channel_Coding/Channel_Models_and_Decision_Structures#Definitions_of_the_different_optimal_receivers|$\text{block&ndash;wise maximum likelihood}$]],&nbsp; the BCJR&ndash; Algorithm minimizes the bit error probability &nbsp; &#8658; &nbsp; [[Channel_Coding/Channel_Models_and_Decision_Structures#Definitions_of_the_different_optimal_receivers|$\text{bit&ndash;wise MAP}$]].<br>
  
 
*The Viterbi algorithm&nbsp; (in its original form)&nbsp; cannot process soft information.&nbsp; In contrast,&nbsp; the BCJR algorithm specifies a reliability value for each individual bit at each iteration,&nbsp; which is taken into account in subsequent iterations.<br>
 
*The Viterbi algorithm&nbsp; (in its original form)&nbsp; cannot process soft information.&nbsp; In contrast,&nbsp; the BCJR algorithm specifies a reliability value for each individual bit at each iteration,&nbsp; which is taken into account in subsequent iterations.<br>
Line 343: Line 343:
 
[[File:EN_KC_T_4_1_S5.png|right|frame|Comparison of Viterbi and BCJR algorithm|class=fit]]
 
[[File:EN_KC_T_4_1_S5.png|right|frame|Comparison of Viterbi and BCJR algorithm|class=fit]]
  
*Viterbi searches and finds the most likely path from&nbsp; Γ0(S0)&nbsp; to&nbsp; Γ5(S0), namely&nbsp; S_0 &#8594; S_1 &#8594; S_0 &#8594; S_0 &#8594; S_1&#8594; S_0. We refer to the sample solution to&nbsp; [[Aufgaben:Exercise_3.09Z:_Viterbi_Algorithm_again|$\text{Exercise 3.9Z}$]].<br>
+
*Viterbi searches and finds the most likely path from&nbsp; Γ0(S0)&nbsp; to&nbsp; Γ5(S0), namely&nbsp; S_0 &#8594; S_1 &#8594; S_0 &#8594; S_0 &#8594; S_1&#8594; S_0. We refer to the sample solution to&nbsp; [[Aufgaben:Exercise_3.09Z:_Viterbi_Algorithm_again|"Exercise 3.9Z"]].<br>
  
 
*The sketch for the BCJR algorithm illustrates the extraction of the extrinsic&nbsp; L&ndash;value for the third symbol &nbsp; &#8658; &nbsp; LE(3).&nbsp; The relevant area in the trellis is shaded:
 
*The sketch for the BCJR algorithm illustrates the extraction of the extrinsic&nbsp; L&ndash;value for the third symbol &nbsp; &#8658; &nbsp; LE(3).&nbsp; The relevant area in the trellis is shaded:

Revision as of 22:45, 1 January 2023

# OVERVIEW OF THE FOURTH MAIN CHAPTER #


The last main chapter of the channel coding book describes  »iterative decoding techniques«  as used in most of today's  (2017)  communication systems.  This is due to the following reasons:

  • To approach the channel capacity,  one needs very long codes.
  • But for long codes,  "blockwise maximum likelihood decoding"  is almost impossible.


The decoder complexity can be significantly reduced with almost the same quality if two  (or more)  short channel codes are linked together and the newly acquired  (soft)  information is exchanged between the decoders at the receiver in several steps   ⇒   "iteratively".

The breakthrough in the field came in the early 1990s with the invention of the  "turbo codes"  by  Claude Berrou  and shortly thereafter with the rediscovery of  "low-density parity-check codes"  by  David J. C. MacKay  and  Radford M. Neal,  after the LDPC codes developed as early as 1961 by  Robert G. Gallager  had been forgotten in the meantime.

Specifically,  the fourth main chapter discusses:

  • a comparison of  »hard decision«  and  »soft decision«,
  • the quantification of  »reliability information«  by  »log likelihood ratios«,
  • the principle of symbol-wise  »soft-in soft-out  (SISO)«  decoding,
  • the definition of  »a-priori  L–value«,  »a-posteriori  L–value«  and  »extrinsic  L–value«,
  • the basic structure of  »serially concatenated«  resp.  »parallel concatenated«  coding systems,
  • the characteristics of  »product codes«  and their  »hard decision decoding«,
  • the basic structure,  decoding algorithm,  and performance of  »turbo codes«,
  • basic information on the  »low-density parity-check codes«  and their applications.



Hard Decision vs. Soft Decision


To introduce the topic discussed here, let us consider the following digital transmission system with coding.

  • In the following, all symbols are given in bipolar representation:
Considered digital transmission system with coding
Comparison of  "hard decision" and  "soft decision";   for all columns of this table it is assumed:
(1)   the information block   u_=(0, 1),  bipolar representable as  (+1,1),
(2)   the encoded block  x_=(0, 1, 1)   ⇒   in bipolar representation:  (+1,1,1).


"0"  →  "+1",  and 
"1"  →  "1".
  • The symbol sequence  u_=(u1, u2)  of the digital source is assigned to the encoded sequence 
x_=(x1, x2, x3)=(u1, u2, p)
where for the parity bit   p=u1u2   holds   ⇒    SPC (3, 2, 2).
  • The  AWGN channel  changes the symbols  xi{+1, 1}  to real-valued output values  yi,  for example according to  channel 4  in the table below:  
    • x1=+1   ⇒   y1=+0.9,
    • x2=1   ⇒   y1=+0.1,
    • x3=1   ⇒   y1=+0.1,
  • The given block diagram corresponds to  MLHD.  Here,  only the signs of the AWGN output values   ⇒   yHD, i=sign[ySD, i]  are evaluated for decoding.
  • With  "soft decision"  one omits the shaded block and directly evaluates the continuous value input variables  ySD, i .


Thus,  the four columns in the table differ only by different AWGN realizations.

Definitions:  From the example table you can see:

  • Hard Decision:   The sink symbol sequence  v_HD  results from the  "hard decided channel values"  y_HD  (blue background).
        In our example,  only the constellations according to  channel 1  and  channel 2  are decoded without errors.
  • Soft Decision:   The sink symbol sequence  v_SD  results from the  "soft channel output values"  y_SD  (green background).
        Now,  in this example,  it is also correctly decided at  channel 3


The entries in the example table above are to be interpreted as follows:

  1. For ideal channel   ⇒   channel 1   ⇒   x_=y_SD=y_HD  there is no difference between the (blue) conventional hard decision variant  (MLHD)  and the (green) soft decision variant  (MLSD).
  2. The setting according to  channel 2  demonstrates low signal distortions.  Because of  y_HD=x_  (which means that the channel does not distort the signs)  also  MLHD  gives the correct result  v_HD=u_.
  3. At  channel 3  there is  y_HDx_  and there is also no  SPC (3,2)  assignment  u_   ⇒   y_HD.  The maximum Likelihood decoder reports here by outputting  v_HD=(E, E) that it failed in decoding this block.  "E" stands for an  Erasure .
  4. Also the  soft decision decoder recognizes that decoding based on the signs does not work.  Based on the  y_SD  values,  however,  it recognizes that with high probability the second bit has been falsified and decides to use the correct symbol sequence  v_SD=(+1,1)=u_.
  5. In  channel 4,  both the signs of bit 2 and bit 3 are changed by the AWGN channel,  leading to the result   v_HD=(+1,+1)u_(+1,1)   ⇒   a block error and a bit error at the same time.  Also the soft decision decoder gives the same wrong result here.

The decoding variant  MLSD  also offers the advantage over  MLHD  that it is relatively easy to assign a reliability value to each decoding result  (but in the above table this is not specified).  This reliability value would

  • have its maximum value for  channel 1 ,
  • be significantly smaller for  channel 2 ,
  • be close to zero for  channel 3  and  channel 4.

Reliability information - Log likelihood ratio


Let be  x{+1,1}  a binary random variable with probabilities  Pr(x=+1)  and  Pr(x=1).  For coding theory,  it proves convenient in terms of computation times to use the natural logarithm of the quotient instead of the probabilities  Pr(x=±1)  .

Definition:  The  log likelihood ratio  (LLR)  of the random variable  x{+1,1}  is:

L(x)=lnPr(x=+1)Pr(x=1).
  • For unipolar representation  (+10   and   11)  applies accordingly with  ξ{0,1}:
L(ξ)=lnPr(ξ=0)Pr(ξ=1).


Probability and log likelihood ratio

The table gives the nonlinear relationship between  Pr(x=±1)  and  L(x).  Replacing  Pr(x=+1)  with  Pr(ξ=0),  the middle row gives the  L–value of the unipolar random variable  ξ.

You can see:

  1. The more likely random value of  x{+1,1}  is given by the  sign   ⇒   sign L(x).
  2. In contrast,  the  absolute value    ⇒   |L(x)|  indicates the reliability for the result  "sign(L(x))".

Given BSC model

Example 1:  We consider the outlined  BSC model  with bipolar representation. 

  • With the falsification probability  ε=0.1  and the two random variables  x{+1,1}  and  y{+1,1}  at the input and output of the channel:
L(y|x)=lnPr(y|x=+1)Pr(y|x=1)={ln[(1ε)/ε]ln[ε/(1ε)]fory=+1,fory=1.
  • For example,  ε=0.1  results in the following numerical values  (compare the upper table):
L(y=+1|x)=ln0.90.1=+2.197,L(y=1|x)=2.197.
  • This example shows that the so-called  »LLR algebra« can also be applied to conditional probabilities. In  "Exercise 4.1Z",  the BEC–model is described in a similar way.


Conditional AWGN probability density functions

Example 2:  In another example,  we consider the  AWGN   channel with the conditional probability density functions

fy|x=+1(y|x=+1)=12πσe(y1)2/(2σ2),
fy|x=1(y|x=1)=12πσe(y+1)2/(2σ2).

⇒   In the graph,  two exemplary Gaussian functions are shown as blue and red curves, respectively. 

  • The total probability density function  (PDF)  of the output signal y  is obtained from the  (equally)  weighted sum:
fy(y)=1/2[fy|x=+1(y|x=+1)+fy|x=1(y|x=1)].
  • We now calculate the probability that the received value  y  lies in a  (very)  narrow interval of width  Δ  around  y0=0.25.  One obtains approximately
Pr(|yy0|Δ/2|x=+1)Δ2πσe(y01)2/(2σ2),
Pr(|yy0|Δ/2|x=1)Δ2πσe(y0+1)2/(2σ2).
The slightly larger vertical lines denote the conditions,  the smaller ones the absolute value.
  • The  log likelihood ratio  (LLR)  of the conditional probability in the forward direction  (meaning:   output  y  for a given input  x)  is thus obtained as the logarithm of the quotient of both expressions:
L(y=y0|x)=ln[e(y01)2/(2σ2)e(y0+1)2/(2σ2)]=ln[e[(y01)2+(y0+1)2]/(2σ2)]=(y0+1)2(y01)22σ2=2y0σ2.
  • If we now replace the auxiliary  y0  with the general random  y  at the AWGN output,  the final result is:
L(y|x)=2y/σ2=KLy.
Here  KL=2/σ2  is a constant that depends solely on the standard deviation of the Gaussian noise component.


Bit-wise soft-in soft-out decoding


We now assume an  (n, k) block code where the code word   x_=(x1, ... , xn)   is falsified by the channel into the received word  y_=(y1, ... , yn).

Model of bit-wise soft-in soft-out decoding
  • One would have to find among the  2k allowable code words  x_jC  the one with the greatest a-posteriori probability  (APP).
  • The code word to output  z_  in this case would be 
z_=argmax


A second possibility is decoding on bit level.  The represented bit-wise  \text{soft-in soft out decoder}  has the task to decode all code word bits  x_i ∈ \{0, \, 1\}  according to maximum a-posteriori probability  {\rm Pr}(x_i | \underline{y}).  With the control variable  i = 1, \text{...} , \ n  holds in this case:

  • The corresponding  log likelihood ratio  for the  ith  code bit is:
L_{\rm APP} (i) = L(x_i\hspace{0.05cm}|\hspace{0.05cm}\underline{y}) = {\rm ln} \hspace{0.15cm} \frac{{\rm Pr}(x_i = 0\hspace{0.05cm}|\hspace{0.05cm}\underline{y})}{{\rm Pr}(x_i = 1\hspace{0.05cm}|\hspace{0.05cm}\underline{y})}\hspace{0.05cm} .
  • The decoder works iteratively.  At initialization  (indicated in the diagram by the parameter  I = 0)  is  L_{\rm APP}(i) = L_{\rm K}(i),  where the channel  (German:  "Kanal"   ⇒   "K")  log likelihood ratio  L_{\rm K}(i)  is given by the received value  y_i.
  • In addition,  the extrinsic  log likelihood ratio  L_{\rm E}(i)  is calculated,  which quantifies the total information provided by all other bits   (j ≠ i)   due to the code properties about the considered  ith bit.
  • At the next iteration  (I = 1)  L_{\rm E}(i)  is taken into account in the computation of  L_{\rm APP}(i)  as a-priori information  L_{\rm A}(i).  Thus,  for the new a-posteriori log likelihood ratio in iteration  I + 1  holds:
L_{\hspace{0.1cm}\rm APP}^{(I+1)} (i) = L_{\hspace{0.1cm}\rm APP}^{(I)} (i) + L_{\hspace{0.1cm}\rm A}^{(I+1)} (i) = L_{\hspace{0.1cm}\rm APP}^{(I)} (i) + L_{\hspace{0.1cm}\rm E}^{(I)} (i)\hspace{0.05cm} .
  • The iterations continue until all absolute values  |L_{\rm APP}(i)|  are greater than a given value.  The most likely code word  \underline{z}  is then obtained from the signs of all  L_{\rm APP}(i),  with  i = 1, \ \text{...} , \ n.
  • For a  \text{systematic code}  the first  k  bits of  \underline{z}  indicate the information word  \underline{\upsilon}  being searched for,  which will most likely match the message  \underline{u}  being sent.

This SISO decoder descriptione from  [Bos99][1]  is intended at this point primarily to clarify the different  L–values.  One recognizes the large potential of the bit-wise decoding only in connection with  \text{concatenated coding systems}.

Calculation of extrinsic log likelihood ratios


The difficulty in bit-wise  (or symbol-wise)  iterative decoding is generally the provision of the extrinsic  log likelihood ratios   L_{\rm E}(i).  For a code of length  n,  the control variable is   i = 1, \ \text{...} , \ n.

\text{Definition:}  The   extrinsic log likelihood ratio   is a measure of the information that the other symbols  (j ≠ i)  of the code word provide about the  i–th encoded symbol,  expressed as a log likelihood ratio.  We denote this characteristic value by  L_{\rm E}(i).


We now calculate the extrinsic log likelihood ratios  L_{\rm E}(i)  for two example codes.

\text{(A)  Repetition Code}   ⇒   {\rm RC} \ (n, 1, n)

A repetition code  \rm (RC) is characterized by the fact that all  n  encoded symbols  x_i ∈ \{0, \, 1\}  are identical.

  • Here,  the extrinsic log likelihood ratio for the  i–th symbol is very easy to specify:
L_{\rm E}(i) = \hspace{0.05cm}\sum_{j \ne i} \hspace{0.1cm} L_j \hspace{0.3cm}{\rm with}\hspace{0.3cm}L_j = L_{\rm APP}(j) \hspace{0.05cm}.
  • If the sum over all  L_{j ≠ i}  is positive   ⇒   this implies from the point of view of the other log likelihood ratio values a preference for the decision   x_i = 0.
  • On the other hand,  if the sum is negative   ⇒   x_i = 1  is more likely.
  • L_{\rm E}(i) = 0  does not allow a prediction.


\text{Example 5:}  We consider the decoding of the repetition code  \text{RC (4, 1,4)}.  Thereby,  we make three different assumptions for the log likelihood ratio  \underline{L}_{\rm APP}^{(I=0)}=\underline{L}_{\rm K}.

Decoding example  \rm (5a)  for the  {\rm RC} \ (4, 1, 4)

\text{Decoding example (5a):}

\underline{L}_{\rm APP}^{(I=0)} = (+1, -1, +3, -1)\text{:}
L_{\rm E}(1) = -1+3-1 = +1\hspace{0.05cm},
L_{\rm E}(2) = +1+3-1 = +3\hspace{0.05cm},
L_{\rm E}(3) = +1-1 -1= -1\hspace{0.05cm},
L_{\rm E}(4) = +1-1 +3 = +3\hspace{0.05cm}
\hspace{0.3cm}\Rightarrow\hspace{0.3cm}\underline{L}_{\rm E}^{(I=0)}= (+1, +3, -1, +3) \hspace{0.3cm}\Rightarrow\hspace{0.3cm}\underline{L}_{\rm APP}^{(I=1)}=\underline{L}_{\rm APP}^{(I=0)}+ \underline{L}_{\rm E}^{(I=0)}= (+2, +2, +2, +2)
  • At the beginning  (I=0)  the positive  L_{\rm E}  values indicate  x_1 = x_2 = x_4 = 0  while  x_3 =1  is more likely   (because of negative  L_{\rm E}) .
  • Already after one iteration  (I=1)  all  L_{\rm APP}  values are positive   ⇒   the information bit is decoded as  u = 0.
  • Further iterations yield nothing.


\text{Decoding example (5b):}

Decoding example  \rm (5b)  for the  {\rm RC} \ (4, 1, 4)
\underline{L}_{\rm APP}^{(I=0)} = (+1, +1, -4, +1)\text{:}
\underline{L}_{\rm E}^{(I=0)} = \ (-2, -2, +3, -2)
  • Although three signs were wrong at the beginning,  after two iterations all  L_{\rm APP}  values are negative.
  • The information bit is decoded as  u = 1.


Decoding example  \rm (5c)  for the  {\rm RC} \ (4, 1, 4)

\text{Decoding example (5c):}

\underline{L}_{\rm APP}^{(I=0)} = (+1, +1, -3, +1)\text{:}
\underline{L}_{\rm E}^{(I=0)} = (-1, -1, +3, -1)
  • All  L_{\rm APP}  values are zero already after one iteration.
  • The information bit cannot be decoded,  although the initial situation was not much worse than with  \rm (5b).
  • Further iterations bring nothing here.


\text{(B)   Single parity–check code}   ⇒   {\rm SPC} \ (n, \ n \, -1, \ 2)

For each single parity–check code,  the number of  "ones"  in each code word is even.  Or,  put another way:   For each code word  \underline{x}  the  \text{Hamming weight}  w_{\rm H}(\underline{x})  is even.

\text{Definition:}  Let the code word  \underline{x}^{(–i)}  contain all symbols except  x_i   ⇒   vector of length  n–1.  Thus,  the   "extrinsic log likelihood ratio"  with respect to the  i–th symbol is,log likelihood ratiowhen  \underline{x}  was received: KORREKTUR: Thus,  the   "extrinsic log likelihood ratio"  will be with respect to the  i–th symbol, when  \underline{x}  has been received:

L_{\rm E}(i) = {\rm ln} \hspace{0.15cm}\frac{ {\rm Pr} \left [w_{\rm H}(\underline{x}^{(-i)})\hspace{0.15cm}{\rm is \hspace{0.15cm} even} \hspace{0.05cm} \vert\hspace{0.05cm}\underline{y} \hspace{0.05cm}\right ]}{ {\rm Pr} \left [w_{\rm H}(\underline{x}^{(-i)})\hspace{0.15cm}{\rm is \hspace{0.15cm} odd} \hspace{0.05cm} \vert \hspace{0.05cm}\underline{y} \hspace{0.05cm}\right ]} \hspace{0.05cm}.
  • As will be shown in the  "Exercise 4.4",  it is also possible to write for this:
L_{\rm E}(i) = 2 \cdot {\rm tanh}^{-1} \hspace{0.1cm} \left [ \prod\limits_{j \ne i}^{n} \hspace{0.15cm}{\rm tanh}(L_j/2) \right ] \hspace{0.3cm}{\rm with}\hspace{0.3cm}L_j = L_{\rm APP}(j) \hspace{0.05cm}.


\text{Example 6:}  We assume a  "Single parity–check code"  with  n = 3, \ k = 2   ⇒   briefly  {\rm SPC} \ (3, \ 2, \ 2).

  • The   2^k = 4   valid code words  \underline{x} = \{x_1,\ x_2,\ x_3\}  are for bipolar description   ⇒   x_i ∈ \{±1\}:
Decoding example for the  {\rm SPC} \ (3, 2, 2)
\underline{x}_0 \hspace{-0.05cm}=\hspace{-0.05cm} (+1\hspace{-0.03cm},\hspace{-0.05cm}+1\hspace{-0.03cm},\hspace{-0.05cm}+1)\hspace{-0.05cm},
\underline{x}_1 \hspace{-0.05cm}=\hspace{-0.05cm} (+1\hspace{-0.03cm},\hspace{-0.05cm} -1\hspace{-0.03cm},\hspace{-0.05cm} -1)\hspace{-0.05cm},
\underline{x}_2 \hspace{-0.05cm}=\hspace{-0.05cm} (-1\hspace{-0.03cm},\hspace{-0.05cm} +1\hspace{-0.03cm},\hspace{-0.05cm} -1)\hspace{-0.05cm},
\underline{x}_3 \hspace{-0.05cm}=\hspace{-0.05cm} (-1\hspace{-0.03cm},\hspace{-0.05cm} -1\hspace{-0.03cm},\hspace{-0.05cm} +1)\hspace{-0.05cm}.
So, the product   x_1 \cdot x_2 \cdot x_3   is always positive.
  • The table shows the decoding process for  \underline{L}_{\rm APP} = (+2.0, +0.4, \, –1.6).
  • The hard decision according to the signs of  L_{\rm APP}(i)  would yield here  (+1, +1, \, -1)   ⇒   no valid code word of  {\rm SP}(3, \ 2, \ 2).
  • On the right side of the table, the corresponding extrinsic log likelihood ratios are entered:
L_{\rm E}(1) = 2 \cdot {\rm tanh}^{-1} \hspace{0.05cm} \left [ {\rm tanh} (0.2) \cdot {\rm tanh} (-0.8)\hspace{0.05cm}\right ] = -0.131\hspace{0.05cm},
L_{\rm E}(2) =2 \cdot {\rm tanh}^{-1} \hspace{0.05cm} \left [ {\rm tanh} (1.0) \cdot {\rm tanh} (-0.8)\hspace{0.05cm}\right ] = -0.518\hspace{0.05cm},
L_{\rm E}(3) =2 \cdot {\rm tanh}^{-1} \hspace{0.05cm} \left [ {\rm tanh} (1.0) \cdot {\rm tanh} (0.2)\hspace{0.05cm}\right ] = +0.151\hspace{0.05cm}.

The second equation can be interpreted as follows:

  1. L_{\rm APP}(1) = +2.0  and  L_{\rm APP}(3) = \, -1.6  state that the first bit is more likely to be  "+1"  than  "-1"  and the third bit is more likely to be  "-1"  than  "+1".
  2. The reliability  (the amount)  is slightly greater for the first bit than for the third.
  3. The extrinsic information   L_{\rm E}(2) = \, -0.518  considers only the information of bit 1 and bit 3 about bit 2.
  4. From their point of view,  the second bit is  "-1"  with reliability  0.518.
  5. The log likelihood ratio  L_{\rm APP}^{I=0}(2) = +0.4  derived from the received value  y_2  has suggested  "+1"  for the second bit.
  6. But the discrepancy is resolved here already in the iteration  I = 1.  The decision falls here for the code word  \underline{x}_1 \hspace{-0.05cm}=\hspace{-0.05cm} (+1\hspace{-0.03cm},\hspace{-0.05cm} -1\hspace{-0.03cm},\hspace{-0.05cm} -1)\hspace{-0.05cm},.
  7. For   0.518 < L_{\rm APP}(2) < 1.6   the result  \underline{x}_1  would only be available after several iterations.  For  L_{\rm APP}(2) > 1.6  on the other hand,  the decoder returns  \underline{x}_0.


BCJR decoding: Forward-backward algorithm


An example of iterative decoding of convolutional codes is the  »BCJR algorithm«,  named after its inventors L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv   ⇒   [BCJR74][2].  The algorithm has many parallels to the seven-year older Viterbi decoding,  but also some significant differences:

  • The Viterbi algorithm  (in its original form)  cannot process soft information.  In contrast,  the BCJR algorithm specifies a reliability value for each individual bit at each iteration,  which is taken into account in subsequent iterations.


The figure is intended to illustrate  (in an almost inadmissibly simplified way)  the different approach of Viterbi and BCJR algorithm.  This is based on a convolutional code with memory  m = 1  and length  L = 4   ⇒   total length (with termination)  L' = 5.

Comparison of Viterbi and BCJR algorithm
  • Viterbi searches and finds the most likely path from  {\it \Gamma}_0(S_0)  to  {\it \Gamma}_5(S_0), namely  S_0 → S_1 → S_0 → S_0 → S_1→ S_0 . We refer to the sample solution to  "Exercise 3.9Z".
  • The sketch for the BCJR algorithm illustrates the extraction of the extrinsic  L–value for the third symbol   ⇒   L_{\rm E}(3).  The relevant area in the trellis is shaded:
  • Working through the trellis diagram in the forward direction,  one obtains in the same way as Viterbi the metrics  \alpha_1, \ \alpha_2, \ \text{...}\hspace{0.05cm} , \ \alpha_5.  To calculate  L_{\rm E}(3)  one needs from this  \alpha_2.
  • Then we traverse the trellis diagram backwards  (i.e. from right to left)  and thus obtain the metrics  \beta_4, \ \beta_3, \ \text{...}\hspace{0.05cm} , \ \beta_0  corresponding to the sketch below.
  • The extrinsic log likelihood ratio  L_{\rm E}(3)  is obtained from the forward direction metric  \alpha_2  and the backward direction metric  \beta_3  and the a-priori information  \gamma_3  about the symbol  i = 3.

Basic structure of concatenated coding systems


The most important communication systems of the last years use two different channel codes.  One speaks then of  »concatenated codes«.  Even with relatively short component codes  \mathcal{C}_1  and  \mathcal{C}_2  the concatenated code  \mathcal{C}  results in a sufficiently large code word length  n,  which is known to be necessary to approach the channel capacity.

To begin with,  here are a few examples from mobile communications:

  • In  \rm GSM  ("Global System for Mobile Communications",  the second generation of mobile communication systems),  the data bit rate is first increased from  9. 6 \ \rm kbit/s  to  12 \ \rm kbit/s  in order to enable error detection in circuit-switched networks as well.  This is followed by a punctured convolutional code with the output bit rate  22.8 \ \rm kbit/s.  The total code rate is thus about 0.421.
  • In 4G mobile communication systems  \rm LTE  ("Long Term Evolution"),  a convolutional code is used for short control signals,  turbo code for the longer payload data.


The graph shows the basic structure of a parallel concatenated coding system.  All vectors consist of  n  elements: 

\underline{L} = (L(1), \ \text{...}\hspace{0.05cm} , \ L(n)).
Parallel concatenated codes

So the calculation of all  L–values is done on symbol level.  Not shown here is the  \rm interleaver,  which is mandatory e.g. for turbo codes.

  1. The sequences  \underline{x}_1  and  \underline{x}_2  are combined to form the vector  \underline{x}  for joint transmission over the channel by a multiplexer.
  2. At the receiver, the sequence  \underline{y}  is again decomposed into the parts  \underline{y}_1  and  \underline{y}_2.  From this the channel L–values  \underline{L}_{\rm K,\hspace{0.05cm}1}  and  \underline{L}_{\rm K,\hspace{0.05cm}2}  are formed.
  3. The symbol-wise decoder determines the extrinsic log likelihood ratios   \underline{L}_{\rm E,\hspace{0.05cm} 1}  and  \underline{L}_{\rm E,\hspace{0.05cm} 2}, which are also the a-priori information  \underline{L}_{\rm A,\hspace{0.05cm} 2}  and  \underline{L}_{\rm A,\hspace{0.05cm} 1}  for the other decoder.
  4. After a sufficient number of iterations  (i.e. when a termination criterion is met),  the vector  \underline{L}_{\rm APP}  of a-posteriori values is available at the decoder output.
  5. In the example,  the value in the upper branch is taken arbitrarily. 
  6. However,  the lower log likelihood ratio would also be possible.

The above model also applies in particular to the decoding of turbo codes according to the chapter  "Basic structure of a turbo code".

Exercises for the chapter


Exercise 4.1: Log Likelihood Ratio

Exercise 4.1Z: Log Likelihood Ratio at the BEC Model

Exercise 4.2: Channel Log Likelihood Ratio at AWGN

Exercise 4.3: Iterative Decoding at the BSC

Exercise 4.3Z: Conversions of L-value and S-value

Exercise 4.4: Extrinsic L-values at SPC

Exercise 4.4Z: Supplement to Exercise 4.4

Exercise 4.5: On the Extrinsic L-values again

Exercise 4.5Z: Tangent Hyperbolic and Inverse

References

  1. Bossert, M.:  Channel Coding for Telecommunications.  Chichester:  Wiley,  1999.  ISBN:  978-0-471-98277-7.
  2. Bahl, L.R.; Cocke, J.; Jelinek, F.; Raviv, J.:  Optimal Decoding of Linear Codes for Minimizing Symbol Error Rate.  In: IEEE Transactions on Information Theory, Vol. IT-20, pp. 284-287, 1974.