Difference between revisions of "Channel Coding/Information Theoretical Limits of Channel Coding"

From LNTwww
 
(62 intermediate revisions by 8 users not shown)
Line 1: Line 1:
 
   
 
   
 
{{Header
 
{{Header
|Untermenü=Binäre Blockcodes zur Kanalcodierung
+
|Untermenü=Binary Block Codes for Channel Coding
|Vorherige Seite=Schranken für die Blockfehlerwahrscheinlichkeit
+
|Vorherige Seite=Limits for Block Error Probability
|Nächste Seite=Einige Grundlagen der Algebra
+
|Nächste Seite=Some Basics of Algebra
 
}}
 
}}
  
== Kanalcodierungstheorem und Kanalkapazität ==
+
== Channel coding theorem and channel capacity ==
 
<br>
 
<br>
Wir betrachten weiterhin einen binären Blockcode mit <i>k</i> Informationsbits pro Block und Codeworte der Länge <i>n</i>, woraus sich die Coderate <i>R</i> = <i>k</i>/<i>n</i> mit der Einheit Informationsbit/Codesymbol ergibt.<br>
+
We further consider a binary block code with&nbsp;
 +
*$k$&nbsp; information bits per block,
 +
*code words of length&nbsp; $n$,  
 +
*resulting in the code rate&nbsp; $R=k/n$&nbsp; with the unit&nbsp; "information bit/code symbol".<br>
  
Der geniale Informationstheoretiker Claude E. Shannon hat sich schon 1948 sehr intensiv mit der Korrekturfähigkeit solcher Codes beschäftigt und hierfür für jeden Kanal eine Grenze angegeben, die sich allein aus informationstheoretischen Überlegungen ergibt. Bis heute wurde noch kein Code gefunden, der diese Grenze übersteigt, und dies wird auch so bleiben.<br>
 
  
<b>Shannons Kanalcodierungstheorem</b>: Zu jedem Kanal mit der Kanalkapazität <i>C</i> &gt; 0 existiert stets (mindestens) ein Code, dessen Fehlerwahrscheinlichkeit gegen Null geht, so lange die Coderate kleiner als die Kanalkapazität ist: <i>R</i> &lt; <i>C</i>. Voraussetzung hierfür ist allerdings, dass für die Blocklänge dieses Codes gilt: <i>n</i> &#8594; &#8734;.<br>
+
The ingenious information theorist&nbsp; [https://en.wikipedia.org/wiki/Claude_Shannon $\text{Claude E. Shannon}$]&nbsp; has dealt very intensively with the correctability of such codes already in 1948 and has given for this a limit for each channel which results from information-theoretical considerations alone.&nbsp; Up to this day,&nbsp; no code has been found which exceeds this limit,&nbsp; and this will remain so.<br>
  
Die Umkehrung ist ebenfalls zutreffend und besagt:<br>
+
{{BlaueBox|TEXT= 
 +
$\text{Shannon's channel coding theorem:}$&nbsp;  For each channel with channel capacity&nbsp; $C > 0$&nbsp; there always exists&nbsp; (at least)&nbsp; one code whose error probability approaches zero as long as the code rate&nbsp; $R$&nbsp; is smaller than the channel capacity&nbsp; $C$.&nbsp; The prerequisite for this is that the following holds for the block length of this code: &nbsp;
 +
:$$n \to \infty.$$}}<br>
  
<b>Umkehrschluss</b>: Ist die Coderate größer als die Kanalkapazität &nbsp;&#8658;&nbsp; <i>R &gt; C</i>, so kann eine beliebig kleine Fehlerwahrscheinlichkeit auf keinen Fall erreicht werden.<br>
+
Notes:&nbsp;  
 +
*The statement&nbsp; "The error probability approaches zero"&nbsp; is not identical with the statement&nbsp; "The transmission is error-free".&nbsp; Example: &nbsp; For an infinitely long sequence,&nbsp; finitely many symbols are falsified.
  
Zur Herleitung und Berechnung der Kanalkapazität gehen wir zunächst von einem digitalen Kanal mit <i>M<sub>x</sub></i> möglichen Eingangswerten <i>x<sub>i</sub></i> und <i>M<sub>y</sub></i> möglichen Ausgangswerten <i>y<sub>j</sub></i> aus.  Dann gilt für den mittleren Transinformationsgehalt  &ndash; kurz die Transinformation (englisch: <i>Mutual Information</i>) &ndash; zwischen der Zufallsgröße <i>x</i>  am Kanaleingang und der Zufallsgröße <i>y</i> am Ausgang:
+
*For some channels,&nbsp; even with&nbsp; $R=C$&nbsp; the error probability still goes towards zero&nbsp; (but not for all).
  
:<math>I(x; y) \hspace{-0.15cm}  = \hspace{-0.15cm} \sum_{i= 1 }^{M_X} \hspace{0.15cm}\sum_{j= 1 }^{M_Y} \hspace{0.15cm}{\rm Pr}(x_i, y_j) \cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(y_j \hspace{0.05cm}| \hspace{0.05cm}x_i)}{{\rm Pr}(y_j)} = </math>
+
 
:<math>\hspace{1.2cm}  \hspace{-0.15cm} \sum_{i= 1 }^{M_X} \hspace{0.15cm}\sum_{j= 1 }^{M_Y}\hspace{0.15cm}{\rm Pr}(y_j \hspace{0.05cm}| \hspace{0.05cm}x_i) \cdot {\rm Pr}(x_i) \cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(y_j \hspace{0.05cm}| \hspace{0.05cm}x_i)}{\sum_{k= 1}^{M_X} {\rm Pr}(y_j \hspace{0.05cm}| \hspace{0.05cm}x_k) \cdot {\rm Pr}(x_k)}
+
The inverse of the channel coding theorem is also true and states:<br>
 +
 
 +
{{BlaueBox|TEXT= 
 +
$\text{Inverse:}$&nbsp;  If the code rate&nbsp; $R$&nbsp; is larger than the channel capacity&nbsp; $C$,&nbsp; then an arbitrarily small error probability cannot be achieved in any case.}}<br>
 +
 
 +
To derive and calculate the channel capacity,&nbsp; we first assume a digital channel with&nbsp; $M_x$&nbsp; possible input values&nbsp; $x$&nbsp; and&nbsp; $M_y$&nbsp; possible output values&nbsp; $y$.&nbsp; Then,&nbsp; for the mean mutual information&nbsp; &ndash; in short,&nbsp; the&nbsp; [[Information_Theory/Different_Entropy_Measures_of_Two-Dimensional_Random_Variables#Mutual_information_between_two_random_variables|$\text{mutual information}$]] &nbsp; &ndash;&nbsp; between the random variable&nbsp; $x$&nbsp; at the channel input and the random variable&nbsp; $y$&nbsp; at its output:
 +
 
 +
::<math>I(x; y) =\sum_{i= 1 }^{M_X} \hspace{0.15cm}\sum_{j= 1 }^{M_Y} \hspace{0.15cm}{\rm Pr}(x_i, y_j) \cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(y_j \hspace{0.05cm}| \hspace{0.05cm}x_i)}{{\rm Pr}(y_j)} =  \sum_{i= 1 }^{M_X} \hspace{0.15cm}\sum_{j= 1 }^{M_Y}\hspace{0.15cm}{\rm Pr}(y_j \hspace{0.05cm}| \hspace{0.05cm}x_i) \cdot {\rm Pr}(x_i) \cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(y_j \hspace{0.05cm}| \hspace{0.05cm}x_i)}{\sum_{k= 1}^{M_X} {\rm Pr}(y_j \hspace{0.05cm}| \hspace{0.05cm}x_k) \cdot {\rm Pr}(x_k)}
 
  \hspace{0.05cm}.</math>
 
  \hspace{0.05cm}.</math>
  
Beim Übergang von der ersten zur zweiten Gleichung wurden dabei der [http://en.lntwww.de/Stochastische_Signaltheorie/Statistische_Abh%C3%A4ngigkeit_und_Unabh%C3%A4ngigkeit#Bedingte_Wahrscheinlichkeit_.281.29 Satz von Bayes] sowie der [http://en.lntwww.de/Stochastische_Signaltheorie/Statistische_Abh%C3%A4ngigkeit_und_Unabh%C3%A4ngigkeit#R.C3.BCckschlusswahrscheinlichkeit Satz von der totalen Wahrscheinlichkeit] berücksichtigt. Weiter ist anzumerken, dass hier der <i>Logarithmus dualis</i> mit &bdquo;log<sub>2</sub>&rdquo; bezeichnet ist. Teilweise verwenden wir im LNTwww hierfür auch &bdquo;ld&rdquo;. Im Gegensatz zum Buch Einführung in die Informationstheorie unterscheiden wir im Folgenden auch nicht zwischen der Zufallsgröße (Großbuchstaben, <i>X</i> bzw. <i>Y</i>) und Realisierungen (Kleinbuchstaben, <i>x</i> bzw. <i>y</i>).<br>
+
In the transition from the first to the second equation, the &nbsp; [[Theory_of_Stochastic_Signals/Statistical_Dependence_and_Independence#Conditional_Probability| $\text{Theorem of Bayes}$]] &nbsp; and the &nbsp; [[Theory_of_Stochastic_Signals/Statistical_Dependence_and_Independence#Inference_probability| $\text{Theorem of Total Probability}$]] &nbsp; were considered.  
 +
 
 +
Further,&nbsp; it should be noted:
 +
*The&nbsp; "binary logarithm"&nbsp; is denoted here with&nbsp; "log<sub>2</sub>".&nbsp; Partially,&nbsp; we also use&nbsp; "ld"&nbsp; ("logarithm dualis")&nbsp; for this in our learning tutorial.
 +
 +
*In contrast to the book&nbsp; [[Information_Theory|$\text{Information Theory}$]]&nbsp; we do not distinguish in the following between the random variable&nbsp; $($upper case letters&nbsp; $X$&nbsp; resp.&nbsp; $Y)$&nbsp; and the realizations $($lower case letters&nbsp; $x$&nbsp; resp.&nbsp; $y)$.<br>
  
{{Definition}}''':''' Die von Shannon eingeführte Kanalkapazität gibt die maximale Transinformation <i>I</i>(<i>x</i>; <i>y</i>) zwischen der Eingangsgröße <i>x</i> und der Ausgangsgröße <i>y</i> an:
 
  
:<math>C = \max_{{{\rm Pr}(x_i)}} \hspace{0.1cm} I(X; Y) \hspace{0.05cm}.</math>
+
{{BlaueBox|TEXT= 
 +
$\text{Definition:}$&nbsp;  The&nbsp; &raquo;'''channel capacity'''&laquo;&nbsp; introduced by Shannon gives the maximum mutual information&nbsp; $I(x; y)$&nbsp; between the input variable&nbsp; $x$&nbsp; and output variable&nbsp; $y$:
 +
::<math>C = \max_{{{\rm Pr}(x_i)}} \hspace{0.1cm} I(X; Y) \hspace{0.05cm}.</math>
  
Hinzugefügt werden muss die Pseudo&ndash;Einheit &bdquo;bit/Kanalzugriff&rdquo;.{{end}}<br>
+
*The pseudo&ndash;unit&nbsp; "bit/channel use"&nbsp; must be added.}}<br>
  
Da die Maximierung der Transinformation über alle möglichen (diskreten) Eingangsverteilungen Pr(<i>x<sub>i</sub></i>) erfolgen muss, ist die Kanalkapazität unabhängig vom Eingang und damit eine reine Kanalkenngröße.<br>
+
:Since the maximization of the mutual information&nbsp; $I(x; y)$&nbsp; must be done over all possible&nbsp; (discrete)&nbsp; input distributions&nbsp; ${\rm Pr}(x_i)$,
 +
::&raquo;'''the channel capacity is independent of the input and thus a pure channel parameter'''&laquo;.<br>
  
== Kanalkapazität des BSC–Modells (1) ==
+
== Channel capacity of the BSC model ==
 
<br>
 
<br>
Wenden wir diese Definitionen auf das [http://en.lntwww.de/Kanalcodierung/Kanalmodelle_und_Entscheiderstrukturen#Binary_Symmetric_Channel_.E2.80.93_BSC BSC&ndash;Modell] an:
+
We now apply these definitions to the&nbsp; [[Channel_Coding/Channel_Models_and_Decision_Structures#Binary_Symmetric_Channel_.E2.80.93_BSC|$\text{BSC model}$]]&nbsp; ("Binary Symmetric Channel"):
  
:<math>I(x; y) \hspace{-0.15cm}  = \hspace{0.15cm} {\rm Pr}(y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 0) \cdot {\rm Pr}(x = 0)  
+
::<math>I(x; y) = {\rm Pr}(y = 0 \hspace{0.03cm}| \hspace{0.03cm}x = 0) \cdot {\rm Pr}(x = 0)  
\cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 0)}{{\rm Pr}(y = 0)}
+
\cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(y = 0 \hspace{0.03cm}| \hspace{0.03cm}x = 0)}{{\rm Pr}(y = 0)}
+ </math>
+
+ {\rm Pr}(y = 1 \hspace{0.03cm}| \hspace{0.03cm}x = 0) \cdot {\rm Pr}(x = 0)  
:<math>\hspace{1.2cm}  +  \hspace{0.15cm} {\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 0) \cdot {\rm Pr}(x = 0)  
+
\cdot {\rm log_2  } \hspace{0.15cm}\frac{{\rm Pr}(Y = 1 \hspace{0.03cm}| \hspace{0.03cm}x = 0)}{{\rm Pr}(y = 1)}
\cdot {\rm log_2  } \hspace{0.15cm}\frac{{\rm Pr}(Y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 0)}{{\rm Pr}(y = 1)}
 
 
+ </math>
 
+ </math>
:<math>\hspace{1.2cm}  +  \hspace{0.15cm}{\rm Pr}(y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 1) \cdot {\rm Pr}(x = 1)  
+
::<math>\hspace{1.45cm}  +  \hspace{0.15cm}{\rm Pr}(y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 1) \cdot {\rm Pr}(x = 1)  
 
\cdot {\rm  log_2  } \hspace{0.15cm}\frac{{\rm Pr}(Y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 1)}{{\rm Pr}(y = 0)}
 
\cdot {\rm  log_2  } \hspace{0.15cm}\frac{{\rm Pr}(Y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 1)}{{\rm Pr}(y = 0)}
+ </math>
+
+ {\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 1) \cdot {\rm Pr}(x = 1)  
:<math>\hspace{1.2cm} +  \hspace{0.15cm} {\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 1) \cdot {\rm Pr}(x = 1)  
 
 
\cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 1)}{{\rm Pr}(y = 1)}
 
\cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 1)}{{\rm Pr}(y = 1)}
 
  \hspace{0.05cm}.</math>
 
  \hspace{0.05cm}.</math>
  
Zur Kanalkapazität gelangt man durch folgende Überlegungen:
+
The channel capacity is arrived at by the following considerations:
*Die Maximierung bezüglich der Eingangsverteilung führt auf gleichwahrscheinliche Symbole:
+
[[File:KC_T_1_7_Zusatz_version2.png|right|frame|Binary symmetric channel|class=fit]]
 +
 
 +
*Maximization with respect to the input distribution leads to equally probable symbols:
  
 
::<math>{\rm Pr}(x = 0) = {\rm Pr}(x = 1) = 1/2
 
::<math>{\rm Pr}(x = 0) = {\rm Pr}(x = 1) = 1/2
 
  \hspace{0.05cm}.</math>
 
  \hspace{0.05cm}.</math>
  
*Aufgrund der aus dem Modell erkennbaren Symmetrie gilt dann gleichzeitig:
+
*Due to the symmetry evident from the model,&nbsp; the following then holds simultaneously:
  
 
::<math>{\rm Pr}(y = 0) = {\rm Pr}(y = 1) = 1/2
 
::<math>{\rm Pr}(y = 0) = {\rm Pr}(y = 1) = 1/2
 
  \hspace{0.05cm}.</math>
 
  \hspace{0.05cm}.</math>
  
*Berücksichtigt man weiter die BSC&ndash;Übergangswahrscheinlichkeiten
+
*We also consider the BSC transmission probabilities:
  
::<math>{\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 0) \hspace{-0.15cm}  = \hspace{-0.15cm} {\rm Pr}(y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 1) = \varepsilon \hspace{0.05cm},</math>
+
::<math>{\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 0) = {\rm Pr}(y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 1) = \varepsilon \hspace{0.05cm},</math>
::<math>{\rm Pr}(y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 0) \hspace{-0.15cm}  = \hspace{-0.15cm} {\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 1) = 1-\varepsilon  \hspace{0.05cm},</math>
+
::<math>{\rm Pr}(y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 0) = {\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 1) = 1-\varepsilon  \hspace{0.05cm}.</math>
  
:so erhält man nach Zusammenfassen je zweier Terme:
+
*After combining two terms each,&nbsp; we thus obtain:
  
 
::<math>C \hspace{0.15cm}  =  \hspace{0.15cm} 2 \cdot 1/2 \cdot \varepsilon \cdot {\rm log_2 } \hspace{0.15cm}\frac{\varepsilon}{1/2 }+  
 
::<math>C \hspace{0.15cm}  =  \hspace{0.15cm} 2 \cdot 1/2 \cdot \varepsilon \cdot {\rm log_2 } \hspace{0.15cm}\frac{\varepsilon}{1/2 }+  
 
2 \cdot 1/2 \cdot (1- \varepsilon)  \cdot {\rm log_2 } \hspace{0.15cm}\frac{1- \varepsilon}{1/2 }
 
2 \cdot 1/2 \cdot (1- \varepsilon)  \cdot {\rm log_2 } \hspace{0.15cm}\frac{1- \varepsilon}{1/2 }
=</math>
+
\varepsilon \cdot {\rm ld } \hspace{0.15cm}2  - \varepsilon \cdot {\rm log_2 } \hspace{0.15cm} \frac{1}{\varepsilon }+  (1- \varepsilon) \cdot {\rm log_2 } \hspace{0.15cm} 2 - (1- \varepsilon) \cdot {\rm log_2 } \hspace{0.15cm}\frac{1}{1- \varepsilon}</math>
::<math>\hspace{0.7cm}  =  \hspace{0.15cm} \varepsilon \cdot {\rm ld } \hspace{0.15cm}2  - \varepsilon \cdot {\rm log_2 } \hspace{0.15cm} \frac{1}{\varepsilon }+  (1- \varepsilon) \cdot {\rm ld } \hspace{0.15cm} 2 - (1- \varepsilon) \cdot {\rm log_2 } \hspace{0.15cm}\frac{1}{1- \varepsilon} = 1 - H_{\rm bin}(\varepsilon) </math>
+
:::<math>\Rightarrow \hspace{0.3cm} C \hspace{0.15cm}  = \hspace{0.15cm}  1 - H_{\rm bin}(\varepsilon). </math>
  
:mit
+
*Here,&nbsp; the&nbsp; "binary entropy function"&nbsp; is used:
  
 
::<math>H_{\rm bin}(\varepsilon) =  \varepsilon \cdot {\rm log_2 } \hspace{0.15cm}\frac{1}{\varepsilon}+  
 
::<math>H_{\rm bin}(\varepsilon) =  \varepsilon \cdot {\rm log_2 } \hspace{0.15cm}\frac{1}{\varepsilon}+  
 
(1- \varepsilon)  \cdot {\rm log_2 } \hspace{0.15cm}\frac{1}{1- \varepsilon}\hspace{0.05cm}.</math>
 
(1- \varepsilon)  \cdot {\rm log_2 } \hspace{0.15cm}\frac{1}{1- \varepsilon}\hspace{0.05cm}.</math>
  
Das Ergebnis wird auf der nächsten Seite grafisch dargestellt.<br>
+
The right graph shows the BSC channel capacity depending on the falsificaion probability&nbsp; $\varepsilon$.&nbsp; On the left,&nbsp; for comparison,&nbsp; the binary entropy function is shown,&nbsp; which has already been defined in the chapter&nbsp; [[Information_Theory/Discrete_Memoryless_Sources#Binary_entropy_function|$\text{Discrete Memoryless Sources}$]]&nbsp; of the book&nbsp; "Information Theory".&nbsp; <br>
 +
[[File:P ID2379 KC T 1 7 S2 v2.png|right|frame|Channel capacity of the BSC model|class=fit]]
 +
One can see from this representation:
 +
*The falsificaion probability&nbsp; $\varepsilon$&nbsp; leads to the channel capacity&nbsp; $C(\varepsilon)$.&nbsp; According to the&nbsp; [[Channel_Coding/Information_Theoretical_Limits_of_Channel_Coding#Channel_coding_theorem_and_channel_capacity |$\text{channel coding theorem}$]]&nbsp; error-free decoding is only possible if the code rate&nbsp; $R \le C(\varepsilon)$.
  
== Kanalkapazität des BSC–Modells (2) ==
+
*With&nbsp; $\varepsilon = 10\%$,&nbsp; error-free decoding is impossible&nbsp; if the code rate&nbsp; $R > 0.531$ &nbsp; $($because:&nbsp; $C(0.1) = 0.531)$.  
<br>
 
Die rechte Grafik zeigt die BSC&ndash;Kanalkapazität abhängig von der Verfälschungswahrscheinlichkeit <i>&epsilon;</i>. Links ist zum Vergleich die binäre Entropiefunktion dargestellt, die bereits im [http://en.lntwww.de/Informationstheorie/Ged%C3%A4chtnislose_Nachrichtenquellen#Bin.C3.A4re_Entropiefunktion Kapitel 1.1] des Buches &bdquo;Einführung in die Informationstheorie&rdquo; definiert wurde.<br>
 
  
[[File:P ID2379 KC T 1 7 S2 v2.png|Kanalkapazität des BSC–Modells|class=fit]]<br>
+
*With&nbsp; $\varepsilon = 50\%$,&nbsp; error-free decoding is impossible even if the code rate is arbitrarily small &nbsp; $($because:&nbsp; $C(0.5) = 0)$.<br>
  
Man erkennt aus dieser Darstellung:
+
*From an information theory point of view&nbsp; $\varepsilon = 1$&nbsp; $($inversion of all bits$)$&nbsp; is equivalent to&nbsp; $\varepsilon = 0$&nbsp; $($"error-free transmission"$)$.  
*Die Verfälschungswahrscheinlichkeit <i>&epsilon;</i> führt zur Kanalkapazität <i>C</i>(<i>&epsilon;</i>). Eine fehlerfreie Decodierung nach bestmöglicher Codierung ist nach dem [http://en.lntwww.de/Kanalcodierung/Informationstheoretische_Grenzen_der_Kanalcodierung#Kanalcodierungstheorem_und_Kanalkapazit.C3.A4t Kanalcodierungstheorem] nur dann möglich, wenn die Coderate <i>R</i> kleiner ist als <i>C</i>(<i>&epsilon;</i>).
 
  
*Mit <i>&epsilon;</i> = 10% ist wegen <i>C</i>(0.1) = 0.531 eine fehlerfreie Decodierung nicht möglich, wenn die Coderate <i>R</i> &#8805; 0.531 beträgt. Bei 50&ndash;prozentiger Verfälschung ist eine fehlerfreie Decodierung auch bei beliebig kleiner Coderate unmöglich: <i>C</i>(0.5) = 0.<br>
+
*Error-free decoding is achieved here by swapping the zeros and ones, i.e. by a so-called&nbsp; "mapping".
  
*Aus informationstheoretischer Sicht ist <i>&epsilon;</i> = 1 (Invertierung aller Bits) gleich gut wie <i>&epsilon;</i> = 0 (fehlerfreie Übertragung). Ebenso ist <i>&epsilon;</i> = 0.9 äquivalent zu <i>&epsilon;</i> = 0.1. Eine fehlerfreie Decodierung erzielt man hier durch Vertauschen der Nullen und Einsen, also durch ein  <i>Mapping</i>.<br>
+
*Similarly&nbsp; $\varepsilon = 0.9$&nbsp; is equivalent to&nbsp; $\varepsilon = 0.1$.&nbsp; <br>
 +
<br clear=all>
 +
== Channel capacity of the AWGN model==
 +
<br>
 +
We now consider the&nbsp; [[Digital_Signal_Transmission/System_Components_of_a_Baseband_Transmission_System#Transmission_channel_and_interference|$\text{AWGN channel}$]]&nbsp; ("Additive White Gaussian Noise").
 +
[[File:P ID2372 KC T 1 7 S3a.png|right|frame|AWGN channel model|class=fit]]
  
== Kanalkapazität des AWGN–Modells (1) ==
+
Here,&nbsp; for the output signal&nbsp; $y = x + n$,&nbsp; where&nbsp; $n$&nbsp; describes a&nbsp; [[Theory_of_Stochastic_Signals/Gaussian_Distributed_Random_Variables|$\text{Gaussian distributed random variable}$]]&nbsp; and it applies to their expected values&nbsp; ("moments"):
<br>
+
[[File:P ID2372 KC T 1 7 S3a.png|AWGN–Kanalmodell|rechts|rahmenlos]]<br>
+
#&nbsp;${\rm E}[n] = 0,$
 +
#&nbsp;${\rm E}[n^2] = P_n.$
  
Wir betrachten den AWGN&ndash;Kanal (<i>Additive White Gaussian Noise</i>). Hier gilt für das Ausgangssignal <i>y</i> = <i>x</i> + <i>n</i>, wobei <i>n</i> eine gaußverteilte Zufallsgröße beschreibt, und es gilt <i>E</i>[<i>n</i>] = 0 und <i>E</i>[<i>n</i><sup>2</sup>] = <i>P<sub>n</sub></i>.<br>
 
  
Damit ergibt sich unabhängig vom Eingangssignal <i>x</i> ein kontinuierliches Ausgangssignal <i>y</i>, und in der [http://en.lntwww.de/Kanalcodierung/Informationstheoretische_Grenzen_der_Kanalcodierung#Kanalcodierungstheorem_und_Kanalkapazit.C3.A4t Gleichung] für die Transinformation ist <i>M<sub>y</sub></i> &#8594; &#8734; einzusetzen.<br>
+
Thus,&nbsp; regardless of the input signal&nbsp; $x$&nbsp; $($analog or digital$)$,&nbsp; there is always a continuous-valued output signal&nbsp; $y$,&nbsp; and in the&nbsp; [[Channel_Coding/Information_Theoretical_Limits_of_Channel_Coding#Channel_coding_theorem_and_channel_capacity|$\text{equation for mutual information}$]]&nbsp; is to be used:&nbsp;  
 +
:$$M_y \to\infty.$$
  
Die Berechnung der Kanalkapazität für den AWGN&ndash;Kanal wird hier nur in Stichworten angegeben. Die genaue Herleitung finden Sie im Kapitel 4 des Buches &bdquo;Einführung in die Informationstheorie&rdquo;:
+
The calculation of the AWGN channel capacity is given here only in keywords.&nbsp; The exact derivation can be found in the fourth main chapter&nbsp; "Discrete Value Information Theory"&nbsp; of the textbook&nbsp; [[Information_theory|$\text{Information Theory}$]].
*Die im Hinblick auf maximale Transinformation optimierte Eingangsgröße <i>x</i> wird mit Sicherheit wertkontinuierlich sein, das heißt, beim AWGN&ndash;Kanal gilt außer <i>M<sub>y</sub></i> &#8594; &#8734; auch <i>M<sub>x</sub></i> &#8594; &#8734;.
+
*The input quantity&nbsp; $x$&nbsp; optimized with respect to maximum mutual information will certainly be continuous-valued,&nbsp; that is,&nbsp; for the AWGN channel in addition to &nbsp; $M_y \to\infty$ &nbsp; also holds &nbsp; $M_x \to\infty$.
  
*Während bei wertdiskretem Eingang über alle Pr(<i>x<sub>i</sub></i>) zu optimieren ist, erfolgt nun die Optimierung anhand der [http://en.lntwww.de/Stochastische_Signaltheorie/Gau%C3%9Fverteilte_Zufallsgr%C3%B6%C3%9Fe#Wahrscheinlichkeitsdichte-_und_Verteilungsfunktion WDF] <i>f<sub>x</sub></i>(<i>x</i>) des Eingangssignals unter der Nebenbedingung [http://en.lntwww.de/Digitalsignal%C3%BCbertragung/Optimierung_der_Basisband%C3%BCbertragungssysteme#Leistungs.E2.80.93_und_Spitzenwertbegrenzung_.281.29 Leistungsbegrenzung:]
+
*While for discrete-valued input optimization is to be done over all probabilities&nbsp; ${\rm Pr}(x_i)$&nbsp; now optimization is done using the&nbsp; [[Theory_of_Stochastic_Signals/Gaussian_Distributed_Random_Variables#Probability_density_function_.E2.80.93_Cumulative_density_function|$\text{probability density function&nbsp; $\rm (PDF)$}$]]&nbsp; $f_x(x)$&nbsp; of the input signal under the constraint&nbsp; [[Digital_Signal_Transmission/Optimization_of_Baseband_Transmission_Systems#Power_and_peak_limitation|$\text{power limitation}$]]:
  
::<math>C = \max_{f_x(x)} \hspace{0.1cm} I(x; y)\hspace{0.05cm},\hspace{0.3cm}{\rm wobei \hspace{0.15cm} gelten  \hspace{0.15cm} muss:}\hspace{0.15cm} {\rm E} \left [ x^2 \right ] \le P_x  \hspace{0.05cm}.</math>
+
::<math>C = \max_{f_x(x)} \hspace{0.1cm} I(x; y)\hspace{0.05cm},\hspace{0.3cm}{\rm where \hspace{0.15cm} must \hspace{0.15cm} apply} \text{:}\hspace{0.15cm} {\rm E} \left [ x^2 \right ] \le P_x  \hspace{0.05cm}.</math>
  
*Die Optimierung liefert für die Eingangs&ndash;WDF ebenfalls eine Gaußverteilung &#8658; <i>x</i>, <i>n</i> und <i>y</i> sind gaußverteilt gemäß den Dichtefunktionen <i>f<sub>x</sub></i>(<i>x</i>), <i>f<sub>n</sub></i>(<i>n</i>), <i>f<sub>y</sub></i>(<i>y</i>) und den Leistungen <i>P<sub>x</sub></i>, <i>P<sub>n</sub></i> und <i>P<sub>y</sub></i>.<br>
+
*The optimization also yields a Gaussian distribution for the input PDF &nbsp; &#8658; &nbsp; $x$,&nbsp; $n$&nbsp; and&nbsp; $y$&nbsp; are Gaussian distributed according to the probability density functions&nbsp; $f_x(x)$, &nbsp; $f_n(n)$ &nbsp; and &nbsp; $f_y(y)$.&nbsp; We designate the corresponding powers&nbsp; $P_x$,&nbsp; $P_n$&nbsp; and&nbsp; $P_y$.<br>
  
*Nach längerer Rechnung erhält man für die Kanalkapazität unter Verwendung des <i>Logarithmus dualis</i>  log<sub>2</sub>(.) &ndash; wiederum mit der Pseudo&ndash;Einheit &bdquo;bit/Kanalzugriff&rdquo;:
+
*After longer calculation one gets for the channel capacity using the&nbsp; "binary logarithm"&nbsp; $\log_2(\cdot)$&nbsp; &ndash;&nbsp; again with the pseudo-unit&nbsp; "bit/channel use":
  
::<math>C = {\rm log_2 } \hspace{0.15cm} \sqrt{\frac{P_y}{P_n }} = {\rm log_2 } \hspace{0.15cm} \sqrt{\frac{P_x + P_n}{P_n }} =  \frac{1}{2 } \cdot {\rm log_2 } \hspace{0.05cm}\left ( 1 + \frac{P_x}{P_n } \right )\hspace{0.05cm}.</math>
+
::<math>C = {\rm log_2 } \hspace{0.15cm} \sqrt{\frac{P_y}{P_n }} = {\rm log_2 } \hspace{0.15cm} \sqrt{\frac{P_x + P_n}{P_n }} =  {1}/{2 } \cdot {\rm log_2 } \hspace{0.05cm}\left ( 1 + \frac{P_x}{P_n } \right )\hspace{0.05cm}.</math>
  
*Beschreibt <i>x</i> ein zeitdiskretes Signal mit der Symbolrate 1/<i>T</i><sub>S</sub>, so muss dieses auf <i>B</i>&nbsp;=&nbsp;1/(2<i>T</i><sub>S</sub>) bandbegrenzt sein, und die gleiche Bandbreite <i>B</i> muss man für das Rauschsignal <i>n</i> ansetzen:
+
*If&nbsp; $x$&nbsp; describes a&nbsp; discrete-time signal with symbol rate&nbsp; $1/T_{\rm S}$,&nbsp; it must be bandlimited to &nbsp; $B = 1/(2T_{\rm S})$, &nbsp; and the same bandwidth&nbsp; $B$&nbsp; must be applied to the noise signal&nbsp; $n$&nbsp; &#8658; &nbsp; "noise bandwidth":
  
 
::<math>P_X  = \frac{E_{\rm S}}{T_{\rm S} } \hspace{0.05cm}, \hspace{0.4cm} P_N  = \frac{N_0}{2T_{\rm S} }\hspace{0.05cm}. </math>
 
::<math>P_X  = \frac{E_{\rm S}}{T_{\rm S} } \hspace{0.05cm}, \hspace{0.4cm} P_N  = \frac{N_0}{2T_{\rm S} }\hspace{0.05cm}. </math>
  
*Somit lässt sich die AWGN&ndash;Kanalkapazität auch durch die Sendeenergie pro Symbol (<i>E</i><sub>S</sub>) und die Rauschleistungsdichte (<i>N</i><sub>0</sub>) ausdrücken:
+
*Thus, the AWGN channel capacity can also be expressed by the&nbsp; &raquo;transmitted energy per symbol'''&laquo;&nbsp; $(E_{\rm S})$&nbsp; and the&nbsp; &raquo;'''noise power density'''&laquo;&nbsp; $(N_0)$:
  
::<math>C \hspace{-0.15cm}  \hspace{-0.15cm}  \frac{1}{2 } \cdot {\rm log_2 } \hspace{0.05cm}\left ( 1 + \frac{2 E_{\rm S}}{N_0 } \right )\hspace{0.05cm}, \hspace{1.85cm}
+
::<math>C =  {1}/{2 } \cdot {\rm log_2 } \hspace{0.05cm}\left ( 1 + {2 E_{\rm S}}/{N_0 } \right )\hspace{0.05cm}, \hspace{1.9cm}
{\rm Einheit:} \hspace{0.3cm} \frac{\rm bit}{\rm Kanalzugriff}\hspace{0.05cm},</math>  
+
\text {unit:}\hspace{0.3cm} \frac{\rm bit}{\rm channel\:use}\hspace{0.05cm}.</math>
::<math>C^{\star} \hspace{-0.15cm}  = \hspace{-0.15cm} \frac{C}{T_{\rm S} } =    B \cdot {\rm log_2 } \hspace{0.05cm}\left ( 1 + \frac{2  E_{\rm S}}{N_0 } \right )\hspace{0.05cm}, \hspace{0.8cm}
+
*The following equation gives the channel capacity per unit time&nbsp; $($denoted by $^{\star})$:
{\rm Einheit:} \hspace{0.3cm} \frac{\rm bit}{\rm Zeiteinheit}\hspace{0.05cm}.</math><br>
+
::<math>C^{\star} = \frac{C}{T_{\rm S} } =    B \cdot {\rm log_2 } \hspace{0.05cm}\left ( 1 + {2  E_{\rm S}}/{N_0 } \right )\hspace{0.05cm}, \hspace{0.8cm}
 +
\text {unit:} \hspace{0.3cm} \frac{\rm bit}{\rm time\:unit}\hspace{0.05cm}.</math><br>
  
{{Beispiel}}''':''' Für das Zahlenverhältnis <i>E</i><sub>S</sub>/<i>N</i><sub>0</sub> = 7.5 &#8658; 10 &middot; lg <i>E</i><sub>S</sub>/<i>N</i><sub>0</sub> = 8.75 dB erhält man die Kanalkapazität <b><i>C</i> = 1/2 &middot; log<sub>2</sub> (16) = 2 bit/Kanalzugriff</b>. Bei einem Kanal mit der (physikalischen) Bandbreite <i>B</i> = 4 kHz, was der Abtastrate <i>f</i><sub>A</sub> = 8 kHz entspricht, gilt zudem <b><i>C</i>* = 16 kbit/s</b>.{{end}}<br>
+
{{GraueBox|TEXT= 
 +
$\text{Example 1:}$&nbsp; 
 +
*For&nbsp; $E_{\rm S}/N_0  = 7.5$ &nbsp; &#8658; &nbsp; $10 \cdot \lg \, E_{\rm S}/N_0 = 8.75 \, \rm dB$&nbsp; the channel capacity is&nbsp;
 +
:$$C = {1}/{2 } \cdot {\rm log_2 } \hspace{0.05cm} (16) = 2 \, \rm  bit/channel\:use.$$
 +
 +
*For a channel with the&nbsp; (physical)&nbsp; bandwidth&nbsp; $B = 4 \, \rm kHz$,&nbsp; which corresponds to the sampling rate&nbsp; $f_{\rm A} = 8\, \rm kHz$:&nbsp;
 +
:$$C^\star = 16 \, \rm kbit/s.$$}}
  
== Kanalkapazität des AWGN–Modells (2) ==
 
<br>
 
Ein Vergleich verschiedener Codierverfahren bei konstantem <i>E</i><sub>S</sub> (Energie pro übertragenem Symbol) ist nicht fair. Vielmehr sollte man für diesen Vergleich die Energie <i>E</i><sub>B</sub> pro Nutzbit fest vorgeben. Dabei gilt:
 
  
:<math>E_{\rm S} = R \cdot E_{\rm B}  \hspace{0.3cm} \Rightarrow \hspace{0.3cm}
+
*However,&nbsp; a comparison of different encoding methods at constant&nbsp; "energy per transmitted symbol" &nbsp; &rArr; &nbsp; $E_{\rm S}$&nbsp; is not fair.&nbsp;
 +
 
 +
*Rather,&nbsp; for this comparison,&nbsp; the&nbsp; "energy per source bit" &nbsp; &rArr; &nbsp; $E_{\rm B}$&nbsp; should be fixed.&nbsp; The following relationships apply:
 +
 
 +
::<math>E_{\rm S} = R \cdot E_{\rm B}  \hspace{0.3cm} \Rightarrow \hspace{0.3cm}
 
E_{\rm B} = E_{\rm S} / R \hspace{0.05cm}. </math>
 
E_{\rm B} = E_{\rm S} / R \hspace{0.05cm}. </math>
  
Damit lautet das <i>Kanalcodierungstheorem</i>: Eine fehlerfreie Decodierung (bei unendlich langen Blöcken) ist möglich, falls die Coderate <i>R</i> kleiner ist als die Kanalkapazität <i>C</i>:
+
{{BlaueBox|TEXT= 
 +
$\text{Channel Coding Theorem for the AWGN channel:}$&nbsp;
  
:<math>R < C =  {1}/{2 } \cdot {\rm log_2 } \hspace{0.05cm}\left ( 1 +2 \cdot R\cdot E_{\rm B}/{N_0 } \right )\hspace{0.05cm}.</math>
+
*Error-free decoding $($at infinitely long blocks &nbsp; &#8658; &nbsp; $n \to \infty)$&nbsp; is always possible if the code rate&nbsp; $R$&nbsp; is smaller than the channel capacity&nbsp; $C$:
  
Für jede Coderate <i>R</i> lässt sich damit das erforderliche <i>E</i><sub>B</sub>/<i>N</i><sub>0</sub> des AWGN&ndash;Kanals ermitteln, damit eine fehlerfreie Decodierung möglich ist. Man erhält für den Grenzfall <i>R</i> = <i>C</i>:
+
::<math>R < C = {1}/{2 } \cdot {\rm log_2 } \hspace{0.05cm}\left ( 1 +2 \cdot R\cdot E_{\rm B}/{N_0 } \right )\hspace{0.05cm}.</math>
  
:<math>{E_{\rm B}}/{N_0} >  (2^{2R}-1)/(2R ) \hspace{0.05cm}.</math>
+
*For each code rate&nbsp; $R$&nbsp; the required&nbsp; $E_{\rm B}/N_0$&nbsp; of the AWGN channel can thus be determined so that error-free decoding is just possible.  
  
Die Grafik fasst das Ergebnis zusammen, wobei die Ordinate <i>R</i> im linearen Maßstab und die Abszisse <i>E</i><sub>B</sub>/<i>N</i><sub>0</sub> logarithmisch aufgetragen ist. Außerhalb der blauen Fläche ist eine fehlerfreie Codierung nicht möglich. Die blaue Grenzkurve gibt die Kanalkapazität <i>C</i> des AWGN&ndash;Kanals an.<br>
+
*One obtains for the limiting case&nbsp; $R = C$:
  
[[File:P ID2373 KC T 1 7 S3b v3.png|Kanalkapazität des AWGN–Kanals|class=fit]]<br>
+
::<math>{E_{\rm B} }/{N_0} >  \frac{2^{2R}-1}{2R } \hspace{0.05cm}.</math>}}
  
Aus dieser Grafik und obiger Gleichung lässt sich Folgendes ableiten:
 
*Die Kanalkapazität <i>C</i> steigt etwas weniger als linear mit 10 &middot; lg <i>E</i><sub>B</sub>/<i>N</i><sub>0</sub> an. In der Grafik sind einige ausgewählte Funktionswerte angegeben (blaue Kreuze).<br>
 
  
*Ist 10 &middot; lg (<i>E</i><sub>B</sub>/<i>N</i><sub>0</sub>) kleiner als &ndash;1.59 dB, so ist eine fehlerfreie Decodierung prinzipiell unmöglich. Beträgt die Coderate <i>R</i> = 0.5, so muss 10 &middot; lg (<i>E</i><sub>B</sub>/<i>N</i><sub>0</sub>) größer als 0 dB &nbsp;&nbsp;&#8658;&nbsp;&nbsp; <i>E</i><sub>B</sub> > <i>N</i><sub>0</sub> sein.<br>
+
The graph summarizes the result,&nbsp; with the ordinate&nbsp; $R$&nbsp; plotted on a linear scale and the abscissa&nbsp; $E_{\rm B}/{N_0 }$&nbsp; plotted logarithmically.
 +
[[File:EN_KC_T_1_7_S3b_v2.png|right|frame|AWGN channel capacity |class=fit]]
 +
*Error-free coding is not possible outside the blue area.  
  
*Für alle binären Codes gilt per se 0 &#8804; <i>R</i> &#8804; 1. Coderaten <i>R</i> >  1 sind nur mit <i>nichtbinären</i> Codes möglich. Beispielsweise beträgt die maximal mögliche Coderate eines quaternären Codes <i>R</i> = 2.<br>
+
*The blue limit curve indicates the AWGN channel capacity&nbsp; $C$.<br>
  
*Alle eindimensionalen Modulationsarten &ndash; also solche Verfahren, die nur die Inphase&ndash; oder nur die Quadraturkomponente nutzen wie ASK, BPSK und FSK &ndash; müssen im blauen Bereich liegen.<br>
 
  
== AWGN–Kanalkapazität für binäre Eingangssignale ==
+
From this graph and the above equation,&nbsp; the following can be deduced:
 +
#Channel capacity&nbsp; $C$&nbsp; increases with&nbsp; $10 \cdot \lg \, E_{\rm B}/N_0 $&nbsp; somewhat less than linearly.&nbsp; In the graph,&nbsp; some selected function values are indicated as blue crosses.<br>
 +
#If&nbsp; $10 \cdot \lg \, E_{\rm B}/N_0 < -1.59 \, \rm dB$,&nbsp; error-free decoding is impossible in principle.&nbsp; If the code rate&nbsp; $R = 0.5$,&nbsp; then&nbsp; $10 \cdot \lg \, E_{\rm B}/N_0 > 0 \, \rm dB$&nbsp; must be &nbsp;&nbsp;&#8658;&nbsp; $E_{\rm B} > N_0$.<br>
 +
#For all binary codes holds per se&nbsp; $0 < R &#8804; 1$. Only with non-binary codes&nbsp; &rArr; &nbsp; rates&nbsp; $R > 1$&nbsp; are possible.&nbsp; For example,&nbsp; the maximum possible code rate of a quaternary code:&nbsp; $R = \log_2 \, M_y = \log_2 \, 4 = 2$.<br>
 +
#All one-dimensional modulation types &ndash; i.e., those methods that use only the in-phase&ndash; or only the quadrature component such as&nbsp; [[Digital_Signal_Transmission/Carrier_Frequency_Systems_with_Coherent_Demodulation#On.E2.80.93off_keying_.282.E2.80.93ASK.29|$\text{2&ndash;ASK}$]],&nbsp;  [[Digital_Signal_Transmission/Carrier_Frequency_Systems_with_Coherent_Demodulation#Binary_phase_shift_keying_.28BPSK.29 |$\text{BPSK}$]]&nbsp; and&nbsp; [[Digital_Signal_Transmission/Carrier_Frequency_Systems_with_Non-Coherent_Demodulation#Non-coherent_demodulation_of_binary_FSK_.282.E2.80.93FSK.29|$\text{2&ndash;FSK}$]]&nbsp; must be in the blue area of the present graphic.<br>
 +
#As shown in the chapter&nbsp; [[Information_Theory/AWGN_Channel_Capacity_for_Discrete_Input#Maximum_code_rate_for_QAM_structures|"Maximum code rate for QAM structures"]],&nbsp;  there is a&nbsp; "friendlier"&nbsp; limit curve for two-dimensional modulation types such as the&nbsp; [[Modulation_Methods/Quadrature_Amplitude_Modulation|"Quadrature Amplitude Modulation"]]. 
 +
 
 +
== AWGN channel capacity for binary input signals ==
 
<br>
 
<br>
In diesem Buch beschränken wir uns meist auf binäre Codes, also auf das Galoisfeld GF(2<sup><i>n</i></sup>). Damit ist
+
In this book we restrict ourselves mainly to binary codes,&nbsp; that is,&nbsp; to the Galois field&nbsp; ${\rm GF}(2^n)$.&nbsp; With this
*zum einen die Coderate auf den Bereich <i>R</i> &#8804; 1 begrenzt,<br>
+
[[File:P ID2374 KC T 1 7 S4a.png|right|frame|Conditional PDFs for AWGN channel and binary input]]
  
*zum zweiten auch für <i>R</i> &#8804; 1 nicht die gesamte blaue Region (siehe letzte Seite) verfügbar.<br><br>
+
*firstly,&nbsp; the code rate is limited to the range&nbsp; $R &#8804; 1$,<br>
  
[[File:P ID2374 KC T 1 7 S4a.png|Bedingte Wahrscheinlichkeitsdichten des binären AWGN–Kanals|rechts|rahmenlos]]
+
*secondly,&nbsp; also for&nbsp; $R &#8804; 1$&nbsp; not the whole blue region is available&nbsp; (see previous section).
  
Die nun gültige Region ergibt sich aus der [http://en.lntwww.de/Kanalcodierung/Informationstheoretische_Grenzen_der_Kanalcodierung#Kanalcodierungstheorem_und_Kanalkapazit.C3.A4t allgemeinen Gleichung] der Transinformation durch
 
*die Parameter <i>M<sub>x</sub></i> = 2 und <i>M<sub>y</sub></i> &#8594; &#8734;,<br>
 
  
*bipolare Signalisierung &nbsp;&#8658;&nbsp; <i>x</i> = 0 &#8594; <i>x</i>&#x0303; = +1,&nbsp; <i>x</i> = 1 &#8594; <i>x</i>&#x0303; = &ndash;1,<br>
+
*The now valid region results from the&nbsp; [[Channel_Coding/Information_Theoretical_Limits_of_Channel_Coding#Channel_coding_theorem_and_channel_capacity|$\text{general equation}$]]&nbsp; of mutual information by
 +
#the parameters&nbsp; $M_x = 2$&nbsp; and&nbsp; $M_y \to \infty$,<br>
 +
#bipolar signaling &nbsp; &#8658; &nbsp; $x=0$ &nbsp; &#8594; &nbsp; $\tilde{x} = +1$&nbsp; and&nbsp; $x=1$ &nbsp; &#8594; &nbsp; $\tilde{x} = -1$,<br>
 +
#the transition from conditional probabilities&nbsp; ${\rm Pr}(\tilde{x}_i)$&nbsp; to conditional probability density functions,<br>
 +
#replace the sum with an integration.<br><br>
  
*den Übergang von bedingten Wahrscheinlichkeiten Pr(<i>x&#x0303;</i><sub><i>i</i></sub>) zu bedingten Wahrscheinlichkeitsdichtefunktionen,<br>
+
*The optimization of the source leads to equally probable symbols:
  
*Ersetzen der Summe durch eine Integration.<br><br>
+
::<math>{\rm Pr}(\tilde{x} = +1) = {\rm Pr}(\tilde{x} = -1) = 1/2  \hspace{0.05cm}. </math>
  
Die Optimierung der Quelle führt auf gleichwahrscheinliche Symbole:
+
*This gives for the maximum of the mutual information,&nbsp; i.e. for the channel capacity:  
  
:<math>{\rm Pr}(\tilde{x} = +1) = {\rm Pr}(\tilde{x} = -1) = 1/2  \hspace{0.05cm}. </math>
+
::<math>C \hspace{-0.15cm} = {1}/{2} \cdot \int_{-\infty }^{+ \infty}
 
 
Damit erhält man  bei binärem Eingang (&plusmn;1) für die Kanalkapazität &nbsp;&#8658;&nbsp; Maximum der Transinformation hinsichtlich Pr(<i>x&#x0303;</i><sub><i>i</i></sub>):
 
 
 
:<math>C \hspace{-0.15cm} = {1}/{2} \cdot \int_{-\infty }^{+ \infty}
 
 
\left [ f_{y\hspace{0.05cm} |\hspace{0.05cm}\tilde{x} = +1}(y)\cdot {\rm log_2 } \hspace{0.15cm}\frac {f_{y\hspace{0.05cm} |\hspace{0.05cm}\tilde{x} = +1}(y)}{f_y(y)} +  
 
\left [ f_{y\hspace{0.05cm} |\hspace{0.05cm}\tilde{x} = +1}(y)\cdot {\rm log_2 } \hspace{0.15cm}\frac {f_{y\hspace{0.05cm} |\hspace{0.05cm}\tilde{x} = +1}(y)}{f_y(y)} +  
 
f_{y\hspace{0.05cm} |\hspace{0.05cm}\tilde{x} = -1}(y)\cdot {\rm log_2 } \hspace{0.15cm}\frac {f_{y\hspace{0.05cm} |\hspace{0.05cm}\tilde{x} = -1}(y)}{f_y(y)}
 
f_{y\hspace{0.05cm} |\hspace{0.05cm}\tilde{x} = -1}(y)\cdot {\rm log_2 } \hspace{0.15cm}\frac {f_{y\hspace{0.05cm} |\hspace{0.05cm}\tilde{x} = -1}(y)}{f_y(y)}
Line 186: Line 224:
 
\hspace{0.05cm}.</math>
 
\hspace{0.05cm}.</math>
  
Das Integral lässt sich nicht in mathematisch&ndash;geschlossener Form lösen, sondern kann nur numerisch ausgewertet werden. Die grüne Kurve zeigt das Ergebnis. Die blaue Kurve gibt zum Vergleich die auf der letzten Seite hergeleitete Kanalkapazität für gaußverteilte Eingangssignale an. Man erkennt:
+
The integral cannot be solved in mathematical closed form, but can only be evaluated numerically.
*Für 10 &middot; lg <i>E</i><sub>B</sub>/<i>N</i><sub>0</sub> < 0 unterscheiden sich die beiden Kapazitätskurven nur geringfügig. So muss man bei binärem bipolaren Eingang gegenüber dem Optimum (Gaußscher Eingang) die Kenngröße 10 &middot; lg <i>E</i><sub>B</sub>/<i>N</i><sub>0</sub> nur etwa um 0.1 dB erhöhen, um ebenfalls die Coderate <i>R</i> = 0.5 zu ermöglichen.<br>
+
[[File:EN_KC_T_1_7_S4b_v2.png|right|frame|AWGN channel capacity for binary input signals |class=fit]]
 
+
   
*Ab etwa 10 &middot; lg <i>E</i><sub>B</sub>/<i>N</i><sub>0</sub> = 6 dB ist die Kanalkapazität <i>C</i> = 1 bit/Kanalzugriff für binären Eingang (fast) erreicht. Dazwischen verläuft die grüne Grenzkurve annähernd exponentiell ansteigend.
+
#The green curve shows the result.  
 +
#The blue curve gives for comparison the channel capacity for Gaussian distributed input signals derived in the last section.
 +
   
  
:[[File:P ID2375 KC T 1 7 S4b v3.png|AWGN–Kanalkapazität für binäre Eingangssignale|class=fit]]<br>
+
It can be seen:
 +
*For&nbsp; $10 \cdot \lg \, E_{\rm B}/N_0 < 0 \, \rm dB$&nbsp; the two capacitance curves differ only slightly.
 +
 +
*So,&nbsp; for binary bipolar input,&nbsp; compared to the optimum (Gaussian) input,&nbsp; the characteristic&nbsp; $10 \cdot \lg \, E_{\rm B}/N_0$&nbsp; only needs to be increased by about&nbsp; $0.1 \, \rm dB$&nbsp; to also allow the code rate&nbsp; $R = 0.5$.<br>
  
== Gebräuchliche Kanalcodes im Vergleich zur Kanalkapazität (1) ==
+
*From&nbsp; $10 \cdot \lg \, E_{\rm B}/N_0 \approx6 \, \rm dB$&nbsp; the capacity&nbsp; $C = 1 \, \rm bit/channel\:use$&nbsp; of the AWGN channel for binary input is almost reached.
 +
 +
*In between, the limit curve is approximately exponentially increasing.
 +
<br clear=all>
 +
== Common channel codes versus channel capacity ==
 
<br>
 
<br>
Nun soll gezeigt werden, in wie weit sich etablierte Kanalcodes der BPSK&ndash;Kanalkapazität (grüne Kurve) annähern. In der Grafik ist als Ordinate die Rate <i>R</i> = <i>k</i>/<i>n</i> dieser Codes bzw. die Kapazität <i>C</i> (mit der zusätzlichen Pseudo&ndash;Einheit &bdquo;bit/Kanalzugriff&rdquo;) aufgetragen. Weiter ist vorausgesetzt:
+
Now it shall be shown to what extent established channel codes approximate the BPSK channel capacity&nbsp; (green curve). &nbsp; In the following graph the rate &nbsp; $R=k/n$ &nbsp; of these codes or the capacity &nbsp; $C$ &nbsp; (with the additional pseudo&ndash;unit "bit/channel use")&nbsp; is plotted as ordinate.  
*der AWGN&ndash;Kanal, gekennzeichnet durch 10 &middot; lg <i>E</i><sub>B</sub>/<i>N</i><sub>0</sub> in dB, und<br>
+
[[File:EN_KC_T_1_7_S5a.png|right|frame|Rates and required &nbsp;$E_{\rm B}/{N_0}$&nbsp; of different channel codes]]
 
 
*für die durch Kreuze markierten Codes eine Bitfehlerrate (BER) von 10<sup>&ndash;5</sup>.<br><br>
 
 
 
Zu beachten ist, dass die Kanalkapazitätskurven stets für <i>n</i> &#8594; &#8734;  und BER = 0 gelten. Würde man diese strenge Forderung &bdquo;feherfrei&rdquo; auch an die betrachteten Kanalcodes endlicher Codelänge <i>n</i> anlegen, so wäre hierfür stets 10 &middot; <i>E</i><sub>B</sub>/<i>N</i><sub>0</sub> &#8594; &#8734; erforderlich. Dies ist aber ein akademisches Problem, das für die Praxis wenig Bedeutung hat. Für BER = 10<sup>&ndash;10</sup> ergäbe sich eine qualitativ ähnliche Grafik.<br>
 
 
 
[[File:P ID2968 KC T 1 7 S5a v3.png|Raten und erforderliches <i>E</i><sub>B</sub>/<i>N</i><sub>0</sub> verschiedener Kanalcodes|class=fit]]<br>
 
 
 
Es folgen einige Erläuterungen zu den Daten, die der Vorlesung [Liv10]<ref>Liva, G.: Channel Coding. Vorlesungsmanuskript, Lehrstuhl für Nachrichtentechnik, TU München und DLR Oberpfaffenhofen, 2010.</ref> entnommen wurden:
 
*Die Punkte <b>A</b>, <b>B</b> und <b>C</b> markieren Hamming&ndash;Codes unterschiedlicher Rate, die im [http://en.lntwww.de/Kanalcodierung/Beispiele_bin%C3%A4rer_Blockcodes#Single_Parity.E2.80.93check_Codes_.281.29 Kapitel 1.3] bereits ausführlich behandelt wurden. Sie alle benötigen 10 &middot; lg <i>E</i><sub>B</sub>/<i>N</i><sub>0</sub> > 8 dBfür BER = 10<sup>&ndash;5</sup>.<br>
 
 
 
*<b>D</b> kennzeichnet den binären Golay&ndash;Code mit der Rate 1/2 und <b>E</b> einen Reed&ndash;Muller&ndash;Code. Dieser sehr niederratige Code kam  bereits 1971 bei der Raumsonde Mariner 9 zum Einsatz.<br>
 
 
 
*Die [http://en.lntwww.de/Kanalcodierung/Definition_und_Eigenschaften_von_Reed%E2%80%93Solomon%E2%80%93Codes#Konstruktion_von_Reed.E2.80.93Solomon.E2.80.93Codes_.281.29 Reed&ndash;Solomon&ndash;Codes] (RS&ndash;Codes) werden im Kapitel 2 noch ausführlich behandelt. Mit <b>F</b> markiert ist der RS&ndash;Code der Rate 223/255 > 0.9 und einem erforderlichen <i>E</i><sub>B</sub>/<i>N</i><sub>0</sub> < 6 dB. <br>
 
 
 
*Die Markierungen <b>G</b> und <b>H</b> bezeichnen beispielhafte Faltungscodes (englisch: <i>Convolutional Codes</i>, CC) mittlerer Rate. Ersterer wurde schon 1972 bei der Pioneer10&ndash;Mission eingesetzt.<br>
 
  
*Die Kanalcodierung der Voyager&ndash;Mission Ende der 1970er Jahre ist mit <b>I</b> markiert. Es handelt sich um die Verkettung eines (2, 1, 7)&ndash;Faltungscodes mit einem Reed&ndash;Solomon&ndash;Code.<br><br>
+
Further,&nbsp; it is assumed:
 
+
*the AWGN&ndash;channel,&nbsp; denoted by&nbsp; $10 \cdot \lg \, E_{\rm B}/N_0$&nbsp; in dB,&nbsp; and<br>
Anzumerken ist, dass bei den Faltungscodes insbesondere der dritte Parameter der Kennung eine andere Bedeutung hat als bei den Blockcodes. (2, 1, 32) weist beispielsweise auf das Memory <i>m</i> = 32 hin.<br>
 
 
 
Auf der nächsten Seite folgen noch Kenndaten von Systemen mit iterativer Decodierung.<br>
 
 
 
== Gebräuchliche Kanalcodes im Vergleich zur Kanalkapazität (2) ==
 
<br>
 
Mit iterativer Decodierung lassen sich deutlich bessere Ergebnisse erzielen, wie die folgende Grafik zeigt. Das heißt: Die einzelnen Markierungspunkte liegen sehr viel näher an der Kapazitätskurve.<br>
 
  
[[File:P ID2377 KC T 1 7 S5b v4.png|Raten und erforderliches <i>E</i><sub>B</sub>/<i>N</i><sub>0</sub>  für iterative Codierverfahren |class=fit]]<br>
+
*bit error rate&nbsp; $\rm BER=10^{-5}$ for all codes marked by crosses.<br>
  
Die bisher mit &bdquo;Gauß&ndash;Kapazität&rdquo; beschriftete durchgezogene blaue Kurve wird hier &bdquo;Gauß (reell)&rdquo; genannt. Hier noch einige weitere Erläuterungen zu dieser Grafik:
+
* Rote Kreuze markieren sogenannte Turbocodes nach CCSDS (<i>Consultative Committee for Space Data Systems</i>) mit jeweils <i>k</i> = 6920 Informationsbits und unterschiedlichen Codelängen <i>n</i>. Diese von [[http://perso.telecom-bretagne.eu/claudeberrou/ Claude Berrou]] um 1990 erfundenen Codes können iterativ decodiert werden. Die (roten) Markierungen liegen jeweils weniger als 1 dB von der Shannon&ndash;Grenze entfernt.<br>
+
$\text{Please note:}$&nbsp;
 +
#The channel capacity curves always apply to&nbsp; $n \to \infty$&nbsp; and&nbsp; $\rm BER \to 0$&nbsp;.
 +
#If one would apply this strict requirement&nbsp; "error-free"&nbsp; also to the considered channel codes of finite code length&nbsp; $n$,&nbsp; this would always require&nbsp; $10 \cdot \lg \, E_{\rm B}/N_0 \to \infty$&nbsp;.  
 +
#But this is an academic problem of little practical significance.&nbsp; For&nbsp; $\rm BER = 10^{-10}$&nbsp; a qualitatively and also quantitatively similar graph would result.<br>
 +
#For convolutional codes,&nbsp; the third identifier parameter has a different meaning than for block codes.&nbsp; For example,&nbsp; $\text{CC (2, 1, 32)}$&nbsp; indicates the memory&nbsp; $m = 32$<br>
 +
<br clear=all>
 +
The following are some&nbsp; &raquo;'''explanations of the data taken from the lecture [Liv10]<ref name='Liv10'>Liva, G.: Channel Coding. Lecture manuscript, Chair of Communications Engineering, TU Munich and DLR Oberpfaffenhofen, 2010.</ref>'''&laquo;:
 +
#The points&nbsp; $\rm A$,&nbsp; $\rm B$&nbsp; and&nbsp; $\rm C$&nbsp; mark&nbsp; [[Channel_Coding/Examples_of_Binary_Block_Codes#Hamming_Codes|$\text{Hamming codes}$]]&nbsp; of different rate.&nbsp; They all require for&nbsp; $\rm BER = 10^{-5}$&nbsp; more than&nbsp; $10 \cdot \lg \, E_{\rm B}/N_0 = 8 \, \rm dB$.
 +
#$\rm D$&nbsp; denotes the binary&nbsp; [https://en.wikipedia.org/wiki/Binary_Golay_code $\text{Golay code}$]&nbsp; with rate&nbsp; $1/2$&nbsp; and&nbsp; $\rm E$&nbsp; denotes a&nbsp; [https://en.wikipedia.org/wiki/Reed-Muller_code $\text{Reed&ndash;Muller code}$].&nbsp; This very low rate code was used  1971 on the Mariner 9 spacecraft.
 +
#Marked by&nbsp; $\rm F$&nbsp; is a high rate&nbsp; [[Channel_Coding/Definition_and_Properties_of_Reed-Solomon_Codes|$\text{Reed&ndash;Solomon codes}$]] &nbsp; $(R = 223/255 > 0.9)$ &nbsp;and a required&nbsp; $10 \cdot \lg \, E_{\rm B}/N_0 < 6 \, \rm dB$.
 +
#The markers&nbsp; $\rm G$&nbsp; and&nbsp; $\rm H$&nbsp; denote exemplary&nbsp; [[Channel_Coding/Basics_of_Convolutional_Coding|$\text{convolutional codes}$]]&nbsp; medium rate. The code&nbsp; $\rm G$&nbsp; was used as early as 1972 on the Pioneer10 mission.
 +
#The channel coding of the Voyager&ndash;mission in the late 1970s is marked by&nbsp; $\rm I$.&nbsp; It is the concatenation of a&nbsp; $\text{CC (2, 1, 7)}$&nbsp; with a&nbsp; [[Channel_Coding/Definition_and_Properties_of_Reed-Solomon_Codes|$\text{Reed&ndash;Solomon code}$]].<br><br>
  
*Ähnliches Verhalten zeigen die durch weiße Rechtecke gekennzeichneten LDPC&ndash;Codes (<i>Low Density Parity&ndash;check Codes</i>), die seit 2006 bei DVB&ndash;S2 (<i>Digital Video Broadcast over Satellite</i>) eingesetzt werden. Diese eignen sich aufgrund der spärlichen Belegung der Prüfmatrix <b>H</b> (mit Einsen) sehr gut für die iterative Decodierung mittels Faktor&ndash;Graphen und [[http://wwwmayr.in.tum.de/konferenzen/Jass05/courses/4/papers/prof_hagenauer.pdf Exit Charts.]]<br>
 
  
*Die schwarzen Punkte markieren die von CCSDS spezifizierten LDPC&ndash;Codes, die alle von einer konstanten Anzahl von Informationsbits ausgehen (<i>k</i> = 16384). Dagegen ist bei allen weißen Rechtecken die Codelänge <i>n</i> = 64800 konstant, während sich die Anzahl <i>k</i> der Informationsbits entsprechend der Rate <i>R</i> = <i>k</i>/<i>n</i> ändert.<br>
+
Much better results can be achieved with&nbsp;  &raquo;'''iterative decoding methods'''&laquo;&nbsp; (see fourth main chapter),&nbsp; as the second graph shows.
 +
[[File:EN_KC_T_1_7_S5b.png|right|frame|Rates and required&nbsp; $E_{\rm B}/N_0$&nbsp; for iterative coding methods |class=fit]]
 +
*This means: &nbsp;The individual marker points are much closer to the capacity curve&nbsp; $C_{\rm BPSK}$&nbsp; for digital input.<br>
  
*Um das Jahr 2000 hatten viele Forscher den Ehrgeiz, sich der Shannon&ndash;Grenze bis auf Bruchteile von einem dB anzunähern. Das gelbe Kreuz markiert ein solches Ergebnis aus [CFRU01] von 2001. Verwendet wurde ein irregulärer Rate&ndash;1/2&ndash;LDPC&ndash;Code der Codelänge <i>n</i> = 2 &middot; 10<sup>6</sup>.<br><br>
+
*The solid blue Gaussian capacity curve is labeled&nbsp; $C_{\rm Gaussian}$.  
  
Festzuhalten ist, dass Shannon bereits 1948 erkannt und nachgewiesen hat, dass kein eindimensionales Modulationsverfahren links von der durchgehend eingezeichneten AWGN&ndash;Grenzkurve &bdquo;Gauß (reell)&rdquo; liegen kann.  Für zweidimensionale Verfahren wie QAM und mehrstufige PSK gilt dagegen die Grenzkurve &bdquo;Gauß (komplex)&rdquo;, die hier gestrichelt gezeichnet ist und stets links von der durchgezogenen Kurve liegt. Näheres hierzu im [http://en.lntwww.de/Informationstheorie/AWGN%E2%80%93Kanalkapazit%C3%A4t_bei_wertdiskretem_Eingang#Maximale_Coderate_f.C3.BCr_QAM.E2.80.93Strukturen Kapitel 4.3] des Buches &bdquo;Einführung in die Informationstheorie&rdquo;.<br>
+
Here are some more explanations about this graph:
 +
# Red crosses mark so-called&nbsp; [[Channel_Coding/The_Basics_of_Turbo_Codes|$\text{turbo codes}$]]&nbsp; accordingt to&nbsp; [https://en.wikipedia.org/wiki/Consultative_Committee_for_Space_Data_Systems $\text{CCSDS}$]&nbsp; ("Consultative Committee for Space Data Systems")&nbsp; with each&nbsp; $k = 6920$&nbsp; information bits and different code lengths&nbsp; $n$.
 +
#These codes,&nbsp; invented by&nbsp; [http://perso.telecom-bretagne.eu/claudeberrou/ $\text{Claude Berrou}$]&nbsp; around 1990,&nbsp; can be decoded iteratively.&nbsp; The&nbsp; (red)&nbsp; marks are each less than&nbsp; $1 \, \rm dB$&nbsp; from the Shannon bound.<br>
 +
#Similar behavior is shown by&nbsp; [[Channel_Coding/The_Basics_of_Low-Density_Parity_Check_Codes|$\text{LDPC codes}$]]&nbsp; ("Low Density Parity&ndash;check Codes")&nbsp; marked by white rectangles,&nbsp; which have been used since in 2006 on&nbsp; [https://en.wikipedia.org/wiki/DVB-S2 $\text{DVB&ndash;S(2)}$]&nbsp; ("Digital Video Broadcast over Satellite").
 +
#These are well suited for iterative decoding using&nbsp; "factor&ndash;graphs"&nbsp; &nbsp; and &nbsp; "exit charts"&nbsp; due to the sparse occupancy of the parity-check matrix&nbsp; $\boldsymbol {\rm H}$&nbsp; (with "ones").&nbsp; See  [Hag02]<ref name='Hag02'>Hagenauer, J.: The Turbo Principle in Mobile Communications.&nbsp; In: Int. Symp. on Information Theory and Its Applications, Oct.2010,&nbsp;  [http://wwwmayr.in.tum.de/konferenzen/Jass05/courses/4/papers/prof_hagenauer.pdf $\text{PDF document}$.]</ref><br>
 +
#The black dots mark the CCSDS specified&nbsp; [[Channel_Coding/The_Basics_of_Low-Density_Parity_Check_Codes|$\text{LDPC codes}$]],&nbsp; all of which assume a constant number of information bits&nbsp; $(k = 16384)$.&nbsp;
 +
#In contrast,&nbsp; for all white rectangles,&nbsp; the code length&nbsp; $n = 64800$&nbsp; is constant,&nbsp; while the number&nbsp; $k$&nbsp; of information bits changes according to the rate&nbsp; $R = k/n$.<br>
 +
#Around the year 2000,&nbsp; many researchers had the ambition to approach the Shannon bound to within fractions of a &nbsp;$\rm dB$.&nbsp; The yellow cross marks such a result from [CFRU01]<ref name='CFRU01'>Chung S.Y; Forney Jr., G.D.; Richardson, T.J.; Urbanke, R.:&nbsp; On the Design of Low-Density Parity- Check Codes within 0.0045 dB of the Shannon Limit. – In: IEEE Communications Letters, vol. 5, no. 2 (2001), pp. 58–60.</ref>. Used an irregular rate&ndash;1/2&ndash;LDPC of code length&nbsp; $n = 2 \cdot10^6$.<br><br>
  
Auch diese Grenzkurve wurde mit QAM und sehr langen Kanalcodes inzwischen nahezu erreicht, ohne dass sie jemals überschritten werden wird.<br>
+
{{BlaueBox|TEXT= 
 +
$\text{Conclusion:}$&nbsp; It should be noted that Shannon recognized and proved as early as 1948 that no one-dimensional modulation method can lie to the left of the AWGN limit curve "Gaussian (real)" drawn throughout. 
 +
*For two-dimensional methods such as QAM and multilevel PSK, on the other hand, the "Gaussian (complex)" limit curve applies, which is drawn here as a dashed line and always lies to the left of the solid curve.
 +
*For more details, see the&nbsp; [[Information_Theory/AWGN_Channel_Capacity_for_Discrete_Input#Maximum_code_rate_for_QAM_structures|$\text{Maximum code rate for QAM structures}$]]&nbsp; section of the "Information Theory" book.<br>
 +
*Also, this limit curve has now been nearly reached with QAM methods and very long channel codes, without ever being exceeded.}}<br>
  
== Aufgaben ==
+
== Exercises for the chapter ==
 
<br>
 
<br>
[[Aufgaben:1.17 Coderate vs. EB/N0|A1.17 Coderate vs. EB/N0]]
+
[[Aufgaben:Exercise_1.17:_About_the_Channel_Coding_Theorem|Exercise 1.17: About the Channel Coding Theorem]]
  
[[Zusatzaufgaben:1.17 BPSK–Kanalkapazität]]
+
[[Aufgaben:Exercise_1.17Z:_BPSK_Channel_Capacity|Exercise 1.17Z: BPSK Channel Capacity]]
  
==Quellenverzeichnis==
+
==References==
 
<references/>
 
<references/>
  
 
{{Display}}
 
{{Display}}

Latest revision as of 12:12, 22 November 2022

Channel coding theorem and channel capacity


We further consider a binary block code with 

  • $k$  information bits per block,
  • code words of length  $n$,
  • resulting in the code rate  $R=k/n$  with the unit  "information bit/code symbol".


The ingenious information theorist  $\text{Claude E. Shannon}$  has dealt very intensively with the correctability of such codes already in 1948 and has given for this a limit for each channel which results from information-theoretical considerations alone.  Up to this day,  no code has been found which exceeds this limit,  and this will remain so.

$\text{Shannon's channel coding theorem:}$  For each channel with channel capacity  $C > 0$  there always exists  (at least)  one code whose error probability approaches zero as long as the code rate  $R$  is smaller than the channel capacity  $C$.  The prerequisite for this is that the following holds for the block length of this code:  

$$n \to \infty.$$


Notes: 

  • The statement  "The error probability approaches zero"  is not identical with the statement  "The transmission is error-free".  Example:   For an infinitely long sequence,  finitely many symbols are falsified.
  • For some channels,  even with  $R=C$  the error probability still goes towards zero  (but not for all).


The inverse of the channel coding theorem is also true and states:

$\text{Inverse:}$  If the code rate  $R$  is larger than the channel capacity  $C$,  then an arbitrarily small error probability cannot be achieved in any case.


To derive and calculate the channel capacity,  we first assume a digital channel with  $M_x$  possible input values  $x$  and  $M_y$  possible output values  $y$.  Then,  for the mean mutual information  – in short,  the  $\text{mutual information}$   –  between the random variable  $x$  at the channel input and the random variable  $y$  at its output:

\[I(x; y) =\sum_{i= 1 }^{M_X} \hspace{0.15cm}\sum_{j= 1 }^{M_Y} \hspace{0.15cm}{\rm Pr}(x_i, y_j) \cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(y_j \hspace{0.05cm}| \hspace{0.05cm}x_i)}{{\rm Pr}(y_j)} = \sum_{i= 1 }^{M_X} \hspace{0.15cm}\sum_{j= 1 }^{M_Y}\hspace{0.15cm}{\rm Pr}(y_j \hspace{0.05cm}| \hspace{0.05cm}x_i) \cdot {\rm Pr}(x_i) \cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(y_j \hspace{0.05cm}| \hspace{0.05cm}x_i)}{\sum_{k= 1}^{M_X} {\rm Pr}(y_j \hspace{0.05cm}| \hspace{0.05cm}x_k) \cdot {\rm Pr}(x_k)} \hspace{0.05cm}.\]

In the transition from the first to the second equation, the   $\text{Theorem of Bayes}$   and the   $\text{Theorem of Total Probability}$   were considered.

Further,  it should be noted:

  • The  "binary logarithm"  is denoted here with  "log2".  Partially,  we also use  "ld"  ("logarithm dualis")  for this in our learning tutorial.
  • In contrast to the book  $\text{Information Theory}$  we do not distinguish in the following between the random variable  $($upper case letters  $X$  resp.  $Y)$  and the realizations $($lower case letters  $x$  resp.  $y)$.


$\text{Definition:}$  The  »channel capacity«  introduced by Shannon gives the maximum mutual information  $I(x; y)$  between the input variable  $x$  and output variable  $y$:

\[C = \max_{{{\rm Pr}(x_i)}} \hspace{0.1cm} I(X; Y) \hspace{0.05cm}.\]
  • The pseudo–unit  "bit/channel use"  must be added.


Since the maximization of the mutual information  $I(x; y)$  must be done over all possible  (discrete)  input distributions  ${\rm Pr}(x_i)$,
»the channel capacity is independent of the input and thus a pure channel parameter«.

Channel capacity of the BSC model


We now apply these definitions to the  $\text{BSC model}$  ("Binary Symmetric Channel"):

\[I(x; y) = {\rm Pr}(y = 0 \hspace{0.03cm}| \hspace{0.03cm}x = 0) \cdot {\rm Pr}(x = 0) \cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(y = 0 \hspace{0.03cm}| \hspace{0.03cm}x = 0)}{{\rm Pr}(y = 0)} + {\rm Pr}(y = 1 \hspace{0.03cm}| \hspace{0.03cm}x = 0) \cdot {\rm Pr}(x = 0) \cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(Y = 1 \hspace{0.03cm}| \hspace{0.03cm}x = 0)}{{\rm Pr}(y = 1)} + \]
\[\hspace{1.45cm} + \hspace{0.15cm}{\rm Pr}(y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 1) \cdot {\rm Pr}(x = 1) \cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(Y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 1)}{{\rm Pr}(y = 0)} + {\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 1) \cdot {\rm Pr}(x = 1) \cdot {\rm log_2 } \hspace{0.15cm}\frac{{\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 1)}{{\rm Pr}(y = 1)} \hspace{0.05cm}.\]

The channel capacity is arrived at by the following considerations:

Binary symmetric channel
  • Maximization with respect to the input distribution leads to equally probable symbols:
\[{\rm Pr}(x = 0) = {\rm Pr}(x = 1) = 1/2 \hspace{0.05cm}.\]
  • Due to the symmetry evident from the model,  the following then holds simultaneously:
\[{\rm Pr}(y = 0) = {\rm Pr}(y = 1) = 1/2 \hspace{0.05cm}.\]
  • We also consider the BSC transmission probabilities:
\[{\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 0) = {\rm Pr}(y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 1) = \varepsilon \hspace{0.05cm},\]
\[{\rm Pr}(y = 0 \hspace{0.05cm}| \hspace{0.05cm}x = 0) = {\rm Pr}(y = 1 \hspace{0.05cm}| \hspace{0.05cm}x = 1) = 1-\varepsilon \hspace{0.05cm}.\]
  • After combining two terms each,  we thus obtain:
\[C \hspace{0.15cm} = \hspace{0.15cm} 2 \cdot 1/2 \cdot \varepsilon \cdot {\rm log_2 } \hspace{0.15cm}\frac{\varepsilon}{1/2 }+ 2 \cdot 1/2 \cdot (1- \varepsilon) \cdot {\rm log_2 } \hspace{0.15cm}\frac{1- \varepsilon}{1/2 } \varepsilon \cdot {\rm ld } \hspace{0.15cm}2 - \varepsilon \cdot {\rm log_2 } \hspace{0.15cm} \frac{1}{\varepsilon }+ (1- \varepsilon) \cdot {\rm log_2 } \hspace{0.15cm} 2 - (1- \varepsilon) \cdot {\rm log_2 } \hspace{0.15cm}\frac{1}{1- \varepsilon}\]
\[\Rightarrow \hspace{0.3cm} C \hspace{0.15cm} = \hspace{0.15cm} 1 - H_{\rm bin}(\varepsilon). \]
  • Here,  the  "binary entropy function"  is used:
\[H_{\rm bin}(\varepsilon) = \varepsilon \cdot {\rm log_2 } \hspace{0.15cm}\frac{1}{\varepsilon}+ (1- \varepsilon) \cdot {\rm log_2 } \hspace{0.15cm}\frac{1}{1- \varepsilon}\hspace{0.05cm}.\]

The right graph shows the BSC channel capacity depending on the falsificaion probability  $\varepsilon$.  On the left,  for comparison,  the binary entropy function is shown,  which has already been defined in the chapter  $\text{Discrete Memoryless Sources}$  of the book  "Information Theory". 

Channel capacity of the BSC model

One can see from this representation:

  • The falsificaion probability  $\varepsilon$  leads to the channel capacity  $C(\varepsilon)$.  According to the  $\text{channel coding theorem}$  error-free decoding is only possible if the code rate  $R \le C(\varepsilon)$.
  • With  $\varepsilon = 10\%$,  error-free decoding is impossible  if the code rate  $R > 0.531$   $($because:  $C(0.1) = 0.531)$.
  • With  $\varepsilon = 50\%$,  error-free decoding is impossible even if the code rate is arbitrarily small   $($because:  $C(0.5) = 0)$.
  • From an information theory point of view  $\varepsilon = 1$  $($inversion of all bits$)$  is equivalent to  $\varepsilon = 0$  $($"error-free transmission"$)$.
  • Error-free decoding is achieved here by swapping the zeros and ones, i.e. by a so-called  "mapping".
  • Similarly  $\varepsilon = 0.9$  is equivalent to  $\varepsilon = 0.1$. 


Channel capacity of the AWGN model


We now consider the  $\text{AWGN channel}$  ("Additive White Gaussian Noise").

AWGN channel model

Here,  for the output signal  $y = x + n$,  where  $n$  describes a  $\text{Gaussian distributed random variable}$  and it applies to their expected values  ("moments"):

  1.  ${\rm E}[n] = 0,$
  2.  ${\rm E}[n^2] = P_n.$


Thus,  regardless of the input signal  $x$  $($analog or digital$)$,  there is always a continuous-valued output signal  $y$,  and in the  $\text{equation for mutual information}$  is to be used: 

$$M_y \to\infty.$$

The calculation of the AWGN channel capacity is given here only in keywords.  The exact derivation can be found in the fourth main chapter  "Discrete Value Information Theory"  of the textbook  $\text{Information Theory}$.

  • The input quantity  $x$  optimized with respect to maximum mutual information will certainly be continuous-valued,  that is,  for the AWGN channel in addition to   $M_y \to\infty$   also holds   $M_x \to\infty$.
\[C = \max_{f_x(x)} \hspace{0.1cm} I(x; y)\hspace{0.05cm},\hspace{0.3cm}{\rm where \hspace{0.15cm} must \hspace{0.15cm} apply} \text{:}\hspace{0.15cm} {\rm E} \left [ x^2 \right ] \le P_x \hspace{0.05cm}.\]
  • The optimization also yields a Gaussian distribution for the input PDF   ⇒   $x$,  $n$  and  $y$  are Gaussian distributed according to the probability density functions  $f_x(x)$,   $f_n(n)$   and   $f_y(y)$.  We designate the corresponding powers  $P_x$,  $P_n$  and  $P_y$.
  • After longer calculation one gets for the channel capacity using the  "binary logarithm"  $\log_2(\cdot)$  –  again with the pseudo-unit  "bit/channel use":
\[C = {\rm log_2 } \hspace{0.15cm} \sqrt{\frac{P_y}{P_n }} = {\rm log_2 } \hspace{0.15cm} \sqrt{\frac{P_x + P_n}{P_n }} = {1}/{2 } \cdot {\rm log_2 } \hspace{0.05cm}\left ( 1 + \frac{P_x}{P_n } \right )\hspace{0.05cm}.\]
  • If  $x$  describes a  discrete-time signal with symbol rate  $1/T_{\rm S}$,  it must be bandlimited to   $B = 1/(2T_{\rm S})$,   and the same bandwidth  $B$  must be applied to the noise signal  $n$  ⇒   "noise bandwidth":
\[P_X = \frac{E_{\rm S}}{T_{\rm S} } \hspace{0.05cm}, \hspace{0.4cm} P_N = \frac{N_0}{2T_{\rm S} }\hspace{0.05cm}. \]
  • Thus, the AWGN channel capacity can also be expressed by the  »transmitted energy per symbol«  $(E_{\rm S})$  and the  »noise power density«  $(N_0)$:
\[C = {1}/{2 } \cdot {\rm log_2 } \hspace{0.05cm}\left ( 1 + {2 E_{\rm S}}/{N_0 } \right )\hspace{0.05cm}, \hspace{1.9cm} \text {unit:}\hspace{0.3cm} \frac{\rm bit}{\rm channel\:use}\hspace{0.05cm}.\]
  • The following equation gives the channel capacity per unit time  $($denoted by $^{\star})$:
\[C^{\star} = \frac{C}{T_{\rm S} } = B \cdot {\rm log_2 } \hspace{0.05cm}\left ( 1 + {2 E_{\rm S}}/{N_0 } \right )\hspace{0.05cm}, \hspace{0.8cm} \text {unit:} \hspace{0.3cm} \frac{\rm bit}{\rm time\:unit}\hspace{0.05cm}.\]

$\text{Example 1:}$ 

  • For  $E_{\rm S}/N_0 = 7.5$   ⇒   $10 \cdot \lg \, E_{\rm S}/N_0 = 8.75 \, \rm dB$  the channel capacity is 
$$C = {1}/{2 } \cdot {\rm log_2 } \hspace{0.05cm} (16) = 2 \, \rm bit/channel\:use.$$
  • For a channel with the  (physical)  bandwidth  $B = 4 \, \rm kHz$,  which corresponds to the sampling rate  $f_{\rm A} = 8\, \rm kHz$: 
$$C^\star = 16 \, \rm kbit/s.$$


  • However,  a comparison of different encoding methods at constant  "energy per transmitted symbol"   ⇒   $E_{\rm S}$  is not fair. 
  • Rather,  for this comparison,  the  "energy per source bit"   ⇒   $E_{\rm B}$  should be fixed.  The following relationships apply:
\[E_{\rm S} = R \cdot E_{\rm B} \hspace{0.3cm} \Rightarrow \hspace{0.3cm} E_{\rm B} = E_{\rm S} / R \hspace{0.05cm}. \]

$\text{Channel Coding Theorem for the AWGN channel:}$ 

  • Error-free decoding $($at infinitely long blocks   ⇒   $n \to \infty)$  is always possible if the code rate  $R$  is smaller than the channel capacity  $C$:
\[R < C = {1}/{2 } \cdot {\rm log_2 } \hspace{0.05cm}\left ( 1 +2 \cdot R\cdot E_{\rm B}/{N_0 } \right )\hspace{0.05cm}.\]
  • For each code rate  $R$  the required  $E_{\rm B}/N_0$  of the AWGN channel can thus be determined so that error-free decoding is just possible.
  • One obtains for the limiting case  $R = C$:
\[{E_{\rm B} }/{N_0} > \frac{2^{2R}-1}{2R } \hspace{0.05cm}.\]


The graph summarizes the result,  with the ordinate  $R$  plotted on a linear scale and the abscissa  $E_{\rm B}/{N_0 }$  plotted logarithmically.

AWGN channel capacity
  • Error-free coding is not possible outside the blue area.
  • The blue limit curve indicates the AWGN channel capacity  $C$.


From this graph and the above equation,  the following can be deduced:

  1. Channel capacity  $C$  increases with  $10 \cdot \lg \, E_{\rm B}/N_0 $  somewhat less than linearly.  In the graph,  some selected function values are indicated as blue crosses.
  2. If  $10 \cdot \lg \, E_{\rm B}/N_0 < -1.59 \, \rm dB$,  error-free decoding is impossible in principle.  If the code rate  $R = 0.5$,  then  $10 \cdot \lg \, E_{\rm B}/N_0 > 0 \, \rm dB$  must be   ⇒  $E_{\rm B} > N_0$.
  3. For all binary codes holds per se  $0 < R ≤ 1$. Only with non-binary codes  ⇒   rates  $R > 1$  are possible.  For example,  the maximum possible code rate of a quaternary code:  $R = \log_2 \, M_y = \log_2 \, 4 = 2$.
  4. All one-dimensional modulation types – i.e., those methods that use only the in-phase– or only the quadrature component such as  $\text{2–ASK}$$\text{BPSK}$  and  $\text{2–FSK}$  must be in the blue area of the present graphic.
  5. As shown in the chapter  "Maximum code rate for QAM structures",  there is a  "friendlier"  limit curve for two-dimensional modulation types such as the  "Quadrature Amplitude Modulation".

AWGN channel capacity for binary input signals


In this book we restrict ourselves mainly to binary codes,  that is,  to the Galois field  ${\rm GF}(2^n)$.  With this

Conditional PDFs for AWGN channel and binary input
  • firstly,  the code rate is limited to the range  $R ≤ 1$,
  • secondly,  also for  $R ≤ 1$  not the whole blue region is available  (see previous section).


  1. the parameters  $M_x = 2$  and  $M_y \to \infty$,
  2. bipolar signaling   ⇒   $x=0$   →   $\tilde{x} = +1$  and  $x=1$   →   $\tilde{x} = -1$,
  3. the transition from conditional probabilities  ${\rm Pr}(\tilde{x}_i)$  to conditional probability density functions,
  4. replace the sum with an integration.

  • The optimization of the source leads to equally probable symbols:
\[{\rm Pr}(\tilde{x} = +1) = {\rm Pr}(\tilde{x} = -1) = 1/2 \hspace{0.05cm}. \]
  • This gives for the maximum of the mutual information,  i.e. for the channel capacity:
\[C \hspace{-0.15cm} = {1}/{2} \cdot \int_{-\infty }^{+ \infty} \left [ f_{y\hspace{0.05cm} |\hspace{0.05cm}\tilde{x} = +1}(y)\cdot {\rm log_2 } \hspace{0.15cm}\frac {f_{y\hspace{0.05cm} |\hspace{0.05cm}\tilde{x} = +1}(y)}{f_y(y)} + f_{y\hspace{0.05cm} |\hspace{0.05cm}\tilde{x} = -1}(y)\cdot {\rm log_2 } \hspace{0.15cm}\frac {f_{y\hspace{0.05cm} |\hspace{0.05cm}\tilde{x} = -1}(y)}{f_y(y)} \right ]\hspace{0.1cm}{\rm d}y \hspace{0.05cm}.\]

The integral cannot be solved in mathematical closed form, but can only be evaluated numerically.

AWGN channel capacity for binary input signals
  1. The green curve shows the result.
  2. The blue curve gives for comparison the channel capacity for Gaussian distributed input signals derived in the last section.


It can be seen:

  • For  $10 \cdot \lg \, E_{\rm B}/N_0 < 0 \, \rm dB$  the two capacitance curves differ only slightly.
  • So,  for binary bipolar input,  compared to the optimum (Gaussian) input,  the characteristic  $10 \cdot \lg \, E_{\rm B}/N_0$  only needs to be increased by about  $0.1 \, \rm dB$  to also allow the code rate  $R = 0.5$.
  • From  $10 \cdot \lg \, E_{\rm B}/N_0 \approx6 \, \rm dB$  the capacity  $C = 1 \, \rm bit/channel\:use$  of the AWGN channel for binary input is almost reached.
  • In between, the limit curve is approximately exponentially increasing.


Common channel codes versus channel capacity


Now it shall be shown to what extent established channel codes approximate the BPSK channel capacity  (green curve).   In the following graph the rate   $R=k/n$   of these codes or the capacity   $C$   (with the additional pseudo–unit "bit/channel use")  is plotted as ordinate.

Rates and required  $E_{\rm B}/{N_0}$  of different channel codes

Further,  it is assumed:

  • the AWGN–channel,  denoted by  $10 \cdot \lg \, E_{\rm B}/N_0$  in dB,  and
  • bit error rate  $\rm BER=10^{-5}$ for all codes marked by crosses.


$\text{Please note:}$ 

  1. The channel capacity curves always apply to  $n \to \infty$  and  $\rm BER \to 0$ .
  2. If one would apply this strict requirement  "error-free"  also to the considered channel codes of finite code length  $n$,  this would always require  $10 \cdot \lg \, E_{\rm B}/N_0 \to \infty$ .
  3. But this is an academic problem of little practical significance.  For  $\rm BER = 10^{-10}$  a qualitatively and also quantitatively similar graph would result.
  4. For convolutional codes,  the third identifier parameter has a different meaning than for block codes.  For example,  $\text{CC (2, 1, 32)}$  indicates the memory  $m = 32$


The following are some  »explanations of the data taken from the lecture [Liv10][1]«:

  1. The points  $\rm A$,  $\rm B$  and  $\rm C$  mark  $\text{Hamming codes}$  of different rate.  They all require for  $\rm BER = 10^{-5}$  more than  $10 \cdot \lg \, E_{\rm B}/N_0 = 8 \, \rm dB$.
  2. $\rm D$  denotes the binary  $\text{Golay code}$  with rate  $1/2$  and  $\rm E$  denotes a  $\text{Reed–Muller code}$.  This very low rate code was used 1971 on the Mariner 9 spacecraft.
  3. Marked by  $\rm F$  is a high rate  $\text{Reed–Solomon codes}$   $(R = 223/255 > 0.9)$  and a required  $10 \cdot \lg \, E_{\rm B}/N_0 < 6 \, \rm dB$.
  4. The markers  $\rm G$  and  $\rm H$  denote exemplary  $\text{convolutional codes}$  medium rate. The code  $\rm G$  was used as early as 1972 on the Pioneer10 mission.
  5. The channel coding of the Voyager–mission in the late 1970s is marked by  $\rm I$.  It is the concatenation of a  $\text{CC (2, 1, 7)}$  with a  $\text{Reed–Solomon code}$.


Much better results can be achieved with  »iterative decoding methods«  (see fourth main chapter),  as the second graph shows.

Rates and required  $E_{\rm B}/N_0$  for iterative coding methods
  • This means:  The individual marker points are much closer to the capacity curve  $C_{\rm BPSK}$  for digital input.
  • The solid blue Gaussian capacity curve is labeled  $C_{\rm Gaussian}$.

Here are some more explanations about this graph:

  1. Red crosses mark so-called  $\text{turbo codes}$  accordingt to  $\text{CCSDS}$  ("Consultative Committee for Space Data Systems")  with each  $k = 6920$  information bits and different code lengths  $n$.
  2. These codes,  invented by  $\text{Claude Berrou}$  around 1990,  can be decoded iteratively.  The  (red)  marks are each less than  $1 \, \rm dB$  from the Shannon bound.
  3. Similar behavior is shown by  $\text{LDPC codes}$  ("Low Density Parity–check Codes")  marked by white rectangles,  which have been used since in 2006 on  $\text{DVB–S(2)}$  ("Digital Video Broadcast over Satellite").
  4. These are well suited for iterative decoding using  "factor–graphs"    and   "exit charts"  due to the sparse occupancy of the parity-check matrix  $\boldsymbol {\rm H}$  (with "ones").  See [Hag02][2]
  5. The black dots mark the CCSDS specified  $\text{LDPC codes}$,  all of which assume a constant number of information bits  $(k = 16384)$. 
  6. In contrast,  for all white rectangles,  the code length  $n = 64800$  is constant,  while the number  $k$  of information bits changes according to the rate  $R = k/n$.
  7. Around the year 2000,  many researchers had the ambition to approach the Shannon bound to within fractions of a  $\rm dB$.  The yellow cross marks such a result from [CFRU01][3]. Used an irregular rate–1/2–LDPC of code length  $n = 2 \cdot10^6$.

$\text{Conclusion:}$  It should be noted that Shannon recognized and proved as early as 1948 that no one-dimensional modulation method can lie to the left of the AWGN limit curve "Gaussian (real)" drawn throughout.

  • For two-dimensional methods such as QAM and multilevel PSK, on the other hand, the "Gaussian (complex)" limit curve applies, which is drawn here as a dashed line and always lies to the left of the solid curve.
  • For more details, see the  $\text{Maximum code rate for QAM structures}$  section of the "Information Theory" book.
  • Also, this limit curve has now been nearly reached with QAM methods and very long channel codes, without ever being exceeded.


Exercises for the chapter


Exercise 1.17: About the Channel Coding Theorem

Exercise 1.17Z: BPSK Channel Capacity

References

  1. Liva, G.: Channel Coding. Lecture manuscript, Chair of Communications Engineering, TU Munich and DLR Oberpfaffenhofen, 2010.
  2. Hagenauer, J.: The Turbo Principle in Mobile Communications.  In: Int. Symp. on Information Theory and Its Applications, Oct.2010,  $\text{PDF document}$.
  3. Chung S.Y; Forney Jr., G.D.; Richardson, T.J.; Urbanke, R.:  On the Design of Low-Density Parity- Check Codes within 0.0045 dB of the Shannon Limit. – In: IEEE Communications Letters, vol. 5, no. 2 (2001), pp. 58–60.