Difference between revisions of "Information Theory/Entropy Coding According to Huffman"

From LNTwww
 
(43 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 
   
 
   
 
{{Header
 
{{Header
|Untermenü=Quellencodierung – Datenkomprimierung
+
|Untermenü=Source Coding - Data Compression
|Vorherige Seite=Komprimierung nach Lempel, Ziv und Welch
+
|Vorherige Seite=Compression According to Lempel, Ziv and Welch
|Nächste Seite=Weitere Quellencodierverfahren
+
|Nächste Seite=Further Source Coding Methods
 
}}
 
}}
  
==Der Huffman–Algorithmus==   
+
==The Huffman algorithm==   
 
<br>
 
<br>
Wir setzen nun voraus, dass die Quellensymbole $q_\nu$ einem Alphabet $\{q_μ\} = \{$ $\rm A$, $\rm B$ , $\rm C$ , ...$\}$ mit dem Symbolumfang $M$ entstammen und statistisch voneinander unabhängig seien. Beispielsweise gelte für den Symbolumfang $M = 8$:
+
We now assume that the source symbols&nbsp; $q_\nu$&nbsp; originate from an alphabet&nbsp; $\{q_μ\} = \{$ $\rm A$,&nbsp; $\rm B$ ,&nbsp; $\rm C$ , ...$\}$&nbsp; with the symbol set size&nbsp; $M$&nbsp; and they are statistically independent of each other.&nbsp; For example, for the symbol set size&nbsp; $M = 8$:
 
   
 
   
:$$\{ \hspace{0.05cm}q_{\mu} \} = \{ \boldsymbol{\rm A} \hspace{0.05cm}, \boldsymbol{\rm B}\hspace{0.05cm}, \boldsymbol{\rm C}\hspace{0.05cm}, \boldsymbol{\rm D}\hspace{0.05cm}, \boldsymbol{\rm E}\hspace{0.05cm}, \boldsymbol{\rm F}\hspace{0.05cm}, \boldsymbol{\rm G}\hspace{0.05cm}, \boldsymbol{\rm H}\hspace{0.05cm}
+
:$$\{ \hspace{0.05cm}q_{\mu} \} = \{ \boldsymbol{\rm A} \hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm B}\hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm C}\hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm D}\hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm E}\hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm F}\hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm G}\hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm H}\hspace{0.05cm}
 
\}\hspace{0.05cm}.$$
 
\}\hspace{0.05cm}.$$
  
[https://de.wikipedia.org/wiki/David_A._Huffman David A. Huffman] hat 1952 – also kurz nach Shannons bahnbrechenden Veröffentlichungen – einen Algorithmus zur Konstruktion von optimalen präfixfreien Codes angegeben.
+
In 1952 - i.e. shortly after Shannon's groundbreaking publications,&nbsp; [https://en.wikipedia.org/wiki/David_A._Huffman $\text{David A. Huffman}$]&nbsp; gave an algorithm for the construction of optimal prefix-free codes.&nbsp; This&nbsp; &raquo;'''Huffman Algorithm'''&laquo;&nbsp; is to be given here without derivation and proof, whereby we restrict ourselves to binary codes.&nbsp; That is: &nbsp; For the code symbols, let&nbsp; $c_ν ∈ \{$'''0''',&nbsp; '''1'''$\}$ always hold.  
Dieser ''Huffman–Algorithmus'' soll ohne Herleitung und Beweis angegeben werden, wobei wir uns hier auf Binärcodes beschränken. Das heißt: &nbsp; Für die Codesymbole gelte stets $c_ν ∈ \{$'''0''', '''1'''$\}$. Hier ist das Rezept:
+
 
# &nbsp; Man ordne die Symbole nach fallenden Auftrittswahrscheinlichkeiten.
+
Here is the&nbsp; "recipe":
# &nbsp; Man fasse die zwei unwahrscheinlichsten Symbole zu einem neuen Symbol zusammen.
+
# &nbsp; Order the symbols according to decreasing probabilities of occurrence.
# &nbsp; Man wiederhole '''(1)''' und '''(2)''', bis nur mehr zwei (zusammengefasste) Symbole übrig bleiben.
+
# &nbsp; Combine the two most improbable symbols into a new symbol.
# &nbsp; Man codiert die wahrscheinlichere Symbolmenge mit '''1''' und die andere Menge mit '''0'''.
+
# &nbsp; Repeat&nbsp; '''(1)'''&nbsp; and&nbsp; '''(2)''', until only two&nbsp; (combined)&nbsp; symbols remain.
# &nbsp; Man ergänzt in Gegenrichtung (also von unten nach oben) die jeweiligen Binärcodes der aufgespaltenen Teilmengen gemäß den Wahrscheinlichkeiten mit '''1''' bzw. '''0'''.
+
# &nbsp; Encode the more probable set of symbols with&nbsp; '''1'''&nbsp; and the other set with&nbsp; '''0'''.
 +
# &nbsp; In the opposite direction&nbsp; (i.e. from bottom to top)&nbsp;, add&nbsp; '''1'''&nbsp; or&nbsp; '''0''' to the respective binary codes of the split subsets according to the probabilities.
  
  
 
{{GraueBox|TEXT=
 
{{GraueBox|TEXT=
$\text{Beispiel 1:}$&nbsp; Ohne Einschränkung der Allgemeingültigkeit setzen wir voraus, dass die $M = 6$ Symbole '''A''', ... , '''F''' bereits nach ihren Wahrscheinlichkeiten geordnet sind:
+
$\text{Example 1:}$&nbsp; Without limiting the generality, we assume that the&nbsp; $M = 6$&nbsp; symbols&nbsp; $\rm A$, ... , $\rm F$&nbsp; are already ordered according to their probabilities:
 
   
 
   
 
:$$p_{\rm A} = 0.30 \hspace{0.05cm},\hspace{0.2cm}p_{\rm B} = 0.24 \hspace{0.05cm},\hspace{0.2cm}p_{\rm C} = 0.20 \hspace{0.05cm},\hspace{0.2cm}
 
:$$p_{\rm A} = 0.30 \hspace{0.05cm},\hspace{0.2cm}p_{\rm B} = 0.24 \hspace{0.05cm},\hspace{0.2cm}p_{\rm C} = 0.20 \hspace{0.05cm},\hspace{0.2cm}
Line 29: Line 30:
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
  
Durch paarweises Zusammenfassen und anschießendem Sortieren erhält man in fünf Schritten die folgenden Symbolkombinationen <br>(resultierende Wahrscheinlichkeiten in Klammern):
+
By pairwise combination and subsequent sorting, the following symbol combinations are obtained in five steps&nbsp; (resulting probabilities in brackets):
:1. &nbsp; '''A''' (0.30), '''B''' (0.24), '''C''' (0.20), '''EF''' (0.14), '''D''' (0.12),
+
:1. &nbsp; $\rm A$&nbsp; (0.30), $\rm B$&nbsp; (0.24), $\rm C$&nbsp; (0.20), $\rm EF$&nbsp; (0.14), $\rm D$&nbsp; (0.12),
:2. &nbsp; '''A''' (0.30), '''EFD''' (0.26), '''B''' (0.24), '''C''' (0.20),
+
:2. &nbsp; $\rm A$&nbsp; (0.30), $\rm EFD$&nbsp; (0.26), $\rm B$&nbsp; (0.24), $\rm C$&nbsp; (0.20),
:3. &nbsp; '''BC''' (0.44), '''A''' (0.30), '''EFD''' (0.26),
+
:3. &nbsp; $\rm BC$&nbsp; (0.44), $\rm A$&nbsp; (0.30), $\rm EFD$&nbsp; (0.26),
:4. &nbsp; '''AEFD''' (0.56), '''BC''' (0.44),
+
:4. &nbsp; $\rm AEFD$&nbsp; (0.56), $\rm BC$&nbsp; (0.44),
:5. &nbsp; Root '''AEFDBC''' (1.00).
+
:5. &nbsp; Root&nbsp; $\rm AEFDBC$&nbsp; (1.00).
Rückwärts &ndash; alsogemäß den Schritten '''(5)''' bis '''(1)''' &ndash; erfolgt dann die Zuordnung zu Binärsymbolen. Ein '''x''' weist darauf hin, dass in den nächsten Schritten noch Bits hinzugefügt werden müssen:
+
Backwards &ndash; i.e. according to steps&nbsp; '''(5)'''&nbsp; to&nbsp; '''(1)'''&nbsp; &ndash; the assignment to binary symbols then takes place. &nbsp; An&nbsp; "'''x'''"&nbsp; indicates that bits still have to be added in the next steps:
:5. &nbsp; '''AEFD''' → '''1x''',   '''BC''' → '''0x''',
+
:5. &nbsp; $\rm AEFD$ &nbsp; &nbsp; '''1x''', &nbsp; &nbsp;  $\rm BC$ &nbsp; &nbsp; '''0x''',
:4. &nbsp; '''<u>A''' → '''11</u>''',   '''EFD''' → '''10x''',
+
:4. &nbsp; $\underline{\rm A}$ &nbsp; → &nbsp; '''<u>11</u>''', &nbsp; &nbsp;  $\rm EFD$ &nbsp; &nbsp; '''10x''',
:3. &nbsp; '''<u>B''' → '''01</u>''', '''<u>C''' → '''00</u>''',
+
:3. &nbsp; $\underline{\rm B}$ &nbsp; → &nbsp; '''<u>01</u>''', &nbsp; &nbsp; $\underline{\rm C}$ &nbsp; → &nbsp; '''<u>00</u>''',
:2. &nbsp; '''EF''' → '''101x''',   '''<u>D''' → '''100</u>''',
+
:2. &nbsp; $\rm EF$ &nbsp; &nbsp; '''101x''', &nbsp; &nbsp; $\underline{\rm D}$ &nbsp; → &nbsp; '''<u>100</u>''',
:1. &nbsp; '''<u>E''' → '''1011</u>''',   '''<u>F''' → '''1010</u>'''.
+
:1. &nbsp; $\underline{\rm E}$ &nbsp; → &nbsp; '''<u>1011</u>''', &nbsp; &nbsp; $\underline{\rm F}$ &nbsp; → &nbsp; '''<u>1010</u>'''.
Die Unterstreichungen markieren die endgültige Binärcodierung.}}
+
The underlines mark the final binary coding.}}
  
  
 
 
 
 
==Zum Begriff „Entropiecodierung”==   
+
==On the term&nbsp; "Entropy Coding"==   
 
<br>
 
<br>
Wir gehen weiterhin von den Wahrscheinlichkeiten und Zuordnungen des letzten Beispiels aus:
+
We continue to assume the probabilities and assignments of the last example:
 
    
 
    
 
:$$p_{\rm A} = 0.30 \hspace{0.05cm},\hspace{0.2cm}p_{\rm B} = 0.24 \hspace{0.05cm},\hspace{0.2cm}p_{\rm C} = 0.20 \hspace{0.05cm},\hspace{0.2cm}
 
:$$p_{\rm A} = 0.30 \hspace{0.05cm},\hspace{0.2cm}p_{\rm B} = 0.24 \hspace{0.05cm},\hspace{0.2cm}p_{\rm C} = 0.20 \hspace{0.05cm},\hspace{0.2cm}
Line 60: Line 61:
 
\boldsymbol{\rm F} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 1010} \hspace{0.05cm}.$$
 
\boldsymbol{\rm F} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 1010} \hspace{0.05cm}.$$
  
Von den sechs Quellensymbolen werden also drei mit je zwei Bit, eines mit drei Bit und zwei Symbole ($\rm E$ und $\rm F$) mit vier Bit codiert.  
+
Thus, of the six source symbols, three are encoded with two bits each, one with three bits and two symbols&nbsp;  $(\rm E$&nbsp;  and&nbsp;  $\rm F)$&nbsp;  with four bits each.
  
Die mittlere Codewortlänge ergibt sich damit zu
+
The average code word length thus results in
 
   
 
   
:$$L_{\rm M} = (0.30 \hspace{-0.05cm}+ \hspace{-0.05cm}0.24 \hspace{-0.05cm}+ \hspace{-0.05cm} 0.20) \cdot 2  + 0.12 \cdot 3 + (0.10 \hspace{-0.05cm}+ \hspace{-0.05cm} 0.04 ) \cdot 4 = 2.4 \,{\rm bit/Quellensymbol}
+
:$$L_{\rm M} = (0.30 \hspace{-0.05cm}+ \hspace{-0.05cm}0.24 \hspace{-0.05cm}+ \hspace{-0.05cm} 0.20) \cdot 2  + 0.12 \cdot 3 + (0.10 \hspace{-0.05cm}+ \hspace{-0.05cm} 0.04 ) \cdot 4 = 2.4 \,{\rm bit/source\hspace{0.15cm}symbol}
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
  
Aus dem Vergleich mit der Quellenentropie $H = 2.365 \ \rm bit/Quellensymbol$ erkennt man die Effizienz der Huffman–Codierung.
+
From the comparison with the source entropy&nbsp;  $H = 2.365 \ \rm bit/source\hspace{0.15cm}symbol$&nbsp;  one can see the efficiency of the Huffman coding.
  
 
{{BlaueBox|TEXT=
 
{{BlaueBox|TEXT=
$\text{Merke:}$&nbsp;  
+
$\text{Note:}$&nbsp;  
Es gibt keinen präfixfreien (⇒ sofort decodierbaren) Code, der allein unter Ausnutzung der Auftrittswahrscheinlichkeiten zu einer kleineren mittleren Codewortlänge $L_{\rm M}$ führt als der Huffman–Code. '''In diesem Sinne ist der Huffman–Code optimal'''.}}
+
There is no prefix-free&nbsp;  $($&nbsp; immediately decodable$)$&nbsp;  code that leads to a smaller average code word length&nbsp;  $L_{\rm M}$&nbsp;  than the Huffman code by exploiting the occurrence probabilities alone. &nbsp;  '''In this sense, the Huffman code is optimal'''.}}
  
  
 
{{GraueBox|TEXT=
 
{{GraueBox|TEXT=
$\text{Beispiel 2:}$&nbsp; Wären die Symbolwahrscheinlichkeiten
+
$\text{Example 2:}$&nbsp; If the symbol probabilities were
 
   
 
   
 
:$$p_{\rm A} = p_{\rm B} = p_{\rm C} = 1/4 \hspace{0.05cm},\hspace{0.2cm}
 
:$$p_{\rm A} = p_{\rm B} = p_{\rm C} = 1/4 \hspace{0.05cm},\hspace{0.2cm}
Line 81: Line 82:
 
\hspace{0.05cm},$$
 
\hspace{0.05cm},$$
  
so würde für die Entropie und für die mittlere Codewortlänge gleichermaßen gelten:
+
then the same would apply to the entropy and to the average code word length:
 
   
 
   
:$$H =  3 \cdot 1/4 \cdot {\rm log_2}\hspace{0.1cm}(4)  + 1/8 \cdot {\rm log_2}\hspace{0.1cm}(8) + 2 \cdot 1/16 \cdot {\rm log_2}\hspace{0.1cm}(16) = 2.375 \,{\rm bit/Quellensymbol}\hspace{0.05cm},$$
+
:$$H =  3 \cdot 1/4 \cdot {\rm log_2}\hspace{0.1cm}(4)  + 1/8 \cdot {\rm log_2}\hspace{0.1cm}(8) + 2 \cdot 1/16 \cdot {\rm log_2}\hspace{0.1cm}(16) = 2.375 \,{\rm bit/source\hspace{0.15cm} symbol}\hspace{0.05cm},$$
:$$L_{\rm M}  =  3 \cdot 1/4 \cdot 2  + 1/8 \cdot 3 + 2 \cdot 1/16 \cdot 4 = 2.375 \,{\rm bit/Quellensymbol}
+
:$$L_{\rm M}  =  3 \cdot 1/4 \cdot 2  + 1/8 \cdot 3 + 2 \cdot 1/16 \cdot 4 = 2.375 \,{\rm bit/source\hspace{0.15cm} symbol}
 
\hspace{0.05cm}.$$}}
 
\hspace{0.05cm}.$$}}
  
  
Aus dieser Eigenschaft $L_{\rm M} = H +\varepsilon$ mit $\varepsilon = 0$ bei geeigneten Auftrittswahrscheinlichkeiten erklärt sich der Begriff '''Entropiecodierung''':  
+
This property&nbsp;  $L_{\rm M} = H +\varepsilon$&nbsp;  with positive&nbsp;  $\varepsilon \to 0$&nbsp;  at suitable symbol probabilities explains the term&nbsp;  &raquo;'''Entropy Coding'''&laquo;:
  
Man versucht bei dieser Form von Quellencodierung, die Länge $L_μ$ der Binärfolge (bestehend aus Nullen und Einsen) für das Symbol $q_μ$ gemäß der Entropieberechnung wie folgt an dessen Auftrittswahrscheinlichkeit $p_μ$ anzupassen:
+
In this form of source coding, one tries to adapt the length&nbsp;  $L_μ$&nbsp;  of the binary sequence&nbsp;  (consisting of zeros and ones)&nbsp;  for the symbol&nbsp;  $q_μ$&nbsp;  according to the entropy calculation to its symbol probability&nbsp;  $p_μ$&nbsp;  as follows:
 
   
 
   
 
:$$L_{\mu} =  {\rm log}_2\hspace{0.1cm}(1/p_{\mu} )  \hspace{0.05cm}.$$
 
:$$L_{\mu} =  {\rm log}_2\hspace{0.1cm}(1/p_{\mu} )  \hspace{0.05cm}.$$
  
Natürlich gelingt das nicht immer, sondern nur dann, wenn alle Auftrittswahrscheinlichkeiten $p_μ$ in der Form $2^{–k}$ mit $k = 1, 2, 3,$ ...  dargestellt werden können.
+
Of course, this does not always succeed, but only if all symbol probabilities&nbsp;  $p_μ$&nbsp;  can be represented in the form&nbsp;  $2^{–k}$&nbsp;  with&nbsp;  $k = 1, \ 2, \ 3,$ ...   
*In diesem Sonderfall – und nur in diesem – stimmt die mittlere Codewortlänge $L_{\rm M}$ exakt mit der Quellenentropie $H$ überein ($\varepsilon = 0$, siehe $\text{Beispiel 2}$).  
+
*In this special case - and only in this case - the average code word length&nbsp;  $L_{\rm M}$&nbsp;  coincides exactly with the source entropy&nbsp;  $H$&nbsp; &nbsp;  $(\varepsilon = 0$,&nbsp;  see&nbsp; $\text{Example 2})$.  
*Nach dem [[Informationstheorie/Allgemeine_Beschreibung#Quellencodierungstheorem|Quellencodierungstheorem]] gibt es keinen (decodierbaren) Code, der im Mittel mit weniger Binärzeichen pro Quellensymbol auskommt.
+
*According to the&nbsp;  [[Information_Theory/General_Description#Source_coding_theorem|$\text{Source Coding Theorem}$]]&nbsp;  there is no&nbsp; (decodable)&nbsp; code that gets by with fewer binary characters per source symbol on average.
  
 
 
 
 
==Darstellung des Huffman–Codes als Baumdiagramm==
+
==Representation of the Huffman code as a tree diagram==
 
<br>
 
<br>
Häufig wird zur Konstruktion des Huffman–Codes eine '''Baumstruktur''' verwendet. Für das bisher betrachtete Beispiel zeigt diese die folgende Grafik:
+
A&nbsp;  &raquo;'''tree structure'''&laquo;&nbsp;  is often used to construct the Huffman code.&nbsp;  For the example considered so far, this is shown in the following diagram:
  
[[File:P_ID2418__Inf_T_2_3_S3_neu.png|frame|Baumdarstellung der Huffman–Codierung für das $\text{Beispiel 1}$]]
+
[[File:EN_Inf_T_2_3_S3.png|frame|Tree representation of the Huffman coding for&nbsp; $\text{Example 1}$]]
  
Man erkennt:
+
It can be seen:
*Bei jedem Schritt des Huffman–Algorithmus werden die beiden Zweige mit den jeweils kleinsten Wahrscheinlichkeiten zusammengefasst.  
+
*At each step of the Huffman algorithm, the two branches with the smallest respective probabilities are combined.  
*Der Knoten im ersten Schritt fasst die zwei Symbole $\rm E$ und $\rm F$ mit den aktuell kleinsten Wahrscheinlichkeiten zusammen. Dieser Knoten ist mit $p_{\rm E} + p_{\rm F} = 0.14$ beschriftet.
+
*The node in the first step combines the two symbols&nbsp;  $\rm E$&nbsp;  and&nbsp;  $\rm F$&nbsp;  with the currently smallest probabilities.&nbsp;  This new node is labelled&nbsp;  $p_{\rm E} + p_{\rm F} = 0.14$&nbsp; .
*Der vom Symbol mit der kleineren Wahrscheinlichkeit (hier $\rm F$) zum Summenknoten verlaufende Zweig ist blau eingezeichnet, der andere Zweig (für $\rm E$) rot.
+
*The branch running from the symbol with the smaller probability&nbsp; $($here&nbsp; $\rm F)$&nbsp;  to the sum node is drawn in blue, the other branch&nbsp;  $($here&nbsp; $\rm E)$&nbsp;  in red.
<br clear=all>
+
*After five steps, we arrive at the root of the tree with the total probability&nbsp;  $1.00$&nbsp;.&nbsp;   
Nach fünf Schritten ist man bei der Baumwurzel („Root”) mit der Gesamtwahrscheinlichkeit $1.00$ angelangt. Verfolgt man nun den Verlauf von der Wurzel (in obiger Grafik mit gelber Füllung) zu den einzelnen Symbolen zurück, so kann man aus den Farben der einzelnen Zweige die Symbolzuordnung ablesen. Mit den Zuordnungen „rot”  →  '''1''' und „blau” →  '''0''' ergibt sich beispielsweise von der Wurzel zu Symbol
 
* $\rm A$':  &nbsp;  rot, rot  →  '''11''',
 
*$\rm B$&nbsp; blau, rot  →  '''01''',
 
*$\rm C$&nbsp; blau, blau →  '''00''',
 
*$\rm D$:  &nbsp; rot, blau, blau  → '''100''',
 
*$\rm E$&nbsp; rot, blau, rot, rot  →  '''1011''',
 
*$\rm F$:  &nbsp; rot, blau, rot, blau  → '''1010'''.
 
  
  
Die (einheitliche) Zuordnung „rot” →  '''0''' und „blau” →  '''1''' würde ebenfalls zu einem optimalen präfixfreien Huffman–Code führen.
+
If you now trace the course from the root&nbsp;  (in the graphic with yellow filling)&nbsp;  back to the individual symbols, you can read off the symbol assignment from the colours of the individual branches.
 +
 
 +
For example, with the assignments&nbsp; "red" &nbsp;  → &nbsp;  '''1''' and&nbsp; "blue"  &nbsp;  → &nbsp;  '''0''', the following results from the root to the symbol
 +
*$\rm A$:  &nbsp;  red, red  &nbsp;  → &nbsp;  '''11''',
 +
*$\rm B$:  &nbsp; blue, red  &nbsp;  → &nbsp;  '''01''',
 +
*$\rm C$:  &nbsp; blue, blue  &nbsp;  → &nbsp;  '''00''',
 +
*$\rm D$:  &nbsp; red, blue, blue  &nbsp;  → &nbsp;  '''100''',
 +
*$\rm E$:  &nbsp; red, blue, red, red  &nbsp;  → &nbsp;  '''1011''',
 +
*$\rm F$:  &nbsp; red, blue, red, blue  &nbsp;  → &nbsp;  '''1010'''.
 +
 
 +
 
 +
The&nbsp; (uniform)&nbsp; assignment&nbsp; "red"  &nbsp; &nbsp; '''0'''&nbsp; and&nbsp; "blue"  &nbsp; &nbsp; '''1'''&nbsp; would also lead to an optimal prefix-free Huffman code.
 
    
 
    
 
{{GraueBox|TEXT=
 
{{GraueBox|TEXT=
$\text{Beispiel 3:}$&nbsp;  
+
$\text{Example 3:}$&nbsp;  
Die folgende Grafik zeigt die Huffman–Codierung von 49 Symbolen $q_ν ∈ \{$ $\rm A$, $\rm B$, $\rm C$, $\rm D$, $\rm E$, $\rm F$ $\}$ mit der auf der letzten Seite hergeleiteten Zuordnung. Die binäre Codesymbolfolge weist die mittlere Codewortlänge $L_{\rm M} = 125/49 = 2.551$ auf. Die Farben dienen ausschließlich zur besseren Orientierung.
+
The following graph shows the Huffman coding of&nbsp;  $49$&nbsp; symbols&nbsp; $q_ν ∈ \{$ $\rm A$,&nbsp; $\rm B$,&nbsp; $\rm C$,&nbsp; $\rm D$,&nbsp; $\rm E$,&nbsp; $\rm F$ $\}$ &nbsp;with the assignment derived in the last section.&nbsp; The different colours serve only for better orientation.
 +
*The binary encoded sequence has the average code word length&nbsp;
 +
:$$L_{\rm M} = 125/49 = 2.551.$
  
[[File:Inf_T_2_3_S3b_version2.png|center|frame|Beispielfolgen bei Huffman–Codierung]]
+
*Due to the short sequence&nbsp; $(N = 49)$&nbsp;, the frequencies&nbsp; $h_{\rm A}$, ... ,&nbsp; $h_{\rm F}$&nbsp; of the simulated sequences deviate significantly from the given probabilities&nbsp; $p_{\rm A}$, ... ,&nbsp; $p_{\rm F}$&nbsp;:
 +
[[File:EN_Inf_T_2_3_S3b.png|right|frame|Example sequences for Huffman coding]]  
  
Aufgrund der kurzen Quellensymbolfolge $(N = 49$ weichen die Auftrittshäufigkeiten $h_{\rm A}$, ... , $h_{\rm F}$ der simulierten Folgen signifikant von den vorgegebenen Wahrscheinlichkeiten $p_{\rm A}$, ... , $p_{\rm F}$ ab:
+
:$$p_{\rm A} = 0.30 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm A} = 16/49 \approx 0.326 \hspace{0.05cm},$$
 
+
:$$p_{\rm B} = 0.24 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm B} = 7/49 \approx 0.143 \hspace{0.05cm},$$  
:$$p_{\rm A} = 0.30 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm A} = 16/49 \approx 0.326 \hspace{0.05cm},\hspace{0.4cm}p_{\rm B} = 0.24 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm B} = 7/49 \approx 0.143 \hspace{0.05cm},$$
+
:$$p_{\rm C} =0.24 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm C}= 9/49 \approx  0.184 \hspace{0.05cm},$$
+
:$$p_{\rm D} = 0.12 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm D} = 7/49 \approx  0.143 \hspace{0.05cm},$$
:$$p_{\rm C} =0.24 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm C}= 9/49 \approx  0.184 \hspace{0.05cm},\hspace{0.6cm}p_{\rm D} = 0.12 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm D} = 7/49 \approx  0.143 \hspace{0.05cm},$$
+
:$$p_{\rm E}=0.10 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm E} = 5/49 \approx 0.102 \hspace{0.05cm},$$
 
+
:$$p_{\rm F} = 0.04 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm E} = 5/49 \approx 0.102  
:$$p_{\rm E}=0.10 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm E} = 5/49 \approx 0.102 \hspace{0.05cm},\hspace{0.6cm}p_{\rm F} = 0.04 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm E} = 5/49 \approx 0.102  
 
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
  
Damit ergibt sich ein etwas größerer Entropiewert:
+
*This results in a slightly larger entropy value:
 
    
 
    
:$$H ({\rm bez\ddot{u}glich }\hspace{0.15cm}p_{\mu}) = 2.365 \,{\rm bit/Quellensymbol}\hspace{0.3cm}  
+
:$$H ({\rm regarding}\hspace{0.15cm}p_{\mu}) = 2.365 \ {\rm bit/source\hspace{0.15cm} symbol}\hspace{0.3cm}  
 
\Rightarrow \hspace{0.3cm}  
 
\Rightarrow \hspace{0.3cm}  
H ({\rm bez\ddot{u}glich }\hspace{0.15cm}h_{\mu}) = 2.451 \,{\rm bit/Quellensymbol}
+
H ({\rm regarding}\hspace{0.15cm}h_{\mu}) = 2.451 \ {\rm bit/source\hspace{0.15cm} symbol}
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
  
Würde man den Huffman–Code mit diesen „neuen” Wahrscheinlichkeiten $h_{\rm A}$, ... , $h_{\rm F}$ bilden, so ergäben sich folgende Zuordnungen:
+
If one were to form the Huffman code with these&nbsp; "new probabilities"&nbsp; $h_{\rm A}$, ... , $h_{\rm F}$&nbsp; the following assignments would result:
 
   
 
   
:$$\boldsymbol{\rm A} \hspace{0.05cm} \rightarrow \hspace{0.05cm}\boldsymbol{\rm 11}\hspace{0.05cm},\hspace{0.2cm}
+
:&nbsp; &nbsp; &nbsp;$\boldsymbol{\rm A} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$'''11'''$\hspace{0.05cm},\hspace{0.2cm}
\boldsymbol{\rm B} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 100} \hspace{0.05cm},\hspace{0.2cm}
+
\boldsymbol{\rm B} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$'''100'''$\hspace{0.05cm},\hspace{0.2cm}
\boldsymbol{\rm C} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 00} \hspace{0.05cm},\hspace{0.2cm}
+
\boldsymbol{\rm C} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$'''00'''$\hspace{0.05cm},\hspace{0.2cm}
\boldsymbol{\rm D} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 101} \hspace{0.05cm},\hspace{0.2cm}
+
\boldsymbol{\rm D} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$'''101'''$\hspace{0.05cm},\hspace{0.2cm}
\boldsymbol{\rm E} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 010} \hspace{0.05cm},\hspace{0.2cm}
+
\boldsymbol{\rm E} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$'''010'''$\hspace{0.05cm},\hspace{0.2cm}
\boldsymbol{\rm F} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 011} \hspace{0.05cm}.$$
+
\boldsymbol{\rm F} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$'''011'''$\hspace{0.05cm}.$
 +
 
 +
Now only&nbsp; $\rm A$&nbsp; and&nbsp; $\rm C$&nbsp; would be represented by two bits, the other four symbols by three bits each.
 +
*The encoded sequence would then have a length of &nbsp; $(16 + 9) · 2 + (7 + 7 + 5 + 5) · 3 = 122$ &nbsp; bits, &nbsp; &rArr; &nbsp; three bits shorter than under the previous coding.
 +
*The average code word length would then be&nbsp; $L_{\rm M} = 122/49 ≈ 2.49$&nbsp;  bit/source symbol&nbsp; instead of&nbsp; $L_{\rm M}≈ 2.55$&nbsp;  bit/source symbol.}}
  
Nun würden nur $\rm A$ und $\rm C$ mit zwei Bit dargestellt, die anderen vier Symbole durch jeweils drei Bit.
 
*Die Codesymbolfolge hätte dann eine Länge von $(16 + 9) · 2 + (7 + 7 + 5 + 5) · 3 = 122$ Bit, wäre also um drei Bit kürzer als nach der bisherigen Codierung.
 
*Die mittlere Codewortlänge wäre dann $L_{\rm M} = 122/49 ≈ 2.49 \ \rm  bit/Quellensymbol$ anstelle von $L_{\rm M}≈ 2.55 \ \rm  bit/Quellensymbol$.}}
 
  
 +
The (German language) interactive applet&nbsp; [[Applets:Huffman_Shannon_Fano|"Huffman- und Shannon-Fano-Codierung&nbsp; &rArr; &nbsp; $\text{SWF}$&nbsp;version"]]&nbsp; illustrates the procedure for two variants of entropy coding.
  
 
{{BlaueBox|TEXT=
 
{{BlaueBox|TEXT=
$\text{Fazit:}$&nbsp;  
+
$\text{This example can be interpreted as follows:}$&nbsp;  
Dieses Beispiel lässt sich wie folgt interpretieren:
+
*Huffman coding thrives on the&nbsp; (exact)&nbsp; knowledge of the symbol probabilities.&nbsp; If these are known to both the transmitter and the receiver, the average code word length&nbsp; $L_{\rm M}$&nbsp; is often only insignificantly larger than the source entropy&nbsp; $H$.
*Die Huffman–Codierung lebt von der (genauen) Kenntnis der Symbolwahrscheinlichkeiten. Sind diese sowohl dem Sender als auch dem Empfänger bekannt, so ist die mittlere Codewortlänge $L_{\rm M}$ oft nur unwesentlich größer als die Quellenentropie $H$.
+
*Especially with small files, there may be deviations between the&nbsp; (expected)&nbsp; symbol probabilities&nbsp; $p_μ$&nbsp; and the&nbsp; (actual)&nbsp; frequencies&nbsp; $h_μ$&nbsp;.&nbsp; It would be better here to generate a separate Huffman code for each file based on the actual circumstances&nbsp; $(h_μ)$.
*Insbesondere aber bei kleinen Dateien kann es zu Abweichungen zwischen den (erwarteten) Symbolwahrscheinlichkeiten $p_μ$ und den (tatsächlichen) Häufigkeiten $h_μ$ kommen. Besser wäre es hier, für jede Datei einen eigenen Huffman–Code zu generieren, der auf den tatsächlichen Gegebenheiten ($h_μ$) basiert.
+
*In this case, however, the specific Huffman code must also be communicated to the decoder.&nbsp; This leads to a certain overhead, which again can only be neglected for longer files.&nbsp; With small files, this effort is not worthwhile.}}
*In diesem Fall muss aber dem Decoder auch der spezifische Huffman–Code mitgeteilt werden. Dies führt zu einem gewissen Overhead, der nur wieder bei längeren Dateien vernachlässigt werden kann. Bei kleinen Dateien lohnt sich dieser Aufwand nicht.}}
 
 
 
 
 
  
==Einfluss von Übertragungsfehlern auf die Decodierung ==
+
==Influence of transmission errors on decoding==
 
<br>
 
<br>
Der Huffman–Code ist aufgrund der Eigenschaft „präfixfrei” verlustlos.  
+
Due to the property&nbsp; "prefix-free" &nbsp; &rArr; &nbsp; the Huffman code is lossless.
*Das bedeutet: &nbsp; Aus der binären Codesymbolfolge lässt sich die Quellensymbolfolge vollständig rekonstruieren.
+
*This means that the source symbol sequence can be completely reconstructed from the binary encoded sequence.
*Kommt es aber bei der Übertragung zu einem Fehler (aus einer '''0''' wird eine '''1''' bzw. aus einer '''1''' eine '''0'''), so stimmt natürlich auch die Sinkensymbolfolge $〈v_ν〉$ nicht mit der Quellensymbolfolge $〈q_ν〉$ überein.
+
*However, if an error occurs during transmission&nbsp; $($"'''0'''"&nbsp; &nbsp; &rArr; &nbsp; "'''1'''"&nbsp; or&nbsp; "'''1'''"&nbsp; &nbsp; &rArr; &nbsp; "'''0'''"$)$,&nbsp;, the sink symbol sequence&nbsp; $〈v_ν〉$&nbsp; naturally also does not match the source symbol sequence&nbsp; $〈q_ν〉$.
  
  
Die beiden folgenden Beispiele zeigen, dass ein einziger Übertragungsfehler manchmal eine Vielzahl von Fehlern hinsichtlich des Ursprungstextes zur Folge haben kann.
+
The following two examples show that a single transmission error can sometimes result in a multitude of errors regarding the source text.
  
 
{{GraueBox|TEXT=
 
{{GraueBox|TEXT=
$\text{Beispiel 4:}$&nbsp;  
+
$\text{Example 4:}$&nbsp;  
Wir betrachten die gleiche Quellensymbolfolge und den gleichen Huffman–Code wie auf der vorherigen Seite.  
+
We consider the same source symbol sequence and the same Huffman code as in the previous section.  
  
[[File:P_ID2420__Inf_T_2_3_S4b_neu.png|frame|Zum Einfluss von Übertragungsfehlern bei Huffman–Codierung]]
+
[[File:EN_Inf_T_2_3_S4b_v2.png|frame|On the influence of transmission errors in Huffman coding]]
  
*Die obere Grafik zeigt, dass bei fehlerfreier Übertragung aus der codierten Binärfolge '''111011'''... wieder die ursprüngliche Quellenfolge $\rm AEBFCC$... rekonstruiert werden kann.
+
*The upper diagram shows that with error-free transmission, the original source sequence&nbsp; '''111011'''...&nbsp; can be reconstructed from the coded binary sequence&nbsp; $\rm AEBFCC$...
*Wird aber zum Beispiel das Bit 6 verfälscht (von '''1''' auf '''0''', rote Markierung in der mittlere Grafik), so wird aus dem Quellensymbol $q_2 =$  $\rm E$ das Sinkensymbol $v_2 =$ $\rm F$.
+
*However, if for example bit 6 is falsified&nbsp; $($from&nbsp; '''1'''&nbsp; to&nbsp; '''0''', red marking in the middle graphic$)$, the source symbol&nbsp; $q_2 = \rm E$&nbsp; becomes the sink symbol&nbsp; $v_2 =\rm F$.
*Eine Verfälschung von Bit 13 (von '''0''' auf '''1''', rote Markierung in der unteren Grafik) führt sogar zu einer Verfälschung von vier Quellensymbolen: &nbsp; $\rm CCEC$  →  $\rm DBBD$.}}
+
*A falsification of bit 13&nbsp; $($from&nbsp; '''0'''&nbsp; to&nbsp; '''1''', red marking in the lower graphic$)$&nbsp; even leads to a falsification of four source symbols: &nbsp; $\rm CCEC$&nbsp;   →&nbsp;   $\rm DBBD$.}}
  
  
 
{{GraueBox|TEXT=
 
{{GraueBox|TEXT=
$\text{Beispiel 5:}$&nbsp;  
+
$\text{Example 5:}$&nbsp;  
Eine zweite Nachrichtenquelle mit Symbolumfang $M = 6$ sei durch folgende Symbolwahrscheinlichkeiten gekennzeichnet:
+
Let a second source with symbol set size&nbsp; $M = 6$&nbsp; be characterized by the following symbol probabilities:
 
   
 
   
 
:$$p_{\rm A} = 0.50 \hspace{0.05cm},\hspace{0.2cm}p_{\rm B} = 0.19 \hspace{0.05cm},\hspace{0.2cm}p_{\rm C} = 0.11 \hspace{0.05cm},\hspace{0.2cm}
 
:$$p_{\rm A} = 0.50 \hspace{0.05cm},\hspace{0.2cm}p_{\rm B} = 0.19 \hspace{0.05cm},\hspace{0.2cm}p_{\rm C} = 0.11 \hspace{0.05cm},\hspace{0.2cm}
Line 193: Line 201:
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
  
Hier führt der Huffman–Algorithmus zu folgender Zuordnung:
+
Here the Huffman algorithm leads to the following assignment:
 
:$$\boldsymbol{\rm A} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 0} \hspace{0.05cm},\hspace{0.2cm}
 
\boldsymbol{\rm B} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 111} \hspace{0.05cm},\hspace{0.2cm}
 
\boldsymbol{\rm C} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 101} \hspace{0.05cm},\hspace{0.2cm}
 
\boldsymbol{\rm D} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 100} \hspace{0.05cm},\hspace{0.2cm}
 
\boldsymbol{\rm E} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 1101} \hspace{0.05cm},\hspace{0.2cm}
 
\boldsymbol{\rm F} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 1100} \hspace{0.05cm}.$$
 
  
Die Quellensymbolfolge $\rm ADABD$... (siehe Grafik) wird somit durch die Codesymbolfolge '''0'100'0'111'100' '''... dargestellt. Die Hochkommata dienen hierbei lediglich der Orientierung für den Leser.
+
:&nbsp; &nbsp; &nbsp;$\boldsymbol{\rm A} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$'''0'''$\hspace{0.05cm},\hspace{0.2cm}
 +
\boldsymbol{\rm B} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$'''111'''$\hspace{0.05cm},\hspace{0.2cm}
 +
\boldsymbol{\rm C} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$'''101'''$\hspace{0.05cm},\hspace{0.2cm}
 +
\boldsymbol{\rm D} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$'''100'''$\hspace{0.05cm},\hspace{0.2cm}
 +
\boldsymbol{\rm E} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$'''1101'''$\hspace{0.05cm},\hspace{0.2cm}
 +
\boldsymbol{\rm F} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$'''1100'''$\hspace{0.05cm}.$
  
[[File:P_ID2423__Inf_T_2_3_S4c_neu.png|center|frame|Zur Fehlerfortpflanzung der Huffman–Codierung]]
+
[[File:EN_Inf_T_2_3_S4c_v2.png|right|frame|On the error propagation of the Huffman coding]]
 +
 +
The source symbol sequence&nbsp; $\rm ADABD$...&nbsp; (upper diagram) is thus represented by the encoded sequence&nbsp; '''0'100'0'111'100' '''...&nbsp;.&nbsp; The inverted commas are only for the reader's orientation.
  
Bei der Übertragung wird nun das erste Bit verfälscht: Anstelle von '''01000111100'''... empfängt man somit '''11000111100'''...  
+
During transmission, the first bit is falsified:&nbsp; Instead of&nbsp; '''01000111100'''...&nbsp; one receives&nbsp; '''11000111100'''...  
*Aus den beiden ersten Quellensymbolen $\rm AD$ → '''0100''' wird  nach der Decodierung das Sinkensymbol $\rm F$ → '''1100'''.  
+
*After decoding, the first two source symbols&nbsp; $\rm AD$&nbsp; &nbsp; '''0100'''&nbsp; become the sink symbol&nbsp; $\rm F$&nbsp; &nbsp; '''1100'''.  
*Die weiteren Symbole werden dann wieder richtig detektiert, aber nun nicht mehr beginnend bei der Position $ν = 3$, sondern bei $ν = 2$.
+
*The further symbols are then detected correctly again, but now no longer starting at position&nbsp; $ν = 3$, but at&nbsp; $ν = 2$.
  
  
Je nach Anwendung sind die Auswirkungen unterschiedlich:
+
Depending on the application, the effects are different:
*Handelt es sich bei der Quelle um einen natürlichen Text und bei der Sinke um einen Menschen, so bleibt der Großteil des Textes für den Leser verständlich.
+
*If the source is a&nbsp; "natural text"&nbsp; and the sinker is a&nbsp; "human",&nbsp; most of the text remains comprehensible to the reader.
*Ist die Sinke jedoch ein Automat, der sukzessive alle $v_ν$ mit den entsprechenden $q_ν$ vergleicht, so ergibt sich eine Verfälschungshäufigkeit von deutlich über $50\%$.
+
*If, however, the sink is an automaton that successively compares all&nbsp; $v_ν$&nbsp; with the corresponding&nbsp; $q_ν$,&nbsp;  the result is a distortion frequency of well over&nbsp; $50\%$.
*Nur die blauen Symbole der Sinkensymbolfolge $〈v_ν〉$ stimmen dann (zufällig) mit den entsprechenden Quellensymbolen $q_ν$ überein, während rote Symbole auf Fehler hinweisen.}}
+
*Only the blue symbols of the sink symbol sequence&nbsp; $〈v_ν〉$&nbsp; then&nbsp; (coincidentally)&nbsp; match the source symbols&nbsp; $q_ν$&nbsp; while red symbols indicate errors.}}
  
  
 
 
 
 
==Anwendung der Huffman–Codierung auf $k$–Tupel ==
+
==Application of Huffman coding to&nbsp; $k$–tuples ==
 
<br>
 
<br>
Der Huffman–Algorithmus in seiner Grundform liefert dann unbefriedigende Ergebnisse, wenn
+
The Huffman algorithm in its basic form delivers unsatisfactory results when
*eine Binärquelle $(M = 2)$ vorliegt, zum Beispiel mit dem Symbolvorrat $\{$ $\rm X$, $\rm Y$ $\}$,
+
*there is a binary source&nbsp; $(M = 2)$&nbsp;, for example with the symbol set&nbsp; $\{$ $\rm X$,&nbsp; $\rm Y \}$,
*es statistische Bindungen zwischen den Symbolen der Eingangsfolge gibt,
+
*there are statistical dependencies between adjacent symbols of the input sequence,
*die Wahrscheinlichkeit des häufigsten Symbols deutlich größer ist als $50\%$.
+
*the probability of the most frequent symbol is clearly greater than&nbsp; $50\%$.
  
  
Abhilfe schafft man in diesen Anwendungsfällen,
+
The remedy in these cases is
*in dem man mehrere Symbole zusammenfasst, und
+
*by grouping several symbols together, and
*den Huffman–Algorithmus auf einen neuen Symbolvorrat $\{$ $\rm A$, $\rm B$, $\rm C$, $\rm D$, ... $\}$ anwendet.
+
*applying the Huffman algorithm to a new set of symbols&nbsp; $\{$ $\rm A$,&nbsp; $\rm B$, $\rm C$,&nbsp; $\rm D$, ... $\}$&nbsp;.
  
  
Bildet man $k$–Tupel, so steigt der Symbolumfang von $M$ auf $M\hspace{0.01cm}′ = M^k$.  
+
If&nbsp; $k$–tuples are formed, the symbol set size&nbsp; $M$&nbsp; increases to&nbsp; $M\hspace{-0.01cm}′ = M^k$.  
  
Wir wollen im folgenden Beispiel die Vorgehensweise anhand einer Binärquelle verdeutlichen. Weitere Beispiele finden Sie in  
+
In the following example, we will illustrate the procedure using a binary source.&nbsp; Further examples can be found in
*der [[Aufgaben:2.07_Zweiertupel_-_Huffman|Aufgabe 2.7]],  
+
*[[Aufgaben:Exercise_2.7:_Huffman_Application_for_Binary_Two-Tuples|"Exercise 2.7"]],
*der [[Aufgaben:2.07Z_Ternärquelle-Zweiertupel|Aufgabe 2.7Z]] und
+
*[[Aufgaben:Exercise_2.7Z:_Huffman_Coding_for_Two-Tuples_of_a_Ternary_Source|"Exercise 2.7Z"]],
*der  [[Aufgaben:2.08_Markovquelle_und_Huffman|Aufgabe 2.8]].
+
*[[Aufgaben:Exercise_2.8:_Huffman_Application_for_a_Markov_Source|"Exercise 2.8"]].  
  
  
 
{{GraueBox|TEXT=
 
{{GraueBox|TEXT=
$\text{Beispiel 6:}$&nbsp; Gegeben sei eine gedächtnislose Binärquelle $(M = 2)$ mit den Symbolen $\{$ $\rm X$,  $\rm Y$ $\}$:
+
$\text{Example 6:}$&nbsp; Let a memoryless binary source $(M = 2)$ with the symbols&nbsp; $\{$ $\rm X$,&nbsp; $\rm Y \}$ be given:
*Die Symbolwahrscheinlichkeiten seien $p_{\rm X} = 0.8$ &nbsp;und&nbsp; $p_{\rm Y} = 0.2$.
+
*Let the symbol probabilities be&nbsp; $p_{\rm X} = 0.8$ &nbsp;and&nbsp; $p_{\rm Y} = 0.2$.&nbsp; Thus the source entropy is&nbsp; $H = 0.722$&nbsp; bit/source symbol.
*Damit ergibt sich die Quellenentropie zu $H = 0.722$ bit/Quellensymbol.
+
*We consider the symbol sequence &nbsp; $\{\rm XXXYXXXXXXXXYYXXXXXYYXXYXYXXYX\ \text{...} \}$&nbsp; with only a few&nbsp; $\rm Y$ symbols at positions 4, 13, 14, ...
*Wir betrachten die Symbolfolge $\{\rm XXXYXXXXXXXXYYXXXXXYYXXYXYXXYX\ \text{...} \}$ mit nur wenigen $\rm Y$&ndash;Symbolen an den Positionen 3, 13, 14, ...
 
  
  
Der Huffman–Algorithmus kann auf diese Quelle direkt nicht angewendet werden, das heißt, man benötigt ohne weitere Maßnahme für jedes binäre Quellensymbol auch ein Bit. Aber:
+
The Huffman algorithm cannot be applied directly to this source, that is, one also needs a bit for each binary source symbol without further action. But:
*Fasst man jeweils zwei binäre Symbole zu einem Zweiertupel $(k = 2)$ entsprechend '''XX''' '''A''', '''XY''' '''B''', '''YX''' '''C''', '''YY''' '''D''' zusammen, so kann man „Huffman” auf die resultierende Folge '''ABAACADAABCBBAC'''...  mit $M′ = 4$ anwenden. Wegen
+
If one combines two binary symbols at a time into a tuple of two&nbsp; $(k = 2)$&nbsp; corresponding to &nbsp; $\rm XX$ &nbsp; &nbsp; $\rm A$, &nbsp; $\rm XY$ &nbsp; &nbsp; $\rm B$, &nbsp; $\rm YX$ &nbsp; &nbsp; $\rm C$, &nbsp;  $\rm YY$ &nbsp; &nbsp; $\rm D$ &nbsp;, one can apply&nbsp;  "Huffman"&nbsp; to the resulting sequence&nbsp; → &nbsp; $\rm ABAACADAABCBBAC$ ...&nbsp; → &nbsp; with $M\hspace{-0.01cm}′ = 4$.&nbsp; Because of&nbsp;
   
 
 
:$$p_{\rm A}= 0.8^2 = 0.64 \hspace{0.05cm}, \hspace{0.2cm}p_{\rm B}= 0.8 \cdot 0.2 = 0.16 = p_{\rm C} \hspace{0.05cm}, \hspace{0.2cm}
 
:$$p_{\rm A}= 0.8^2 = 0.64 \hspace{0.05cm}, \hspace{0.2cm}p_{\rm B}= 0.8 \cdot 0.2 = 0.16 = p_{\rm C} \hspace{0.05cm}, \hspace{0.2cm}
p_{\rm D}= 0.2^2 = 0.04$$
+
p_{\rm D}= 0.2^2 = 0.04,$$
 +
we get &nbsp; $\rm A$ &nbsp; → &nbsp; '''1''', &nbsp;  $\rm B$ &nbsp; → &nbsp; '''00''', &nbsp;  $\rm C$ &nbsp; → &nbsp; '''011''', &nbsp;  $\rm D$ &nbsp; → &nbsp; '''010''' &nbsp; as well as
 +
 
 +
:$$L_{\rm M}\hspace{0.03cm}' = 0.64 \cdot 1 + 0.16 \cdot 2 + 0.16 \cdot 3 + 0.04 \cdot 3 =1.56\,\text{bit/two-tuple}$$
  
:erhält man '''A''' → 1, '''B''' → '''00''', '''C''' → '''011''', '''D''' → '''010''' sowie
+
:$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}\hspace{0.03cm}'}/{2}  = 0.78\ {\rm bit/source\hspace{0.15cm}symbol}\hspace{0.05cm}.$$
 
 
:$$L_{\rm M}\hspace{0.01cm}' = 0.64 \cdot 1 + 0.16 \cdot 2 + 0.16 \cdot 3 + 0.04 \cdot 3 =1.56\,{\rm bit/Zweiertupel} $$
 
  
:$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}\hspace{0.01cm}'}/{2}  = 0.78\,{\rm bit/Quellensymbol}\hspace{0.05cm}.$$
+
Now we form tuples of three&nbsp;  $(k = 3)$,&nbsp; corresponding to
 +
:&nbsp; &nbsp; &nbsp; $\rm XXX$ &nbsp; → &nbsp; $\rm A$, &nbsp; $\rm XXY$ &nbsp; → &nbsp; $\rm B$, &nbsp; $\rm XYX$ &nbsp; → &nbsp; $\rm C$, &nbsp; $\rm XYY$ &nbsp; → &nbsp; $\rm D$, &nbsp; $\rm YXX$ &nbsp; → &nbsp; $\rm E$, &nbsp; $\rm YXY$ &nbsp; → &nbsp; $\rm F$, &nbsp; $\rm YYX$ &nbsp; → &nbsp; $\rm G$, &nbsp; $\rm YYY$ &nbsp; → &nbsp; $\rm H$.
  
*Nun bilden wir Dreiertupel $(k = 3)$ entsprechend '''XXX''' → '''A''', '''XXY''' → '''B''', '''XYX''' → '''C''', '''XYY''' → '''D''', '''YXX''' → '''E''', '''YXY''' → '''F''', '''YYX''' → '''G''', '''YYY''' → '''H'''.
+
*For the input sequence given above, one arrives at the equivalent output sequence &nbsp;  $\rm AEBAGADBCC$ ...&nbsp; (based on the new symbol set size $M\hspace{-0.01cm}′ = 8$) and to the following probabilities:
: Für die oben angegebene Eingangsfolge kommt man zur äquivalenten Folge '''AEBAGADBCC'''... (basierend auf dem neuen Symbolumfang $M\hspace{0.01cm}′ = 8$) und zu folgenden Wahrscheinlichkeiten:
 
 
   
 
   
:$$p_{\rm A}= 0.8^3 = 0.512 \hspace{0.05cm}, \hspace{0.2cm}p_{\rm B}=  p_{\rm C}= p_{\rm E} = 0.8^2 \cdot 0.2 = 0.128\hspace{0.05cm},$$
+
:$$p_{\rm A}= 0.8^3 = 0.512 \hspace{0.05cm}, \hspace{0.5cm}p_{\rm B}=  p_{\rm C}= p_{\rm E} = 0.8^2 \cdot 0.2 = 0.128\hspace{0.05cm},\hspace{0.5cm}
:$$p_{\rm D}=  p_{\rm F}= p_{\rm G} = 0.8 \cdot 0.2^2 = 0.032 \hspace{0.05cm}, \hspace{0.2cm}p_{\rm H}=  0.2^3 = 0.008\hspace{0.05cm}.$$
+
p_{\rm D}=  p_{\rm F}= p_{\rm G} = 0.8 \cdot 0.2^2 = 0.032 \hspace{0.05cm}, \hspace{0.5cm}p_{\rm H}=  0.2^3 = 0.008\hspace{0.05cm}.$$
  
:Die Huffman–Codierung lautet somit: '''A''' → '''1''', '''B''' → '''011''', '''C''' → '''010''', '''D''' → '''00011''', '''E''' → '''001''', '''F''' → '''00010''', '''G''' → '''00001''', '''H''' → '''00000'''.  
+
*The Huffman coding is thus:
:Damit erhält man für die mittlere Codewortlänge:
+
:&nbsp; &nbsp; &nbsp; $\rm A$ &nbsp; &nbsp; '''1''', &nbsp; $\rm B$ &nbsp; &nbsp; '''011''', &nbsp; $\rm C$ &nbsp; &nbsp; '''010''', &nbsp; $\rm D$ &nbsp; &nbsp; '''00011''', &nbsp; $\rm E$ &nbsp; &nbsp; '''001''', &nbsp; $\rm F$ &nbsp; &nbsp; '''00010''', &nbsp; $\rm G$ &nbsp; &nbsp; '''00001''', &nbsp; $\rm H$ &nbsp; &nbsp; '''00000'''.  
 +
*Thus, for the average code word length we obtain:
  
:$$L_{\rm M}\hspace{0.01cm}' = 0.512 \cdot 1 + 3 \cdot 0.128 \cdot 3 + (3 \cdot 0.032 + 0.008)  \cdot 5 =2.184 \,{\rm bit/Dreiertupel} $$   
+
:$$L_{\rm M}\hspace{0.03cm}' = 0.512 \cdot 1 + 3 \cdot 0.128 \cdot 3 + (3 \cdot 0.032 + 0.008)  \cdot 5 =2.184 \,{\text{bit/three-tuple} } $$   
  
:$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}\hspace{0.01cm}'}/{3}  = 0.728\,{\rm bit/Quellensymbol}\hspace{0.05cm}.$$
+
:$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}\hspace{0.03cm}'}/{3}  = 0.728\ {\rm bit/source\hspace{0.15cm} symbol}\hspace{0.05cm}.$$
  
:In diesem Beispiel wird also bereits mit $k = 3$ die Quellenentropie $H = 0.722\ \rm bit/Quellensymbol$ fast erreicht.}}
+
*In this example, therefore, the source entropy&nbsp; $H = 0.722$&nbsp; bit/source symbol is almost reached already with&nbsp; $k = 3$.}}
  
  
 
   
 
   
==Aufgaben zum Kapitel  ==  
+
==Exercises for the chapter==  
 
<br>
 
<br>
[[Aufgaben:Aufgabe_2.6:_Zur_Huffman-Codierung|Aufgabe 2.6: Zur Huffman-Codierung]]
+
[[Aufgaben:Exercise_2.6:_About_the_Huffman_Coding|Exercise 2.6: About the Huffman Coding]]
  
[[Aufgaben:2.6Z Nochmals zum Huffman–Code|Aufgabe 2.6Z: Nochmals zum Huffman–Code]]
+
[[Aufgaben:Exercise_2.6Z:_Again_on_the_Huffman_Code|Exercise 2.6Z: Again on the Huffman Code]]
  
[[Aufgaben:Aufgabe_2.7:_Huffman-Anwendung_für_binäre_Zweiertupel|Aufgabe 2.7: Huffman-Anwendung für binäre Zweiertupel]]
+
[[Aufgaben:Exercise_2.7:_Huffman_Application_for_Binary_Two-Tuples|Exercise 2.7: Huffman Application for Binary Two-Tuples]]
  
[[Aufgaben:Aufgabe_2.7Z:_Huffman-Codierung_für_Zweiertupel_einer_Ternärquelle|Aufgabe 2.7Z: Huffman-Codierung für Zweiertupel einer Ternärquelle]]
+
[[Aufgaben:Exercise_2.7Z:_Huffman_Coding_for_Two-Tuples_of_a_Ternary_Source|Exercise 2.7Z: Huffman Coding for Two-Tuples of a Ternary Source]]
 
+
[[Aufgaben:Aufgabe_2.8:_Huffman-Anwendung_bei_einer_Markovquelle|Aufgabe 2.8: Huffman-Anwendung bei einer Markovquelle]]
+
[[Aufgaben:Exercise_2.8:_Huffman_Application_for_a_Markov_Source|Exercise 2.8: Huffman Application for a Markov Source]]
  
[[Aufgaben:2.9 Huffman-Decodierung nach Fehlern|Aufgabe 2.9: Huffman-Decodierung nach Fehlern]]
+
[[Aufgaben:Exercise_2.9:_Huffman_Decoding_after_Errors|Exercise 2.9: Huffman Decoding after Errors]]
  
  
 
{{Display}}
 
{{Display}}

Latest revision as of 18:09, 20 February 2023

The Huffman algorithm


We now assume that the source symbols  $q_\nu$  originate from an alphabet  $\{q_μ\} = \{$ $\rm A$,  $\rm B$ ,  $\rm C$ , ...$\}$  with the symbol set size  $M$  and they are statistically independent of each other.  For example, for the symbol set size  $M = 8$:

$$\{ \hspace{0.05cm}q_{\mu} \} = \{ \boldsymbol{\rm A} \hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm B}\hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm C}\hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm D}\hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm E}\hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm F}\hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm G}\hspace{0.05cm},\hspace{0.05cm} \boldsymbol{\rm H}\hspace{0.05cm} \}\hspace{0.05cm}.$$

In 1952 - i.e. shortly after Shannon's groundbreaking publications,  $\text{David A. Huffman}$  gave an algorithm for the construction of optimal prefix-free codes.  This  »Huffman Algorithm«  is to be given here without derivation and proof, whereby we restrict ourselves to binary codes.  That is:   For the code symbols, let  $c_ν ∈ \{$01$\}$ always hold.

Here is the  "recipe":

  1.   Order the symbols according to decreasing probabilities of occurrence.
  2.   Combine the two most improbable symbols into a new symbol.
  3.   Repeat  (1)  and  (2), until only two  (combined)  symbols remain.
  4.   Encode the more probable set of symbols with  1  and the other set with  0.
  5.   In the opposite direction  (i.e. from bottom to top) , add  1  or  0 to the respective binary codes of the split subsets according to the probabilities.


$\text{Example 1:}$  Without limiting the generality, we assume that the  $M = 6$  symbols  $\rm A$, ... , $\rm F$  are already ordered according to their probabilities:

$$p_{\rm A} = 0.30 \hspace{0.05cm},\hspace{0.2cm}p_{\rm B} = 0.24 \hspace{0.05cm},\hspace{0.2cm}p_{\rm C} = 0.20 \hspace{0.05cm},\hspace{0.2cm} p_{\rm D} = 0.12 \hspace{0.05cm},\hspace{0.2cm}p_{\rm E} = 0.10 \hspace{0.05cm},\hspace{0.2cm}p_{\rm F} = 0.04 \hspace{0.05cm}.$$

By pairwise combination and subsequent sorting, the following symbol combinations are obtained in five steps  (resulting probabilities in brackets):

1.   $\rm A$  (0.30), $\rm B$  (0.24), $\rm C$  (0.20), $\rm EF$  (0.14), $\rm D$  (0.12),
2.   $\rm A$  (0.30), $\rm EFD$  (0.26), $\rm B$  (0.24), $\rm C$  (0.20),
3.   $\rm BC$  (0.44), $\rm A$  (0.30), $\rm EFD$  (0.26),
4.   $\rm AEFD$  (0.56), $\rm BC$  (0.44),
5.   Root  $\rm AEFDBC$  (1.00).

Backwards – i.e. according to steps  (5)  to  (1)  – the assignment to binary symbols then takes place.   An  "x"  indicates that bits still have to be added in the next steps:

5.   $\rm AEFD$   →   1x,     $\rm BC$   →   0x,
4.   $\underline{\rm A}$   →   11,     $\rm EFD$   →   10x,
3.   $\underline{\rm B}$   →   01,     $\underline{\rm C}$   →   00,
2.   $\rm EF$   →   101x,     $\underline{\rm D}$   →   100,
1.   $\underline{\rm E}$   →   1011,     $\underline{\rm F}$   →   1010.

The underlines mark the final binary coding.


On the term  "Entropy Coding"


We continue to assume the probabilities and assignments of the last example:

$$p_{\rm A} = 0.30 \hspace{0.05cm},\hspace{0.2cm}p_{\rm B} = 0.24 \hspace{0.05cm},\hspace{0.2cm}p_{\rm C} = 0.20 \hspace{0.05cm},\hspace{0.2cm} p_{\rm D} = 0.12 \hspace{0.05cm},\hspace{0.2cm}p_{\rm E} = 0.10 \hspace{0.05cm},\hspace{0.2cm}p_{\rm F} = 0.04 \hspace{0.05cm};$$
$$\boldsymbol{\rm A} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 11} \hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm B} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 01} \hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm C} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 00} \hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm D} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 100} \hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm E} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 1011} \hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm F} \hspace{0.05cm} \rightarrow \hspace{0.05cm} \boldsymbol{\rm 1010} \hspace{0.05cm}.$$

Thus, of the six source symbols, three are encoded with two bits each, one with three bits and two symbols  $(\rm E$  and  $\rm F)$  with four bits each.

The average code word length thus results in

$$L_{\rm M} = (0.30 \hspace{-0.05cm}+ \hspace{-0.05cm}0.24 \hspace{-0.05cm}+ \hspace{-0.05cm} 0.20) \cdot 2 + 0.12 \cdot 3 + (0.10 \hspace{-0.05cm}+ \hspace{-0.05cm} 0.04 ) \cdot 4 = 2.4 \,{\rm bit/source\hspace{0.15cm}symbol} \hspace{0.05cm}.$$

From the comparison with the source entropy  $H = 2.365 \ \rm bit/source\hspace{0.15cm}symbol$  one can see the efficiency of the Huffman coding.

$\text{Note:}$  There is no prefix-free  $($⇒   immediately decodable$)$  code that leads to a smaller average code word length  $L_{\rm M}$  than the Huffman code by exploiting the occurrence probabilities alone.   In this sense, the Huffman code is optimal.


$\text{Example 2:}$  If the symbol probabilities were

$$p_{\rm A} = p_{\rm B} = p_{\rm C} = 1/4 \hspace{0.05cm},\hspace{0.2cm} p_{\rm D} = 1/8 \hspace{0.05cm},\hspace{0.2cm}p_{\rm E} = p_{\rm F} = 1/16 \hspace{0.05cm},$$

then the same would apply to the entropy and to the average code word length:

$$H = 3 \cdot 1/4 \cdot {\rm log_2}\hspace{0.1cm}(4) + 1/8 \cdot {\rm log_2}\hspace{0.1cm}(8) + 2 \cdot 1/16 \cdot {\rm log_2}\hspace{0.1cm}(16) = 2.375 \,{\rm bit/source\hspace{0.15cm} symbol}\hspace{0.05cm},$$
$$L_{\rm M} = 3 \cdot 1/4 \cdot 2 + 1/8 \cdot 3 + 2 \cdot 1/16 \cdot 4 = 2.375 \,{\rm bit/source\hspace{0.15cm} symbol} \hspace{0.05cm}.$$


This property  $L_{\rm M} = H +\varepsilon$  with positive  $\varepsilon \to 0$  at suitable symbol probabilities explains the term  »Entropy Coding«:

In this form of source coding, one tries to adapt the length  $L_μ$  of the binary sequence  (consisting of zeros and ones)  for the symbol  $q_μ$  according to the entropy calculation to its symbol probability  $p_μ$  as follows:

$$L_{\mu} = {\rm log}_2\hspace{0.1cm}(1/p_{\mu} ) \hspace{0.05cm}.$$

Of course, this does not always succeed, but only if all symbol probabilities  $p_μ$  can be represented in the form  $2^{–k}$  with  $k = 1, \ 2, \ 3,$ ...

  • In this special case - and only in this case - the average code word length  $L_{\rm M}$  coincides exactly with the source entropy  $H$    $(\varepsilon = 0$,  see  $\text{Example 2})$.
  • According to the  $\text{Source Coding Theorem}$  there is no  (decodable)  code that gets by with fewer binary characters per source symbol on average.


Representation of the Huffman code as a tree diagram


A  »tree structure«  is often used to construct the Huffman code.  For the example considered so far, this is shown in the following diagram:

Tree representation of the Huffman coding for  $\text{Example 1}$

It can be seen:

  • At each step of the Huffman algorithm, the two branches with the smallest respective probabilities are combined.
  • The node in the first step combines the two symbols  $\rm E$  and  $\rm F$  with the currently smallest probabilities.  This new node is labelled  $p_{\rm E} + p_{\rm F} = 0.14$  .
  • The branch running from the symbol with the smaller probability  $($here  $\rm F)$  to the sum node is drawn in blue, the other branch  $($here  $\rm E)$  in red.
  • After five steps, we arrive at the root of the tree with the total probability  $1.00$ . 


If you now trace the course from the root  (in the graphic with yellow filling)  back to the individual symbols, you can read off the symbol assignment from the colours of the individual branches.

For example, with the assignments  "red"   →   1 and  "blue"   →   0, the following results from the root to the symbol

  • $\rm A$:   red, red   →   11,
  • $\rm B$:   blue, red   →   01,
  • $\rm C$:   blue, blue   →   00,
  • $\rm D$:   red, blue, blue   →   100,
  • $\rm E$:   red, blue, red, red   →   1011,
  • $\rm F$:   red, blue, red, blue   →   1010.


The  (uniform)  assignment  "red"   →   0  and  "blue"   →   1  would also lead to an optimal prefix-free Huffman code.

$\text{Example 3:}$  The following graph shows the Huffman coding of  $49$  symbols  $q_ν ∈ \{$ $\rm A$,  $\rm B$,  $\rm C$,  $\rm D$,  $\rm E$,  $\rm F$ $\}$  with the assignment derived in the last section.  The different colours serve only for better orientation.

  • The binary encoded sequence has the average code word length 
$$L_{\rm M} = 125/49 = 2.551.$$
  • Due to the short sequence  $(N = 49)$ , the frequencies  $h_{\rm A}$, ... ,  $h_{\rm F}$  of the simulated sequences deviate significantly from the given probabilities  $p_{\rm A}$, ... ,  $p_{\rm F}$ :
Example sequences for Huffman coding
$$p_{\rm A} = 0.30 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm A} = 16/49 \approx 0.326 \hspace{0.05cm},$$
$$p_{\rm B} = 0.24 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm B} = 7/49 \approx 0.143 \hspace{0.05cm},$$
$$p_{\rm C} =0.24 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm C}= 9/49 \approx 0.184 \hspace{0.05cm},$$
$$p_{\rm D} = 0.12 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm D} = 7/49 \approx 0.143 \hspace{0.05cm},$$
$$p_{\rm E}=0.10 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm E} = 5/49 \approx 0.102 \hspace{0.05cm},$$
$$p_{\rm F} = 0.04 \hspace{0.05cm} \Rightarrow \hspace{0.05cm} h_{\rm E} = 5/49 \approx 0.102 \hspace{0.05cm}.$$
  • This results in a slightly larger entropy value:
$$H ({\rm regarding}\hspace{0.15cm}p_{\mu}) = 2.365 \ {\rm bit/source\hspace{0.15cm} symbol}\hspace{0.3cm} \Rightarrow \hspace{0.3cm} H ({\rm regarding}\hspace{0.15cm}h_{\mu}) = 2.451 \ {\rm bit/source\hspace{0.15cm} symbol} \hspace{0.05cm}.$$

If one were to form the Huffman code with these  "new probabilities"  $h_{\rm A}$, ... , $h_{\rm F}$  the following assignments would result:

     $\boldsymbol{\rm A} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$11$\hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm B} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$100$\hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm C} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$00$\hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm D} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$101$\hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm E} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$010$\hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm F} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$011$\hspace{0.05cm}.$

Now only  $\rm A$  and  $\rm C$  would be represented by two bits, the other four symbols by three bits each.

  • The encoded sequence would then have a length of   $(16 + 9) · 2 + (7 + 7 + 5 + 5) · 3 = 122$   bits,   ⇒   three bits shorter than under the previous coding.
  • The average code word length would then be  $L_{\rm M} = 122/49 ≈ 2.49$  bit/source symbol  instead of  $L_{\rm M}≈ 2.55$  bit/source symbol.


The (German language) interactive applet  "Huffman- und Shannon-Fano-Codierung  ⇒   $\text{SWF}$ version"  illustrates the procedure for two variants of entropy coding.

$\text{This example can be interpreted as follows:}$ 

  • Huffman coding thrives on the  (exact)  knowledge of the symbol probabilities.  If these are known to both the transmitter and the receiver, the average code word length  $L_{\rm M}$  is often only insignificantly larger than the source entropy  $H$.
  • Especially with small files, there may be deviations between the  (expected)  symbol probabilities  $p_μ$  and the  (actual)  frequencies  $h_μ$ .  It would be better here to generate a separate Huffman code for each file based on the actual circumstances  $(h_μ)$.
  • In this case, however, the specific Huffman code must also be communicated to the decoder.  This leads to a certain overhead, which again can only be neglected for longer files.  With small files, this effort is not worthwhile.


Influence of transmission errors on decoding


Due to the property  "prefix-free"   ⇒   the Huffman code is lossless.

  • This means that the source symbol sequence can be completely reconstructed from the binary encoded sequence.
  • However, if an error occurs during transmission  $($"0"    ⇒   "1"  or  "1"    ⇒   "0"$)$, , the sink symbol sequence  $〈v_ν〉$  naturally also does not match the source symbol sequence  $〈q_ν〉$.


The following two examples show that a single transmission error can sometimes result in a multitude of errors regarding the source text.

$\text{Example 4:}$  We consider the same source symbol sequence and the same Huffman code as in the previous section.

On the influence of transmission errors in Huffman coding
  • The upper diagram shows that with error-free transmission, the original source sequence  111011...  can be reconstructed from the coded binary sequence  $\rm AEBFCC$...
  • However, if for example bit 6 is falsified  $($from  1  to  0, red marking in the middle graphic$)$, the source symbol  $q_2 = \rm E$  becomes the sink symbol  $v_2 =\rm F$.
  • A falsification of bit 13  $($from  0  to  1, red marking in the lower graphic$)$  even leads to a falsification of four source symbols:   $\rm CCEC$  →  $\rm DBBD$.


$\text{Example 5:}$  Let a second source with symbol set size  $M = 6$  be characterized by the following symbol probabilities:

$$p_{\rm A} = 0.50 \hspace{0.05cm},\hspace{0.2cm}p_{\rm B} = 0.19 \hspace{0.05cm},\hspace{0.2cm}p_{\rm C} = 0.11 \hspace{0.05cm},\hspace{0.2cm} p_{\rm D} = 0.09 \hspace{0.05cm},\hspace{0.2cm}p_{\rm E} = 0.06 \hspace{0.05cm},\hspace{0.2cm}p_{\rm F} = 0.05 \hspace{0.05cm}.$$

Here the Huffman algorithm leads to the following assignment:

     $\boldsymbol{\rm A} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$0$\hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm B} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$111$\hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm C} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$101$\hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm D} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$100$\hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm E} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$1101$\hspace{0.05cm},\hspace{0.2cm} \boldsymbol{\rm F} \hspace{0.05cm} \rightarrow \hspace{0.15cm}$1100$\hspace{0.05cm}.$
On the error propagation of the Huffman coding

The source symbol sequence  $\rm ADABD$...  (upper diagram) is thus represented by the encoded sequence  0'100'0'111'100' ... .  The inverted commas are only for the reader's orientation.

During transmission, the first bit is falsified:  Instead of  01000111100...  one receives  11000111100...

  • After decoding, the first two source symbols  $\rm AD$  →  0100  become the sink symbol  $\rm F$  →  1100.
  • The further symbols are then detected correctly again, but now no longer starting at position  $ν = 3$, but at  $ν = 2$.


Depending on the application, the effects are different:

  • If the source is a  "natural text"  and the sinker is a  "human",  most of the text remains comprehensible to the reader.
  • If, however, the sink is an automaton that successively compares all  $v_ν$  with the corresponding  $q_ν$,  the result is a distortion frequency of well over  $50\%$.
  • Only the blue symbols of the sink symbol sequence  $〈v_ν〉$  then  (coincidentally)  match the source symbols  $q_ν$  while red symbols indicate errors.


Application of Huffman coding to  $k$–tuples


The Huffman algorithm in its basic form delivers unsatisfactory results when

  • there is a binary source  $(M = 2)$ , for example with the symbol set  $\{$ $\rm X$,  $\rm Y \}$,
  • there are statistical dependencies between adjacent symbols of the input sequence,
  • the probability of the most frequent symbol is clearly greater than  $50\%$.


The remedy in these cases is

  • by grouping several symbols together, and
  • applying the Huffman algorithm to a new set of symbols  $\{$ $\rm A$,  $\rm B$, $\rm C$,  $\rm D$, ... $\}$ .


If  $k$–tuples are formed, the symbol set size  $M$  increases to  $M\hspace{-0.01cm}′ = M^k$.

In the following example, we will illustrate the procedure using a binary source.  Further examples can be found in


$\text{Example 6:}$  Let a memoryless binary source $(M = 2)$ with the symbols  $\{$ $\rm X$,  $\rm Y \}$ be given:

  • Let the symbol probabilities be  $p_{\rm X} = 0.8$  and  $p_{\rm Y} = 0.2$.  Thus the source entropy is  $H = 0.722$  bit/source symbol.
  • We consider the symbol sequence   $\{\rm XXXYXXXXXXXXYYXXXXXYYXXYXYXXYX\ \text{...} \}$  with only a few  $\rm Y$ symbols at positions 4, 13, 14, ...


The Huffman algorithm cannot be applied directly to this source, that is, one also needs a bit for each binary source symbol without further action. But: If one combines two binary symbols at a time into a tuple of two  $(k = 2)$  corresponding to   $\rm XX$   →   $\rm A$,   $\rm XY$   →   $\rm B$,   $\rm YX$   →   $\rm C$,   $\rm YY$   →   $\rm D$  , one can apply  "Huffman"  to the resulting sequence  →   $\rm ABAACADAABCBBAC$ ...  →   with $M\hspace{-0.01cm}′ = 4$.  Because of 

$$p_{\rm A}= 0.8^2 = 0.64 \hspace{0.05cm}, \hspace{0.2cm}p_{\rm B}= 0.8 \cdot 0.2 = 0.16 = p_{\rm C} \hspace{0.05cm}, \hspace{0.2cm} p_{\rm D}= 0.2^2 = 0.04,$$

we get   $\rm A$   →   1,   $\rm B$   →   00,   $\rm C$   →   011,   $\rm D$   →   010   as well as

$$L_{\rm M}\hspace{0.03cm}' = 0.64 \cdot 1 + 0.16 \cdot 2 + 0.16 \cdot 3 + 0.04 \cdot 3 =1.56\,\text{bit/two-tuple}$$
$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}\hspace{0.03cm}'}/{2} = 0.78\ {\rm bit/source\hspace{0.15cm}symbol}\hspace{0.05cm}.$$

Now we form tuples of three  $(k = 3)$,  corresponding to

      $\rm XXX$   →   $\rm A$,   $\rm XXY$   →   $\rm B$,   $\rm XYX$   →   $\rm C$,   $\rm XYY$   →   $\rm D$,   $\rm YXX$   →   $\rm E$,   $\rm YXY$   →   $\rm F$,   $\rm YYX$   →   $\rm G$,   $\rm YYY$   →   $\rm H$.
  • For the input sequence given above, one arrives at the equivalent output sequence   $\rm AEBAGADBCC$ ...  (based on the new symbol set size $M\hspace{-0.01cm}′ = 8$) and to the following probabilities:
$$p_{\rm A}= 0.8^3 = 0.512 \hspace{0.05cm}, \hspace{0.5cm}p_{\rm B}= p_{\rm C}= p_{\rm E} = 0.8^2 \cdot 0.2 = 0.128\hspace{0.05cm},\hspace{0.5cm} p_{\rm D}= p_{\rm F}= p_{\rm G} = 0.8 \cdot 0.2^2 = 0.032 \hspace{0.05cm}, \hspace{0.5cm}p_{\rm H}= 0.2^3 = 0.008\hspace{0.05cm}.$$
  • The Huffman coding is thus:
      $\rm A$   →   1,   $\rm B$   →   011,   $\rm C$   →   010,   $\rm D$   →   00011,   $\rm E$   →   001,   $\rm F$   →   00010,   $\rm G$   →   00001,   $\rm H$   →   00000.
  • Thus, for the average code word length we obtain:
$$L_{\rm M}\hspace{0.03cm}' = 0.512 \cdot 1 + 3 \cdot 0.128 \cdot 3 + (3 \cdot 0.032 + 0.008) \cdot 5 =2.184 \,{\text{bit/three-tuple} } $$
$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}\hspace{0.03cm}'}/{3} = 0.728\ {\rm bit/source\hspace{0.15cm} symbol}\hspace{0.05cm}.$$
  • In this example, therefore, the source entropy  $H = 0.722$  bit/source symbol is almost reached already with  $k = 3$.


Exercises for the chapter


Exercise 2.6: About the Huffman Coding

Exercise 2.6Z: Again on the Huffman Code

Exercise 2.7: Huffman Application for Binary Two-Tuples

Exercise 2.7Z: Huffman Coding for Two-Tuples of a Ternary Source

Exercise 2.8: Huffman Application for a Markov Source

Exercise 2.9: Huffman Decoding after Errors