Difference between revisions of "Aufgaben:Exercise 2.8: Huffman Application for a Markov Source"

From LNTwww
 
(41 intermediate revisions by 4 users not shown)
Line 1: Line 1:
  
{{quiz-Header|Buchseite=Informationstheorie/Entropiecodierung nach Huffman
+
{{quiz-Header|Buchseite=Information_Theory/Entropy_Coding_According_to_Huffman
 
}}
 
}}
  
[[File:P_ID2460__Inf_A_2_8.png|right|]]
+
[[File:EN_Inf_A_2_8.png|right|frame|Binary symmetric Markov source]]
Wir betrachten hier die binäre symmetrische Markovquelle entsprechend nebenstehender Grafik, die durch den einzigen Parameter
+
We consider the symmetric Markov source according to the graph, which is completely given by the single parameter
 
:$$q = {\rm Pr}(\boldsymbol{\rm X}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm X}) =  
 
:$$q = {\rm Pr}(\boldsymbol{\rm X}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm X}) =  
{\rm Pr}(\boldsymbol{\rm Y}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm Y})$$
+
{\rm Pr}(\boldsymbol{\rm Y}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm Y}).$$
vollständig beschrieben wird. Die angegebenen Quellensymbolfolgen gelten für <i>q</i> = 0.2 bzw. <i>q</i>&nbsp;=&nbsp;0.8. In der Teilaufgabe (a) ist zu klären, welche Symbolfolge mit <i>q</i> = 0.2 und welche mit <i>q</i> = 0.8 generiert wurde.
 
  
Die Eigenschaften von Markovquellen werden im Kapitel 1.2 ausführlich beschrieben. Aufgrund der hier vorausgesetzten Symmetrie bezüglich der binären Symbole <b>X</b> und <b>Y</b> ergeben sich einige gravierende Vereinfachungen, wie in Aufgabe Z1.5 hergeleitet wird:
+
*The given source symbol sequences apply to the conditional probabilities &nbsp;$q = 0.2$&nbsp; and &nbsp;$q = 0.8$, respectively.
 +
*In subtask&nbsp; '''(1)'''&nbsp; it has to be clarified which symbol sequence &ndash; the red or the blue one &ndash; was generated with &nbsp;$q = 0.2$&nbsp; and which with &nbsp;$q = 0.8$&nbsp;.
  
:* Die Symbole <b>X</b> und <b>Y</b> sind gleichwahrscheinlich, das heißt. es ist <i>p</i><sub>X</sub> = <i>p</i><sub>Y</sub> = 0.5. Damit lautet die erste Entropienäherung:
 
:$$H_1 = 1\,\,{\rm bit/Quellensymbol}\hspace{0.05cm}. $$
 
  
:* Die Entropie der Markovquelle ergibt sich zu
+
The properties of Markov sources are described in detail in the chapter &nbsp;[[Information_Theory/Discrete_Sources_with_Memory|Discrete Sources with Memory]].&nbsp; Due to the symmetry assumed here with regard to the binary symbols &nbsp;$\rm X$&nbsp; and &nbsp;$\rm Y$,&nbsp; some serious simplifications result, as is derived in&nbsp; [[Aufgaben:1.5Z_Symmetrische_Markovquelle|Exercise  1.5Z]]:
:$$H = q \cdot {\rm ld}\hspace{0.15cm}\frac{1}{q} + (1-q) \cdot {\rm ld}\hspace{0.15cm}\frac{1}{1-q}  
+
* The symbols &nbsp;$\rm X$&nbsp; and &nbsp;$\rm Y$&nbsp; are equally probable, that is,&nbsp; $p_{\rm X} = p_{\rm Y}  = 0.5$ holds.&nbsp; <br>Thus the first entropy approximation is&nbsp; $H_1 = 1\,\,{\rm bit/source\hspace{0.15cm}symbol}\hspace{0.05cm}. $
= 0.722\,\,{\rm bit/Quellensymbol}
+
* The entropy of the Markov source for &nbsp;$q = 0.2$&nbsp; as well as for &nbsp;$q = 0.8$&nbsp; results in
\hspace{0.05cm}.$$
+
:$$H = q \cdot {\rm log_2}\hspace{0.15cm}\frac{1}{q} + (1-q) \cdot {\rm log_2}\hspace{0.15cm}\frac{1}{1-q}  
: Der Zahlenwert gilt nur für <i>q</i> = 0.2 sowie für <i>q</i> = 0.8.
+
= 0.722\,\,{\rm bit/source\hspace{0.15cm}symbol}\hspace{0.05cm}.$$
 +
* For Markov sources, all entropy approximations&nbsp; $H_k$&nbsp; with order&nbsp; $k \ge 2$&nbsp; are determined by&nbsp; $H_1$&nbsp;  and&nbsp; $H = H_{k \to \infty}$:
 +
:$$H_k = {1}/{k}\cdot \big [ H_1 + H \big ] \hspace{0.05cm}.$$  
 +
*The following numerical values again apply equally to &nbsp;$q = 0.2$&nbsp; and &nbsp;$q = 0.8$&nbsp;:
 +
:$$H_2 = {1}/{2}\cdot \big [ H_1 + H \big ] = 0.861\,\,{\rm bit/source\hspace{0.15cm}symbol}\hspace{0.05cm},$$
 +
:$$H_3 = {1}/{3} \cdot \big [ H_1 + 2H \big ] = 0.815\,\,{\rm bit/source\hspace{0.15cm}symbol}\hspace{0.05cm}.$$
  
:* Bei Markovquellen sind alle Entropienäherungen höherer Ordnung durch <i>H</i><sub>1</sub> und <i>H</i> bestimmt. Die folgenden Zahlenwerte gelten wieder für <i>q</i> = 0.2 und <i>q</i> = 0.8 gleichermaßen:
+
In this exercise, the Huffman algorithm is to be applied to&nbsp; $k$&ndash;tuples, where we restrict ourselves to&nbsp; $k = 2$&nbsp; and&nbsp; $k = 3$.
:$$H_2 \hspace{0.2cm} = \hspace{0.2cm} \frac{1}{2}\cdot \big [ H_1 + H \big ] = 0.861\,\,{\rm bit/Quellensymbol}\hspace{0.05cm},\\
 
H_3 \hspace{0.2cm} =  \hspace{0.2cm} \frac{1}{3} \cdot \big [ H_1 + 2H \big ] = 0.815\,\,{\rm bit/Quellensymbol}\hspace{0.05cm}.$$
 
  
Wie auf der letzten Theorieseite dieses Kapitels 2.3 soll hier der Huffman&ndash;Algorithmus auf <i>k</i>&ndash;Tupel angewandt werden, wobei wir uns auf <i>k</i> = 2 und <i>k</i> = 3 beschränken.
 
  
<b>Hinweis:</b> Die Aufgabe gehört zum Themengebiet von Kapitel 2.3. Nützliche Informationen finden Sie auch in Aufgabe A2.7 und Aufgabe Z2.7. Für die Huffman&ndash;Codierung können Sie das folgende Interaktionsmodul benutzen:
 
  
Shannon&ndash;Fano&ndash; und Huffman&ndash;Codierung
 
  
  
===Fragebogen===
+
<u>Hints:</u>
 +
*The exercise belongs to the chapter&nbsp; [[Information_Theory/Entropiecodierung_nach_Huffman|Entropy Coding according to Huffman]].
 +
*In particular, reference is made to the page&nbsp; [[Information_Theory/Entropiecodierung_nach_Huffman#Application_of_Huffman_coding_to_.7F.27.22.60UNIQ-MathJax168-QINU.60.22.27.7F.E2.80.93tuples|Application of Huffman coding to&nbsp; $k$-tuples]].
 +
*Useful information can also be found in the specification sheets for    &nbsp;[[Aufgaben:Exercise_2.7:_Huffman_Application_for_Binary_Two-Tuples|Exercise 2.7]]&nbsp; and  &nbsp;[[Aufgaben:Exercise_2.7Z:_Huffman_Coding_for_Two-Tuples_of_a_Ternary_Source|Exercise 2.7Z]].
 +
*To check your results, please refer to the (German language) SWF module&nbsp; [[Applets:Huffman_Shannon_Fano|Coding according to Huffman and Shannon/Fano]].
 +
 +
 
 +
 
 +
 
 +
===Questions===
  
 
<quiz display=simple>
 
<quiz display=simple>
{Welche der vorne angegebenen Beispielfolgen gilt für <i>q</i> = 0.8?
+
{Which of the example sequences given at the front is true for&nbsp; $q = 0.8$?
|type="[]"}
+
|type="()"}
- Quellensymbolfolge 1,
+
- The red source symbol sequence&nbsp; '''1''',
+ Quellensymbolfolge 2,
+
+ the blue source symbol sequence&nbsp; '''2'''.
  
  
{Welche der folgenden Aussagen treffen zu?
+
{Which of the following statements are true?
 
|type="[]"}
 
|type="[]"}
- Auch die direkte Anwendung von Huffman ist hier sinnvoll.
+
- The direct application of Huffman is also useful here.
+ Huffman macht bei Bildung von Zweiertupeln (<i>k</i> = 2) Sinn.
+
+ Huffman makes sense when forming two-tuples&nbsp; $(k = 2)$.
+ Huffman macht bei Bildung von Dreiertupeln (<i>k</i> = 3) Sinn.
+
+ Huffman makes sense when forming tuples of three&nbsp; $(k = 3)$.
  
  
{Wie lauten die Wahrscheinlichkeiten der Zweiertupel (<i>k</i> = 2) für <i>q</i> = 0.8?
+
{What are the probabilities of <u>two-tuples</u>&nbsp; $(k = 2)$&nbsp; for &nbsp;$\underline{q = 0.8}$?
 
|type="{}"}
 
|type="{}"}
$q = 0.8; k = 2:\ p_A = Pr(XX)$ = { 0.4 3% }
+
$p_{\rm A} = \rm Pr(XX)\ = \ $ { 0.4 3% }
$p_B = Pr(XY)$ = { 0.1 3% }
+
$p_{\rm B} = \rm Pr(XY)\ = \ $ { 0.1 3% }
$p_C = Pr(YX)$ = { 0.1 3% }
+
$p_{\rm C} = \rm Pr(YX)\ = \ $ { 0.1 3% }
$p_D = Pr(YY)$ = { 0.4 3% }
+
$p_{\rm D} = \rm Pr(YY)\ = \ $ { 0.4 3% }
  
  
{Ermitteln Sie mit dem angegebenen Flash&ndash;Modul den Huffman&ndash;Code für <i>k</i> = 2. Wie groß ist in diesem Fall die mittlere Codewortlänge?
+
{Find the Huffman code for&nbsp; $\underline{k = 2}$.&nbsp; What is the average code word length in this case?
 
|type="{}"}
 
|type="{}"}
$L_M$ = { 0.9 3% } bit/Quellensymbol
+
$L_{\rm M} \ = \ $ { 0.9 3% } $\ \rm bit/source\hspace{0.15cm}symbol$
  
  
{Welche Schranke ergibt sich für die mittlere Codewortlänge, wenn Zweiertupel gebildet werden (<i>k</i> = 2)? Interpretation.
+
{What is the bound on the average code word length when <u>two-tuples</u> are formed&nbsp; $(k = 2)$? Interpretation.
|type="[]"}
+
|type="()"}
- <i>L</i><sub>M</sub> &#8805; <i>H</i><sub>1</sub> = 1 bit/Quellensymbol,
+
- $L_{\rm M} \ge H_1 =  1.000$&nbsp; $\ \rm bit/source\hspace{0.15cm}symbol$
+ <i>L</i><sub>M</sub> &#8805; <i>H</i><sub>2</sub> &asymp; 0.861 bit/Quellensymbol,
+
+ $L_{\rm M} \ge H_2 \approx  0.861$&nbsp; $\ \rm bit/source\hspace{0.15cm}symbol$
- <i>L</i><sub>M</sub> &#8805; <i>H</i><sub>3</sub> &asymp; 0.815 bit/Quellensymbol,
+
- $L_{\rm M} \ge H_3 \approx  0.815$&nbsp; $\ \rm bit/source\hspace{0.15cm}symbol$
- <i>L</i><sub>M</sub> &#8805; <i>H</i> &asymp; 0.722 bit/Quellensymbol,
+
- $L_{\rm M} \ge H_{k \to \infty} \approx  0.722$&nbsp; $\ \rm bit/source\hspace{0.15cm}symbol$
- <i>L</i><sub>M</sub> &#8805; 0.5 bit/Quellensymbol.
+
- $L_{\rm M} \ge 0.5$&nbsp; $\ \rm bit/source\hspace{0.15cm}symbol$
  
  
{Berechnen Sie die Wahrscheinlichkeiten für <i>k</i> = 3.
+
{Calculate the probabilities of the <u>three-tuple</u>&nbsp; $(k = 3)$&nbsp; for &nbsp;$\underline{q = 0.8}$?
 
|type="{}"}
 
|type="{}"}
$q = 0.8; k = 3:\ p_A = Pr(XXX)$ = { 0.32 3% }
+
$p_{\rm A} = \rm Pr(XXX)\ = \ $ { 0.32 3% }
$p_B = Pr(XXY)$ = { 0.08 3% }
+
$p_{\rm B} = \rm Pr(XXY)\ = \ $ { 0.08 3% }
$p_C = Pr(XYX)$ = { 0.02 3% }
+
$p_{\rm C} = \rm Pr(XYX)\ = \ $ { 0.02 3% }
$p_D = Pr(XYY)$ = { 0.08 3% }
+
$p_{\rm D} = \rm Pr(XYY)\ = \ $ { 0.08 3% }
$p_E = Pr(YXX)$ = { 0.08 3% }
+
$p_{\rm E} = \rm Pr(YXX)\ = \ $ { 0.08 3% }
$p_F = Pr(YXY)$ = { 0.02 3% }
+
$p_{\rm F} = \rm Pr(YXY)\ = \ $ { 0.02 3% }
$p_G = Pr(YYX)$ = { 0.08 3% }
+
$p_{\rm G} = \rm Pr(YYX)\ = \ $ { 0.08 3% }
$p_H = Pr(YYY)$ = { 0.32 3% }
+
$p_{\rm H} = \rm Pr(YYY)\ = \ $ { 0.32 3% }
  
  
{Ermitteln Sie mit dem genannten Flash&ndash;Modul den Huffman&ndash;Code für <i>k</i> = 3. Wie groß ist in diesem Fall die mittlere Codewortlänge?
+
{Find the Huffman code for $\underline{k = 3}$.&nbsp; What is the average code word length in this case?
 
|type="{}"}
 
|type="{}"}
$L_M$ = { 0.84 3% } bit/Quellensymbol
+
$L_{\rm M} \ = \ $ { 0.84 3% } $\ \rm bit/source\hspace{0.15cm}symbol$
  
  
Line 89: Line 96:
 
</quiz>
 
</quiz>
  
===Musterlösung===
+
===Solution===
 
{{ML-Kopf}}
 
{{ML-Kopf}}
<b>1.</b>&nbsp;&nbsp;Bei der Quellensymbolfolge 2 erkennt man sehr viel weniger Symbolwechsel als in der roten Folge. Die blaue Symbolfolge 2 wurde mit dem Parameter
+
'''(1)'''&nbsp; Correct is the  <u>solution suggestion 2</u>:
:$$q = {\rm Pr}(\boldsymbol{\rm X}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm X}) =  
+
*In the blue source symbol sequence&nbsp; '''2'''&nbsp; one recognizes much less symbol changes than in the red sequence.
{\rm Pr}(\boldsymbol{\rm Y}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm Y}) = 0.8$$
+
*The symbol sequence&nbsp; '''2'''&nbsp; was generated with the parameter&nbsp; $q = {\rm Pr}(\boldsymbol{\rm X}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm X}) =  
erzeugt und die rote Symbolfolge 1 mit <i>q</i> = 0.2 &nbsp;&nbsp;&#8658;&nbsp;&nbsp; Richtig ist der <u>Lösungsvorschlag 2</u>.
+
{\rm Pr}(\boldsymbol{\rm Y}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm Y}) = 0.8$&nbsp;  and the red symbol sequence&nbsp; '''1'''&nbsp; with&nbsp; $q = 0.2$.
 +
 +
 
 +
 
 +
'''(2)'''&nbsp;<u>Answers 2 and 3</u> are correct:
 +
*Since here the source symbols&nbsp; $\rm X$&nbsp; and&nbsp; $\rm Y$&nbsp; were assumed to be equally probable, the direct application of Huffman makes no sense.
 +
*In contrast, one can use the inner statistical depenndecies of the Markov source for data compression if one forms&nbsp; $k$&ndash;tuples &nbsp; $(k &#8805; 2)$.
 +
*The larger&nbsp; $k$&nbsp; is, the more the average code word length&nbsp; $L_{\rm M}$&nbsp; approaches the entropy&nbsp; $H$.
 +
 
 +
 
 +
 
 +
'''(3)'''&nbsp; The symbol probabilities are&nbsp; $p_{\rm X} = p_{\rm Y}  = 0.5$, which gives us for the two-tuples:&nbsp;
 +
:$$p_{\rm A} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XX}) = p_{\rm X} \cdot {\rm Pr}(\boldsymbol{\rm X}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm X}) = 0.5 \cdot q = 0.5 \cdot 0.8 \hspace{0.15cm}\underline{  = 0.4}  \hspace{0.05cm},$$
 +
:$$p_{\rm B} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XY}) = p_{\rm X} \cdot {\rm Pr}(\boldsymbol{\rm Y}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm X}) = 0.5 \cdot (1-q)= 0.5 \cdot 0.2 \hspace{0.15cm}\underline{  = 0.1}  \hspace{0.05cm},$$
 +
:$$p_{\rm C} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm YX}) = p_{\rm Y} \cdot {\rm Pr}(\boldsymbol{\rm X}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm Y}) = 0.5 \cdot (1-q)= 0.5 \cdot 0.2 \hspace{0.15cm}\underline{  = 0.1}  \hspace{0.05cm},$$
 +
:$$p_{\rm D} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm YY}) = p_{\rm Y} \cdot {\rm Pr}(\boldsymbol{\rm Y}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm Y}) = 0.5 \cdot q = 0.5 \cdot 0.8\hspace{0.15cm}\underline{  = 0.4}  \hspace{0.05cm}.$$
 +
 
 +
 
 +
[[File:P_ID2462__Inf_A_2_8d.png|right|frame|For Huffman coding for $k = 2$]]
 +
'''(4)'''&nbsp; Opposite screen capture of the (earlier) SWF applet&nbsp; [[Applets:Huffman_Shannon_Fano|Coding according to Huffman and Shannon/Fano]]&nbsp; shows the construction of the Huffman code for&nbsp; $k = 2$&nbsp; with the probabilities just calculated.
 +
*Thus, the average code word length is:
 +
:$$L_{\rm M}\hspace{0.01cm}' = 0.4 \cdot 1 + 0.4 \cdot 2 + (0.1 + 0.1) \cdot 3 = 1.8\,\,\text { bit/two-tuple}$$
 +
:$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}\hspace{0.01cm}'}/{2}\hspace{0.15cm}\underline{  = 0.9\,\text{ bit/source symbol}}\hspace{0.05cm}.$$
 +
 
  
<b>2.</b>&nbsp;&nbsp;Da hier die Quellensymbole <b>X</b> und <b>Y</b> gleichwahrscheinlich angenommen wurden, macht die direkte Anwendung von Huffman keinen Sinn. Dagegen kann man die inneren statistischen Bindungen  der Markovquelle zur Datenkomprimierung nutzen, wenn man <i>k</i>&ndash;Tupel bildet (<i>k</i> &#8805; 2). Richtig sind demnach die <u>Antworten 2 und 3</u>. Je größer <i>k</i> ist, desto mehr nähert sich die Codewortlänge <i>L</i><sub>M</sub> der Entropie <i>H</i>.
+
'''(5)'''&nbsp; Correct is the <u>suggested solution 2</u>:
 +
*According to the source coding theorem&nbsp; $L_{\rm M} &#8805; H$ holds.
 +
*However, if we apply Huffman coding and disregard ties between non-adjacent symbols&nbsp; $(k = 2)$, the lower bound of the code word length is not&nbsp; $H = 0.722$, but&nbsp; $H_2 = 0.861$&nbsp; (the addition&nbsp; "bit/source symbol"&nbsp; is omitted for the rest of the task).
 +
*The result of subtask&nbsp; '''(4)'''&nbsp; was&nbsp; $L_{\rm M} = 0.9.$
 +
*If an asymmetrical Markov chain were present and in such a way that for the probabilities&nbsp; $p_{\rm A}$, ... , $p_{\rm D}$&nbsp; the values&nbsp; $50\%$,&nbsp; $25\%$&nbsp; and twice&nbsp; $12.5\%$&nbsp; would result, then one would come to the average code word length&nbsp; $L_{\rm M}  = 0.875$.
 +
*How the exact parameters of this asymmetrical Markov source look, however, is not known even to the task creator (G. Söder).
 +
*Nor how the value&nbsp; $0.875$&nbsp; could be reduced to&nbsp; $0.861$.&nbsp; In any case, the Huffman algorithm is unsuitable for this.
  
<b>3.</b>&nbsp;&nbsp;Die Symbolwahrscheinlichkeiten <i>p</i><sub>X</sub> und <i>p</i><sub>Y</sub> sind jeweils 0.5. Damit erhält man für die Zweiertupel:
 
:$$p_{\rm A} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XX}) = p_{\rm X} \cdot {\rm Pr}(\boldsymbol{\rm X}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm X}) = 0.5 \cdot q = 0.5 \cdot 0.8 \hspace{0.15cm}\underline{  = 0.4}  \hspace{0.05cm},\\
 
p_{\rm B} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XY}) = p_{\rm X} \cdot {\rm Pr}(\boldsymbol{\rm Y}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm X}) = 0.5 \cdot (1-q)= 0.5 \cdot 0.2 \hspace{0.15cm}\underline{  = 0.1}  \hspace{0.05cm},\\
 
p_{\rm C} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm YX}) = p_{\rm Y} \cdot {\rm Pr}(\boldsymbol{\rm X}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm Y}) = 0.5 \cdot (1-q)= 0.5 \cdot 0.2 \hspace{0.15cm}\underline{  = 0.1}  \hspace{0.05cm},\\
 
p_{\rm D} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm YY}) = p_{\rm Y} \cdot {\rm Pr}(\boldsymbol{\rm Y}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm Y}) = 0.5 \cdot q = 0.5 \cdot 0.8\hspace{0.15cm}\underline{  = 0.4}  \hspace{0.05cm}.$$
 
[[File:P_ID2462__Inf_A_2_8d.png|right|]]
 
  
<b>4.</b>&nbsp;&nbsp;Nebenstehender Bildschirmabzug des Programms Shannon&ndash;Fano&ndash; und Huffman&ndash;Codierung zeigt die Konstruktion des Huffman&ndash;Codes für <i>k</i> = 2 mit den soeben berechneten Wahrscheinlichkeiten. Damit gilt für die mittlere Codewortlänge:
+
'''(6)'''&nbsp; With&nbsp; $q = 0.8$&nbsp; and&nbsp; $1 - q = 0.2$&nbsp; we get:
:$$L_{\rm M}' \hspace{0.2cm} =  \hspace{0.2cm} 0.4 \cdot 1 + 0.4 \cdot 2 + (0.1 + 0.1) \cdot 3 = \\  
+
:$$p_{\rm A} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XXX})  = 0.5 \cdot q^2 \hspace{0.15cm}\underline{  = 0.32} = p_{\rm H} = {\rm Pr}(\boldsymbol{\rm YYY})\hspace{0.05cm},$$
\hspace{0.2cm} =  1.8\,\,{\rm bit/Zweiertupel}$$
+
:$$p_{\rm B} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XXY}) = 0.5 \cdot q \cdot (1-q) \hspace{0.15cm}\underline{  = 0.08}= p_{\rm G} = {\rm Pr}(\boldsymbol{\rm YYX}) \hspace{0.05cm},$$
:$$\Rightarrow\hspace{0.3cm}L_{\rm M} = \frac{L_{\rm M}'}{2}\hspace{0.15cm}\underline{  = 0.9\,{\rm bit/Quellensymbol}}\hspace{0.05cm}.$$
+
:$$p_{\rm C} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XYX}) = 0.5 \cdot (1-q)^2\hspace{0.15cm}\underline{  = 0.02} = p_{\rm F}= {\rm Pr}(\boldsymbol{\rm YXY}) \hspace{0.05cm},$$
 +
:$$p_{\rm D} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XYY}) = 0.5 \cdot (1-q) \cdot q \hspace{0.15cm}\underline{  = 0.08} = p_{\rm E}  = {\rm Pr}(\boldsymbol{\rm YXX})\hspace{0.05cm}.$$
  
<br><br><b>5.</b>&nbsp;&nbsp;Nach dem Quellencodierungstheorem gilt <i>L</i><sub>M</sub> &#8805; <i>H</i>. Wendet man aber Huffman&ndash;Codierung an und lässt dabei Bindungen zwischen nicht benachbarten Symbolen außer Betracht (<i>k</i> = 2), so gilt als unterste Grenze der Codewortlänge nicht <i>H</i> = 0.722, sondern <i>H</i><sub>2</sub> = 0.861 (auf den Zusatz bit/Quellensymbol wird für den Rest der Aufgabe verzichtet) &nbsp;&#8658;&nbsp;<u>Lösungsvorschlag 2</u>.
 
  
Das Ergebnis der Teilaufgabe 4) war <i>L</i><sub>M</sub> = 0.9. Würde eine unsymmetrische Markovkette vorliegen und zwar derart, dass sich für die Wahrscheinlichkeiten <i>p</i><sub>A</sub>, ... , <i>p</i><sub>D</Sub> die Werte 50%, 25% und zweimal 12.5% ergeben würden, so käme man auf die mittlere Codewortlänge <i>L</i><sub>M</sub> = 0.875. Wie die genauen Parameter dieser unsymmetrischen Markovquelle aussehen, weiß ich (G. Söder) nicht. Auch nicht, wie sich der Wert 0.875 auf 0.861 senken ließe. Der Huffman&ndash;Algorithmus ist hierfür jedenfalls ungeeignet.
+
[[File:P_ID2463__Inf_A_2_8g.png|right|frame|On the Huffman coding for&nbsp; $k = 3$]]
 +
'''(7)'''&nbsp; The screen capture of the of the (earlier) SWF applet&nbsp; [[Applets:Huffman_Shannon_Fano|Coding according to Huffman and Shannon/Fano]]&nbsp; coding illustrates the constellation of the Huffman code for&nbsp; $k = 3$.&nbsp;
  
<b>6.</b>&nbsp;&nbsp;Mit <i>q</i> = 0.8 und 1 &ndash; <i>q</i> = 0.2 erhält man:
+
This gives us for the average code word length:
:$$p_{\rm A} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XXX})  = 0.5 \cdot q^2 \hspace{0.15cm}\underline{  = 0.32} = p_{\rm H} = {\rm Pr}(\boldsymbol{\rm YYY})\hspace{0.05cm},\\
+
:$$L_{\rm M}\hspace{0.01cm}' =  0.64 \cdot 2 + 0.24 \cdot 3 + 0.04 \cdot 5 2.52\,\,{\rm bit/three tupel}$$
p_{\rm B} \hspace{0.2cm} \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XXY}) = 0.5 \cdot q \cdot (1-q) \hspace{0.15cm}\underline{  = 0.08}= p_{\rm G} = {\rm Pr}(\boldsymbol{\rm YYX}) \hspace{0.05cm},\\
+
:$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}\hspace{0.01cm}'}/{3}\hspace{0.15cm}\underline{  = 0.84\,{\rm bit/source\:symbol}}\hspace{0.05cm}.$$
p_{\rm C} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XYX}) = 0.5 \cdot (1-q)^2\hspace{0.15cm}\underline{  = 0.02} = p_{\rm F}= {\rm Pr}(\boldsymbol{\rm YXY}) \hspace{0.05cm},\\
 
p_{\rm D} \hspace{0.2cm} =  \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XYY}) = 0.5 \cdot (1-q) \cdot q \hspace{0.15cm}\underline{  = 0.08} = p_{\rm E}  = {\rm Pr}(\boldsymbol{\rm YXX})\hspace{0.05cm}.$$
 
  
<b>7.</b>&nbsp;&nbsp;Der Bildschirmabzug des Flash&ndash;Moduls verdeutlicht die Konstellation des Huffman&ndash;Codes für <i>k</i> = 3. Damit erhält man für die mittlere Codewortlänge:
+
*One can see the improvement over subtask&nbsp; '''(4)'''.
:$$L_{\rm M}' =  0.64 \cdot 2 + 0.24 \cdot 3 + 0.04 \cdot 5 =  2.52\,\,{\rm bit/Dreiertupel}$$
+
*The bound&nbsp; $k = 2$&nbsp; valid for&nbsp; $H_2 = 0.861$&nbsp; is now undercut by the average code word length&nbsp; $L_{\rm M}$.
:$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}'}/{3}\hspace{0.15cm}\underline{ = 0.84\,{\rm bit/Quellensymbol}}\hspace{0.05cm}.$$
+
*The new bound for&nbsp; $k = 3$&nbsp; is&nbsp; $H_3 = 0.815$.  
[[File:P_ID2463__Inf_A_2_8g.png|center|]]
+
*However, to reach the source entropy&nbsp; $H = 0.722$&nbsp;&nbsp; (or better:&nbsp; to come close to this final value up to an&nbsp; $&epsilon;$&nbsp;), one would have to form infinitely long tuples&nbsp; $(k &#8594; &#8734;)$.
Man erkennt die Verbesserung gegenüber (4). Die für <i>k</i> = 2 gültige informationstheoretische Schranke <i>H</i><sub>2</sub> = 0.861 wird nun unterschritten (<i>L</i><sub>M</sub>). Die neue Schranke für <i>k</i> = 3 ist  <i>H</i><sub>3</sub> = 0.815. Um die Quellenentropie <i>H</i>&nbsp;=&nbsp;0.722 zu erreichen (besser gesagt: diesem Endwert bis auf ein &epsilon; näher zu kommen), müsste man allerdings unendlich lange Tupel bilden (<i>k</i> &#8594; &#8734;).
 
  
 
{{ML-Fuß}}
 
{{ML-Fuß}}
Line 130: Line 158:
  
  
[[Category:Aufgaben zu Informationstheorie|^2.3 Entropiecodierung nach Huffman^]]
+
[[Category:Information Theory: Exercises|^2.3 Entropy Coding according to Huffman^]]

Latest revision as of 17:04, 1 November 2022

Binary symmetric Markov source

We consider the symmetric Markov source according to the graph, which is completely given by the single parameter

$$q = {\rm Pr}(\boldsymbol{\rm X}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm X}) = {\rm Pr}(\boldsymbol{\rm Y}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm Y}).$$
  • The given source symbol sequences apply to the conditional probabilities  $q = 0.2$  and  $q = 0.8$, respectively.
  • In subtask  (1)  it has to be clarified which symbol sequence – the red or the blue one – was generated with  $q = 0.2$  and which with  $q = 0.8$ .


The properties of Markov sources are described in detail in the chapter  Discrete Sources with Memory.  Due to the symmetry assumed here with regard to the binary symbols  $\rm X$  and  $\rm Y$,  some serious simplifications result, as is derived in  Exercise 1.5Z:

  • The symbols  $\rm X$  and  $\rm Y$  are equally probable, that is,  $p_{\rm X} = p_{\rm Y} = 0.5$ holds. 
    Thus the first entropy approximation is  $H_1 = 1\,\,{\rm bit/source\hspace{0.15cm}symbol}\hspace{0.05cm}. $
  • The entropy of the Markov source for  $q = 0.2$  as well as for  $q = 0.8$  results in
$$H = q \cdot {\rm log_2}\hspace{0.15cm}\frac{1}{q} + (1-q) \cdot {\rm log_2}\hspace{0.15cm}\frac{1}{1-q} = 0.722\,\,{\rm bit/source\hspace{0.15cm}symbol}\hspace{0.05cm}.$$
  • For Markov sources, all entropy approximations  $H_k$  with order  $k \ge 2$  are determined by  $H_1$  and  $H = H_{k \to \infty}$:
$$H_k = {1}/{k}\cdot \big [ H_1 + H \big ] \hspace{0.05cm}.$$
  • The following numerical values again apply equally to  $q = 0.2$  and  $q = 0.8$ :
$$H_2 = {1}/{2}\cdot \big [ H_1 + H \big ] = 0.861\,\,{\rm bit/source\hspace{0.15cm}symbol}\hspace{0.05cm},$$
$$H_3 = {1}/{3} \cdot \big [ H_1 + 2H \big ] = 0.815\,\,{\rm bit/source\hspace{0.15cm}symbol}\hspace{0.05cm}.$$

In this exercise, the Huffman algorithm is to be applied to  $k$–tuples, where we restrict ourselves to  $k = 2$  and  $k = 3$.



Hints:



Questions

1

Which of the example sequences given at the front is true for  $q = 0.8$?

The red source symbol sequence  1,
the blue source symbol sequence  2.

2

Which of the following statements are true?

The direct application of Huffman is also useful here.
Huffman makes sense when forming two-tuples  $(k = 2)$.
Huffman makes sense when forming tuples of three  $(k = 3)$.

3

What are the probabilities of two-tuples  $(k = 2)$  for  $\underline{q = 0.8}$?

$p_{\rm A} = \rm Pr(XX)\ = \ $

$p_{\rm B} = \rm Pr(XY)\ = \ $

$p_{\rm C} = \rm Pr(YX)\ = \ $

$p_{\rm D} = \rm Pr(YY)\ = \ $

4

Find the Huffman code for  $\underline{k = 2}$.  What is the average code word length in this case?

$L_{\rm M} \ = \ $

$\ \rm bit/source\hspace{0.15cm}symbol$

5

What is the bound on the average code word length when two-tuples are formed  $(k = 2)$? Interpretation.

$L_{\rm M} \ge H_1 = 1.000$  $\ \rm bit/source\hspace{0.15cm}symbol$
$L_{\rm M} \ge H_2 \approx 0.861$  $\ \rm bit/source\hspace{0.15cm}symbol$
$L_{\rm M} \ge H_3 \approx 0.815$  $\ \rm bit/source\hspace{0.15cm}symbol$
$L_{\rm M} \ge H_{k \to \infty} \approx 0.722$  $\ \rm bit/source\hspace{0.15cm}symbol$
$L_{\rm M} \ge 0.5$  $\ \rm bit/source\hspace{0.15cm}symbol$

6

Calculate the probabilities of the three-tuple  $(k = 3)$  for  $\underline{q = 0.8}$?

$p_{\rm A} = \rm Pr(XXX)\ = \ $

$p_{\rm B} = \rm Pr(XXY)\ = \ $

$p_{\rm C} = \rm Pr(XYX)\ = \ $

$p_{\rm D} = \rm Pr(XYY)\ = \ $

$p_{\rm E} = \rm Pr(YXX)\ = \ $

$p_{\rm F} = \rm Pr(YXY)\ = \ $

$p_{\rm G} = \rm Pr(YYX)\ = \ $

$p_{\rm H} = \rm Pr(YYY)\ = \ $

7

Find the Huffman code for $\underline{k = 3}$.  What is the average code word length in this case?

$L_{\rm M} \ = \ $

$\ \rm bit/source\hspace{0.15cm}symbol$


Solution

(1)  Correct is the solution suggestion 2:

  • In the blue source symbol sequence  2  one recognizes much less symbol changes than in the red sequence.
  • The symbol sequence  2  was generated with the parameter  $q = {\rm Pr}(\boldsymbol{\rm X}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm X}) = {\rm Pr}(\boldsymbol{\rm Y}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm Y}) = 0.8$  and the red symbol sequence  1  with  $q = 0.2$.


(2) Answers 2 and 3 are correct:

  • Since here the source symbols  $\rm X$  and  $\rm Y$  were assumed to be equally probable, the direct application of Huffman makes no sense.
  • In contrast, one can use the inner statistical depenndecies of the Markov source for data compression if one forms  $k$–tuples   $(k ≥ 2)$.
  • The larger  $k$  is, the more the average code word length  $L_{\rm M}$  approaches the entropy  $H$.


(3)  The symbol probabilities are  $p_{\rm X} = p_{\rm Y} = 0.5$, which gives us for the two-tuples: 

$$p_{\rm A} \hspace{0.2cm} = \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XX}) = p_{\rm X} \cdot {\rm Pr}(\boldsymbol{\rm X}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm X}) = 0.5 \cdot q = 0.5 \cdot 0.8 \hspace{0.15cm}\underline{ = 0.4} \hspace{0.05cm},$$
$$p_{\rm B} \hspace{0.2cm} = \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XY}) = p_{\rm X} \cdot {\rm Pr}(\boldsymbol{\rm Y}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm X}) = 0.5 \cdot (1-q)= 0.5 \cdot 0.2 \hspace{0.15cm}\underline{ = 0.1} \hspace{0.05cm},$$
$$p_{\rm C} \hspace{0.2cm} = \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm YX}) = p_{\rm Y} \cdot {\rm Pr}(\boldsymbol{\rm X}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm Y}) = 0.5 \cdot (1-q)= 0.5 \cdot 0.2 \hspace{0.15cm}\underline{ = 0.1} \hspace{0.05cm},$$
$$p_{\rm D} \hspace{0.2cm} = \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm YY}) = p_{\rm Y} \cdot {\rm Pr}(\boldsymbol{\rm Y}\hspace{0.05cm}|\hspace{0.05cm}\boldsymbol{\rm Y}) = 0.5 \cdot q = 0.5 \cdot 0.8\hspace{0.15cm}\underline{ = 0.4} \hspace{0.05cm}.$$


For Huffman coding for $k = 2$

(4)  Opposite screen capture of the (earlier) SWF applet  Coding according to Huffman and Shannon/Fano  shows the construction of the Huffman code for  $k = 2$  with the probabilities just calculated.

  • Thus, the average code word length is:
$$L_{\rm M}\hspace{0.01cm}' = 0.4 \cdot 1 + 0.4 \cdot 2 + (0.1 + 0.1) \cdot 3 = 1.8\,\,\text { bit/two-tuple}$$
$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}\hspace{0.01cm}'}/{2}\hspace{0.15cm}\underline{ = 0.9\,\text{ bit/source symbol}}\hspace{0.05cm}.$$


(5)  Correct is the suggested solution 2:

  • According to the source coding theorem  $L_{\rm M} ≥ H$ holds.
  • However, if we apply Huffman coding and disregard ties between non-adjacent symbols  $(k = 2)$, the lower bound of the code word length is not  $H = 0.722$, but  $H_2 = 0.861$  (the addition  "bit/source symbol"  is omitted for the rest of the task).
  • The result of subtask  (4)  was  $L_{\rm M} = 0.9.$
  • If an asymmetrical Markov chain were present and in such a way that for the probabilities  $p_{\rm A}$, ... , $p_{\rm D}$  the values  $50\%$,  $25\%$  and twice  $12.5\%$  would result, then one would come to the average code word length  $L_{\rm M} = 0.875$.
  • How the exact parameters of this asymmetrical Markov source look, however, is not known even to the task creator (G. Söder).
  • Nor how the value  $0.875$  could be reduced to  $0.861$.  In any case, the Huffman algorithm is unsuitable for this.


(6)  With  $q = 0.8$  and  $1 - q = 0.2$  we get:

$$p_{\rm A} \hspace{0.2cm} = \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XXX}) = 0.5 \cdot q^2 \hspace{0.15cm}\underline{ = 0.32} = p_{\rm H} = {\rm Pr}(\boldsymbol{\rm YYY})\hspace{0.05cm},$$
$$p_{\rm B} \hspace{0.2cm} = \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XXY}) = 0.5 \cdot q \cdot (1-q) \hspace{0.15cm}\underline{ = 0.08}= p_{\rm G} = {\rm Pr}(\boldsymbol{\rm YYX}) \hspace{0.05cm},$$
$$p_{\rm C} \hspace{0.2cm} = \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XYX}) = 0.5 \cdot (1-q)^2\hspace{0.15cm}\underline{ = 0.02} = p_{\rm F}= {\rm Pr}(\boldsymbol{\rm YXY}) \hspace{0.05cm},$$
$$p_{\rm D} \hspace{0.2cm} = \hspace{0.2cm} {\rm Pr}(\boldsymbol{\rm XYY}) = 0.5 \cdot (1-q) \cdot q \hspace{0.15cm}\underline{ = 0.08} = p_{\rm E} = {\rm Pr}(\boldsymbol{\rm YXX})\hspace{0.05cm}.$$


On the Huffman coding for  $k = 3$

(7)  The screen capture of the of the (earlier) SWF applet  Coding according to Huffman and Shannon/Fano  coding illustrates the constellation of the Huffman code for  $k = 3$. 

This gives us for the average code word length:

$$L_{\rm M}\hspace{0.01cm}' = 0.64 \cdot 2 + 0.24 \cdot 3 + 0.04 \cdot 5 = 2.52\,\,{\rm bit/three tupel}$$
$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}\hspace{0.01cm}'}/{3}\hspace{0.15cm}\underline{ = 0.84\,{\rm bit/source\:symbol}}\hspace{0.05cm}.$$
  • One can see the improvement over subtask  (4).
  • The bound  $k = 2$  valid for  $H_2 = 0.861$  is now undercut by the average code word length  $L_{\rm M}$.
  • The new bound for  $k = 3$  is  $H_3 = 0.815$.
  • However, to reach the source entropy  $H = 0.722$   (or better:  to come close to this final value up to an  $ε$ ), one would have to form infinitely long tuples  $(k → ∞)$.