Difference between revisions of "Aufgaben:Exercise 2.7Z: Huffman Coding for Two-Tuples of a Ternary Source"

From LNTwww
 
(34 intermediate revisions by 5 users not shown)
Line 1: Line 1:
  
{{quiz-Header|Buchseite=Informationstheorie und Quellencodierung/Entropiecodierung nach Huffman
+
{{quiz-Header|Buchseite=Information_Theory/Entropy_Coding_According_to_Huffman
 
}}
 
}}
  
[[File:P_ID2458__Inf_Z_2_7.png|right|]]
+
[[File:P_ID2458__Inf_Z_2_7.png|right|frame|Huffman tree for <br>a ternary source]]
Wir betrachten den gleichen Sachverhalt wie in der Aufgabe A2.7: Der Huffman&ndash;Algorithmus führt zu einem besseren Ergebnis, das heißt zu einer kleineren mittleren Codewortlänge <i>L</i><sub>M</sub>, wenn man ihn nicht auf einzelne Symbole anwendet, sondern vorher <i>k</i>&ndash;Tupel bildet. Dadurch erhöht man den Symbolumfang von <i>M</i> auf <i>M</i><sup>&nbsp;</sup>&prime; = <i>M<sup>&nbsp;k</sup></i>.
+
We consider the same situation as in&nbsp; [[Aufgaben:Exercise_2.7:_Huffman_Application_for_Binary_Two-Tuples|Exercise A2.7]]: &nbsp;  
 +
*The Huffman algorithm leads to a better result, i.e. to a smaller average code word length&nbsp; $L_{\rm M}$, if one does not apply it to individual symbols but forms&nbsp; $k$&ndash;tuples beforehand. &nbsp;  
 +
*This increases the symbol set size from&nbsp; $M$&nbsp; to&nbsp; $M\hspace{0.03cm}' = M^k$.
  
Für die hier betrachtete Nachrichtenquelle gilt:
 
  
:* Symbolumfang: <i>M</i> = 3,
+
For the message source considered here, the following applies:
 +
* Symbol set size: &nbsp; $M = 3$,
 +
* Symbol set: &nbsp; $\{$ $\rm X$,&nbsp; $\rm Y$,&nbsp; $\rm Z$ $\}$,
 +
* Probabilities: &nbsp;  $p_{\rm X} = 0.7$,&nbsp; $p_{\rm Y} = 0.2$,&nbsp; $p_{\rm Z} = 0.1$,
 +
* Entropy: &nbsp; $H = 1.157 \ \rm  bit/ternary\hspace{0.12cm}symbol$.
  
:* Symbolvorrat: {<b>X</b>, <b>Y</b>, <b>Z</b>},
 
  
:* Wahrscheinlichkeiten: <i>p</i><sub>X</sub> = 0.7, <i>p</i><sub>Y</sub> = 0.2, <i>p</i><sub>Z</sub> = 0.1,
+
The graph shows the Huffman tree when the Huffman algorithm is applied to single symbols&nbsp; $(k= 1)$. <br>In subtask&nbsp; '''(2)'''&nbsp; you are to give the corresponding Huffman code when two-tuples are formed beforehand&nbsp; $(k=2)$.
  
:* Entropie: <i>H</i> = 1.157 bit/Ternärsymbol.
 
  
Die Grafik zeigt den Huffman&ndash;Baum, wenn man den Huffman&ndash;Algorithmus auf Einzelsymbole anwendet, also den Fall <i>k</i>&nbsp;=&nbsp;1. In der Teilaufgabe (2) sollen Sie den entsprechenden Huffman&ndash;Code angeben, wenn vorher Zweiertupel gebildet werden (<i>k</i>&nbsp;=&nbsp;2).
 
  
<b>Hinweis:</b> Die Aufgabe bezieht sich auf die letzte Theorieseite von Kapitel 2.3. Bezeichnen Sie die möglichen Zweiertupel mit
 
  
:<b>XX</b> = <b>A</b>,&nbsp;&nbsp;<b>XY</b> = <b>B</b>,&nbsp;&nbsp;<b>XZ</b> = <b>C</b>,&nbsp;&nbsp; <b>YX</b> = <b>D</b>,&nbsp;&nbsp;<b>YY</b> = <b>E</b>,&nbsp;&nbsp;<b>YZ</b> = <b>F</b>,&nbsp;&nbsp;<b>ZX</b> = <b>G</b>,&nbsp;&nbsp;<b>ZY</b> = <b>H</b>,&nbsp;&nbsp;<b>ZZ</b> = <b>I</b> .
+
<u>Hints:</u>
 +
*The exercise belongs to the chapter&nbsp; [[Information_Theory/Entropiecodierung_nach_Huffman|Entropy Coding according to Huffman]].
 +
*In particular, reference is made to the page&nbsp; [[Information_Theory/Entropiecodierung_nach_Huffman#Application_of_Huffman_coding_to_.7F.27.22.60UNIQ-MathJax168-QINU.60.22.27.7F.E2.80.93tuples|Application of Huffman coding to&nbsp; $k$-tuples]].
 +
*A comparable task with binary input symbols is dealt with in&nbsp;  [[Aufgaben:Exercise_2.7:_Huffman_Application_for_Binary_Two-Tuples|Exercise 2.7]]&nbsp;.
 +
*Designate the possible two-tuples with &nbsp; &nbsp; $\rm XX = A$,&nbsp; &nbsp;$\rm XY = B$,&nbsp; &nbsp;$\rm XZ = C$,&nbsp;&nbsp; $\rm YX = D$,&nbsp; &nbsp;$\rm YY = E$,&nbsp; &nbsp;$\rm YZ = F$,&nbsp; &nbsp;$\rm ZX = G$,&nbsp; &nbsp;$\rm ZY = H$,&nbsp; &nbsp;$\rm ZZ = I$.
 +
  
  
===Fragebogen===
+
===Questions===
  
 
<quiz display=simple>
 
<quiz display=simple>
{Wie groß ist die mittlere Codewortlänge, wenn der Huffman&ndash;Algorithmus direkt auf die ternären Quellensymbole <b>X</b>, <b>Y</b> und <b>Z</b> angewendet wird?
+
{What is the average code word length when the Huffman algorithm is applied directly to the ternary source symbols&nbsp; $\rm X$,&nbsp; $\rm Y$&nbsp; und&nbsp; $\rm Z$&nbsp;?  
 
|type="{}"}
 
|type="{}"}
$k = 1:\ L_M$ = { 1.3 3% } bit/Quellensymbol
+
$\underline{k=1}\text{:} \hspace{0.25cm}L_{\rm M} \ = \ $ { 1.3 3% } $\ \rm bit/source\hspace{0.12cm}symbol$
  
  
{Wie groß sind  die Tupel&ndash;Wahrscheinlichkeiten? Insbesondere:
+
{What are the tuple probabilities here?&nbsp; In particular:
 
|type="{}"}
 
|type="{}"}
$p_A = Pr(XX)$ = { 0.49 3% }
+
$p_{\rm A} = \rm Pr(XX)\ = \ $ { 0.49 3% }
$p_B = Pr(XY)$ = { 0.14 3% }
+
$p_{\rm B} = \rm Pr(XY)\ = \ $ { 0.14 3% }
$p_C = Pr(XZ)$ = { 0.07 3% }
+
$p_{\rm C} = \rm Pr(XZ)\ = \ $ { 0.07 3% }
  
  
{Wie groß ist die mittlere Codewortlänge, wenn man erst Zweiertupel bildet und darauf den Huffman&ndash;Algorithmus  anwendet.
+
{What is the average code word length if you first form two-tuples and apply the Huffman algorithm to them?
 
|type="{}"}
 
|type="{}"}
$k = 2:\ L_M$ = { 1.165 3% } bit/Quellensymbol
+
$\underline{k=2}\text{:} \hspace{0.25cm}L_{\rm M} \ = \ $ { 1.165 3% } $\ \rm bit/source\hspace{0.12cm}symbol$
  
  
{Welche der folgenden Aussagen sind zutreffend, wenn man mehr als zwei Ternärzeichen zusammenfasst (<i>k</i> >2)?
+
{Which of the following statements are true when more than two ternary symbols are combined&nbsp; $(k>2)$?
 
|type="[]"}
 
|type="[]"}
+ <i>L</i><sub>M</sub> fällt monoton mit steigendem <i>k</i> ab.
+
+ $L_{\rm M}$&nbsp; decreases monotonically with increasing &nbsp;$k$.
- <i>L</i><sub>M</sub> ändert sich nicht, wenn man <i>k</i> erhöht.
+
- $L_{\rm M}$&nbsp; does not change when &nbsp;$k$&nbsp; is increased.
- Für <i>k</i> = 3 erhält man <i>L</i><sub>M</Sub> = 1.05 bit/Quellensymbol.
+
- Für &nbsp;$k= 3$&nbsp; you get &nbsp;$L_{\rm M} = 1.05 \ \rm bit/source\hspace{0.12cm}symbol$.
  
  
Line 53: Line 59:
 
</quiz>
 
</quiz>
  
===Musterlösung===
+
===Solution===
 
{{ML-Kopf}}
 
{{ML-Kopf}}
<b>1.</b>&nbsp;&nbsp;Die mittlere Codewortlänge ergibt sich mit <i>p</i><sub>X</sub> = 0.7, <i>L</i><sub>X</sub> = 1, <i>p</i><sub>Y</sub> = 0.2, <i>L</i><sub>Y</sub> = 2, <i>p</i><sub>Z</sub> = 0.1, <i>L</i><sub>Z</sub> = 2 zu
+
'''(1)'''&nbsp; The average code word length is with &nbsp;$p_{\rm X} = 0.7$, &nbsp;$L_{\rm X} = 1$, &nbsp;$p_{\rm Y} = 0.2$, &nbsp;$L_{\rm Y} = 2$, &nbsp;$p_{\rm Z} = 0.1$, &nbsp;$L_{\rm Z} = 2$:
:$$L_{\rm M} = p_{\rm X} \cdot 1 + (p_{\rm Y} + p_{\rm Z}) \cdot 2 \hspace{0.15cm}\underline{= 1.3\,\,{\rm bit/Quellensymbol}}\hspace{0.05cm}. $$
+
:$$L_{\rm M} = p_{\rm X} \cdot 1 + (p_{\rm Y} + p_{\rm Z}) \cdot 2 \hspace{0.15cm}\underline{= 1.3\,\,{\rm bit/source\:symbol}}\hspace{0.05cm}. $$
Dieser Wert liegt noch deutlich über der Quellenentropie <i>H</i> = 1.157 bit/Quellensymbol.
+
*This value is greater than the source entropy&nbsp; $H = 1.157$&nbsp; bit/source&nbsp; symbol.
  
<b>2.</b>&nbsp;&nbsp;Es gibt <i>M</i><sup>&nbsp;</sup>&prime; = <i>M</i><sup>&nbsp;2</sup> = 3<sup>2</sup> = 9 Zweiertupel mit folgenden Wahrscheinlichkeiten:
 
  
: <i>p</i><sub>A</sub> = Pr(<b>XX</b>) <u>= 0.49</u>,&nbsp;&nbsp;&nbsp;<i>p</i><sub>B</sub> = Pr(<b>XY</b>) <u>= 0.14</u>,&nbsp;&nbsp;&nbsp; <i>p</i><sub>C</sub> = Pr(<b>XZ</b>) <u>= 0.07</u>,
 
  
: <i>p</i><sub>D</sub> = Pr(<b>YX</b>) = 0.14,&nbsp;&nbsp;&nbsp; <i>p</i><sub>E</sub> = Pr(<b>YY</b>) = 0.04,&nbsp;&nbsp;&nbsp; <i>p</i><sub>F</sub> = Pr(<b>YZ</b>) = 0.02,
+
'''(2)'''&nbsp; There are&nbsp; $M\hspace{0.03cm}' = M^k = 3^2 = 9$&nbsp; two-tuples with the following probabilities:
  
: <i>p</i><sub>G</sub> = Pr(<b>YX</b>) = 0.07,&nbsp;&nbsp;&nbsp; <i>p</i><sub>H</sub> = Pr(<b>YY</b>) = 0.02,&nbsp;&nbsp;&nbsp; <i>p</i><sub>I</sub> = Pr(<b>YZ</b>) = 0.01.
+
[[File:P_ID2459__Inf_Z_2_7c.png|right|frame|Huffman tree for ternary source and two-tuples.]]
 +
:$$p_{\rm A} = \rm Pr(XX) = 0.7 \cdot 0.7\hspace{0.15cm}\underline{= 0.49},$$
 +
:$$p_{\rm B} = \rm Pr(XY) = 0.7 \cdot 0.2\hspace{0.15cm}\underline{= 0.14},$$
 +
:$$p_{\rm C} = \rm Pr(XZ) = 0.7 \cdot 0.1\hspace{0.15cm}\underline{= 0.07},$$
 +
:$$p_{\rm D} = \rm Pr(YX) = 0.2 \cdot 0.7 = 0.14,$$
 +
:$$p_{\rm E} = \rm Pr(YY) = 0.2 \cdot 0.2 = 0.04,$$
 +
:$$p_{\rm F} = \rm Pr(YZ) = 0.2 \cdot 0.1 = 0.02,$$
 +
:$$p_{\rm G} = \rm Pr(ZX) = 0.1 \cdot 0.7 = 0.07,$$
 +
:$$p_{\rm H} = \rm Pr(ZY) = 0.1 \cdot 0.2 = 0.02,$$
 +
:$$p_{\rm I} = \rm Pr(ZZ) = 0.1 \cdot 0.1 = 0.01.$$
 +
<br clear=all>
  
<b>3.</b>&nbsp;&nbsp;Die Grafik zeigt den Huffman&ndash;Baum für die Anwendung mit <i>k</i> = 2.
+
'''(3)'''&nbsp; The graph shows the Huffman tree for the application with $k = 2$.&nbsp; Thus we obtain
[[File:P_ID2459__Inf_Z_2_7c.png|center|]]
+
* for the individual two-tuples the following binary codings: <br>
Damit erhält man
+
: &nbsp; &nbsp;  $\rm XX = A$ &nbsp; &#8594; &nbsp; '''0''', &nbsp; &nbsp;  $\rm XY = B$ &nbsp; &#8594; &nbsp; '''111''', &nbsp; &nbsp;  $\rm XZ = C$ &nbsp; &#8594; &nbsp; <b>1011</b>,
 +
: &nbsp; &nbsp;  $\rm YX = D$ &nbsp; &#8594; &nbsp; <b>110</b>, &nbsp; &nbsp; $\rm YY = E$ &nbsp; &#8594; &nbsp; <b>1000</b>, &nbsp; &nbsp;  $\rm YZ = F$ &nbsp; &#8594; &nbsp; <b>10010</b>,
 +
: &nbsp; &nbsp;  $\rm ZX = G$ &nbsp; &#8594; &nbsp; <b>1010</b>, &nbsp; &nbsp;  $\rm ZY = H$ &nbsp; &#8594; &nbsp; <b>100111</b>, &nbsp; &nbsp;  $\rm ZZ =I$ &nbsp; &#8594; &nbsp; <b>100110</b>; 
  
:* für die einzelnen Zweiertupels folgende Binärcodierungen: <br>
+
* for the average code word length:
: <b>XX</b> = <b>A</b> &#8594; <b>0</b>,&nbsp;&nbsp;&nbsp;<b>XY</b> = <b>B</b> &#8594; <b>111</b>,&nbsp;&nbsp;&nbsp;<b>XZ</b> = <b>C</b> &#8594; <b>1011</b>,&nbsp;&nbsp;&nbsp; <b>YX</b> = <b>D</b> &#8594; <b>110</b>,&nbsp;&nbsp;&nbsp;<b>YY</b> = <b>E</b> &#8594; <b>1000</b>, <br>
+
:$$L_{\rm M}\hspace{0.01cm}' =0.49 \cdot 1 + (0.14 + 0.14) \cdot 3 + (0.07 + 0.04 + 0.07) \cdot 4 + 0.02 \cdot 5 + (0.02 + 0.01) \cdot 6 = 2.33\,\,{\rm bit/two tuples}$$
: <b>YZ</b> = <b>F</b> &#8594; <b>10010</b>,&nbsp;&nbsp;&nbsp; <b>ZX</b> = <b>G</b> &#8594; <b>1010</b>,&nbsp;&nbsp;&nbsp;<b>ZY</b> = <b>H</b> &#8594; <b>100111</b>,&nbsp;&nbsp;&nbsp;<b>ZZ</b> = <b>I</b> &#8594; <b>100110</b> .
+
:$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}\hspace{0.01cm}'}/{2}\hspace{0.15cm}\underline{  = 1.165\,\,{\rm bit/source\hspace{0.12cm}symbol}}\hspace{0.05cm}.$$
  
:* für die mittlere Codewortlänge:
 
:$$L_{\rm M}' \hspace{0.2cm} =  \hspace{0.2cm} 0.49 \cdot 1 + (0.14 + 0.14) \cdot 3 + (0.07 + 0.04 + 0.07) \cdot 4 + \\
 
\hspace{0.2cm} +  \hspace{0.2cm}0.02 \cdot 5 + (0.02 + 0.01) \cdot 6 = 2.33\,\,{\rm bit/Zweiertupel}$$
 
:$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}'}/{2}\hspace{0.15cm}\underline{  = 1.165\,\,{\rm bit/Quellensymbol}}\hspace{0.05cm}.$$
 
  
<b>4.</b>&nbsp;&nbsp;Richtig ist <u>Aussage 1</u>, auch wenn <i>L</i><sub>M</sub> mit wachsendem <i>k</i> nur sehr langsam abfällt.
+
'''(4)'''&nbsp;<u>Statement 1</u> is correct,&nbsp; even if&nbsp; $L_{\rm M}$&nbsp;  decreases very slowly as&nbsp; $k$&nbsp; increases.
 
+
* The last statement is false because&nbsp; $L_{\rm M}$&nbsp; cannot be smaller than&nbsp; $H = 1.157$&nbsp; bit/source&nbsp; symbol even for &nbsp; $k &#8594; &#8734;$&nbsp; .
:* Die letzte Aussage ist falsch, da <i>L</i><sub>M</sub> auch für <i>k</i> &#8594; &#8734; nicht kleiner sein kann als <i>H</i> = 1.157 bit/Quellensymbol.
+
* But the second statement is not necessarily correct either: &nbsp; Since&nbsp; $L_{\rm M} > H$&nbsp; still applies with&nbsp; $k = 2$&nbsp;,&nbsp; $k = 3$&nbsp; can lead to a further improvement.
 
 
:* Aber auch die zweite Aussage ist falsch: Da mit <i>k</i> = 2 weiterhin <i>L</i><sub>M</sub> > <i>H</i> gilt, führt <i>k</i> = 3 zu einer Verbesserung.
 
 
{{ML-Fuß}}
 
{{ML-Fuß}}
  
  
  
[[Category:Aufgaben zu Informationstheorie und Quellencodierung|^2.3 Entropiecodierung nach Huffman^]]
+
[[Category:Information Theory: Exercises|^2.3 Entropy Coding according to Huffman^]]

Latest revision as of 16:57, 1 November 2022

Huffman tree for
a ternary source

We consider the same situation as in  Exercise A2.7:  

  • The Huffman algorithm leads to a better result, i.e. to a smaller average code word length  $L_{\rm M}$, if one does not apply it to individual symbols but forms  $k$–tuples beforehand.  
  • This increases the symbol set size from  $M$  to  $M\hspace{0.03cm}' = M^k$.


For the message source considered here, the following applies:

  • Symbol set size:   $M = 3$,
  • Symbol set:   $\{$ $\rm X$,  $\rm Y$,  $\rm Z$ $\}$,
  • Probabilities:   $p_{\rm X} = 0.7$,  $p_{\rm Y} = 0.2$,  $p_{\rm Z} = 0.1$,
  • Entropy:   $H = 1.157 \ \rm bit/ternary\hspace{0.12cm}symbol$.


The graph shows the Huffman tree when the Huffman algorithm is applied to single symbols  $(k= 1)$.
In subtask  (2)  you are to give the corresponding Huffman code when two-tuples are formed beforehand  $(k=2)$.



Hints:


Questions

1

What is the average code word length when the Huffman algorithm is applied directly to the ternary source symbols  $\rm X$,  $\rm Y$  und  $\rm Z$ ?

$\underline{k=1}\text{:} \hspace{0.25cm}L_{\rm M} \ = \ $

$\ \rm bit/source\hspace{0.12cm}symbol$

2

What are the tuple probabilities here?  In particular:

$p_{\rm A} = \rm Pr(XX)\ = \ $

$p_{\rm B} = \rm Pr(XY)\ = \ $

$p_{\rm C} = \rm Pr(XZ)\ = \ $

3

What is the average code word length if you first form two-tuples and apply the Huffman algorithm to them?

$\underline{k=2}\text{:} \hspace{0.25cm}L_{\rm M} \ = \ $

$\ \rm bit/source\hspace{0.12cm}symbol$

4

Which of the following statements are true when more than two ternary symbols are combined  $(k>2)$?

$L_{\rm M}$  decreases monotonically with increasing  $k$.
$L_{\rm M}$  does not change when  $k$  is increased.
Für  $k= 3$  you get  $L_{\rm M} = 1.05 \ \rm bit/source\hspace{0.12cm}symbol$.


Solution

(1)  The average code word length is with  $p_{\rm X} = 0.7$,  $L_{\rm X} = 1$,  $p_{\rm Y} = 0.2$,  $L_{\rm Y} = 2$,  $p_{\rm Z} = 0.1$,  $L_{\rm Z} = 2$:

$$L_{\rm M} = p_{\rm X} \cdot 1 + (p_{\rm Y} + p_{\rm Z}) \cdot 2 \hspace{0.15cm}\underline{= 1.3\,\,{\rm bit/source\:symbol}}\hspace{0.05cm}. $$
  • This value is greater than the source entropy  $H = 1.157$  bit/source  symbol.


(2)  There are  $M\hspace{0.03cm}' = M^k = 3^2 = 9$  two-tuples with the following probabilities:

Huffman tree for ternary source and two-tuples.
$$p_{\rm A} = \rm Pr(XX) = 0.7 \cdot 0.7\hspace{0.15cm}\underline{= 0.49},$$
$$p_{\rm B} = \rm Pr(XY) = 0.7 \cdot 0.2\hspace{0.15cm}\underline{= 0.14},$$
$$p_{\rm C} = \rm Pr(XZ) = 0.7 \cdot 0.1\hspace{0.15cm}\underline{= 0.07},$$
$$p_{\rm D} = \rm Pr(YX) = 0.2 \cdot 0.7 = 0.14,$$
$$p_{\rm E} = \rm Pr(YY) = 0.2 \cdot 0.2 = 0.04,$$
$$p_{\rm F} = \rm Pr(YZ) = 0.2 \cdot 0.1 = 0.02,$$
$$p_{\rm G} = \rm Pr(ZX) = 0.1 \cdot 0.7 = 0.07,$$
$$p_{\rm H} = \rm Pr(ZY) = 0.1 \cdot 0.2 = 0.02,$$
$$p_{\rm I} = \rm Pr(ZZ) = 0.1 \cdot 0.1 = 0.01.$$


(3)  The graph shows the Huffman tree for the application with $k = 2$.  Thus we obtain

  • for the individual two-tuples the following binary codings:
    $\rm XX = A$   →   0,     $\rm XY = B$   →   111,     $\rm XZ = C$   →   1011,
    $\rm YX = D$   →   110,     $\rm YY = E$   →   1000,     $\rm YZ = F$   →   10010,
    $\rm ZX = G$   →   1010,     $\rm ZY = H$   →   100111,     $\rm ZZ =I$   →   100110;
  • for the average code word length:
$$L_{\rm M}\hspace{0.01cm}' =0.49 \cdot 1 + (0.14 + 0.14) \cdot 3 + (0.07 + 0.04 + 0.07) \cdot 4 + 0.02 \cdot 5 + (0.02 + 0.01) \cdot 6 = 2.33\,\,{\rm bit/two tuples}$$
$$\Rightarrow\hspace{0.3cm}L_{\rm M} = {L_{\rm M}\hspace{0.01cm}'}/{2}\hspace{0.15cm}\underline{ = 1.165\,\,{\rm bit/source\hspace{0.12cm}symbol}}\hspace{0.05cm}.$$


(4) Statement 1 is correct,  even if  $L_{\rm M}$  decreases very slowly as  $k$  increases.

  • The last statement is false because  $L_{\rm M}$  cannot be smaller than  $H = 1.157$  bit/source  symbol even for   $k → ∞$  .
  • But the second statement is not necessarily correct either:   Since  $L_{\rm M} > H$  still applies with  $k = 2$ ,  $k = 3$  can lead to a further improvement.