Difference between revisions of "Aufgaben:Exercise 2.5: Residual Redundancy with LZW Coding"

Latest revision as of 13:12, 10 August 2021

Residual redundancy $r(N)$ and approximation $r\hspace{0.05cm}'(N)$ of three sources

We assume here a binary input sequence of length $N$ and consider three different binary sources:

$\rm BQ1$: Symbol probabilities $p_{\rm A} = 0.89$ and $p_{\rm B} = 0.11$, i.e. different
⇒ entropy $H = 0.5\text{ bit/source symbol}$ ⇒ the source is redundant.
$\rm BQ2$: $p_{\rm A} = p_{\rm B} = 0.5$ (equally probable)
⇒ entropy $H = 1\text{ bit/source symbol}$ ⇒ the source is redundancy-free.
$\rm BQ3$: There is no concrete information on the statistics here.
In subtask (6) you are to estimate the entropy $H$ of this source.

For these three sources, the respective "residual redundancy" $r(N)$ was determined by simulation, which remains in the binary sequence after Lempel–Ziv–Welch coding.

The results are shown in the first column of the above table for the sources

$\rm BQ1$ (yellow background),
$\rm BQ2$ (green background) and
$\rm BQ3$ (blue background)

whereby we have restricted ourselves to sequence lengths $N ≤ 50000$ in the simulation.

The "relative redundancy of the output sequence" – simplified called "residual redundancy" – can be calculated from

the length $N$ of the input sequence,
the length $L(N)$ of the output sequence and
the entropy $H$

in the following way:

$$r(N) = \frac{L(N) - N \cdot H}{L(N)}= 1 - \frac{ N \cdot H}{L(N)}\hspace{0.05cm}.$$

This takes into account that with perfect source coding the length of the output sequence could be lowered to the value $L_{\rm min} = N · H$ .

With non-perfect source coding, $L(n) - N · H$ gives the remaining redundancy (with the pseudo–unit "bit").
After dividing by $L(n)$, one obtains the relative redundancy $r(n)$ with the value range between zero and one; $r(n)$ should be as small as possible.

A second parameter for measuring the efficiency of LZW coding is the "compression factor" $K(N)$ as the quotient of the lengths of the output and input sequences, which should also be very small:

$$K(N) = {L(N) }/{N} \hspace{0.05cm},$$

In the theory section it was shown that the residual redundancy $r(n)$ is well approximated by the function

$$r\hspace{0.05cm}'(N) =\frac {A}{{\rm lg}\hspace{0.1cm}(N)} \hspace{0.5cm}{\rm mit}\hspace{0.5cm} A = 4 \cdot {r(N = 10000)} \hspace{0.05cm}.$$

This approximation $r\hspace{0.05cm}'(N)$ is given for $\rm BQ1$ in the second column of the table above.
In subtasks (4) and (5) you are to make the approximation for sources $\rm BQ2$ and $\rm BQ3$ .

Hints:

The task belongs to the chapter Compression according to Lempel, Ziv and Welch.
In particular, reference is made to the pages

Remaining redundancy as a measure for the efficiency of coding methods,

Efficiency of Lempel-Ziv coding and

Quantitative statements on asymptotic optimality.

The descriptive variables $K(N)$ and $r(N)$ are deterministically related.

Questions

$A \ = \ $

$N_{2} \ = \ $

$\ \cdot 10^{21}$

$N_{3} \ = \ $

$\ \cdot 10^{6}$

$r'(N = 50000)\ = \ $

$r'(N = 10^6)\ = \ $

$r'(N = 10^{12})\ = \ $

$r'(N = 50000)\ = \ $

$r'(N = 10^6)\ = \ $

$r'(N = 10^{12})\ = \ $

	$H = 1.00 \ \rm bit/source\hspace{0.15cm}symbol$,
	$H = 0.75 \ \rm bit/source\hspace{0.15cm} symbol$,
	$H = 0.50 \ \rm bit/source\hspace{0.15cm} symbol$,
	$H = 0.25 \ \rm bit/source\hspace{0.15cm} symbol$.

Solution

(1) The approximation $r\hspace{0.05cm}'(N)$ agrees exactly by definition for the sequence length $N = 10000$ with the residual redundancy $r(N) = 0.265$ determined by simulation.

Thus

$$A = 4 \cdot r(N = 10000) =4 \cdot {0.265} \hspace{0.15cm}\underline{= 1.06} \hspace{0.05cm}. $$

(2) From the relationship ${A}/{\rm lg}\hspace{0.1cm}(N) ≤ 0.05$ ⇒ ${A}/{\rm lg}\hspace{0.1cm}(N) = 0.05$ it follows:

$${{\rm lg}\hspace{0.1cm}N_{\rm 2}} = \frac{A}{0.05} = 21.2 \hspace{0.3cm}\Rightarrow\hspace{0.3cm} N_{\rm 2} = 10^{21.2} \hspace{0.15cm}\underline{= 1.58 \cdot 10^{21}} \hspace{0.05cm}.$$

(3) In general, $r(N) = 1 - {H}/{K(N)} \hspace{0.05cm}.$

$\rm BQ1$ has entropy $H = 0.5$ bit/source symbol.
It follows that because $r(N) ≈ r\hspace{0.05cm}'(N)$ for $K(N_3) = 0.6$:

$$r(N_{\rm 3}) = 1 - \frac{0.5}{0.6} = 0.167 \hspace{0.1cm}\Rightarrow\hspace{0.1cm} {\rm lg}\hspace{0.1cm}N_{\rm 3} = \frac{A}{0.167} = 6.36 \hspace{0.1cm}\Rightarrow\hspace{0.1cm} N_{\rm 3} = 10^{6.36} \hspace{0.15cm}\underline{= 2.29 \cdot 10^{6}} \hspace{0.05cm}.$$

Results for $\rm BQ2$

(4) For $N = 10000$ ⇒ $r(N) ≈ r\hspace{0.05cm}'(N) = 0.19$:

$$\frac{A}{{\rm lg}\hspace{0.1cm}10000} = 0.19 \hspace{0.3cm}\Rightarrow\hspace{0.3cm} A = 0.19 \cdot 4 = 0.76 \hspace{0.05cm}. $$

The results are summarised in the table opposite.
One can see the very good agreement between $r(N)$ and $r\hspace{0.05cm}'(N)$.
The numerical values sought are marked in red in the table:

$$r'(N = 50000)\hspace{0.15cm}\underline{ = 0.162},\hspace{0.3cm}r'(N = 10^{6})\hspace{0.15cm}\underline{ = 0.127},\hspace{0.3cm} r'(N = 10^{12})\hspace{0.15cm}\underline{ = 0.063}.$$

For the compression factor – the apostrophe indicates that the approximation $r\hspace{0.05cm}'(N)$ was assumed:

$$K\hspace{0.05cm}'(N) = \frac{1}{1 - r\hspace{0.05cm}'(N)}\hspace{0.05cm}.$$

Thus, for the length of the LZW output string:

$$L\hspace{0.05cm}'(N) = K\hspace{0.05cm}'(N) \cdot N = \frac{N}{1 - r\hspace{0.05cm}'(N)}\hspace{0.05cm}.$$

Results for $\rm BQ3$

(5) Following a similar procedure as in subtask (4) we obtain the fitting parameter $A = 1.36$ for the binary source $\rm BQ3$ and from this the results according to the table with a blue background.

Hint: The last column of this table is only understandable with knowledge of subtask (6). There it is shown that the source $\rm BQ3$ has the entropy $H = 0.25$ bit/source symbol.

In this case, the following applies to the compression factor:

$$K\hspace{0.05cm}'(N) = \frac{H}{1 - r\hspace{0.05cm}'(N)} = \frac{0.25}{1 - r'(N)} \hspace{0.05cm}.$$

Thus, for the values of residual redundancy we are looking for, we obtain:

$$r\hspace{0.05cm}'(N = 50000)\hspace{0.15cm}\underline{ = 0.289},\hspace{0.3cm}r\hspace{0.05cm}'(N = 10^{6})\hspace{0.15cm}\underline{ = 0.227},\hspace{0.3cm} r\hspace{0.05cm}'(N = 10^{12})\hspace{0.15cm}\underline{ = 0.113}.$$

Thus, for $N = 10^{12}$ , the compression factor $(0.282)$ still deviates significantly from the entropy $(0.25)$ which can only be achieved for $N \to \infty$ ("Source Coding Theorem").

(6) The individual approximations $r\hspace{0.05cm}'(N)$ differ only by the parameter $A$. Here we found:

Source $\rm BQ1$ with $H = 0.50$ ⇒ $A = 1.06$ ⇒ according to the specification sheet,
Source $\rm BQ2$ with $H = 1.00$ ⇒ $A = 0.76$ ⇒ see subtask (4),
Source $\rm BQ3$ $(H$ unknown$)$: $A = 4 · 0.34 =1.36$ ⇒ corresponding to the table on the right.

Obviously, the smaller the entropy $H$ the larger the adjustment factor $A$ (and vice versa).
Since exactly one solution is possible, $H = 0.25$ bit/source symbol must be correct ⇒ answer 4.

In fact, the probabilities $p_{\rm A} = 0.96$ and $p_{\rm B} = 0.04$ ⇒ $H ≈ 0.25$ were used in the simulation for the source $\rm BQ3$.

@@ Line 1: / Line 1: @@
-{{quiz-Header|Buchseite=Informationstheorie/Komprimierung nach Lempel, Ziv und Welch
+{{quiz-Header|Buchseite=Information_Theory/Compression_According_to_Lempel,_Ziv_and_Welch
 }}
-[[File:P_ID2446__Inf_A_2_5_neu.png|right|Restredundanz <i>r</i>(<i>N</i>) und Näherung <i>r</i> ′(<i>N</i>) dreier Quellen]]
+[[File:P_ID2446__Inf_A_2_5_neu.png|right|frame|Residual redundancy&nbsp; $r(N)$&nbsp; and approximation &nbsp; $r\hspace{0.05cm}'(N)$&nbsp; of three sources]]
-Wir gehen hier von einer binären Eingangsfolge der Länge <i>N</i> aus und  betrachten drei verschiedene binäre Nachrichtenquellen:
+We assume here a binary input sequence of length&nbsp; $N$&nbsp; and consider three different binary sources:
-* '''BQ1''': &nbsp; Symbolwahrscheinlichkeiten   <i>p</i><sub>A</sub> = 0.89, <i>p</i><sub>B</sub>&nbsp;=&nbsp;0.11, also unterschiedlich<br> &#8658;&nbsp; Entropie <i>H</i> = 0.5 bit/Quellensymbol &nbsp; &#8658; &nbsp; Quelle ist redundant.
+* $\rm BQ1$: &nbsp; Symbol probabilities&nbsp;   $p_{\rm A} = 0.89$&nbsp; and&nbsp; $p_{\rm B} = 0.11$, i.e. different<br> &nbsp;  &#8658; &nbsp; entropy &nbsp; $H = 0.5\text{ bit/source&nbsp;symbol}$ &nbsp; &#8658; &nbsp; the source is redundant.
-* '''BQ2''':  &nbsp; <i>p</i><sub>A</sub> = <i>p</i><sub>B</sub> = 0.5 (gleichwahrscheinlich) &nbsp; &#8658; &nbsp; Entropie <i>H</i> = 1 bit/Quellensymbol <br>&#8658;&nbsp; Quelle ist redundanzfrei.
+* $\rm BQ2$:  &nbsp; $p_{\rm A} = p_{\rm B} = 0.5$&nbsp; (equally probable)<br> &nbsp; &#8658; &nbsp; entropy &nbsp; $H = 1\text{ bit/source symbol}$ &nbsp; &#8658; &nbsp; the source is redundancy-free.
-* '''BQ3''': &nbsp;   Hier gibt es keine konkreten Angaben zur Statistik. In der Teilaufgabe (6) sollen Sie die Entropie <i>H</i> dieser Quelle  abschätzen.
+* $\rm BQ3$: &nbsp;   There is no concrete information on the statistics here.&nbsp; <br>In subtask&nbsp; '''(6)'''&nbsp; you are to estimate the entropy&nbsp; $H$&nbsp; of this source.
-Für diese drei Quellen wurden per Simulation die jeweilige <i>Restredundanz</i> <i>r</i>(<i>N</i>) ermittelt, die nach der [[Informationstheorie/Komprimierung_nach_Lempel,_Ziv_und_Welch#Der_Lempel.E2.80.93Ziv.E2.80.93Welch.E2.80.93Algorithmus|Lempel&ndash;Ziv&ndash;Welch&ndash;Codierung]] in der Binärfolge verbleibt. Die Ergebnisse sind in der jeweils ersten Spalte obiger Tabelle für die Quellen
-'''BQ1''' (gelbe Hinterlegung), '''BQ2''' (grüne Hinterlegung) und '''BQ3''' (blaue Hinterlegung) eingetragen, wobei wir uns bei der Simulation auf Folgenlängen <i>N</i> &#8804; 50000 beschränkt haben.
-Die <i>relative Redundanz der Ausgangsfolge</i> &ndash; vereinfachend <i>Restredundanz</i> genannt &ndash;  kann aus
+For these three sources, the respective&nbsp; "residual redundancy"&nbsp; $r(N)$&nbsp; was determined by simulation, which remains in the binary sequence after&nbsp; [[Information_Theory/Komprimierung_nach_Lempel,_Ziv_und_Welch#The_Lempel-Ziv-Welch_algorithm |Lempel&ndash;Ziv&ndash;Welch coding]].&nbsp;
-:* der Länge <i>N</i> der Eingangsfolge,
+The results are shown in the first column of the above table for the sources
+*$\rm BQ1$&nbsp; (yellow background),
+*$\rm BQ2$&nbsp; (green background) and
+*$\rm BQ3$&nbsp; (blue background)
-:* der Länge <i>L</i>(<i>N</i>) der Ausgangsfolge und
-:* der Quellenentropie <i>H</i>
+whereby we have restricted ourselves to sequence lengths&nbsp; $N &#8804; 50000$&nbsp; in the simulation.
-:in folgender Weise berechnet werden:
+The&nbsp; "relative redundancy of the output sequence" &ndash; simplified called&nbsp; "residual redundancy" &ndash;  can be calculated from
+* the length&nbsp;  $N$&nbsp;  of the input sequence,
+* the length&nbsp;  $L(N)$&nbsp;  of the output sequence and
+*the entropy&nbsp;  $H$
+in the following way:
 :$$r(N) = \frac{L(N) - N \cdot H}{L(N)}= 1 -  \frac{ N \cdot H}{L(N)}\hspace{0.05cm}.$$
-:Hierbei ist berücksichtigt, dass bei perfekter Quellencodierung die Länge der Ausgangsfolge bis auf den Wert  <i>L</i><sub>min</sub> = <i>N</i> &middot; <i>H</i> herabgesenkt werden könnte. Bei nichtperfekter Quellencodierung  gibt <i>L</i>(<i>N</i>) &ndash; <i>N</i> &middot; <i>H</i> die verbleibende Redundanz (mit der Pseudo&ndash;Einheit &bdquo;bit&rdquo;) an. Nach Division durch <i>L</i>(<i>N</i>) erhält man die relative Redundanz <i>r</i>(<i>N</i>) mit dem Wertebereich zwischen 0 und 1; <i>r</i>(<i>N</i>) sollte möglichst klein sein.
-:Eine zweite Kenngröße zur Effizienzmessung der LZW&ndash;Codierung ist der <i>Komprimierungsfaktor</i>
+This takes into account that with perfect source coding the length of the output sequence could be lowered to the value&nbsp;   $L_{\rm min} = N &middot; H$&nbsp;.
+*With non-perfect source coding,&nbsp;  $L(n) - N &middot; H$&nbsp;  gives the remaining redundancy&nbsp; (with the pseudo&ndash;unit&nbsp; "bit").
+*After dividing by&nbsp;  $L(n)$,&nbsp;  one obtains the relative redundancy&nbsp;  $r(n)$&nbsp;  with the value range between zero and one;&nbsp; $r(n)$&nbsp; should be as small as possible.
+A second parameter for measuring the efficiency of LZW coding is the&nbsp;  "compression factor"&nbsp;  $K(N)$&nbsp; as the quotient of the lengths of the output and input sequences, which should also be very small:
 :$$K(N) = {L(N) }/{N} \hspace{0.05cm},$$
-:der ebenfalls klein sein sollte. <i>Hinweis</i>: <i>K</i>(<i>N</i>) und <i>r</i>(<i>N</i>) hängen deterministisch zusammen.
-:Im Theorieteil wurde gezeigt, dass die Restredundanz <i>r</i>(<i>N</i>) durch die Funktion
+In the&nbsp;  [[Information_Theory/Komprimierung_nach_Lempel,_Ziv_und_Welch#Quantitative_statements_on_asymptotic_optimality|theory section]]&nbsp;  it was shown that the residual redundancy&nbsp;  $r(n)$&nbsp;  is well approximated by the function
-:$$r'(N) ={A}/{{\rm lg}\hspace{0.1cm}(N)}
+:$$r\hspace{0.05cm}'(N) =\frac {A}{{\rm lg}\hspace{0.1cm}(N)}
 \hspace{0.5cm}{\rm mit}\hspace{0.5cm} A = 4 \cdot {r(N = 10000)}
 \hspace{0.05cm}.$$
-:oft gut angenähert wird. Die Näherung <i>r</i>&prime;(<i>N</i>) ist für BQ1 in der zweiten Spalte obiger Tabelle angegeben. In den Teilaufgaben (d) und (e) sollen Sie die Approximation für die Quellen BQ2 und BQ3 vornehmen.
-''Hinweise:''
+*This approximation&nbsp;  $r\hspace{0.05cm}'(N)$&nbsp;  is given for&nbsp;  $\rm BQ1$&nbsp;  in the second column of the table above.
-*Die Aufgabe gehört zum  Kapitel [[Informationstheorie/Komprimierung_nach_Lempel,_Ziv_und_Welch|Komprimierung nach Lempel, Ziv und Welch]].
+* In subtasks&nbsp;  '''(4)'''&nbsp;  and&nbsp;  '''(5)'''&nbsp;  you are to make the approximation for sources&nbsp;  $\rm BQ2$&nbsp;  and&nbsp;  $\rm BQ3$&nbsp;.
-*Insbesondere wird  Bezug genommen auf die Seiten [[Informationstheorie/Komprimierung_nach_Lempel,_Ziv_und_Welch#Der_Lempel.E2.80.93Ziv.E2.80.93Welch.E2.80.93Algorithmus|LZ77 &ndash; die Grundform der Lempel-Ziv-Algorithmen]], [[Informationstheorie/Komprimierung_nach_Lempel,_Ziv_und_Welch#Lempel.E2.80.93Ziv.E2.80.93Codierung_mit_variabler_Indexbitl.C3.A4nge|Lempel-Ziv-Codierung mit variabler Indexbitlänge]] sowie [[Informationstheorie/Komprimierung_nach_Lempel,_Ziv_und_Welch#Decodierung_des_LZW.E2.80.93Algorithmus|Decodierung des LZW-Agorithmus]].
-*Sollte die Eingabe des Zahlenwertes &bdquo;0&rdquo; erforderlich sein, so geben Sie bitte &bdquo;0.&rdquo; ein.
-*Die [[Aufgaben:2.3_Zur_LZ78-Komprimierung|Aufgabe 2.3]] sowie die [[Aufgaben:2.3Z_Zur_LZ77-Codierung|Zusatzaufgabe 2.3Z]] behandeln andere LZ-Verfahren in ähnlicher Weise.
-:<b>Hinweis:</b> Die Aufgabe bezieht sich auf Seite 7 und Seite 8 von Kapitel 2.2.
+Hints:
+*The task belongs to the chapter&nbsp;  [[Information_Theory/Komprimierung_nach_Lempel,_Ziv_und_Welch|Compression according to Lempel, Ziv and Welch]].
+*In particular, reference is made to the pages
+:: [[Information_Theory/Komprimierung_nach_Lempel,_Ziv_und_Welch#Remaining_redundancy_as_a_measure_for_the_efficiency_of_coding_methods|Remaining redundancy as a measure for the efficiency of coding methods]],
+:: [[Information_Theory/Komprimierung_nach_Lempel,_Ziv_und_Welch#Efficiency_of_Lempel-Ziv_coding|Efficiency of Lempel-Ziv coding]] and
+:: [[Information_Theory/Komprimierung_nach_Lempel,_Ziv_und_Welch#Quantitative_statements_on_asymptotic_optimality|Quantitative statements on asymptotic optimality]].
+*The descriptive variables&nbsp;  $K(N)$&nbsp;  and&nbsp;  $r(N)$&nbsp;  are deterministically related.
-===Fragebogen===
+===Questions===
 <quiz display=simple>
-{Mit welchem Parameter <i>A</i> wurde die Näherung <i>r'</i>(<i>N</i>) der Restredundanz für die Binärquelle BQ1 erstellt?
+{With which parameter&nbsp; $A$&nbsp; was the approximation&nbsp; $r\hspace{0.05cm}'(N)$&nbsp; of the residual redundancy for the binary source&nbsp; $\rm BQ1$&nbsp; created?
 |type="{}"}
-$BQ1:\ A$ = { 1.06 3% }
+$A \ = \ $  { 1.06 3% }
-{Wie groß muss <i>N</i> = <i>N</i><sub>b</sub> mindestens sein, damit die Restredundanz die Bedingung <i>r</i>(<i>N</i>) &#8804; 5% erfüllt? <i>Hinweis:</i> Ersetzen Sie <i>r</i>(<i>N</i>) durch <i>r'</i><sub> </sub>(<i>N</i>).
+{What is the minimum size of&nbsp; $N = N_2$&nbsp; for&nbsp;  $\rm BQ1$&nbsp;  so that the residual redundancy satisfies the condition&nbsp; $r(N) &asymp; r\hspace{0.05cm}'(N) \le 5\%$&nbsp;?
 |type="{}"}
-$BQ1:\ N_b$ = { 1.58 3% } $\cdot 10^{21}$
+$N_{2} \ =  \ $ { 1.58 3% } $\ \cdot 10^{21}$
-{Wie groß muss <i>N</i> = <i>N</i><sub>c</sub> mindestens sein, damit der Komprimierungsfaktor &nbsp;&#8658;&nbsp;&nbsp; <i>K</i>(<i>N</i>) = <i>L</i>(<i>N</i>)/<i>N</i> nicht größer ist als 0.6?
+{What is the minimum size of&nbsp; $N = N_3$&nbsp;  at&nbsp;  $\rm BQ1$&nbsp; so that the compression factor&nbsp; $K(N)= L(N)/N$&nbsp; is not greater than&nbsp; $0.6$?
 |type="{}"}
-$BQ1:\ N_c$ = { 2.29 3% } $\cdot 10^{6}$
+$N_{3} \ =  \ $ { 2.29 3% } $\ \cdot 10^{6}$
-{Bestimmen Sie nun die Redundanznäherung <i>r'</i>(<i>N</i>) für die redundanzfreie Binärquelle BQ2, insbesondere:
+{Now determine the redundancy approximation&nbsp; $r\hspace{0.05cm}'(N)$&nbsp; for the redundancy-free binary source $\rm BQ2$, in particular:
 |type="{}"}
-$BQ2:\ r'(N = 50000)$ = { 0.162 3% }
+$r'(N = 50000)\ =  \ $ { 0.162 3% }
-$r'(N = 10^6)$ = { 0.127 3% }
+$r'(N = 10^6)\ =  \ $ { 0.127 3% }
-$r'(N = 10^12)$ = { 0.063 3% }
+$r'(N = 10^{12})\ =  \ $ { 0.063 3% }
-{Welche Werte liefert die Redundanznäherung <i>r</i>&prime;(<i>N</i>) für die nicht näher spezifizierte Binärquelle BQ3? Insbesondere:
+{What values does the redundancy approximation&nbsp; $r\hspace{0.05cm}'(N)$&nbsp; yield for the unspecified binary source&nbsp; $\rm BQ3$? In particular:
 |type="{}"}
-$BQ2:\ r'(N = 50000)$ = { 0.289 3% }
+$r'(N = 50000)\ =  \ $ { 0.289 3% }
-$r'(N = 10^6)$ = { 0.227 3% }
+$r'(N = 10^6)\ =  \ $ { 0.227 3% }
-$r'(N = 10^12)$ = { 0.113 3% }
+$r'(N = 10^{12})\ =  \ $ { 0.113 3% }
-{Welche Quellenentropie <i>H</i> könnte BQ3 nach diesem Ergebnis besitzen? <i>Hinweis:</i> Es ist genau eine Antwort richtig.
+{According to this result, what source entropy&nbsp; $H$&nbsp; could&nbsp; $\rm BQ3$&nbsp; ?&nbsp; <u>Hint:</u>&nbsp; Exactly one answer is correct.
-|type="[]"}
+|type="()"}
-- <i>H</i> = 1.00 bit/Quellensymbol,
+- $H = 1.00 \ \rm bit/source\hspace{0.15cm}symbol$,
-- <i>H</i> = 0.75 bit/Quellensymbol,
+- $H = 0.75 \ \rm bit/source\hspace{0.15cm} symbol$,
-- <i>H</i> = 0.50 bit/Quellensymbol,
+- $H = 0.50 \ \rm bit/source\hspace{0.15cm} symbol$,
-+ <i>H</i> = 0.25 bit/Quellensymbol.
++ $H = 0.25 \ \rm bit/source\hspace{0.15cm} symbol$.
@@ Line 88: / Line 107: @@
 </quiz>
-===Musterlösung===
+===Solution===
 {{ML-Kopf}}
-:<b>1.</b>&nbsp;&nbsp;Die Näherung <i>r'</i><sup> </sup>(<i>N</i>) stimmt definitionsgemäß für die Eingangsfolgenlänge  <i>N</i> = 10000 mit der per Simulation ermittelten Restredundanz <i>r</i>(<i>N</i>) = 0.265 exakt überein. Damit ist
+'''(1)'''&nbsp; The approximation&nbsp; $r\hspace{0.05cm}'(N)$&nbsp; agrees exactly by definition for the sequence length&nbsp;  $N = 10000$&nbsp; with the residual redundancy&nbsp; $r(N) = 0.265$&nbsp; determined by simulation.
+*Thus
 :$$A = 4 \cdot  r(N = 10000) =4 \cdot {0.265} \hspace{0.15cm}\underline{= 1.06}
 \hspace{0.05cm}. $$
-:<b>2.</b>&nbsp;&nbsp;Aus der Beziehung <i>A</i>/lg (<i>N</i>) &#8804; 0.05 &nbsp;&nbsp;&nbsp;&#8658;&nbsp;&nbsp;&nbsp; <i>A</i>/lg (<i>N</i><sub>b</sub>) = 0.05 folgt:
-:$${{\rm lg}\hspace{0.1cm}N_{\rm b}} = \frac{A}{0.05} = 21.2 \hspace{0.3cm}\Rightarrow\hspace{0.3cm}
-N_{\rm b} = 10^{21.2} \hspace{0.15cm}\underline{= 1.58 \cdot 10^{21}}
+'''(2)'''&nbsp; From the relationship&nbsp; ${A}/{\rm lg}\hspace{0.1cm}(N) &#8804; 0.05$ &nbsp;&nbsp;&nbsp;&#8658;&nbsp;&nbsp;&nbsp; ${A}/{\rm lg}\hspace{0.1cm}(N) = 0.05$&nbsp; it follows:
+:$${{\rm lg}\hspace{0.1cm}N_{\rm 2}} = \frac{A}{0.05} = 21.2 \hspace{0.3cm}\Rightarrow\hspace{0.3cm}
+N_{\rm 2} = 10^{21.2} \hspace{0.15cm}\underline{= 1.58 \cdot 10^{21}}
 \hspace{0.05cm}.$$
-:<b>3.</b>&nbsp;&nbsp;Allgemein gilt:
-:$$r(N) = 1 - \frac{H}{K(N)}
-\hspace{0.05cm}. $$
+'''(3)'''&nbsp; In general, &nbsp;$r(N) = 1 - {H}/{K(N)} \hspace{0.05cm}.$&nbsp;
-:BQ1 hat die Entropie <i>H</i> = 0.5 bit/Symbol. Daraus folgt wegen <i>r</i>(<i>N</i>) &asymp; <i>r'</i><sup> </sup>(<i>N</i>) für <i>K</i>(<i>N</i><sub>c</sub>) = 0.6:
+*$\rm BQ1$&nbsp; has entropy&nbsp; $H = 0.5$ bit/source symbol.&nbsp;
-:$$r(N_{\rm c}) = 1 - \frac{0.5}{0.6} = 0.167 \hspace{0.1cm}\Rightarrow\hspace{0.1cm}
+*It follows that because &nbsp;$r(N) &asymp; r\hspace{0.05cm}'(N)$&nbsp; for&nbsp; $K(N_3) = 0.6$:
-{\rm lg}\hspace{0.1cm}N_{\rm c} = \frac{A}{0.167} = 6.36
+:$$r(N_{\rm 3}) = 1 - \frac{0.5}{0.6} = 0.167 \hspace{0.1cm}\Rightarrow\hspace{0.1cm}
+{\rm lg}\hspace{0.1cm}N_{\rm 3} = \frac{A}{0.167} = 6.36
 \hspace{0.1cm}\Rightarrow\hspace{0.1cm}
-N_{\rm c} = 10^{6.36} \hspace{0.15cm}\underline{= 2.29 \cdot 10^{6}}
+N_{\rm 3} = 10^{6.36} \hspace{0.15cm}\underline{= 2.29 \cdot 10^{6}}
 \hspace{0.05cm}.$$
-[[File:P_ID2447__Inf_A_2_5d.png|right|]]
-:<b>4.</b>&nbsp;&nbsp;Für <i>N</i> = 10000 gilt <i>r</i>&prime;(<i>N</i>) = <i>r</i>(<i>N</i>) = 0.190:
+[[File:EN_Inf_A_2_5d_v2.png|right|frame|Results for&nbsp;  $\rm BQ2$]]
+'''(4)'''&nbsp; For&nbsp; $N = 10000$ &nbsp; &rArr; &nbsp; $r(N) &asymp; r\hspace{0.05cm}'(N) = 0.19$:
 :$$\frac{A}{{\rm lg}\hspace{0.1cm}10000} = 0.19 \hspace{0.3cm}\Rightarrow\hspace{0.3cm}
 A = 0.19 \cdot 4 = 0.76 \hspace{0.05cm}. $$
-:Die Ergebnisse sind in nebenstehender Tabelle zusammengefasst. Man erkennt die sehr gute Übereinstimmung zwischen <i>r</i>(<i>N</i>) und <i>r</i>&prime;(<i>N</i>). Die gesuchten Zahlenwerte sind in der Tabelle rot markiert.
+*The results are summarised in the table opposite.
+*One can see the very good agreement between&nbsp; &nbsp;$r(N)$&nbsp; and  &nbsp;$r\hspace{0.05cm}'(N)$.
+*The numerical values sought are marked in red in the table:
+$$r'(N = 50000)\hspace{0.15cm}\underline{ = 0.162},\hspace{0.3cm}r'(N = 10^{6})\hspace{0.15cm}\underline{ = 0.127},\hspace{0.3cm}
+r'(N = 10^{12})\hspace{0.15cm}\underline{ = 0.063}.$$
+*For the compression factor &nbsp; &ndash; &nbsp; the apostrophe indicates that the approximation &nbsp;$r\hspace{0.05cm}'(N)$&nbsp; was assumed:
+:$$K\hspace{0.05cm}'(N) = \frac{1}{1 - r\hspace{0.05cm}'(N)}\hspace{0.05cm}.$$
+*Thus, for the length of the LZW output string:
+:$$L\hspace{0.05cm}'(N) = K\hspace{0.05cm}'(N) \cdot N  = \frac{N}{1 - r\hspace{0.05cm}'(N)}\hspace{0.05cm}.$$
-:Für den Komprimierungsfaktor gilt (der Apostroph weist darauf hin, dass von der Näherung <i>r</i>&prime;(<i>N</i>) ausgegangen wurde):
-:$$K'(N) = \frac{1}{1 - r'(N)}\hspace{0.05cm}.$$
-:Damit gilt für die Länge des LZW&ndash;Ausgabestrings:
-:$$L'(N) = K'(N) \cdot N \hspace{0.05cm}.$$
-:<br><br><br>
+[[File:EN_Inf_A_2_5e_v2.png|right|frame|Results for&nbsp; $\rm BQ3$]]
-:<b>5.</b>&nbsp;&nbsp;Nach ähnlicher Vorgehensweise wie in der Teilaufgabe d) erhält man für die Binärquelle BQ3 den Anpassungsparameter <i>A</i> = 1.36 und daraus die Ergebnisse gemäß der blau hinterlegten Tabelle.
+'''(5)'''&nbsp; Following a similar procedure as in subtask&nbsp; '''(4)'''&nbsp; we obtain the fitting parameter&nbsp; $A = 1.36$&nbsp; for the binary source&nbsp; $\rm BQ3$&nbsp; and from this the results according to the table with a blue background.
-:<u>Hinweis:</u> Die letzte Spalte dieser Tabelle ist nur bei Kenntnis der Teilaufgabe (f) verständlich. Dort wird gezeigt, dass die  Quelle BQ3 die Entropie   <i>H</i> = 0.25 bit/Quellensymbol besitzt.
+<u>Hint:</u> &nbsp; The last column of this table is only understandable with knowledge of subtask&nbsp; '''(6)'''.&nbsp; There it is shown that the source&nbsp; $\rm BQ3$&nbsp; has the entropy&nbsp; $H = 0.25$&nbsp; bit/source symbol.
-:In diesem Fall gilt für den Komprimierungsfaktor:
+*In this case, the following applies to the compression factor:
-:$$K'(N) = \frac{H}{1 - r'(N)} = \frac{0.25}{1 - r'(N)} \hspace{0.05cm}.$$
+:$$K\hspace{0.05cm}'(N) = \frac{H}{1 - r\hspace{0.05cm}'(N)} = \frac{0.25}{1 - r'(N)} \hspace{0.05cm}.$$
-[[File:P_ID2448__Inf_A_2_5e.png|right|]]
+*Thus, for the values of residual redundancy we are looking for, we obtain:
-:Für <i>N</i> = 10<sup>12</sup> weicht also der Komprimierungsfaktor (0.282) noch deutlich von der Entropie (0.25) ab, die  für <i>N</i> &#8594;  &#8734; erreicht werden kann (Quellencodierungstheorem).<br><br><br>
+:$$r\hspace{0.05cm}'(N = 50000)\hspace{0.15cm}\underline{ = 0.289},\hspace{0.3cm}r\hspace{0.05cm}'(N = 10^{6})\hspace{0.15cm}\underline{ = 0.227},\hspace{0.3cm}
+r\hspace{0.05cm}'(N = 10^{12})\hspace{0.15cm}\underline{ = 0.113}.$$
+*Thus, for&nbsp; $N = 10^{12}$&nbsp; , the compression factor&nbsp; $(0.282)$&nbsp; still deviates significantly from the entropy&nbsp; $(0.25)$&nbsp; which can only be achieved for&nbsp; $N \to \infty$&nbsp; ("Source Coding Theorem").
-:<b>6.</b>&nbsp;&nbsp;Die einzelnen Näherungen <i>r</i>&prime;(<i>N</i>) unterscheiden sich nur durch den Parameter <i>A</i>. Dabei haben wir festgestellt:
-:* Quelle BQ2 (<i>H</i> = 1.00): <i>A</i> = 0.76,
-:* Quelle BQ1 (<i>H</i> = 0.50): <i>A</i> = 1.06,
-:* Quelle BQ3 (<i>H</i> unbekannt): <i>A</i> = 1.36.
+'''(6)'''&nbsp; The individual approximations&nbsp; $r\hspace{0.05cm}'(N)$&nbsp; differ only by the parameter&nbsp; $A$.&nbsp; Here we found:
+# Source&nbsp; $\rm BQ1$&nbsp; with&nbsp; $H = 0.50$ &nbsp; &rArr; &nbsp; $A = 1.06$ &nbsp; &rArr; &nbsp; according to the specification sheet,
+# Source&nbsp; $\rm BQ2$&nbsp; with&nbsp; $H = 1.00$ &nbsp; &rArr; &nbsp; $A = 0.76$ &nbsp; &rArr; &nbsp; see subtask&nbsp; '''(4)''',
+# Source&nbsp; $\rm BQ3$&nbsp; $(H$ unknown$)$: $A = 4 &middot; 0.34 =1.36$ &nbsp; &rArr; &nbsp; corresponding to the table on the right.
-:Je kleiner die Entropie <i>H</i> ist, um so größer ist offensichtlich der Anpassungsfaktor <i>A</i>. Da genau eine Lösung möglich ist, muss <i>H</i> = 0.25 bit/Quellensymbol richtig sein &nbsp;&nbsp;&#8658;&nbsp;&nbsp; <u>Antwort 4</u>.
+*Obviously, the smaller the entropy&nbsp; $H$&nbsp; the larger the adjustment factor&nbsp; $A$&nbsp; (and vice versa).
+*Since exactly one solution is possible, &nbsp; $H = 0.25$&nbsp; bit/source&nbsp; symbol must be correct &nbsp;&nbsp;&#8658;&nbsp;&nbsp; <u>answer 4</u>.
-:Tatsächlich wurden bei der Simulation für die Quelle Q3 die Wahrscheinlichkeiten <i>p</i><sub>A</sub> = 0.96, <i>p</i><sub>B</sub> = 0.04 &#8658; <i>H</i> &asymp; 0.25 verwendet.<br><br>
+*In fact, the probabilities&nbsp;$p_{\rm A} = 0.96$&nbsp; and &nbsp;$p_{\rm B} = 0.04$&nbsp; &nbsp; &#8658; &nbsp; $H &asymp; 0.25$ were used in the simulation for the source&nbsp; $\rm BQ3$.
 {{ML-Fuß}}
-[[Category:Aufgaben zu Informationstheorie|^2.2 Verfahren von Lempel, Ziv, Welch^]]
+[[Category:Information Theory: Exercises|^2.2 Lempel-Ziv-Welch Compression^]]