Difference between revisions of "Aufgaben:Exercise 2.4Z: LZW Coding and Decoding again"

Revision as of 11:58, 15 July 2021

Snapshots of the LZW dictionaries

The graph shows snapshots of the dictionary created during LZW coding of the input symbol sequence.

The upper graph is for input symbol sequence ABABABBAA.
The lower dictionary is created during LZW coding of the sequence ABABABABA.

In both cases, it is assumed that no characters other than A and B can occur at later times.

In LZW decoding, the same dictionaries are created, but then the dictionary entries occur one step later. In subtask (3) the question is asked for which coding step or for which decoding step the two snapshots shown are valid.

In LZW coding, an index $I$ is selected for each coding step $i$ and transmitted (binary). The character pair AB is represented by the index $I = 2$ in the two dictionaries. Here we consider the index $I$ as a decimal number and leave the binary representation out of consideration for this task.

With LZW decoding, a character or character sequence is generated in the same way from each index $I$ with the help of the dictionary, for example $I = 1$ leads to the character B and $I = 2$ to the character pair AB.

If a dictionary entry with the desired index $I$ is actually found, the decoding runs smoothly. However, this is not always the case:

If a new index $I$ is entered during encoding at step $i$ and this $I$ is at the same time the encoding result of the step, this index is not yet occupied in the dictionary at decoding step $i$ . The reason for this is that with the decoder the entries are made one step later than with the coder.
In the case of a binary input sequence $($all characters are A or B$)$ , a special rule must always be applied in LZW decoding if the entry with index $I = i$ was made in coding step $i$ .

This special rule shall be illustrated by an example:

For step $i$ there is no entry in the decoder dictionary matching index $I$ .
We assume that in the previous step $(i- 1)$ the decoding result was ABBABA .
Then we add the first character of the sequence to this string. Here: ABBABAA.
Then one enters the sequence ABBABAA into the dictionary under index $I$ .

Hints:

The exercise belongs to the chapter Compression according to Lempel, Ziv and Welch.
In particular, reference is made to the pages

LZ77 - the basic form of Lempel-Ziv algorithms,

Lempel-Ziv coding with variable index bit length,

Decoding of the LZW algorithm.

Also, when solving this task, note that the LZW algorithm does not assume an empty dictionary.
Rather, the indices $I = 0$ to $I = M- 1$ contain all $M$ permissible characters of the alphabet.

Questions

$I_1 \ = \ $

$I_2 \ = \ $

$I_3 \ = \ $

$I_4 \ = \ $

$I_5 \ = \ $

$I_4 \ = \ $

$I_5 \ = \ $

$\text{Encoding:} \ \ i \ = \ $

$\text{Decoding:} \ \ i \ = \ $

	When decoding ABABABBAA in step $i = 4$.
	When decoding ABABABABA in step $i = 4$.
	When decoding ABABABABA in step $i = 5$.

Solution

(1) Wir bezeichnen mit $W(I)$ ein Feld (Array), welches das Wörterbuch beschreibt und dessen Elemente Character oder Zeichenfolgen beinhalten.

Die Codierung von ABABABBAA läuft dann wie folgt ab:

$i = 1$: A → $\underline{I=0}$, $W(I = 2) =$ AB,

$i = 2$: B → $\underline{I=1}$, $W(I = 3) =$ BA,

$i = 3$: AB → $\underline{I=2}$, $W(I = 4) =$ ABA,

$i = 4$: AB → $\underline{I=2}$, $W(I = 5) =$ ABB,

$i = 5$: BA → $\underline{I=3}$, $W(I = 6) =$ BAA.

Es ist anzumerken, dass das letzte Zeichen $($A$)$ des Eingabestrings ABABABBAA zum Zeitpunkt $i = 5$ zwar bereits beim Wörterbucheintrag berücksichtigt ist, aber noch nicht codiert wurde.

(2) Für die Schritte $i = 1$ bis $i = 3$ ändert sich nichts gegenüber der Teilaufgabe (1).

Danach gilt:

$i = 4$: ABA → $\underline{I=4}$, $W(I = 5) =$ ABAB,

$i = 5$: BA → $\underline{I=3}$, Codierung abgeschlossen, kein neuer Wörterbucheintrag möglich.

(3) Der Vergleich mit obigen Ergebnissen zeigt:

Das Wörterbuch des Coders weist genau nach $\underline{i=4}$ Codierschritten die gezeigten Einträge auf.
Beim Decoder ergibt sich demgegenüber eine Zeitverzögerung um einen Schritt: $\underline{i=5}$.

(4) Richtig ist der Lösungsvorschlag 2:

Die Sonderfallregelung der Decodierung ist (im vorliegenden Beispiel) dann notwendig, wenn im Codierschritt $i$ der Index $I =i$ ausgegeben wird.
Bei der Decodierung findet er dann die erforderliche Zuordnung "Index → Zeichenfolge" nicht, da das generierte Wörterbuch zum Zeitpunkt $i$ nur Einträge mit Indizes $I < i$ enthält.
Für die Folge ABABABBAA gilt entsprechend der Teilaufgabe (1) stets $I < i$.
Dagegen ergäbe sich für die Folge ABABABABA folgende Indizes:

$$i = 1\hspace{-0.15cm}: \hspace{0.15cm} I = 0\hspace{0.05cm}, \hspace{0.5cm}i = 2\hspace{-0.15cm}: \hspace{0.15cm}I = 1\hspace{0.05cm}, \hspace{0.5cm}i = 3\hspace{-0.15cm}: \hspace{0.15cm}I = 2\hspace{0.05cm}, \hspace{0.5cm}\underline{i = 4\hspace{-0.15cm}: \hspace{0.15cm}I = 4}\hspace{0.05cm}, \hspace{0.5cm}i = 5\hspace{-0.15cm}: \hspace{0.15cm}I = 3\hspace{0.05cm}. $$

Hier noch zusammenfassend die gesamte Decodierung von ABABABABA:

Die Vorbelegung des Wörterbuchs beinhaltet $(I=0$: A$)$ und $(I=1$: B$)$.

Dann gilt mit dem Wörterbuch–Array $W(I)$:

$i = 1$: Decodierung $I=0$ → A,

$i = 2$: Decodierung $I=1$ → B, $W(I = 2) =$ AB,

$i = 3$: Decodierung $I=2$ → AB, $W(I = 3) =$ BA,

$i = 4$: Ein Eintrag mit dem Index $I = 4$ ist nicht vorhanden ⇒ Sonderfallregelung:
Man nimmt das letzte Decodierergebnis $($hier AB$)$ und fügt das erste Zeichen dieser Sequenz hinten an ⇒ ABA.
Danach wird ABA im Wörterbuch unter dem Index $I = 4$ abgelegt.

$i = 5$: Decodierung $I=3$ → BA. Ende der Decoder–Eingangsfolge.

@@ Line 3: / Line 3: @@
 }}
-[[File:EN_Inf_Z_2_4.png|right|frame|Momentaufnahmen der LZW–Wörterbüchern]]
+[[File:EN_Inf_Z_2_4.png|right|frame|Snapshots of the LZW dictionaries]]
-Die Grafik zeigt Momentaufnahmen des Wörterbuchs, das während der&nbsp; <u>LZW&ndash;Codierung</u>&nbsp; der Eingangssymbolfolge entsteht.
+The graph shows snapshots of the dictionary created during LZW coding of the input symbol sequence.
-*Die obere Grafik gilt für Eingangssymbolfolge&nbsp; <b>ABABABBAA</b>.
+*The upper graph is for input symbol sequence&nbsp; <b>ABABABBAA</b>.
-*Das untere Wörterbuch entsteht bei der LZW&ndash;Codierung der Sequenz&nbsp; <b>ABABABABA</b>.
+*The lower dictionary is created during LZW coding of the sequence&nbsp; <b>ABABABABA</b>.
-In beiden Fällen wird vorausgesetzt, dass auch zu späteren Zeitpunkten keine anderen Zeichen als&nbsp; <b>A</b>&nbsp; und&nbsp; <b>B</b>&nbsp; vorkommen können.
+In both cases, it is assumed that no characters other than&nbsp; <b>A</b>&nbsp; and&nbsp; <b>B</b>&nbsp; can occur at later times.
-Bei der&nbsp; <u>LZW&ndash;Decodierung</u>&nbsp; entstehen gleiche Wörterbücher, doch erfolgen dann die Wörterbucheinträge erst einen Schritt später.&nbsp; In der Teilaufgabe&nbsp; '''(3)'''&nbsp; wird gefragt, für welchen Codierschritt bzw. für welchen Decodierschritt die beiden dargestellten Momentaufnahmen gültig sind.
+In <u>LZW decoding</u>, the same dictionaries are created, but then the dictionary entries occur one step later.&nbsp; In subtask&nbsp; '''(3)'''&nbsp; the question is asked for which coding step or for which decoding step the two snapshots shown are valid.
-Bei der <u>LZW&ndash;Codierung</u> wird zu jedem Codierschritt $i$ ein Index $I$ ausgewählt und (binär) übertragen.&nbsp; Das Zeichenpaar <b>AB</b> wird bei den beiden Wörterbüchern durch den Index $I = 2$ dargestellt.&nbsp; Wir betrachten hier den Index $I$ als Dezimalzahl und lassen bei dieser Aufgabe die Binärdarstellung außer Betracht.
+In <u>LZW coding</u>, an index $I$ is selected for each coding step $i$ and transmitted (binary).&nbsp; The character pair <b>AB</b> is represented by the index  $I = 2$ in the two dictionaries.&nbsp; Here we consider the index $I$ as a decimal number and leave the binary representation out of consideration for this task.
-Bei der <u>LZW&ndash;Decodierung</u> wird in gleicher Weise mit Hilfe des Wörterbuchs aus jedem Index&nbsp; $I$&nbsp; ein Zeichen bzw. eine Zeichenfolge generiert, zum Beispiel führt&nbsp; $I = 1$&nbsp; zum Zeichen&nbsp; <b>B</b>&nbsp; und&nbsp; $I = 2$&nbsp; zum Zeichenpaar&nbsp; <b>AB</b>.
+With <u>LZW decoding</u>, a character or character sequence is generated in the same way from each index&nbsp; $I$&nbsp; with the help of the dictionary, for example&nbsp; $I = 1$&nbsp; leads to the character&nbsp; <b>B</b>&nbsp; and&nbsp; $I = 2$&nbsp; to the character pair&nbsp; <b>AB</b>.
-Wird  tatsächlich ein Wörterbucheintrag mit dem gewünschten Index&nbsp; $I$&nbsp; gefunden, so läuft die Decodierung problemlos ab. Dies ist aber nicht immer so:
+If a dictionary entry with the desired index&nbsp; $I$&nbsp; is actually found, the decoding runs smoothly. However, this is not always the case:
-* Wird bei der Codierung beim Schritt&nbsp; $i$&nbsp; ein neuer Index&nbsp; $I$&nbsp; eingetragen und ist dieses&nbsp; $I$&nbsp; gleichzeitig das Codierergebnis des Schrittes, so ist dieser Index beim Decodierschritt&nbsp; $i$&nbsp; im Wörterbuch noch nicht belegt.&nbsp; Der Grund hierfür ist, dass  beim Decoder die Einträge um einen Schritt später erfolgen als beim Coder.
+* If a new index&nbsp; $I$&nbsp; is entered during encoding at step&nbsp; $i$&nbsp; and this&nbsp; $I$&nbsp; is at the same time the encoding result of the step, this index is not yet occupied in the dictionary at decoding step&nbsp; $i$&nbsp;.&nbsp; The reason for this is that with the decoder the entries are made one step later than with the coder.
-* Bei binärer Eingangsfolge&nbsp; $($alle Zeichen seien&nbsp; <b>A</b>&nbsp; oder&nbsp; <b>B</b>$)$&nbsp; ist bei der LZW&ndash;Decodierung genau immer dann eine Sonderregelung anzuwenden, wenn im Codierschritt&nbsp; $i$&nbsp; der Eintrag mit dem Index&nbsp; $I = i$&nbsp; vorgenommen wurde.
+* In the case of a binary input sequence&nbsp; $($all characters are&nbsp; <b>A</b>&nbsp; or&nbsp; <b>B</b>$)$&nbsp; , a special rule must always be applied in LZW decoding if the entry with index&nbsp; $I = i$&nbsp; was made in coding step&nbsp; $i$&nbsp;.
-Diese Sonderregelung soll an einem Beispiel veranschaulicht werden:
+This special rule shall be illustrated by an example:
-* Zum Schritt&nbsp; $i$&nbsp; gibt es keinen zum Index&nbsp; $I$&nbsp; passenden  Eintrag im Decoder&ndash;Wörterbuch.
+* For step&nbsp; $i$&nbsp; there is no entry in the decoder dictionary matching index&nbsp; $I$&nbsp;.
-* Wir nehmen an, dass beim vorherigen Schritt&nbsp; $(i- 1)$&nbsp; das Decodierergebnis&nbsp; <b>ABBABA</b>&nbsp; war.
+* We assume that in the previous step&nbsp; $(i- 1)$&nbsp; the decoding result was&nbsp; <b>ABBABA</b>&nbsp;.
-* Dann ergänzt man diese Zeichenfolge um das erste Zeichen der Folge.&nbsp; Hier:&nbsp; <b>ABBABAA</b>.
+* Then we add the first character of the sequence to this string. &nbsp; Here:&nbsp; <b>ABBABAA</b>.
-* Anschließend trägt man die Sequenz&nbsp; <b>ABBABAA</b>&nbsp; in das Wörterbuch unter dem Index&nbsp; $I$&nbsp; ein.
+* Then one enters the sequence&nbsp; <b>ABBABAA</b>&nbsp; into the dictionary under index&nbsp; $I$&nbsp;.
@@ Line 44: / Line 44: @@
 :: [[Information_Theory/Komprimierung_nach_Lempel,_Ziv_und_Welch#Decoding_of_the_LZW_algorithm|Decoding of the LZW algorithm]].
-*Beachten Sie zudem bei der Lösung dieser Aufgabe, dass beim LZW&ndash;Algorithmus nicht von einem leeren Wörterbuch ausgegangen wird.
+*Also, when solving this task, note that the LZW algorithm does not assume an empty dictionary.
-*Vielmehr beinhalten die Indizes&nbsp; $I = 0$&nbsp; bis&nbsp; $I = M- 1$&nbsp; alle&nbsp; $M$&nbsp; zulässigen Zeichen des Alphabets.
+*Rather, the indices&nbsp; $I = 0$&nbsp; to&nbsp; $I = M- 1$&nbsp; contain all&nbsp; $M$&nbsp; permissible characters of the alphabet.
+===Questions===
-===Fragebogen===
 <quiz display=simple>
-{Codieren Sie die Eingangsfolge&nbsp; <b>ABABABBAA</b>&nbsp; entsprechend der <u>oberen Grafik</u>.&nbsp; Welche Indizes&nbsp; $I_i$&nbsp; ergeben sich zu den Schritten&nbsp; $i=1$, ... , $i=5$?
+{Code the input sequence&nbsp; <b>ABABABBAA</b>&nbsp; according to the <u>above diagram</u>.&nbsp; Which indices&nbsp; $I_i$&nbsp; result for the steps&nbsp; $i=1$, ... , $i=5$?
 |type="{}"}
 $I_1 \ = \ $ { 0. }
@@ Line 61: / Line 60: @@
-{Codieren Sie nun die Eingangsfolge&nbsp; <b>ABABABABA</b>&nbsp; entsprechend der <u>unteren Grafik</u>.&nbsp; Geben Sie die Indizes&nbsp; $I_i$&nbsp; zu den Schritten&nbsp; $i=4$&nbsp; und&nbsp; $i=5$&nbsp; an.
+{Now code the input sequence&nbsp; <b>ABABABABA</b>&nbsp; according to the <u>diagram below</u>.&nbsp; Specify the indices&nbsp; $I_i$&nbsp; for the steps&nbsp; $i=4$&nbsp; and&nbsp; $i=5$&nbsp;.
 |type="{}"}
 $I_4 \ = \ $ { 4 }
@@ Line 67: / Line 66: @@
-{Für welchen Schritt&nbsp; $i$&nbsp; gilt die Momentaufnahme des auf der Angabenseite dargestellten Wörterbuchs bezüglich
+{For which step&nbsp; $i$&nbsp; does the snapshot of the dictionary shown on the input page apply with respect to
 |type="{}"}
-$\text{Codierung:} \ \ i \ = \ $ { 4 }
+$\text{Encoding:} \ \ i \ = \ $ { 4 }
-$\text{Decodierung:} \ \ i \ = \ $ { 5 }
+$\text{Decoding:} \ \ i \ = \ $ { 5 }
-{Wann muss man auf die Decodier&ndash;Sonderfallregelung zurückgreifen?
+{When does one have to resort to the decoding special case rule?
 |type="[]"}
-- Bei der Decodierung von&nbsp; <b>ABABABBAA</b>&nbsp; im Schritt&nbsp; $i = 4$.
+- When decoding&nbsp; <b>ABABABBAA</b>&nbsp; in step&nbsp; $i = 4$.
-+ Bei der Decodierung von&nbsp; <b>ABABABABA</b>&nbsp; im Schritt&nbsp; $i = 4$.
++ When decoding&nbsp; <b>ABABABABA</b>&nbsp; in step&nbsp; $i = 4$.
-- Bei der Decodierung von&nbsp; <b>ABABABABA</b>&nbsp; im Schritt&nbsp; $i = 5$.
+- When decoding&nbsp; <b>ABABABABA</b>&nbsp; in step&nbsp; $i = 5$.
@@ Line 83: / Line 82: @@
 </quiz>
-===Musterlösung===
+===Solution===
 {{ML-Kopf}}
 '''(1)'''&nbsp; Wir bezeichnen mit&nbsp; $W(I)$&nbsp; ein Feld (Array), welches das Wörterbuch beschreibt und dessen Elemente Character oder Zeichenfolgen beinhalten.&nbsp;