Difference between revisions of "Exercise 2.13Z: Combination of BWT and MTF"

From LNTwww
 
(23 intermediate revisions by 5 users not shown)
Line 1: Line 1:
{{quiz-Header|Buchseite=Informationstheorie/Weitere Quellencodierverfahren
+
{{quiz-Header|Buchseite=Information_Theory/Further_Source_Coding_Methods
 
}}
 
}}
  
[[File:P_ID2480__Inf_Z_2_14.png|right|frame|Schema für die Burrows–Wheeler–Datenkomprimierung]]
+
[[File:EN_Inf_Z_2_14b.png|right|frame|Scheme for Burrows-Wheeler data compression]]
Wir beziehen uns auf die Theorieseite [[Informationstheorie/Weitere_Quellencodierverfahren#Anwendungsszenario_f.C3.BCr_die_Burrows.E2.80.93Wheeler.E2.80.93Transformation|Anwendungsszenario für die Burrows-Wheeler-Transformation]] und betrachten das rechts skizzierte Codiersystem, bestehend aus den Blöcken.
+
We refer to the theory section  [[Information_Theory/Weitere_Quellencodierverfahren#Application_scenario_for_the_Burrows-Wheeler_transformation|Application Scenario for the Burrows-Wheeler Transform]]  and consider the coding system sketched on the right, consisting of the blocks
* <i>Burrows&ndash;Wheeler&ndash;Transformation</i> $\rm (BWT)$ gemäß der Beschreibung in [[Aufgaben:2.13_Burrows-Wheeler-Rücktransformation|Aufgabe 2.13]]; die beiden Zeichenmengen am Ein&ndash; und Ausgang des BWT sind gleich: &nbsp; $\{$ $\rm D$, $\rm E$, $\rm I$, $\rm M$, $\rm N$, $\rm S$ $\}$;
+
* "Burrows&ndash;Wheeler Transform"</i>&nbsp; $\rm (BWT)$&nbsp; as described in&nbsp; [[Aufgaben:2.13_Burrows-Wheeler-Rücktransformation|Exercise 2.13]];&nbsp; the character sets at the input and the output of the BWT are the same: &nbsp; $\{$ $\rm D$,&nbsp; $\rm E$,&nbsp; $\rm I$,&nbsp; $\rm M$,&nbsp; $\rm N$,&nbsp; $\rm S$ $\}$;
* <i>Move&ndash;to&ndash;Front</i> $\rm (MTF)$, ein Sortieralgorithmus, der eine gleich lange Zeichenfolge (im Beispiel $N = 12$), aber mit anderem Alphabet $\{$<b>0</b>, <b>1</b>, <b>2</b>, <b>3</b>, <b>4</b>, <b>5</b>$\}$ ausgibt;
+
* "Move&ndash;to&ndash;Front"&nbsp; $\rm (MTF)$, a sorting algorithm that outputs a string of the same length&nbsp; $($&nbsp; $N = 12$&nbsp; in the example$)$,&nbsp; but with a different alphabet&nbsp; $\{$<b>0</b>,&nbsp; <b>1</b>,&nbsp; <b>2</b>,&nbsp; <b>3</b>,&nbsp; <b>4</b>,&nbsp; <b>5</b>$\}$;
* $\rm RLC0$ &ndash; eine Lauflängencodierung speziell für die nach BWT und MTF (möglichst) häufige Null; alle anderen Indizes werden durch &bdquo;RLC0&rdquo; nicht verändert;
+
* $\rm RLC0$&nbsp; &ndash; a run-length encoding specifically for&nbsp; <b>0</b>, which is (as) frequent according to&nbsp; $\rm BWT$&nbsp; and&nbsp; $\rm MTF$;&nbsp; all other indices are not changed by&nbsp; $\rm RLC0$&nbsp;;
* $\rm Huffman$ gemäß der Beschreibung im Kapitel [[Informationstheorie/Entropiecodierung_nach_Huffman|Entropiecodierung nach Huffman]]; häufige Zeichen werden durch kurze Binärfolgen dargestellt und seltene durch lange.
+
* $\rm Huffman$&nbsp; as described in the chapter&nbsp; [[Information_Theory/Entropiecodierung_nach_Huffman|Entropy coding according to Huffman]]; frequent characters are represented by short binary sequences and rare ones by long ones.
  
  
Der MTF&ndash;Algorithmus lässt sich bei $M = 6$ Eingangssymbolen  wie folgt beschreiben:
+
The&nbsp; $\rm MTF$&nbsp; algorithm can be described as follows for&nbsp; $M = 6$&nbsp; input symbols:
  
* Die Ausgangsfolge des MTF ist eine Aneinanderreihung von Indizes aus der Menge
+
* The output sequence of the&nbsp; $\rm MTF$&nbsp; is a string of indices from the set
:&nbsp; &nbsp;  &nbsp; &nbsp;$ I = \{$<b>0</b>, <b>1</b>, <b>2</b>, <b>3</b>, <b>4</b>, <b>5</b>$\}$.
+
:&nbsp; &nbsp;  &nbsp; &nbsp;$ I = \{$<b>0</b>,&nbsp; <b>1</b>,&nbsp; <b>2</b>,&nbsp; <b>3</b>,&nbsp; <b>4</b>,&nbsp; <b>5</b>$\}$.
* Vor Beginn des eigentlichen MTF&ndash;Algorithmus werden die möglichen Eingangssymbole lexikografisch sortiert und den folgenden Indizes zugeordnet:
+
* Before starting the actual&nbsp; $\rm MTF$ algorithm, the possible input symbols are sorted lexicographically and assigned to the following indices:
: &nbsp; &nbsp;  &nbsp; &nbsp;  $\rm D$ &#8594; <b>0</b>, &nbsp; &nbsp; $\rm E$ &#8594; <b>1</b>, &nbsp; &nbsp;  $\rm I$ &#8594; <b>2</b>, &nbsp; &nbsp; $\rm M$ &#8594; <b>3</b>, &nbsp; &nbsp;  $\rm N$ &#8594; <b>4</b>,&nbsp; &nbsp;  $\rm S$ &#8594; <b>5</b>.
+
: &nbsp; &nbsp;  &nbsp; &nbsp;  $\rm D$ &nbsp; &#8594; &nbsp; <b>0</b>, &nbsp; &nbsp; $\rm E$ &nbsp; &#8594; &nbsp; <b>1</b>, &nbsp; &nbsp;  $\rm I$ &nbsp; &#8594; &nbsp; <b>2</b>, &nbsp; &nbsp; $\rm M$ &nbsp; &#8594; &nbsp; <b>3</b>, &nbsp; &nbsp;  $\rm N$ &nbsp; &#8594; &nbsp; <b>4</b>,&nbsp; &nbsp;  $\rm S$ &nbsp; &#8594; &nbsp; <b>5</b>.
* Der MTF&ndash;Eingabestring sei hier $\rm N\hspace{0.05cm}M\hspace{0.05cm}S\hspace{0.05cm}D\hspace{0.05cm}E\hspace{0.05cm}E\hspace{0.05cm}E\hspace{0.05cm}N\hspace{0.05cm}I\hspace{0.05cm}I\hspace{0.05cm}I\hspace{0.05cm}N$ . Dies war das BWT&ndash;Ergebnis in der [[Aufgaben:2.13_Burrows-Wheeler-Rücktransformation|Aufgabe 2.13]]. Das erste $\rm N$ wird gemäß Voreinstellung mit $I = 4$ dargestellt.
+
* Let the&nbsp; $\rm MTF$ input string here be&nbsp; $\rm N\hspace{0.05cm}M\hspace{0.05cm}S\hspace{0.05cm}D\hspace{0.05cm}E\hspace{0.05cm}E\hspace{0.05cm}E\hspace{0.05cm}N\hspace{0.05cm}I\hspace{0.05cm}I\hspace{0.05cm}I\hspace{0.05cm}N$.&nbsp; This was the&nbsp; $\rm BWT$ result in&nbsp; [[Aufgaben:2.13_Burrows-Wheeler-Rücktransformation|Exercise 2.13]].&nbsp; The first&nbsp; $\rm N$&nbsp; is represented as&nbsp; $I = 4$&nbsp; according to the default setting.
* Anschließend wird das $\rm N$ in der Sortierung an den Anfang gestellt, so dass nach dem Codierschritt $i = 1$ die Zuordnung gilt:
+
* Then the&nbsp; $\rm N$&nbsp; is placed at the beginning in the sorting, so that after the coding step&nbsp; $i = 1$&nbsp; the assignment holds:
: &nbsp; &nbsp;  &nbsp; &nbsp;  $\rm N$ &#8594; <b>0</b>, &nbsp; &nbsp; $\rm D$ &#8594; <b>1</b>, &nbsp; &nbsp;  $\rm E$ &#8594; <b>2</b>, &nbsp; &nbsp; $\rm I$ &#8594; <b>3</b>, &nbsp; &nbsp;  $\rm M$ &#8594; <b>4</b>,&nbsp; &nbsp;  $\rm S$ &#8594; <b>5</b>.
+
: &nbsp; &nbsp;  &nbsp; &nbsp;  $\rm N$ &nbsp; &#8594; &nbsp; <b>0</b>, &nbsp; &nbsp; $\rm D$ &nbsp; &#8594; &nbsp; <b>1</b>, &nbsp; &nbsp;  $\rm E$ &nbsp; &#8594; &nbsp;  <b>2</b>, &nbsp; &nbsp; $\rm I$ &nbsp; &#8594; &nbsp; <b>3</b>, &nbsp; &nbsp;  $\rm M$ &nbsp; &#8594; &nbsp; <b>4</b>,&nbsp; &nbsp;  $\rm S$ &nbsp; &#8594; &nbsp; <b>5</b>.
* In gleicher Weise fährt man fort, bis der gesamte Eingangstext abgearbeitet ist. Steht ein Zeichen bereits an Position <b>0</b>, so ist keine Neusortierung erforderlich.
+
* Continue in the same way until the entire input text has been processed.&nbsp; If a character is already at position&nbsp; <b>0</b>, no reordering is necessary.
  
  
  
  
 +
<u>Hints:</u>
 +
*The exercise belongs to the chapter&nbsp; [[Information_Theory/Further_Source_Coding_Methods|Further Source Coding Methods]].
 +
*In particular, reference is made to the section&nbsp; [[Information_Theory/Further_Source_Coding_Methods#Burrows.E2.80.93Wheeler_transformation|Burrows&ndash;Wheeler Transformation]].
 +
*Information on the Huffman code can be found in the chapter&nbsp;  [[Information_Theory/Entropiecodierung_nach_Huffman|Entropy Coding according to Huffman]].&nbsp;  This information is not necessary for the solution of the task.
  
''Hinweise:''
+
===Questions===
*Die Aufgabe gehört zum  Kapitel [[Informationstheorie/Weitere_Quellencodierverfahren|Weitere Quellencodierverfahren]].
 
*Insbesondere wird  Bezug genommen auf die Seite [[Informationstheorie/Weitere_Quellencodierverfahren#Burrows.E2.80.93Wheeler.E2.80.93Transformation|Burrows&ndash;Wheeler&ndash;Transformation]].
 
*Informationen zum Huffman&ndash;Code finden Sie im Kapitel [[Informationstheorie/Entropiecodierung_nach_Huffman|Entropiecodierung nach Huffman]]. Für die Lösung dieser Aufgabe sind diese Informationen aber nicht erforderlich.
 
 
 
 
 
===Fragebogen===
 
  
 
<quiz display=simple>
 
<quiz display=simple>
{Welche Aussagen gelten für den Block &bdquo;BWT&rdquo; des Codiersystems?
+
{Which statements are true for block&nbsp; $\rm BWT$&nbsp; of the coding system?
 
|type="[]"}
 
|type="[]"}
+ Die Eingangszeichenmenge ist $\{$<b>D</b>, <b>E</b>, <b>I</b>, <b>M</b>, <b>N</b>, <b>S</b>$\}$.
+
+ The input character set is&nbsp; $\{ \hspace{0.05cm}\rm D,\hspace{0.05cm}  E,\hspace{0.05cm}  I,\hspace{0.05cm}  M,\hspace{0.05cm}  N , \hspace{0.05cm} S \}$.
+ Die Ausgangszeichenmenge ist $\{$<b>D</b>, <b>E</b>, <b>I</b>, <b>M</b>, <b>N</b>, <b>S</b>$\}$.
+
+ The output character set is $\{ \hspace{0.05cm}\rm D,\hspace{0.05cm}  E,\hspace{0.05cm}  I,\hspace{0.05cm}  M,\hspace{0.05cm}  N , \hspace{0.05cm} S \}$.
- In der Ausgangsfolge treten alle $M = 6$ Zeichen gruppiert auf.
+
- In the output sequence, all&nbsp; $M = 6$&nbsp; characters occur grouped together.
  
  
{Welche Aussagen gelten für den Block &bdquo;MTF&rdquo; des Codiersystems?
+
{Which statements are true for the block&nbsp; $\rm MTF$&nbsp; of the coding system?
 
|type="[]"}
 
|type="[]"}
- Die Ausgangszeichenmenge ist $\{$<b>D</b>, <b>E</b>, <b>I</b>, <b>M</b>, <b>N</b>, <b>S</b>$\}$.
+
- The output character set is&nbsp; $\{ \hspace{0.05cm}\rm D,\hspace{0.05cm}  E,\hspace{0.05cm}  I,\hspace{0.05cm}  M,\hspace{0.05cm}  N , \hspace{0.05cm} S \}$.
+ Die Ausgangszeichenmenge ist $\{$<b>0</b>, <b>1</b>, <b>2</b>, <b>3</b>, <b>4</b>, <b>5</b>$\}$.
+
+ The output character set is&nbsp; $\{ \hspace{0.05cm}\rm 0,\hspace{0.05cm}  1,\hspace{0.05cm}  2,\hspace{0.05cm}  3,\hspace{0.05cm}  4 , \hspace{0.05cm} 5 \}$.
+ Die MTF&ndash;Ausgangsfolge hat die Länge $N = 12$.
+
+ The MTF output sequence has length&nbsp; $N = 12$.
  
  
{Wie lautet die MTF&ndash;Ausgangsfolge?
+
{What is the&nbsp; $\rm MTF$ output sequence?
|type="[]"}
+
|type="()"}
- <b>230000100405</b>,
+
- $\rm 230000100405$,
+ <b>445340045001</b>,
+
+ $\rm 445340045001$,
- <b>543120345123</b>.
+
- $\rm 543120345123$.
  
  
{Welche Aussagen gelten für den Block &bdquo;RLC0&rdquo; des Codiersystems?
+
{Which statements apply to block&nbsp; $\rm RLC0$&nbsp; of the coding system?
 
|type="[]"}
 
|type="[]"}
+ Der Eingangswert <b>0</b> erfährt eine Sonderbehandlung.
+
+ The input value&nbsp; $0$&nbsp; receives special treatment
+ Je häufiger eine <b>0</b> auftritt, um so effektiver ist dieser Block.
+
+ The more often a&nbsp; $0$&nbsp; occurs, the more effective this block is.
- Am besten wäre Pr(<b>0</b>) &asymp; Pr(<b>1</b>) &asymp; ... &asymp; Pr(<b>5</b>).
+
- The best would be&nbsp; $\rm Pr(0) &asymp; Pr(1) &asymp; \text{...} &asymp; Pr(5)$.
  
  
{Welche Aussagen gelten für den abschließenden Block &bdquo;Huffman&rdquo;?
+
{Which statements are true for the final block "Huffman"?
 
|type="[]"}
 
|type="[]"}
+ Die Ausgangsfolge ist binär.
+
+ The initial sequence is binary.
+ Er bewirkt eine möglichst kleine mittlere Codewortlänge.
+
+ It causes the smallest possible average code word length.
+ Die Dimensionierung richtet sich nach den anderen Blöcken.
+
+ The dimensioning depends on the other blocks.
  
  
Line 72: Line 70:
 
</quiz>
 
</quiz>
  
===Musterlösung===
+
===Solution===
 
{{ML-Kopf}}
 
{{ML-Kopf}}
'''(1)'''&nbsp; Die Grafik auf der Angabenseite zeigt, dass die <u>Lösungsvorschläge 1 und 2</u> richtig sind und der Vorschlag 3 falsch ist. <b>E</b> und <b>I</b> treten zwar gruppiert auf, aber nicht die <b>N</b>&ndash;Zeichen.
+
[[File:EN_Inf_Z_2_14.png|right|frame|Example of the MTF algorithm]]
 +
'''(1)'''&nbsp; The graph on the information section shows that <u>solution suggestions 1 and 2</u> are correct and suggestion 3 is incorrect:
 +
*$\rm E$&nbsp; and&nbsp; $\rm I$&nbsp; occur grouped together,
 +
*but not the&nbsp; $\rm N$&nbsp; characters.
 +
 
 +
 
 +
'''(2)'''&nbsp;<u>Proposed solutions 2 and 3</u> are correct:
 +
*The input sequence is processed&nbsp; "character by character".&nbsp; <br>Thus, the output sequence also has the length&nbsp; $N = 12$.
 +
*In fact, the input set&nbsp; $\{ \hspace{0.05cm}\rm D,\hspace{0.05cm}  E,\hspace{0.05cm}  I,\hspace{0.05cm}  M,\hspace{0.05cm}  N , \hspace{0.05cm} S \}$&nbsp; is converted into the output set&nbsp;  $\{ \hspace{0.05cm}\rm 0,\hspace{0.05cm}  1,\hspace{0.05cm}  2,\hspace{0.05cm}  3,\hspace{0.05cm}  4 , \hspace{0.05cm} 5 \}$.
 +
*However, not by simple&nbsp; "mapping",&nbsp; but by an algorithm which is outlined below.
 +
<br clear=all>
 +
'''(3)'''&nbsp; Correct is <u>solution suggestion 2</u>:
 +
*The table shows the MTF algorithm.&nbsp; The step&nbsp; $i=0$&nbsp; (red background) indicates the preassignment.&nbsp; The MTF input is highlighted in yellow, the output in green.
 +
* In step&nbsp; $i=1$,&nbsp; the input character&nbsp; $\rm N$&nbsp; corresponding to column&nbsp; $i=0$&nbsp; is represented by index&nbsp; $I = 4$&nbsp;.&nbsp; Subsequently,&nbsp; $\rm N$&nbsp; is sorted to the front, while the order of the other characters remains the same.
 +
*The input character&nbsp; $\rm M$&nbsp; in the second step is also given the index&nbsp; $I = 4$&nbsp; according to column&nbsp; $i=2$&nbsp;.&nbsp; One continues in the same way until the twelfth character&nbsp; $\rm N$, to which the index&nbsp; $I = 1$&nbsp; is assigned.
 +
*You can see from the above table that at the times&nbsp; $i=6$,&nbsp; $i=7$,&nbsp; $i=10$&nbsp; and&nbsp; $i=11$&nbsp; the output index is&nbsp; $I = 0$&nbsp;.
 +
 
  
'''(2)'''&nbsp; Richtig sind die <u>Lösungsvorschläge 2 und 3</u>:
 
*Die Eingangsfolge wird Zeichen für Zeichen abgearbeitet. Auch die Ausgangsfolge hat somit die Länge <i>N</i> = 12.
 
*Tatsächlich wird die Eingangsmenge {<b>D</b>, <b>E</b>, <b>I</b>, <b>N</b>, <b>M</b>, <b>S</b>} in die Ausgangsmenge  {<b>0</b>, <b>1</b>, <b>2</b>, <b>3</b>, <b>4</b>, <b>5</b>} gewandelt.
 
*Allerdings nicht durch einfaches <i>Mapping</i>, sondern durch einen Algorithmus, der nachfolgend skizziert wird.
 
  
'''(3)'''&nbsp; Die folgende Tabelle zeigt den MTF&ndash;Algorithmus. Der Schritt <i>i</i>&nbsp;=&nbsp;0 (rote Hinterlegung) gibt die Vorbelegung an. Die Eingabe der MTF ist gelb hinterlegt, die Ausgabe grün. Richtig ist der <u>Lösungsvorschlag 2</u>:
+
'''(4)'''&nbsp; <u>Statements 1 and 2</u>&nbsp; are correct:
* Im Schritt <i>i</i> = 1 wird das Eingangszeichen <b>N</b> entsprechend der Spalte <i>i</i> = 0 durch den Index <i>I</i> = <b>4</b> dargestellt. Anschließend wird <b>N</b> nach vorne sortiert, während die Reihenfolge der anderen Zeichen gleich bleibt.
+
*The preprocessings&nbsp; "BWT"&nbsp; and&nbsp; "MTF"&nbsp; only have the task to generate as many zeros as possible.
* Das Eingangszeichen <b>M</b> im zweiten Schritt erhält entsprechend der Spalte <i>i</i> = 1 ebenfalls den Index <i>I</i> = <b>4</b>. In gleicher Weise macht man weiter bis zum 12. Zeichen <b>N</b>, dem der Index <i>I</i> = <b>1</b> zugeordnet wird.
 
*Man erkennt aus obiger Tabelle weiter, dass zu den Zeitpunkten <i>i</i> = 6, <i>i</i>&nbsp;=&nbsp;7, <i>i</i> = 10 und <i>i</i> = 11 der Ausgabeindex jeweils <i>I</i> = <b>0</b> ist.
 
  
:[[File:P_ID2481__Inf_Z_2_14b.png|Beispiel für den MTF–Algorithmus]]
 
  
'''(4)'''&nbsp; Richtig sind die <u>Aussagen 1 und 2</u>. Die Vorverarbeitungen &bdquo;BWT&rdquo; und &bdquo;MTF&rdquo;  haben nur die Aufgabe, möglichst viele Nullen zu generieren.
 
  
'''(5)'''&nbsp; <u>Alle Aussagen</u> sind richtig. Nähere Angaben zum Huffman&ndash;Algorithmus finden Sie im Kapitel &bdquo;Entropiecodierung nach Huffman&rdquo;.
+
'''(5)'''&nbsp; <u>All statements</u>&nbsp; are correct.
 +
*You can find more information on the Huffman algorithm in the chapter&nbsp; "Entropy coding according to Huffman".
 
{{ML-Fuß}}
 
{{ML-Fuß}}
  
  
[[Category:Aufgaben zu Informationstheorie|^2.4 Weitere Quellencodierverfahren^]]
+
[[Category:Information Theory: Exercises|^2.4 Further Source Coding Methods^]]

Latest revision as of 01:05, 13 November 2022

Scheme for Burrows-Wheeler data compression

We refer to the theory section  Application Scenario for the Burrows-Wheeler Transform  and consider the coding system sketched on the right, consisting of the blocks

  • "Burrows–Wheeler Transform"  $\rm (BWT)$  as described in  Exercise 2.13;  the character sets at the input and the output of the BWT are the same:   $\{$ $\rm D$,  $\rm E$,  $\rm I$,  $\rm M$,  $\rm N$,  $\rm S$ $\}$;
  • "Move–to–Front"  $\rm (MTF)$, a sorting algorithm that outputs a string of the same length  $($  $N = 12$  in the example$)$,  but with a different alphabet  $\{$012345$\}$;
  • $\rm RLC0$  – a run-length encoding specifically for  0, which is (as) frequent according to  $\rm BWT$  and  $\rm MTF$;  all other indices are not changed by  $\rm RLC0$ ;
  • $\rm Huffman$  as described in the chapter  Entropy coding according to Huffman; frequent characters are represented by short binary sequences and rare ones by long ones.


The  $\rm MTF$  algorithm can be described as follows for  $M = 6$  input symbols:

  • The output sequence of the  $\rm MTF$  is a string of indices from the set
       $ I = \{$012345$\}$.
  • Before starting the actual  $\rm MTF$ algorithm, the possible input symbols are sorted lexicographically and assigned to the following indices:
        $\rm D$   →   0,     $\rm E$   →   1,     $\rm I$   →   2,     $\rm M$   →   3,     $\rm N$   →   4,    $\rm S$   →   5.
  • Let the  $\rm MTF$ input string here be  $\rm N\hspace{0.05cm}M\hspace{0.05cm}S\hspace{0.05cm}D\hspace{0.05cm}E\hspace{0.05cm}E\hspace{0.05cm}E\hspace{0.05cm}N\hspace{0.05cm}I\hspace{0.05cm}I\hspace{0.05cm}I\hspace{0.05cm}N$.  This was the  $\rm BWT$ result in  Exercise 2.13.  The first  $\rm N$  is represented as  $I = 4$  according to the default setting.
  • Then the  $\rm N$  is placed at the beginning in the sorting, so that after the coding step  $i = 1$  the assignment holds:
        $\rm N$   →   0,     $\rm D$   →   1,     $\rm E$   →   2,     $\rm I$   →   3,     $\rm M$   →   4,    $\rm S$   →   5.
  • Continue in the same way until the entire input text has been processed.  If a character is already at position  0, no reordering is necessary.



Hints:

Questions

1

Which statements are true for block  $\rm BWT$  of the coding system?

The input character set is  $\{ \hspace{0.05cm}\rm D,\hspace{0.05cm} E,\hspace{0.05cm} I,\hspace{0.05cm} M,\hspace{0.05cm} N , \hspace{0.05cm} S \}$.
The output character set is $\{ \hspace{0.05cm}\rm D,\hspace{0.05cm} E,\hspace{0.05cm} I,\hspace{0.05cm} M,\hspace{0.05cm} N , \hspace{0.05cm} S \}$.
In the output sequence, all  $M = 6$  characters occur grouped together.

2

Which statements are true for the block  $\rm MTF$  of the coding system?

The output character set is  $\{ \hspace{0.05cm}\rm D,\hspace{0.05cm} E,\hspace{0.05cm} I,\hspace{0.05cm} M,\hspace{0.05cm} N , \hspace{0.05cm} S \}$.
The output character set is  $\{ \hspace{0.05cm}\rm 0,\hspace{0.05cm} 1,\hspace{0.05cm} 2,\hspace{0.05cm} 3,\hspace{0.05cm} 4 , \hspace{0.05cm} 5 \}$.
The MTF output sequence has length  $N = 12$.

3

What is the  $\rm MTF$ output sequence?

$\rm 230000100405$,
$\rm 445340045001$,
$\rm 543120345123$.

4

Which statements apply to block  $\rm RLC0$  of the coding system?

The input value  $0$  receives special treatment
The more often a  $0$  occurs, the more effective this block is.
The best would be  $\rm Pr(0) ≈ Pr(1) ≈ \text{...} ≈ Pr(5)$.

5

Which statements are true for the final block "Huffman"?

The initial sequence is binary.
It causes the smallest possible average code word length.
The dimensioning depends on the other blocks.


Solution

Example of the MTF algorithm

(1)  The graph on the information section shows that solution suggestions 1 and 2 are correct and suggestion 3 is incorrect:

  • $\rm E$  and  $\rm I$  occur grouped together,
  • but not the  $\rm N$  characters.


(2) Proposed solutions 2 and 3 are correct:

  • The input sequence is processed  "character by character". 
    Thus, the output sequence also has the length  $N = 12$.
  • In fact, the input set  $\{ \hspace{0.05cm}\rm D,\hspace{0.05cm} E,\hspace{0.05cm} I,\hspace{0.05cm} M,\hspace{0.05cm} N , \hspace{0.05cm} S \}$  is converted into the output set  $\{ \hspace{0.05cm}\rm 0,\hspace{0.05cm} 1,\hspace{0.05cm} 2,\hspace{0.05cm} 3,\hspace{0.05cm} 4 , \hspace{0.05cm} 5 \}$.
  • However, not by simple  "mapping",  but by an algorithm which is outlined below.


(3)  Correct is solution suggestion 2:

  • The table shows the MTF algorithm.  The step  $i=0$  (red background) indicates the preassignment.  The MTF input is highlighted in yellow, the output in green.
  • In step  $i=1$,  the input character  $\rm N$  corresponding to column  $i=0$  is represented by index  $I = 4$ .  Subsequently,  $\rm N$  is sorted to the front, while the order of the other characters remains the same.
  • The input character  $\rm M$  in the second step is also given the index  $I = 4$  according to column  $i=2$ .  One continues in the same way until the twelfth character  $\rm N$, to which the index  $I = 1$  is assigned.
  • You can see from the above table that at the times  $i=6$,  $i=7$,  $i=10$  and  $i=11$  the output index is  $I = 0$ .


(4)  Statements 1 and 2  are correct:

  • The preprocessings  "BWT"  and  "MTF"  only have the task to generate as many zeros as possible.


(5)  All statements  are correct.

  • You can find more information on the Huffman algorithm in the chapter  "Entropy coding according to Huffman".