Difference between revisions of "Aufgaben:Exercise 2.9: Huffman Decoding after Errors"

From LNTwww
 
(24 intermediate revisions by 4 users not shown)
Line 1: Line 1:
  
{{quiz-Header|Buchseite=Informationstheorie/Entropiecodierung nach Huffman
+
{{quiz-Header|Buchseite=Information_Theory/Entropy_Coding_According_to_Huffman
 
}}
 
}}
  
[[File:P_ID2464__Inf_A_2_9.png|right|]]
+
[[File:EN_Inf_A_2_9.png|right|frame|Overall system with "Huffman"]]
Wir betrachten die Huffman–Codierung gemäß folgender Zuordnung:
+
We consider Huffman coding according to the following assignment:
  
:<b>A</b> &#8594; <b>1</b>, <b>B</b> &#8594; <b>01</b>, <b>C</b> &#8594; <b>001</b>, <b>D</b> &#8594; <b>000</b>.
+
: &nbsp; $\rm A$ &nbsp; &#8594; &nbsp; <b>1</b>, &nbsp; &nbsp; $\rm B$ &nbsp; &#8594; &nbsp;  <b>01</b>, &nbsp; &nbsp; $\rm C$ &nbsp; &#8594; &nbsp; <b>001</b>, &nbsp; &nbsp; $\rm D$ &nbsp; &#8594; &nbsp; <b>000</b>.
  
Die Codierung nach Huffman ist stets <i>verlustlos</i>. Das bedeutet: Decodiert man die Codesymbolfolge &#9001;<i>c<sub>&nu;</sub></i>&#9002; nach dem Huffman&ndash;Codierer sofort wieder, so ist das Decodierergebnis &#9001;<i>&upsilon;<sub>&nu;</sub></i>&#9002; gleich der Quellensymbolfolge &#9001;<i>q<sub>&nu;</sub></i>&#9002;.
+
Huffman coding is always&nbsp; <u>lossless</u>.&nbsp; This means:
 +
*If the encoded sequence&nbsp; $\langle c_\nu \rangle$&nbsp; is immediately decoded again after the Huffman encoder, the decoding result&nbsp; $\langle v_\nu \rangle$&nbsp; is equal to the source symbol sequence&nbsp; $\langle q_\nu \rangle$.
  
Stimmt dagegen die Empfangsfolge &#9001;<i>r<sub>&nu;</sub></i>&#9002; aufgrund von Fehlern bei der Übertragung (<b>0</b> &#8594; <b>1</b>, <b>1</b> &#8594; <b>0</b>) mit der erzeugten Codefolge &#9001;<i>c<sub>&nu;</sub></i>&#9002; nicht überein, so kann es zu einer Fehlerfortpflanzung kommen. Ein einziger Bitfehler kann dann dazu führen, dass (nahezu) alle nachfolgenden Zeichen falsch decodiert werden.
+
*If, on the other hand, the  reception sequence&nbsp; $\langle r_\nu \rangle$&nbsp;&nbsp; does not match the generated code sequence&nbsp;  $\langle c_\nu \rangle$&nbsp; due to errors during transmission&nbsp; <br>$($<b>0</b> &nbsp;  &#8594; &nbsp; <b>1</b>, &nbsp;  &nbsp; <b>1</b> &nbsp;  &#8594; &nbsp; <b>0</b>$)$, error propagation may occur.
 +
*A single bit error can then lead to (almost) all subsequent characters being decoded incorrectly.
  
<b>Hinweis:</b> Die Aufgabe bezieht sich auf die Seite 5 von Kapitel 2.3.
 
  
  
===Fragebogen===
+
 
 +
 
 +
 
 +
<u>Hints:</u>
 +
*The exercise belongs to the chapter&nbsp;  [[Information_Theory/Entropiecodierung_nach_Huffman|Entropy coding according to Huffman]].
 +
*In particular, reference is made to the page&nbsp;  [[Information_Theory/Entropiecodierung_nach_Huffman#Influence_of_transmission_errors_on_decoding|Influence of transmission errors on decoding]]&nbsp;.
 +
 +
 
 +
 
 +
 
 +
===Questions===
  
 
<quiz display=simple>
 
<quiz display=simple>
{Wir betrachten die Codesymbolfolge <b>10100100011000010011</b>. Wie lautet die dazugehörige Quellensymbolfolge?
+
{We consider the encoded sequence&nbsp;  $\langle c_\nu \rangle = \rm \langle 10100100011000010011 \rangle$.&nbsp;  What is the corresponding source symbol sequence?
|type="[]"}
+
|type="()"}
- <b>CCDAADBCA</b>,
+
- $\langle q_\nu \rangle = \rm \langle \rm CCDAADBCA \rangle$,
- <b>ABDDAADBCA</b>,
+
- $\langle q_\nu \rangle = \rm \langle\rm ABDDAADBCA \rangle$,
+ <b>ABCDAADBCA</b>,
+
+ $\langle q_\nu \rangle = \rm \langle\rm ABCDAADBCA \rangle$,
- Anders als die drei genannten.
+
- Other than the three above.
  
  
{Welche Folge ergibt sich nach der Decodierung, wenn das erste Bit verfälscht wird (<b>1</b> &#8594; <b>0</b>)? &nbsp;&#8658;&nbsp; Anliegende Folge <b>00100100011000010011</b>.
+
{Which sequence&nbsp;  $\langle v_\nu \rangle$&nbsp; results after decoding if the first bit is falsified&nbsp;  $\rm (1 &nbsp; &#8594; &nbsp; 0)$? <br> &nbsp;  &nbsp;  $\langle c_\nu \rangle = \rm \langle 10100100011000010011 \rangle$ &nbsp; &nbsp; &rArr; &nbsp; &nbsp; $\langle r_\nu \rangle = \rm \langle \underline{0}0100100011000010011 \rangle$.
|type="[]"}
+
|type="()"}
+ <b>CCDAADBCA</b>,
+
+ $\langle v_\nu \rangle = \rm \langle \rm CCDAADBCA \rangle$,
- <b>ABDDAADBCA</b>,
+
- $\langle v_\nu \rangle = \rm \langle\rm ABDDAADBCA \rangle$,
- <b>ABCDAADBCA</b>,
+
- $\langle v_\nu \rangle = \rm \langle\rm ABCDAADBCA \rangle$,
- Anders als die drei genannten.
+
- One other than the three mentioned.
  
  
{Ist es möglich, dass durch einen weiteren Bitfehler die späteren Symbole alle wieder richtig decodiert werden?
+
{Is it possible that by another bit error the later symbols will all be decoded correctly again?
 
|type="[]"}
 
|type="[]"}
+ Ja, durch einen zweiten Bitfehler an Position 2.
+
+ Yes, by a second bit error at position 2.
- Ja, durch einen zweiten Bitfehler an Position 10.
+
- Yes, by a second bit error at position 10.
+ Ja, durch einen zweiten Bitfehler an Position 15.
+
+ Yes, by a second bit error at position 15.
- Nein.
+
- No.
  
  
{Welche Folge ergibt sich nach der Decodierung, wenn das sechste Bit verfälscht wird (<b>1</b> &#8594; <b>0</b>)?
+
{Which sequence&nbsp;  $\langle v_\nu \rangle$&nbsp; results after decoding if the sixth bit is falsified&nbsp;  $\rm (1 &nbsp; &#8594; &nbsp; 0)$? <br> &nbsp;  &nbsp; $\langle c_\nu \rangle = \rm \langle 10100100011000010011 \rangle$ &nbsp; &rArr; &nbsp; $\langle r_\nu \rangle = \rm \langle 10100\underline{0}00011000010011 \rangle$.
|type="[]"}
+
|type="()"}
- <b>CCDAADBCA</b>,
+
- $\langle v_\nu \rangle = \rm \langle \rm CCDAADBCA \rangle$,
+ <b>ABDDAADBCA</b>,
+
+ $\langle v_\nu \rangle = \rm \langle\rm ABDDAADBCA \rangle$,
- <b>ABCDAADBCA</b>,
+
- $\langle v_\nu \rangle = \rm \langle\rm ABCDAADBCA \rangle$,
- Anders als die drei genannten..
+
- A different one from the three mentioned.
  
  
Line 53: Line 64:
 
</quiz>
 
</quiz>
  
===Musterlösung===
+
===Solution===
 
{{ML-Kopf}}
 
{{ML-Kopf}}
<b>1.</b>&nbsp;&nbsp;Richtig ist der <u>Vorschlag 3</u>. Nachfolgend sehen Sie die durch Hochkommata eingeteilte Codesymbolfolge <b>1&prime;01&prime;001&prime;000&prime;1&prime;1&prime;000&prime;01&prime;001&prime;1</b> &nbsp;&nbsp;&#8658;&nbsp;&nbsp; Quellensymbolfolge <b>ABCDAADBCA</b>.
+
'''(1)'''&nbsp;<u>Solution suggestion 3</u> is correct:
 +
*Below you can see the encoded sequence divided by inverted commas:
 +
:$$\langle c_\nu \rangle = \rm \langle 1'01'001'000'1'1'000'01'001'1 \rangle .$$
 +
*This belongs to the following source symbol sequence:
 +
:$$\langle q_\nu \rangle = \rm \langle ABCDAADBCA \rangle .$$
 +
 
 +
 
 +
 
 +
'''(2)'''&nbsp;<u>Solution suggestion 1</u> is correct:
 +
*With a bit error at position 1, one obtains for the reception sequence:
 +
:$$\langle r_\nu \rangle = \rm \langle 00100100011000010011 \rangle .$$
 +
*The inverted commas clarify the individual blocks of decoding:
 +
:$$\langle r_\nu \rangle = \rm \langle 001'001'000'1'1'000'01'001'1 \rangle .$$
 +
*This leads to the following sink symbol sequence:
 +
:$$\langle v_\nu \rangle = \rm \langle CCDAADBCA \rangle .$$
 +
 
 +
Interpretation:
 +
*$\rm AB$&nbsp; is replaced by&nbsp; $\rm C$&nbsp;, the further text&nbsp; $\rm CDAADBCA$&nbsp; is unchanged, but shifted by one position.
 +
*However, if you compare the first nine symbols of the original with the decoding result&nbsp; <u>position by position</u>, as an automaton would do, you will recognise eight different symbols.
 +
 
 +
 
 +
 
 +
'''(3)'''&nbsp; The correct <u>answers are 1 and 3</u>:
 +
 
 +
* An additional bit error at position 2&nbsp; $\rm (0 &nbsp; &#8594; &nbsp; 1)$&nbsp; falsifies&nbsp; $\rm AB$&nbsp; to&nbsp; $\rm BA$,&nbsp; but all further symbols are recognised correctly again.
 +
* An additional bit error at position 15&nbsp;  $\rm (0 &nbsp; &#8594; &nbsp; 1)$&nbsp; leads to
 +
:$$\langle r_\nu \rangle = \rm {\langle \underline{0}01'001'000'1'1'000'\underline{1}'1'001'1 \rangle} \hspace{0.3cm}\Rightarrow \hspace{0.3cm}\langle \it v_\nu \rangle = \rm \langle \underline{C}CDAAD\underline{AA}CA \rangle .$$
 +
 +
::Due to the bit error at position 1&nbsp;  $\rm (1 &nbsp; &#8594; &nbsp; 0)$&nbsp;, &nbsp; $\rm AB$&nbsp;  is falsified to&nbsp; $\rm C$&nbsp;, i.e. a character is&nbsp; "swallowed".&nbsp; The additional bit error at position 15&nbsp;  $\rm (0 &nbsp; &#8594; &nbsp; 1)$&nbsp; turns&nbsp; $\rm B$&nbsp; into the tuple&nbsp; $\rm AA$.&nbsp; After that, all symbols in the correct position are recognised correctly, starting with&nbsp; $\rm CA$.
  
<b>2.</b>&nbsp;&nbsp;Mit einem Bitfehler an Position 1 erhält man das folgende Decodierergebnis:
+
* An additional bit error at position 10&nbsp;  $\rm (1 &nbsp; &#8594; &nbsp; 0)$&nbsp; on the other hand, leads to
 +
:$$\langle r_\nu \rangle = \rm \langle \underline{0}01'001'000'\underline{0}'1'000'0'1'001'1 \rangle \hspace{0.3cm}\Rightarrow \hspace{0.3cm}\langle \it  v_\nu \rangle = \rm \langle \underline{C}CDAAD\underline{B}CA \rangle .$$ 
 +
::The bit error at position 10 turns&nbsp; $\rm AA$&nbsp; into&nbsp; $\rm B$.&nbsp; The decoder thus swallows a total of two characters.&nbsp; All subsequently decoded characters are then not in the correct position.
  
:<b><font color="#cc0000"><span style="font-weight: bold;">0</span></font>01&prime;001&prime;000&prime;1&prime;1&prime;000&prime;01&prime;001&prime;1</b> &nbsp;&nbsp;&#8658;&nbsp;&nbsp; <b>CCDAADBCA</b> &#8658; <u>Lösungsvorschlag 1</u>.
 
  
Das heißt: <b>AB</b> wird durch <b>C</b> ersetzt, der weitere Text <b>CDAADBCA</b> ist unverändert, allerdings um eine Position verschoben. Vergleicht man jedoch die ersten neun Symbole des Originals mit der Decodierung <i>Stelle für Stelle</i>, wie es ein Automat machen würde, so erkennt man acht unterschiedliche Symbole.
 
  
<b>3.</b>&nbsp;&nbsp;Richtig sind die <u>Antworten 1 und 3</u>:
+
'''(4)'''&nbsp;<u>Solution suggestion 2</u> is correct:
  
:* Durch einen zusätzlichen Bitfehler an Position 2 (<b>0</b> &#8594; <b>1</b>) wird <b>AB</b> zu <b>BA</b> verfälscht, aber alle weiteren Symbole wieder richtig erkannt.
+
*The first bit error at position 6&nbsp;  $\rm (1 &nbsp; &#8594; &nbsp; 0)$&nbsp; yields
 +
:$$\langle r_\nu \rangle = \rm \langle 101'00\underline{0}'000'1'000'0'1'001'1 \rangle \hspace{0.3cm}\Rightarrow \hspace{0.3cm}\langle \it  v_\nu \rangle = \rm \langle AB\underline{D}DAADBCA \rangle .$$ 
  
:* Ein zusätzlicher Bitfehler an Position 15 (<b>0</b> &#8594; <b>1</b>) führt zu <b>001&prime;001&prime;000&prime;1&prime;1&prime;000&prime;<font color="#cc0000"><span style="font-weight: bold;">1</span></font>&prime; 1&prime; 001&prime; 1</b> und damit zur Sinkensymbolfolge <b>CCDAADAA<font color="#008800"><span style="font-weight: bold;">CA</span></font></b>. Das neunte und das zehnte Symbol (beide grün markiert) und eventuell weitere Symbole werden richtig erkannt.
+
*The first&nbsp; $\rm C$&nbsp; becomes a&nbsp; $\rm D$.&nbsp; All other symbols are decoded correctly.
  
:* Durch den ersten Bitfehler an Position 1 wird <b>AB</b> in <b>C</b> verfälscht, also ein Zeichen &bdquo;verschluckt&rdquo;. Ein weiterer Bitfehler an Position 10 macht aus <b>AA</b> ein <b>B</b>. Insgesamt verschluckt so der Decoder zwei Zeichen, und alle nachfolgend decodierten Zeichen stehen nicht an der richtigen Position.
 
  
<b>4.</b>&nbsp;&nbsp;Aus <b>001</b> wird <b>000</b>. Das bewirkt, dass insgesamt nur ein Fehler <b>C</b> &#8594; <b>D</b> entsteht:
 
  
<b><font color="#000000"><span style="font-weight: bold;">1</span></font>01&prime;00<b><font color="#cc0000"><span style="font-weight: bold;">0</span></font>&prime;000&prime;1&prime;1&prime;000&prime;01&prime;001&prime;1</b> &nbsp;&nbsp;&#8658;&nbsp;&nbsp; <b>AB<font color="#cc0000"><span style="font-weight: bold;">D</span></font>DAADBCA</b> &nbsp;&nbsp;&#8658;&nbsp;&nbsp;</b> <u>Lösungsvorschlag 2</u>.
 
 
{{ML-Fuß}}
 
{{ML-Fuß}}
  
  
  
[[Category:Aufgaben zu Informationstheorie|^2.3 Entropiecodierung nach Huffman^]]
+
[[Category:Information Theory: Exercises|^2.3 Entropy Coding according to Huffman^]]

Latest revision as of 17:29, 23 January 2023

Overall system with "Huffman"

We consider Huffman coding according to the following assignment:

  $\rm A$   →   1,     $\rm B$   →   01,     $\rm C$   →   001,     $\rm D$   →   000.

Huffman coding is always  lossless.  This means:

  • If the encoded sequence  $\langle c_\nu \rangle$  is immediately decoded again after the Huffman encoder, the decoding result  $\langle v_\nu \rangle$  is equal to the source symbol sequence  $\langle q_\nu \rangle$.
  • If, on the other hand, the reception sequence  $\langle r_\nu \rangle$   does not match the generated code sequence  $\langle c_\nu \rangle$  due to errors during transmission 
    $($0   →   1,     1   →   0$)$, error propagation may occur.
  • A single bit error can then lead to (almost) all subsequent characters being decoded incorrectly.




Hints:



Questions

1

We consider the encoded sequence  $\langle c_\nu \rangle = \rm \langle 10100100011000010011 \rangle$.  What is the corresponding source symbol sequence?

$\langle q_\nu \rangle = \rm \langle \rm CCDAADBCA \rangle$,
$\langle q_\nu \rangle = \rm \langle\rm ABDDAADBCA \rangle$,
$\langle q_\nu \rangle = \rm \langle\rm ABCDAADBCA \rangle$,
Other than the three above.

2

Which sequence  $\langle v_\nu \rangle$  results after decoding if the first bit is falsified  $\rm (1   →   0)$?
    $\langle c_\nu \rangle = \rm \langle 10100100011000010011 \rangle$     ⇒     $\langle r_\nu \rangle = \rm \langle \underline{0}0100100011000010011 \rangle$.

$\langle v_\nu \rangle = \rm \langle \rm CCDAADBCA \rangle$,
$\langle v_\nu \rangle = \rm \langle\rm ABDDAADBCA \rangle$,
$\langle v_\nu \rangle = \rm \langle\rm ABCDAADBCA \rangle$,
One other than the three mentioned.

3

Is it possible that by another bit error the later symbols will all be decoded correctly again?

Yes, by a second bit error at position 2.
Yes, by a second bit error at position 10.
Yes, by a second bit error at position 15.
No.

4

Which sequence  $\langle v_\nu \rangle$  results after decoding if the sixth bit is falsified  $\rm (1   →   0)$?
    $\langle c_\nu \rangle = \rm \langle 10100100011000010011 \rangle$   ⇒   $\langle r_\nu \rangle = \rm \langle 10100\underline{0}00011000010011 \rangle$.

$\langle v_\nu \rangle = \rm \langle \rm CCDAADBCA \rangle$,
$\langle v_\nu \rangle = \rm \langle\rm ABDDAADBCA \rangle$,
$\langle v_\nu \rangle = \rm \langle\rm ABCDAADBCA \rangle$,
A different one from the three mentioned.


Solution

(1) Solution suggestion 3 is correct:

  • Below you can see the encoded sequence divided by inverted commas:
$$\langle c_\nu \rangle = \rm \langle 1'01'001'000'1'1'000'01'001'1 \rangle .$$
  • This belongs to the following source symbol sequence:
$$\langle q_\nu \rangle = \rm \langle ABCDAADBCA \rangle .$$


(2) Solution suggestion 1 is correct:

  • With a bit error at position 1, one obtains for the reception sequence:
$$\langle r_\nu \rangle = \rm \langle 00100100011000010011 \rangle .$$
  • The inverted commas clarify the individual blocks of decoding:
$$\langle r_\nu \rangle = \rm \langle 001'001'000'1'1'000'01'001'1 \rangle .$$
  • This leads to the following sink symbol sequence:
$$\langle v_\nu \rangle = \rm \langle CCDAADBCA \rangle .$$

Interpretation:

  • $\rm AB$  is replaced by  $\rm C$ , the further text  $\rm CDAADBCA$  is unchanged, but shifted by one position.
  • However, if you compare the first nine symbols of the original with the decoding result  position by position, as an automaton would do, you will recognise eight different symbols.


(3)  The correct answers are 1 and 3:

  • An additional bit error at position 2  $\rm (0   →   1)$  falsifies  $\rm AB$  to  $\rm BA$,  but all further symbols are recognised correctly again.
  • An additional bit error at position 15  $\rm (0   →   1)$  leads to
$$\langle r_\nu \rangle = \rm {\langle \underline{0}01'001'000'1'1'000'\underline{1}'1'001'1 \rangle} \hspace{0.3cm}\Rightarrow \hspace{0.3cm}\langle \it v_\nu \rangle = \rm \langle \underline{C}CDAAD\underline{AA}CA \rangle .$$
Due to the bit error at position 1  $\rm (1   →   0)$ ,   $\rm AB$  is falsified to  $\rm C$ , i.e. a character is  "swallowed".  The additional bit error at position 15  $\rm (0   →   1)$  turns  $\rm B$  into the tuple  $\rm AA$.  After that, all symbols in the correct position are recognised correctly, starting with  $\rm CA$.
  • An additional bit error at position 10  $\rm (1   →   0)$  on the other hand, leads to
$$\langle r_\nu \rangle = \rm \langle \underline{0}01'001'000'\underline{0}'1'000'0'1'001'1 \rangle \hspace{0.3cm}\Rightarrow \hspace{0.3cm}\langle \it v_\nu \rangle = \rm \langle \underline{C}CDAAD\underline{B}CA \rangle .$$
The bit error at position 10 turns  $\rm AA$  into  $\rm B$.  The decoder thus swallows a total of two characters.  All subsequently decoded characters are then not in the correct position.


(4) Solution suggestion 2 is correct:

  • The first bit error at position 6  $\rm (1   →   0)$  yields
$$\langle r_\nu \rangle = \rm \langle 101'00\underline{0}'000'1'000'0'1'001'1 \rangle \hspace{0.3cm}\Rightarrow \hspace{0.3cm}\langle \it v_\nu \rangle = \rm \langle AB\underline{D}DAADBCA \rangle .$$
  • The first  $\rm C$  becomes a  $\rm D$.  All other symbols are decoded correctly.