Difference between revisions of "Aufgaben:Exercise 2.11Z: Arithmetic Coding once again"

Latest revision as of 16:26, 12 August 2021

Preset interval ranges

Here we consider arithmetic coding $(\rm AC)$. All necessary information about this type of entropy coding can be found in Exercise 2.11.

The graph is also the result of Exercise 2.11. The numerical values for coding steps 3 and 7 that are important for the current task are highlighted in colour:

The interval for $N= 3$ $($symbol sequence $\rm XXY)$ starts at $B_3 = 0.343$ and goes up to $E_3 = 0.392$.
The interval limits for $N= 7$ $($symbol sequence $\rm XXYXXXZ)$ are $B_7 = 0.3564456$ and $E_7 =0.359807$.

This task is only about assigning binary sequences to the selected intervals. Procedure:

The interval $I$ is determined by the beginning $B$, the end $E$, the interval width ${\it \Delta} = E-B$ as well as the interval centre $M = (B+E)/2$.
The interval $I$ is characterised by the binary representation (with limited resolution) of any real number value $r \in I$. For example, one chooses $r \approx M$.
The required number of bits results from the interval width according to the following equation (the open square brackets mean "round up"):

$$N_{\rm bits} = \left\lceil{\rm log_2} \hspace{0.15cm} 1/{\it \Delta} \right\rceil+1\hspace{0.05cm}. $$

For example, for $N_{\rm bits} = 5$ the binary code 01001 stands for the following real-valued number $r$:

$$r = 0 \cdot 2^{-1}+1 \cdot 2^{-2}+0 \cdot 2^{-3}+0 \cdot 2^{-4}+1 \cdot 2^{-5} = 0.28125 \hspace{0.05cm}. $$

Hints:

The exercise belongs to the chapter Further source coding methods.
In particular, reference is made to the page Arithmetic Coding.
Further information on the topic can also be found in this Wikipedia article.

Questions

$N_\text{bits} \ = \ $

	$\rm AC = $ 01011,
	$\rm AC = $ 010111,
	$\rm AC = $ 110111.

$N_\text{bits} \ = \ $

	Yes.
	No.

	It is a common coding of entire sequences.
	A computer with 32-bit architecture limits the sequence length $N$.
	This problem can be circumvented by integer realisation.
	Integer realisation increases the coding speed.

Solution

(1) The selected interval begins at $B_3 = 0.343$ and ends at $E_3 = 0.392$.

The interval width is therefore ${\it \Delta}_3 = 0.049$ and thus the following applies with the logarithm to base 2:

$$N_{\rm bits} = {\rm log_2} \hspace{0.15cm} \left\lceil \frac{1}{0.049}\right\rceil+1\hspace{0.15cm}\underline{= 6} \hspace{0.05cm}.$$

(2) The selected interval results in $I = \big[0.343, \ 0.392\big)$.

The middle lies at $M_3 = 0.3675$.
To determine the arithmetic code, we try to reach the centre of the interval as well as possible by using a binary representation.
Since we do not have a corresponding tool for solving this task at the moment, we assume the following secondary calculations:

$H_4 = 2^{-2} + 2^{-4} = 0.3125$ ⇒ does not belong to interval $I$.

$H_5 = H_4 +2^{-5} = 0.34375 \in I$ ⇒ binary representation: 0.01011 ⇒ Code: 01011.

$H_6 = H_5 +2^{-6} = 0.359375 \in I$ ⇒ binary representation: 0.010111 ⇒ Code: 010111.

$H_7 = H_6 +2^{-7} = 0.3671875 \in I$ ⇒ binary representation: 0.0101111 ⇒ Code: 0101111.

$H_{12} = H_7 +2^{-12} = 0.3674316406 \in I$ ⇒ binary representation: 0.010111100001 ⇒ Code: 010111100001.

The corresponding 6–bit code is therefore $\rm AC =$ 010111 ⇒ Correct is solution suggestion 2.

(3) Here, with the beginning $B_7 = 0.3564456$ and the end $E_7 = 0.359807$ the interval width is ${\it \Delta}_7 = 0.0033614$ and thus

$$N_{\rm bits} = \left\lceil {\rm log_2} \hspace{0.15cm} \frac{1}{0.0033614} \right\rceil + 1\hspace{0.15cm} = \left\lceil {\rm log_2} \hspace{0.15cm} 297.5 \right\rceil + 1\hspace{0.15cm} \underline{= 11} \hspace{0.05cm}.$$

(4) The binary representation of the code 01011100001 results in

$$2^{-2}+ 2^{-4}+ 2^{-5}+ 2^{-6}+ 2^{-11} = 0.3598632813 > E_7 \hspace{0.05cm}.$$

The correct answer is therefore NO. The valid arithmetic code is $\rm AC =$ 01011011101, because of

$$2^{-2}+ 2^{-4}+ 2^{-5}+ 2^{-7}+ 2^{-8}+ 2^{-9}+ 2^{-11} =0.3579101563 \hspace{0.3cm} \Rightarrow \hspace{0.3cm} B_7 \le 0.3579101563 < E_7.$$

(5) All statements are correct. See also this Wikipedia article.

@@ Line 1: / Line 1: @@
-{{quiz-Header|Buchseite=Informationstheorie/Weitere Quellencodierverfahren
+{{quiz-Header|Buchseite=Information_Theory/Further_Source_Coding_Methods
 }}
 [[File:EN_Inf_Z_2_11_v2.png|right|frame|Preset interval ranges]]
-Here we consider arithmetic coding&nbsp; $(\rm AC)$.&nbsp; All necessary information about this type of entropy coding can be found in&nbsp; [[Aufgaben:2.11_Arithmetische_Codierung|Exercise 2.11]].
+Here we consider arithmetic coding&nbsp; $(\rm AC)$.&nbsp; All necessary information about this type of entropy coding can be found in&nbsp; [[Aufgaben:Exercise_2.11:_Arithmetic_Coding|Exercise 2.11]].
 The graph is also the result of Exercise 2.11.&nbsp; The numerical values for coding steps 3 and 7 that are important for the current task are highlighted in colour:
@@ Line 13: / Line 13: @@
 This task is only about assigning binary sequences to the selected intervals. Procedure:
-* The interval&nbsp; $I$&nbsp; is determined by the beginning&nbsp; $B$, the end&nbsp; $E$,&nbsp; the interval width&nbsp; ${\it \Delta} = E-B$ &nbsp;sowie die Intervallmitte &nbsp;$M = (B+E)/2$.
+* The interval&nbsp; $I$&nbsp; is determined by the beginning&nbsp; $B$, the end&nbsp; $E$,&nbsp; the interval width&nbsp; ${\it \Delta} = E-B$ &nbsp;as well as the interval centre &nbsp;$M = (B+E)/2$.
-* The interval&nbsp; $I$&nbsp; is characterised by the binary representation (with limited resolution) of any real number value&nbsp; $r \in I$.&nbsp; For example, one chooses &nbsp;$r \approx M$.
+* The interval&nbsp; $I$&nbsp; is characterised by the binary representation&nbsp; (with limited resolution)&nbsp; of any real number value&nbsp; $r \in I$.&nbsp; For example, one chooses &nbsp;$r \approx M$.
-* The required number of bits results from the interval width according to the following equation (the open square brackets mean "round up"):
+* The required number of bits results from the interval width according to the following equation&nbsp; (the open square brackets mean&nbsp; "round up"):
-:$$N_{\rm Bit} =  \left\lceil{\rm log_2} \hspace{0.15cm} 1/{\it \Delta} \right\rceil+1\hspace{0.05cm}. $$
+:$$N_{\rm bits} =  \left\lceil{\rm log_2} \hspace{0.15cm} 1/{\it \Delta} \right\rceil+1\hspace{0.05cm}. $$
-For example, for&nbsp; $N_{\rm Bit} = 5$&nbsp; the binary code&nbsp; <b>01001</b>&nbsp; stands for the following real-valued number&nbsp; $r$:
+For example, for&nbsp; $N_{\rm bits} = 5$&nbsp; the binary code&nbsp; <b>01001</b>&nbsp; stands for the following real-valued number&nbsp; $r$:
 :$$r = 0 \cdot 2^{-1}+1 \cdot 2^{-2}+0 \cdot 2^{-3}+0 \cdot 2^{-4}+1 \cdot 2^{-5} = 0.28125
 \hspace{0.05cm}. $$
@@ Line 28: / Line 28: @@
-Hints:
+<u>Hints:</u>
-*The task belongs to the chapter&nbsp; [[Information_Theory/Weitere_Quellencodierverfahren|Other source coding methods]].
+*The exercise belongs to the chapter&nbsp; [[Information_Theory/Further_Source_Coding_Methods|Further source coding methods]].
-*In particular, reference is made to the page &nbsp; [[Information_Theory/Weitere_Quellencodierverfahren#Arithmetische_Codierung|Arithmetic coding]].
+*In particular, reference is made to the page&nbsp; [[Information_Theory/Further_Source_Coding_Methods#Arithmetic_coding|Arithmetic Coding]].
 *Further information on the topic can also be found in this&nbsp; [https://en.wikipedia.org/wiki/Arithmetic_coding Wikipedia article].
@@ Line 40: / Line 40: @@
 {How many bits are used to represent the source symbol sequence&nbsp; $\rm XXY$ &nbsp; &rArr; &nbsp; $N = 3$&nbsp;?
 |type="{}"}
-$N_\text{Bit} \ = \ $ { 6 }
+$N_\text{bits} \ = \ $ { 6 }
@@ Line 52: / Line 52: @@
 {How many bits are used to represent the source symbol sequence&nbsp; $\rm XXYXXXZ$ &nbsp; &rArr; &nbsp; $N = 7$&nbsp;?
 |type="{}"}
-$N_\text{Bit} \ = \ $ { 11 }
+$N_\text{bits} \ = \ $ { 11 }
@@ Line 64: / Line 64: @@
 |type="[]"}
 + It is a common coding of entire sequences.
-+ A 32 Bit&ndash;computer architecture limits the sequence length&nbsp; $N$.
++ A computer with 32-bit architecture limits the sequence length&nbsp; $N$.
 + This problem can be circumvented by integer realisation.
 + Integer realisation increases the coding speed.
@@ Line 74: / Line 74: @@
 ===Solution===
 {{ML-Kopf}}
-'''(1)'''&nbsp; Das ausgewählte Intervall beginnt bei&nbsp; $B_3 = 0.343$&nbsp; und endet bei&nbsp; $E_3 = 0.392$.
+'''(1)'''&nbsp; The selected interval begins at&nbsp; $B_3 = 0.343$&nbsp; and ends at&nbsp; $E_3 = 0.392$.
-*Die Intervallbreite ist somit&nbsp; ${\it \Delta}_3 = 0.049$&nbsp; und damit gilt mit dem <i>Logarithmus dualis</i>:
+*The interval width is therefore&nbsp; ${\it \Delta}_3 = 0.049$&nbsp; and thus the following applies with the logarithm to base 2:
-:$$N_{\rm Bit} = {\rm log_2} \hspace{0.15cm} \left\lceil \frac{1}{0.049}\right\rceil+1\hspace{0.15cm}\underline{= 6}
+:$$N_{\rm bits} = {\rm log_2} \hspace{0.15cm} \left\lceil \frac{1}{0.049}\right\rceil+1\hspace{0.15cm}\underline{= 6}
 \hspace{0.05cm}.$$
-'''(2)'''&nbsp; Das ausgewählte Intervall ergibt sich zu&nbsp; $I = \big[0.343, \ 0.392\big)$.
+'''(2)'''&nbsp; The selected interval results in&nbsp; $I = \big[0.343, \ 0.392\big)$.
-*Die Mitte liegt bei&nbsp; $M_3 = 0.3675$.
+*The middle lies at&nbsp; $M_3 = 0.3675$.
-*Zur Bestimmung des arithmetischen Codes versuchen wir, die Intervallmitte durch eine Binärdarstellung möglichst gut zu erreichen.
+*To determine the arithmetic code, we try to reach the centre of the interval as well as possible by using a binary representation.
-*Da uns gerade kein entsprechendes Tool zur Lösung dieser Aufgabe zur Verfügung steht, gehen wir von folgenden Nebenrechnungen aus:
+*Since we do not have a corresponding tool for solving this task at the moment, we assume the following secondary calculations:
-:&nbsp;$H_4 = 2^{-2} + 2^{-2} = 0.3125$ &nbsp; &#8658; &nbsp; gehört nicht zum Intervall&nbsp; $I$.
-:&nbsp;$H_5 = H_4 +2^{-5} = 0.34375  \in I$  &nbsp; &#8658; &nbsp;  Binärdarstellung: &nbsp; '''0.01011'''  &nbsp;&#8658;&nbsp;  Code: &nbsp; <b>01011</b>.
+:&nbsp;$H_4 = 2^{-2} + 2^{-4} = 0.3125$ &nbsp; &#8658; &nbsp; does not belong to interval&nbsp; $I$.
-:&nbsp;$H_6 = H_5 +2^{-6} = 0.359375  \in I$  &nbsp; &#8658; &nbsp;  Binärdarstellung: &nbsp; '''0.010111'''  &nbsp;&#8658;&nbsp;  Code: &nbsp; <b>010111</b>.
+:&nbsp;$H_5 = H_4 +2^{-5} = 0.34375  \in I$  &nbsp; &#8658; &nbsp;  binary representation: &nbsp; '''0.01011'''  &nbsp;&#8658;&nbsp;  Code: &nbsp; <b>01011</b>.
-:&nbsp;$H_7 = H_6 +2^{-7} = 0.3671875  \in I$  &nbsp; &#8658; &nbsp;  Binärdarstellung: &nbsp; '''0.0101111'''  &nbsp;&#8658;&nbsp;  Code: &nbsp; <b>0101111</b>.
+:&nbsp;$H_6 = H_5 +2^{-6} = 0.359375  \in I$  &nbsp; &#8658; &nbsp;  binary representation: &nbsp; '''0.010111'''  &nbsp;&#8658;&nbsp;  Code: &nbsp; <b>010111</b>.
-:&nbsp;$H_{12} = H_7 +2^{-12} = 0.3674316406  \in I$  &nbsp; &#8658; &nbsp;  Binärdarstellung: &nbsp; '''0.010111100001'''  &nbsp;&#8658;&nbsp;  Code: &nbsp; <b>010111100001</b>.
+:&nbsp;$H_7 = H_6 +2^{-7} = 0.3671875  \in I$  &nbsp; &#8658; &nbsp;  binary representation: &nbsp; '''0.0101111'''  &nbsp;&#8658;&nbsp;  Code: &nbsp; <b>0101111</b>.
-Der entsprechende 6 Bit&ndash;Code lautet somit&nbsp; $\rm AC =$&nbsp;  <b>010111</b> &nbsp;&nbsp;&#8658;&nbsp;&nbsp; Richtig ist der <u>Lösungsvorschlag 2</u>.
+:&nbsp;$H_{12} = H_7 +2^{-12} = 0.3674316406  \in I$  &nbsp; &#8658; &nbsp;  binary representation: &nbsp; '''0.010111100001'''  &nbsp;&#8658;&nbsp;  Code: &nbsp; <b>010111100001</b>.
+The corresponding 6&ndash;bit code is therefore&nbsp; $\rm AC =$&nbsp;  <b>010111</b> &nbsp;&nbsp;&#8658;&nbsp;&nbsp; Correct is&nbsp; <u>solution suggestion 2</u>.
-'''(3)'''&nbsp; Hier ergibt sich mit dem Beginn&nbsp; $B_7 = 0.3564456$&nbsp; und dem Ende&nbsp; $E_7  = 0.359807$&nbsp; die Intervallbreite&nbsp; ${\it \Delta}_7 = 0.0033614$&nbsp; und damit
-:$$N_{\rm Bit} = \left\lceil  {\rm log_2} \hspace{0.15cm} \frac{1}{0.0033614} \right\rceil + 1\hspace{0.15cm} =
+'''(3)'''&nbsp; Here, with the beginning&nbsp; $B_7 = 0.3564456$&nbsp; and the end&nbsp; $E_7  = 0.359807$&nbsp; the interval width is&nbsp; ${\it \Delta}_7 = 0.0033614$&nbsp; and thus
+:$$N_{\rm bits} = \left\lceil  {\rm log_2} \hspace{0.15cm} \frac{1}{0.0033614} \right\rceil + 1\hspace{0.15cm} =
 \left\lceil  {\rm log_2} \hspace{0.15cm} 297.5 \right\rceil + 1\hspace{0.15cm} \underline{= 11}
 \hspace{0.05cm}.$$
@@ Line 107: / Line 108: @@
-'''(4)'''&nbsp; Die Binärdarstellung des Codes&nbsp; <b>01011100001</b>&nbsp; ergibt
+'''(4)'''&nbsp; The binary representation of the code&nbsp; <b>01011100001</b>&nbsp; results in
 :$$2^{-2}+ 2^{-4}+ 2^{-5}+ 2^{-6}+ 2^{-11} = 0.3598632813 > E_7
 \hspace{0.05cm}.$$
-*Richtig ist also <u>NEIN</u>.&nbsp; Der gültige arithmetische Code ist&nbsp; $\rm AC =$&nbsp; <b>01011011101</b>, wegen
+*The correct answer is therefore <u>NO</u>.&nbsp; The valid arithmetic code is&nbsp; $\rm AC =$&nbsp; <b>01011011101</b>, because of
 :$$2^{-2}+ 2^{-4}+ 2^{-5}+ 2^{-7}+ 2^{-8}+ 2^{-9}+ 2^{-11} =0.3579101563 \hspace{0.3cm}
 \Rightarrow \hspace{0.3cm} B_7 \le 0.3579101563 <  E_7.$$
-'''(5)'''&nbsp; <u>Alle Aussagen</u> sind richtig. Siehe auch:
+'''(5)'''&nbsp; <u>All statements</u> are correct. See also &nbsp; [https://en.wikipedia.org/wiki/Arithmetic_coding '''this Wikipedia article'''].
-:Bodden, E.; Clasen, M.; Kneis, J.: Algebraische Kodierung. Proseminar, Lehrstuhl für Informatik IV, RWTH Aachen, 2002.
 {{ML-Fuß}}