Difference between revisions of "Aufgaben:Exercise 2.11Z: Arithmetic Coding once again"
(27 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
− | {{quiz-Header|Buchseite= | + | {{quiz-Header|Buchseite=Information_Theory/Further_Source_Coding_Methods |
}} | }} | ||
− | [[File: | + | [[File:EN_Inf_Z_2_11_v2.png|right|frame|Preset interval ranges]] |
− | + | Here we consider arithmetic coding $(\rm AC)$. All necessary information about this type of entropy coding can be found in [[Aufgaben:Exercise_2.11:_Arithmetic_Coding|Exercise 2.11]]. | |
− | + | The graph is also the result of Exercise 2.11. The numerical values for coding steps 3 and 7 that are important for the current task are highlighted in colour: | |
+ | * The interval for N=3 (symbol sequence XXY) starts at B3=0.343 and goes up to E3=0.392. | ||
+ | * The interval limits for N=7 (symbol sequence XXYXXXZ) are B7=0.3564456 and E7=0.359807. | ||
− | |||
− | : | + | This task is only about assigning binary sequences to the selected intervals. Procedure: |
− | + | * The interval I is determined by the beginning B, the end E, the interval width Δ=E−B as well as the interval centre M=(B+E)/2. | |
+ | * The interval I is characterised by the binary representation (with limited resolution) of any real number value r∈I. For example, one chooses r≈M. | ||
+ | * The required number of bits results from the interval width according to the following equation (the open square brackets mean "round up"): | ||
+ | :Nbits=⌈log21/Δ⌉+1. | ||
− | + | For example, for $N_{\rm bits} = 5$ the binary code <b>01001</b> stands for the following real-valued number r: | |
+ | :$$r = 0 \cdot 2^{-1}+1 \cdot 2^{-2}+0 \cdot 2^{-3}+0 \cdot 2^{-4}+1 \cdot 2^{-5} = 0.28125 | ||
+ | \hspace{0.05cm}. $$ | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | === | + | |
+ | |||
+ | <u>Hints:</u> | ||
+ | *The exercise belongs to the chapter [[Information_Theory/Further_Source_Coding_Methods|Further source coding methods]]. | ||
+ | *In particular, reference is made to the page [[Information_Theory/Further_Source_Coding_Methods#Arithmetic_coding|Arithmetic Coding]]. | ||
+ | *Further information on the topic can also be found in this [https://en.wikipedia.org/wiki/Arithmetic_coding Wikipedia article]. | ||
+ | |||
+ | |||
+ | |||
+ | ===Questions=== | ||
<quiz display=simple> | <quiz display=simple> | ||
− | { | + | {How many bits are used to represent the source symbol sequence $\rm XXY ⇒ N = 3$ ? |
|type="{}"} | |type="{}"} | ||
− | $ | + | $N_\text{bits} \ = \ $ { 6 } |
− | { | + | {Which arithmetic code $\rm (AC)$ applies to this case? |
− | |type=" | + | |type="()"} |
− | - AC = <b>01011</b>, | + | - $\rm AC = $ <b>01011</b>, |
− | + AC = <b>010111</b>, | + | + $\rm AC = $ <b>010111</b>, |
− | - AC = <b>110111</b>. | + | - $\rm AC = $ <b>110111</b>. |
− | { | + | {How many bits are used to represent the source symbol sequence $\rm XXYXXXZ ⇒ N = 7$ ? |
|type="{}"} | |type="{}"} | ||
− | $ | + | $N_\text{bits} \ = \ $ { 11 } |
− | { | + | {Is <b>01011100001</b> a valid code for the source symbol sequence $\rm XXYXXXZ$? |
− | |type=" | + | |type="()"} |
− | - | + | - Yes. |
− | + | + | + No. |
− | { | + | {Which statements apply to arithmetic coding in general? |
|type="[]"} | |type="[]"} | ||
− | + | + | + It is a common coding of entire sequences. |
− | + | + | + A computer with 32-bit architecture limits the sequence length $N$. |
− | + | + | + This problem can be circumvented by integer realisation. |
− | + | + | + Integer realisation increases the coding speed. |
Line 64: | Line 72: | ||
</quiz> | </quiz> | ||
− | === | + | ===Solution=== |
{{ML-Kopf}} | {{ML-Kopf}} | ||
− | + | '''(1)''' The selected interval begins at $B_3 = 0.343 and ends at E_3 = 0.392$. | |
− | :$$N_{\rm | + | *The interval width is therefore ${\it \Delta}_3 = 0.049$ and thus the following applies with the logarithm to base 2: |
+ | :$$N_{\rm bits} = {\rm log_2} \hspace{0.15cm} \left\lceil \frac{1}{0.049}\right\rceil+1\hspace{0.15cm}\underline{= 6} | ||
\hspace{0.05cm}.$$ | \hspace{0.05cm}.$$ | ||
− | |||
− | + | '''(2)''' The selected interval results in $I = \big[0.343, \ 0.392\big)$. | |
+ | *The middle lies at M3=0.3675. | ||
+ | *To determine the arithmetic code, we try to reach the centre of the interval as well as possible by using a binary representation. | ||
+ | *Since we do not have a corresponding tool for solving this task at the moment, we assume the following secondary calculations: | ||
− | |||
− | |||
− | : | + | : $H_4 = 2^{-2} + 2^{-4} = 0.3125$ ⇒ does not belong to interval I. |
− | : | + | : $H_5 = H_4 +2^{-5} = 0.34375 \in I$ ⇒ binary representation: '''0.01011''' ⇒ Code: <b>01011</b>. |
− | + | : $H_6 = H_5 +2^{-6} = 0.359375 \in I$ ⇒ binary representation: '''0.010111''' ⇒ Code: <b>010111</b>. | |
− | <b> | + | : H7=H6+2−7=0.3671875∈I ⇒ binary representation: '''0.0101111''' ⇒ Code: <b>0101111</b>. |
− | :$$N_{\rm | + | |
− | \left\lceil {\rm | + | : H12=H7+2−12=0.3674316406∈I ⇒ binary representation: '''0.010111100001''' ⇒ Code: <b>010111100001</b>. |
+ | |||
+ | The corresponding 6–bit code is therefore $\rm AC =$ <b>010111</b> ⇒ Correct is <u>solution suggestion 2</u>. | ||
+ | |||
+ | |||
+ | |||
+ | '''(3)''' Here, with the beginning B7=0.3564456 and the end $E_7 = 0.359807$ the interval width is ${\it \Delta}_7 = 0.0033614$ and thus | ||
+ | :$$N_{\rm bits} = \left\lceil {\rm log_2} \hspace{0.15cm} \frac{1}{0.0033614} \right\rceil + 1\hspace{0.15cm} = | ||
+ | \left\lceil {\rm log_2} \hspace{0.15cm} 297.5 \right\rceil + 1\hspace{0.15cm} \underline{= 11} | ||
\hspace{0.05cm}.$$ | \hspace{0.05cm}.$$ | ||
− | + | ||
+ | |||
+ | '''(4)''' The binary representation of the code <b>01011100001</b> results in | ||
:$$2^{-2}+ 2^{-4}+ 2^{-5}+ 2^{-6}+ 2^{-11} = 0.3598632813 > E_7 | :$$2^{-2}+ 2^{-4}+ 2^{-5}+ 2^{-6}+ 2^{-11} = 0.3598632813 > E_7 | ||
\hspace{0.05cm}.$$ | \hspace{0.05cm}.$$ | ||
− | + | *The correct answer is therefore <u>NO</u>. The valid arithmetic code is AC= <b>01011011101</b>, because of | |
− | :$$2^{-2}+ 2^{-4}+ 2^{-5}+ 2^{-7}+ 2^{-8}+ 2^{-9}+ 2^{-11} =0.3579101563 | + | :$$2^{-2}+ 2^{-4}+ 2^{-5}+ 2^{-7}+ 2^{-8}+ 2^{-9}+ 2^{-11} =0.3579101563 \hspace{0.3cm} |
− | + | \Rightarrow \hspace{0.3cm} B_7 \le 0.3579101563 < E_7.$$ | |
− | + | ||
− | + | '''(5)''' <u>All statements</u> are correct. See also [https://en.wikipedia.org/wiki/Arithmetic_coding '''this Wikipedia article''']. | |
{{ML-Fuß}} | {{ML-Fuß}} | ||
− | [[Category: | + | [[Category:Information Theory: Exercises|^2.4 Further Source Coding Methods^]] |
Latest revision as of 17:26, 12 August 2021
Here we consider arithmetic coding (AC). All necessary information about this type of entropy coding can be found in Exercise 2.11.
The graph is also the result of Exercise 2.11. The numerical values for coding steps 3 and 7 that are important for the current task are highlighted in colour:
- The interval for N=3 (symbol sequence XXY) starts at B3=0.343 and goes up to E3=0.392.
- The interval limits for N=7 (symbol sequence XXYXXXZ) are B7=0.3564456 and E7=0.359807.
This task is only about assigning binary sequences to the selected intervals. Procedure:
- The interval I is determined by the beginning B, the end E, the interval width Δ=E−B as well as the interval centre M=(B+E)/2.
- The interval I is characterised by the binary representation (with limited resolution) of any real number value r∈I. For example, one chooses r≈M.
- The required number of bits results from the interval width according to the following equation (the open square brackets mean "round up"):
- Nbits=⌈log21/Δ⌉+1.
For example, for Nbits=5 the binary code 01001 stands for the following real-valued number r:
- r=0⋅2−1+1⋅2−2+0⋅2−3+0⋅2−4+1⋅2−5=0.28125.
Hints:
- The exercise belongs to the chapter Further source coding methods.
- In particular, reference is made to the page Arithmetic Coding.
- Further information on the topic can also be found in this Wikipedia article.
Questions
Solution
- The interval width is therefore Δ3=0.049 and thus the following applies with the logarithm to base 2:
- Nbits=log2⌈10.049⌉+1=6_.
(2) The selected interval results in I=[0.343, 0.392).
- The middle lies at M3=0.3675.
- To determine the arithmetic code, we try to reach the centre of the interval as well as possible by using a binary representation.
- Since we do not have a corresponding tool for solving this task at the moment, we assume the following secondary calculations:
- H4=2−2+2−4=0.3125 ⇒ does not belong to interval I.
- H5=H4+2−5=0.34375∈I ⇒ binary representation: 0.01011 ⇒ Code: 01011.
- H6=H5+2−6=0.359375∈I ⇒ binary representation: 0.010111 ⇒ Code: 010111.
- H7=H6+2−7=0.3671875∈I ⇒ binary representation: 0.0101111 ⇒ Code: 0101111.
- H12=H7+2−12=0.3674316406∈I ⇒ binary representation: 0.010111100001 ⇒ Code: 010111100001.
The corresponding 6–bit code is therefore AC= 010111 ⇒ Correct is solution suggestion 2.
(3) Here, with the beginning B7=0.3564456 and the end E7=0.359807 the interval width is Δ7=0.0033614 and thus
- Nbits=⌈log210.0033614⌉+1=⌈log2297.5⌉+1=11_.
(4) The binary representation of the code 01011100001 results in
- 2−2+2−4+2−5+2−6+2−11=0.3598632813>E7.
- The correct answer is therefore NO. The valid arithmetic code is AC= 01011011101, because of
- 2−2+2−4+2−5+2−7+2−8+2−9+2−11=0.3579101563⇒B7≤0.3579101563<E7.
(5) All statements are correct. See also this Wikipedia article.