Difference between revisions of "Exercise 2.12: Run–Length Coding and Run–Length Limited Coding"
m (Text replacement - "[File:" to "[File:") |
|||
(20 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
− | {{quiz-Header|Buchseite= | + | {{quiz-Header|Buchseite=Information_Theory/Further_Source_Coding_Methods |
}} | }} | ||
− | [[File: | + | [[File:EN_Inf_A_2_13.png|right|frame|Table on Run-Length Coding]] |
− | + | We consider a binary source with the symbol set A and B, where B however, occurs only very rarely. | |
− | * | + | * Without source coding, exactly one bit would be needed per source symbol, and accordingly, for a source symbol sequence of length N , the encoded sequence would also have length $N_\text{bits} = N$. |
− | * | + | * Entropy coding makes little sense here without further measures (example: combining several symbols into a tuple) because of the unfavourable symbol probabilities. |
− | * | + | * The remedy is [[Information_Theory/Weitere_Quellencodierverfahren#Run.E2.80.93Length_coding|Run-Length Coding]] (RLC), which is described in the theory section under the link mentioned. For example, for the symbol sequence ABAABAAAABBAAB... the corresponding output of "Run–Length Coding": 2; 3; 5; 1; 3;... |
− | * | + | * Of course, the lengths L1=2, L2=3, ... of the individual substrings, each separated by B , must be represented in binary before transmission. If one uses $D = 3$ (bit) for all $L_i$ one obtains the RLC binary symbol sequence |
:010'011'101'001'011'... | :010'011'101'001'011'... | ||
− | + | The graph shows the RLC result to be analyzed. Columns 2 and 3 show the substring lengths Li in binary and decimal, respectively, and column 4 shows them in cumulative form (values from column 3 added up). | |
− | * | + | *One problem of "Run-Length Coding" (RLC) is the unlimited range of values of the quantities Li. With D=3, no value Li>7 can be represented and with D=2, the restriction is 1≤Li≤3. |
− | * | + | *The problem is circumvented with "Run–Length Limited Coding" (RLLC). If a value is Li≥2D, one replaces Li with a special character <b>S</b> and the difference Li−2D+1. With the RLLC decoder, this special character <b>S</b> is expanded again. |
Line 25: | Line 25: | ||
− | + | Hints: | |
− | * | + | *This exercise belongs to the chapter [[Information_Theory/Further_Source_Coding_Methods|Further source coding methods]]. |
− | * | + | *In particular, reference is made to the section [[Information_Theory/Further_Source_Coding_Methods#Run.E2.80.93Length_coding|Run-Length Coding]]. |
{{GraueBox| | {{GraueBox| | ||
− | TEXT=$\text{RLLC | + | TEXT=$\text{RLLC Example}$: We again assume the above sequence and the parameter D=2 : |
− | * | + | * Source symbol sequence: ABAABAAAABBAAB... |
− | * RLC& | + | * RLC decimal sequence: '''2; 3; 5; 1; 3;''' ... |
− | * RLLC& | + | * RLLC decimal sequence: '''2; 3; '''<font color="#cc0000"><span style="font-weight: bold;"> S;</span></font>''' 2; 1; 3;''' ... |
− | * RLLC | + | * RLLC binary sequence: '''10′11′''' <font color="#cc0000"><span style="font-weight: bold;">00′</span></font>'''10′01′11′'''... |
− | + | You can see: | |
− | * | + | *The special character <b>S</b> is binary-coded here as <b>00</b> . This is only an example – it does not have to be like this. |
− | * | + | *Since with D=2 for all real RLC values 1≤Li≤3 , the decoder recognizes the special character '''00'''. |
− | * | + | *It replaces this again with 2D−1 (three in the example) A–symbols.}} |
− | === | + | ===Questions=== |
<quiz display=simple> | <quiz display=simple> | ||
− | { | + | {How many bits would be needed <u>without source coding</u> , i.e. with the assignment A → <b>0</b> and B → <b>1</b>? |
|type="{}"} | |type="{}"} | ||
− | $N_\text{ | + | $N_\text{bits} \ = \ $ { 1250 } |
− | { | + | {What is the relative frequency of symbol B? |
|type="{}"} | |type="{}"} | ||
hB = { 2 3% } % | hB = { 2 3% } % | ||
− | { | + | {How many bits are needed for <u>Run–Length Coding</U> (RLC) according to the given table with eight bits per code word (D=8)? |
|type="{}"} | |type="{}"} | ||
− | $N_\text{ | + | $N_\text{bits} \ = \ $ { 200 } |
− | { | + | {Is <u>Run–Length Coding</u> with seven bit code words (D=7) possible here? |
|type="()"} | |type="()"} | ||
− | - | + | - Yes. |
− | + | + | + No. |
− | { | + | {How many bits are needed for <u>Run–Length Limited Coding</U> (RLLC) with seven bits per code word (D=7)? |
|type="{}"} | |type="{}"} | ||
− | $N_\text{ | + | $N_\text{bits} \ = \ $ { 182 } |
− | { | + | {How many bits are needed for <u>Run–Length Limited Coding</u> (RLLC) with six bits per code word (D=6)? |
|type="{}"} | |type="{}"} | ||
− | $N_\text{ | + | $N_\text{bits} \ = \ ${ 204 } |
Line 83: | Line 83: | ||
</quiz> | </quiz> | ||
− | === | + | ===Solution=== |
{{ML-Kopf}} | {{ML-Kopf}} | ||
− | '''(1)''' | + | '''(1)''' The binary sequence consists of N=1250 binary symbols (can be read from the last column in the table). |
− | :$$N_\text{ | + | *This means that the same number of bits is needed without coding: |
+ | :$$N_\text{bits}\hspace{0.15cm}\underline{= 1250}.$$ | ||
− | '''(2)''' | + | '''(2)''' The entire symbol sequence of length N=1250 contains NB=25 symbols B and thus NA=1225 symbols A. |
− | * | + | *The number NB of symbols B is equal to the number of rows in the table given at the front. |
− | * | + | *Thus the following applies to the relative frequency of symbol B: |
:hB=NBN=251250=0.02_=2%. | :hB=NBN=251250=0.02_=2%. | ||
− | '''(3)''' | + | '''(3)''' We now consider "Run–Length Coding" (RLC), where each distance between two B–symbols is represented by eight bits (D=8). |
− | * | + | *Thus, with NB=25, we get: |
− | :$$N_{\rm | + | :$$N_{\rm bits} = N_{\rm B} \cdot 8 \hspace{0.15cm}\underline{= 200} \hspace{0.05cm}.$$ |
− | '''(4)''' RLC | + | '''(4)''' RLC with D=7 only allows values between 1 und $2^7-1 =127for L_i$. |
− | * | + | *However, the entry "226" in line 19 is greater ⇒ <u>NO</u>. |
− | '''(5)''' | + | '''(5)''' Even with Run–Length Limited Coding (RLLC), only values up to 127 are permitted for the "real" distances $L_i$ with $D = 7$. |
− | * | + | *The entry "226" in line 19 is replaced by the following for RLLC: |
− | :* | + | :* Line 19a: <b>S</b> = <b>0000000</b> ⇒ special character, stands for "+127", |
− | :* | + | :* Line 19b: <b>1100011</b> ⇒ decimal 99. |
− | * | + | *This gives a total of 26 words of seven bits each: |
− | :$$N_{\rm | + | :$$N_{\rm bits} = 26 \cdot 7 \hspace{0.15cm}\underline{= 182} \hspace{0.05cm}.$$ |
− | '''(6)''' | + | '''(6)''' With D=6 the following changes have to be made in RLLC compared to RLC (see table): |
− | * | + | * Line 1: 122 = 1 · 63 + 59 (one word more), |
− | * | + | * Line 6: 70 = 1 · 63 + 7 (one word more), |
− | * | + | * Line 7: 80 = 1 · 63 + 17 (one word more), |
− | * | + | * Line 12: $79 = 1 · 63 + 16$ (one word more), |
− | * | + | * Line 13: 93 = 1 · 63 + 30 (one word more), |
− | * | + | * Line 19: 226 = 3 · 63 + 37 (one word more), |
− | * | + | * Line 25: 97 = 1 · 63 + 34 (one word more). |
− | + | This gives a total of $25+9=34$ words of six bits each: | |
− | :$$N_{\rm | + | :$$N_{\rm bits} = 34 \cdot 6 \hspace{0.15cm}\underline{= 204} \hspace{0.05cm},$$ |
− | + | i.e. a worse result than with seven bits according to subtask '''(5)'''. | |
{{ML-Fuß}} | {{ML-Fuß}} | ||
− | [[Category: | + | [[Category:Information Theory: Exercises|^2.4 Further Source Coding Methods^]] |
Latest revision as of 01:34, 13 November 2022
We consider a binary source with the symbol set A and B, where B however, occurs only very rarely.
- Without source coding, exactly one bit would be needed per source symbol, and accordingly, for a source symbol sequence of length N , the encoded sequence would also have length Nbits=N.
- Entropy coding makes little sense here without further measures (example: combining several symbols into a tuple) because of the unfavourable symbol probabilities.
- The remedy is Run-Length Coding (RLC), which is described in the theory section under the link mentioned. For example, for the symbol sequence ABAABAAAABBAAB... the corresponding output of "Run–Length Coding": 2; 3; 5; 1; 3;...
- Of course, the lengths L1=2, L2=3, ... of the individual substrings, each separated by B , must be represented in binary before transmission. If one uses D=3 (bit) for all Li one obtains the RLC binary symbol sequence
- 010'011'101'001'011'...
The graph shows the RLC result to be analyzed. Columns 2 and 3 show the substring lengths Li in binary and decimal, respectively, and column 4 shows them in cumulative form (values from column 3 added up).
- One problem of "Run-Length Coding" (RLC) is the unlimited range of values of the quantities Li. With D=3, no value Li>7 can be represented and with D=2, the restriction is 1≤Li≤3.
- The problem is circumvented with "Run–Length Limited Coding" (RLLC). If a value is Li≥2D, one replaces Li with a special character S and the difference Li−2D+1. With the RLLC decoder, this special character S is expanded again.
Hints:
- This exercise belongs to the chapter Further source coding methods.
- In particular, reference is made to the section Run-Length Coding.
RLLC Example: We again assume the above sequence and the parameter D=2 :
- Source symbol sequence: ABAABAAAABBAAB...
- RLC decimal sequence: 2; 3; 5; 1; 3; ...
- RLLC decimal sequence: 2; 3; S; 2; 1; 3; ...
- RLLC binary sequence: 10′11′ 00′10′01′11′...
You can see:
- The special character S is binary-coded here as 00 . This is only an example – it does not have to be like this.
- Since with D=2 for all real RLC values 1≤Li≤3 , the decoder recognizes the special character 00.
- It replaces this again with 2D−1 (three in the example) A–symbols.
Questions
Solution
- This means that the same number of bits is needed without coding:
- Nbits=1250_.
(2) The entire symbol sequence of length N=1250 contains NB=25 symbols B and thus NA=1225 symbols A.
- The number NB of symbols B is equal to the number of rows in the table given at the front.
- Thus the following applies to the relative frequency of symbol B:
- hB=NBN=251250=0.02_=2%.
(3) We now consider "Run–Length Coding" (RLC), where each distance between two B–symbols is represented by eight bits (D=8).
- Thus, with NB=25, we get:
- Nbits=NB⋅8=200_.
(4) RLC with D=7 only allows values between 1 und 27−1=127 for Li.
- However, the entry "226" in line 19 is greater ⇒ NO.
(5) Even with Run–Length Limited Coding (RLLC), only values up to 127 are permitted for the "real" distances Li with D=7.
- The entry "226" in line 19 is replaced by the following for RLLC:
- Line 19a: S = 0000000 ⇒ special character, stands for "+127",
- Line 19b: 1100011 ⇒ decimal 99.
- This gives a total of 26 words of seven bits each:
- Nbits=26⋅7=182_.
(6) With D=6 the following changes have to be made in RLLC compared to RLC (see table):
- Line 1: 122=1·63+59 (one word more),
- Line 6: 70=1·63+7 (one word more),
- Line 7: 80=1·63+17 (one word more),
- Line 12: 79=1·63+16 (one word more),
- Line 13: 93=1·63+30 (one word more),
- Line 19: 226=3·63+37 (one word more),
- Line 25: 97=1·63+34 (one word more).
This gives a total of 25+9=34 words of six bits each:
- Nbits=34⋅6=204_,
i.e. a worse result than with seven bits according to subtask (5).