Difference between revisions of "Aufgaben:Exercise 1.3: Entropy Approximations"

From LNTwww
 
(32 intermediate revisions by 5 users not shown)
Line 1: Line 1:
  
{{quiz-Header|Buchseite=Informationstheorie und Quellencodierung/Nachrichtenquellen mit Gedächtnis
+
{{quiz-Header|Buchseite=Information_Theory/Discrete_Sources_with_Memory
 
}}
 
}}
  
[[File:P_ID2236__Inf_A_1_3.png|right|]]
+
[[File:EN_Inf_A_1_3_v2.png|right|frame|Different binary sequences]]
:Die Grafik zeigt vier Symbolfolgen &#9001;<i>q<sub>&nu;</sub></i>&#9002; mit jeweiliger Länge <i>N</i> = 60. Die Quellensymbole sind jeweils <b>A</b> und <b>B</b>. Daraus folgt direkt, dass für den Entscheidungsgehalt aller betrachteten Quellen <i>H</i><sub>0</sub> = 1 bit/Symbol gilt. Die Symbole <b>A</b> und <b>B</b> treten mit den Wahrscheinlichkeiten <i>p</i><sub>A</sub> und <i>p</i><sub>B</sub> auf.
+
The graphic on the right shows four symbol sequences&nbsp; $\langle q_\nu \rangle $,&nbsp; each with length&nbsp; $N = 60$.&nbsp; The source symbols are&nbsp; $\rm A$&nbsp; and&nbsp; $\rm B$.  
 +
*It follows directly that&nbsp; $H_0 = 1 \; \rm bit/symbol$&nbsp; applies to the decision content of all sources considered.
 +
*However, the symbols&nbsp; $\rm A$&nbsp; and&nbsp; $\rm B$&nbsp; do not occur with equal probability, but with the probabilities&nbsp; $p_{\rm A}$&nbsp; and&nbsp; $p_{\rm B}$.
  
:Die folgende Tabelle zeigt neben <i>H</i><sub>0</sub> die Entropienäherungen
 
  
:* <i>H</i><sub>1</sub>, basierend auf <i>p</i><sub>A</sub> und <i>p</i><sub>B</sub> (Spalte 2),
+
In addition to&nbsp; $H_0$&nbsp;, the table below shows the entropy approximations
 +
* $H_1$,&nbsp; based on&nbsp; $p_{\rm A}$&nbsp; und&nbsp; $p_{\rm B}$&nbsp; (column 2),
 +
* $H_2$,&nbsp; based on two-tuples (column 3),
 +
* $H_3$,&nbsp; based on three-tuples (column 4),
 +
* $H_4$,&nbsp; based on four-tuples (column 5),
 +
* the actual entropy&nbsp; $H$, which is obtained from&nbsp; $H_k$&nbsp; by the boundary transition for&nbsp; $k \to \infty$&nbsp; (last column).
  
:* <i>H</i><sub>2</sub>, basierend auf Zweiertupel (Spalte 3),
 
  
:* <i>H</i><sub>3</sub>, basierend auf Dreiertupel (Spalte 4),
+
The following size relations exist between these entropies: &nbsp; $H \le$ ... $\le H_3 \le H_2 \le H_1 \le H_0 \hspace{0.05cm}.$
  
:* <i>H</i><sub>4</sub>, basierend auf Vierertupel (Spalte 5),
+
*What is not known is the correlation between the sources&nbsp; $\rm Q1$,&nbsp; $\rm Q2$,&nbsp; $\rm Q3$,&nbsp; $\rm Q4$&nbsp; and the symbol sequences shown in the graph&nbsp; <br>(black, blue, red, green).
 +
*It is only known that source&nbsp; $\rm Q4$&nbsp; contains a repetition code.&nbsp; Due to the fact that in the corresponding symbol sequence every second symbol does not lier any information,&nbsp; the final entrpy value is&nbsp; $H = 0.5 \; \rm bit/symbol$.
 +
*In addition, the entropy approximations&nbsp; $H_1 = 1 \; \rm bit/symbol$&nbsp; and&nbsp; $H_4 \approx 0.789 \; \rm bit/symbol$&nbsp; are given.
  
:* die tatsächliche Quellenentropie <i>H</i>, die sich aus <i>H<sub>k</sub></i> durch den Grenzübergang für <i>k</i> &#8594; &#8734; ergibt (letzte Spalte).
 
  
:Zwischen diesen Entropien bestehen folgende Größenrelationen:
+
Finally, the entropy approximations&nbsp; $H_2$&nbsp; and&nbsp; $H_3$ are to be determined for the source&nbsp; $\rm Q4$&nbsp;.
:$$H \le ... \le H_3 \le H_2 \le H_1 \le H_0
 
\hspace{0.05cm}.$$
 
:Nicht bekannt ist die Zuordnung zwischen den Quellen Q1, Q2, Q3, Q4 und den in der Grafik gezeigten gezeigten Symbolfolgen (Schwarz, Blau, Rot, Grün). Es ist lediglich bekannt, dass die Quelle Q4 einen Wiederholungscode beinhaltet. Zu bestimmen sind für diese Nachrichtenquelle schließlich noch die Entropienäherungen <i>H</i><sub>2</sub> und <i>H</i><sub>3</sub>.
 
  
:Für die Quelle Q4 sind nur <i>H</i><sub>1</sub> = 1 bit/Symbol (&#8658;&nbsp;<b>A</b> und <b>B</b> gleichwahrscheinlich), die Näherung <i>H</i><sub>4</sub> &asymp; 0.789 bit/Symbol und der Entropie&ndash;Endwert <i>H</i> = 0.5 bit/Symbol angegeben. Letzterer aufgrund der Tatsache, dass bei der entsprechenden Symbolfolge jedes zweite Symbol keinerlei Information lierfert.
+
[[File:EN_Inf_A_1_3b_v2.png|left|frame|Source entropy and approximations in "bit/symbol"]]
[[File:P_ID2246__Inf_A_1_3b.png|center|]]
+
<br clear=all>
:<b>Hinweis:</b> Die Aufgabe gehört zum Themengebiet von Kapitel 1.2. Für die <i>k</i>&ndash;te Entropienäherung gilt bei Binärquellen (<i>M</i> = 2) mit der Verbundwahrscheinlichkeit <i>p<sub>i</sub></i><sup>(<i>k</i>)</sup> eines <i>k</i>&ndash;Tupels:
+
''Hints:''
:$$H_k = \frac{1}{k} \cdot \sum_{i=1}^{2^k} p_i^{(k)} \cdot {\rm log}_2\hspace{0.1cm}\frac {1}{p_i^{(k)}} \hspace{0.5cm}({\rm Einheit\hspace{-0.1cm}: \hspace{0.1cm}bit/Symbol})
+
*This task belongs to the chapter&nbsp; [[Information_Theory/Discrete_Sources_with_Memory|Discrete Sources with Memory]].
 +
*For the&nbsp; $k$&ndash;th entropy approximation, the following holds for binary sources&nbsp; $(M = 2)$&nbsp; with the composite probability&nbsp; $ p_i^{(k)}$&nbsp; of a&nbsp; $k$&ndash;tuple:
 +
:$$H_k = \frac{1}{k} \cdot \sum_{i=1}^{2^k} p_i^{(k)} \cdot {\rm log}_2\hspace{0.1cm}\frac {1}{p_i^{(k)}} \hspace{0.5cm}({\rm unit\hspace{-0.1cm}: \hspace{0.1cm}bit/Symbol})
 
  \hspace{0.05cm}.$$
 
  \hspace{0.05cm}.$$
  
  
===Fragebogen===
+
===Questions===
  
 
<quiz display=simple>
 
<quiz display=simple>
{Welche Aussagen gelten für die Entropienäherung <i>H</i><sub>4</sub>?
+
{What is the source of the <u>black symbol sequence</u>?
|type="[]"}
+
|type="()"}
+ Es muss über 3<sup>4</sup> = 81 Summanden gemittelt werden.
+
- $\rm Q1$,
+ Es gilt 1 bit/Symbol < <i>H</i><sub>4</sub> < <i>H</i><sub>3</sub>.
+
- $\rm Q2$,
- Nach langer Rechnung erhält man <i>H</i><sub>4</sub>&nbsp;=&nbsp;1.333&nbsp;bit/Symbol.
+
+ $\rm Q3$,
 
+
- $\rm Q4$.
 
 
{Von welcher Quelle stammt die schwarze Symbolfolge?
 
|type="[]"}
 
- Q1,
 
- Q2,
 
+ Q3,
 
- Q4.
 
  
  
{Von welcher Quelle stammt die blaue Symbolfolge?
+
{What is the source of the <u>blue symbol sequence</u>?
|type="[]"}
+
|type="()"}
+ Q1,
+
+ $\rm Q1$,
- Q2,
+
- $\rm Q2$,
- Q3,
+
- $\rm Q3$,
- Q4.
+
- $\rm Q4$.
  
  
{Von welcher Quelle stammt die rote Symbolfolge?
+
{What is the source of the <u>red symbol sequence</u>?
|type="[]"}
+
|type="()"}
- Q1,
+
- $\rm Q1$,
+ Q2,
+
+ $\rm Q2$,
- Q3,
+
- $\rm Q3$,
- Q4.
+
- $\rm Q4$.
  
  
{Berechnen Sie die Entropienäherung <i>H</i><sub>2</sub> des Wiederholungscodes.
+
{Calculate the entropy approximation&nbsp; $H_2$&nbsp; of the repetition code&nbsp; $\rm Q4$.
 
|type="{}"}
 
|type="{}"}
$H_2$ = { 0.906 3% } $bit/Symbol$
+
$H_2 \ = \ $ { 0.906 3% } $\ \rm bit/symbol$
  
  
{Berechnen Sie die Entropienäherung <i>H</i><sub>3</sub> des Wiederholungscodes.
+
{Calculate the entropy approximation&nbsp; $H_3$&nbsp; of the repetition code&nbsp;  $\rm Q4$.
 
|type="{}"}
 
|type="{}"}
$H_3$ = { 0.833 3% } $bit/Symbol$
+
$H_3 \ =  \ $ { 0.833 3% } $\ \rm bit/symbol$
  
  
Line 77: Line 75:
 
</quiz>
 
</quiz>
  
===Musterlösung===
+
===Solution===
 
{{ML-Kopf}}
 
{{ML-Kopf}}
:<b>1.</b>&nbsp;&nbsp;Es handelt sich um die <u>Quelle Q3</u>, da die Symbole gleichwahrscheinlich sind &nbsp;&nbsp;&#8658;&nbsp;&nbsp; <i>H</i><sub>1</sub> = <i>H</i><sub>0</sub> und keine statistischen Bindungen zwischen den Symbolen bestehen &nbsp;&nbsp;&#8658;&nbsp;&nbsp; <i>H</i> = .... = <i>H</i><sub>2</sub> = <i>H</i><sub>1</sub>.
+
'''(1)'''&nbsp; The black binary sequence comes from the source&nbsp; $\underline{\rm Q3}$,  
 +
*since the symbols are equally probable &nbsp;&nbsp;&#8658;&nbsp;&nbsp; $H_1 = H_0$,&nbsp; and
 +
*there are no statistical bindings between the symbols &nbsp;&nbsp;&#8658;&nbsp;&nbsp; $H=$ ... $= H_2 = H_1$.
 +
 
 +
 
 +
 
 +
'''(2)'''&nbsp; It can be seen in the blue binary sequence that&nbsp; $\rm A$&nbsp; occurs much more frequently than&nbsp; $\rm B$, so that&nbsp; $H_1 < H_0$&nbsp; must hold.
 +
*According to the table, only source&nbsp; $\underline{\rm Q1}$&nbsp; fulfils this condition.
 +
*From&nbsp; $H_1 = 0.5 \; \rm bit/symbol$&nbsp; one can determine the symbol probabilities&nbsp; $p_{\rm A} = 0.89$&nbsp; and&nbsp; $p_{\rm B} = 0.11$&nbsp;.
 +
 
  
:<b>2.</b>&nbsp;&nbsp;Man erkennt hier, dass <b>A</b> sehr viel häufiger auftritt als <b>B</b>, so dass <i>H</i><sub>1</sub> < <i>H</i><sub>0</sub> gelten muss. Entsprechend der Tabelle erfüllt nur die <u>Quelle Q1</u> diese Bedingung. Aus <i>H</i><sub>1</sub> = 0.5 bit/Symbol kann man die Symbolwahrscheinlichkeiten <i>p</i><sub>A</sub>&asymp; 0.89 und <i>p</i><sub>B</sub> &asymp; 0.11 ermitteln.
 
  
:<b>3.</b>&nbsp;&nbsp;Durch Ausschlussverfahren kommt man zum <u>Ergebnis Q2</u>: Die Quelle Q1 gehört zur schwarzen Folge, Q3 zur blauen und Q4 zum Wiederholungscode und damit offensichtlich zur grünen Symbolfolge. Die rote Symbolfolge weist folgende Eigenschaften auf:
+
'''(3)'''&nbsp; By exclusion procedure one arrives at the result&nbsp; $\underline{\rm Q2}$ for the red binary sequence:  
 +
*The source&nbsp; $\rm Q1$&nbsp;belongs to the blue sequence,&nbsp; $\rm Q3$&nbsp; to the black and&nbsp; $\rm Q4$&nbsp; to the repetition code and thus obviously to the green symbol sequence.
 +
*The red symbol sequence has the following properties:
 +
:* Because of&nbsp; $H_1 = H_0$&nbsp;, the symbols are equally probable:  &nbsp; $p_{\rm A} = p_{\rm B} = 0.5$.
 +
:* Because of&nbsp; $H < H_1$,&nbsp; there are statistical bindings within the sequence.
 +
*This can be recognised by the fact that there are more transitions between&nbsp; $\rm A$&nbsp; and&nbsp; $\rm B$&nbsp; than with statistical independence.
  
:* Wegen <i>H</i><sub>1</sub> = <i>H</i><sub>0</sub> sind die Symbole gleichwahrscheinlich: <i>p</i><sub>A</sub> = <i>p</i><sub>B</sub> = 0.5.
 
  
:* Wegen <i>H</i> < <i>H</i><sub>1</sub> bestehen statistische Bindungen innerhalb der Folge. Diese erkennt man daran, dass es mehr Übergänge zwischen <b>A</b> und <b>B</b> als bei statistischer Unabhängigkeit gibt.
 
  
:<b>4.</b>&nbsp;&nbsp;Bei der grünen Symbolfolge (Quelle Q4) sind die Symbole <b>A</b> und <b>B</b> gleichwahrscheinlich:
+
 
 +
'''(4)'''&nbsp; In the green symbol sequence&nbsp; $($source&nbsp; $\rm Q4)$&nbsp;, the symbols&nbsp; $\rm A$&nbsp; and&nbsp; $\rm B$&nbsp; are equally likely:
 +
[[File:P_ID2247__Inf_A_1_3d.png|right|frame|Symbol sequences of a binary repetition code]]
 
:$$p_{\rm A} = p_{\rm B} = 0.5 \hspace{0.3cm}\Rightarrow\hspace{0.3cm}H_1 = 1\,{\rm bit/Symbol}
 
:$$p_{\rm A} = p_{\rm B} = 0.5 \hspace{0.3cm}\Rightarrow\hspace{0.3cm}H_1 = 1\,{\rm bit/Symbol}
 
  \hspace{0.05cm}.$$
 
  \hspace{0.05cm}.$$
[[File:P_ID2247__Inf_A_1_3d.png|right|]]
+
To determine&nbsp; $H_2$, one considers two-tuples.&nbsp; The composite probabilities&nbsp; $p_{\rm AA}$,&nbsp; $p_{\rm AB}$,&nbsp; $p_{\rm BA}$&nbsp; and&nbsp; $p_{\rm BB}$&nbsp; can be calculated from this.&nbsp; You can see from the sketch:
:Zur Berechnung von <i>H</i><sub>2</sub> müssen Zweiertupel betrachtet und daraus die Verbundwahrscheinlichkeiten <i>p</i><sub>AA</sub>, <i>p</i><sub>AB</sub>, <i>p</i><sub>BA</sub> und <i>p</i><sub>BB</sub> berechnet werden. Aus der Skizze erkennt man:
+
* The combinations&nbsp; $\rm AB$&nbsp; and&nbsp; $\rm BA$&nbsp; are only possible if a tuple starts at even&nbsp; $\nu$&nbsp;.&nbsp; For the composite probabilities&nbsp; $p_{\rm AB}$&nbsp; and&nbsp; $p_{\rm BA}$&nbsp; then holds:
 
+
:$$p_{\rm AB} \hspace{0.1cm} =  \hspace{0.1cm} {\rm Pr}(\nu {\rm \hspace{0.15cm}is\hspace{0.15cm}even})  \cdot {\rm Pr}( q_{\nu} = \mathbf{A}) \cdot {\rm Pr}(q_{\nu+1} = \mathbf{B}\hspace{0.05cm} | q_{\nu} = \mathbf{A}) = {1}/{2} \cdot {1}/{2} \cdot {1}/{2} = {1}/{8} = p_{\rm BA}
:* Die Kombinationen <b>AB</b> und <b>BA</b> sind nur dann möglich, wenn ein Tupel bei geradzahligem <i>&nu;</i> beginnt. Für die Verbundwahrscheinlichkeiten <i>p</i><sub>AB</sub> und <i>p</i><sub>BA</sub> gilt dann:
+
  \hspace{0.05cm}.$$
:$$p_{\rm AB} \hspace{0.1cm} =  \hspace{0.1cm} {\rm Pr}(\nu {\rm \hspace{0.15cm}ist\hspace{0.15cm}gerade})  \cdot {\rm Pr}( q_{\nu} = \mathbf{A}) \cdot {\rm Pr}(q_{\nu+1} = \mathbf{B}\hspace{0.05cm} | q_{\nu} = \mathbf{A}) = \\
+
*In contrast, for the two other combinations&nbsp; $\rm AA$&nbsp; and&nbsp; $\rm BB$:
  \hspace{0.1cm} =  \hspace{0.1cm} \frac{1}{2} \cdot \frac{1}{2} \cdot \frac{1}{2} = \frac{1}{8} = p_{\rm BA}
+
:$$p_{\rm AA} ={\rm Pr}(\nu = 1) \cdot {\rm Pr}( q_1 = \mathbf{A}) \cdot {\rm Pr}(q_{2} = \mathbf{A}\hspace{0.05cm} | q_{1} = \mathbf{A}) +  {\rm Pr}(\nu=2\cdot {\rm Pr}( q_{2} = \mathbf{A}) \cdot {\rm Pr}(q_{3} = \mathbf{A}\hspace{0.05cm} | q_{2} = \mathbf{A})
 
  \hspace{0.05cm}.$$
 
  \hspace{0.05cm}.$$
 
+
:$$\Rightarrow \hspace{0.3cm}p_{\rm AA}  = \frac{1}{2} \cdot \frac{1}{2} \cdot 1+ \frac{1}{2} \cdot \frac{1}{2} \cdot \frac{1}{2} = \frac{3}{8} = p_{\rm BB}
:* Dagegen gelten für die beiden weiteren Kombinationen <b>AA</b> und <b>BB</b>:
 
:$$p_{\rm AA} \hspace{0.1cm} =  \hspace{0.1cm} {\rm Pr}(\nu {\rm \hspace{0.15cm}ist\hspace{0.15cm}ungerade,\hspace{0.15cm}zum\hspace{0.15cm}Beispiel\hspace{0.15cm}1})  \cdot {\rm Pr}( q_1 = \mathbf{A}) \cdot {\rm Pr}(q_{2} = \mathbf{A}\hspace{0.05cm} | q_{1} = \mathbf{A}) + \\
 
\hspace{0.1cm} +  \hspace{0.1cm} {\rm Pr}(\nu {\rm \hspace{0.15cm}ist\hspace{0.15cm}gerade,\hspace{0.15cm}zum\hspace{0.15cm}Beispiel\hspace{0.15cm}2})  \cdot {\rm Pr}( q_{2} = \mathbf{A}) \cdot {\rm Pr}(q_{3} = \mathbf{A}\hspace{0.05cm} | q_{2} = \mathbf{A}) = \\
 
  \hspace{0.1cm} = \hspace{0.1cm} \frac{1}{2} \cdot \frac{1}{2} \cdot 1+ \frac{1}{2} \cdot \frac{1}{2} \cdot \frac{1}{2} = \frac{3}{8} = p_{\rm BB}
 
 
  \hspace{0.05cm}.$$
 
  \hspace{0.05cm}.$$
 +
:Here&nbsp; $\nu = 1$&nbsp; stands for all odd indices and&nbsp; $\nu = 2$&nbsp; for all even indices.
  
:Damit ergibt sich für die Entropienäherung:
+
*This gives for the entropy approximation:
:$$H_2 \hspace{0.1cm} = \hspace{0.1cm} \frac{1}{2} \cdot \left [ 2 \cdot \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}\frac {8}{3} +  
+
:$$H_2 = \frac{1}{2} \cdot \left [ 2 \cdot \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}\frac {8}{3} +  
  2 \cdot \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)\right ] =\\
+
  2 \cdot \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)\right ] =
=  \hspace{0.1cm}
 
 
  \frac{3}{8} \cdot  
 
  \frac{3}{8} \cdot  
{\rm log}_2\hspace{0.1cm}(8) - \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}(3) +  \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8) =\\
+
{\rm log}_2\hspace{0.1cm}(8) - \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}(3) +  \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)  \hspace{0.15cm} \underline {= 0.906 \,{\rm bit/symbol}}  
\hspace{0.1cm} = \hspace{0.1cm} 1.5 -0.375 \cdot 1.585 \hspace{0.15cm} \underline {= 0.906 \,{\rm bit/Symbol}}  
 
 
  \hspace{0.05cm}.$$
 
  \hspace{0.05cm}.$$
  
:<b>5.</b>&nbsp;&nbsp;Nach ähnlichem Vorgehen kommt man bei Dreiertupeln zu den Verbundwahrscheinlichkeiten
+
 
:$$p_{\rm AAA} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm BBB} = 1/4 \hspace{0.05cm},\hspace{0.2cm} p_{\rm ABA} = p_{\rm BAB} = 0 \hspace{0.05cm},\\
+
'''(5)'''&nbsp; Following a similar procedure, we arrive at the composite probabilities for three-tuples
p_{\rm AAB} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm ABB} = p_{\rm BBA} = p_{\rm BAA} = 1/8$$
+
 
:und daraus zur Entropienäherung
+
:$$p_{\rm AAA} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm BBB} = 1/4 \hspace{0.05cm},\hspace{0.2cm} p_{\rm ABA} = p_{\rm BAB} = 0 \hspace{0.05cm},\hspace{0.2cm} p_{\rm AAB} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm ABB} = p_{\rm BBA} = p_{\rm BAA} = 1/8$$
 +
and from this to the entropy approximation
 
:$$H_3 = \frac{1}{3} \cdot \left [ 2 \cdot \frac{1}{4} \cdot {\rm log}_2\hspace{0.1cm}(4) +  
 
:$$H_3 = \frac{1}{3} \cdot \left [ 2 \cdot \frac{1}{4} \cdot {\rm log}_2\hspace{0.1cm}(4) +  
 
  4 \cdot \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)\right ] = \frac{2.5}{3} \hspace{0.15cm} \underline {= 0.833 \,{\rm bit/Symbol}}  
 
  4 \cdot \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)\right ] = \frac{2.5}{3} \hspace{0.15cm} \underline {= 0.833 \,{\rm bit/Symbol}}  
 
  \hspace{0.05cm}.$$
 
  \hspace{0.05cm}.$$
:Zur Berechnung von <i>H</i><sub>4</sub> ergeben sich folgende 16 Wahrscheinlichkeiten:
+
To calculate&nbsp; $H_4$&nbsp;, the&nbsp; $16$&nbsp; probabilities are as follows:
:$$p_{\rm AAAA} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm BBBB} = 3/16 \hspace{0.05cm},\hspace{0.2cm} p_{\rm AABB} = p_{\rm BBAA} = 2/16 \hspace{0.05cm},\\
+
:$$p_{\rm AAAA} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm BBBB} = 3/16 \hspace{0.05cm},\hspace{0.2cm} p_{\rm AABB} = p_{\rm BBAA} = 2/16 \hspace{0.05cm},$$
p_{\rm AAAB} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm ABBA} = p_{\rm ABBB} = p_{\rm BBBA} = p_{\rm BAAB} = p_{\rm BAAA}= 1/16
+
:$$ p_{\rm AAAB} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm ABBA} = p_{\rm ABBB} = p_{\rm BBBA} = p_{\rm BAAB} = p_{\rm BAAA}= 1/16
  \hspace{0.05cm},\\
+
  \hspace{0.05cm}$$
p_{\rm AABA} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm ABAA} = p_{\rm ABAB} = p_{\rm BBAB} = p_{\rm BABB} = p_{\rm BABA}= 0\hspace{0.05cm}.$$
+
:$$ p_{\rm AABA} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm ABAA} = p_{\rm ABAB} = p_{\rm BBAB} = p_{\rm BABB} = p_{\rm BABA}= 0\hspace{0.05cm}.$$
:Daraus folgt:
+
It follows that:
:$$H_4 \hspace{0.2cm} \hspace{0.2cm} \frac{1}{4} \cdot \left [ 2 \cdot \frac{3}{16} \cdot {\rm log}_2\hspace{0.1cm}\frac{16}{3} +  
+
:$$H_4= \frac{1}{4} \hspace{-0.05cm}\cdot \hspace{-0.05cm}\left [ 2 \hspace{-0.05cm}\cdot \hspace{-0.05cm} \frac{3}{16} \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.1cm}\frac{16}{3} +  
  2 \cdot \frac{1}{8} \cdot{\rm log}_2\hspace{0.1cm}(8) +  
+
  2 \hspace{-0.05cm}\cdot \hspace{-0.05cm} \frac{1}{8} \hspace{-0.05cm}\cdot \hspace{-0.05cm}{\rm log}_2\hspace{0.1cm}(8) +  
  6 \cdot \frac{1}{16} \cdot {\rm log}_2\hspace{0.1cm}(16)\right ] =\\
+
  6 \hspace{-0.05cm}\cdot \hspace{-0.05cm} \frac{1}{16} \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.1cm}(16)\right ] =\frac{\left [ 6 \hspace{-0.05cm}\cdot \hspace{-0.05cm}  
\hspace{0.1cm} \hspace{0.2cm} \left [ 6 \cdot
+
{\rm log}_2\hspace{0.01cm}(16) - 6 \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.01cm}(3) + 4 \hspace{-0.05cm}\cdot \hspace{-0.05cm}
{\rm log}_2\hspace{0.01cm}(16) - 6 \cdot {\rm log}_2\hspace{0.01cm}(3) + 4 \cdot  
+
{\rm log}_2\hspace{0.01cm}(8) + 6\hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.01cm}(16)\right ]}{32} .$$
{\rm log}_2\hspace{0.01cm}(8) + 6 \cdot {\rm log}_2\hspace{0.01cm}(16)\right ]/32 {= 0.789 \,{\rm bit/Symbol}}
+
One can see:
\hspace{0.05cm}.$$
+
*Even the approximation&nbsp; $H_4 = 0.789\,{\rm bit/Symbol}$&nbsp; still deviates significantly from the final entropy value&nbsp; $H = 0.5\,{\rm bit/symbol}$&nbsp;.
:Man erkennt, dass auch die Näherung <i>H</i><sub>4</sub> = 0.789 bit/Symbol noch weit vom tatsächlichen Entropiewert <i>H</i> = 0.5 bit/Symbol abweicht.
+
*The repetition code obviously cannot be modelled by a Markov source.&nbsp; If&nbsp; $\rm Q4$&nbsp; were a Markov source, then the following would have to apply:
 
 
:<b>Hinweis:</b> Der Wiederholungscode kann offensichtlich nicht durch eine Markovquelle modelliert werden. Wäre Q4 eine Markovquelle, so müsste nämlich gelten:
 
 
:$$H = 2 \cdot H_2 - H_1
 
:$$H = 2 \cdot H_2 - H_1
 
\hspace{0.3cm}\Rightarrow\hspace{0.3cm}H_2 = 1/2 \cdot (H+H_1) =  
 
\hspace{0.3cm}\Rightarrow\hspace{0.3cm}H_2 = 1/2 \cdot (H+H_1) =  
Line 147: Line 152:
  
  
[[Category:Aufgaben zu Informationstheorie und Quellencodierung|^1.2 Nachrichtenquellen mit Gedächtnis^]]
+
[[Category:Information Theory: Exercises|^1.2 Sources with Memory^]]

Latest revision as of 13:02, 10 August 2021

Different binary sequences

The graphic on the right shows four symbol sequences  $\langle q_\nu \rangle $,  each with length  $N = 60$.  The source symbols are  $\rm A$  and  $\rm B$.

  • It follows directly that  $H_0 = 1 \; \rm bit/symbol$  applies to the decision content of all sources considered.
  • However, the symbols  $\rm A$  and  $\rm B$  do not occur with equal probability, but with the probabilities  $p_{\rm A}$  and  $p_{\rm B}$.


In addition to  $H_0$ , the table below shows the entropy approximations

  • $H_1$,  based on  $p_{\rm A}$  und  $p_{\rm B}$  (column 2),
  • $H_2$,  based on two-tuples (column 3),
  • $H_3$,  based on three-tuples (column 4),
  • $H_4$,  based on four-tuples (column 5),
  • the actual entropy  $H$, which is obtained from  $H_k$  by the boundary transition for  $k \to \infty$  (last column).


The following size relations exist between these entropies:   $H \le$ ... $\le H_3 \le H_2 \le H_1 \le H_0 \hspace{0.05cm}.$

  • What is not known is the correlation between the sources  $\rm Q1$,  $\rm Q2$,  $\rm Q3$,  $\rm Q4$  and the symbol sequences shown in the graph 
    (black, blue, red, green).
  • It is only known that source  $\rm Q4$  contains a repetition code.  Due to the fact that in the corresponding symbol sequence every second symbol does not lier any information,  the final entrpy value is  $H = 0.5 \; \rm bit/symbol$.
  • In addition, the entropy approximations  $H_1 = 1 \; \rm bit/symbol$  and  $H_4 \approx 0.789 \; \rm bit/symbol$  are given.


Finally, the entropy approximations  $H_2$  and  $H_3$ are to be determined for the source  $\rm Q4$ .

Source entropy and approximations in "bit/symbol"


Hints:

  • This task belongs to the chapter  Discrete Sources with Memory.
  • For the  $k$–th entropy approximation, the following holds for binary sources  $(M = 2)$  with the composite probability  $ p_i^{(k)}$  of a  $k$–tuple:
$$H_k = \frac{1}{k} \cdot \sum_{i=1}^{2^k} p_i^{(k)} \cdot {\rm log}_2\hspace{0.1cm}\frac {1}{p_i^{(k)}} \hspace{0.5cm}({\rm unit\hspace{-0.1cm}: \hspace{0.1cm}bit/Symbol}) \hspace{0.05cm}.$$


Questions

1

What is the source of the black symbol sequence?

$\rm Q1$,
$\rm Q2$,
$\rm Q3$,
$\rm Q4$.

2

What is the source of the blue symbol sequence?

$\rm Q1$,
$\rm Q2$,
$\rm Q3$,
$\rm Q4$.

3

What is the source of the red symbol sequence?

$\rm Q1$,
$\rm Q2$,
$\rm Q3$,
$\rm Q4$.

4

Calculate the entropy approximation  $H_2$  of the repetition code  $\rm Q4$.

$H_2 \ = \ $

$\ \rm bit/symbol$

5

Calculate the entropy approximation  $H_3$  of the repetition code  $\rm Q4$.

$H_3 \ = \ $

$\ \rm bit/symbol$


Solution

(1)  The black binary sequence comes from the source  $\underline{\rm Q3}$,

  • since the symbols are equally probable   ⇒   $H_1 = H_0$,  and
  • there are no statistical bindings between the symbols   ⇒   $H=$ ... $= H_2 = H_1$.


(2)  It can be seen in the blue binary sequence that  $\rm A$  occurs much more frequently than  $\rm B$, so that  $H_1 < H_0$  must hold.

  • According to the table, only source  $\underline{\rm Q1}$  fulfils this condition.
  • From  $H_1 = 0.5 \; \rm bit/symbol$  one can determine the symbol probabilities  $p_{\rm A} = 0.89$  and  $p_{\rm B} = 0.11$ .


(3)  By exclusion procedure one arrives at the result  $\underline{\rm Q2}$ for the red binary sequence:

  • The source  $\rm Q1$ belongs to the blue sequence,  $\rm Q3$  to the black and  $\rm Q4$  to the repetition code and thus obviously to the green symbol sequence.
  • The red symbol sequence has the following properties:
  • Because of  $H_1 = H_0$ , the symbols are equally probable:   $p_{\rm A} = p_{\rm B} = 0.5$.
  • Because of  $H < H_1$,  there are statistical bindings within the sequence.
  • This can be recognised by the fact that there are more transitions between  $\rm A$  and  $\rm B$  than with statistical independence.



(4)  In the green symbol sequence  $($source  $\rm Q4)$ , the symbols  $\rm A$  and  $\rm B$  are equally likely:

Symbol sequences of a binary repetition code
$$p_{\rm A} = p_{\rm B} = 0.5 \hspace{0.3cm}\Rightarrow\hspace{0.3cm}H_1 = 1\,{\rm bit/Symbol} \hspace{0.05cm}.$$

To determine  $H_2$, one considers two-tuples.  The composite probabilities  $p_{\rm AA}$,  $p_{\rm AB}$,  $p_{\rm BA}$  and  $p_{\rm BB}$  can be calculated from this.  You can see from the sketch:

  • The combinations  $\rm AB$  and  $\rm BA$  are only possible if a tuple starts at even  $\nu$ .  For the composite probabilities  $p_{\rm AB}$  and  $p_{\rm BA}$  then holds:
$$p_{\rm AB} \hspace{0.1cm} = \hspace{0.1cm} {\rm Pr}(\nu {\rm \hspace{0.15cm}is\hspace{0.15cm}even}) \cdot {\rm Pr}( q_{\nu} = \mathbf{A}) \cdot {\rm Pr}(q_{\nu+1} = \mathbf{B}\hspace{0.05cm} | q_{\nu} = \mathbf{A}) = {1}/{2} \cdot {1}/{2} \cdot {1}/{2} = {1}/{8} = p_{\rm BA} \hspace{0.05cm}.$$
  • In contrast, for the two other combinations  $\rm AA$  and  $\rm BB$:
$$p_{\rm AA} ={\rm Pr}(\nu = 1) \cdot {\rm Pr}( q_1 = \mathbf{A}) \cdot {\rm Pr}(q_{2} = \mathbf{A}\hspace{0.05cm} | q_{1} = \mathbf{A}) + {\rm Pr}(\nu=2) \cdot {\rm Pr}( q_{2} = \mathbf{A}) \cdot {\rm Pr}(q_{3} = \mathbf{A}\hspace{0.05cm} | q_{2} = \mathbf{A}) \hspace{0.05cm}.$$
$$\Rightarrow \hspace{0.3cm}p_{\rm AA} = \frac{1}{2} \cdot \frac{1}{2} \cdot 1+ \frac{1}{2} \cdot \frac{1}{2} \cdot \frac{1}{2} = \frac{3}{8} = p_{\rm BB} \hspace{0.05cm}.$$
Here  $\nu = 1$  stands for all odd indices and  $\nu = 2$  for all even indices.
  • This gives for the entropy approximation:
$$H_2 = \frac{1}{2} \cdot \left [ 2 \cdot \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}\frac {8}{3} + 2 \cdot \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)\right ] = \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}(8) - \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}(3) + \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8) \hspace{0.15cm} \underline {= 0.906 \,{\rm bit/symbol}} \hspace{0.05cm}.$$


(5)  Following a similar procedure, we arrive at the composite probabilities for three-tuples

$$p_{\rm AAA} \hspace{0.1cm} = \hspace{0.1cm} p_{\rm BBB} = 1/4 \hspace{0.05cm},\hspace{0.2cm} p_{\rm ABA} = p_{\rm BAB} = 0 \hspace{0.05cm},\hspace{0.2cm} p_{\rm AAB} \hspace{0.1cm} = \hspace{0.1cm} p_{\rm ABB} = p_{\rm BBA} = p_{\rm BAA} = 1/8$$

and from this to the entropy approximation

$$H_3 = \frac{1}{3} \cdot \left [ 2 \cdot \frac{1}{4} \cdot {\rm log}_2\hspace{0.1cm}(4) + 4 \cdot \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)\right ] = \frac{2.5}{3} \hspace{0.15cm} \underline {= 0.833 \,{\rm bit/Symbol}} \hspace{0.05cm}.$$

To calculate  $H_4$ , the  $16$  probabilities are as follows:

$$p_{\rm AAAA} \hspace{0.1cm} = \hspace{0.1cm} p_{\rm BBBB} = 3/16 \hspace{0.05cm},\hspace{0.2cm} p_{\rm AABB} = p_{\rm BBAA} = 2/16 \hspace{0.05cm},$$
$$ p_{\rm AAAB} \hspace{0.1cm} = \hspace{0.1cm} p_{\rm ABBA} = p_{\rm ABBB} = p_{\rm BBBA} = p_{\rm BAAB} = p_{\rm BAAA}= 1/16 \hspace{0.05cm}$$
$$ p_{\rm AABA} \hspace{0.1cm} = \hspace{0.1cm} p_{\rm ABAA} = p_{\rm ABAB} = p_{\rm BBAB} = p_{\rm BABB} = p_{\rm BABA}= 0\hspace{0.05cm}.$$

It follows that:

$$H_4= \frac{1}{4} \hspace{-0.05cm}\cdot \hspace{-0.05cm}\left [ 2 \hspace{-0.05cm}\cdot \hspace{-0.05cm} \frac{3}{16} \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.1cm}\frac{16}{3} + 2 \hspace{-0.05cm}\cdot \hspace{-0.05cm} \frac{1}{8} \hspace{-0.05cm}\cdot \hspace{-0.05cm}{\rm log}_2\hspace{0.1cm}(8) + 6 \hspace{-0.05cm}\cdot \hspace{-0.05cm} \frac{1}{16} \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.1cm}(16)\right ] =\frac{\left [ 6 \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.01cm}(16) - 6 \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.01cm}(3) + 4 \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.01cm}(8) + 6\hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.01cm}(16)\right ]}{32} .$$

One can see:

  • Even the approximation  $H_4 = 0.789\,{\rm bit/Symbol}$  still deviates significantly from the final entropy value  $H = 0.5\,{\rm bit/symbol}$ .
  • The repetition code obviously cannot be modelled by a Markov source.  If  $\rm Q4$  were a Markov source, then the following would have to apply:
$$H = 2 \cdot H_2 - H_1 \hspace{0.3cm}\Rightarrow\hspace{0.3cm}H_2 = 1/2 \cdot (H+H_1) = 1/2 \cdot (0.5+1) = 0.75 \,{\rm bit/Symbol}\hspace{0.05cm}.$$