Difference between revisions of "Aufgaben:Exercise 1.3: Entropy Approximations"

From LNTwww

@@ Line 1: / Line 1: @@
-{{quiz-Header|Buchseite=Informationstheorie/Nachrichtenquellen mit Gedächtnis
+{{quiz-Header|Buchseite=Information_Theory/Discrete_Sources_with_Memory
 }}
-[[File:Inf_A_1_3_vers2.png|right|Vier verschiedene Binärfolgen]]
+[[File:EN_Inf_A_1_3_v2.png|right|frame|Different binary sequences]]
-Die Grafik zeigt vier Symbolfolgen $\langle q_\nu \rangle $mit jeweilger Länge $N = 60$. Die Quellensymbole sind jeweils $\rm A$ und $\rm B$. Daraus folgt direkt, dass für den Entscheidungsgehalt aller betrachteten Quellen $H_0 = 1 \; \rm bit/Symbol$ gilt. Die Symbole $\rm A$ und $\rm B$ treten  jedoch nicht gleichwahrscheinlich auf, sondern mit den Wahrscheinlichkeiten $p_{\rm A}$ und $p_{\rm B}$.
+The graphic on the right shows four symbol sequences&nbsp; $\langle q_\nu \rangle $,&nbsp; each with length&nbsp; $N = 60$.&nbsp; The source symbols are&nbsp; $\rm A$&nbsp; and&nbsp; $\rm B$.
+*It follows directly that&nbsp; $H_0 = 1 \; \rm bit/symbol$&nbsp; applies to the decision content of all sources considered.
+*However, the symbols&nbsp; $\rm A$&nbsp; and&nbsp; $\rm B$&nbsp; do not occur with equal probability, but with the probabilities&nbsp; $p_{\rm A}$&nbsp; and&nbsp; $p_{\rm B}$.
-Die folgende Tabelle zeigt neben $H_0$ noch die Entropienäherungen
-* $H_1$, basierend auf $p_{\rm A}$ und $p_{\rm B}$ (Spalte 2),
-* $H_2$, basierend auf Zweiertupel (Spalte 3),
-* $H_3$, basierend auf Dreiertupel (Spalte 4),
-* $H_4$, basierend auf Vierertupel (Spalte 5),
-* die tatsächliche Entropie $H$, die sich aus $H_k$ durch den Grenzübergang für $k \to \infty$ ergibt (letzte Spalte).
+In addition to&nbsp; $H_0$&nbsp;, the table below shows the entropy approximations
+* $H_1$,&nbsp; based on&nbsp; $p_{\rm A}$&nbsp; und&nbsp; $p_{\rm B}$&nbsp; (column 2),
+* $H_2$,&nbsp; based on two-tuples (column 3),
+* $H_3$,&nbsp; based on three-tuples (column 4),
+* $H_4$,&nbsp; based on four-tuples (column 5),
+* the actual entropy&nbsp; $H$, which is obtained from&nbsp; $H_k$&nbsp; by the boundary transition for&nbsp; $k \to \infty$&nbsp; (last column).
-Zwischen diesen Entropien bestehen folgende Größenrelationen: &nbsp; $H \le$ ... $\le H_3 \le H_2 \le H_1 \le H_0 \hspace{0.05cm}.$
-*Nicht bekannt ist die Zuordnung zwischen den Quellen '''Q1''', '''Q2''', '''Q3''', '''Q4''' und den in der Grafik gezeigten gezeigten Symbolfolgen (Schwarz, Blau, Rot, Grün).
+The following size relations exist between these entropies: &nbsp; $H \le$ ... $\le H_3 \le H_2 \le H_1 \le H_0 \hspace{0.05cm}.$
-*Es ist lediglich bekannt, dass die Quelle '''Q4''' einen Wiederholungscode beinhaltet. Aufgrund der Tatsache, dass bei der entsprechenden Symbolfolge jedes zweite Symbol keinerlei Information lierfert, ist $H_0 = 0.5 \; \rm bit/Symbol$. Zudem sind die Entropienäherungen $H_1 = 1 \; \rm bit/Symbol$ (gleichwahrscheinliche Symbole) und $H_4 \approx 0.789 \; \rm bit/Symbol$ gegeben
-Zu bestimmen sind für diese Nachrichtenquelle schließlich noch die Entropienäherungen $H_2$ und $H_3$.
+*What is not known is the correlation between the sources&nbsp; $\rm Q1$,&nbsp; $\rm Q2$,&nbsp; $\rm Q3$,&nbsp; $\rm Q4$&nbsp; and the symbol sequences shown in the graph&nbsp; <br>(black, blue, red, green).
+*It is only known that source&nbsp; $\rm Q4$&nbsp; contains a repetition code.&nbsp; Due to the fact that in the corresponding symbol sequence every second symbol does not lier any information,&nbsp; the final entrpy value is&nbsp; $H = 0.5 \; \rm bit/symbol$.
+*In addition, the entropy approximations&nbsp; $H_1 = 1 \; \rm bit/symbol$&nbsp; and&nbsp; $H_4 \approx 0.789 \; \rm bit/symbol$&nbsp; are given.
-[[File:Inf_A_1_3b_vers2.png|Quellenentropie und Näherungen in &bdquo;bit/Symbol&rdquo;]]
+Finally, the entropy approximations&nbsp; $H_2$&nbsp; and&nbsp; $H_3$ are to be determined for the source&nbsp; $\rm Q4$&nbsp;.
-''Hinweise:''
+[[File:EN_Inf_A_1_3b_v2.png|left|frame|Source entropy and approximations in "bit/symbol"]]
-*Die Aufgabe gehört zum  Kapitel [[Informationstheorie/Nachrichtenquellen_mit_Gedächtnis|Nachrichtenquellen mit Gedächtnis]].
+<br clear=all>
-*Sollte die Eingabe des Zahlenwertes &bdquo;0&rdquo; erforderlich sein, so geben Sie bitte &bdquo;0.&rdquo; ein.
+''Hints:''
-*Für die $k$&ndash;te Entropienäherung gilt bei Binärquellen ($M = 2$) mit der Verbundwahrscheinlichkeit $ p_i^{(k)}$ eines $k$&ndash;Tupels:
+*This task belongs to the chapter&nbsp; [[Information_Theory/Discrete_Sources_with_Memory|Discrete Sources with Memory]].
-:$$H_k = \frac{1}{k} \cdot \sum_{i=1}^{2^k} p_i^{(k)} \cdot {\rm log}_2\hspace{0.1cm}\frac {1}{p_i^{(k)}} \hspace{0.5cm}({\rm Einheit\hspace{-0.1cm}: \hspace{0.1cm}bit/Symbol})
+*For the&nbsp; $k$&ndash;th entropy approximation, the following holds for binary sources&nbsp; $(M = 2)$&nbsp; with the composite probability&nbsp; $ p_i^{(k)}$&nbsp; of a&nbsp; $k$&ndash;tuple:
+:$$H_k = \frac{1}{k} \cdot \sum_{i=1}^{2^k} p_i^{(k)} \cdot {\rm log}_2\hspace{0.1cm}\frac {1}{p_i^{(k)}} \hspace{0.5cm}({\rm unit\hspace{-0.1cm}: \hspace{0.1cm}bit/Symbol})
   \hspace{0.05cm}.$$
-===Fragebogen===
+===Questions===
 <quiz display=simple>
-{Von welcher Quelle stammt die schwarze Symbolfolge?
+{What is the source of the <u>black symbol sequence</u>?
-|type="[]"}
+|type="()"}
-- Q1,
+- $\rm Q1$,
-- Q2,
+- $\rm Q2$,
-+ Q3,
++ $\rm Q3$,
-- Q4.
+- $\rm Q4$.
-{Von welcher Quelle stammt die blaue Symbolfolge?
+{What is the source of the <u>blue symbol sequence</u>?
-|type="[]"}
+|type="()"}
-+ Q1,
++ $\rm Q1$,
-- Q2,
+- $\rm Q2$,
-- Q3,
+- $\rm Q3$,
-- Q4.
+- $\rm Q4$.
-{Von welcher Quelle stammt die rote Symbolfolge?
+{What is the source of the <u>red symbol sequence</u>?
-|type="[]"}
+|type="()"}
-- Q1,
+- $\rm Q1$,
-+ Q2,
++ $\rm Q2$,
-- Q3,
+- $\rm Q3$,
-- Q4.
+- $\rm Q4$.
-{Berechnen Sie die Entropienäherung $H_2$ des Wiederholungscodes '''Q4'''.
+{Calculate the entropy approximation&nbsp; $H_2$&nbsp; of the repetition code&nbsp; $\rm Q4$.
 |type="{}"}
-$H_2 \ = $ { 0.906 3% } $\ \rm bit/Symbol$
+$H_2 \ = \ $ { 0.906 3% } $\ \rm bit/symbol$
-{Berechnen Sie die Entropienäherung $H_3$ des Wiederholungscodes  '''Q4'''.
+{Calculate the entropy approximation&nbsp; $H_3$&nbsp; of the repetition code&nbsp;  $\rm Q4$.
 |type="{}"}
-$H_3 \ = $ { 0.833 3% } $\ \rm bit/Symbol$
+$H_3 \ =  \ $ { 0.833 3% } $\ \rm bit/symbol$
@@ Line 72: / Line 75: @@
 </quiz>
-===Musterlösung===
+===Solution===
 {{ML-Kopf}}
-'''(1)'''&nbsp; Die schwarze Binärfolge stammt von der <u>Quelle '''Q3'''</u>, da die Symbole gleichwahrscheinlich sind &nbsp;&nbsp;&#8658;&nbsp;&nbsp; $H_1 = H_0$ und keine statistischen Bindungen zwischen den Symbolen bestehen &nbsp;&nbsp;&#8658;&nbsp;&nbsp; $H=$ ....  $= H_2 = H_1$.
+'''(1)'''&nbsp; The black binary sequence comes from the source&nbsp; $\underline{\rm Q3}$,
+*since the symbols are equally probable &nbsp;&nbsp;&#8658;&nbsp;&nbsp; $H_1 = H_0$,&nbsp; and
+*there are no statistical bindings between the symbols &nbsp;&nbsp;&#8658;&nbsp;&nbsp; $H=$ ...  $= H_2 = H_1$.
-'''(2)'''&nbsp; Man erkennt bei der blauen Binärfolge, dass $\rm A$ sehr viel häufiger auftritt als $\rm B$, so dass $H_1 < H_0$ gelten muss. Entsprechend der Tabelle erfüllt nur die <u>Quelle '''Q1'''</u> diese Bedingung. Aus $H_1 = 0.5 \; \rm bit/Symbol$ kann man die Symbolwahrscheinlichkeiten $p_{\rm A} = 0.89$ und $p_{\rm B} = 0.11$ ermitteln.
+'''(2)'''&nbsp; It can be seen in the blue binary sequence that&nbsp; $\rm A$&nbsp; occurs much more frequently than&nbsp; $\rm B$, so that&nbsp; $H_1 < H_0$&nbsp; must hold.
+*According to the table, only source&nbsp; $\underline{\rm Q1}$&nbsp; fulfils this condition.
+*From&nbsp; $H_1 = 0.5 \; \rm bit/symbol$&nbsp; one can determine the symbol probabilities&nbsp; $p_{\rm A} = 0.89$&nbsp; and&nbsp; $p_{\rm B} = 0.11$&nbsp;.
-'''(3)'''&nbsp; Durch Ausschlussverfahren kommt man für die rote Binärfolge zum <u>Quelle '''Q2'''</u>: Die Quelle '''Q1''' gehört nämlich zur blauen Folge, '''Q3''' zur schwarzen und '''Q4''' zum Wiederholungscode und damit offensichtlich zur grünen Symbolfolge. Die rote Symbolfolge weist folgende Eigenschaften auf:
-* Wegen $H_1 = H_0$ sind die Symbole gleichwahrscheinlich: $p_{\rm A} = p_{\rm B} = 0.5$.
-* Wegen $H < H_1$ bestehen statistische Bindungen innerhalb der Folge. Diese erkennt man daran, dass es zwischen $\rm A$ und $\rm B$ mehr Übergänge als bei statistischer Unabhängigkeit gibt.
-'''(4)'''&nbsp; Bei der grünen Symbolfolge (Quelle '''Q4''') sind die Symbole $\rm A$ und $\rm B$ gleichwahrscheinlich:
+'''(3)'''&nbsp; By exclusion procedure one arrives at the result&nbsp; $\underline{\rm Q2}$ for the red binary sequence:
-[[File:P_ID2247__Inf_A_1_3d.png|right|Symbolfolgen eines binären Wiederholungscodes]]
+*The source&nbsp; $\rm Q1$&nbsp;belongs to the blue sequence,&nbsp; $\rm Q3$&nbsp; to the black and&nbsp; $\rm Q4$&nbsp; to the repetition code and thus obviously to the green symbol sequence.
+*The red symbol sequence has the following properties:
+:* Because of&nbsp; $H_1 = H_0$&nbsp;, the symbols are equally probable:  &nbsp; $p_{\rm A} = p_{\rm B} = 0.5$.
+:* Because of&nbsp; $H < H_1$,&nbsp; there are statistical bindings within the sequence.
+*This can be recognised by the fact that there are more transitions between&nbsp; $\rm A$&nbsp; and&nbsp; $\rm B$&nbsp; than with statistical independence.
+'''(4)'''&nbsp; In the green symbol sequence&nbsp; $($source&nbsp; $\rm Q4)$&nbsp;, the symbols&nbsp; $\rm A$&nbsp; and&nbsp; $\rm B$&nbsp; are equally likely:
+[[File:P_ID2247__Inf_A_1_3d.png|right|frame|Symbol sequences of a binary repetition code]]
 :$$p_{\rm A} = p_{\rm B} = 0.5 \hspace{0.3cm}\Rightarrow\hspace{0.3cm}H_1 = 1\,{\rm bit/Symbol}
   \hspace{0.05cm}.$$
-Zur $H_2$-Ermittlung  betrachtet man Zweiertupel. Die Verbundwahrscheinlichkeiten $p_{\rm AA}$, $p_{\rm AB}$, $p_{\rm PA}$ und $p_{\rm BB}$ können daraus berechnet werden. Aus der Skizze erkennt man:
+To determine&nbsp; $H_2$, one considers two-tuples.&nbsp; The composite probabilities&nbsp; $p_{\rm AA}$,&nbsp; $p_{\rm AB}$,&nbsp; $p_{\rm BA}$&nbsp; and&nbsp; $p_{\rm BB}$&nbsp; can be calculated from this.&nbsp; You can see from the sketch:
-* Die Kombinationen $\rm AB$ und $\rm BA$ sind nur dann möglich, wenn ein Tupel bei geradzahligem $\nu$ beginnt. Für die Verbundwahrscheinlichkeiten $p_{\rm AB}$ und $p_{\rm BA}$ gilt dann:
+* The combinations&nbsp; $\rm AB$&nbsp; and&nbsp; $\rm BA$&nbsp; are only possible if a tuple starts at even&nbsp; $\nu$&nbsp;.&nbsp; For the composite probabilities&nbsp; $p_{\rm AB}$&nbsp; and&nbsp; $p_{\rm BA}$&nbsp; then holds:
-:$$p_{\rm AB} \hspace{0.1cm} =  \hspace{0.1cm} {\rm Pr}(\nu {\rm \hspace{0.15cm}ist\hspace{0.15cm}gerade})  \cdot {\rm Pr}( q_{\nu} = \mathbf{A}) \cdot {\rm Pr}(q_{\nu+1} = \mathbf{B}\hspace{0.05cm} | q_{\nu} = \mathbf{A}) =  {1}/{2} \cdot {1}/{2} \cdot {1}/{2} = {1}/{8} = p_{\rm BA}
+:$$p_{\rm AB} \hspace{0.1cm} =  \hspace{0.1cm} {\rm Pr}(\nu {\rm \hspace{0.15cm}is\hspace{0.15cm}even})  \cdot {\rm Pr}( q_{\nu} = \mathbf{A}) \cdot {\rm Pr}(q_{\nu+1} = \mathbf{B}\hspace{0.05cm} | q_{\nu} = \mathbf{A}) =  {1}/{2} \cdot {1}/{2} \cdot {1}/{2} = {1}/{8} = p_{\rm BA}
   \hspace{0.05cm}.$$
-*Dagegen gelten für die beiden weiteren Kombinationen $\rm AA$ und $\rm BB$:
+*In contrast, for the two other combinations&nbsp; $\rm AA$&nbsp; and&nbsp; $\rm BB$:
 :$$p_{\rm AA} ={\rm Pr}(\nu = 1)  \cdot {\rm Pr}( q_1 = \mathbf{A}) \cdot {\rm Pr}(q_{2} = \mathbf{A}\hspace{0.05cm} | q_{1} = \mathbf{A}) +  {\rm Pr}(\nu=2)  \cdot {\rm Pr}( q_{2} = \mathbf{A}) \cdot {\rm Pr}(q_{3} = \mathbf{A}\hspace{0.05cm} | q_{2} = \mathbf{A})
   \hspace{0.05cm}.$$
 :$$\Rightarrow \hspace{0.3cm}p_{\rm AA}  = \frac{1}{2} \cdot \frac{1}{2} \cdot 1+ \frac{1}{2} \cdot \frac{1}{2} \cdot \frac{1}{2} = \frac{3}{8} = p_{\rm BB}
   \hspace{0.05cm}.$$
-:Hierbei steht $\nu = 1$ für alle ungeradzahligen $\nu$ und $\nu = 2$ für alle geradzahligen.
+:Here&nbsp; $\nu = 1$&nbsp; stands for all odd indices and&nbsp; $\nu = 2$&nbsp; for all even indices.
-*Damit ergibt sich für die Entropienäherung:
+*This gives for the entropy approximation:
 :$$H_2 = \frac{1}{2} \cdot \left [ 2 \cdot \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}\frac {8}{3} +
 \cdot \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)\right ] =
   \frac{3}{8} \cdot
-{\rm log}_2\hspace{0.1cm}(8) - \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}(3) +   \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)  \hspace{0.15cm} \underline {= 0.906 \,{\rm bit/Symbol}}
+{\rm log}_2\hspace{0.1cm}(8) - \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}(3) +   \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)  \hspace{0.15cm} \underline {= 0.906 \,{\rm bit/symbol}}
   \hspace{0.05cm}.$$
-'''(5)'''&nbsp; Nach ähnlichem Vorgehen kommt man bei Dreiertupeln zu den Verbundwahrscheinlichkeiten
-:$$p_{\rm AAA} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm BBB} = 1/4 \hspace{0.05cm},\hspace{0.2cm} p_{\rm ABA} = p_{\rm BAB} = 0 \hspace{0.05cm},$$
+'''(5)'''&nbsp; Following a similar procedure, we arrive at the composite probabilities for three-tuples
-:$$ p_{\rm AAB} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm ABB} = p_{\rm BBA} = p_{\rm BAA} = 1/8$$
-und daraus zur Entropienäherung
+:$$p_{\rm AAA} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm BBB} = 1/4 \hspace{0.05cm},\hspace{0.2cm} p_{\rm ABA} = p_{\rm BAB} = 0 \hspace{0.05cm},\hspace{0.2cm} p_{\rm AAB} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm ABB} = p_{\rm BBA} = p_{\rm BAA} = 1/8$$
+and from this to the entropy approximation
 :$$H_3 = \frac{1}{3} \cdot \left [ 2 \cdot \frac{1}{4} \cdot {\rm log}_2\hspace{0.1cm}(4) +
 \cdot \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)\right ] = \frac{2.5}{3} \hspace{0.15cm} \underline {= 0.833 \,{\rm bit/Symbol}}
   \hspace{0.05cm}.$$
-Zur Berechnung von$H_4$  ergeben sich folgende $16$ Wahrscheinlichkeiten:
+To calculate&nbsp; $H_4$&nbsp;, the&nbsp; $16$&nbsp; probabilities are as follows:
 :$$p_{\rm AAAA} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm BBBB} = 3/16 \hspace{0.05cm},\hspace{0.2cm} p_{\rm AABB} = p_{\rm BBAA} = 2/16 \hspace{0.05cm},$$
 :$$ p_{\rm AAAB} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm ABBA} = p_{\rm ABBB} = p_{\rm BBBA} = p_{\rm BAAB} = p_{\rm BAAA}= 1/16
   \hspace{0.05cm}$$
 :$$ p_{\rm AABA} \hspace{0.1cm} =  \hspace{0.1cm} p_{\rm ABAA} = p_{\rm ABAB} = p_{\rm BBAB} = p_{\rm BABB} = p_{\rm BABA}= 0\hspace{0.05cm}.$$
-Daraus folgt:
+It follows that:
 :$$H_4= \frac{1}{4} \hspace{-0.05cm}\cdot \hspace{-0.05cm}\left [ 2 \hspace{-0.05cm}\cdot \hspace{-0.05cm} \frac{3}{16} \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.1cm}\frac{16}{3} +
 \hspace{-0.05cm}\cdot \hspace{-0.05cm} \frac{1}{8} \hspace{-0.05cm}\cdot \hspace{-0.05cm}{\rm log}_2\hspace{0.1cm}(8) +
@@ Line 125: / Line 140: @@
 {\rm log}_2\hspace{0.01cm}(16) - 6 \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.01cm}(3) + 4 \hspace{-0.05cm}\cdot \hspace{-0.05cm}
 {\rm log}_2\hspace{0.01cm}(8) + 6\hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.01cm}(16)\right ]}{32} .$$
-Man erkennt: Auch die Näherung $H_4 = 0.789\,{\rm bit/Symbol}$ weicht noch weit vom Entropie-Endwert $H = 0.5\,{\rm bit/Symbol}$ ab.
+One can see:
+*Even the approximation&nbsp; $H_4 = 0.789\,{\rm bit/Symbol}$&nbsp; still deviates significantly from the final entropy value&nbsp; $H = 0.5\,{\rm bit/symbol}$&nbsp;.
-''Hinweis:'' Der Wiederholungscode kann offensichtlich nicht durch eine Markovquelle modelliert werden. Wäre '''Q4''' eine Markovquelle, so müsste nämlich gelten:
+*The repetition code obviously cannot be modelled by a Markov source.&nbsp; If&nbsp; $\rm Q4$&nbsp; were a Markov source, then the following would have to apply:
 :$$H = 2 \cdot H_2 - H_1
 \hspace{0.3cm}\Rightarrow\hspace{0.3cm}H_2 = 1/2 \cdot (H+H_1) =
@@ Line 137: / Line 152: @@
-[[Category:Aufgaben zu Informationstheorie|^1.2 Nachrichtenquellen mit Gedächtnis^]]
+[[Category:Information Theory: Exercises|^1.2 Sources with Memory^]]

Latest revision as of 13:02, 10 August 2021

Return to book

Different binary sequences

The graphic on the right shows four symbol sequences $\langle q_\nu \rangle $, each with length $N = 60$. The source symbols are $\rm A$ and $\rm B$.

It follows directly that $H_0 = 1 \; \rm bit/symbol$ applies to the decision content of all sources considered.
However, the symbols $\rm A$ and $\rm B$ do not occur with equal probability, but with the probabilities $p_{\rm A}$ and $p_{\rm B}$.

In addition to $H_0$ , the table below shows the entropy approximations

$H_1$, based on $p_{\rm A}$ und $p_{\rm B}$ (column 2),
$H_2$, based on two-tuples (column 3),
$H_3$, based on three-tuples (column 4),
$H_4$, based on four-tuples (column 5),
the actual entropy $H$, which is obtained from $H_k$ by the boundary transition for $k \to \infty$ (last column).

The following size relations exist between these entropies: $H \le$ ... $\le H_3 \le H_2 \le H_1 \le H_0 \hspace{0.05cm}.$

What is not known is the correlation between the sources $\rm Q1$, $\rm Q2$, $\rm Q3$, $\rm Q4$ and the symbol sequences shown in the graph
(black, blue, red, green).
It is only known that source $\rm Q4$ contains a repetition code. Due to the fact that in the corresponding symbol sequence every second symbol does not lier any information, the final entrpy value is $H = 0.5 \; \rm bit/symbol$.
In addition, the entropy approximations $H_1 = 1 \; \rm bit/symbol$ and $H_4 \approx 0.789 \; \rm bit/symbol$ are given.

Finally, the entropy approximations $H_2$ and $H_3$ are to be determined for the source $\rm Q4$ .

Source entropy and approximations in "bit/symbol"

Hints:

This task belongs to the chapter Discrete Sources with Memory.
For the $k$–th entropy approximation, the following holds for binary sources $(M = 2)$ with the composite probability $ p_i^{(k)}$ of a $k$–tuple:

$$H_k = \frac{1}{k} \cdot \sum_{i=1}^{2^k} p_i^{(k)} \cdot {\rm log}_2\hspace{0.1cm}\frac {1}{p_i^{(k)}} \hspace{0.5cm}({\rm unit\hspace{-0.1cm}: \hspace{0.1cm}bit/Symbol}) \hspace{0.05cm}.$$

Questions

What is the source of the black symbol sequence?

	$\rm Q1$,
	$\rm Q2$,
	$\rm Q3$,
	$\rm Q4$.

What is the source of the blue symbol sequence?

	$\rm Q1$,
	$\rm Q2$,
	$\rm Q3$,
	$\rm Q4$.

What is the source of the red symbol sequence?

	$\rm Q1$,
	$\rm Q2$,
	$\rm Q3$,
	$\rm Q4$.

Calculate the entropy approximation $H_2$ of the repetition code $\rm Q4$.

$H_2 \ = \ $

$\ \rm bit/symbol$

Calculate the entropy approximation $H_3$ of the repetition code $\rm Q4$.

$H_3 \ = \ $

$\ \rm bit/symbol$

Solution

(1) The black binary sequence comes from the source $\underline{\rm Q3}$,

since the symbols are equally probable ⇒ $H_1 = H_0$, and
there are no statistical bindings between the symbols ⇒ $H=$ ... $= H_2 = H_1$.

(2) It can be seen in the blue binary sequence that $\rm A$ occurs much more frequently than $\rm B$, so that $H_1 < H_0$ must hold.

According to the table, only source $\underline{\rm Q1}$ fulfils this condition.
From $H_1 = 0.5 \; \rm bit/symbol$ one can determine the symbol probabilities $p_{\rm A} = 0.89$ and $p_{\rm B} = 0.11$ .

(3) By exclusion procedure one arrives at the result $\underline{\rm Q2}$ for the red binary sequence:

The source $\rm Q1$ belongs to the blue sequence, $\rm Q3$ to the black and $\rm Q4$ to the repetition code and thus obviously to the green symbol sequence.
The red symbol sequence has the following properties:

Because of $H_1 = H_0$ , the symbols are equally probable: $p_{\rm A} = p_{\rm B} = 0.5$.
Because of $H < H_1$, there are statistical bindings within the sequence.

This can be recognised by the fact that there are more transitions between $\rm A$ and $\rm B$ than with statistical independence.

(4) In the green symbol sequence $($source $\rm Q4)$ , the symbols $\rm A$ and $\rm B$ are equally likely:

Symbol sequences of a binary repetition code

$$p_{\rm A} = p_{\rm B} = 0.5 \hspace{0.3cm}\Rightarrow\hspace{0.3cm}H_1 = 1\,{\rm bit/Symbol} \hspace{0.05cm}.$$

To determine $H_2$, one considers two-tuples. The composite probabilities $p_{\rm AA}$, $p_{\rm AB}$, $p_{\rm BA}$ and $p_{\rm BB}$ can be calculated from this. You can see from the sketch:

The combinations $\rm AB$ and $\rm BA$ are only possible if a tuple starts at even $\nu$ . For the composite probabilities $p_{\rm AB}$ and $p_{\rm BA}$ then holds:

$$p_{\rm AB} \hspace{0.1cm} = \hspace{0.1cm} {\rm Pr}(\nu {\rm \hspace{0.15cm}is\hspace{0.15cm}even}) \cdot {\rm Pr}( q_{\nu} = \mathbf{A}) \cdot {\rm Pr}(q_{\nu+1} = \mathbf{B}\hspace{0.05cm} | q_{\nu} = \mathbf{A}) = {1}/{2} \cdot {1}/{2} \cdot {1}/{2} = {1}/{8} = p_{\rm BA} \hspace{0.05cm}.$$

In contrast, for the two other combinations $\rm AA$ and $\rm BB$:

$$p_{\rm AA} ={\rm Pr}(\nu = 1) \cdot {\rm Pr}( q_1 = \mathbf{A}) \cdot {\rm Pr}(q_{2} = \mathbf{A}\hspace{0.05cm} | q_{1} = \mathbf{A}) + {\rm Pr}(\nu=2) \cdot {\rm Pr}( q_{2} = \mathbf{A}) \cdot {\rm Pr}(q_{3} = \mathbf{A}\hspace{0.05cm} | q_{2} = \mathbf{A}) \hspace{0.05cm}.$$

$$\Rightarrow \hspace{0.3cm}p_{\rm AA} = \frac{1}{2} \cdot \frac{1}{2} \cdot 1+ \frac{1}{2} \cdot \frac{1}{2} \cdot \frac{1}{2} = \frac{3}{8} = p_{\rm BB} \hspace{0.05cm}.$$

Here $\nu = 1$ stands for all odd indices and $\nu = 2$ for all even indices.

This gives for the entropy approximation:

$$H_2 = \frac{1}{2} \cdot \left [ 2 \cdot \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}\frac {8}{3} + 2 \cdot \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)\right ] = \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}(8) - \frac{3}{8} \cdot {\rm log}_2\hspace{0.1cm}(3) + \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8) \hspace{0.15cm} \underline {= 0.906 \,{\rm bit/symbol}} \hspace{0.05cm}.$$

(5) Following a similar procedure, we arrive at the composite probabilities for three-tuples

$$p_{\rm AAA} \hspace{0.1cm} = \hspace{0.1cm} p_{\rm BBB} = 1/4 \hspace{0.05cm},\hspace{0.2cm} p_{\rm ABA} = p_{\rm BAB} = 0 \hspace{0.05cm},\hspace{0.2cm} p_{\rm AAB} \hspace{0.1cm} = \hspace{0.1cm} p_{\rm ABB} = p_{\rm BBA} = p_{\rm BAA} = 1/8$$

and from this to the entropy approximation

$$H_3 = \frac{1}{3} \cdot \left [ 2 \cdot \frac{1}{4} \cdot {\rm log}_2\hspace{0.1cm}(4) + 4 \cdot \frac{1}{8} \cdot {\rm log}_2\hspace{0.1cm}(8)\right ] = \frac{2.5}{3} \hspace{0.15cm} \underline {= 0.833 \,{\rm bit/Symbol}} \hspace{0.05cm}.$$

To calculate $H_4$ , the $16$ probabilities are as follows:

$$p_{\rm AAAA} \hspace{0.1cm} = \hspace{0.1cm} p_{\rm BBBB} = 3/16 \hspace{0.05cm},\hspace{0.2cm} p_{\rm AABB} = p_{\rm BBAA} = 2/16 \hspace{0.05cm},$$

$$ p_{\rm AAAB} \hspace{0.1cm} = \hspace{0.1cm} p_{\rm ABBA} = p_{\rm ABBB} = p_{\rm BBBA} = p_{\rm BAAB} = p_{\rm BAAA}= 1/16 \hspace{0.05cm}$$

$$ p_{\rm AABA} \hspace{0.1cm} = \hspace{0.1cm} p_{\rm ABAA} = p_{\rm ABAB} = p_{\rm BBAB} = p_{\rm BABB} = p_{\rm BABA}= 0\hspace{0.05cm}.$$

It follows that:

$$H_4= \frac{1}{4} \hspace{-0.05cm}\cdot \hspace{-0.05cm}\left [ 2 \hspace{-0.05cm}\cdot \hspace{-0.05cm} \frac{3}{16} \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.1cm}\frac{16}{3} + 2 \hspace{-0.05cm}\cdot \hspace{-0.05cm} \frac{1}{8} \hspace{-0.05cm}\cdot \hspace{-0.05cm}{\rm log}_2\hspace{0.1cm}(8) + 6 \hspace{-0.05cm}\cdot \hspace{-0.05cm} \frac{1}{16} \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.1cm}(16)\right ] =\frac{\left [ 6 \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.01cm}(16) - 6 \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.01cm}(3) + 4 \hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.01cm}(8) + 6\hspace{-0.05cm}\cdot \hspace{-0.05cm} {\rm log}_2\hspace{0.01cm}(16)\right ]}{32} .$$

One can see:

Even the approximation $H_4 = 0.789\,{\rm bit/Symbol}$ still deviates significantly from the final entropy value $H = 0.5\,{\rm bit/symbol}$ .
The repetition code obviously cannot be modelled by a Markov source. If $\rm Q4$ were a Markov source, then the following would have to apply:

$$H = 2 \cdot H_2 - H_1 \hspace{0.3cm}\Rightarrow\hspace{0.3cm}H_2 = 1/2 \cdot (H+H_1) = 1/2 \cdot (0.5+1) = 0.75 \,{\rm bit/Symbol}\hspace{0.05cm}.$$

Retrieved from "http://en.lntwww.de/index.php?title=Aufgaben:Exercise_1.3:_Entropy_Approximations&oldid=40292"

Category:

Information Theory: Exercises