Difference between revisions of "Aufgaben:Exercise 3.15: Data Processing Theorem"

Latest revision as of 14:22, 24 September 2021

"Data Processing Theorem"

We consider the following data processing chain:

Binary input data $X$ is processed by processor $1$ (top half in the graph) which is describable by conditional probabilities ⇒ $P_{Y\hspace{0.03cm}|\hspace{0.03cm}X}(\cdot)$ . Its output variable is $Y$.
A second processor with random variable $Y$ at input and random variable $Z$ at output is given by $P_{Z\hspace{0.03cm}|\hspace{0.03cm}Y}(\cdot)$ (lower half in the graph). $Z$ depends on $Y$ alone (deterministic or stochastic) and is independent of $X$:

$$P_{Z\hspace{0.05cm}|\hspace{0.03cm} XY\hspace{-0.03cm}}(z\hspace{0.03cm}|\hspace{0.03cm} x, y) =P_{Z\hspace{0.05cm}|\hspace{0.03cm} Y\hspace{-0.03cm}}(z\hspace{0.03cm}|\hspace{0.03cm} y) \hspace{0.05cm}.$$

The following nomenclature was used for this description:

$$x \in X = \{0, 1\}\hspace{0.02cm},\hspace{0.3cm} y \in Y = \{0,1\}\hspace{0.02cm},\hspace{0.3cm} z \in Z = \{0, 1\}\hspace{0.02cm}.$$

The joint probability mass function is:

$$P_{XYZ}(x, y, z) = P_{X}(x) \cdot P_{Y\hspace{0.05cm}|\hspace{0.03cm} X\hspace{-0.03cm}}(y\hspace{0.03cm}|\hspace{0.03cm} x)\cdot P_{Z\hspace{0.05cm}|\hspace{0.03cm} Y\hspace{-0.03cm}}(z\hspace{0.03cm}|\hspace{0.03cm} y) \hspace{0.05cm}.$$

This also means:

$X → Y → Z$ form a Markov chain. For such a one, the "Data Processing Theorem" holds with the following consequence:

$$I(X;Z) \le I(X;Y ) \hspace{0.05cm}, $$

$$I(X;Z) \le I(Y;Z ) \hspace{0.05cm}.$$

The theorem thus states:

One cannot gain additional information about input $X$ by manipulating ("processing") data $Y$ .
Data processing (by processor $1$) serves only the purpose of making the information contained in $X$ more visible.

Hints:

The task belongs to the chapter Application to Digital Signal Transmission.
Reference is also made to the page Chain rule of the mutual information in the previous chapter]].

Questions

	The derivation is based on the properties of a strictly symmetric channel.
	It is exploited that $H_{\rm bin}(p)$ is a concave function.
	The result is valid for any probability mass function $P_X(X)$.

$ I(X; Y) \ = \ $

$\ \rm bit$

$I(Y; Z) \ = \ $

$\ \rm bit$

$I(X; Z) \ = \ $

$\ \rm bit$

	Yes,
	No.

Solution

(1) Only solution proposal 1 is correct:

Both processors describe strictly symmetric channels ⇒ both uniformly dispersive and uniformly focusing.
For such a binary channel, with $Y = \{0, 1\} \ ⇒ \ |Y| = 2$:

$$I(X;Y) = 1 + \sum_{y \hspace{0.05cm}\in\hspace{0.05cm} Y} \hspace{0.1cm} P_{Y\hspace{0.03cm}|\hspace{0.03cm} X}(y\hspace{0.03cm}|\hspace{0.03cm}x) \cdot {\rm log}_2 \hspace{0.1cm}P_{\hspace{0.01cm}Y \hspace{0.03cm}|\hspace{0.03cm} X}(y\hspace{0.03cm}|\hspace{0.03cm}x) \hspace{0.05cm}.$$

Here it does not matter at all whether one starts from $X = 0$ or from $X = 1$ .
For $X = 0$ one obtains with $P_{Y\hspace{0.03cm}|\hspace{0.03cm}X}(Y = 1\hspace{0.03cm}|\hspace{0.03cm}X = 0) = p$ and $P_{Y\hspace{0.03cm}|\hspace{0.03cm}X}(Y = 0\hspace{0.03cm}|\hspace{0.03cm}X = 0) = 1 – p\hspace{0.05cm}$:

$$ I(X;Y) = 1 + p \cdot {\rm log}_2 \hspace{0.1cm} (p) + (1-p) \cdot {\rm log}_2 \hspace{0.1cm} (1-p) = 1 - H_{\rm bin}(p)\hspace{0.05cm}, \hspace{1.0cm}H_{\rm bin}(p)= p \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{p}+ (1-p) \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{1-p}\hspace{0.05cm}.$$

However, the result holds only for $P_X(X) = (0.5, \ 0.5)$ ⇒ maximum mutual information ⇒ channel capacity.
Otherwise, $I(X; Y)$ is smaller. For example, for $P_X(X) = (1, \ 0)$: $H(X) = 0$ ⇒ $I(X; Y) = 0.$
The binary entropy function is concave, but this property was not used in the derivation ⇒ answer 2 is incorrect.

(2) For processor $1$ , $p = 0.1\hspace{0.05cm}$ gives:

$$ I(X;Y) = 1 - H_{\rm bin}(0.1) = 1 - 0.469 \hspace{0.15cm} \underline {=0.531 \,{\rm (bit)}} \hspace{0.05cm}.$$

(3) Correspondingly, for the second processor with $q = 0.2\hspace{0.05cm}$:

$$I(Y;Z) = 1 - H_{\rm bin}(0.2) = 1 - 0.722 \hspace{0.15cm} \underline {=0.278 \,{\rm (bit)}} \hspace{0.05cm}.$$

(4) The probability for $Z = 0$ under the condition $X = 0$ results in two ways to

$$P(\hspace{0.01cm}Z\hspace{-0.05cm} = 0\hspace{0.03cm} | \hspace{0.03cm} X\hspace{-0.05cm} = \hspace{-0.05cm}0) = (1-p) \cdot (1-q) + p \cdot q = 1 - p - q + 2pq \stackrel{!}{=} 1 - \varepsilon \hspace{0.05cm}.$$

The overall system then has exactly the same BSC structure as processors $1$ and $2$, but now with falsification probability $\varepsilon = p + q - 2pq \hspace{0.05cm}.$
For $p = 0.1$ and $q = 0.2$ , we obtain:

$$ \varepsilon = 0.1 + 0.2 - 2\cdot 0.1 \cdot 0.2 = 0.26 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) = 1 - H_{\rm bin}(0.26) = 1 - 0.827 \hspace{0.15cm} \underline {=0.173 \,{\rm (bit)}} \hspace{0.05cm}.$$

(5) The answer is YES, because the "Data Processing Theorem" assumes exactly the conditions given here.

However, we want to evaluate some additional numerical results:

If $0 ≤ p < 0.5$ and $0 ≤ q < 0.5$ hold, we obtain:

$$\varepsilon = p + q \cdot (1-2p) > p \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) < I(X;Y) \hspace{0.05cm},$$

$$\varepsilon = q + p \cdot (1-2q) > q \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) < I(Y;Z) \hspace{0.05cm}.$$

For $p = 0.5$ holds independently of $q$, since $I(X; Z)$ cannot be greater than $I(X; Y)$:

$$\varepsilon = 0.5 + q \cdot (1-1) = 0.5 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Y) = 0 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) = 0 \hspace{0.05cm}.$$

Similarly, with $q = 0.5$ independent of $p$, we obtain:

$$\varepsilon = 0.5 + p \cdot (1-1) = 0.5 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(Y;Z) = 0 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) = 0 \hspace{0.05cm}.$$

Also for $p < 0.5$ and $q > 0.5$ the Data Processing Theorem is not violated, which will be shown here only by an example:

With $p = 0.1$ and $q = 0.8$ , the same result is obtained as in subtask (4):

$$\varepsilon = 0.1 + 0.8 - 2\cdot 0.1 \cdot 0.8 = 0.74 \hspace{0.3cm} \Rightarrow \hspace{0.3cm} I(X;Z) = 1 - H_{\rm bin}(0.74) = 1 - H_{\rm bin}(0.26) =0.173 \,{\rm (bit)} \hspace{0.05cm}.$$

@@ Line 1: / Line 1: @@
-{{quiz-Header|Buchseite=Informationstheorie/Anwendung auf die Digitalsignalübertragung
+{{quiz-Header|Buchseite=Information_Theory/Application_to_Digital_Signal_Transmission
 }}
-[[File:P_ID2818__Inf_A_3_14.png|right|]]
+[[File:P_ID2818__Inf_A_3_14.png|right|frame|"Data Processing Theorem"]]
-Wir betrachten die folgende Datenverarbeitungskette:
+We consider the following data processing chain:
-:* Binäre Eingangsdaten $X$ werden durch den Prozessor $1$ verarbeitet, der durch bedingte Wahrscheinlichkeiten $(P_Y|X)$ beschreibbar ist. Dessen Ausgangsgröße ist $Y$.
+* Binary input data&nbsp; $X$&nbsp; is processed by processor &nbsp; $1$&nbsp; (top half in the graph)&nbsp; which is describable by conditional probabilities &nbsp; &rArr; &nbsp; $P_{Y\hspace{0.03cm}|\hspace{0.03cm}X}(\cdot)$&nbsp;.&nbsp; Its output variable is&nbsp; $Y$.
-:* Ein zweiter Prozessor mit der Zufallsgröße $Y$ am Eingang und der Zufallsgröße $Z$ am Ausgang ist durch $P_{Z|Y} $gegeben. $Z$ hängt allein von $Y$ ab (entweder deterministisch oder stochastisch) und ist unabhängig von $X$:
+* A second processor with random variable&nbsp; $Y$&nbsp; at input and random variable&nbsp; $Z$&nbsp; at output is given by&nbsp; $P_{Z\hspace{0.03cm}|\hspace{0.03cm}Y}(\cdot)$&nbsp;   (lower half in the graph).&nbsp; $Z$&nbsp; depends on&nbsp; $Y$&nbsp; alone&nbsp; (deterministic or stochastic)&nbsp; and is independent of&nbsp; $X$:
-$$P_{Z\hspace{0.01cm}|\hspace{0.01cm} XY\hspace{-0.03cm}}(z\hspace{0.01cm}|\hspace{0.01cm} x, y) =P_{Z\hspace{0.01cm}|\hspace{0.01cm} Y\hspace{-0.03cm}}(z\hspace{0.01cm}|\hspace{0.01cm} y) \hspace{0.05cm}.$$
+:$$P_{Z\hspace{0.05cm}|\hspace{0.03cm} XY\hspace{-0.03cm}}(z\hspace{0.03cm}|\hspace{0.03cm} x, y) =P_{Z\hspace{0.05cm}|\hspace{0.03cm} Y\hspace{-0.03cm}}(z\hspace{0.03cm}|\hspace{0.03cm} y) \hspace{0.05cm}.$$
-Hierbei wurde folgende Nomenklatur benutzt:
+The following nomenclature was used for this description:
-$$x \in X = \{0, 1\}\hspace{0.02cm},\hspace{0.3cm} y \in Y = \{0,1\}\hspace{0.02cm},\hspace{0.3cm} z \in Z = \{0, 1\}\hspace{0.02cm}.$$
+:$$x \in X = \{0, 1\}\hspace{0.02cm},\hspace{0.3cm} y \in Y = \{0,1\}\hspace{0.02cm},\hspace{0.3cm} z \in Z = \{0, 1\}\hspace{0.02cm}.$$
-Die Verbund–Wahrscheinlichkeitsfunktion (englisch: ''Joint Probability Mass Function'') lautet:
+The joint probability mass function is:
-$$P_{XYZ}(x, y, z) = P_{X}(x) \cdot P_{Y\hspace{0.01cm}|\hspace{0.01cm} X\hspace{-0.03cm}}(y\hspace{0.01cm}|\hspace{0.01cm} x)\cdot P_{Z\hspace{0.01cm}|\hspace{0.01cm} Y\hspace{-0.03cm}}(z\hspace{0.01cm}|\hspace{0.01cm} y) \hspace{0.05cm}.$$
+:$$P_{XYZ}(x, y, z) = P_{X}(x) \cdot P_{Y\hspace{0.05cm}|\hspace{0.03cm} X\hspace{-0.03cm}}(y\hspace{0.03cm}|\hspace{0.03cm} x)\cdot P_{Z\hspace{0.05cm}|\hspace{0.03cm} Y\hspace{-0.03cm}}(z\hspace{0.03cm}|\hspace{0.03cm} y) \hspace{0.05cm}.$$
-Das bedeutet auch: $X → Y → Z$ bilden eine [http://en.lntwww.de/Stochastische_Signaltheorie/Markovketten Markovkette]. Für eine solche gilt das Data Processing Theorem mit folgender Konsequenz:
+This also means:
-$$I(X;Z) \hspace{-0.15cm}  \le  \hspace{-0.15cm}I(X;Y ) \hspace{0.05cm},\\ I(X;Z) \hspace{-0.15cm}  \le  \hspace{-0.15cm} I(Y;Z ) \hspace{0.05cm}.$$
-Das Theorem besagt somit:
-:* Man kann durch Manipulation (''Processing'') der Daten $Y$ keine zusätzliche Information über den Eingang $X$ gewinnen.
-:* Datenverarbeitung (durch den Prozessor 2) dient nur dem Zweck, die Information über $X$ besser sichtbar zu machen.
-'''Hinweis:''' Die Aufgabe gehört zu [http://en.lntwww.de/Informationstheorie/Anwendung_auf_die_Digitalsignal%C3%BCbertragung Kapitel 3.3].
+$X → Y → Z$&nbsp; form a&nbsp; [[Theory_of_Stochastic_Signals/Markovketten|Markov chain]].&nbsp; For such a one, the&nbsp; "Data Processing Theorem"&nbsp; holds with the following consequence:
+:$$I(X;Z)   \le  I(X;Y ) \hspace{0.05cm}, $$
+:$$I(X;Z)   \le   I(Y;Z ) \hspace{0.05cm}.$$
+The theorem thus states:
+:* One cannot gain additional information about input&nbsp; $X$&nbsp; by manipulating &nbsp; ("processing")&nbsp; data&nbsp; $Y$&nbsp;.
+:* Data processing&nbsp; (by processor &nbsp; $1$)&nbsp; serves only the purpose of making the information contained in&nbsp; $X$&nbsp; more visible.
-===Fragebogen===
+Hints:
+*The task belongs to the chapter&nbsp; [[Information_Theory/Anwendung_auf_die_Digitalsignalübertragung| Application to Digital Signal Transmission]].
+*Reference is also made to the page&nbsp; [[Information_Theory/Different_Entropy_Measures_of_Two-Dimensional_Random_Variables#Chain_rule_of_the_mutual_information|Chain rule of the mutual information]]&nbsp; in the previous chapter]].
+===Questions===
 <quiz display=simple>
-{Wie lässt sich das Ergebnis $I(X; Y) = 1 – H_{bin}(p)$ herleiten?
+{How can the result &nbsp;$I(X; Y) = 1 - H_{\rm bin}(p)$&nbsp; be interpreted?
 |type="[]"}
-+ Über die Eigenschaften eines streng symmetrischen Kanals.
++ The derivation is based on the properties of a strictly symmetric channel.
-- Weil $H_{bin}(p)$ eine konkave Funktion ist.
+- It is exploited that &nbsp;$H_{\rm bin}(p)$&nbsp; is a concave function.
-- Das Ergebnis gilt für jede Wahrscheinlichkeitsfunktion $P_X(X).$
+- The result is valid for any probability mass function &nbsp;$P_X(X)$.
-{Welche Transinformation ergibt sich für den Prozessor $1$ mit $p = 0.1$?
+{Which mutual information &nbsp;$I(X; Y)$&nbsp; results for the first processor with &nbsp;$p = 0.1$?
 |type="{}"}
-$p = 0.1:   I(X; Y)$ = { 0.531 3% } $bit$
+$ I(X; Y) \ = \ $ { 0.531 3% } $\ \rm bit$
-{Welche Transinformation ergibt sich für den Prozessor 2 mit $q = 0.2$?
+{Which mutual information &nbsp;$I(Y; Z)$&nbsp; results for the second processor with &nbsp;$q = 0.2$?
 |type="{}"}
-$q = 0.2:   I(Y; Z)$ = { 0.278 3% } $bit$
+$I(Y; Z)  \ = \ $ { 0.278 3% } $\ \rm bit$
-{Welche Transinformation ergibt sich für das Gesamtsystem?
+{Which mutual information &nbsp;$I(X; Z)$&nbsp; results for the whole system with&nbsp; $p = 0.1$ &nbsp;and&nbsp; $q = 0.2$?
 |type="{}"}
-$p = 0.1, q = 0.2:   I(X; Z)$ = { 0.173 3% } $bit$
+$I(X; Z) \ = \ $ { 0.173 3% } $\ \rm bit$
-{Erfüllt dieses Beispiel das Data Processing Theorem?
+{Does this example satisfy the&nbsp; "Data Processing Theorem"?
-|type="[]"}
+|type="()"}
-+ ja
++ Yes,
-- nein
+- No.
 </quiz>
-===Musterlösung===
+===Solution===
 {{ML-Kopf}}
-'''1.''' Beide Prozessoren beschreiben streng symmetrische Kanäle $ ⇒$  sowohl gleichmäßig dispersiv als auch gleichmäßig fokussierend. Für einen solchen Binärkanal gilt mit $Y = \{0, 1\} ⇒ |Y| = 2:$
+'''(1)'''&nbsp; Only <u>solution proposal 1</u> is correct:
-$$I(X;Y) = 1 + \sum_{y \hspace{0.05cm}\in\hspace{0.05cm} Y} \hspace{0.1cm} P_{\hspace{0.01cm}Y \mid \hspace{0.01cm} X}(y|x) \cdot {\rm log}_2 \hspace{0.1cm}P_{\hspace{0.01cm}Y \mid \hspace{0.01cm} X}(y|x) \hspace{0.05cm}.$$
+* Both processors describe strictly symmetric channels &nbsp; &rArr; &nbsp;  both uniformly dispersive and uniformly focusing.
-Hierbei ist es völlig egal, ob man von $X = 0$ oder von $X = 1$ ausgeht. Für $X = 0$ erhält man mit $P_{Y|X}(Y = 1|X = 0) = p$   und   $P_{Y|X}(Y = 0|X = 0) = 1 – p:$
+* For such a binary channel, with&nbsp; $Y = \{0, 1\} \ ⇒ \ |Y| = 2$:
-$$ I(X;Y) = 1 + p \cdot {\rm log}_2 \hspace{0.1cm} (p) + (1-p) \cdot {\rm log}_2 \hspace{0.1cm} (1-p) = 1 - H_{\rm bin}(p)\hspace{0.05cm},$$
+:$$I(X;Y) = 1 + \sum_{y \hspace{0.05cm}\in\hspace{0.05cm} Y} \hspace{0.1cm} P_{Y\hspace{0.03cm}|\hspace{0.03cm} X}(y\hspace{0.03cm}|\hspace{0.03cm}x) \cdot {\rm log}_2 \hspace{0.1cm}P_{\hspace{0.01cm}Y \hspace{0.03cm}|\hspace{0.03cm} X}(y\hspace{0.03cm}|\hspace{0.03cm}x) \hspace{0.05cm}.$$
-$$ {\rm mit}\hspace{1.0cm}H_{\rm bin}(p)= p \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{p}+ (1-p) \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{1-p}\hspace{0.05cm}.$$
+*Here it does not matter at all whether one starts from&nbsp; $X = 0$&nbsp; or from&nbsp; $X = 1$&nbsp;.&nbsp;
-Das Ergebnis gilt allerdings nur für $P_X(X) = [0.5, 0.5] ⇒$ maximale Transinformation $⇒$ Kanalkapazität. Andernfalls ist $I(X; Y)$ kleiner. Beispielsweise gilt für $PX_(X) = [1, 0]$:    $H(X) = 0  ⇒  I(X; Y) = 0.$
+*For&nbsp; $X = 0$&nbsp; one obtains with&nbsp; $P_{Y\hspace{0.03cm}|\hspace{0.03cm}X}(Y = 1\hspace{0.03cm}|\hspace{0.03cm}X = 0) = p$   &nbsp;and&nbsp;   $P_{Y\hspace{0.03cm}|\hspace{0.03cm}X}(Y = 0\hspace{0.03cm}|\hspace{0.03cm}X = 0) = 1 – p\hspace{0.05cm}$:
+:$$ I(X;Y) = 1 + p \cdot {\rm log}_2 \hspace{0.1cm} (p) + (1-p) \cdot {\rm log}_2 \hspace{0.1cm} (1-p) = 1 - H_{\rm bin}(p)\hspace{0.05cm},
+\hspace{1.0cm}H_{\rm bin}(p)= p \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{p}+ (1-p) \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{1-p}\hspace{0.05cm}.$$
+*However, the result holds only for&nbsp; $P_X(X) = (0.5, \ 0.5)$ &nbsp; &rArr; &nbsp; maximum mutual information &nbsp; &rArr; &nbsp; channel capacity.
+*Otherwise,&nbsp; $I(X; Y)$&nbsp; is smaller.&nbsp; &nbsp; &nbsp;For example, for $P_X(X) = (1, \ 0)$: &nbsp;   $H(X) = 0$  &nbsp; &rArr; &nbsp;   $I(X; Y) = 0.$
+*The binary entropy function is concave, but this property was not used in the derivation &nbsp; &rArr; &nbsp;  answer 2 is incorrect.
+'''(2)'''&nbsp; For processor&nbsp; $1$&nbsp;,&nbsp; $p = 0.1\hspace{0.05cm}$ gives:
+:$$ I(X;Y) = 1 - H_{\rm bin}(0.1) = 1 - 0.469 \hspace{0.15cm} \underline {=0.531 \,{\rm (bit)}} \hspace{0.05cm}.$$
-Richtig ist also nur der Lösungsvorschlag 1. Die binäre Entropiefunktion ist zwar konkav, aber diese Eigenschaft wurde bei der Herleitung nicht benutzt.
-'''2.''' Für den Prozessor $1$ ergibt sich mit $p = 0.1:$
+'''(3)'''&nbsp; Correspondingly, for the second processor with&nbsp; $q = 0.2\hspace{0.05cm}$:
-$$ I(X;Y) = 1 - H_{\rm bin}(0.1) = 1 - 0.469 \hspace{0.15cm} \underline {=0.531 \,{\rm (bit)}} \hspace{0.05cm}.$$
+:$$I(Y;Z) = 1 - H_{\rm bin}(0.2) = 1 - 0.722 \hspace{0.15cm} \underline {=0.278 \,{\rm (bit)}} \hspace{0.05cm}.$$
-'''3.''' Entsprechend gilt für den zweiten Prozessor mit $q = 0.2$:
-$$I(Y;Z) = 1 - H_{\rm bin}(0.2) = 1 - 0.722 \hspace{0.15cm} \underline {=0.278 \,{\rm (bit)}} \hspace{0.05cm}.$$
-'''4.''' Die Wahrscheinlichkeit für $Z = 0$ unter der Bedingung $X = 0$ ergibt sich über zwei Wege zu
+'''(4)'''&nbsp; The probability for&nbsp; $Z = 0$&nbsp; under the condition&nbsp; $X = 0$&nbsp; results in two ways to
-$$P(\hspace{0.01cm}Z\hspace{-0.05cm} = \hspace{-0.05cm}0 \mid \hspace{0.01cm} X\hspace{-0.05cm} = \hspace{-0.05cm}0) = (1-p) \cdot (1-q) + p \cdot q = 1 - p - q + 2pq \stackrel{!}{=} 1 - \varepsilon \hspace{0.05cm}.$$
+:$$P(\hspace{0.01cm}Z\hspace{-0.05cm} = 0\hspace{0.03cm} | \hspace{0.03cm} X\hspace{-0.05cm} = \hspace{-0.05cm}0) = (1-p) \cdot (1-q) + p \cdot q = 1 - p - q + 2pq \stackrel{!}{=} 1 - \varepsilon \hspace{0.05cm}.$$
-Das Gesamtsystem hat dann die genau gleiche BSC–Struktur wie die Prozessoren $1$ und $2$, aber nun mit der Verfälschungswahrscheinlichkeit
+*The overall system then has exactly the same BSC structure as processors $1$ and $2$, but now with falsification probability&nbsp; $\varepsilon = p + q - 2pq \hspace{0.05cm}.$
-$$\varepsilon = p + q - 2pq \hspace{0.05cm}.$$
+*For&nbsp; $p = 0.1$&nbsp; and&nbsp; $q = 0.2$&nbsp;, we obtain:
-Für $p = 0.1$ und $q = 0.2$ erhält man
+:$$ \varepsilon = 0.1 + 0.2 - 2\cdot 0.1 \cdot 0.2 = 0.26 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) = 1 - H_{\rm bin}(0.26) = 1 - 0.827 \hspace{0.15cm} \underline {=0.173 \,{\rm (bit)}} \hspace{0.05cm}.$$
-$$ \varepsilon = 0.1 + 0.2 - 2\cdot 0.1 \cdot 0.2 = 0.26$$
-$$ \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) = 1 - H_{\rm bin}(0.26) = 1 - 0.827 \hspace{0.15cm} \underline {=0.173 \,{\rm (bit)}} \hspace{0.05cm}.$$
+'''(5)'''&nbsp; The answer is&nbsp; <u>YES</u>,&nbsp; because the&nbsp; "Data Processing Theorem" assumes exactly the conditions given here.
-'''5.''' Die Antwort ist natürlich JA, da beim Data ''Processing Theorem'' genau von den hier gegebenen Voraussetzungen ausgegangen wird. Wir wollen aber zusätzlich einige numerische Ergebnisse auswerten:
+However, we want to evaluate some additional numerical results:
-:* Gilt $0 ≤ p < 0.5$ und $0 ≤ q < 0.5$, so erhält man:
+* If&nbsp; $0 ≤ p < 0.5$ &nbsp;and&nbsp; $0 ≤ q < 0.5$&nbsp; hold, we obtain:
-$$varepsilon \hspace{-0.15cm}  =\hspace{-0.15cm} p + q \cdot (1-2p) > p \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) < I(X;Y) \hspace{0.05cm},$$
+:$$\varepsilon = p + q \cdot (1-2p) > p \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) < I(X;Y) \hspace{0.05cm},$$
-$$\varepsilon \hspace{-0.15cm} = \hspace{-0.15cm} q + p \cdot (1-2q) > q \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) < I(Y;Z) \hspace{0.05cm}.$$
+:$$\varepsilon = q + p \cdot (1-2q) > q \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) < I(Y;Z) \hspace{0.05cm}.$$
-:*Für $p = 0.5$ gilt unabhängig von $q$, da $I(X; Z)$ nicht größer sein kann als $I(X; Y)$:
+*For &nbsp;$p = 0.5$&nbsp; holds independently of &nbsp;$q$,&nbsp; since&nbsp; $I(X; Z)$&nbsp; cannot be greater than&nbsp; $I(X; Y)$:
-$$\varepsilon = 0.5 + q \cdot (1-1) = 0.5 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Y) = 0 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) = 0 \hspace{0.05cm}.$$
+:$$\varepsilon = 0.5 + q \cdot (1-1) = 0.5 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Y) = 0 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) = 0 \hspace{0.05cm}.$$
-Ebenso erhält man mit $q = 0.5$ unabhängig von $p$:
+*Similarly, with &nbsp;$q = 0.5$&nbsp; independent of &nbsp;$p$, we obtain:
-$$\varepsilon = 0.5 + p \cdot (1-1) = 0.5 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(Y;Z) = 0 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) = 0 \hspace{0.05cm}.$$
+:$$\varepsilon = 0.5 + p \cdot (1-1) = 0.5 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(Y;Z) = 0 \hspace{0.3cm}\Rightarrow \hspace{0.3cm} I(X;Z) = 0 \hspace{0.05cm}.$$
-Auch für $p < 0.5$ und $q > 0.5$ wird das Data Processing Theorem nicht verletzt, was hier nur an einem Beispiel gezeigt werden soll. Mit $p = 0.1$ und $q = 0.8$ erhält man das gleiche Ergebnis wie in Teilaufgabe (d):
+*Also for &nbsp;$p < 0.5$&nbsp; and &nbsp;$q > 0.5$&nbsp; the Data Processing Theorem is not violated, which will be shown here only by an example:
-$$\varepsilon = 0.1 + 0.8 - 2\cdot 0.1 \cdot 0.8 = 0.74$$
+::With &nbsp;$p = 0.1$&nbsp; and &nbsp;$q = 0.8$&nbsp;, the same result is obtained as in subtask&nbsp; '''(4)''':
-$$\Rightarrow \hspace{0.3cm} I(X;Z) = 1 - H_{\rm bin}(0.74) = 1 - H_{\rm bin}(0.26) =0.173 \,{\rm (bit)} \hspace{0.05cm}.$$
+::$$\varepsilon = 0.1 + 0.8 - 2\cdot 0.1 \cdot 0.8 = 0.74  \hspace{0.3cm}
+\Rightarrow \hspace{0.3cm} I(X;Z) = 1 - H_{\rm bin}(0.74) = 1 - H_{\rm bin}(0.26) =0.173 \,{\rm (bit)} \hspace{0.05cm}.$$
 {{ML-Fuß}}
-[[Category:Aufgaben zu  Informationstheorie|^3.3 Anwendung auf die Digitalsignalübertragung^]]
+[[Category:Information Theory: Exercises|^3.3 Application to Digital Signal Transmission^]]