Loading [MathJax]/jax/output/HTML-CSS/fonts/TeX/fontdata.js

Difference between revisions of "Aufgaben:Exercise 3.9: Conditional Mutual Information"

From LNTwww
 
(18 intermediate revisions by 4 users not shown)
Line 1: Line 1:
  
{{quiz-Header|Buchseite=Informationstheorie/Verschiedene Entropien zweidimensionaler Zufallsgrößen
+
{{quiz-Header|Buchseite=Information_Theory/Different_Entropy_Measures_of_Two-Dimensional_Random_Variables
 
}}
 
}}
  
[[File:P_ID2813__Inf_A_3_8.png|right|Zusammenhang zwischen den Zufallsgrößen <i>X</i>, <i>Y</i>, <i>Z</i> und <i>W</i>]]
+
[[File:P_ID2813__Inf_A_3_8.png|right|frame|Result&nbsp; W&nbsp; as a function <br>of&nbsp;  $X$,&nbsp; $Y$,&nbsp; $Z$]]
Wir gehen von den statistisch unabhängigen Zufallsgrößen X, Y und Z mit den folgenden Eigenschaften aus :  
+
We assume statistically independent random variables&nbsp; X,&nbsp; Y&nbsp; and&nbsp; Z&nbsp; with the following properties:
:$$X \in \{1, 2 \} \hspace{0.05cm},\hspace{0.35cm}
+
:$$X \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm}
Y \in \{1, 2 \} \hspace{0.05cm},\hspace{0.35cm}
+
Y \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm}
Z \in \{1, 2 \} \hspace{0.05cm},\hspace{0.35cm} P_X(X) = P_Y(Y) = [ 1/2 , 1/2]\hspace{0.05cm},\hspace{0.35cm}P_Z(Z) = [ p, 1-p].$$
+
Z \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm} P_X(X) = P_Y(Y) = \big [ 1/2, \ 1/2 \big ]\hspace{0.05cm},\hspace{0.35cm}P_Z(Z) = \big [ p, \ 1-p \big ].$$
  
Aus X, Y und Z bilden wir die neue Zufallsgröße W=(X+Y)Z.
+
From&nbsp; X,&nbsp; Y&nbsp; and&nbsp; Z&nbsp; we form the new random variable&nbsp; W=(X+Y)Z.
*Damit ist offensichtlich, dass es zwischen den beiden Zufallsgrößen X und W statistische Abhängigkeiten gibt, die sich auch in der Transinformation I(X;W)0 zeigen werden.
+
* It is obvious that there are statistical dependencies between&nbsp; X&nbsp; and&nbsp; W&nbsp; &nbsp; &rArr; &nbsp; mutual information&nbsp; I(X;W)0.
*Außerdem wird auch I(Y;W)0 sowie I(Z;W)0 gelten, worauf in dieser Aufgabe jedoch nicht näher eingegangen wird.
+
*Furthermore,&nbsp; I(Y;W)0 &nbsp;as well as&nbsp; I(Z;W)0&nbsp; will also apply, but this will not be discussed in detail in this exercise.
  
  
In dieser Aufgabe werden drei verschiedene Transinformationsdefinitionen verwendet:
+
Three different definitions of mutual information are used in this exercise:
*die ''herkömmliche'' Transinformation zwischen X und W:
+
*the&nbsp; <u>conventional</u>&nbsp; mutual information zwischen&nbsp; X&nbsp; and&nbsp; W:
 
:I(X;W)=H(X)H(X|W),   
 
:I(X;W)=H(X)H(X|W),   
:* die ''bedingte'' Transinformation zwischen X und W bei ''gegebenem Festwert'' Z=z:
+
* the&nbsp; <u>conditional</u>&nbsp; mutual information between&nbsp; X&nbsp; and&nbsp; W&nbsp; with a&nbsp; <u>given fixed value</u>&nbsp; Z=z:
 
:I(X;W|Z=z)=H(X|Z=z)H(X|W,Z=z),
 
:I(X;W|Z=z)=H(X|Z=z)H(X|W,Z=z),
* die ''bedingte'' Transinformation zwischen X und W bei ''gegebener Zufallsgröße'' Z:
+
* the&nbsp; <u>conditional</u>&nbsp; mutual information between&nbsp; X&nbsp; and&nbsp; W&nbsp; for a&nbsp; <u>given random variable</u>&nbsp; Z:
 
:I(X;W|Z)=H(X|Z)H(X|WZ).
 
:I(X;W|Z)=H(X|Z)H(X|WZ).
  
Der Zusammenhang zwischen den beiden letzten Definitionen lautet:
+
The relationship between the last two definitions is:
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z ) = \sum_{z \hspace{0.1cm}\in \hspace{0.1cm}{\rm supp} (P_{Z})} \hspace{-0.2cm}
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z ) = \sum_{z \hspace{0.1cm}\in \hspace{0.1cm}{\rm supp} (P_{Z})} \hspace{-0.2cm}
 
  P_Z(z) \cdot  I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = z)\hspace{0.05cm}.$$
 
  P_Z(z) \cdot  I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = z)\hspace{0.05cm}.$$
  
  
''Hinweise:''
 
*Die Aufgabe gehört zum  Kapitel [[Informationstheorie/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen|Verschiedene Entropien zweidimensionaler Zufallsgrößen]].
 
*Insbesondere wird auf die Seite [[Informationstheorie/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Bedingte_Transinformation|Bedingte Transinformation]] Bezug genommen .
 
*Sollte die Eingabe des Zahlenwertes &bdquo;0&rdquo; erforderlich sein, so geben Sie bitte &bdquo;0.&rdquo; ein.
 
  
  
  
===Fragebogen===
+
 
 +
 
 +
Hints:
 +
*The exercise belongs to the chapter&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen|Different entropies of two-dimensional random variables]].
 +
*In particular, reference is made to the page&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Conditional_mutual_information|Conditional mutual information]].
 +
 
 +
 
 +
===Questions===
 
<quiz display=simple>
 
<quiz display=simple>
  
{Wie groß ist die Transinformation zwischen X und W, falls stets Z=1 gilt?
+
{How large is the mutual information between&nbsp; X&nbsp; and&nbsp; W,&nbsp; if&nbsp; Z=1&nbsp; always holds?
 
|type="{}"}
 
|type="{}"}
 
I(X;W|Z=1) =  { 0.5 3% }  bit
 
I(X;W|Z=1) =  { 0.5 3% }  bit
  
{Wie groß ist die Transinformation zwischen X und W, falls stets Z=2 gilt?
+
{How large is the mutual information between&nbsp; X&nbsp; and&nbsp; W,&nbsp; if&nbsp; Z=2&nbsp; always holds?
 
|type="{}"}
 
|type="{}"}
 
I(X;W|Z=2) =  { 0.5 3% }  bit
 
I(X;W|Z=2) =  { 0.5 3% }  bit
  
{Nun gelte p=Pr(Z=1). Wie groß ist die bedingte Transinformation zwischen X und W, falls zZ={1,2} bekannt ist?  
+
{Now let &nbsp;p=Pr(Z=1).&nbsp; How large is the conditional mutual information between&nbsp; X&nbsp; and&nbsp; W, if&nbsp; $z  \in Z = \{1,\ 2\}$&nbsp; is known?  
 
|type="{}"}
 
|type="{}"}
 
p=1/2:   I(X;W|Z) =   { 0.5 3% }  bit
 
p=1/2:   I(X;W|Z) =   { 0.5 3% }  bit
 
p=3/4:   I(X;W|Z) =   { 0.5 3% }  bit
 
p=3/4:   I(X;W|Z) =   { 0.5 3% }  bit
  
{Wie groß ist die unkonditionierte Transinformation?  
+
{How large is the unconditional mutual information for&nbsp; p=1/2?  
 
|type="{}"}
 
|type="{}"}
$p = 1/2\text{:} \ \ \ I(X; W) \ = \ { 0.25 3% }\ \rm bit$
+
I(X;W) =  { 0.25 3% }  bit
  
  
Line 61: Line 64:
 
</quiz>
 
</quiz>
  
===Musterlösung===
+
===Solution===
 
{{ML-Kopf}}
 
{{ML-Kopf}}
'''(1)'''&nbsp; Die erste Grafik gilt für Z=1 &nbsp; &rArr; &nbsp; W=X+Y. Unter den Voraussetzungen PX(X)=[1/2,1/2] sowie PY(Y)=[1/2,1/2] ergeben sich somit die Verbundwahrscheinlichkeiten PXW|Z=1(X,W) entsprechend der rechten Grafik (graue Hinterlegung).
+
[[File:P_ID2814__Inf_A_3_8a.png|right|frame|Two-dimensional probability mass functions for&nbsp; Z=1]]
 +
'''(1)'''&nbsp; The upper graph is valid for&nbsp; Z=1 &nbsp; &rArr; &nbsp; W=X+Y.&nbsp;
 +
*Under the conditions&nbsp; $P_X(X) = \big [1/2, \ 1/2 \big]$&nbsp; as well as&nbsp; $P_Y(Y) = \big [1/2, \ 1/2 \big]$&nbsp; the joint probabilities&nbsp; PXW|Z=1(X,W)&nbsp; thus result according to the right graph (grey background).
  
Damit gilt für die Transinformation unter der festen Bedingung Z=1:
+
*Thus the following applies to the mutual information under the fixed condition&nbsp; Z=1:
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) \hspace{-0.05cm} = \hspace{-1.1cm}\sum_{(x,w) \hspace{0.1cm}\in \hspace{0.1cm}{\rm supp} (P_{XW}\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1)} \hspace{-1.1cm}
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) \hspace{-0.05cm} = \hspace{-1.1cm}\sum_{(x,w) \hspace{0.1cm}\in \hspace{0.1cm}{\rm supp} (P_{XW}\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1)} \hspace{-1.1cm}
  P_{XW\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (x,w) \cdot {\rm log}_2 \hspace{0.1cm} \frac{P_{XW\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (x,w) }{P_X(x) \cdot P_{W\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (w) } =  2 \cdot \frac{1}{4} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/4}{1/2 \cdot 1/4} +
+
  P_{XW\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (x,w) \cdot {\rm log}_2 \hspace{0.1cm} \frac{P_{XW\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (x,w) }{P_X(x) \cdot P_{W\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (w) }$$
 +
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1)  =  2 \cdot \frac{1}{4} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/4}{1/2 \cdot 1/4} +
 
2 \cdot \frac{1}{4} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/4}{1/2 \cdot 1/2}
 
2 \cdot \frac{1}{4} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/4}{1/2 \cdot 1/2}
 +
$$
 +
:$$\Rightarrow \hspace{0.3cm} I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1)
 
\hspace{0.15cm} \underline {=0.5\,{\rm (bit)}}
 
\hspace{0.15cm} \underline {=0.5\,{\rm (bit)}}
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
  
[[File:P_ID2814__Inf_A_3_8a.png|center|2D-Wahrscheinlichkeitsfunktionen für <i>Z</i> = 1]]
+
*The first term summarises the two horizontally shaded fields in the graph, the second term the vertically shaded fields.
 +
*The second term do not contribute because of&nbsp; $\log_2 (1) = 0$&nbsp;.
  
Der erste Term fasst die beiden horizontal schraffierten Felder in obiger Grafik zusammen, der zweite Term die vertikal schraffierten Felder. Letztere liefern wegen log2(1)=0 keinen Beitrag.
 
  
  
'''(2)'''&nbsp; Für Z=2 gilt zwar W={4,6,8}, aber hinsichtlich der Wahrscheinlichkeitsfunktionen ändert sich gegenüber der Teilaufgabe (1) nichts. Demzufolge erhält man auch die gleiche bedingte Transinformation:
+
[[File:P_ID2815__Inf_A_3_8b.png|right|frame|Two-dimensional probability mass functions for&nbsp; Z=2]]
 +
'''(2)'''&nbsp; For&nbsp; Z=2,&nbsp; $W = \{4,\ 6,\ 8\}$&nbsp; is valid, but nothing changes with respect to the probability functions compared to subtask&nbsp; '''(1)'''.
 +
 
 +
*Consequently, the same conditional mutual information is obtained:
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 2) = I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1)
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 2) = I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1)
 
\hspace{0.15cm} \underline {=0.5\,{\rm (bit)}}
 
\hspace{0.15cm} \underline {=0.5\,{\rm (bit)}}
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
  
[[File:P_ID2815__Inf_A_3_8b.png|center|2D-Wahrscheinlichkeitsfunktionen für <i>Z</i> = 2]]
 
  
 
+
'''(3)'''&nbsp; The equation is for&nbsp; $Z = \{1,\ 2\}$&nbsp; with&nbsp; Pr(Z=1)=p &nbsp;and&nbsp; Pr(Z=2)=1p:
'''(3)'''&nbsp; Die angegebene Gleichung lautet für Z={1,2} mit Pr(Z=1)=p und Pr(Z=2)=1p:
 
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z) =  p \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) + (1-p) \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 2)\hspace{0.15cm} \underline {=0.5\,{\rm (bit)}}
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z) =  p \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) + (1-p) \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 2)\hspace{0.15cm} \underline {=0.5\,{\rm (bit)}}
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
Es ist berücksichtigt, dass entsprechend den Teilaufgaben (1) und (2) die bedingten Transinformationen für gegebenes Z=1 und gegebenes Z=2 gleich sind. Damit ist I(X;W|Z), also unter der Bedingung einer stochastischen Zufallsgröße Z={1,2} mit P_Z(Z) = [p, 1 – p], unabhängig von p. Das Ergebnis gilt insbesondere auch für \underline{p = 1/2} und \underline{p = 3/4}.
+
*It is considered that according to subtasks&nbsp; '''(1)'''&nbsp; and&nbsp; '''(2)'''&nbsp; the conditional mutual information for given&nbsp; Z = 1&nbsp; and given&nbsp; Z = 2&nbsp; are equal.
 +
*Thus&nbsp; I(X; W|Z), i.e. under the condition of a stochastic random variable&nbsp; $Z = \{1,\ 2\}$&nbsp; with&nbsp; $P_Z(Z) = \big [p, \ 1 – p\big ]$&nbsp; is independent of &nbsp;p.  
 +
*In particular, the result is also valid for&nbsp; \underline{p = 1/2}&nbsp; and&nbsp; \underline{p = 3/4}.
  
  
[[File:P_ID2816__Inf_A_3_8d.png|right|Zur Berechnung der Verbundwahrscheinlichkeit für &bdquo;XW&rdquo;]]
+
[[File:P_ID2816__Inf_A_3_8d.png|right|frame|To calculate the joint probability for $XW$]]
'''(4)'''&nbsp; Die Verbundwahrscheinlichkeiten $P_{ XW }(⋅)$ hängen auch von den Z–Wahrscheinlichkeiten p und 1 – p ab.  
+
'''(4)'''&nbsp; The joint probability&nbsp; P_{ XW }&nbsp; depends on the&nbsp; Z–probabilites &nbsp;p&nbsp; and&nbsp; 1 – p&nbsp;.
*Für Pr(Z = 1) = Pr(Z = 2) = 1/2 ergibt sich das rechts skizzierte Schema.  
+
*For&nbsp; Pr(Z = 1) = Pr(Z = 2) = 1/2&nbsp; the scheme sketched on the right results.
*Zur Transinformation tragen nur wieder die beiden horizontal schraffierten Felder bei:
+
*Again, only the two horizontally shaded fields contribute to the mutual information:
 
:$$ I(X;W) = 2 \cdot \frac{1}{8} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/8}{1/2 \cdot 1/8}
 
:$$ I(X;W) = 2 \cdot \frac{1}{8} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/8}{1/2 \cdot 1/8}
 
\hspace{0.15cm} \underline {=0.25\,{\rm (bit)}} \hspace{0.35cm} < \hspace{0.35cm} I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z)
 
\hspace{0.15cm} \underline {=0.25\,{\rm (bit)}} \hspace{0.35cm} < \hspace{0.35cm} I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z)
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
  
 
+
The result&nbsp; I(X; W|Z) > I(X; W)&nbsp; is true for this example, but also for many other applications:
 
+
*If I know&nbsp; Z, I know more about the 2D random variable&nbsp; XW&nbsp; than without this knowledge..  
Das Ergebnis I(X; W|Z) > I(X; W) trifft für dieses Beispiel, aber auch für viele andere Anwendungen zu:  
+
*However, one must not generalize this result:
*Kenne ich Z, so weiß ich mehr über die 2D–Zufallsgröße XW als ohne diese Kenntnis.  
+
:Sometimes&nbsp; I(X; W) > I(X; W|Z), actually applies, as in&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgr%C3%B6%C3%9Fen#Conditional_mutual_information|Example 4]]&nbsp; in the theory section.
*Man darf dieses Ergebnis aber nicht verallgemeinern. Manchmal gilt tatsächlich I(X; W) > I(X; W|Z), so wie im [http://en.lntwww.de/Informationstheorie/Verschiedene_Entropien_zweidimensionaler_Zufallsgr%C3%B6%C3%9Fen#Bedingte_Transinformation Beispiel] im Theorieteil.
 
 
 
 
{{ML-Fuß}}
 
{{ML-Fuß}}
  
  
  
[[Category:Aufgaben zu Informationstheorie|^3.2 Entropien von 2D-Zufallsgrößen^]]
+
[[Category:Information Theory: Exercises|^3.2 Entropies of 2D Random Variables^]]

Latest revision as of 10:16, 24 September 2021

Result  W  as a function
of  XYZ

We assume statistically independent random variables  XY  and  Z  with the following properties:

X \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm} Y \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm} Z \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm} P_X(X) = P_Y(Y) = \big [ 1/2, \ 1/2 \big ]\hspace{0.05cm},\hspace{0.35cm}P_Z(Z) = \big [ p, \ 1-p \big ].

From  XY  and  Z  we form the new random variable  W = (X+Y) \cdot Z.

  • It is obvious that there are statistical dependencies between  X  and  W    ⇒   mutual information  I(X; W) ≠ 0.
  • Furthermore,  I(Y; W) ≠ 0  as well as  I(Z; W) ≠ 0  will also apply, but this will not be discussed in detail in this exercise.


Three different definitions of mutual information are used in this exercise:

  • the  conventional  mutual information zwischen  X  and  W:
I(X;W) = H(X) - H(X|\hspace{0.05cm}W) \hspace{0.05cm},
  • the  conditional  mutual information between  X  and  W  with a  given fixed value  Z = z:
I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = z) = H(X\hspace{0.05cm}|\hspace{0.05cm} Z = z) - H(X|\hspace{0.05cm}W ,\hspace{0.05cm} Z = z) \hspace{0.05cm},
  • the  conditional  mutual information between  X  and  W  for a  given random variable  Z:
I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z ) = H(X\hspace{0.05cm}|\hspace{0.05cm} Z ) - H(X|\hspace{0.05cm}W \hspace{0.05cm} Z ) \hspace{0.05cm}.

The relationship between the last two definitions is:

I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z ) = \sum_{z \hspace{0.1cm}\in \hspace{0.1cm}{\rm supp} (P_{Z})} \hspace{-0.2cm} P_Z(z) \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = z)\hspace{0.05cm}.




Hints:


Questions

1

How large is the mutual information between  X  and  W,  if  Z = 1  always holds?

I(X; W | Z = 1) \ = \

\ \rm bit

2

How large is the mutual information between  X  and  W,  if  Z = 2  always holds?

I(X; W | Z = 2) \ = \

\ \rm bit

3

Now let  p = {\rm Pr}(Z = 1).  How large is the conditional mutual information between  X  and  W, if  z \in Z = \{1,\ 2\}  is known?

p = 1/2\text{:} \ \ \ I(X; W | Z) \ = \

\ \rm bit
p = 3/4\text{:} \ \ \ I(X; W | Z) \ = \

\ \rm bit

4

How large is the unconditional mutual information for  p = 1/2?

I(X; W) \ = \

\ \rm bit


Solution

Two-dimensional probability mass functions for  Z = 1

(1)  The upper graph is valid for  Z = 1   ⇒   W = X + Y

  • Under the conditions  P_X(X) = \big [1/2, \ 1/2 \big]  as well as  P_Y(Y) = \big [1/2, \ 1/2 \big]  the joint probabilities  P_{ XW|Z=1 }(X, W)  thus result according to the right graph (grey background).
  • Thus the following applies to the mutual information under the fixed condition  Z = 1:
I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) \hspace{-0.05cm} = \hspace{-1.1cm}\sum_{(x,w) \hspace{0.1cm}\in \hspace{0.1cm}{\rm supp} (P_{XW}\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1)} \hspace{-1.1cm} P_{XW\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (x,w) \cdot {\rm log}_2 \hspace{0.1cm} \frac{P_{XW\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (x,w) }{P_X(x) \cdot P_{W\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (w) }
I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) = 2 \cdot \frac{1}{4} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/4}{1/2 \cdot 1/4} + 2 \cdot \frac{1}{4} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/4}{1/2 \cdot 1/2}
\Rightarrow \hspace{0.3cm} I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) \hspace{0.15cm} \underline {=0.5\,{\rm (bit)}} \hspace{0.05cm}.
  • The first term summarises the two horizontally shaded fields in the graph, the second term the vertically shaded fields.
  • The second term do not contribute because of  \log_2 (1) = 0 .


Two-dimensional probability mass functions for  Z = 2

(2)  For  Z = 2W = \{4,\ 6,\ 8\}  is valid, but nothing changes with respect to the probability functions compared to subtask  (1).

  • Consequently, the same conditional mutual information is obtained:
I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 2) = I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) \hspace{0.15cm} \underline {=0.5\,{\rm (bit)}} \hspace{0.05cm}.


(3)  The equation is for  Z = \{1,\ 2\}  with  {\rm Pr}(Z = 1) =p  and  {\rm Pr}(Z = 2) =1-p:

I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z) = p \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) + (1-p) \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 2)\hspace{0.15cm} \underline {=0.5\,{\rm (bit)}} \hspace{0.05cm}.
  • It is considered that according to subtasks  (1)  and  (2)  the conditional mutual information for given  Z = 1  and given  Z = 2  are equal.
  • Thus  I(X; W|Z), i.e. under the condition of a stochastic random variable  Z = \{1,\ 2\}  with  P_Z(Z) = \big [p, \ 1 – p\big ]  is independent of  p.
  • In particular, the result is also valid for  \underline{p = 1/2}  and  \underline{p = 3/4}.


To calculate the joint probability for XW

(4)  The joint probability  P_{ XW }  depends on the  Z–probabilites  p  and  1 – p .

  • For  Pr(Z = 1) = Pr(Z = 2) = 1/2  the scheme sketched on the right results.
  • Again, only the two horizontally shaded fields contribute to the mutual information:
I(X;W) = 2 \cdot \frac{1}{8} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/8}{1/2 \cdot 1/8} \hspace{0.15cm} \underline {=0.25\,{\rm (bit)}} \hspace{0.35cm} < \hspace{0.35cm} I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z) \hspace{0.05cm}.

The result  I(X; W|Z) > I(X; W)  is true for this example, but also for many other applications:

  • If I know  Z, I know more about the 2D random variable  XW  than without this knowledge..
  • However, one must not generalize this result:
Sometimes  I(X; W) > I(X; W|Z), actually applies, as in  Example 4  in the theory section.