Loading [MathJax]/jax/output/HTML-CSS/fonts/TeX/fontdata.js

Difference between revisions of "Aufgaben:Exercise 3.9: Conditional Mutual Information"

From LNTwww
Line 3: Line 3:
 
}}
 
}}
  
[[File:P_ID2813__Inf_A_3_8.png|right|frame|Ergebnis&nbsp; W&nbsp; als Funktion <br>von&nbsp;  X,&nbsp; Y,&nbsp; Z]]
+
[[File:P_ID2813__Inf_A_3_8.png|right|frame|Result&nbsp; W&nbsp; as a function <br>of&nbsp;  X,&nbsp; Y,&nbsp; Z]]
Wir gehen von den statistisch unabhängigen Zufallsgrößen&nbsp; X,&nbsp; Y&nbsp; und&nbsp; Z&nbsp; mit den folgenden Eigenschaften aus:  
+
We assume statistically independent random variables&nbsp; X,&nbsp; Y&nbsp; and&nbsp; Z&nbsp; with the following properties:
 
:$$X \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm}
 
:$$X \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm}
 
Y \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm}
 
Y \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm}
 
Z \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm} P_X(X) = P_Y(Y) = \big [ 1/2, \ 1/2 \big ]\hspace{0.05cm},\hspace{0.35cm}P_Z(Z) = \big [ p, \ 1-p \big ].$$
 
Z \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm} P_X(X) = P_Y(Y) = \big [ 1/2, \ 1/2 \big ]\hspace{0.05cm},\hspace{0.35cm}P_Z(Z) = \big [ p, \ 1-p \big ].$$
  
Aus&nbsp; X,&nbsp; Y&nbsp; und&nbsp; Z&nbsp; bilden wir die neue Zufallsgröße&nbsp; W=(X+Y)Z.
+
From&nbsp; X,&nbsp; Y&nbsp; and&nbsp; Z&nbsp; we form the new random variable&nbsp; W=(X+Y)Z.
*Es ist offensichtlich, dass es zwischen&nbsp; X&nbsp; und&nbsp; W&nbsp; statistische Abhängigkeiten gibt &nbsp; &rArr; &nbsp; Transinformation&nbsp; I(X;W)0.
+
* It is obvious that there are statistical dependencies between&nbsp; X&nbsp; and&nbsp; W&nbsp; &nbsp; &rArr; &nbsp; mutual information&nbsp; I(X;W)0.
*Außerdem wird auch&nbsp; I(Y;W)0 &nbsp;sowie&nbsp; I(Z;W)0&nbsp; gelten, worauf in dieser Aufgabe jedoch nicht näher eingegangen wird.
+
*Furthermore,&nbsp; I(Y;W)0 &nbsp;as well as&nbsp; I(Z;W)0&nbsp; will also apply, but this will not be discussed in detail in this task.
  
  
In dieser Aufgabe werden drei verschiedene Transinformationsdefinitionen verwendet:
+
Three different definitions of mutual information are used in this task:
*die ''herkömmliche''&nbsp; Transinformation zwischen&nbsp; X&nbsp; und&nbsp; W:
+
*the ''conventional''&nbsp; mutual information zwischen&nbsp; X&nbsp; and&nbsp; W:
 
:I(X;W)=H(X)H(X|W),   
 
:I(X;W)=H(X)H(X|W),   
* die ''bedingte''&nbsp; Transinformation zwischen&nbsp; X&nbsp; und&nbsp; W&nbsp; bei ''gegebenem Festwert''&nbsp; Z=z:
+
* the ''conditional''&nbsp; mutual information between&nbsp; X&nbsp; and&nbsp; W&nbsp; with a ''given fixed value''&nbsp; Z=z:
 
:I(X;W|Z=z)=H(X|Z=z)H(X|W,Z=z),
 
:I(X;W|Z=z)=H(X|Z=z)H(X|W,Z=z),
* die ''bedingte''&nbsp; Transinformation zwischen&nbsp; X&nbsp; und&nbsp; W&nbsp; bei ''gegebener Zufallsgröße''&nbsp; Z:
+
* the ''conditional''&nbsp; mutual information between&nbsp; X&nbsp; and&nbsp; W&nbsp; for a ''given random value''&nbsp; Z:
 
:I(X;W|Z)=H(X|Z)H(X|WZ).
 
:I(X;W|Z)=H(X|Z)H(X|WZ).
  
Der Zusammenhang zwischen den beiden letzten Definitionen lautet:
+
The relationship between the last two definitions is:
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z ) = \sum_{z \hspace{0.1cm}\in \hspace{0.1cm}{\rm supp} (P_{Z})} \hspace{-0.2cm}
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z ) = \sum_{z \hspace{0.1cm}\in \hspace{0.1cm}{\rm supp} (P_{Z})} \hspace{-0.2cm}
 
  P_Z(z) \cdot  I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = z)\hspace{0.05cm}.$$
 
  P_Z(z) \cdot  I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = z)\hspace{0.05cm}.$$
Line 32: Line 32:
  
  
''Hinweise:''
+
Hints:
*Die Aufgabe gehört zum  Kapitel&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen|Verschiedene Entropien zweidimensionaler Zufallsgrößen]].
+
*How large is the mutual information between&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen|Different entropies of two-dimensional random variables]].
*Insbesondere wird auf die Seite&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Bedingte_Transinformation|Bedingte Transinformation]] Bezug genommen .
+
*In particular, reference is made to the page&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Bedingte_Transinformation|Conditional mutual information]].  
 
  
  
 
+
===Questions===
===Fragebogen===
 
 
<quiz display=simple>
 
<quiz display=simple>
  
{Wie groß ist die Transinformation zwischen&nbsp; X&nbsp; und&nbsp; W,&nbsp; falls stets&nbsp; Z=1&nbsp; gilt?
+
{How large is the transinformation between&nbsp; X&nbsp; and&nbsp; W,&nbsp; if&nbsp; Z=1&nbsp; always holds?
 
|type="{}"}
 
|type="{}"}
 
I(X;W|Z=1) =  { 0.5 3% }  bit
 
I(X;W|Z=1) =  { 0.5 3% }  bit
  
{Wie groß ist die Transinformation zwischen&nbsp; X&nbsp; und&nbsp; W,&nbsp; falls stets&nbsp; Z=2&nbsp; gilt?
+
{How large is the transinformation between&nbsp; X&nbsp; and&nbsp; W,&nbsp; if&nbsp; Z=2&nbsp; always holds?
 
|type="{}"}
 
|type="{}"}
 
I(X;W|Z=2) =  { 0.5 3% }  bit
 
I(X;W|Z=2) =  { 0.5 3% }  bit
  
{Nun gelte &nbsp;p=Pr(Z=1).&nbsp; Wie groß ist die bedingte Transinformation zwischen&nbsp; X&nbsp; und&nbsp; W, falls&nbsp; zZ={1, 2}&nbsp; bekannt ist?  
+
{Now let &nbsp;p=Pr(Z=1).&nbsp; How large is the conditional mutual information between&nbsp; X&nbsp; and&nbsp; W, if&nbsp; zZ={1, 2}&nbsp; is known?  
 
|type="{}"}
 
|type="{}"}
 
p=1/2:   I(X;W|Z) =   { 0.5 3% }  bit
 
p=1/2:   I(X;W|Z) =   { 0.5 3% }  bit
 
p=3/4:   I(X;W|Z) =   { 0.5 3% }  bit
 
p=3/4:   I(X;W|Z) =   { 0.5 3% }  bit
  
{Wie groß ist die unkonditionierte Transinformation für&nbsp; p=1/2?  
+
{How large is the unconditional mutual information for&nbsp; p=1/2?  
 
|type="{}"}
 
|type="{}"}
 
I(X;W) =  { 0.25 3% }  bit
 
I(X;W) =  { 0.25 3% }  bit
Line 66: Line 64:
 
</quiz>
 
</quiz>
  
===Musterlösung===
+
===Solution===
 
{{ML-Kopf}}
 
{{ML-Kopf}}
[[File:P_ID2814__Inf_A_3_8a.png|right|frame|2D-Wahrscheinlichkeitsfunktionen für&nbsp; Z=1]]
+
[[File:P_ID2814__Inf_A_3_8a.png|right|frame|2D probability functions for&nbsp; Z=1]]
'''(1)'''&nbsp; Die obere Grafik gilt für&nbsp; Z=1 &nbsp; &rArr; &nbsp; W=X+Y.&nbsp;  
+
'''(1)'''&nbsp; The upper graph is valid for&nbsp; Z=1 &nbsp; &rArr; &nbsp; W=X+Y.&nbsp;  
*Unter den Voraussetzungen&nbsp; PX(X)=[1/2, 1/2]&nbsp; sowie&nbsp; PY(Y)=[1/2, 1/2]&nbsp; ergeben sich somit die Verbundwahrscheinlichkeiten&nbsp; PXW|Z=1(X,W)&nbsp; entsprechend der rechten Grafik (graue Hinterlegung).
+
*Under the conditions&nbsp; PX(X)=[1/2, 1/2]&nbsp; as well as&nbsp; PY(Y)=[1/2, 1/2]&nbsp; the joint probabilities&nbsp; PXW|Z=1(X,W)&nbsp; thus result according to the right graph (grey background).
  
*Damit gilt für die Transinformation unter der festen Bedingung&nbsp; Z=1:
+
*Thus the following applies to the mutual information under the fixed condition&nbsp; Z=1:
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) \hspace{-0.05cm} = \hspace{-1.1cm}\sum_{(x,w) \hspace{0.1cm}\in \hspace{0.1cm}{\rm supp} (P_{XW}\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1)} \hspace{-1.1cm}
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) \hspace{-0.05cm} = \hspace{-1.1cm}\sum_{(x,w) \hspace{0.1cm}\in \hspace{0.1cm}{\rm supp} (P_{XW}\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1)} \hspace{-1.1cm}
 
  P_{XW\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (x,w) \cdot {\rm log}_2 \hspace{0.1cm} \frac{P_{XW\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (x,w) }{P_X(x) \cdot P_{W\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (w) }$$
 
  P_{XW\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (x,w) \cdot {\rm log}_2 \hspace{0.1cm} \frac{P_{XW\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (x,w) }{P_X(x) \cdot P_{W\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (w) }$$
Line 82: Line 80:
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
  
*Der erste Term fasst die beiden horizontal schraffierten Felder in der Grafik zusammen, der zweite Term die vertikal schraffierten Felder.  
+
*The first term summarises the two horizontally shaded fields in the graph, the second term the vertically shaded fields.
*Letztere liefern wegen&nbsp; log2(1)=0&nbsp; keinen Beitrag.
+
*The latter do not contribute because of&nbsp; log2(1)=0&nbsp;.
  
  
  
[[File:P_ID2815__Inf_A_3_8b.png|right|frame|2D-Wahrscheinlichkeitsfunktionen für&nbsp; Z=2]]
+
[[File:P_ID2815__Inf_A_3_8b.png|right|frame|2D probability functions for&nbsp; Z=2]]
'''(2)'''&nbsp; Für&nbsp; Z=2&nbsp; gilt zwar W={4, 6, 8}, es ändert sich aber hinsichtlich der Wahrscheinlichkeitsfunktionen  gegenüber der Teilaufgabe&nbsp; '''(1)'''&nbsp; nichts.  
+
'''(2)'''&nbsp; For&nbsp; Z=2&nbsp; W={4, 6, 8}, is valid, but nothing changes with respect to the probability functions compared to subtask&nbsp; '''(1)'''&nbsp;.
  
*Demzufolge erhält man auch die gleiche bedingte Transinformation:
+
*Consequently, the same conditional mutual information is obtained:
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 2) = I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1)
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 2) = I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1)
 
\hspace{0.15cm} \underline {=0.5\,{\rm (bit)}}
 
\hspace{0.15cm} \underline {=0.5\,{\rm (bit)}}
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
 
<br clear=all>
 
<br clear=all>
'''(3)'''&nbsp; Die Gleichung lautet für&nbsp; Z={1, 2}&nbsp; mit&nbsp; Pr(Z=1)=p &nbsp;und&nbsp;  Pr(Z=2)=1p:
+
'''(3)'''&nbsp; The equation is for&nbsp; Z={1, 2}&nbsp; with&nbsp; Pr(Z=1)=p &nbsp;and&nbsp;  Pr(Z=2)=1p:
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z) =  p \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) + (1-p) \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 2)\hspace{0.15cm} \underline {=0.5\,{\rm (bit)}}
 
:$$I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z) =  p \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) + (1-p) \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 2)\hspace{0.15cm} \underline {=0.5\,{\rm (bit)}}
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
*Es ist berücksichtigt, dass nach den Teilaufgaben&nbsp; '''(1)'''&nbsp; und&nbsp; '''(2)'''&nbsp; die bedingten Transinformationen für gegebenes&nbsp; Z=1&nbsp; und gegebenes&nbsp; Z=2&nbsp; gleich sind.  
+
*It is considered that according to subtasks&nbsp; '''(1)'''&nbsp; and&nbsp; '''(2)'''&nbsp; the conditional mutual information for given&nbsp; Z=1&nbsp; and given&nbsp; Z=2&nbsp; are equal.
*Damit ist&nbsp; I(X;W|Z), also unter der Bedingung einer stochastischen Zufallsgröße&nbsp; Z={1, 2}&nbsp; mit&nbsp; P_Z(Z) = \big [p, \ 1 – p\big ]&nbsp; unabhängig von &nbsp;p.  
+
*Thus&nbsp; I(X; W|Z), i.e. under the condition of a stochastic random variable&nbsp; Z = \{1,\ 2\}&nbsp; with&nbsp; P_Z(Z) = \big [p, \ 1 – p\big ]&nbsp; is independent of &nbsp;p.  
*Das Ergebnis gilt insbesondere auch für&nbsp; \underline{p = 1/2}&nbsp; und&nbsp; \underline{p = 3/4}.
+
*In particular, the result is also valid for&nbsp; \underline{p = 1/2}&nbsp; and&nbsp; \underline{p = 3/4}.
  
  
[[File:P_ID2816__Inf_A_3_8d.png|right|frame|Zur Berechnung der Verbundwahrscheinlichkeit für XW]]
+
[[File:P_ID2816__Inf_A_3_8d.png|right|frame|To calculate the joint probability for XW]]
'''(4)'''&nbsp; Die Verbundwahrscheinlichkeit&nbsp; P_{ XW }&nbsp; hängt von den&nbsp; Z–Wahrscheinlichkeiten &nbsp;p&nbsp; und&nbsp; 1 – p&nbsp; ab.  
+
'''(4)'''&nbsp; The joint probability&nbsp; P_{ XW }&nbsp; depends on the&nbsp; Z–probabilites &nbsp;p&nbsp; and&nbsp; 1 – p&nbsp;.
*Für&nbsp; Pr(Z = 1) = Pr(Z = 2) = 1/2&nbsp; ergibt sich das rechts skizzierte Schema.  
+
*For&nbsp; Pr(Z = 1) = Pr(Z = 2) = 1/2&nbsp; the scheme sketched on the right results.
*Zur Transinformation tragen nur wieder die beiden horizontal schraffierten Felder bei:
+
*Again, only the two horizontally shaded fields contribute to the mutual information:
 
:$$ I(X;W) = 2 \cdot \frac{1}{8} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/8}{1/2 \cdot 1/8}
 
:$$ I(X;W) = 2 \cdot \frac{1}{8} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/8}{1/2 \cdot 1/8}
 
\hspace{0.15cm} \underline {=0.25\,{\rm (bit)}} \hspace{0.35cm} < \hspace{0.35cm} I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z)
 
\hspace{0.15cm} \underline {=0.25\,{\rm (bit)}} \hspace{0.35cm} < \hspace{0.35cm} I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z)
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
  
Das Ergebnis&nbsp; I(X; W|Z) > I(X; W)&nbsp; trifft für dieses Beispiel, aber auch für viele andere Anwendungen zu:  
+
The result&nbsp; I(X; W|Z) > I(X; W)&nbsp; is true for this example, but also for many other applications:
*Kenne ich&nbsp; Z, so weiß ich mehr über die 2D–Zufallsgröße&nbsp; XW&nbsp; als ohne diese Kenntnis.  
+
*If I know&nbsp; Z, I know more about the 2D random variable&nbsp; XW&nbsp; than without this knowledge..  
*Man darf dieses Ergebnis aber nicht verallgemeinern:  
+
*However, one must not generalise this result:
:Manchmal gilt tatsächlich&nbsp; I(X; W) > I(X; W|Z), so wie im&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgr%C3%B6%C3%9Fen#Bedingte_Transinformation|Beispiel 3]] im Theorieteil.
+
:Sometimes&nbsp; I(X; W) > I(X; W|Z), actually applies, as in&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgr%C3%B6%C3%9Fen#Bedingte_Transinformation|example 3]] in the theory section.
 
 
 
{{ML-Fuß}}
 
{{ML-Fuß}}
  

Revision as of 13:23, 13 September 2021

Result  W  as a function
of  XYZ

We assume statistically independent random variables  XY  and  Z  with the following properties:

X \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm} Y \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm} Z \in \{1,\ 2 \} \hspace{0.05cm},\hspace{0.35cm} P_X(X) = P_Y(Y) = \big [ 1/2, \ 1/2 \big ]\hspace{0.05cm},\hspace{0.35cm}P_Z(Z) = \big [ p, \ 1-p \big ].

From  XY  and  Z  we form the new random variable  W = (X+Y) \cdot Z.

  • It is obvious that there are statistical dependencies between  X  and  W    ⇒   mutual information  I(X; W) ≠ 0.
  • Furthermore,  I(Y; W) ≠ 0  as well as  I(Z; W) ≠ 0  will also apply, but this will not be discussed in detail in this task.


Three different definitions of mutual information are used in this task:

  • the conventional  mutual information zwischen  X  and  W:
I(X;W) = H(X) - H(X|\hspace{0.05cm}W) \hspace{0.05cm},
  • the conditional  mutual information between  X  and  W  with a given fixed value  Z = z:
I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = z) = H(X\hspace{0.05cm}|\hspace{0.05cm} Z = z) - H(X|\hspace{0.05cm}W ,\hspace{0.05cm} Z = z) \hspace{0.05cm},
  • the conditional  mutual information between  X  and  W  for a given random value  Z:
I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z ) = H(X\hspace{0.05cm}|\hspace{0.05cm} Z ) - H(X|\hspace{0.05cm}W \hspace{0.05cm} Z ) \hspace{0.05cm}.

The relationship between the last two definitions is:

I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z ) = \sum_{z \hspace{0.1cm}\in \hspace{0.1cm}{\rm supp} (P_{Z})} \hspace{-0.2cm} P_Z(z) \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = z)\hspace{0.05cm}.




Hints:


Questions

1

How large is the transinformation between  X  and  W,  if  Z = 1  always holds?

I(X; W | Z = 1) \ = \

\ \rm bit

2

How large is the transinformation between  X  and  W,  if  Z = 2  always holds?

I(X; W | Z = 2) \ = \

\ \rm bit

3

Now let  p = {\rm Pr}(Z = 1).  How large is the conditional mutual information between  X  and  W, if  z \in Z = \{1,\ 2\}  is known?

p = 1/2\text{:} \ \ \ I(X; W | Z) \ = \

\ \rm bit
p = 3/4\text{:} \ \ \ I(X; W | Z) \ = \

\ \rm bit

4

How large is the unconditional mutual information for  p = 1/2?

I(X; W) \ = \

\ \rm bit


Solution

2D probability functions for  Z = 1

(1)  The upper graph is valid for  Z = 1   ⇒   W = X + Y

  • Under the conditions  P_X(X) = \big [1/2, \ 1/2 \big]  as well as  P_Y(Y) = \big [1/2, \ 1/2 \big]  the joint probabilities  P_{ XW|Z=1 }(X, W)  thus result according to the right graph (grey background).
  • Thus the following applies to the mutual information under the fixed condition  Z = 1:
I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) \hspace{-0.05cm} = \hspace{-1.1cm}\sum_{(x,w) \hspace{0.1cm}\in \hspace{0.1cm}{\rm supp} (P_{XW}\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1)} \hspace{-1.1cm} P_{XW\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (x,w) \cdot {\rm log}_2 \hspace{0.1cm} \frac{P_{XW\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (x,w) }{P_X(x) \cdot P_{W\hspace{0.01cm}|\hspace{0.01cm} Z\hspace{-0.03cm} =\hspace{-0.03cm} 1} (w) }
I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) = 2 \cdot \frac{1}{4} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/4}{1/2 \cdot 1/4} + 2 \cdot \frac{1}{4} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/4}{1/2 \cdot 1/2}
\Rightarrow \hspace{0.3cm} I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) \hspace{0.15cm} \underline {=0.5\,{\rm (bit)}} \hspace{0.05cm}.
  • The first term summarises the two horizontally shaded fields in the graph, the second term the vertically shaded fields.
  • The latter do not contribute because of  \log_2 (1) = 0 .


2D probability functions for  Z = 2

(2)  For  Z = 2  W = \{4,\ 6,\ 8\}, is valid, but nothing changes with respect to the probability functions compared to subtask  (1) .

  • Consequently, the same conditional mutual information is obtained:
I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 2) = I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) \hspace{0.15cm} \underline {=0.5\,{\rm (bit)}} \hspace{0.05cm}.


(3)  The equation is for  Z = \{1,\ 2\}  with  {\rm Pr}(Z = 1) =p  and  {\rm Pr}(Z = 2) =1-p:

I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z) = p \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 1) + (1-p) \cdot I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z = 2)\hspace{0.15cm} \underline {=0.5\,{\rm (bit)}} \hspace{0.05cm}.
  • It is considered that according to subtasks  (1)  and  (2)  the conditional mutual information for given  Z = 1  and given  Z = 2  are equal.
  • Thus  I(X; W|Z), i.e. under the condition of a stochastic random variable  Z = \{1,\ 2\}  with  P_Z(Z) = \big [p, \ 1 – p\big ]  is independent of  p.
  • In particular, the result is also valid for  \underline{p = 1/2}  and  \underline{p = 3/4}.


To calculate the joint probability for XW

(4)  The joint probability  P_{ XW }  depends on the  Z–probabilites  p  and  1 – p .

  • For  Pr(Z = 1) = Pr(Z = 2) = 1/2  the scheme sketched on the right results.
  • Again, only the two horizontally shaded fields contribute to the mutual information:
I(X;W) = 2 \cdot \frac{1}{8} \cdot {\rm log}_2 \hspace{0.1cm} \frac{1/8}{1/2 \cdot 1/8} \hspace{0.15cm} \underline {=0.25\,{\rm (bit)}} \hspace{0.35cm} < \hspace{0.35cm} I(X;W \hspace{0.05cm}|\hspace{0.05cm} Z) \hspace{0.05cm}.

The result  I(X; W|Z) > I(X; W)  is true for this example, but also for many other applications:

  • If I know  Z, I know more about the 2D random variable  XW  than without this knowledge..
  • However, one must not generalise this result:
Sometimes  I(X; W) > I(X; W|Z), actually applies, as in  example 3 in the theory section.