Difference between revisions of "Aufgaben:Exercise 3.8: Once more Mutual Information"

From LNTwww
 
(9 intermediate revisions by 3 users not shown)
Line 1: Line 1:
  
{{quiz-Header|Buchseite=Informationstheorie/Verschiedene Entropien zweidimensionaler Zufallsgrößen
+
{{quiz-Header|Buchseite=Information_Theory/Different_Entropy_Measures_of_Two-Dimensional_Random_Variables
 
}}
 
}}
  
[[File:P_ID2768__Inf_A_3_7_neu.png|right|frame|"Wahrscheinlichkeiten" $P_{ XY }$  und  $P_{ XW }$]]
+
[[File:P_ID2768__Inf_A_3_7_neu.png|right|frame|2D&ndash;Functions&nbsp; <br>&nbsp;$P_{ XY }$&nbsp; und&nbsp; $P_{ XW }$]]
Wir betrachten das Tupel&nbsp; $Z = (X, Y)$, wobei die Einzelkomponenten&nbsp; $X$&nbsp; und&nbsp; $Y$&nbsp; jeweils ternäre Zufallsgrößen darstellen:  
+
We consider the tuple&nbsp; $Z = (X, Y)$, where the individual components&nbsp; $X$&nbsp; and&nbsp; $Y$&nbsp; each represent ternary random variables:  
 
:$$X = \{ 0 ,\ 1 ,\ 2 \} , \hspace{0.3cm}Y= \{ 0 ,\ 1 ,\ 2 \}.$$  
 
:$$X = \{ 0 ,\ 1 ,\ 2 \} , \hspace{0.3cm}Y= \{ 0 ,\ 1 ,\ 2 \}.$$  
  
Die gemeinsame Wahrscheinlichkeitsfunktion&nbsp; $P_{ XY }(X, Y)$&nbsp; beider Zufallsgrößen ist in der oberen Grafik angegeben.&nbsp;  
+
The joint probability function&nbsp; $P_{ XY }(X, Y)$&nbsp; of both random variables is given in the upper graph.&nbsp;  
  
In der&nbsp; [[Aufgaben:3.07Z_Tupel_aus_tern%C3%A4ren_Zufallsgr%C3%B6%C3%9Fen|Aufgabe 3.8Z]]&nbsp; wird diese Konstellation ausführlich analysiert.&nbsp; Man erhält als Ergebnis (alle Angaben in "bit"):
+
In&nbsp; [[Aufgaben:Exercise_3.8Z:_Tuples_from_Ternary_Random_Variables|Exercise 3.8Z]]&nbsp; this constellation is analyzed in detail.&nbsp; One obtains as a result (all data in "bit"):
 
:* $H(X) = H(Y) = \log_2 (3) = 1.585,$
 
:* $H(X) = H(Y) = \log_2 (3) = 1.585,$
 
:* $H(XY) = \log_2 (9) = 3.170,$
 
:* $H(XY) = \log_2 (9) = 3.170,$
Line 16: Line 16:
 
:* $I(X, Z) = 1.585.$
 
:* $I(X, Z) = 1.585.$
  
Desweiteren betrachten wir die Zufallsgröße&nbsp; $W = \{ 0,\ 1,\ 2,\ 3,\ 4 \}$, deren Eigenschaften sich aus der Verbundwahrscheinlichkeitsfunktion&nbsp; $P_{ XW }(X, W)$&nbsp; nach der unteren Skizze ergeben.&nbsp; Die Wahrscheinlichkeiten sind in allen weiß hinterlegten Feldern jeweils Null.
+
Furthermore, we consider the random variable&nbsp; $W = \{ 0,\ 1,\ 2,\ 3,\ 4 \}$, whose properties result from the composite probability function&nbsp; $P_{ XW }(X, W)$&nbsp; according to the sketch below.&nbsp; The probabilities are zero in all fields with a white background.
  
Gesucht ist in der vorliegenden Aufgabe die Transinformation zwischen
+
What is sought in the present exercise is the mutual information between
:*den Zufallsgrößen&nbsp; $X$&nbsp; und&nbsp; $W$ &nbsp; ⇒ &nbsp;  $I(X; W)$,
+
:*the random variables&nbsp; $X$&nbsp; and&nbsp; $W$ &nbsp; ⇒ &nbsp;  $I(X; W)$,
:* den Zufallsgrößen&nbsp; $Z$&nbsp; und&nbsp; $W &nbsp; ⇒ &nbsp; I(Z; W)$.
+
:*the random variables&nbsp; $Z$&nbsp; and&nbsp; $W &nbsp; ⇒ &nbsp; I(Z; W)$.
  
  
Line 26: Line 26:
  
  
''Hinweise:''
+
Hints:  
*Die Aufgabe gehört zum  Kapitel&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen|Verschiedene Entropien zweidimensionaler Zufallsgrößen]].
+
*The exercise belongs to the chapter&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen|Different entropies of two-dimensional random variables]].
*Insbesondere wird Bezug genommen auf die Seiten <br> &nbsp; &nbsp;  [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Bedingte_Wahrscheinlichkeit_und_bedingte_Entropie|Bedingte Wahrscheinlichkeit und bedingte Entropie]] sowie <br> &nbsp; &nbsp;  [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Transinformation_zwischen_zwei_Zufallsgr.C3.B6.C3.9Fen|Transinformation zwischen zwei Zufallsgrößen]].
+
*In particular, reference is made to the sections <br> &nbsp; &nbsp;  [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Conditional_probability_and_conditional_entropy|Conditional probability and conditional entropy]] as well as <br> &nbsp; &nbsp;  [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Mutual_information_between_two_random_variables|Mutual information between two random variables]].
 
   
 
   
  
  
===Fragebogen===
+
===Questions===
  
 
<quiz display=simple>
 
<quiz display=simple>
{Wie könnten die Größen&nbsp; $X$,&nbsp; $Y$&nbsp; und&nbsp; $W$&nbsp; zusammenhängen?  
+
{How might the variables&nbsp; $X$,&nbsp; $Y$&nbsp; and&nbsp; $W$&nbsp; be related?  
 
|type="[]"}
 
|type="[]"}
 
+ $W = X + Y$,
 
+ $W = X + Y$,
Line 41: Line 41:
 
-$W = Y - X + 2$.
 
-$W = Y - X + 2$.
  
{Welche Transinformation besteht zwischen den Zufallsgrößen&nbsp; $X$&nbsp; und&nbsp; $W$?
+
{What is the mutual information between the random variables&nbsp; $X$&nbsp; and&nbsp; $W$?
 
|type="{}"}
 
|type="{}"}
 
$I(X; W) \ = \ $ { 0.612 3%  } $\ \rm bit$
 
$I(X; W) \ = \ $ { 0.612 3%  } $\ \rm bit$
  
{Welche Transinformation besteht zwischen den Zufallsgrößen&nbsp; $Z$&nbsp; und&nbsp; $W$?
+
{What is the mutual information between the random variables&nbsp; $Z$&nbsp; and&nbsp; $W$?
 
|type="{}"}
 
|type="{}"}
 
$I(Z; W) \ = \ $ { 2.197 3%  } $\ \rm bit$
 
$I(Z; W) \ = \ $ { 2.197 3%  } $\ \rm bit$
  
{Welche der nachfolgenden Aussagen sind zutreffend?
+
{Which of the following statements are true?
 
|type="[]"}
 
|type="[]"}
+ Es gilt&nbsp; $H(ZW) = H(XW)$.
+
+ &nbsp; $H(ZW) = H(XW)$&nbsp; is true.
+ Es gilt&nbsp; $H(W|Z) = 0$.
+
+ &nbsp; $H(W|Z) = 0$&nbsp; is true.
+ Es gilt&nbsp; $I(Z; W) > I(X; W)$.
+
+ &nbsp; $I(Z; W) > I(X; W)$&nbsp; is true.
  
 
</quiz>
 
</quiz>
  
===Musterlösung===
+
===Solution===
 
{{ML-Kopf}}
 
{{ML-Kopf}}
'''(1)'''&nbsp; Richtig sind die <u>Lösungsvorschläge 1 und 2</u>:
+
'''(1)'''&nbsp; The <u>correct solutions are 1 and 2</u>:
*Mit&nbsp; $X = \{0,\ 1,\ 2\}$,&nbsp; $Y = \{0,\ 1,\ 2\}$&nbsp; gilt&nbsp; $X + Y = \{0,\ 1,\ 2,\ 3,\ 4\}$.&nbsp;  
+
*With&nbsp; $X = \{0,\ 1,\ 2\}$,&nbsp; $Y = \{0,\ 1,\ 2\}$&nbsp;, &nbsp; $X + Y = \{0,\ 1,\ 2,\ 3,\ 4\}$ holds.&nbsp;  
*Auch die Wahrscheinlichkeiten stimmen mit der gegebenen Wahrscheinlichkeitsfunktion überein.  
+
*The probabilities also agree with the given probability function.
*Die Überprüfung der beiden anderen Vorgaben zeigt, dass auch $W = X Y + 2$ möglich ist, nicht jedoch $W = Y X + 2$.
+
*Checking the other two specifications shows that&nbsp; $W = X - Y + 2$&nbsp; is also possible, but not&nbsp; $W = Y - X + 2$.
  
  
 +
[[File:P_ID2769__Inf_A_3_7d.png|right|frame|To calculate the mutual information]]
  
'''(2)'''&nbsp; Aus der 2D–Wahrscheinlichkeitsfunktion&nbsp; $P_{ XW }(X, W)$&nbsp; auf der Angabenseite erhält man für
+
 
*die Verbundentropie:
+
'''(2)'''&nbsp; From the two-dimensional probability mass function&nbsp; $P_{ XW }(X, W)$&nbsp; in the specification section, one obtains for
 +
*the joint entropy:
 
:$$H(XW) =  {\rm log}_2 \hspace{0.1cm} (9)  
 
:$$H(XW) =  {\rm log}_2 \hspace{0.1cm} (9)  
 
= 3.170\ {\rm (bit)}
 
= 3.170\ {\rm (bit)}
 
\hspace{0.05cm},$$
 
\hspace{0.05cm},$$
* die Wahrsacheinlichkeitsfunktion der Zufallsgröße&nbsp; $W$:
+
* the probability function of the random variable&nbsp; $W$:
 
:$$P_W(W) = \big [\hspace{0.05cm}1/9\hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm},\hspace{0.15cm} 3/9 \hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm}, \hspace{0.15cm} 1/9\hspace{0.05cm} \big ]\hspace{0.05cm},$$
 
:$$P_W(W) = \big [\hspace{0.05cm}1/9\hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm},\hspace{0.15cm} 3/9 \hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm}, \hspace{0.15cm} 1/9\hspace{0.05cm} \big ]\hspace{0.05cm},$$
*die Entropie der Zufallsgröße $W$:
+
*the entropy of the random variable&nbsp; $W$:
 
:$$H(W) = 2 \cdot \frac{1}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{1} + 2 \cdot \frac{2}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{2} +
 
:$$H(W) = 2 \cdot \frac{1}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{1} + 2 \cdot \frac{2}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{2} +
 
\frac{3}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{3}
 
\frac{3}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{3}
 
  {= 2.197\ {\rm (bit)}} \hspace{0.05cm}.$$
 
  {= 2.197\ {\rm (bit)}} \hspace{0.05cm}.$$
  
Mit&nbsp; $H(X) = 1.585 \ \rm bit$&nbsp; (wurde vorgegeben) ergibt sich somit für die&nbsp; ''Mutual Information'':  
+
Thus, with&nbsp; $H(X) = 1.585 \ \rm bit$&nbsp; (was given), the result for the&nbsp; mutual information:  
 
:$$I(X;W) = H(X) + H(W) - H(XW) = 1.585 + 2.197- 3.170\hspace{0.15cm} \underline {= 0.612\ {\rm (bit)}} \hspace{0.05cm}.$$
 
:$$I(X;W) = H(X) + H(W) - H(XW) = 1.585 + 2.197- 3.170\hspace{0.15cm} \underline {= 0.612\ {\rm (bit)}} \hspace{0.05cm}.$$
  
[[File:P_ID2769__Inf_A_3_7d.png|right|frame|Zur Berechnung der Transinformation]]
+
The left of the two diagrams illustrates the calculation of the mutual information&nbsp; $I(X; W)$&nbsp; between the first component&nbsp; $X$&nbsp; and the sum&nbsp; $W$.
Das linke der beiden Schaubilder verdeutlicht die Berechnung der Transinformation&nbsp; $I(X; W)$&nbsp; zwischen der ersten Komponente&nbsp; $X$&nbsp; und der Summe&nbsp; $W$.
 
<br clear=all>
 
[[File:P_ID2770__Inf_A_3_7c.png|right|Verbundwahrscheinlichkeit zwischen&nbsp; $Z$&nbsp; und&nbsp; $W$]]
 
'''(3)'''&nbsp;  Die zweite Grafik zeigt die Verbundwahrscheinlichkeit&nbsp; $P_{ ZW }(⋅)$.&nbsp; Das Schema besteht aus&nbsp; $5 · 9 = 45$&nbsp; Feldern im Gegensatz zur Darstellung von&nbsp; $P_{ XW }(⋅)$&nbsp; auf der Angabenseite mit&nbsp; $3 · 9 = 27$&nbsp; Feldern.
 
*Von den&nbsp; $45$&nbsp; Feldern sind aber auch nur neun mit Wahrscheinlichkeiten ungleich Null belegt.&nbsp; Für die Verbundentropie gilt: &nbsp; $H(ZW)  = 3.170\ {\rm (bit)} \hspace{0.05cm}.$
 
*Mit den weiteren Entropien&nbsp; $H(Z)  = 3.170\ {\rm (bit)}\hspace{0.05cm}$&nbsp; und&nbsp; $H(W)  = 2.197\ {\rm (bit)}\hspace{0.05cm}$&nbsp; entsprechend der&nbsp; [[Aufgaben:3.07Z_Tupel_aus_tern%C3%A4ren_Zufallsgr%C3%B6%C3%9Fen| Aufgabe 3.8Z]]&nbsp; bzw. der Teilfrage&nbsp; '''(2)'''&nbsp; dieser Aufgabe erhält man für die Transinformation:
 
:$$I(Z;W) = H(Z) + H(W) - H(ZW) \hspace{0.15cm} \underline {= 2.197\,{\rm (bit)}} \hspace{0.05cm}.$$
 
  
  
'''(4)'''&nbsp; <u>Alle drei Aussagen</u> treffen zu, wie auch aus dem rechten der beiden oberen Schaubilder ersichtlich ist.
 
  
Wir versuchen eine Interpretation dieser numerischen Ergebnisse:
+
[[File:P_ID2770__Inf_A_3_7c.png|right|frame|Joint probability between&nbsp; $Z$&nbsp; and&nbsp; $W$]]
* Die Verbundwahrscheinlichkeit&nbsp; $P_{ ZW }(⋅)$&nbsp; setzt sich ebenso wie&nbsp; $P_{ XW }(⋅)$&nbsp; aus neun gleichwahrscheinlichen Elementenungleich 0 zusammen. Damit ist offensichtlich, dass auch die Verbundentropien gleich sind &nbsp; ⇒ &nbsp; $H(ZW) =  H(XW) = 3.170 \ \rm (bit)$.   
+
'''(3)'''&nbsp;  The second graph shows the joint probability&nbsp; $P_{ ZW }(⋅)$.&nbsp;
* Wenn ich das Tupel&nbsp; $Z = (X, Y)$&nbsp; kenne, kenne ich natürlich auch die Summe&nbsp; $W = X + Y$.&nbsp; Damit ist&nbsp; $H(W|Z) = 0$.  
+
*The scheme consists of&nbsp; $5 · 9 = 45$&nbsp; fields in contrast to the plot of&nbsp; $P_{ XW }(⋅)$&nbsp; in the data section with&nbsp; $3 · 9 = 27$&nbsp; fields.
*Dagegen ist&nbsp; $H(Z|W) \ne 0$.&nbsp; Vielmehr gilt&nbsp; $H(Z|W) = H(X|W) = 0.973  \ \rm (bit)$.
+
*However, of the&nbsp; $45$&nbsp; fields, only nine are also assigned non-zero probabilities.&nbsp; The following applies to the joint entropy:  &nbsp; $H(ZW)  = 3.170\ {\rm (bit)} \hspace{0.05cm}.$
* Die Zufallsgröße&nbsp; $W$&nbsp; liefert also die genau gleiche Information hinsichtlich des Tupels&nbsp; $Z$&nbsp; wie für die Einzelkomponente&nbsp; $X$.&nbsp; Dies ist die verbale Interpretation der Aussage&nbsp; $H(Z|W) = H(X|W)$.
+
*With the further entropies&nbsp; $H(Z)  = 3.170\ {\rm (bit)}\hspace{0.05cm}$&nbsp; and&nbsp; $H(W)  = 2.197\ {\rm (bit)}\hspace{0.05cm}$&nbsp; according to&nbsp; [[Aufgaben:Exercise_3.8Z:_Tuples_from_Ternary_Random_Variables| Exercise 3.8Z]]&nbsp; or the subquestion&nbsp; '''(2)'''&nbsp; of this exercise, one obtains for the mutual information:
* Die gemeinsame Information von&nbsp; $Z$&nbsp; und&nbsp; $W$&nbsp; &nbsp; ⇒ &nbsp; $I(Z; W)$&nbsp; ist größer als die gemeinsame Information von&nbsp; $X$&nbsp; und&nbsp; $W$  &nbsp; ⇒ &nbsp; $I(X; W)$, weil&nbsp; $H(W|Z) =0$&nbsp; gilt, während&nbsp; $H(W|X)$&nbsp; ungleich Null ist, nämlich genau so groß ist wie&nbsp; $H(X)$ :
+
:$$I(Z;W) = H(Z) + H(W) - H(ZW) \hspace{0.15cm} \underline {= 2.197\,{\rm (bit)}} \hspace{0.05cm}.$$
 +
<br clear=all>
 +
'''(4)'''&nbsp; <u>All three statements</u> are true, as can also be seen from the right-hand side of the two upper diagrams.&nbsp; We attempt an interpretation of these numerical results:
 +
* The joint probability&nbsp; $P_{ ZW }(⋅)$&nbsp;is composed, like&nbsp; $P_{ XW }(⋅)$&nbsp;, of nine equally probable elements unequal to 0.&nbsp; It is thus obvious that the joint entropies are also equal &nbsp; ⇒ &nbsp; $H(ZW) =  H(XW) = 3.170 \ \rm (bit)$.   
 +
* If I know the tuple&nbsp; $Z = (X, Y)$,&nbsp; I naturally also know the sum&nbsp; $W = X + Y$.&nbsp; Thus&nbsp; $H(W|Z) = 0$.  
 +
*In contrast,&nbsp; $H(Z|W) \ne 0$.&nbsp; Rather,&nbsp; $H(Z|W) = H(X|W) = 0.973  \ \rm (bit)$.
 +
*The random variable&nbsp; $W$&nbsp; thus provides exactly the same information with regard to the tuple&nbsp; $Z$&nbsp; as for the individual component&nbsp; $X$.&nbsp; This is the verbal interpretation of the statement&nbsp; $H(Z|W) = H(X|W)$.
 +
*The joint information of&nbsp; $Z$&nbsp; and&nbsp; $W$&nbsp; &nbsp; ⇒ &nbsp; $I(Z; W)$&nbsp; is greater than the joint information of&nbsp; $X$&nbsp; and&nbsp; $W$  &nbsp; ⇒ &nbsp; $I(X; W)$,&nbsp; because&nbsp; $H(W|Z) =0$,&nbsp; while&nbsp; $H(W|X)$&nbsp; is non-zero, namely exactly as great as&nbsp; $H(X)$ :
 
:$$I(Z;W)  = H(W) - H(W|Z) = 2.197 - 0= 2.197\,{\rm (bit)} \hspace{0.05cm},$$
 
:$$I(Z;W)  = H(W) - H(W|Z) = 2.197 - 0= 2.197\,{\rm (bit)} \hspace{0.05cm},$$
 
:$$I(X;W) =  H(W) - H(W|X) = 2.197 - 1.585= 0.612\,{\rm (bit)} \hspace{0.05cm}.$$
 
:$$I(X;W) =  H(W) - H(W|X) = 2.197 - 1.585= 0.612\,{\rm (bit)} \hspace{0.05cm}.$$

Latest revision as of 14:50, 17 November 2022

2D–Functions 
 $P_{ XY }$  und  $P_{ XW }$

We consider the tuple  $Z = (X, Y)$, where the individual components  $X$  and  $Y$  each represent ternary random variables:

$$X = \{ 0 ,\ 1 ,\ 2 \} , \hspace{0.3cm}Y= \{ 0 ,\ 1 ,\ 2 \}.$$

The joint probability function  $P_{ XY }(X, Y)$  of both random variables is given in the upper graph. 

In  Exercise 3.8Z  this constellation is analyzed in detail.  One obtains as a result (all data in "bit"):

  • $H(X) = H(Y) = \log_2 (3) = 1.585,$
  • $H(XY) = \log_2 (9) = 3.170,$
  • $I(X, Y) = 0,$
  • $H(Z) = H(XZ) = 3.170,$
  • $I(X, Z) = 1.585.$

Furthermore, we consider the random variable  $W = \{ 0,\ 1,\ 2,\ 3,\ 4 \}$, whose properties result from the composite probability function  $P_{ XW }(X, W)$  according to the sketch below.  The probabilities are zero in all fields with a white background.

What is sought in the present exercise is the mutual information between

  • the random variables  $X$  and  $W$   ⇒   $I(X; W)$,
  • the random variables  $Z$  and  $W   ⇒   I(Z; W)$.



Hints:


Questions

1

How might the variables  $X$,  $Y$  and  $W$  be related?

$W = X + Y$,
$W = X - Y + 2$,
$W = Y - X + 2$.

2

What is the mutual information between the random variables  $X$  and  $W$?

$I(X; W) \ = \ $

$\ \rm bit$

3

What is the mutual information between the random variables  $Z$  and  $W$?

$I(Z; W) \ = \ $

$\ \rm bit$

4

Which of the following statements are true?

  $H(ZW) = H(XW)$  is true.
  $H(W|Z) = 0$  is true.
  $I(Z; W) > I(X; W)$  is true.


Solution

(1)  The correct solutions are 1 and 2:

  • With  $X = \{0,\ 1,\ 2\}$,  $Y = \{0,\ 1,\ 2\}$ ,   $X + Y = \{0,\ 1,\ 2,\ 3,\ 4\}$ holds. 
  • The probabilities also agree with the given probability function.
  • Checking the other two specifications shows that  $W = X - Y + 2$  is also possible, but not  $W = Y - X + 2$.


To calculate the mutual information


(2)  From the two-dimensional probability mass function  $P_{ XW }(X, W)$  in the specification section, one obtains for

  • the joint entropy:
$$H(XW) = {\rm log}_2 \hspace{0.1cm} (9) = 3.170\ {\rm (bit)} \hspace{0.05cm},$$
  • the probability function of the random variable  $W$:
$$P_W(W) = \big [\hspace{0.05cm}1/9\hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm},\hspace{0.15cm} 3/9 \hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm}, \hspace{0.15cm} 1/9\hspace{0.05cm} \big ]\hspace{0.05cm},$$
  • the entropy of the random variable  $W$:
$$H(W) = 2 \cdot \frac{1}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{1} + 2 \cdot \frac{2}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{2} + \frac{3}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{3} {= 2.197\ {\rm (bit)}} \hspace{0.05cm}.$$

Thus, with  $H(X) = 1.585 \ \rm bit$  (was given), the result for the  mutual information:

$$I(X;W) = H(X) + H(W) - H(XW) = 1.585 + 2.197- 3.170\hspace{0.15cm} \underline {= 0.612\ {\rm (bit)}} \hspace{0.05cm}.$$

The left of the two diagrams illustrates the calculation of the mutual information  $I(X; W)$  between the first component  $X$  and the sum  $W$.


Joint probability between  $Z$  and  $W$

(3)  The second graph shows the joint probability  $P_{ ZW }(⋅)$. 

  • The scheme consists of  $5 · 9 = 45$  fields in contrast to the plot of  $P_{ XW }(⋅)$  in the data section with  $3 · 9 = 27$  fields.
  • However, of the  $45$  fields, only nine are also assigned non-zero probabilities.  The following applies to the joint entropy:   $H(ZW) = 3.170\ {\rm (bit)} \hspace{0.05cm}.$
  • With the further entropies  $H(Z) = 3.170\ {\rm (bit)}\hspace{0.05cm}$  and  $H(W) = 2.197\ {\rm (bit)}\hspace{0.05cm}$  according to  Exercise 3.8Z  or the subquestion  (2)  of this exercise, one obtains for the mutual information:
$$I(Z;W) = H(Z) + H(W) - H(ZW) \hspace{0.15cm} \underline {= 2.197\,{\rm (bit)}} \hspace{0.05cm}.$$


(4)  All three statements are true, as can also be seen from the right-hand side of the two upper diagrams.  We attempt an interpretation of these numerical results:

  • The joint probability  $P_{ ZW }(⋅)$ is composed, like  $P_{ XW }(⋅)$ , of nine equally probable elements unequal to 0.  It is thus obvious that the joint entropies are also equal   ⇒   $H(ZW) = H(XW) = 3.170 \ \rm (bit)$.
  • If I know the tuple  $Z = (X, Y)$,  I naturally also know the sum  $W = X + Y$.  Thus  $H(W|Z) = 0$.
  • In contrast,  $H(Z|W) \ne 0$.  Rather,  $H(Z|W) = H(X|W) = 0.973 \ \rm (bit)$.
  • The random variable  $W$  thus provides exactly the same information with regard to the tuple  $Z$  as for the individual component  $X$.  This is the verbal interpretation of the statement  $H(Z|W) = H(X|W)$.
  • The joint information of  $Z$  and  $W$    ⇒   $I(Z; W)$  is greater than the joint information of  $X$  and  $W$   ⇒   $I(X; W)$,  because  $H(W|Z) =0$,  while  $H(W|X)$  is non-zero, namely exactly as great as  $H(X)$ :
$$I(Z;W) = H(W) - H(W|Z) = 2.197 - 0= 2.197\,{\rm (bit)} \hspace{0.05cm},$$
$$I(X;W) = H(W) - H(W|X) = 2.197 - 1.585= 0.612\,{\rm (bit)} \hspace{0.05cm}.$$