Difference between revisions of "Aufgaben:Exercise 3.8: Once more Mutual Information"

Latest revision as of 14:50, 17 November 2022

2D–Functions
$P_{ XY }$ und $P_{ XW }$

We consider the tuple $Z = (X, Y)$, where the individual components $X$ and $Y$ each represent ternary random variables:

$$X = \{ 0 ,\ 1 ,\ 2 \} , \hspace{0.3cm}Y= \{ 0 ,\ 1 ,\ 2 \}.$$

The joint probability function $P_{ XY }(X, Y)$ of both random variables is given in the upper graph.

In Exercise 3.8Z this constellation is analyzed in detail. One obtains as a result (all data in "bit"):

$H(X) = H(Y) = \log_2 (3) = 1.585,$
$H(XY) = \log_2 (9) = 3.170,$
$I(X, Y) = 0,$
$H(Z) = H(XZ) = 3.170,$
$I(X, Z) = 1.585.$

Furthermore, we consider the random variable $W = \{ 0,\ 1,\ 2,\ 3,\ 4 \}$, whose properties result from the composite probability function $P_{ XW }(X, W)$ according to the sketch below. The probabilities are zero in all fields with a white background.

What is sought in the present exercise is the mutual information between

the random variables $X$ and $W$ ⇒ $I(X; W)$,
the random variables $Z$ and $W ⇒ I(Z; W)$.

Hints:

The exercise belongs to the chapter Different entropies of two-dimensional random variables.
In particular, reference is made to the sections
Conditional probability and conditional entropy as well as
Mutual information between two random variables.

Questions

Solution

(1) The correct solutions are 1 and 2:

With $X = \{0,\ 1,\ 2\}$, $Y = \{0,\ 1,\ 2\}$ , $X + Y = \{0,\ 1,\ 2,\ 3,\ 4\}$ holds.
The probabilities also agree with the given probability function.
Checking the other two specifications shows that $W = X - Y + 2$ is also possible, but not $W = Y - X + 2$.

To calculate the mutual information

(2) From the two-dimensional probability mass function $P_{ XW }(X, W)$ in the specification section, one obtains for

the joint entropy:

$$H(XW) = {\rm log}_2 \hspace{0.1cm} (9) = 3.170\ {\rm (bit)} \hspace{0.05cm},$$

the probability function of the random variable $W$:

$$P_W(W) = \big [\hspace{0.05cm}1/9\hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm},\hspace{0.15cm} 3/9 \hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm}, \hspace{0.15cm} 1/9\hspace{0.05cm} \big ]\hspace{0.05cm},$$

the entropy of the random variable $W$:

$$H(W) = 2 \cdot \frac{1}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{1} + 2 \cdot \frac{2}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{2} + \frac{3}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{3} {= 2.197\ {\rm (bit)}} \hspace{0.05cm}.$$

Thus, with $H(X) = 1.585 \ \rm bit$ (was given), the result for the mutual information:

$$I(X;W) = H(X) + H(W) - H(XW) = 1.585 + 2.197- 3.170\hspace{0.15cm} \underline {= 0.612\ {\rm (bit)}} \hspace{0.05cm}.$$

The left of the two diagrams illustrates the calculation of the mutual information $I(X; W)$ between the first component $X$ and the sum $W$.

Joint probability between $Z$ and $W$

(3) The second graph shows the joint probability $P_{ ZW }(⋅)$.

The scheme consists of $5 · 9 = 45$ fields in contrast to the plot of $P_{ XW }(⋅)$ in the data section with $3 · 9 = 27$ fields.
However, of the $45$ fields, only nine are also assigned non-zero probabilities. The following applies to the joint entropy: $H(ZW) = 3.170\ {\rm (bit)} \hspace{0.05cm}.$
With the further entropies $H(Z) = 3.170\ {\rm (bit)}\hspace{0.05cm}$ and $H(W) = 2.197\ {\rm (bit)}\hspace{0.05cm}$ according to Exercise 3.8Z or the subquestion (2) of this exercise, one obtains for the mutual information:

$$I(Z;W) = H(Z) + H(W) - H(ZW) \hspace{0.15cm} \underline {= 2.197\,{\rm (bit)}} \hspace{0.05cm}.$$

(4) All three statements are true, as can also be seen from the right-hand side of the two upper diagrams. We attempt an interpretation of these numerical results:

The joint probability $P_{ ZW }(⋅)$ is composed, like $P_{ XW }(⋅)$ , of nine equally probable elements unequal to 0. It is thus obvious that the joint entropies are also equal ⇒ $H(ZW) = H(XW) = 3.170 \ \rm (bit)$.
If I know the tuple $Z = (X, Y)$, I naturally also know the sum $W = X + Y$. Thus $H(W|Z) = 0$.
In contrast, $H(Z|W) \ne 0$. Rather, $H(Z|W) = H(X|W) = 0.973 \ \rm (bit)$.
The random variable $W$ thus provides exactly the same information with regard to the tuple $Z$ as for the individual component $X$. This is the verbal interpretation of the statement $H(Z|W) = H(X|W)$.
The joint information of $Z$ and $W$ ⇒ $I(Z; W)$ is greater than the joint information of $X$ and $W$ ⇒ $I(X; W)$, because $H(W|Z) =0$, while $H(W|X)$ is non-zero, namely exactly as great as $H(X)$ :

$$I(Z;W) = H(W) - H(W|Z) = 2.197 - 0= 2.197\,{\rm (bit)} \hspace{0.05cm},$$

$$I(X;W) = H(W) - H(W|X) = 2.197 - 1.585= 0.612\,{\rm (bit)} \hspace{0.05cm}.$$

@@ Line 1: / Line 1: @@
-{{quiz-Header|Buchseite=Informationstheorie/Verschiedene Entropien zweidimensionaler Zufallsgrößen
+{{quiz-Header|Buchseite=Information_Theory/Different_Entropy_Measures_of_Two-Dimensional_Random_Variables
 }}
-[[File:P_ID2768__Inf_A_3_7_neu.png|right|frame|"Wahrscheinlichkeiten" $P_{ XY }$ &nbsp;und&nbsp; $P_{ XW }$]]
+[[File:P_ID2768__Inf_A_3_7_neu.png|right|frame|2D&ndash;Functions&nbsp; <br>&nbsp;$P_{ XY }$&nbsp; und&nbsp; $P_{ XW }$]]
-Wir betrachten das Tupel&nbsp; $Z = (X, Y)$, wobei die Einzelkomponenten&nbsp; $X$&nbsp; und&nbsp; $Y$&nbsp; jeweils ternäre Zufallsgrößen darstellen:
+We consider the tuple&nbsp; $Z = (X, Y)$, where the individual components&nbsp; $X$&nbsp; and&nbsp; $Y$&nbsp; each represent ternary random variables:
 :$$X = \{ 0 ,\ 1 ,\ 2 \} , \hspace{0.3cm}Y= \{ 0 ,\ 1 ,\ 2 \}.$$
-Die gemeinsame Wahrscheinlichkeitsfunktion&nbsp; $P_{ XY }(X, Y)$&nbsp; beider Zufallsgrößen ist in der oberen Grafik angegeben.&nbsp;
+The joint probability function&nbsp; $P_{ XY }(X, Y)$&nbsp; of both random variables is given in the upper graph.&nbsp;
-In der&nbsp; [[Aufgaben:3.07Z_Tupel_aus_tern%C3%A4ren_Zufallsgr%C3%B6%C3%9Fen|Aufgabe 3.8Z]]&nbsp; wird diese Konstellation ausführlich analysiert.&nbsp; Man erhält als Ergebnis (alle Angaben in "bit"):
+In&nbsp; [[Aufgaben:Exercise_3.8Z:_Tuples_from_Ternary_Random_Variables|Exercise 3.8Z]]&nbsp; this constellation is analyzed in detail.&nbsp; One obtains as a result (all data in "bit"):
 :* $H(X) = H(Y) = \log_2 (3) = 1.585,$
 :* $H(XY) = \log_2 (9) = 3.170,$
@@ Line 16: / Line 16: @@
 :* $I(X, Z) = 1.585.$
-Desweiteren betrachten wir die Zufallsgröße&nbsp; $W = \{ 0,\ 1,\ 2,\ 3,\ 4 \}$, deren Eigenschaften sich aus der Verbundwahrscheinlichkeitsfunktion&nbsp; $P_{ XW }(X, W)$&nbsp; nach der unteren Skizze ergeben.&nbsp; Die Wahrscheinlichkeiten sind in allen weiß hinterlegten Feldern jeweils Null.
+Furthermore, we consider the random variable&nbsp; $W = \{ 0,\ 1,\ 2,\ 3,\ 4 \}$, whose properties result from the composite probability function&nbsp; $P_{ XW }(X, W)$&nbsp; according to the sketch below.&nbsp; The probabilities are zero in all fields with a white background.
-Gesucht ist in der vorliegenden Aufgabe die Transinformation zwischen
+What is sought in the present exercise is the mutual information between
-:*den Zufallsgrößen&nbsp; $X$&nbsp; und&nbsp; $W$ &nbsp; ⇒ &nbsp;  $I(X; W)$,
+:*the random variables&nbsp; $X$&nbsp; and&nbsp; $W$ &nbsp; ⇒ &nbsp;  $I(X; W)$,
-:* den Zufallsgrößen&nbsp; $Z$&nbsp; und&nbsp; $W &nbsp; ⇒ &nbsp; I(Z; W)$.
+:*the random variables&nbsp; $Z$&nbsp; and&nbsp; $W &nbsp; ⇒ &nbsp; I(Z; W)$.
@@ Line 26: / Line 26: @@
-''Hinweise:''
+Hints:
-*Die Aufgabe gehört zum  Kapitel&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen|Verschiedene Entropien zweidimensionaler Zufallsgrößen]].
+*The exercise belongs to the chapter&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen|Different entropies of two-dimensional random variables]].
-*Insbesondere wird Bezug genommen auf die Seiten <br> &nbsp; &nbsp;  [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Bedingte_Wahrscheinlichkeit_und_bedingte_Entropie|Bedingte Wahrscheinlichkeit und bedingte Entropie]] sowie <br> &nbsp; &nbsp;  [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Transinformation_zwischen_zwei_Zufallsgr.C3.B6.C3.9Fen|Transinformation zwischen zwei Zufallsgrößen]].
+*In particular, reference is made to the sections <br> &nbsp; &nbsp;  [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Conditional_probability_and_conditional_entropy|Conditional probability and conditional entropy]] as well as <br> &nbsp; &nbsp;  [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Mutual_information_between_two_random_variables|Mutual information between two random variables]].
-===Fragebogen===
+===Questions===
 <quiz display=simple>
-{Wie könnten die Größen&nbsp; $X$,&nbsp; $Y$&nbsp; und&nbsp; $W$&nbsp; zusammenhängen?
+{How might the variables&nbsp; $X$,&nbsp; $Y$&nbsp; and&nbsp; $W$&nbsp; be related?
 |type="[]"}
 + $W = X + Y$,
@@ Line 41: / Line 41: @@
 -$W = Y - X + 2$.
-{Welche Transinformation besteht zwischen den Zufallsgrößen&nbsp; $X$&nbsp; und&nbsp; $W$?
+{What is the mutual information between the random variables&nbsp; $X$&nbsp; and&nbsp; $W$?
 |type="{}"}
 $I(X; W) \ = \ $ { 0.612 3%  } $\ \rm bit$
-{Welche Transinformation besteht zwischen den Zufallsgrößen&nbsp; $Z$&nbsp; und&nbsp; $W$?
+{What is the mutual information between the random variables&nbsp; $Z$&nbsp; and&nbsp; $W$?
 |type="{}"}
 $I(Z; W) \ = \ $ { 2.197 3%  } $\ \rm bit$
-{Welche der nachfolgenden Aussagen sind zutreffend?
+{Which of the following statements are true?
 |type="[]"}
-+ Es gilt&nbsp; $H(ZW) = H(XW)$.
++ &nbsp; $H(ZW) = H(XW)$&nbsp; is true.
-+ Es gilt&nbsp; $H(W|Z) = 0$.
++ &nbsp; $H(W|Z) = 0$&nbsp; is true.
-+ Es gilt&nbsp; $I(Z; W) > I(X; W)$.
++ &nbsp; $I(Z; W) > I(X; W)$&nbsp; is true.
 </quiz>
-===Musterlösung===
+===Solution===
 {{ML-Kopf}}
-'''(1)'''&nbsp; Richtig sind die <u>Lösungsvorschläge 1 und 2</u>:
+'''(1)'''&nbsp; The <u>correct solutions are 1 and 2</u>:
-*Mit&nbsp; $X = \{0,\ 1,\ 2\}$,&nbsp; $Y = \{0,\ 1,\ 2\}$&nbsp; gilt&nbsp; $X + Y = \{0,\ 1,\ 2,\ 3,\ 4\}$.&nbsp;
+*With&nbsp; $X = \{0,\ 1,\ 2\}$,&nbsp; $Y = \{0,\ 1,\ 2\}$&nbsp;, &nbsp; $X + Y = \{0,\ 1,\ 2,\ 3,\ 4\}$ holds.&nbsp;
-*Auch die Wahrscheinlichkeiten stimmen mit der gegebenen Wahrscheinlichkeitsfunktion überein.
+*The probabilities also agree with the given probability function.
-*Die Überprüfung der beiden anderen Vorgaben zeigt, dass auch $W = X – Y + 2$ möglich ist, nicht jedoch $W = Y – X + 2$.
+*Checking the other two specifications shows that&nbsp; $W = X - Y + 2$&nbsp; is also possible, but not&nbsp; $W = Y - X + 2$.
+[[File:P_ID2769__Inf_A_3_7d.png|right|frame|To calculate the mutual information]]
-'''(2)'''&nbsp; Aus der 2D–Wahrscheinlichkeitsfunktion&nbsp; $P_{ XW }(X, W)$&nbsp; auf der Angabenseite erhält man für
-*die Verbundentropie:
+'''(2)'''&nbsp; From the two-dimensional probability mass function&nbsp; $P_{ XW }(X, W)$&nbsp; in the specification section, one obtains for
+*the joint entropy:
 :$$H(XW) =  {\rm log}_2 \hspace{0.1cm} (9)
 = 3.170\ {\rm (bit)}
 	\hspace{0.05cm},$$
-* die Wahrsacheinlichkeitsfunktion der Zufallsgröße&nbsp; $W$:
+* the probability function of the random variable&nbsp; $W$:
 :$$P_W(W) = \big [\hspace{0.05cm}1/9\hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm},\hspace{0.15cm} 3/9 \hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm}, \hspace{0.15cm} 1/9\hspace{0.05cm} \big ]\hspace{0.05cm},$$
-*die Entropie der Zufallsgröße $W$:
+*the entropy of the random variable&nbsp; $W$:
 :$$H(W) = 2 \cdot \frac{1}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{1} + 2 \cdot \frac{2}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{2} +
 \frac{3}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{3}
   {= 2.197\ {\rm (bit)}} \hspace{0.05cm}.$$
-Mit&nbsp; $H(X) = 1.585 \ \rm bit$&nbsp; (wurde vorgegeben) ergibt sich somit für die&nbsp; ''Mutual Information'':
+Thus, with&nbsp; $H(X) = 1.585 \ \rm bit$&nbsp; (was given), the result for the&nbsp; mutual information:
 :$$I(X;W) = H(X) + H(W) - H(XW) = 1.585 + 2.197- 3.170\hspace{0.15cm} \underline {= 0.612\ {\rm (bit)}} \hspace{0.05cm}.$$
-[[File:P_ID2769__Inf_A_3_7d.png|right|frame|Zur Berechnung der Transinformation]]
+The left of the two diagrams illustrates the calculation of the mutual information&nbsp; $I(X; W)$&nbsp; between the first component&nbsp; $X$&nbsp; and the sum&nbsp; $W$.
-Das linke der beiden Schaubilder verdeutlicht die Berechnung der Transinformation&nbsp; $I(X; W)$&nbsp; zwischen der ersten Komponente&nbsp; $X$&nbsp; und der Summe&nbsp; $W$.
-<br clear=all>
-[[File:P_ID2770__Inf_A_3_7c.png|right|Verbundwahrscheinlichkeit zwischen&nbsp; $Z$&nbsp; und&nbsp; $W$]]
-'''(3)'''&nbsp;  Die zweite Grafik zeigt die Verbundwahrscheinlichkeit&nbsp; $P_{ ZW }(⋅)$.&nbsp; Das Schema besteht aus&nbsp; $5 · 9 = 45$&nbsp; Feldern im Gegensatz zur Darstellung von&nbsp; $P_{ XW }(⋅)$&nbsp; auf der Angabenseite mit&nbsp; $3 · 9 = 27$&nbsp; Feldern.
-*Von den&nbsp; $45$&nbsp; Feldern sind aber auch nur neun mit Wahrscheinlichkeiten ungleich Null belegt.&nbsp; Für die Verbundentropie gilt: &nbsp; $H(ZW)  = 3.170\ {\rm (bit)} \hspace{0.05cm}.$
-*Mit den weiteren Entropien&nbsp; $H(Z)  = 3.170\ {\rm (bit)}\hspace{0.05cm}$&nbsp; und&nbsp; $H(W)  = 2.197\ {\rm (bit)}\hspace{0.05cm}$&nbsp; entsprechend der&nbsp; [[Aufgaben:3.07Z_Tupel_aus_tern%C3%A4ren_Zufallsgr%C3%B6%C3%9Fen| Aufgabe 3.8Z]]&nbsp; bzw. der Teilfrage&nbsp; '''(2)'''&nbsp; dieser Aufgabe erhält man für die Transinformation:
-:$$I(Z;W) = H(Z) + H(W) - H(ZW) \hspace{0.15cm} \underline {= 2.197\,{\rm (bit)}} \hspace{0.05cm}.$$
-'''(4)'''&nbsp; <u>Alle drei Aussagen</u> treffen zu, wie auch aus dem rechten der beiden oberen Schaubilder ersichtlich ist.
-Wir versuchen eine Interpretation dieser numerischen Ergebnisse:
+[[File:P_ID2770__Inf_A_3_7c.png|right|frame|Joint probability between&nbsp; $Z$&nbsp; and&nbsp; $W$]]
-* Die Verbundwahrscheinlichkeit&nbsp; $P_{ ZW }(⋅)$&nbsp; setzt sich ebenso wie&nbsp; $P_{ XW }(⋅)$&nbsp; aus neun gleichwahrscheinlichen Elementenungleich 0 zusammen. Damit ist offensichtlich, dass auch die Verbundentropien gleich sind &nbsp; ⇒ &nbsp; $H(ZW) =  H(XW) = 3.170 \ \rm (bit)$.
+'''(3)'''&nbsp;  The second graph shows the joint probability&nbsp; $P_{ ZW }(⋅)$.&nbsp;
-* Wenn ich das Tupel&nbsp; $Z = (X, Y)$&nbsp; kenne, kenne ich natürlich auch die Summe&nbsp; $W = X + Y$.&nbsp; Damit ist&nbsp; $H(W|Z) = 0$.
+*The scheme consists of&nbsp; $5 · 9 = 45$&nbsp; fields in contrast to the plot of&nbsp; $P_{ XW }(⋅)$&nbsp; in the data section with&nbsp; $3 · 9 = 27$&nbsp; fields.
-*Dagegen ist&nbsp; $H(Z|W) \ne 0$.&nbsp; Vielmehr gilt&nbsp; $H(Z|W) = H(X|W) = 0.973   \ \rm (bit)$.
+*However, of the&nbsp; $45$&nbsp; fields, only nine are also assigned non-zero probabilities.&nbsp; The following applies to the joint entropy:  &nbsp; $H(ZW)  = 3.170\ {\rm (bit)} \hspace{0.05cm}.$
-* Die Zufallsgröße&nbsp; $W$&nbsp; liefert also die genau gleiche Information hinsichtlich des Tupels&nbsp; $Z$&nbsp; wie für die Einzelkomponente&nbsp; $X$.&nbsp; Dies ist die verbale Interpretation der Aussage&nbsp; $H(Z|W) = H(X|W)$.
+*With the further entropies&nbsp; $H(Z)  = 3.170\ {\rm (bit)}\hspace{0.05cm}$&nbsp; and&nbsp; $H(W)  = 2.197\ {\rm (bit)}\hspace{0.05cm}$&nbsp; according to&nbsp; [[Aufgaben:Exercise_3.8Z:_Tuples_from_Ternary_Random_Variables| Exercise 3.8Z]]&nbsp; or the subquestion&nbsp; '''(2)'''&nbsp; of this exercise, one obtains for the mutual information:
-* Die gemeinsame Information von&nbsp; $Z$&nbsp; und&nbsp; $W$&nbsp; &nbsp; ⇒ &nbsp; $I(Z; W)$&nbsp; ist größer als die gemeinsame Information von&nbsp; $X$&nbsp; und&nbsp; $W$  &nbsp; ⇒ &nbsp; $I(X; W)$, weil&nbsp; $H(W|Z) =0$&nbsp; gilt, während&nbsp; $H(W|X)$&nbsp; ungleich Null ist, nämlich genau so groß ist wie&nbsp; $H(X)$ :
+:$$I(Z;W) = H(Z) + H(W) - H(ZW) \hspace{0.15cm} \underline {= 2.197\,{\rm (bit)}} \hspace{0.05cm}.$$
+<br clear=all>
+'''(4)'''&nbsp; <u>All three statements</u> are true, as can also be seen from the right-hand side of the two upper diagrams.&nbsp; We attempt an interpretation of these numerical results:
+* The joint probability&nbsp; $P_{ ZW }(⋅)$&nbsp;is composed, like&nbsp; $P_{ XW }(⋅)$&nbsp;, of nine equally probable elements unequal to 0.&nbsp; It is thus obvious that the joint entropies are also equal &nbsp; ⇒ &nbsp; $H(ZW) =  H(XW) = 3.170 \ \rm (bit)$.
+* If I know the tuple&nbsp; $Z = (X, Y)$,&nbsp; I naturally also know the sum&nbsp; $W = X + Y$.&nbsp; Thus&nbsp; $H(W|Z) = 0$.
+*In contrast,&nbsp; $H(Z|W) \ne 0$.&nbsp; Rather,&nbsp; $H(Z|W) = H(X|W) = 0.973   \ \rm (bit)$.
+*The random variable&nbsp; $W$&nbsp; thus provides exactly the same information with regard to the tuple&nbsp; $Z$&nbsp; as for the individual component&nbsp; $X$.&nbsp; This is the verbal interpretation of the statement&nbsp; $H(Z|W) = H(X|W)$.
+*The joint information of&nbsp; $Z$&nbsp; and&nbsp; $W$&nbsp; &nbsp; ⇒ &nbsp; $I(Z; W)$&nbsp; is greater than the joint information of&nbsp; $X$&nbsp; and&nbsp; $W$  &nbsp; ⇒ &nbsp; $I(X; W)$,&nbsp; because&nbsp; $H(W|Z) =0$,&nbsp; while&nbsp; $H(W|X)$&nbsp; is non-zero, namely exactly as great as&nbsp; $H(X)$ :
 :$$I(Z;W)  = H(W) - H(W|Z) = 2.197 - 0= 2.197\,{\rm (bit)} \hspace{0.05cm},$$
 :$$I(X;W) =  H(W) - H(W|X) = 2.197 - 1.585= 0.612\,{\rm (bit)} \hspace{0.05cm}.$$

	$H(ZW) = H(XW)$ is true.
	$H(W\|Z) = 0$ is true.
	$I(Z; W) > I(X; W)$ is true.