Difference between revisions of "Aufgaben:Exercise 3.8: Once more Mutual Information"

Latest revision as of 13:50, 17 November 2022

2D–Functions
$P_{ XY }$ und $P_{ XW }$

We consider the tuple $Z = (X, Y)$, where the individual components $X$ and $Y$ each represent ternary random variables:

$$X = \{ 0 ,\ 1 ,\ 2 \} , \hspace{0.3cm}Y= \{ 0 ,\ 1 ,\ 2 \}.$$

The joint probability function $P_{ XY }(X, Y)$ of both random variables is given in the upper graph.

In Exercise 3.8Z this constellation is analyzed in detail. One obtains as a result (all data in "bit"):

$H(X) = H(Y) = \log_2 (3) = 1.585,$
$H(XY) = \log_2 (9) = 3.170,$
$I(X, Y) = 0,$
$H(Z) = H(XZ) = 3.170,$
$I(X, Z) = 1.585.$

Furthermore, we consider the random variable $W = \{ 0,\ 1,\ 2,\ 3,\ 4 \}$, whose properties result from the composite probability function $P_{ XW }(X, W)$ according to the sketch below. The probabilities are zero in all fields with a white background.

What is sought in the present exercise is the mutual information between

the random variables $X$ and $W$ ⇒ $I(X; W)$,
the random variables $Z$ and $W ⇒ I(Z; W)$.

Hints:

The exercise belongs to the chapter Different entropies of two-dimensional random variables.
In particular, reference is made to the sections
Conditional probability and conditional entropy as well as
Mutual information between two random variables.

Questions

Solution

(1) The correct solutions are 1 and 2:

With $X = \{0,\ 1,\ 2\}$, $Y = \{0,\ 1,\ 2\}$ , $X + Y = \{0,\ 1,\ 2,\ 3,\ 4\}$ holds.
The probabilities also agree with the given probability function.
Checking the other two specifications shows that $W = X - Y + 2$ is also possible, but not $W = Y - X + 2$.

To calculate the mutual information

(2) From the two-dimensional probability mass function $P_{ XW }(X, W)$ in the specification section, one obtains for

the joint entropy:

$$H(XW) = {\rm log}_2 \hspace{0.1cm} (9) = 3.170\ {\rm (bit)} \hspace{0.05cm},$$

the probability function of the random variable $W$:

$$P_W(W) = \big [\hspace{0.05cm}1/9\hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm},\hspace{0.15cm} 3/9 \hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm}, \hspace{0.15cm} 1/9\hspace{0.05cm} \big ]\hspace{0.05cm},$$

the entropy of the random variable $W$:

$$H(W) = 2 \cdot \frac{1}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{1} + 2 \cdot \frac{2}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{2} + \frac{3}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{3} {= 2.197\ {\rm (bit)}} \hspace{0.05cm}.$$

Thus, with $H(X) = 1.585 \ \rm bit$ (was given), the result for the mutual information:

$$I(X;W) = H(X) + H(W) - H(XW) = 1.585 + 2.197- 3.170\hspace{0.15cm} \underline {= 0.612\ {\rm (bit)}} \hspace{0.05cm}.$$

The left of the two diagrams illustrates the calculation of the mutual information $I(X; W)$ between the first component $X$ and the sum $W$.

Joint probability between $Z$ and $W$

(3) The second graph shows the joint probability $P_{ ZW }(⋅)$.

The scheme consists of $5 · 9 = 45$ fields in contrast to the plot of $P_{ XW }(⋅)$ in the data section with $3 · 9 = 27$ fields.
However, of the $45$ fields, only nine are also assigned non-zero probabilities. The following applies to the joint entropy: $H(ZW) = 3.170\ {\rm (bit)} \hspace{0.05cm}.$
With the further entropies $H(Z) = 3.170\ {\rm (bit)}\hspace{0.05cm}$ and $H(W) = 2.197\ {\rm (bit)}\hspace{0.05cm}$ according to Exercise 3.8Z or the subquestion (2) of this exercise, one obtains for the mutual information:

$$I(Z;W) = H(Z) + H(W) - H(ZW) \hspace{0.15cm} \underline {= 2.197\,{\rm (bit)}} \hspace{0.05cm}.$$

(4) All three statements are true, as can also be seen from the right-hand side of the two upper diagrams. We attempt an interpretation of these numerical results:

The joint probability $P_{ ZW }(⋅)$ is composed, like $P_{ XW }(⋅)$ , of nine equally probable elements unequal to 0. It is thus obvious that the joint entropies are also equal ⇒ $H(ZW) = H(XW) = 3.170 \ \rm (bit)$.
If I know the tuple $Z = (X, Y)$, I naturally also know the sum $W = X + Y$. Thus $H(W|Z) = 0$.
In contrast, $H(Z|W) \ne 0$. Rather, $H(Z|W) = H(X|W) = 0.973 \ \rm (bit)$.
The random variable $W$ thus provides exactly the same information with regard to the tuple $Z$ as for the individual component $X$. This is the verbal interpretation of the statement $H(Z|W) = H(X|W)$.
The joint information of $Z$ and $W$ ⇒ $I(Z; W)$ is greater than the joint information of $X$ and $W$ ⇒ $I(X; W)$, because $H(W|Z) =0$, while $H(W|X)$ is non-zero, namely exactly as great as $H(X)$ :

$$I(Z;W) = H(W) - H(W|Z) = 2.197 - 0= 2.197\,{\rm (bit)} \hspace{0.05cm},$$

$$I(X;W) = H(W) - H(W|X) = 2.197 - 1.585= 0.612\,{\rm (bit)} \hspace{0.05cm}.$$

@@ Line 1: / Line 1: @@
-{{quiz-Header|Buchseite=Informationstheorie/Verschiedene Entropien zweidimensionaler Zufallsgrößen
+{{quiz-Header|Buchseite=Information_Theory/Different_Entropy_Measures_of_Two-Dimensional_Random_Variables
 }}
-[[File:P_ID2768__Inf_A_3_7_neu.png|right|]]
+[[File:P_ID2768__Inf_A_3_7_neu.png|right|frame|2D&ndash;Functions&nbsp; <br>&nbsp;$P_{ XY }$&nbsp; und&nbsp; $P_{ XW }$]]
-Wir betrachten das Tupel $Z = (X, Y)$, wobei die Einzelkomponenten $X$ und $Y$ jeweils ternäre Zufallsgrößen darstellen:
+We consider the tuple&nbsp; $Z = (X, Y)$, where the individual components&nbsp; $X$&nbsp; and&nbsp; $Y$&nbsp; each represent ternary random variables:
+:$$X = \{ 0 ,\ 1 ,\ 2 \} , \hspace{0.3cm}Y= \{ 0 ,\ 1 ,\ 2 \}.$$
-$X = \{ 0 , 1 , 2 \}$ , $Y= \{ 0 , 1 , 2 \}$.
+The joint probability function&nbsp; $P_{ XY }(X, Y)$&nbsp; of both random variables is given in the upper graph.&nbsp;
+In&nbsp; [[Aufgaben:Exercise_3.8Z:_Tuples_from_Ternary_Random_Variables|Exercise 3.8Z]]&nbsp; this constellation is analyzed in detail.&nbsp; One obtains as a result (all data in "bit"):
+:* $H(X) = H(Y) = \log_2 (3) = 1.585,$
+:* $H(XY) = \log_2 (9) = 3.170,$
+:* $I(X, Y) = 0,$
+:* $H(Z) = H(XZ) = 3.170,$
+:* $I(X, Z) = 1.585.$
-Die gemeinsame Wahrscheinlichkeitsfunktion $P_{ XY }(X, Y)$ beider Zufallsgrößen ist oben angegeben. In der [http://en.lntwww.de/Aufgaben:3.07Z_Tupel_aus_tern%C3%A4ren_Zufallsgr%C3%B6%C3%9Fen Zusatzaufgabe Z3.7] wird diese Konstellation ausführlich analysiert. Man erhält als Ergebnis:
+Furthermore, we consider the random variable&nbsp; $W = \{ 0,\ 1,\ 2,\ 3,\ 4 \}$, whose properties result from the composite probability function&nbsp; $P_{ XW }(X, W)$&nbsp; according to the sketch below.&nbsp; The probabilities are zero in all fields with a white background.
-:* $H(X) = H(Y) = log_2 (3) = 1.585 bit$
-:*$H(XY) = log_2 (9) = 3.170 bit$,
-:*$I(X, Y) = 0$,
-:*$H(Z) = H(XZ) = 3.170 bit$,
-:*$I(X, Z) = 1.585 bit$
-Desweiteren betrachten wir hier die Zufallsgröße $W = \{ 0, 1, 2, 3, 4 \}$, deren Eigenschaften sich aus der Verbundwahrscheinlichkeitsfunktion $P_{ XW }(X, W)$ nach der unteren Skizze ergeben. Die Wahrscheinlichkeiten in allen weiß hinterlegten Feldern sind jeweils $0$.
+What is sought in the present exercise is the mutual information between
+:*the random variables&nbsp; $X$&nbsp; and&nbsp; $W$ &nbsp; ⇒ &nbsp;  $I(X; W)$,
+:*the random variables&nbsp; $Z$&nbsp; and&nbsp; $W &nbsp; ⇒ &nbsp; I(Z; W)$.
-Gesucht ist in der vorliegenden Aufgabe die Transinformation
-:* zwischen den Zufallsgrößen $X$ und $W \Rightarrow  I(X; W)$,
-:* zwischen den Zufallsgrößen $Z$ und $W  ⇒  I(Z; W)$.
-'''Hinweis:'''  Die Aufgabe bezieht sich auf  [http://en.lntwww.de/Informationstheorie/Verschiedene_Entropien_zweidimensionaler_Zufallsgr%C3%B6%C3%9Fen Kapitel 3.2]
-===Fragebogen===
+Hints:
+*The exercise belongs to the chapter&nbsp; [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen|Different entropies of two-dimensional random variables]].
+*In particular, reference is made to the sections <br> &nbsp; &nbsp;  [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Conditional_probability_and_conditional_entropy|Conditional probability and conditional entropy]] as well as <br> &nbsp; &nbsp;  [[Information_Theory/Verschiedene_Entropien_zweidimensionaler_Zufallsgrößen#Mutual_information_between_two_random_variables|Mutual information between two random variables]].
+===Questions===
 <quiz display=simple>
-{Wie könnten die Größen X, Y und W zusammenhängen? Es gilt
+{How might the variables&nbsp; $X$,&nbsp; $Y$&nbsp; and&nbsp; $W$&nbsp; be related?
 |type="[]"}
-+ $W = X + Y$
++ $W = X + Y$,
-+$W = X – Y + 2$
++$W = X - Y + 2$,
--$W = Y – X + 2$
+-$W = Y - X + 2$.
-{Welche Transinformationen besteht zwischen den Zufallsgrößen $X$ und $W$?
+{What is the mutual information between the random variables&nbsp; $X$&nbsp; and&nbsp; $W$?
 |type="{}"}
-$I(X; Y)$ = { 0.612 3%  } $bit$
+$I(X; W) \ = \ $ { 0.612 3%  } $\ \rm bit$
-{Welche Transinformation besteht zwischen den Zufallsgrößen $Z$ und $W$?
+{What is the mutual information between the random variables&nbsp; $Z$&nbsp; and&nbsp; $W$?
 |type="{}"}
-$I(X; Z)$ = { 2.197 3%  } $bit$
+$I(Z; W) \ = \ $ { 2.197 3%  } $\ \rm bit$
-{Welche der nachfolgenden Aussagen sind zutreffend?
+{Which of the following statements are true?
 |type="[]"}
-+ Es gilt $H(ZW) = H(XW)$.
++ &nbsp; $H(ZW) = H(XW)$&nbsp; is true.
-+ Es gilt $H(W|Z) = 0$.
++ &nbsp; $H(W|Z) = 0$&nbsp; is true.
-+ Es gilt $I(Z; W) > I(X; W)$.
++ &nbsp; $I(Z; W) > I(X; W)$&nbsp; is true.
 </quiz>
-===Musterlösung===
+===Solution===
 {{ML-Kopf}}
-'''1.''' Mit $X = \{0, 1, 2\}$, $Y = \{0, 1, 2\}$ gilt $X + Y = \{0, 1, 2, 3, 4\}$ und auch die Wahrscheinlichkeiten stimmen mit der vorgegebenen Wahrscheinlichkeitsfunktion überein. Die Überprüfung der beiden anderen Vorgaben zeigt, dass auch $W = X – Y + 2$ möglich ist  $\Rightarrow$ $Lösungsvorschläge 1$ und $2$.
+'''(1)'''&nbsp; The <u>correct solutions are 1 and 2</u>:
+*With&nbsp; $X = \{0,\ 1,\ 2\}$,&nbsp; $Y = \{0,\ 1,\ 2\}$&nbsp;, &nbsp; $X + Y = \{0,\ 1,\ 2,\ 3,\ 4\}$ holds.&nbsp;
-'''2.'''Aus der 2D–Wahrscheinlichkeitsfunktion $P_{ XW }(X, W)$ auf der Angabenseite erhält man für
+*The probabilities also agree with the given probability function.
-:*die Verbundentropie:
+*Checking the other two specifications shows that&nbsp; $W = X - Y + 2$&nbsp; is also possible, but not&nbsp; $W = Y - X + 2$.
-$$H(XW) = log_2(9) = 3.170$$,
-:* die Wahrsacheinlichkeitsfunktion der Zufallsgröße $W$:
-$$P_W(W) = [ 1/9 , 2/9 ,  3/9 ,  2/9 ,  1/9]$$,
-:*die Entropie der Zufallsgröße $W$:
-$$H(W) = 2 . \frac{1}{9} .  log_2\frac{9}{1} + 2 . \frac{2}{9} .  log_2\frac{9}{2} + 2 . \frac{3}{9} .  log_2\frac{9}{3} = 2.197 ( bit)$$.
-Mit $H(X) = 1.585$ bit (wurde angegeben) ergibt sich somit für die ''Mutual Information'':
-$$I(X;W) = H(X) + H(W) - H(XW)=$$
-$$=1.585+2.197-3.170=0.612(bit)$$
-[[File:P_ID2769__Inf_A_3_7d.png|right|]]
-Das Rechte Schaubild verdeutlicht die Berechnung der Transinformation $I(X; W)$ zwischen der ersten Komponente $X$ und der Summe $W$.
-'''3.''' Die Grafik zeigt die Verbundwahrscheinlichkeit $P_{ ZW }(⋅)$. Das Schema besteht aus $5 · 9 = 45$ Feldern im Gegensatz zur Darstellung von $P_{ XW }(⋅)$ auf der Angabenseite mit $3 · 9 = 27$ Feldern.
-[[File:P_ID2770__Inf_A_3_7c.png|right|]]
-Von den $45$ Feldern sind aber auch nur neun mit Wahrscheinlichkeiten $≠ 0$ belegt. Für die Verbundentropie gilt:
-$H(ZW) = 3.170(bit)$
-Mit den weiteren Entropien
+[[File:P_ID2769__Inf_A_3_7d.png|right|frame|To calculate the mutual information]]
-$$H(Z) = 3.170 (bit)$$
-$$H(W) = 2.197 (bit)$$
-entsprechend der [http://en.lntwww.de/Aufgaben:3.07Z_Tupel_aus_tern%C3%A4ren_Zufallsgr%C3%B6%C3%9Fen Aufgabe Z3.7] bzw. der Teilaufgabe (b) erhält man für die Transinformation:
-$$I(Z;W) = H(Z) + H(W) - H(ZW) = 2.197 (bit)$$
+'''(2)'''&nbsp; From the two-dimensional probability mass function&nbsp; $P_{ XW }(X, W)$&nbsp; in the specification section, one obtains for
-wie auch aus dem rechten oberen Schaubild hervorgeht.
+*the joint entropy:
+:$$H(XW) =  {\rm log}_2 \hspace{0.1cm} (9)
+= 3.170\ {\rm (bit)}
+	\hspace{0.05cm},$$
+* the probability function of the random variable&nbsp; $W$:
+:$$P_W(W) = \big [\hspace{0.05cm}1/9\hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm},\hspace{0.15cm} 3/9 \hspace{0.05cm}, \hspace{0.15cm} 2/9\hspace{0.05cm}, \hspace{0.15cm} 1/9\hspace{0.05cm} \big ]\hspace{0.05cm},$$
+*the entropy of the random variable&nbsp; $W$:
+:$$H(W) = 2 \cdot \frac{1}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{1} + 2 \cdot \frac{2}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{2} +
+\frac{3}{9} \cdot {\rm log}_2 \hspace{0.1cm} \frac{9}{3}
+ {= 2.197\ {\rm (bit)}} \hspace{0.05cm}.$$
+Thus, with&nbsp; $H(X) = 1.585 \ \rm bit$&nbsp; (was given), the result for the&nbsp; mutual information:
+:$$I(X;W) = H(X) + H(W) - H(XW) = 1.585 + 2.197- 3.170\hspace{0.15cm} \underline {= 0.612\ {\rm (bit)}} \hspace{0.05cm}.$$
-'''4.'''  $Alle$  $drei$  $Aussagen$ treffen zu, wie auch aus dem oberen Schaubild ersichtlich ist. Wir versuchen eine Interpretation dieser numerischen Ergebnisse:
+The left of the two diagrams illustrates the calculation of the mutual information&nbsp; $I(X; W)$&nbsp; between the first component&nbsp; $X$&nbsp; and the sum&nbsp; $W$.
-:* Die Verbundwahrscheinlichkeit $P_{ ZW }$ setzt sich ebenso wie $P_{ XW }$ aus neun gleichwahrscheinlichen Elementen $≠ 0$ zusammen. Damit ist offensichtlich, dass auch die Verbundentropien gleich sind:
-$H(ZW) =  H(XW) = 3.170 (bit)$.
-:* Wenn ich das Tupel $Z = (X, Y)$ kenne, kenne ich natürlich auch die Summe $W = X + Y$. Damit ist $H(W|Z) = 0$. Dagegen ist $H(Z|W)$ ungleich $0$. Vielmehr gilt $H(Z|W) = H(X|W) = 0.973  bit$.
-:* Die Zufallsgröße $W$ liefert also die genau gleiche Information hinsichtlich des Tupels $Z$ wie für die Einzelkomponente $X$. Dies ist die verbale Interpretation für die Aussage $H(Z|W) = H(X|W)$
-:* Die gemeinsame Information von $Z$ und $W \Rightarrow  I(Z; W)$ ist größer als die von $X und W \Rightarrow  I(X; W)$, weil $H(W|Z)$ gleich $0$ ist, während $H(W|X)$ ungleich $0$ ist, nämlich genau so groß ist wie $H(X)$ :
-$$I(Z;W) = H(W) - H(W|Z) = 2.197 - 0 = 2.197 (bit)$$
+[[File:P_ID2770__Inf_A_3_7c.png|right|frame|Joint probability between&nbsp; $Z$&nbsp; and&nbsp; $W$]]
-$$I(X;W) = H(W) - H(W|X) = 2.197 - 1.585 = 0.612 (bit)$$
+'''(3)'''&nbsp;  The second graph shows the joint probability&nbsp; $P_{ ZW }(⋅)$.&nbsp;
+*The scheme consists of&nbsp; $5 · 9 = 45$&nbsp; fields in contrast to the plot of&nbsp; $P_{ XW }(⋅)$&nbsp; in the data section with&nbsp; $3 · 9 = 27$&nbsp; fields.
+*However, of the&nbsp; $45$&nbsp; fields, only nine are also assigned non-zero probabilities.&nbsp; The following applies to the joint entropy:  &nbsp; $H(ZW)  = 3.170\ {\rm (bit)} \hspace{0.05cm}.$
+*With the further entropies&nbsp; $H(Z)  = 3.170\ {\rm (bit)}\hspace{0.05cm}$&nbsp; and&nbsp; $H(W)  = 2.197\ {\rm (bit)}\hspace{0.05cm}$&nbsp; according to&nbsp; [[Aufgaben:Exercise_3.8Z:_Tuples_from_Ternary_Random_Variables| Exercise 3.8Z]]&nbsp; or the subquestion&nbsp; '''(2)'''&nbsp; of this exercise, one obtains for the mutual information:
+:$$I(Z;W) = H(Z) + H(W) - H(ZW) \hspace{0.15cm} \underline {= 2.197\,{\rm (bit)}} \hspace{0.05cm}.$$
+<br clear=all>
+'''(4)'''&nbsp; <u>All three statements</u> are true, as can also be seen from the right-hand side of the two upper diagrams.&nbsp; We attempt an interpretation of these numerical results:
+* The joint probability&nbsp; $P_{ ZW }(⋅)$&nbsp;is composed, like&nbsp; $P_{ XW }(⋅)$&nbsp;, of nine equally probable elements unequal to 0.&nbsp; It is thus obvious that the joint entropies are also equal &nbsp; ⇒ &nbsp; $H(ZW) =  H(XW) = 3.170 \ \rm (bit)$.
+* If I know the tuple&nbsp; $Z = (X, Y)$,&nbsp; I naturally also know the sum&nbsp; $W = X + Y$.&nbsp; Thus&nbsp; $H(W|Z) = 0$.
+*In contrast,&nbsp; $H(Z|W) \ne 0$.&nbsp; Rather,&nbsp; $H(Z|W) = H(X|W) = 0.973   \ \rm (bit)$.
+*The random variable&nbsp; $W$&nbsp; thus provides exactly the same information with regard to the tuple&nbsp; $Z$&nbsp; as for the individual component&nbsp; $X$.&nbsp; This is the verbal interpretation of the statement&nbsp; $H(Z|W) = H(X|W)$.
+*The joint information of&nbsp; $Z$&nbsp; and&nbsp; $W$&nbsp; &nbsp; ⇒ &nbsp; $I(Z; W)$&nbsp; is greater than the joint information of&nbsp; $X$&nbsp; and&nbsp; $W$  &nbsp; ⇒ &nbsp; $I(X; W)$,&nbsp; because&nbsp; $H(W|Z) =0$,&nbsp; while&nbsp; $H(W|X)$&nbsp; is non-zero, namely exactly as great as&nbsp; $H(X)$ :
+:$$I(Z;W)  = H(W) - H(W|Z) = 2.197 - 0= 2.197\,{\rm (bit)} \hspace{0.05cm},$$
+:$$I(X;W) =  H(W) - H(W|X) = 2.197 - 1.585= 0.612\,{\rm (bit)} \hspace{0.05cm}.$$
@@ Line 104: / Line 108: @@
-[[Category:Aufgaben zu Informationstheorie |^3.2Verschiedene Entropien zweidimensionaler Zufallsgrößen^]]
+[[Category:Information Theory: Exercises |^3.2 Entropies of 2D Random Variables^]]

	$H(ZW) = H(XW)$ is true.
	$H(W\|Z) = 0$ is true.
	$I(Z; W) > I(X; W)$ is true.