Difference between revisions of "Aufgaben:Exercise 3.6: Partitioning Inequality"

From LNTwww
 
(25 intermediate revisions by 4 users not shown)
Line 1: Line 1:
  
{{quiz-Header|Buchseite=Informationstheorie/Einige Vorbemerkungen zu zweidimensionalen Zufallsgrößen
+
{{quiz-Header|Buchseite=Information_Theory/Some_Preliminary_Remarks_on_Two-Dimensional_Random_Variables
 
}}
 
}}
  
[[File:P_ID2812__Inf_A_3_5.png|right|]]
+
[[File:P_ID2812__Inf_A_3_5.png|right|frame|Two probability functions  $P_X$  and  $Q_X$]]
Die $Kullback–Leibler–Distanz$ (kurz KLD) wird auch in der „Partitionierungsungleichung” (englisch: Partition Unequality) verwendet:  
+
The  '''Kullback–Leibler distance'''  (KLD for short)  is also used in the   "Partition Unequality":
:* Wir gehen von der Menge
+
* We assume the set  $X =  \{ x_1, \hspace{0.15cm} x_2, \text{...} \hspace{0.05cm}, \hspace{0.15cm} x_M  \}$  and the probability functions
 +
:$$P_X(X) = P_X (  x_1, \hspace{0.15cm} x_2, \text{...} \hspace{0.05cm}, \hspace{0.15cm} x_M  )\hspace{0.05cm},$$
 +
:$$Q_X(X) =Q_X (  x_1, \hspace{0.15cm} x_2, \text{...} \hspace{0.05cm}, \hspace{0.15cm} x_M  ), $$
 +
:which are said to be  "similar in some way".
  
$$X=\{ x_1,x_2,.....,x_M \}$$
+
* We divide the set  $X$  into partitions  $A_1, \text{...} ,  A_K$, which are  [[Theory_of_Stochastic_Signals/Mengentheoretische_Grundlagen#Disjunkte_Mengen|disjoint]]  to each other and result in a  [[Theory_of_Stochastic_Signals/Mengentheoretische_Grundlagen#Vollst.C3.A4ndiges_System|complete system]] :
und den Wahrscheinlichkeitsfunktionen
+
:$$\bigcup_{i=1}^{K} = X, \hspace{0.5cm} A_i \cap A_j = {\it \phi} \hspace{0.25cm}\text{für}\hspace{0.25cm} 1 \le i \ne j \le K .$$
  
$P_X(X) = P_X(x_1,x_2,....,x_M)$ ,
+
* In the following, we denote the probability functions with respect to the partitionings  $A_1,\ A_2, \text{...} ,\ A_K$  by
 +
:$$P_X^{\hspace{0.15cm}(A)} = \big [ P_X ( A_1  )\hspace{0.05cm}, \hspace{0.05cm}\text{...}\hspace{0.1cm},P_X (  A_K  ) \big  ],\hspace{0.05cm}\hspace{0.5cm}{\rm where}\hspace{0.15cm} P_X (  A_i  ) = \sum_{ x \in A_i} P_X ( x  )\hspace{0.05cm},$$
 +
:$$Q_X^{\hspace{0.15cm}(A)}= \big  [ Q_X (  A_1  )\hspace{0.05cm}, \hspace{0.05cm}\text{...}\hspace{0.1cm},Q_X (  A_K  ) \big  ],\hspace{0.05cm}\hspace{0.40cm}{\rm where}\hspace{0.15cm} Q_X (  A_i  ) = \sum_{ x \in A_i} Q_X ( x  )\hspace{0.05cm}. $$
  
$Q_X(X) = Q_X(x_1,x_2,....,x_M)$
+
{{BlaueBox|TEXT=
aus, die in irgendeiner Form „ähnlich” sein sollen
+
$\text{Please note:}$  The  '''partitioning inequality'''  yields the following size relation with respect to the Kullback-Leibler distances:
:* Die Menge $X$ unterteilen wir in die Partitionen $A_1, ..., A_K$ , die zueinander disjunkt sind und ein $vollständiges System$ ergeben:
+
:$$D(P_X^{\hspace{0.15cm}(A)} \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X^{\hspace{0.15cm}(A)})
 +
\hspace{0.25cm}\le \hspace{0.25cm}D(P_X \hspace{0.05cm}\vert \vert \hspace{0.05cm} Q_X) \hspace{0.05cm}.$$}}
  
$\bigcup_{i=_1}^K A_i = X$ , $A_i \cap A_j = \phi$    für  $1 \leq i \neq j \leq K$
 
:* Die Wahrscheinlichkeitsfunktionen bezüglich der Partitionierungen $A=\{ A_1,A_2,.....,A_K \}$bezeichnen wir im Folgenden mit
 
  
$P_X^{ (A) } = [ P_X(A_1),.......,P_X(A_K)]$ ,  wobei    $P_X(A_i) = \sum\limits_{ x \epsilon A_i } P_X(x)$
+
In subtask  '''(1)'''  the Kullback-Leibler distance of the two probability functions  $P_X(X)$  and  $Q_X(X)$  for  $X = \{0,\ 1,\ 2\}$   ⇒   $|X| = 3$  is to be determined.
  
$Q_X^{ (A) } = [ Q_X(A_1),.......,Q_X(A_K)]$ ,  wobei    $Q_X(A_i) = \sum\limits_{ x \epsilon A_i } Q_X(x)$
+
*Then the set  $X$  is to be partitioned with  $K = 2$  according to
 +
:* $A = \{A_1 ,\ A_2\}$   with   $A_1 =\{0\}$   and  $A_2 = \{ 1,\ 2 \}$ ,
 +
:* $B = \{B_1 ,\ B_2\}$   with   $B_1 =\{1\}$   and  $B_2 = \{ 0,\ 2 \}$,  
 +
:* $C = \{C_1 ,\ C_2\}$    with   $C_1 =\{2\}$   and  $C_2 = \{ 0,\ 1\}$.
  
Die $Partitionierungsungleichung$ liefert folgende Größenrelation hinsichtlich der Kullback–Leibler–Distanzen:
+
*Then the respective Kullback-Leibler distances are to be given:
 +
:* $D(P_X^{ (A) } \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X^{ (A) } )$,
 +
:* $D(P_X^{ (B) } \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X^{ (B) } )$,
 +
:* $D(P_X^{ (C) } \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X^{ (C) } )$.
  
$D(P_X^{ (A) } \parallel Q_X^{ (A) } ) \leq  D(P_X \parallel Q_X)$
+
*Finally, subtask  '''(5)'''  asks for the conditions that must be satisfied for the equal sign to be true in the above inequality.
  
  
In der Aufgabe (a) soll die Kullback–Leibler–Distanz der beiden Wahrscheinlichkietsfunktionen $P_X(X)$ und $Q_X(X)$ für $X = \{0, 1, 2\} \Rightarrow  |X| = 3$ ermittelt werden. Anschließend soll die Menge $X$ entsprechend
 
:* $A = \{A_1 , A_2\}$  mit  $A_1 =\{0\}$  und $A_2 = \{ 1,2 \}$ ,
 
:* $B = \{B_1 , B_2\}$  mit  $B_1 =\{1\}$  und $B_2 = \{ 0,2 \}$ ,
 
:* $C = \{C_1 , C_2\}$  mit  $C_1 =\{2\}$  und $C_2 = \{  0,1\}$ ,
 
  
mit $K = 2$ partitioniert werden und es sollen die jeweiligen Kullback–Leibler–Distanzen
 
:* $D(P_X^{ (A) } \parallel Q_X^{ (A) } )$
 
  
:* $D(P_X^{ (B) } \parallel Q_X^{ (B) } )$ ,
 
  
:* $D(P_X^{ (C) } \parallel Q_X^{ (C) } )$
 
angegeben werden. In Aufgabe (e) wird schließlich nach den Bedingungen gefragt, damit in der obigen Ungleichung das Gleichheitszeichen zutrifft.
 
'''Hinweis:''' Die Aufgabe gehört zu
 
[http://en.lntwww.de/Informationstheorie/Einige_Vorbemerkungen_zu_zweidimensionalen_Zufallsgr%C3%B6%C3%9Fen Kapitel 3.1].Die Wahrscheinlichkeitsfunktionen können aus obiger Grafik abgelesen werden:
 
  
$P_X(X) = [1/4 , 1/2 , 1/4]$
 
  
$Q_X(X) = [1/8 , 3/4, 1/8]$
 
  
  
 +
Hints:
 +
*The exercise belongs to the chapter  [[Information_Theory/Some_Preliminary_Remarks_on_Two-Dimensional_Random_Variables|Some preliminary remarks on two-dimensional random variables]].
 +
*In particular, reference is made to the page  [[Information_Theory/Some_Preliminary_Remarks_on_Two-Dimensional_Random_Variables#Informational_divergence_-_Kullback-Leibler_distance|Relative entropy – Kullback-Leibler distance]].
 +
*The two probability functions can be read from the above graph as follows:
 +
:$$P_X(X) = \big [1/4 , \ 1/2 , \ 1/4 \big ],\hspace{0.5cm} Q_X(X) = \big [1/8, \ 3/4, \ 1/8 \big].$$
  
  
 
+
===Questions===
 
 
 
 
 
 
 
 
===Fragebogen===
 
  
 
<quiz display=simple>
 
<quiz display=simple>
  
{Berechnen Sie die Kullback–Leibler–Distanz (KLD) allgemein.
+
{Calculate the Kullback-Leibler distance in general.
 
|type="{}"}
 
|type="{}"}
$ D(P_X \parallel Q_X)$ = { 0.2075 3% } $bit$
+
$ D(P_X \hspace{0.05cm} \vert \vert \hspace{0.05cm}  Q_X) \ = \ $ { 0.2075 3% } $\ \rm bit$
  
{Welche KLD ergibt sich für die Partitionierung $ A_1 = \{0\}, A_2 = \{1, 2\}$?
+
{What is the Kullback-Leibler distance for the partition&nbsp; $ A_1 = \{0\},\ A_2 = \{1, 2\}$?
 
|type="{}"}
 
|type="{}"}
$D(P_X^{ (A) } \parallel Q_X^{ (A) } )$ = { 0.0832 3% } $bit$
+
$D(P_X^{ (A) } \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X^{ (A) } ) \ = \ $ { 0.0832 3% } $\ \rm bit$
  
{Welche KLD ergibt sich für die Partitionierung $ B_1 = \{1\}, A_2 = \{0, 2\}$?
+
{What is the Kullback-Leibler distance for the partition&nbsp; $ B_1 = \{1\}, \ B_2 = \{0, 2\}$?
 
|type="{}"}
 
|type="{}"}
$D(P_X^{ (B) } \parallel Q_X^{ (B) } )$ = { 0.2075 3% } $bit$
+
$D(P_X^{ (B) } \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X^{ (B) } ) \ = \ $ { 0.2075 3% } $\ \rm bit$
 
 
  
{Welche KLD ergibt sich für die Partitionierung $ C_1 = \{2\}, A_2 = \{0, 1\}$?
+
{What is the Kullback-Leibler distance for the partition&nbsp; $ C_1 = \{2\},\ C_2 = \{0, 1\}$?
|type="[]"}
+
|type="()"}
+ Das gleiche Ergebnis wie für die Partitionierung $A$.
+
+ The same result as for partition&nbsp; $\rm A$.
- Das gleiche Ergebnis wie für die Partitionierung $B$.
+
- The same result as for partition&nbsp; $\rm B$.
- Ein ganz anderes Ergebnis.
+
- A completely different result.
  
{Unter welchen Bedingungen ergibt sich für allgemeines $K$ die Gleichheit?
+
{Under which conditions does equality result for general&nbsp; $K$&nbsp;?
 
|type="[]"}
 
|type="[]"}
+ Es müssen $|X|$ Gleichungen erfüllt sein.
+
+ &nbsp; $|X|$&nbsp; equations must be fulfilled.
+ Für $x \epsilon A_i$ muss gelten : $P_X(x)/Q_X(x) = P_X(A_i)/ Q_X(A_i)$
+
+ For&nbsp; $x \in A_i$&nbsp; must hold: &nbsp; $P_X(x)/Q_X(x) = P_X(A_i)/ Q_X(A_i)$
  
 
</quiz>
 
</quiz>
  
===Musterlösung===
+
===Solution===
 
{{ML-Kopf}}
 
{{ML-Kopf}}
  
'''1.'''  Für die Kullback–Leibler–Distanz (KLD) gilt:
+
'''(1)'''&nbsp; For the Kullback-Leibler distance of the non-partitioned quantities&nbsp; $X$&nbsp; and&nbsp; $Y$&nbsp; holds:
 
 
$$D(P_X \parallel P_Y) = E [ log_2 \frac{P_X(X)}{P_Y(X)}] = \sum\limits_{ x \epsilon X} P_X(x) . log_2 \frac{P_X(x)}{P_Y(x)} =$$
 
 
 
$$\frac{1}{2} . log_2 \frac{1/2}{3/4}  + 2 . \frac{1}{4} . log_2 \frac{1/4}{1/8}  +log_2 \frac{2}{3} + \frac{1}{2} . log_2(2) =$$
 
 
 
$$1 -  \frac{1}{2} . log_2(3) = 0.2075 (bit)$$
 
 
 
 
 
'''2.'''  $Partitionierung  A  \Rightarrow  A_1 = \{0\}$ ,  $A_2 = \{ 1 , 2 \}$ :  Man erhält die Wahrscheinlichkeitsfunktionen  $P_X^{ (A) } (X) = \{1/4 , 3/4\}$ und $Q_X^{ (A) } (X) = \{1/8 , 7/8\}$.  Daraus folgt:  
 
  
$$D(P_X^{ (A) } \parallel Q_X^{ (A) } )$ =   \frac{1}{4} . log_2 \frac{1/4}{1/8} + \frac{3}{4} . log_2 \frac{3/4}{7/8} =$$  
+
:$$D(P_X \hspace{0.05cm} ||  \hspace{0.05cm}P_Y) = {\rm E} \left [ {\rm log}_2 \hspace{0.1cm} \frac{P_X(X)}{P_Y(X)}\right ] \hspace{0.2cm}=\hspace{0.2cm} \sum_{x \hspace{0.05cm}\in \hspace{0.05cm}X}
 +
P_X(x) \cdot {\rm log}_2 \hspace{0.1cm} \frac{P_X(x)}{P_Y(x)}$$
 +
:$$\Rightarrow D(P_X \hspace{0.05cm} ||  \hspace{0.05cm}P_Y) = \hspace{-0.15cm} \frac{1}{2}\cdot {\rm log}_2 \hspace{0.1cm} \frac{1/2}{3/4} +
 +
2 \cdot \frac{1}{4}\cdot {\rm log}_2 \hspace{0.1cm} \frac{1/4}{1/8} =
 +
\frac{1}{2}\cdot {\rm log}_2 \hspace{0.1cm} \frac{2}{3} + \frac{1}{2}\cdot {\rm log}_2 \hspace{0.1cm} (2) = 1- \frac{1}{2}\cdot {\rm log}_2 \hspace{0.1cm} (3)
 +
\hspace{0.15cm}
 +
\underline {=0.2075\,{\rm (bit)}}
 +
\hspace{0.05cm}.$$
  
$$ =  \frac{1}{4} . log_2 (2) +  \frac{3}{4} . log_2 \frac{6}{7} = 0.0832 (bit)$$
 
Es ergibt sich also eine kleinere KLD als in Teilaufgabe (a).
 
  
  
'''3.''' $Partitionierung  B \Rightarrow B_1 = \{1\}$ ,  $B_2 = \{ 0 , 2 \}$ :  Man erhält die Wahrscheinlichkeitsfunktionen $P_X^{ (B) } (X) = \{1/2 , 1/2\}$ und $Q_X^{ (B) } (X) = \{3/4 , 1/4\}$.
+
'''(2)'''&nbsp;  With&nbsp; $\text{partition }A$ &nbsp; &rArr; &nbsp; $A_1 = \{0\}$ ,&nbsp; $A_2 = \{ 1 , 2 \}$&nbsp; we get&nbsp; $P_X^{ (A) } (X) = \{1/4 , \ 3/4\}$&nbsp; and&nbsp; $Q_X^{ (A) } (X) = \{1/8 , \ 7/8\}$.&nbsp;  From this follows:
Analog zur Teilaufgabe (b) erhält man nun:  
 
  
$$D(P_X^{ (B) } \parallel Q_X^{ (B) } ) =   \frac{1}{2} . log_2 \frac{1/2}{3/4} \frac{1}{2} . log_2 \frac{1/2}{1/4} = 0.2075$$
+
:$$D(P_X^{\hspace{0.15cm}(A)} \hspace{0.05cm}|| \hspace{0.05cm} Q_X^{\hspace{0.15cm}(A)}) = \frac{1}{4}\cdot {\rm log}_2 \hspace{0.1cm} \frac{1/4}{1/8} +
$\Rightarrow$ gleiches Ergebnis wie in Aufgabe (a) $\Rightarrow$ Bei Partitionierung (B) gilt das Gleichheitszeichen.
+
\frac{3}{4}\cdot {\rm log}_2 \hspace{0.1cm} \frac{3/4}{7/8} =\frac{1}{4}\cdot {\rm log}_2 \hspace{0.1cm} (2) + \frac{3}{4}\cdot {\rm log}_2 \hspace{0.1cm} \frac{6}{7} \hspace{0.15cm} \underline {=0.0832\,{\rm (bit)}}
 +
\hspace{0.05cm}.$$  
  
  
'''4.'''$Partitionierung  C \Rightarrow  C_1 = \{2\}$ ,  $C_2 = \{ 0 , 1\}$ :  Man erhält  $P_X^{ (C) } (X) = \{1/4, 3/4\}$  , $Q_X^{ (C) } (X) = \{1/8, 7/8\}$ . also die gleichen Funktionen wie bei der Partitionierung
 
$A \Rightarrow  Lösungsvorschlag 1$.
 
  
 +
'''(3)'''&nbsp; With&nbsp; $\text{partition } B$ &nbsp; &rArr; &nbsp;  $B_1 = \{1\}$ ,&nbsp;  $B_2 = \{ 0 ,\ 2 \}$&nbsp; the probability functions are&nbsp;  $P_X^{ (B) } (X) = \{1/2 , \ 1/2\}$&nbsp; und&nbsp; $Q_X^{ (B) } (X) = \{3/4 , \ 1/4\}$.&nbsp;
 +
*Analogous to subtask&nbsp; '''(2)'''&nbsp; one thus obtains:
 +
:$$D(P_X^{\hspace{0.15cm}(B)} \hspace{0.05cm}|| \hspace{0.05cm} Q_X^{\hspace{0.15cm}(B)})  = \frac{1}{2}\cdot {\rm log}_2 \hspace{0.1cm} \frac{1/2}{3/4} +
 +
\frac{1}{2}\cdot {\rm log}_2 \hspace{0.1cm} \frac{1/2}{1/4} \hspace{0.15cm} \underline {=0.2075\,{\rm (bit)}}
 +
\hspace{0.05cm}.$$
 +
*The result agrees with that of subtask&nbsp; '''(1)'''&nbsp; &nbsp; &rArr; &nbsp; With&nbsp; $\text{partition }B$&nbsp; the equal sign applies.
  
'''5.'''  Partitionierung $B$ hat zum Ergebnis $D(P_X^{ (B) } \parallel Q_X^{ (B) } ) = D(P_X \parallel Q_X)$ geführt. Für diesen Fall ist
 
  
  
$\frac{P_X(1)}{Q_X(1)} =  \frac{1/2}{3/4} = \frac{2}{3}$ , $\frac{P_X(B_1)}{Q_X(B_1)} = \frac{1/2}{3/4} = \frac{2}{3}$,
 
  
$\frac{P_X(0)}{Q_X(0)} = \frac{1/4}{1/8} = 2$  , $\frac{P_X(B_2)}{Q_X(B_2)} = \frac{1/2}{1/4} = 2$,
+
'''(4)'''&nbsp; With&nbsp; $\text{partition } C$ &nbsp; &rArr; &nbsp;  $C_1 = \{2\}$ ,  $C_2 = \{ 0 , \ 1\}$&nbsp; one obtains&nbsp;  $P_X^{ (C) } (X) = \{1/4, \  3/4\}$  , $Q_X^{ (C) } (X) = \{1/8, \ 7/8\}$, <br>i.e. the same functions as for the&nbsp; $\text{partition }A$ &nbsp; &rArr; &nbsp; <u>solution proposal 1</u>.
  
$\frac{P_X(2)}{Q_X(2)} =  \frac{1/4}{1/8} = 2$ , $\frac{P_X(B_2)}{Q_X(B_2)} = \frac{1/2}{1/4} = 2$
 
  
Es muss also für alle $x \epsilon X$ gelten :
 
  
$\frac{P_X(x)}{Q_X(x)}  = \frac{P_X(B_1)}{Q_X(B_1)} $ , falls $ x \epsilon B_1$
+
'''(5)'''&nbsp; The&nbsp; $\text{partition }B$&nbsp; has led to the result&nbsp; $D(P_X^{ (B) } \hspace{0.05cm} ||  \hspace{0.05cm}Q_X^{ (B) } ) = D(P_X \hspace{0.05cm} ||  \hspace{0.05cm}Q_X)$.
 +
*So for this case
 +
:$$\frac{P_X(1)}{Q_X(1)} = \frac{1/2}{3/4} = \frac{2}{3},  \ \frac{P_X(B_1)}{Q_X(B_1)} = \frac{1/2}{3/4} = {2}/{3},$$
 +
:$$\frac{P_X(0)}{Q_X(0)} =  \frac{1/4}{1/8} = 2, \ \frac{P_X(B_2)}{Q_X(B_2)} = \frac{1/2}{1/4} = 2,$$
 +
:$$\frac{P_X(2)}{Q_X(2)} =  \frac{1/4}{1/8} = 2,  \ \frac{P_X(B_2)}{Q_X(B_2)} = \frac{1/2}{1/4} = 2.$$
  
$\frac{P_X(x)}{Q_X(x)}  = \frac{P_X(B_2)}{Q_X(B_2)} $ , falls $ x \epsilon B_2$
+
*It must therefore hold for all&nbsp; $x \in X$&nbsp; :
 +
:$$\frac{P_X(x)}{Q_X(x)}  = \frac{P_X(B_1)}{Q_X(B_1)}, \text{ if } x \in B_1, \hspace{0.5cm}\frac{P_X(x)}{Q_X(x)}  = \frac{P_X(B_2)}{Q_X(B_2)}, \text{ if } x \in B_2.$$
  
Durch Verallgemeinerung erhält man, dass $beide Lösungsvorschläge$ richtig sind.
+
*By generalization, one can see that&nbsp; <u>both proposed solutions</u>&nbsp; are correct.
  
 
{{ML-Fuß}}
 
{{ML-Fuß}}
Line 137: Line 134:
  
  
[[Category:Aufgaben zu Informationstheorie |^3.1 Vorbemerkungen zu 2D-Zufallsgrößen^]]
+
[[Category:Information Theory: Exercises |^3.1 General Information on 2D Random Variables^]]

Latest revision as of 09:15, 24 September 2021

Two probability functions  $P_X$  and  $Q_X$

The  Kullback–Leibler distance  (KLD for short)  is also used in the  "Partition Unequality":

  • We assume the set  $X = \{ x_1, \hspace{0.15cm} x_2, \text{...} \hspace{0.05cm}, \hspace{0.15cm} x_M \}$  and the probability functions
$$P_X(X) = P_X ( x_1, \hspace{0.15cm} x_2, \text{...} \hspace{0.05cm}, \hspace{0.15cm} x_M )\hspace{0.05cm},$$
$$Q_X(X) =Q_X ( x_1, \hspace{0.15cm} x_2, \text{...} \hspace{0.05cm}, \hspace{0.15cm} x_M ), $$
which are said to be  "similar in some way".
  • We divide the set  $X$  into partitions  $A_1, \text{...} ,  A_K$, which are  disjoint  to each other and result in a  complete system :
$$\bigcup_{i=1}^{K} = X, \hspace{0.5cm} A_i \cap A_j = {\it \phi} \hspace{0.25cm}\text{für}\hspace{0.25cm} 1 \le i \ne j \le K .$$
  • In the following, we denote the probability functions with respect to the partitionings  $A_1,\ A_2, \text{...} ,\ A_K$  by
$$P_X^{\hspace{0.15cm}(A)} = \big [ P_X ( A_1 )\hspace{0.05cm}, \hspace{0.05cm}\text{...}\hspace{0.1cm},P_X ( A_K ) \big ],\hspace{0.05cm}\hspace{0.5cm}{\rm where}\hspace{0.15cm} P_X ( A_i ) = \sum_{ x \in A_i} P_X ( x )\hspace{0.05cm},$$
$$Q_X^{\hspace{0.15cm}(A)}= \big [ Q_X ( A_1 )\hspace{0.05cm}, \hspace{0.05cm}\text{...}\hspace{0.1cm},Q_X ( A_K ) \big ],\hspace{0.05cm}\hspace{0.40cm}{\rm where}\hspace{0.15cm} Q_X ( A_i ) = \sum_{ x \in A_i} Q_X ( x )\hspace{0.05cm}. $$

$\text{Please note:}$  The  partitioning inequality  yields the following size relation with respect to the Kullback-Leibler distances:

$$D(P_X^{\hspace{0.15cm}(A)} \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X^{\hspace{0.15cm}(A)}) \hspace{0.25cm}\le \hspace{0.25cm}D(P_X \hspace{0.05cm}\vert \vert \hspace{0.05cm} Q_X) \hspace{0.05cm}.$$


In subtask  (1)  the Kullback-Leibler distance of the two probability functions  $P_X(X)$  and  $Q_X(X)$  for  $X = \{0,\ 1,\ 2\}$   ⇒   $|X| = 3$  is to be determined.

  • Then the set  $X$  is to be partitioned with  $K = 2$  according to
  • $A = \{A_1 ,\ A_2\}$  with  $A_1 =\{0\}$  and  $A_2 = \{ 1,\ 2 \}$ ,
  • $B = \{B_1 ,\ B_2\}$  with  $B_1 =\{1\}$  and  $B_2 = \{ 0,\ 2 \}$,
  • $C = \{C_1 ,\ C_2\}$  with  $C_1 =\{2\}$  and  $C_2 = \{ 0,\ 1\}$.
  • Then the respective Kullback-Leibler distances are to be given:
  • $D(P_X^{ (A) } \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X^{ (A) } )$,
  • $D(P_X^{ (B) } \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X^{ (B) } )$,
  • $D(P_X^{ (C) } \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X^{ (C) } )$.
  • Finally, subtask  (5)  asks for the conditions that must be satisfied for the equal sign to be true in the above inequality.





Hints:

$$P_X(X) = \big [1/4 , \ 1/2 , \ 1/4 \big ],\hspace{0.5cm} Q_X(X) = \big [1/8, \ 3/4, \ 1/8 \big].$$


Questions

1

Calculate the Kullback-Leibler distance in general.

$ D(P_X \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X) \ = \ $

$\ \rm bit$

2

What is the Kullback-Leibler distance for the partition  $ A_1 = \{0\},\ A_2 = \{1, 2\}$?

$D(P_X^{ (A) } \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X^{ (A) } ) \ = \ $

$\ \rm bit$

3

What is the Kullback-Leibler distance for the partition  $ B_1 = \{1\}, \ B_2 = \{0, 2\}$?

$D(P_X^{ (B) } \hspace{0.05cm} \vert \vert \hspace{0.05cm} Q_X^{ (B) } ) \ = \ $

$\ \rm bit$

4

What is the Kullback-Leibler distance for the partition  $ C_1 = \{2\},\ C_2 = \{0, 1\}$?

The same result as for partition  $\rm A$.
The same result as for partition  $\rm B$.
A completely different result.

5

Under which conditions does equality result for general  $K$ ?

  $|X|$  equations must be fulfilled.
For  $x \in A_i$  must hold:   $P_X(x)/Q_X(x) = P_X(A_i)/ Q_X(A_i)$


Solution

(1)  For the Kullback-Leibler distance of the non-partitioned quantities  $X$  and  $Y$  holds:

$$D(P_X \hspace{0.05cm} || \hspace{0.05cm}P_Y) = {\rm E} \left [ {\rm log}_2 \hspace{0.1cm} \frac{P_X(X)}{P_Y(X)}\right ] \hspace{0.2cm}=\hspace{0.2cm} \sum_{x \hspace{0.05cm}\in \hspace{0.05cm}X} P_X(x) \cdot {\rm log}_2 \hspace{0.1cm} \frac{P_X(x)}{P_Y(x)}$$
$$\Rightarrow D(P_X \hspace{0.05cm} || \hspace{0.05cm}P_Y) = \hspace{-0.15cm} \frac{1}{2}\cdot {\rm log}_2 \hspace{0.1cm} \frac{1/2}{3/4} + 2 \cdot \frac{1}{4}\cdot {\rm log}_2 \hspace{0.1cm} \frac{1/4}{1/8} = \frac{1}{2}\cdot {\rm log}_2 \hspace{0.1cm} \frac{2}{3} + \frac{1}{2}\cdot {\rm log}_2 \hspace{0.1cm} (2) = 1- \frac{1}{2}\cdot {\rm log}_2 \hspace{0.1cm} (3) \hspace{0.15cm} \underline {=0.2075\,{\rm (bit)}} \hspace{0.05cm}.$$


(2)  With  $\text{partition }A$   ⇒   $A_1 = \{0\}$ ,  $A_2 = \{ 1 , 2 \}$  we get  $P_X^{ (A) } (X) = \{1/4 , \ 3/4\}$  and  $Q_X^{ (A) } (X) = \{1/8 , \ 7/8\}$.  From this follows:

$$D(P_X^{\hspace{0.15cm}(A)} \hspace{0.05cm}|| \hspace{0.05cm} Q_X^{\hspace{0.15cm}(A)}) = \frac{1}{4}\cdot {\rm log}_2 \hspace{0.1cm} \frac{1/4}{1/8} + \frac{3}{4}\cdot {\rm log}_2 \hspace{0.1cm} \frac{3/4}{7/8} =\frac{1}{4}\cdot {\rm log}_2 \hspace{0.1cm} (2) + \frac{3}{4}\cdot {\rm log}_2 \hspace{0.1cm} \frac{6}{7} \hspace{0.15cm} \underline {=0.0832\,{\rm (bit)}} \hspace{0.05cm}.$$


(3)  With  $\text{partition } B$   ⇒   $B_1 = \{1\}$ ,  $B_2 = \{ 0 ,\ 2 \}$  the probability functions are  $P_X^{ (B) } (X) = \{1/2 , \ 1/2\}$  und  $Q_X^{ (B) } (X) = \{3/4 , \ 1/4\}$. 

  • Analogous to subtask  (2)  one thus obtains:
$$D(P_X^{\hspace{0.15cm}(B)} \hspace{0.05cm}|| \hspace{0.05cm} Q_X^{\hspace{0.15cm}(B)}) = \frac{1}{2}\cdot {\rm log}_2 \hspace{0.1cm} \frac{1/2}{3/4} + \frac{1}{2}\cdot {\rm log}_2 \hspace{0.1cm} \frac{1/2}{1/4} \hspace{0.15cm} \underline {=0.2075\,{\rm (bit)}} \hspace{0.05cm}.$$
  • The result agrees with that of subtask  (1)    ⇒   With  $\text{partition }B$  the equal sign applies.



(4)  With  $\text{partition } C$   ⇒   $C_1 = \{2\}$ , $C_2 = \{ 0 , \ 1\}$  one obtains  $P_X^{ (C) } (X) = \{1/4, \ 3/4\}$ , $Q_X^{ (C) } (X) = \{1/8, \ 7/8\}$,
i.e. the same functions as for the  $\text{partition }A$   ⇒   solution proposal 1.


(5)  The  $\text{partition }B$  has led to the result  $D(P_X^{ (B) } \hspace{0.05cm} || \hspace{0.05cm}Q_X^{ (B) } ) = D(P_X \hspace{0.05cm} || \hspace{0.05cm}Q_X)$.

  • So for this case
$$\frac{P_X(1)}{Q_X(1)} = \frac{1/2}{3/4} = \frac{2}{3}, \ \frac{P_X(B_1)}{Q_X(B_1)} = \frac{1/2}{3/4} = {2}/{3},$$
$$\frac{P_X(0)}{Q_X(0)} = \frac{1/4}{1/8} = 2, \ \frac{P_X(B_2)}{Q_X(B_2)} = \frac{1/2}{1/4} = 2,$$
$$\frac{P_X(2)}{Q_X(2)} = \frac{1/4}{1/8} = 2, \ \frac{P_X(B_2)}{Q_X(B_2)} = \frac{1/2}{1/4} = 2.$$
  • It must therefore hold for all  $x \in X$  :
$$\frac{P_X(x)}{Q_X(x)} = \frac{P_X(B_1)}{Q_X(B_1)}, \text{ if } x \in B_1, \hspace{0.5cm}\frac{P_X(x)}{Q_X(x)} = \frac{P_X(B_2)}{Q_X(B_2)}, \text{ if } x \in B_2.$$
  • By generalization, one can see that  both proposed solutions  are correct.