Difference between revisions of "Aufgaben:Exercise 3.3: Entropy of Ternary Quantities"

From LNTwww
 
(25 intermediate revisions by 4 users not shown)
Line 1: Line 1:
  
{{quiz-Header|Buchseite=Informationstheorie/Einige Vorbemerkungen zu zweidimensionalen Zufallsgrößen
+
{{quiz-Header|Buchseite=Information_Theory/Some_Preliminary_Remarks_on_Two-Dimensional_Random_Variables
 
}}
 
}}
  
[[File:P_ID2754__Inf_A_3_3.png|right|]]
+
[[File:P_ID2754__Inf_A_3_3.png|right|frame|Given entropy functions]]
Rechts sehen Sie die Entropiefunktionen <i>H</i><sub>R</sub>(<i>p</i>), <i>H</i><sub>B</sub>(<i>p</i>) und <i>H</i><sub>G</sub>(<i>p</i>), wobei &bdquo;R&rdquo; für &bdquo;Rot&rdquo; steht, usw. Für alle Zufallsgrößen lautet die Wahrscheinlichkeitsfunktion:
+
On the right you see the entropy functions&nbsp; $H_{\rm R}(p)$,&nbsp; $H_{\rm B}(p)$&nbsp; and&nbsp; $H_{\rm G}(p)$, where&nbsp; $\rm R$&nbsp; stands for "red",&nbsp; $\rm B$&nbsp; for "blue" and&nbsp; $\rm G$&nbsp; for "green".&nbsp; The probability functions are for all random variables:
:$$P_X(X) = [\hspace{0.05cm}p_1\hspace{0.05cm}, p_2\hspace{0.05cm}, p_3\hspace{0.05cm}]\hspace{0.3cm}\hspace{0.3cm} \Rightarrow \hspace{0.3cm} |X| = 3\hspace{0.05cm}.$$
+
:$$P_X(X) = [\hspace{0.05cm}p_1\hspace{0.05cm},\ p_2\hspace{0.05cm},\ p_3\hspace{0.05cm}]\hspace{0.3cm}\hspace{0.3cm} \Rightarrow \hspace{0.3cm} |X| = 3\hspace{0.05cm}.$$
Es gilt der Zusammenhang <i>p</i><sub>1</sub>&nbsp;=&nbsp;<i>p</i> und <i>p</i><sub>2</sub>&nbsp;= 1 &ndash;&nbsp;<i>p</i><sub>3</sub>&nbsp;&ndash;&nbsp;<i>p</i>.
+
For the questionnaire, the relationship&nbsp; $p_1 = p$&nbsp; and&nbsp; $p_2 = 1 - p_3- p$.
  
Die Wahrscheinlichkeitsfunktion einer Zufallsgröße
+
The probability function of a random variable
:$$X = \{\hspace{0.05cm}x_1\hspace{0.05cm}, \hspace{0.05cm} x_2\hspace{0.05cm},\hspace{0.05cm} ...\hspace{0.1cm} ,\hspace{0.05cm} x_{\mu}\hspace{0.05cm}, \hspace{0.05cm}...\hspace{0.1cm} , \hspace{0.05cm} x_{M}\hspace{0.05cm}\}$$
+
:$$X = \big \{\hspace{0.05cm}x_1\hspace{0.05cm}, \hspace{0.15cm} x_2\hspace{0.05cm},\hspace{0.15cm} \text{...}\hspace{0.1cm} ,\hspace{0.15cm} x_{\mu}\hspace{0.05cm}, \hspace{0.05cm}\text{...}\hspace{0.1cm} , \hspace{0.15cm} x_{M}\hspace{0.05cm}\big  \}$$
mit dem Symbolumfang |<i>X</i>|&nbsp;=&nbsp;<i>M</i> lautet allgemein:
+
with the symbolic range&nbsp; $|X| = M$&nbsp; is generally:
:$$P_X(X) = [\hspace{0.05cm}p_1\hspace{0.05cm}, \hspace{0.05cm} p_2\hspace{0.05cm},\hspace{0.05cm} ...\hspace{0.1cm} ,\hspace{0.05cm} p_{\mu}\hspace{0.05cm}, \hspace{0.05cm}...\hspace{0.1cm} , \hspace{0.05cm} p_{M}\hspace{0.05cm}]\hspace{0.05cm}.$$
+
:$$P_X(X) = [\hspace{0.05cm}p_1\hspace{0.05cm}, \hspace{0.15cm} p_2\hspace{0.05cm},\hspace{0.05cm} ...\hspace{0.1cm} ,\hspace{0.15cm} p_{\mu}\hspace{0.05cm}, \hspace{0.05cm}...\hspace{0.1cm} , \hspace{0.15cm} p_{M}\hspace{0.05cm}]\hspace{0.05cm}.$$
Die Entropie (Unsicherheit) dieser Zufallsgröße berechnet sich entsprechend der Gleichung
+
The entropy (uncertainty) of this random variable is calculated according to the equation
:$$H(X) = {\rm E} \left [{\rm log}_2 \hspace{0.1cm} {1}/{P_X(X)} \right ]\hspace{0.05cm},$$
+
:$$H(X) = {\rm E} \big [\log_2 \hspace{0.05cm} {1}/{P_X(X)} \big ]\hspace{0.05cm},$$
und liegt stets im Bereich 0 &#8804; <i>H</i>(<i>X</i>) &#8804; log<sub>2</Sub> |<i>X</i>|. Die untere Schranke <i>H</i>(<i>X</i>) = 0 ergibt sich, wenn eine beliebige Wahrscheinlichkeit <i>p<sub>&mu;</sub></i> = 1 ist und alle anderen 0 sind. Die obere Schranke soll hier wie in [Kra13] hergeleitet werden:
+
and always lies in the range&nbsp; $0 \le H(X) \le  \log_2 \hspace{0.05cm}  |X|$.  
  
:* Durch Erweiterung obiger Gleichung um |<i>X</i>| in Zähler und Nenner erhält man unter Verwendung von log<sub>2</sub>(<i>x</i>) = ln(<i>x</i>)/ln(2):
+
*The lower bound&nbsp; $H(X) = 0$&nbsp; results when any probability&nbsp; $p_\mu = 1$&nbsp; and all others are zero.
:$$H(X) = \frac{1}{{\rm ln}(2)}\cdot {\rm E} \left [{\rm ln} \hspace{0.1cm} \frac{1}{|X| \cdot P_X(X)} \right ] + {\rm log}_2 \hspace{0.1cm}|X| \hspace{0.05cm}.$$
+
 
:* Wie aus nachfolgender Grafik hervorgeht, gilt die Abschätzung ln(<i>x</i>) &#8804; <i>x</i> &ndash; 1 mit der Identität für <i>x</i>&nbsp;=&nbsp;1. Somit kann geschrieben werden:
+
*The upper bound is to be derived here as in the lecture "Information Theory" by&nbsp; [[Biographies_and_Bibliographies/Lehrstuhlinhaber_des_LNT#Prof._Dr._sc._techn._Gerhard_Kramer_.28seit_2010.29|Gerhard Kramer]]&nbsp; at the TU Munich:
:$$H(X) \le \frac{1}{{\rm ln}(2)}\cdot {\rm E} \left [\frac{1}{|X| \cdot P_X(X)} -1 \right ] + {\rm log}_2 \hspace{0.1cm}|X| \hspace{0.05cm}.$$
+
[[File:P_ID2755__Inf_A_3_3_B_neu.png|right|frame|Upper bound estimate for the natural logarithm]]
[[File:P_ID2755__Inf_A_3_3_B_neu.png|right|]]
+
:* By extending the above equation by&nbsp; $|X|$&nbsp; in the numerator and denominator, using&nbsp; $\log_2 \hspace{0.05cm}x= \ln(x)/\ln(2)$, we obtain:
:* In Aufgabe A3.2 wurde für den Fall, dass <i>p<sub>&mu;</sub></i> &ne; 0 für alle <i>&mu;</i> gilt, der Erwartungswert E[1/<i>P<sub>X</sub></i>(<i>X</i>)] zu |<i>X</i>| berechnet. Damit verschwindet der erste Term und man erhält das bekannte Ergebnis:
+
::$$H(X) = \frac{1}{{\rm ln}(2)}\cdot {\rm E} \left [{\rm ln} \hspace{0.1cm} \frac{1}{|X| \cdot P_X(X)} \right ] + {\rm log}_2 \hspace{0.1cm}|X| \hspace{0.05cm}.$$
:$$H(X) \le {\rm log}_2 \hspace{0.1cm}|X| \hspace{0.05cm}.$$
+
:* As can be seen from the graph opposite, the estimation&nbsp; $\ln(x) \le x-1$&nbsp; holds with the identity for&nbsp; $x=1$.&nbsp; Thus, it can be written:
<b>Hinweis:</b> Die Aufgabe gehört zu Kapitel 3.1. Es wird auf die binäre Entropiefunktion Bezug genommen:
+
::$$H(X) \le \frac{1}{{\rm ln}(2)}\cdot {\rm E} \left [\frac{1}{|X| \cdot P_X(X)} -1 \right ] + {\rm log}_2 \hspace{0.1cm}|X| \hspace{0.05cm}.$$
:$$H_{{\rm bin}}(p) =  p \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{p} +  
+
:* In &nbsp; [[Aufgaben:3.2_Erwartungswertberechnungen|Exercise 3.2]]&nbsp; the expected value&nbsp; ${\rm E} \big [\log_2 \hspace{0.05cm} {1}/{P_X(X)} \big ] =|X|$&nbsp; was calculated for the case&nbsp; $p_\mu \ne 0$&nbsp; for all&nbsp; $\mu$&nbsp;.&nbsp; Thus, the first term disappears and the known result is obtained:
 +
::$$H(X) \le {\rm log}_2 \hspace{0.1cm}|X| \hspace{0.05cm}.$$
 +
 
 +
 
 +
 
 +
 
 +
 
 +
 
 +
Hints:  
 +
*The exercise belongs to the chapter&nbsp; [[Information_Theory/Einige_Vorbemerkungen_zu_zweidimensionalen_Zufallsgrößen|Some preliminary remarks on 2D random variables]].
 +
*In particular, reference is made to the page&nbsp; [[Information_Theory/Einige_Vorbemerkungen_zu_zweidimensionalen_Zufallsgrößen#Wahrscheinlichkeitsfunktion_und_Entropie|Probability function and entropy]].
 +
*The same constellation is assumed here as in&nbsp; [[Aufgaben:Aufgabe_3.2:_Erwartungswertberechnungen|Exercise 3.2]].
 +
 +
*The equation of the binary entropy function is:
 +
:$$H_{\rm bin}(p) =  p \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{p} +  
 
  (1-p) \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{1-p} \hspace{0.05cm}.$$
 
  (1-p) \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{1-p} \hspace{0.05cm}.$$
  
  
===Fragebogen===
+
===Questions===
  
 
<quiz display=simple>
 
<quiz display=simple>
{Welche Aussagen gelten für die rote Entropiefunktion <i>H</i><sub>R</sub>(<i>p</i>)?
+
{Which statements are true for the red entropy function&nbsp; $H_{\rm R}(p)$?
 
|type="[]"}
 
|type="[]"}
+ <i>H</i><sub>R</sub>(<i>p</i>) ergibt sich z.B. mit <i>p</i><sub>3</sub> = 0, <i>p</i><sub>1</sub> = <i>p</i>, <i>p</i><sub>2</sub> = 1 &ndash; <i>p</i>.
+
+ $H_{\rm R}(p)$&nbsp; results, for example, with &nbsp;$p_1 = p$, &nbsp;$p_2 = 1- p$  &nbsp;and&nbsp; $p_3 = 0$.
+ <i>H</i><sub>R</sub>(<i>p</i>) ist identisch mit der binären Entropiefunktion <i>H</i><sub>bin</sub></i>(<i>p</i>).
+
+ $H_{\rm R}(p)$&nbsp; is identical to the binary entropy function&nbsp; $H_{\rm bin}(p)$.
  
  
{Welche Eigenschaften weist die binäre Entropiefunktion auf?
+
{What are the properties of the binary entropy function&nbsp; $H_{\rm bin}(p)$&nbsp;?
 
|type="[]"}
 
|type="[]"}
+ <i>H</i><sub>bin</sub></i>(<i>p</i>) ist konkav hinsichtlich des Parameters <i>p</i>.
+
+ $H_{\rm bin}(p)$&nbsp; is concave with respect to the parameter&nbsp; $p$.
- Es gilt Max[<i>H</i><sub>bin</sub></i>(<i>p</i>)] = 2 bit.
+
- &nbsp; $\text {Max [H_{\rm bin}(p)] = 2$&nbsp; bit applies.
  
  
{Welche Aussagen gelten für die blaue Entropiefunktion <i>H</i><sub>B</sub>(<i>p</i>)?
+
{Which statements are true for the blue entropy function&nbsp; $H_{\rm B}(p)$?
 
|type="[]"}
 
|type="[]"}
+ <i>H</i><sub>B</sub>(<i>p</i>) ergibt sich z.B. mit <i>p</i><sub>3</sub> = 1/2, <i>p</i><sub>1</sub> = <i>p</i>, <i>p</i><sub>2</sub> = 1/2 &ndash; <i>p</i>.
+
+ $H_{\rm B}(p)$&nbsp; ergibt sich beispielsweise mit &nbsp;$p_1 = p$, &nbsp;$p_2 = 1/2- p$  &nbsp;und&nbsp; $p_3 = 1/2$.
+ Es gilt <i>H</i><sub>B</sub>(<i>p</i> = 0) = 1 bit.
+
+ Es gilt&nbsp; $H_{\rm B}(p = 0)= 1$&nbsp; bit.  
- Es gilt Max[<i>H</i><sub>B</sub>(<i>p</i>)] = log<sub>2</sub> (3) bit.
+
- Es gilt&nbsp; $\text {Max } [H_{\rm B}(p)] = \log_2 \hspace{0.1cm} (3)$&nbsp; bit.
  
  
{Welche Aussagen gelten für die grüne Entropiefunktion <i>H</i><sub>G</sub>(<i>p</i>)?
+
{Which statements are true for the green entropy function&nbsp; $H_{\rm G}(p)$?
 
|type="[]"}
 
|type="[]"}
+ <i>H</i><sub>G</sub>(<i>p</i>) ergibt sich z.B.  mit <i>p</i><sub>3</sub> = 1/3, <i>p</i><sub>1</sub> = <i>p</i>, <i>p</i><sub>2</sub> = 2/3 &ndash; <i>p</i>.
+
+ $H_{\rm G}(p)$&nbsp; results, for example, with &nbsp;$p_1 = p$, &nbsp;$p_2 = 2/3- p$  &nbsp;and&nbsp; $p_3 = 1/3$.
- Es gilt <i>H</i><sub>G</sub>(<i>p</i> = 0) = 1 bit.
+
- &nbsp; $H_{\rm G}(p = 0)= 1$&nbsp; bit is valid.
+ Es gilt Max[<i>H</i><sub>G</sub>(<i>p</i>)] = log<sub>2</sub> (3) bit.
+
+ &nbsp; $\text {Max } [H_{\rm G}(p)] = \log_2 \hspace{0.1cm} (3)$ bit applies.
  
  
Line 60: Line 74:
 
</quiz>
 
</quiz>
  
===Musterlösung===
+
===Solution===
 
{{ML-Kopf}}
 
{{ML-Kopf}}
<b>1.</b>&nbsp;&nbsp;<u>Beide Aussagen sind richtig.</u> Setzt man <i>p</i><sub>3</sub> = 0 und formal <i>p</i><sub>1</sub> = <i>p</i> &nbsp;&#8658;&nbsp; <i>p</i><sub>2</sub> = 1 &ndash; <i>p</i>, so ergibt sich die binäre Entropiefunktion
+
'''(1)'''&nbsp; <u>Both statements are correct:</u>  
 +
*If we set &nbsp;$p_3 = 0$ and formally &nbsp;$p_1 = p$ &nbsp; &#8658; &nbsp; &nbsp;$p_2 = 1- p$, we get the binary entropy function
 
:$$H_{\rm bin}(p) =  p \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{p} +  
 
:$$H_{\rm bin}(p) =  p \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{p} +  
 
  (1-p) \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{1-p} \hspace{0.05cm}.$$
 
  (1-p) \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{1-p} \hspace{0.05cm}.$$
  
<b>2.</b>&nbsp;&nbsp;Man kann die binäre Entropiefunktion wegen log<sub>2</sub>(<i>x</i>) = ln(<i>x</i>)/ln(2) auch in die folgende Form bringen:
 
:$$H_{\rm bin}(p) = \frac{-1}{{\rm ln}(2)} \cdot \left [  p \cdot {\rm ln}(p)  +
 
(1-p) \cdot {\rm ln}(1-p) \right ] \hspace{0.05cm}.$$
 
Die erste Ableitung führt zum Ergebnis
 
:$$\frac {\rm dH_{\rm bin}(p)}{\rm dp} \hspace{0.15cm}  =  \hspace{0.15cm} \frac{-1}{{\rm ln}(2)} \cdot \left [  {\rm ln}(p)  + p \cdot \frac{1}{p} -
 
  {\rm ln}(1-p) - (1-p) \cdot \frac{1}{1-p} \right ] =\\
 
=  \hspace{0.15cm} \frac{1}{{\rm ln}(2)} \cdot \left [ {\rm ln}(1-p) - {\rm ln}(p)  \right ] = {\rm log}_2 \hspace{0.1cm} \frac{1-p}{p} \hspace{0.05cm}.$$
 
Durch Nullsetzen dieser Ableitung erhält man den Abszissenwert <i>p</i> = 0.5, der zum Maximum der Entropiefunktion führt: <i>H</i><sub>bin</sub>(<i>p</i> = 0.5) = 1 bit &nbsp;&#8658;&nbsp; Lösungsvorschlag 2 ist falsch..
 
  
Durch nochmaliges Differenzieren erhält man für die zweite Ableitung:
+
'''(2)'''&nbsp; Only the <u>proposed solution 1</u> is correct:
:$$\frac {\rm d^2H_{\rm bin}(p)}{\rm dp^2} = \frac{1}{{\rm ln}(2)} \cdot \left
+
*One can also put the binary entropy function into the following form because of&nbsp; $\log(x) = \ln(x)/\ln(2)$&nbsp;:
 +
:$$H_{\rm bin}(p) = \frac{-1}{{\rm ln}(2)} \cdot \big [  p \cdot {\rm ln}(p)  +
 +
(1-p) \cdot {\rm ln}(1-p) \big ] \hspace{0.05cm}.$$
 +
*The first derivation leads to the result
 +
:$$\frac {{\rm d}H_{\rm bin}(p)}{{\rm d}p} = \frac{-1}{{\rm ln}(2)} \cdot \big [  {\rm ln}(p)  + p \cdot \frac{1}{p} -
 +
  {\rm ln}(1-p) - (1-p) \cdot \frac{1}{1-p} \big ] =
 +
\frac{1}{{\rm ln}(2)} \cdot \big [ {\rm ln}(1-p) - {\rm ln}(p)  \big ] = {\rm log}_2 \hspace{0.1cm} \frac{1-p}{p} \hspace{0.05cm}.$$
 +
*By setting this derivative to zero, we obtain the abscissa value&nbsp; $p = 0.5$, which leads to the maximum of the entropy function: &nbsp; $H_{\rm bin}(p =0.5) = 1$ bit <br>&#8658; &nbsp; the proposed solution 2 is wrong.
 +
*By differentiating again, one obtains for the second derivative
 +
:$$\frac {{\rm d}^2H_{\rm bin}(p)}{{\rm d}p^2} = \frac{1}{{\rm ln}(2)} \cdot \left
 
  [  \frac{-1}{1-p}  - \frac{1}{p}    \right ] =
 
  [  \frac{-1}{1-p}  - \frac{1}{p}    \right ] =
 
\frac{-1}{{\rm ln}(2) \cdot p \cdot (1-p)}  \hspace{0.05cm}.$$
 
\frac{-1}{{\rm ln}(2) \cdot p \cdot (1-p)}  \hspace{0.05cm}.$$
Diese Funktion ist im gesamten Definitionsgebiet 0 &#8804; <i>p</i> &#8804; 1 negativ &nbsp;&#8658;&nbsp; <i>H</i><sub>bin</sub>(<i>p</i>) ist konkav &nbsp;&#8658;&nbsp; Richtig ist dementsprechend (allein) der <u>Lösungsvorschlag 1</u>.
+
*This function is negative in the entire definition domain&nbsp; $0 &#8804; p &#8804; 1$&nbsp; &nbsp; &#8658; &nbsp; $H_{\rm bin}(p)$ is concave &nbsp; &#8658; &nbsp; the proposed solution 1 is correct.
  
<b>3.</b>&nbsp;&nbsp;Richtig sind hier die <u>Aussagen 1 und 2</u>:
 
  
:* Für <i>p</i> = 0 erhält man die Wahrscheinlichkeitsfunktion <i>P<sub>X</sub></i>(<i>X</i>) = [0, 1/2, 1/2] &nbsp;&#8658;&nbsp; <i>H</i>(<i>X</i>) = 1 bit.
 
  
:* Das Maximum unter der Voraussetzung <i>p</i><sub>3</sub> = 1/2 ergibt sich für <i>p</i><sub>1</sub> = <i>p</i><sub>2</sub> = 1/4:
+
[[File:P_ID2756__Inf_A_3_3_ML.png|right|frame|Three entropy functions with&nbsp; $M = 3$]]
:$$P_X(X) = [\hspace{0.05cm}1/4\hspace{0.05cm}, \hspace{0.05cm} 1/4\hspace{0.05cm},\hspace{0.05cm} 1/2 \hspace{0.05cm}]
+
'''(3)'''&nbsp; <u>Propositions 1 and 2</u> are correct here:
 +
* For&nbsp; $p = 0$&nbsp; one obtains the probability function&nbsp; $P_X(X) = \big  [\hspace{0.05cm}0\hspace{0.05cm}, \hspace{0.15cm} 1/2\hspace{0.05cm},\hspace{0.15cm} 1/2 \hspace{0.05cm} \big ]$ &nbsp; &#8658; &nbsp; $H(X) = 1$&nbsp; bit.
 +
* The maximum under the condition&nbsp; $p_3 = 1/2$&nbsp; is obtained for&nbsp; $p_1 = p_2 = 1/4$:
 +
:$$P_X(X) = \big  [\hspace{0.05cm}1/4\hspace{0.05cm}, \hspace{0.05cm} 1/4\hspace{0.05cm},\hspace{0.05cm} 1/2 \hspace{0.05cm} \big ]
 
\hspace{0.3cm} \Rightarrow \hspace{0.3cm}
 
\hspace{0.3cm} \Rightarrow \hspace{0.3cm}
{\rm Max} [H_{\rm B}(p)] = 1.5\,{\rm bit}
+
{\rm Max} \ [H_{\rm B}(p)] = 1.5 \ \rm bit
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
[[File:P_ID2756__Inf_A_3_3_ML.png|right|]]
+
*In compact form,&nbsp; $H_{\rm B}(p)$&nbsp; with the constraint&nbsp; $0 &#8804; p &#8804; 1/2$&nbsp; can be represented as follows:
In kompakter Form lässt sich <i>H</i><sub>B</sub>(<i>p</i>) mit der Einschränkung 0 &#8804; <i>p</i> &#8804; 1/2 wie folgt darstellen:
 
 
:$$H_{\rm B}(p) = 1.0\,{\rm bit} + {1}/{2} \cdot H_{\rm bin}(2p)  
 
:$$H_{\rm B}(p) = 1.0\,{\rm bit} + {1}/{2} \cdot H_{\rm bin}(2p)  
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
  
<b>4.</b>&nbsp;&nbsp;Richtig sind die <u> erste und letzte Aussage</u>. Der grüne Kurvenzug beinhaltet mit <i>p</i> = 1/3 auch die Gleichverteilung aller Wahrscheinlichkeiten &#8658; Max[<i>H</i><sub>G</sub>(<i>p</i>)] = log<sub>2</sub> (3) bit. Allgemein lässt sich der gesamte Kurvenverlauf im Bereich 0 &#8804; <i>p</i> &#8804; 2/3 wie folgt ausdrücken:
+
 
 +
'''(4)'''&nbsp; <u>The first and last statements</u> are correct here::
 +
* The green curve with&nbsp; $p = 1/3$&nbsp; also contains the equal distribution of all probabilities 
 +
:$$ {\rm Max} \ [H_{\rm G}(p)] = \log_2 (3)\ \text{bit}.$$
 +
*In general, the entire curve in the range&nbsp; $0 &#8804; p &#8804; 2/3$&nbsp; can be expressed as follows:
 
:$$H_{\rm G}(p) = H_{\rm G}(p= 0) + {2}/{3} \cdot H_{\rm bin}(3p/2)  
 
:$$H_{\rm G}(p) = H_{\rm G}(p= 0) + {2}/{3} \cdot H_{\rm bin}(3p/2)  
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
Aus der Grafik auf der Angabenseite erkennt man auch, dass folgende Bedingung erfüllt sein muss:
+
*From the graph on the information page, one can also see that the following condition must be fulfilled:
 
:$$H_{\rm G}(p = 0) + {2}/{3}= {\rm log}_2 \hspace{0.01cm} (3)
 
:$$H_{\rm G}(p = 0) + {2}/{3}= {\rm log}_2 \hspace{0.01cm} (3)
 
\hspace{0.3cm} \Rightarrow \hspace{0.3cm}
 
\hspace{0.3cm} \Rightarrow \hspace{0.3cm}
 
H_{\rm G}(p= 0) = 1.585 - 0.667 = 0.918 \,{\rm bit}
 
H_{\rm G}(p= 0) = 1.585 - 0.667 = 0.918 \,{\rm bit}
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
Der Lösungsvorschlag 2 ist hier somit falsch. Zum gleichen Ergebnis gelangt man über die Gleichung
+
*The second suggested solution is therefore wrong.&nbsp; The same result can be obtained with the equation
 
:$$H_{\rm G}(p = 0) = {1}/{3} \cdot  {\rm log}_2 \hspace{0.01cm} (3)
 
:$$H_{\rm G}(p = 0) = {1}/{3} \cdot  {\rm log}_2 \hspace{0.01cm} (3)
 
+{2}/{3} \cdot  {\rm log}_2 \hspace{0.01cm} (3/2) = {\rm log}_2 \hspace{0.01cm} (3) -2/3 \cdot  
 
+{2}/{3} \cdot  {\rm log}_2 \hspace{0.01cm} (3/2) = {\rm log}_2 \hspace{0.01cm} (3) -2/3 \cdot  
 
{\rm log}_2 \hspace{0.01cm} (2)
 
{\rm log}_2 \hspace{0.01cm} (2)
 
\hspace{0.05cm}.$$
 
\hspace{0.05cm}.$$
Die Grafik zeigt nochmals die Ausgangsgrafik, aber nun mit Bemaßungen.
 
 
{{ML-Fuß}}
 
{{ML-Fuß}}
  
  
  
[[Category:Aufgaben zu Informationstheorie|^3.1 Einige Vorbemerkungen zu zweidimensionalen Zufallsgrößen^]]
+
[[Category:Information Theory: Exercises|^3.1 General Information on 2D Random Variables^]]

Latest revision as of 10:13, 24 September 2021

Given entropy functions

On the right you see the entropy functions  $H_{\rm R}(p)$,  $H_{\rm B}(p)$  and  $H_{\rm G}(p)$, where  $\rm R$  stands for "red",  $\rm B$  for "blue" and  $\rm G$  for "green".  The probability functions are for all random variables:

$$P_X(X) = [\hspace{0.05cm}p_1\hspace{0.05cm},\ p_2\hspace{0.05cm},\ p_3\hspace{0.05cm}]\hspace{0.3cm}\hspace{0.3cm} \Rightarrow \hspace{0.3cm} |X| = 3\hspace{0.05cm}.$$

For the questionnaire, the relationship  $p_1 = p$  and  $p_2 = 1 - p_3- p$.

The probability function of a random variable

$$X = \big \{\hspace{0.05cm}x_1\hspace{0.05cm}, \hspace{0.15cm} x_2\hspace{0.05cm},\hspace{0.15cm} \text{...}\hspace{0.1cm} ,\hspace{0.15cm} x_{\mu}\hspace{0.05cm}, \hspace{0.05cm}\text{...}\hspace{0.1cm} , \hspace{0.15cm} x_{M}\hspace{0.05cm}\big \}$$

with the symbolic range  $|X| = M$  is generally:

$$P_X(X) = [\hspace{0.05cm}p_1\hspace{0.05cm}, \hspace{0.15cm} p_2\hspace{0.05cm},\hspace{0.05cm} ...\hspace{0.1cm} ,\hspace{0.15cm} p_{\mu}\hspace{0.05cm}, \hspace{0.05cm}...\hspace{0.1cm} , \hspace{0.15cm} p_{M}\hspace{0.05cm}]\hspace{0.05cm}.$$

The entropy (uncertainty) of this random variable is calculated according to the equation

$$H(X) = {\rm E} \big [\log_2 \hspace{0.05cm} {1}/{P_X(X)} \big ]\hspace{0.05cm},$$

and always lies in the range  $0 \le H(X) \le \log_2 \hspace{0.05cm} |X|$.

  • The lower bound  $H(X) = 0$  results when any probability  $p_\mu = 1$  and all others are zero.
  • The upper bound is to be derived here as in the lecture "Information Theory" by  Gerhard Kramer  at the TU Munich:
Upper bound estimate for the natural logarithm
  • By extending the above equation by  $|X|$  in the numerator and denominator, using  $\log_2 \hspace{0.05cm}x= \ln(x)/\ln(2)$, we obtain:
$$H(X) = \frac{1}{{\rm ln}(2)}\cdot {\rm E} \left [{\rm ln} \hspace{0.1cm} \frac{1}{|X| \cdot P_X(X)} \right ] + {\rm log}_2 \hspace{0.1cm}|X| \hspace{0.05cm}.$$
  • As can be seen from the graph opposite, the estimation  $\ln(x) \le x-1$  holds with the identity for  $x=1$.  Thus, it can be written:
$$H(X) \le \frac{1}{{\rm ln}(2)}\cdot {\rm E} \left [\frac{1}{|X| \cdot P_X(X)} -1 \right ] + {\rm log}_2 \hspace{0.1cm}|X| \hspace{0.05cm}.$$
  • In   Exercise 3.2  the expected value  ${\rm E} \big [\log_2 \hspace{0.05cm} {1}/{P_X(X)} \big ] =|X|$  was calculated for the case  $p_\mu \ne 0$  for all  $\mu$ .  Thus, the first term disappears and the known result is obtained:
$$H(X) \le {\rm log}_2 \hspace{0.1cm}|X| \hspace{0.05cm}.$$




Hints:

  • The equation of the binary entropy function is:
$$H_{\rm bin}(p) = p \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{p} + (1-p) \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{1-p} \hspace{0.05cm}.$$


Questions

1

Which statements are true for the red entropy function  $H_{\rm R}(p)$?

$H_{\rm R}(p)$  results, for example, with  $p_1 = p$,  $p_2 = 1- p$  and  $p_3 = 0$.
$H_{\rm R}(p)$  is identical to the binary entropy function  $H_{\rm bin}(p)$.

2

What are the properties of the binary entropy function  $H_{\rm bin}(p)$ ?

$H_{\rm bin}(p)$  is concave with respect to the parameter  $p$.
  $\text {Max } [H_{\rm bin}(p)] = 2$  bit applies.

3

Which statements are true for the blue entropy function  $H_{\rm B}(p)$?

$H_{\rm B}(p)$  ergibt sich beispielsweise mit  $p_1 = p$,  $p_2 = 1/2- p$  und  $p_3 = 1/2$.
Es gilt  $H_{\rm B}(p = 0)= 1$  bit.
Es gilt  $\text {Max } [H_{\rm B}(p)] = \log_2 \hspace{0.1cm} (3)$  bit.

4

Which statements are true for the green entropy function  $H_{\rm G}(p)$?

$H_{\rm G}(p)$  results, for example, with  $p_1 = p$,  $p_2 = 2/3- p$  and  $p_3 = 1/3$.
  $H_{\rm G}(p = 0)= 1$  bit is valid.
  $\text {Max } [H_{\rm G}(p)] = \log_2 \hspace{0.1cm} (3)$ bit applies.


Solution

(1)  Both statements are correct:

  • If we set  $p_3 = 0$ and formally  $p_1 = p$   ⇒    $p_2 = 1- p$, we get the binary entropy function
$$H_{\rm bin}(p) = p \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{p} + (1-p) \cdot {\rm log}_2 \hspace{0.1cm} \frac{1}{1-p} \hspace{0.05cm}.$$


(2)  Only the proposed solution 1 is correct:

  • One can also put the binary entropy function into the following form because of  $\log(x) = \ln(x)/\ln(2)$ :
$$H_{\rm bin}(p) = \frac{-1}{{\rm ln}(2)} \cdot \big [ p \cdot {\rm ln}(p) + (1-p) \cdot {\rm ln}(1-p) \big ] \hspace{0.05cm}.$$
  • The first derivation leads to the result
$$\frac {{\rm d}H_{\rm bin}(p)}{{\rm d}p} = \frac{-1}{{\rm ln}(2)} \cdot \big [ {\rm ln}(p) + p \cdot \frac{1}{p} - {\rm ln}(1-p) - (1-p) \cdot \frac{1}{1-p} \big ] = \frac{1}{{\rm ln}(2)} \cdot \big [ {\rm ln}(1-p) - {\rm ln}(p) \big ] = {\rm log}_2 \hspace{0.1cm} \frac{1-p}{p} \hspace{0.05cm}.$$
  • By setting this derivative to zero, we obtain the abscissa value  $p = 0.5$, which leads to the maximum of the entropy function:   $H_{\rm bin}(p =0.5) = 1$ bit
    ⇒   the proposed solution 2 is wrong.
  • By differentiating again, one obtains for the second derivative
$$\frac {{\rm d}^2H_{\rm bin}(p)}{{\rm d}p^2} = \frac{1}{{\rm ln}(2)} \cdot \left [ \frac{-1}{1-p} - \frac{1}{p} \right ] = \frac{-1}{{\rm ln}(2) \cdot p \cdot (1-p)} \hspace{0.05cm}.$$
  • This function is negative in the entire definition domain  $0 ≤ p ≤ 1$    ⇒   $H_{\rm bin}(p)$ is concave   ⇒   the proposed solution 1 is correct.


Three entropy functions with  $M = 3$

(3)  Propositions 1 and 2 are correct here:

  • For  $p = 0$  one obtains the probability function  $P_X(X) = \big [\hspace{0.05cm}0\hspace{0.05cm}, \hspace{0.15cm} 1/2\hspace{0.05cm},\hspace{0.15cm} 1/2 \hspace{0.05cm} \big ]$   ⇒   $H(X) = 1$  bit.
  • The maximum under the condition  $p_3 = 1/2$  is obtained for  $p_1 = p_2 = 1/4$:
$$P_X(X) = \big [\hspace{0.05cm}1/4\hspace{0.05cm}, \hspace{0.05cm} 1/4\hspace{0.05cm},\hspace{0.05cm} 1/2 \hspace{0.05cm} \big ] \hspace{0.3cm} \Rightarrow \hspace{0.3cm} {\rm Max} \ [H_{\rm B}(p)] = 1.5 \ \rm bit \hspace{0.05cm}.$$
  • In compact form,  $H_{\rm B}(p)$  with the constraint  $0 ≤ p ≤ 1/2$  can be represented as follows:
$$H_{\rm B}(p) = 1.0\,{\rm bit} + {1}/{2} \cdot H_{\rm bin}(2p) \hspace{0.05cm}.$$


(4)  The first and last statements are correct here::

  • The green curve with  $p = 1/3$  also contains the equal distribution of all probabilities
$$ {\rm Max} \ [H_{\rm G}(p)] = \log_2 (3)\ \text{bit}.$$
  • In general, the entire curve in the range  $0 ≤ p ≤ 2/3$  can be expressed as follows:
$$H_{\rm G}(p) = H_{\rm G}(p= 0) + {2}/{3} \cdot H_{\rm bin}(3p/2) \hspace{0.05cm}.$$
  • From the graph on the information page, one can also see that the following condition must be fulfilled:
$$H_{\rm G}(p = 0) + {2}/{3}= {\rm log}_2 \hspace{0.01cm} (3) \hspace{0.3cm} \Rightarrow \hspace{0.3cm} H_{\rm G}(p= 0) = 1.585 - 0.667 = 0.918 \,{\rm bit} \hspace{0.05cm}.$$
  • The second suggested solution is therefore wrong.  The same result can be obtained with the equation
$$H_{\rm G}(p = 0) = {1}/{3} \cdot {\rm log}_2 \hspace{0.01cm} (3) +{2}/{3} \cdot {\rm log}_2 \hspace{0.01cm} (3/2) = {\rm log}_2 \hspace{0.01cm} (3) -2/3 \cdot {\rm log}_2 \hspace{0.01cm} (2) \hspace{0.05cm}.$$