Difference between revisions of "Aufgaben:Exercise 4.1: PDF, CDF and Probability"

From LNTwww
m (Text replacement - "value-discrete" to "discrete")
 
(6 intermediate revisions by 3 users not shown)
Line 1: Line 1:
  
{{quiz-Header|Buchseite=Informationstheorie/Differentielle Entropie
+
{{quiz-Header|Buchseite=Information_Theory/Differential_Entropy
 
}}
 
}}
  
[[File:P_ID2862__Inf_A_4_1_neu.png|right|frame|CDF (top) and PDF (bottom)]]
+
[[File:P_ID2862__Inf_A_4_1_neu.png|right|frame|$\rm  (CDF)$  (top),  $\rm  (PDF)$  (bottom)]]
 
To repeat some important basics from the book "Theory of Stochastic Signals" we are dealing with
 
To repeat some important basics from the book "Theory of Stochastic Signals" we are dealing with
* the  [[Theory_of_Stochastic_Signals/Wahrscheinlichkeitsdichtefunktion|probability density function]] (PDF),
+
* the  [[Theory_of_Stochastic_Signals/Wahrscheinlichkeitsdichtefunktion|probability density function]] $\rm  (PDF)$,
* the  [[Theory_of_Stochastic_Signals/Verteilungsfunktion|cumulative distribution function]] (CDF).
+
* the  [[Theory_of_Stochastic_Signals/Cumulative_Distribution_Function_(CDF)|cumulative distribution function]] $\rm  (CDF)$.
  
  
The upper plot shows the distribution function  $F_X(x)$  of a discrete value random variable   $X$.  The corresponding PDF  $f_X(x)$  has to be determined in subtask  '''(1)''' .
+
The upper plot shows the cumulative distribution function  $F_X(x)$  of a discrete random variable   $X$.  The corresponding probability density function  $f_X(x)$  has to be determined in subtask  '''(1)'''.
  
 
The equation
 
The equation
 
:$$ {\rm Pr}(A < X \le B) = F_X(B) - F_X(A) =  \lim_{\varepsilon \hspace{0.05cm}\rightarrow \hspace{0.05cm}0} \int_{A+\varepsilon}^{B+\varepsilon} \hspace{-0.15cm}  f_X(x) \hspace{0.1cm}{\rm d}x $$
 
:$$ {\rm Pr}(A < X \le B) = F_X(B) - F_X(A) =  \lim_{\varepsilon \hspace{0.05cm}\rightarrow \hspace{0.05cm}0} \int_{A+\varepsilon}^{B+\varepsilon} \hspace{-0.15cm}  f_X(x) \hspace{0.1cm}{\rm d}x $$
  
represents two ways to calculate the probability for the event&nbsp; „The random variable&nbsp; $X$&nbsp; lies in a given interval”&nbsp; from the CDF and the PDF, respectively.
+
represents two ways to calculate the probability for the event&nbsp; "The random variable&nbsp; $X$&nbsp; lies in a given interval"&nbsp; from the CDF and the PDF,&nbsp; respectively.
  
 
The lower graph shows the probability density function
 
The lower graph shows the probability density function
 
:$$ f_Y(y) = \left\{ \begin{array}{c} \hspace{0.1cm}1/2 \cdot \cos^2(\pi/4 \cdot y) \\ \hspace{0.1cm} 0 \\  \end{array} \right.\quad \begin{array}{*{20}c}  {\rm{f\ddot{u}r}}  \\    {\rm{f\ddot{u}r}}  \\ \end{array}\begin{array}{*{20}l}  | y| \le 2, \\   
 
:$$ f_Y(y) = \left\{ \begin{array}{c} \hspace{0.1cm}1/2 \cdot \cos^2(\pi/4 \cdot y) \\ \hspace{0.1cm} 0 \\  \end{array} \right.\quad \begin{array}{*{20}c}  {\rm{f\ddot{u}r}}  \\    {\rm{f\ddot{u}r}}  \\ \end{array}\begin{array}{*{20}l}  | y| \le 2, \\   
 
y < -2 \hspace{0.1cm}{\rm und}\hspace{0.1cm}y > +2 \\ \end{array}$$
 
y < -2 \hspace{0.1cm}{\rm und}\hspace{0.1cm}y > +2 \\ \end{array}$$
of a continuous-valued random variable&nbsp; $Y$, which is restricted to the range&nbsp; $|Y| \le 2$&nbsp;.
+
of a continuous random variable&nbsp; $Y$,&nbsp; which is restricted to the range&nbsp; $|Y| \le 2$&nbsp;.&nbsp;
 
In principle, the same relationship between PDF, CDF and probabilities exists for the continuous random variable&nbsp; $Y$&nbsp; as for a discrete random variable.&nbsp; Nevertheless, you will notice some differences in details.
 
In principle, the same relationship between PDF, CDF and probabilities exists for the continuous random variable&nbsp; $Y$&nbsp; as for a discrete random variable.&nbsp; Nevertheless, you will notice some differences in details.
  
For example, for the continuous random variable&nbsp; $Y$&nbsp;, the boundary transition can be omitted in the above equation, and we obtain simplified:
+
For example, for the continuous random variable&nbsp; $Y$,&nbsp; the boundary transition can be omitted in the above equation, and we obtain simplified:
 
:$${\rm Pr}(A \le Y \le B) = F_Y(B) - F_Y(A) =\int_{A}^{B} \hspace{-0.01cm}  f_Y(y)
 
:$${\rm Pr}(A \le Y \le B) = F_Y(B) - F_Y(A) =\int_{A}^{B} \hspace{-0.01cm}  f_Y(y)
 
\hspace{0.1cm}{\rm d}y\hspace{0.05cm}.$$
 
\hspace{0.1cm}{\rm d}y\hspace{0.05cm}.$$
Line 35: Line 35:
  
 
Hints:
 
Hints:
*The task belongs to the chapter&nbsp; [[Information_Theory/Differentielle_Entropie|Differential Entropy]].
+
*The exercise belongs to the chapter&nbsp; [[Information_Theory/Differentielle_Entropie|Differential Entropy]].
*Useful hints for solving this problem and further information on continuous-valued random variables can be found in the third chapter "Continuous Random Variables" of the book&nbsp;  [[Theory of Stochastic Signals]].
+
*Useful hints for solving this problem and further information on continuous random variables can be found in the third chapter&nbsp; "Continuous Random Variables"&nbsp; of the book&nbsp;  [[Theory of Stochastic Signals]].
 
   
 
   
 
*Given also is the following indefinite integral:
 
*Given also is the following indefinite integral:
Line 46: Line 46:
  
 
<quiz display=simple>
 
<quiz display=simple>
{Determine the PDF&nbsp; $f_X(x)$&nbsp; of the discrete value random variable&nbsp; $X$.&nbsp; Which of the following statements are true?
+
{Determine the PDF&nbsp; $f_X(x)$&nbsp; of the discrete random variable&nbsp; $X$.&nbsp; Which of the following statements are true?
 
|type="[]"}
 
|type="[]"}
 
+ The PDF is composed of five Dirac functions.
 
+ The PDF is composed of five Dirac functions.
+ &nbsp;${\rm Pr}(X= 0) = 0.4$ &nbsp;and&nbsp; ${\rm Pr}(X= 1) = 0.2$ are true.
+
+ &nbsp;${\rm Pr}(X= 0) = 0.4$&nbsp; &nbsp;and&nbsp; ${\rm Pr}(X= 1) = 0.2$&nbsp; are true.
- &nbsp;${\rm Pr}(X= 2) = 0.4$ is true.
+
- &nbsp;${\rm Pr}(X= 2) = 0.4$&nbsp; is true.
  
  
Line 58: Line 58:
 
${\rm Pr}(|X| ≤ 1) \ =  \ $ { 0.8 3% }
 
${\rm Pr}(|X| ≤ 1) \ =  \ $ { 0.8 3% }
  
{What are the values of the cumulative distribution function&nbsp; $F_Y(y)  ={\rm Pr}(Y \le y)$&nbsp; of the continuous value random variable&nbsp; $Y$,&nbsp; in particular:
+
{What are the values of the cumulative distribution function&nbsp; $F_Y(y)  ={\rm Pr}(Y \le y)$&nbsp; of the continuous random variable&nbsp; $Y$,&nbsp; in particular:
 
|type="{}"}
 
|type="{}"}
 
$F_Y(y = 0) \ =  \ $ { 0.5 3% }
 
$F_Y(y = 0) \ =  \ $ { 0.5 3% }
Line 83: Line 83:
 
[[File:P_ID2857__Inf_A_4_1a_neu.png|right|frame|PDF and CDF of the discrete random variable&nbsp; $X$]]
 
[[File:P_ID2857__Inf_A_4_1a_neu.png|right|frame|PDF and CDF of the discrete random variable&nbsp; $X$]]
 
'''(1)'''&nbsp; <u>Proposed solutions 1 and 2</u> are correct:
 
'''(1)'''&nbsp; <u>Proposed solutions 1 and 2</u> are correct:
*The cumulative distribution function (CDF)&nbsp; $F_X(x)$&nbsp; is obtained from the probability density function&nbsp; $f_X(x)$&nbsp; by integration over the (renamed) random variable in the range from&nbsp; $- \infty$&nbsp; to&nbsp; $x$.  
+
*The cumulative distribution function&nbsp; $F_X(x)$&nbsp; is obtained from the probability density function&nbsp; $f_X(x)$&nbsp; by integration over the (renamed) random variable in the range from&nbsp; $- \infty$&nbsp; to&nbsp; $x$.  
*The inverse is: &nbsp; given the CDF, obtain the PDF by differentiation.
+
*The inverse is: &nbsp; Given the CDF, obtain the PDF by differentiation.
 
*The given CDF contains five discontinuity points, which after differentiation lead to five Dirac functions:
 
*The given CDF contains five discontinuity points, which after differentiation lead to five Dirac functions:
 
:$$f_X(x) =  0.1 \cdot {\rm \delta}( x+2)  
 
:$$f_X(x) =  0.1 \cdot {\rm \delta}( x+2)  
Line 90: Line 90:
 
  +  0.4 \cdot {\rm \delta}( x) + 0.2 \cdot {\rm \delta}( x-1)  
 
  +  0.4 \cdot {\rm \delta}( x) + 0.2 \cdot {\rm \delta}( x-1)  
 
   + 0.1 \cdot {\rm \delta}( x-2)\hspace{0.05cm}.$$
 
   + 0.1 \cdot {\rm \delta}( x-2)\hspace{0.05cm}.$$
*The Dirac weights give the occurrence probabilities of the random variable&nbsp; $X = \{-2,\ -1,\ 0,\ +1,\ +2\}$ an, <br>for example:
+
*The Dirac weights give the occurrence probabilities of the random variable&nbsp; $X = \{-2,\ -1,\ 0,\ +1,\ +2\}$&nbsp;, e.g.:
 
:$${\rm Pr}(X = 0) = F_X(x \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{+}) - F_X(x \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{-}) =
 
:$${\rm Pr}(X = 0) = F_X(x \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{+}) - F_X(x \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{-}) =
 
   0.7 - 0.3 = 0.4\hspace{0.05cm}.$$
 
   0.7 - 0.3 = 0.4\hspace{0.05cm}.$$
Line 105: Line 105:
 
\hspace{0.15cm}\underline {= 0.8}\hspace{0.05cm}.$$
 
\hspace{0.15cm}\underline {= 0.8}\hspace{0.05cm}.$$
  
The same result is obtained using the distribution function.&nbsp; Here the general equation, which is equally valid for discrete-value and continuous-value random variables, is:
+
The same result is obtained using the CDF.&nbsp; Here the general equation, which is equally valid for discrete and continuous random variables, is:
 
:$${\rm Pr}(A < X \le B) =F_X(B) - F_X(A) \hspace{0.05cm}.$$  
 
:$${\rm Pr}(A < X \le B) =F_X(B) - F_X(A) \hspace{0.05cm}.$$  
  
 
* Thus, with&nbsp; $A= 0$&nbsp; and&nbsp; $B = +2$&nbsp; we obtain:
 
* Thus, with&nbsp; $A= 0$&nbsp; and&nbsp; $B = +2$&nbsp; we obtain:
 
:$${\rm Pr}(0 < X \le +2) = {\rm Pr}(X >0)= F_X(+2) - F_X(0) = 1 - 0.7 \hspace{0.15cm}\underline {= 0.3} \hspace{0.05cm}.$$
 
:$${\rm Pr}(0 < X \le +2) = {\rm Pr}(X >0)= F_X(+2) - F_X(0) = 1 - 0.7 \hspace{0.15cm}\underline {= 0.3} \hspace{0.05cm}.$$
*Setting $A=-2$ and $B = +1$, we get:
+
*Setting&nbsp; $A=-2$&nbsp; and&nbsp; $B = +1$,&nbsp; we get:
 
:$${\rm Pr}(-2 < X \le +1) = {\rm Pr}(|X|  \le 1)= F_X(+1) - F_X(-2) = 0.9 - 0.1 \hspace{0.15cm}\underline {= 0.8} \hspace{0.05cm}.$$
 
:$${\rm Pr}(-2 < X \le +1) = {\rm Pr}(|X|  \le 1)= F_X(+1) - F_X(-2) = 0.9 - 0.1 \hspace{0.15cm}\underline {= 0.8} \hspace{0.05cm}.$$
  
Line 116: Line 116:
  
 
[[File:P_ID2858__Inf_A_4_1c_neu.png|right|frame|PDF and CDF of the continuous random variable&nbsp; $Y$]]
 
[[File:P_ID2858__Inf_A_4_1c_neu.png|right|frame|PDF and CDF of the continuous random variable&nbsp; $Y$]]
'''(3)'''&nbsp; The cumulative distribution function&nbsp; $F_Y(y)$&nbsp; is obtained from the (renamed) WDF&nbsp; $f_Y(\eta)$&nbsp; by integrating&nbsp; $- \infty$&nbsp; to&nbsp; $x$.&nbsp; Due to symmetry, this can be written in the range&nbsp; $0 \le y \le +2$&nbsp;:
+
'''(3)'''&nbsp; The cumulative distribution function&nbsp; $F_Y(y)$&nbsp; is obtained from the (renamed) WDF&nbsp; $f_Y(\eta)$&nbsp; by integrating&nbsp; $- \infty$&nbsp; to&nbsp; $x$.&nbsp; Due to symmetry, this can be written in the range&nbsp; $0 \le y \le +2$:
 
:$$F_Y(y) = \int_{-\infty}^{\hspace{0.05cm}y} \hspace{-0.1cm}f_Y(\eta) \hspace{0.1cm}{\rm d}\eta ={1}/{2}+\int_{0}^{\hspace{0.05cm}y} \hspace{-0.1cm}f_Y(\eta) \hspace{0.1cm}{\rm d}\eta$$
 
:$$F_Y(y) = \int_{-\infty}^{\hspace{0.05cm}y} \hspace{-0.1cm}f_Y(\eta) \hspace{0.1cm}{\rm d}\eta ={1}/{2}+\int_{0}^{\hspace{0.05cm}y} \hspace{-0.1cm}f_Y(\eta) \hspace{0.1cm}{\rm d}\eta$$
 
:$$\Rightarrow \hspace{0.3cm}F_Y(y) = \frac{1}{2}+\int_{0}^{\hspace{0.05cm}y} \hspace{0.1cm}\frac{1}{2} \cdot \cos^2({\pi}/{4} \cdot \eta) \hspace{0.1cm}{\rm d}\eta = \frac{1}{2}+\frac{y}{4} + \frac{1}{2\pi} \cdot \sin({\pi}/{2} \cdot y).$$
 
:$$\Rightarrow \hspace{0.3cm}F_Y(y) = \frac{1}{2}+\int_{0}^{\hspace{0.05cm}y} \hspace{0.1cm}\frac{1}{2} \cdot \cos^2({\pi}/{4} \cdot \eta) \hspace{0.1cm}{\rm d}\eta = \frac{1}{2}+\frac{y}{4} + \frac{1}{2\pi} \cdot \sin({\pi}/{2} \cdot y).$$
Line 126: Line 126:
  
  
'''(4)'''&nbsp; The probability that the continuous-value random variable&nbsp; $Y$&nbsp; lies in the range from&nbsp; $-\varepsilon$&nbsp; to&nbsp; $+\varepsilon$&nbsp; can be calculated using the given equation as follows:
+
'''(4)'''&nbsp; The probability that the continuous random variable&nbsp; $Y$&nbsp; lies in the range from&nbsp; $-\varepsilon$&nbsp; to&nbsp; $+\varepsilon$&nbsp; can be calculated using the given equation as follows:
 
:$${\rm Pr}(-\varepsilon \le Y \le +\varepsilon) = F_Y(+\varepsilon) - F_Y(-\varepsilon) \hspace{0.05cm}.$$
 
:$${\rm Pr}(-\varepsilon \le Y \le +\varepsilon) = F_Y(+\varepsilon) - F_Y(-\varepsilon) \hspace{0.05cm}.$$
  
*It was taken into account that for the continuous random variable&nbsp; $Y$&nbsp; the "<"sign can be replaced by the "&#8804;" sign without distortion.
+
*It was taken into account that for the random variable&nbsp; $Y$&nbsp; the "<"sign can be replaced by the "&#8804;" sign without distortion.
*With the boundary transition&nbsp; $\varepsilon \to 0$&nbsp;, the probability we are looking for is obtained:
+
*With the boundary transition&nbsp; $\varepsilon \to 0$,&nbsp; the probability we are looking for is obtained:
 
:$${\rm Pr}(Y = 0)  =\lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm}{\rm Pr}(-\varepsilon \le Y \le +\varepsilon) =  
 
:$${\rm Pr}(Y = 0)  =\lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm}{\rm Pr}(-\varepsilon \le Y \le +\varepsilon) =  
 
\lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm} F_Y(+\varepsilon) - \lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm} F_Y(-\varepsilon) =
 
\lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm} F_Y(+\varepsilon) - \lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm} F_Y(-\varepsilon) =
Line 138: Line 138:
  
  
'''In general''': &nbsp; The probability&nbsp; ${\rm Pr}(Y = y_0)$ that a continuous value random variable&nbsp; $Y$&nbsp; takes a fixed value&nbsp; $y_0$&nbsp; is always zero.
+
'''In general''': &nbsp; The probability&nbsp; ${\rm Pr}(Y = y_0)$&nbsp; that a continuous random variable&nbsp; $Y$&nbsp; takes a fixed value&nbsp; $y_0$,&nbsp; is always zero.
  
  
Line 145: Line 145:
 
*Based on the PDF at hand, the result&nbsp; $Y=3$&nbsp; can be excluded.
 
*Based on the PDF at hand, the result&nbsp; $Y=3$&nbsp; can be excluded.
 
*The result&nbsp; $Y=0$&nbsp; on the other hand is quite possible, although&nbsp; ${\rm Pr}(Y = 0) = 0$&nbsp;.
 
*The result&nbsp; $Y=0$&nbsp; on the other hand is quite possible, although&nbsp; ${\rm Pr}(Y = 0) = 0$&nbsp;.
*For example, if one performs a random experiment&nbsp; $N \to \infty$&nbsp; times and obtains the result&nbsp; $Y= 0$ &nbsp; $N_0$&nbsp; times, then with finite &nbsp; $N_0$&nbsp; according to the classical definition of probability:
+
*For example, if one performs a random experiment&nbsp; $N \to \infty$&nbsp; times and obtains the result&nbsp; $Y= 0$ &nbsp; for&nbsp; $N_0$&nbsp; times, then with finite &nbsp; $N_0$&nbsp; according to the classical definition of probability:
 
:$${\rm Pr}(Y = 0) = \lim_{N\hspace{0.05cm}\rightarrow\hspace{0.05cm}\infty}\hspace{0.1cm}{N_0}/{N} = 0\hspace{0.05cm}.$$
 
:$${\rm Pr}(Y = 0) = \lim_{N\hspace{0.05cm}\rightarrow\hspace{0.05cm}\infty}\hspace{0.1cm}{N_0}/{N} = 0\hspace{0.05cm}.$$
  
  
 
+
'''(6)'''&nbsp; We again assume the equation&nbsp; $ {\rm Pr}(A \le Y \le B) = F_Y(B) - F_Y(A)$&nbsp; &nbsp; valid for the continuous random quantity &nbsp;$Y$:
'''(6)'''&nbsp; We again assume the equation&nbsp; $Y$&nbsp; valid for the continuous random quantity &nbsp; $ {\rm Pr}(A \le Y \le B) = F_Y(B) - F_Y(A)$&nbsp;:
+
*With&nbsp; $A = 0$&nbsp; and&nbsp; $B \to \infty$&nbsp; $($or&nbsp; $B = 2)$&nbsp; we obtain:
*With&nbsp; $A = 0$&nbsp; and&nbsp; $B \to \infty$&nbsp; $($bzw.&nbsp; $B = 2)$&nbsp; we obtain:
 
 
:$${\rm Pr}( Y > 0) = {\rm Pr}(0 \le Y \le \infty)  
 
:$${\rm Pr}( Y > 0) = {\rm Pr}(0 \le Y \le \infty)  
 
= {\rm Pr}(0 \le Y \le 2) = F_Y(2) - F_Y(0)  
 
= {\rm Pr}(0 \le Y \le 2) = F_Y(2) - F_Y(0)  
 
\hspace{0.15cm}\underline {= 0.5}\hspace{0.05cm}.$$
 
\hspace{0.15cm}\underline {= 0.5}\hspace{0.05cm}.$$
*Thus, for the symmetric continuous random variable&nbsp; $Y$&nbsp;  is indeed as expected &nbsp;${\rm Pr}( Y > 0) = 1/2$.  
+
*Thus, for the symmetric continuous random variable&nbsp; $Y$&nbsp;  holds indeed as expected: &nbsp;${\rm Pr}( Y > 0) = 1/2$.  
*Although the discrete value random variable&nbsp; $X$&nbsp; is also symmetric about&nbsp;$x= 0$&nbsp;,&nbsp;${\rm Pr}( X > 0)  = 0.3$&nbsp; was determined in subtask&nbsp; '''(3)'''&nbsp;, on the other hand.  
+
*Although the discrete random variable&nbsp; $X$&nbsp; is also symmetrical about&nbsp;$x= 0$ &nbsp; &rArr; &nbsp; ${\rm Pr}( X > 0)  = 0.3$&nbsp; was determined in subtask&nbsp; '''(3)''', on the other hand.  
*Further, with &nbsp;$A = -1$&nbsp; and &nbsp;$B = +1$&nbsp;, one obtains because of&nbsp;$F_Y(-1) = 1- F_Y(+1)$:
+
*Further, with &nbsp;$A = -1$&nbsp; and &nbsp;$B = +1$,&nbsp; one obtains because of&nbsp;$F_Y(-1) = 1- F_Y(+1)$:
 
:$${\rm Pr}( |Y| \le 1)  =  {\rm Pr}(-1 \le Y \le +1)  
 
:$${\rm Pr}( |Y| \le 1)  =  {\rm Pr}(-1 \le Y \le +1)  
 
=  F_Y(+1) - F_Y(-1)  =  2 \cdot F_Y(+1) -1 = 2 \cdot 0.909 -1 \hspace{0.15cm}\underline {= 0.818}. $$
 
=  F_Y(+1) - F_Y(-1)  =  2 \cdot F_Y(+1) -1 = 2 \cdot 0.909 -1 \hspace{0.15cm}\underline {= 0.818}. $$

Latest revision as of 09:27, 11 October 2021

$\rm (CDF)$  (top),  $\rm (PDF)$  (bottom)

To repeat some important basics from the book "Theory of Stochastic Signals" we are dealing with


The upper plot shows the cumulative distribution function  $F_X(x)$  of a discrete random variable  $X$.  The corresponding probability density function  $f_X(x)$  has to be determined in subtask  (1).

The equation

$$ {\rm Pr}(A < X \le B) = F_X(B) - F_X(A) = \lim_{\varepsilon \hspace{0.05cm}\rightarrow \hspace{0.05cm}0} \int_{A+\varepsilon}^{B+\varepsilon} \hspace{-0.15cm} f_X(x) \hspace{0.1cm}{\rm d}x $$

represents two ways to calculate the probability for the event  "The random variable  $X$  lies in a given interval"  from the CDF and the PDF,  respectively.

The lower graph shows the probability density function

$$ f_Y(y) = \left\{ \begin{array}{c} \hspace{0.1cm}1/2 \cdot \cos^2(\pi/4 \cdot y) \\ \hspace{0.1cm} 0 \\ \end{array} \right.\quad \begin{array}{*{20}c} {\rm{f\ddot{u}r}} \\ {\rm{f\ddot{u}r}} \\ \end{array}\begin{array}{*{20}l} | y| \le 2, \\ y < -2 \hspace{0.1cm}{\rm und}\hspace{0.1cm}y > +2 \\ \end{array}$$

of a continuous random variable  $Y$,  which is restricted to the range  $|Y| \le 2$ .  In principle, the same relationship between PDF, CDF and probabilities exists for the continuous random variable  $Y$  as for a discrete random variable.  Nevertheless, you will notice some differences in details.

For example, for the continuous random variable  $Y$,  the boundary transition can be omitted in the above equation, and we obtain simplified:

$${\rm Pr}(A \le Y \le B) = F_Y(B) - F_Y(A) =\int_{A}^{B} \hspace{-0.01cm} f_Y(y) \hspace{0.1cm}{\rm d}y\hspace{0.05cm}.$$





Hints:

  • The exercise belongs to the chapter  Differential Entropy.
  • Useful hints for solving this problem and further information on continuous random variables can be found in the third chapter  "Continuous Random Variables"  of the book  Theory of Stochastic Signals.
  • Given also is the following indefinite integral:
$$\int \hspace{0.1cm} \cos^2(A \eta) \hspace{0.1cm}{\rm d}\eta = \frac{\eta}{2} + \frac{1}{4A} \cdot \sin(2A \eta).$$


Questions

1

Determine the PDF  $f_X(x)$  of the discrete random variable  $X$.  Which of the following statements are true?

The PDF is composed of five Dirac functions.
 ${\rm Pr}(X= 0) = 0.4$   and  ${\rm Pr}(X= 1) = 0.2$  are true.
 ${\rm Pr}(X= 2) = 0.4$  is true.

2

Calculate the following probabilities:

${\rm Pr}(X > 0) \ = \ $

${\rm Pr}(|X| ≤ 1) \ = \ $

3

What are the values of the cumulative distribution function  $F_Y(y) ={\rm Pr}(Y \le y)$  of the continuous random variable  $Y$,  in particular:

$F_Y(y = 0) \ = \ $

$F_Y(y = 1) \ = \ $

$F_Y(y = 2) \ = \ $

4

What is the probability that  $Y = 0$ ?

${\rm Pr}(Y = 0) \ = \ $

5

Which of the following statements are correct?|type="[]"

The result  $Y = 0$  is impossible.
The result  $Y = 3$  is impossible.

6

What are the following probabilities?

${\rm Pr}(Y > 0) \ = \ $

${\rm Pr}(|Y| ≤ 1) \ = \ $


Solution

PDF and CDF of the discrete random variable  $X$

(1)  Proposed solutions 1 and 2 are correct:

  • The cumulative distribution function  $F_X(x)$  is obtained from the probability density function  $f_X(x)$  by integration over the (renamed) random variable in the range from  $- \infty$  to  $x$.
  • The inverse is:   Given the CDF, obtain the PDF by differentiation.
  • The given CDF contains five discontinuity points, which after differentiation lead to five Dirac functions:
$$f_X(x) = 0.1 \cdot {\rm \delta}( x+2) + 0.2 \cdot {\rm \delta}( x+1) + 0.4 \cdot {\rm \delta}( x) + 0.2 \cdot {\rm \delta}( x-1) + 0.1 \cdot {\rm \delta}( x-2)\hspace{0.05cm}.$$
  • The Dirac weights give the occurrence probabilities of the random variable  $X = \{-2,\ -1,\ 0,\ +1,\ +2\}$ , e.g.:
$${\rm Pr}(X = 0) = F_X(x \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{+}) - F_X(x \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{-}) = 0.7 - 0.3 = 0.4\hspace{0.05cm}.$$
  • Accordingly, the other probabilities are:
$${\rm Pr}(X = +1) = {\rm Pr}(X = -1) = 0.2\hspace{0.05cm},\hspace{0.3cm} {\rm Pr}(X = +2) = {\rm Pr}(X = -2) = 0.1\hspace{0.05cm}.$$


(2)  From the PDF just calculated, we obtain:

$${\rm Pr}(X >0) = {\rm Pr}(X = +1) + {\rm Pr}(X = +2) \hspace{0.15cm}\underline {= 0.3}\hspace{0.05cm},$$
$${\rm Pr}(|X| \le 1) ={\rm Pr}(X = -1) + {\rm Pr}(X = 0) + {\rm Pr}(X = +1) = 0.2 + 0.4 +0.2 \hspace{0.15cm}\underline {= 0.8}\hspace{0.05cm}.$$

The same result is obtained using the CDF.  Here the general equation, which is equally valid for discrete and continuous random variables, is:

$${\rm Pr}(A < X \le B) =F_X(B) - F_X(A) \hspace{0.05cm}.$$
  • Thus, with  $A= 0$  and  $B = +2$  we obtain:
$${\rm Pr}(0 < X \le +2) = {\rm Pr}(X >0)= F_X(+2) - F_X(0) = 1 - 0.7 \hspace{0.15cm}\underline {= 0.3} \hspace{0.05cm}.$$
  • Setting  $A=-2$  and  $B = +1$,  we get:
$${\rm Pr}(-2 < X \le +1) = {\rm Pr}(|X| \le 1)= F_X(+1) - F_X(-2) = 0.9 - 0.1 \hspace{0.15cm}\underline {= 0.8} \hspace{0.05cm}.$$


PDF and CDF of the continuous random variable  $Y$

(3)  The cumulative distribution function  $F_Y(y)$  is obtained from the (renamed) WDF  $f_Y(\eta)$  by integrating  $- \infty$  to  $x$.  Due to symmetry, this can be written in the range  $0 \le y \le +2$:

$$F_Y(y) = \int_{-\infty}^{\hspace{0.05cm}y} \hspace{-0.1cm}f_Y(\eta) \hspace{0.1cm}{\rm d}\eta ={1}/{2}+\int_{0}^{\hspace{0.05cm}y} \hspace{-0.1cm}f_Y(\eta) \hspace{0.1cm}{\rm d}\eta$$
$$\Rightarrow \hspace{0.3cm}F_Y(y) = \frac{1}{2}+\int_{0}^{\hspace{0.05cm}y} \hspace{0.1cm}\frac{1}{2} \cdot \cos^2({\pi}/{4} \cdot \eta) \hspace{0.1cm}{\rm d}\eta = \frac{1}{2}+\frac{y}{4} + \frac{1}{2\pi} \cdot \sin({\pi}/{2} \cdot y).$$

The equation holds in the entire range  $0 \le y \le +2$.  The CDF values we are looking for are thus:

  • $F_Y(y=0)\hspace{0.15cm}\underline{= 0.5}$  (integral over half the PDF),
  • $F_Y(y=1)= 3/4 + 1/(2 \pi)\hspace{0.15cm}\underline{= 0.909}$  (area in red background in the PDF),
  • $F_Y(y=2)\hspace{0.15cm}\underline{= 1}$  (integral over the entire PDF).


(4)  The probability that the continuous random variable  $Y$  lies in the range from  $-\varepsilon$  to  $+\varepsilon$  can be calculated using the given equation as follows:

$${\rm Pr}(-\varepsilon \le Y \le +\varepsilon) = F_Y(+\varepsilon) - F_Y(-\varepsilon) \hspace{0.05cm}.$$
  • It was taken into account that for the random variable  $Y$  the "<"sign can be replaced by the "≤" sign without distortion.
  • With the boundary transition  $\varepsilon \to 0$,  the probability we are looking for is obtained:
$${\rm Pr}(Y = 0) =\lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm}{\rm Pr}(-\varepsilon \le Y \le +\varepsilon) = \lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm} F_Y(+\varepsilon) - \lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm} F_Y(-\varepsilon) = F_Y(y \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{+}) - F_Y(y \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{-})\hspace{0.05cm}.$$
  • Since for a continuous random variable the two limits are equal, $\underline{{\rm Pr}(Y = 0) = 0}$.


In general:   The probability  ${\rm Pr}(Y = y_0)$  that a continuous random variable  $Y$  takes a fixed value  $y_0$,  is always zero.


(5)  Proposed solution 2 is correct:

  • Based on the PDF at hand, the result  $Y=3$  can be excluded.
  • The result  $Y=0$  on the other hand is quite possible, although  ${\rm Pr}(Y = 0) = 0$ .
  • For example, if one performs a random experiment  $N \to \infty$  times and obtains the result  $Y= 0$   for  $N_0$  times, then with finite   $N_0$  according to the classical definition of probability:
$${\rm Pr}(Y = 0) = \lim_{N\hspace{0.05cm}\rightarrow\hspace{0.05cm}\infty}\hspace{0.1cm}{N_0}/{N} = 0\hspace{0.05cm}.$$


(6)  We again assume the equation  $ {\rm Pr}(A \le Y \le B) = F_Y(B) - F_Y(A)$    valid for the continuous random quantity  $Y$:

  • With  $A = 0$  and  $B \to \infty$  $($or  $B = 2)$  we obtain:
$${\rm Pr}( Y > 0) = {\rm Pr}(0 \le Y \le \infty) = {\rm Pr}(0 \le Y \le 2) = F_Y(2) - F_Y(0) \hspace{0.15cm}\underline {= 0.5}\hspace{0.05cm}.$$
  • Thus, for the symmetric continuous random variable  $Y$  holds indeed as expected:  ${\rm Pr}( Y > 0) = 1/2$.
  • Although the discrete random variable  $X$  is also symmetrical about $x= 0$   ⇒   ${\rm Pr}( X > 0) = 0.3$  was determined in subtask  (3), on the other hand.
  • Further, with  $A = -1$  and  $B = +1$,  one obtains because of $F_Y(-1) = 1- F_Y(+1)$:
$${\rm Pr}( |Y| \le 1) = {\rm Pr}(-1 \le Y \le +1) = F_Y(+1) - F_Y(-1) = 2 \cdot F_Y(+1) -1 = 2 \cdot 0.909 -1 \hspace{0.15cm}\underline {= 0.818}. $$