Difference between revisions of "Aufgaben:Exercise 4.1: PDF, CDF and Probability"

From LNTwww
m (Text replacement - "value-discrete" to "discrete")
 
(7 intermediate revisions by 3 users not shown)
Line 1: Line 1:
  
{{quiz-Header|Buchseite=Informationstheorie/Differentielle Entropie
+
{{quiz-Header|Buchseite=Information_Theory/Differential_Entropy
 
}}
 
}}
  
[[File:P_ID2862__Inf_A_4_1_neu.png|right|frame|CDF (top) and PDF (bottom)]]
+
[[File:P_ID2862__Inf_A_4_1_neu.png|right|frame|$\rm  (CDF)$  (top),  $\rm  (PDF)$  (bottom)]]
To repeat some important basics from the book "Theory of Stochastic Signals"
+
To repeat some important basics from the book "Theory of Stochastic Signals" we are dealing with
beschäftigen wir uns mit
+
* the  [[Theory_of_Stochastic_Signals/Wahrscheinlichkeitsdichtefunktion|probability density function]] $\rm  (PDF)$,
* the  [[Theory_of_Stochastic_Signals/Wahrscheinlichkeitsdichtefunktion|probability density function]] (PDF),
+
* the  [[Theory_of_Stochastic_Signals/Cumulative_Distribution_Function_(CDF)|cumulative distribution function]] $\rm  (CDF)$.
* the  [[Theory_of_Stochastic_Signals/Verteilungsfunktion|cumulative distribution function]] (CDF).
 
  
  
The upper plot shows the distribution function  $F_X(x)$  of a discrete value random variable   $X$.  The corresponding PDF  $f_X(x)$  has to be determined in subtask  '''(1)''' .
+
The upper plot shows the cumulative distribution function  $F_X(x)$  of a discrete random variable   $X$.  The corresponding probability density function  $f_X(x)$  has to be determined in subtask  '''(1)'''.
  
 
The equation
 
The equation
 
:$$ {\rm Pr}(A < X \le B) = F_X(B) - F_X(A) =  \lim_{\varepsilon \hspace{0.05cm}\rightarrow \hspace{0.05cm}0} \int_{A+\varepsilon}^{B+\varepsilon} \hspace{-0.15cm}  f_X(x) \hspace{0.1cm}{\rm d}x $$
 
:$$ {\rm Pr}(A < X \le B) = F_X(B) - F_X(A) =  \lim_{\varepsilon \hspace{0.05cm}\rightarrow \hspace{0.05cm}0} \int_{A+\varepsilon}^{B+\varepsilon} \hspace{-0.15cm}  f_X(x) \hspace{0.1cm}{\rm d}x $$
  
represents two ways to calculate the probability for the event&nbsp; „The random variable&nbsp; $X$&nbsp; lies in a given interval”&nbsp; from the CDF and the PDF, respectively.
+
represents two ways to calculate the probability for the event&nbsp; "The random variable&nbsp; $X$&nbsp; lies in a given interval"&nbsp; from the CDF and the PDF,&nbsp; respectively.
  
 
The lower graph shows the probability density function
 
The lower graph shows the probability density function
 
:$$ f_Y(y) = \left\{ \begin{array}{c} \hspace{0.1cm}1/2 \cdot \cos^2(\pi/4 \cdot y) \\ \hspace{0.1cm} 0 \\  \end{array} \right.\quad \begin{array}{*{20}c}  {\rm{f\ddot{u}r}}  \\    {\rm{f\ddot{u}r}}  \\ \end{array}\begin{array}{*{20}l}  | y| \le 2, \\   
 
:$$ f_Y(y) = \left\{ \begin{array}{c} \hspace{0.1cm}1/2 \cdot \cos^2(\pi/4 \cdot y) \\ \hspace{0.1cm} 0 \\  \end{array} \right.\quad \begin{array}{*{20}c}  {\rm{f\ddot{u}r}}  \\    {\rm{f\ddot{u}r}}  \\ \end{array}\begin{array}{*{20}l}  | y| \le 2, \\   
 
y < -2 \hspace{0.1cm}{\rm und}\hspace{0.1cm}y > +2 \\ \end{array}$$
 
y < -2 \hspace{0.1cm}{\rm und}\hspace{0.1cm}y > +2 \\ \end{array}$$
of a continuous-valued random variable&nbsp; $Y$, which is restricted to the range&nbsp; $|Y| \le 2$&nbsp;.
+
of a continuous random variable&nbsp; $Y$,&nbsp; which is restricted to the range&nbsp; $|Y| \le 2$&nbsp;.&nbsp;
 
In principle, the same relationship between PDF, CDF and probabilities exists for the continuous random variable&nbsp; $Y$&nbsp; as for a discrete random variable.&nbsp; Nevertheless, you will notice some differences in details.
 
In principle, the same relationship between PDF, CDF and probabilities exists for the continuous random variable&nbsp; $Y$&nbsp; as for a discrete random variable.&nbsp; Nevertheless, you will notice some differences in details.
  
For example, for the continuous random variable&nbsp; $Y$&nbsp;, the boundary transition can be omitted in the above equation, and we obtain simplified:
+
For example, for the continuous random variable&nbsp; $Y$,&nbsp; the boundary transition can be omitted in the above equation, and we obtain simplified:
 
:$${\rm Pr}(A \le Y \le B) = F_Y(B) - F_Y(A) =\int_{A}^{B} \hspace{-0.01cm}  f_Y(y)
 
:$${\rm Pr}(A \le Y \le B) = F_Y(B) - F_Y(A) =\int_{A}^{B} \hspace{-0.01cm}  f_Y(y)
 
\hspace{0.1cm}{\rm d}y\hspace{0.05cm}.$$
 
\hspace{0.1cm}{\rm d}y\hspace{0.05cm}.$$
Line 36: Line 35:
  
 
Hints:
 
Hints:
*The task belongs to the chapter&nbsp; [[Information_Theory/Differentielle_Entropie|Differential Entropy]].
+
*The exercise belongs to the chapter&nbsp; [[Information_Theory/Differentielle_Entropie|Differential Entropy]].
*Useful hints for solving this problem and further information on continuous-valued random variables can be found in the third chapter "Continuous Random Variables" of the book&nbsp;  [[Theory of Stochastic Signals]].
+
*Useful hints for solving this problem and further information on continuous random variables can be found in the third chapter&nbsp; "Continuous Random Variables"&nbsp; of the book&nbsp;  [[Theory of Stochastic Signals]].
 
   
 
   
 
*Given also is the following indefinite integral:
 
*Given also is the following indefinite integral:
Line 47: Line 46:
  
 
<quiz display=simple>
 
<quiz display=simple>
{Determine the PDF&nbsp; $f_X(x)$&nbsp; of the discrete value random variable&nbsp; $X$.&nbsp; Which of the following statements are true?
+
{Determine the PDF&nbsp; $f_X(x)$&nbsp; of the discrete random variable&nbsp; $X$.&nbsp; Which of the following statements are true?
 
|type="[]"}
 
|type="[]"}
 
+ The PDF is composed of five Dirac functions.
 
+ The PDF is composed of five Dirac functions.
+ &nbsp;${\rm Pr}(X= 0) = 0.4$ &nbsp;and&nbsp; ${\rm Pr}(X= 1) = 0.2$ are true.
+
+ &nbsp;${\rm Pr}(X= 0) = 0.4$&nbsp; &nbsp;and&nbsp; ${\rm Pr}(X= 1) = 0.2$&nbsp; are true.
- &nbsp;${\rm Pr}(X= 2) = 0.4$ is true.
+
- &nbsp;${\rm Pr}(X= 2) = 0.4$&nbsp; is true.
  
  
Line 59: Line 58:
 
${\rm Pr}(|X| ≤ 1) \ =  \ $ { 0.8 3% }
 
${\rm Pr}(|X| ≤ 1) \ =  \ $ { 0.8 3% }
  
{What are the values of the cumulative distribution function&nbsp; $F_Y(y)  ={\rm Pr}(Y \le y)$&nbsp; of the continuous value random variable&nbsp; $Y$,&nbsp; in particular:
+
{What are the values of the cumulative distribution function&nbsp; $F_Y(y)  ={\rm Pr}(Y \le y)$&nbsp; of the continuous random variable&nbsp; $Y$,&nbsp; in particular:
 
|type="{}"}
 
|type="{}"}
 
$F_Y(y = 0) \ =  \ $ { 0.5 3% }
 
$F_Y(y = 0) \ =  \ $ { 0.5 3% }
Line 84: Line 83:
 
[[File:P_ID2857__Inf_A_4_1a_neu.png|right|frame|PDF and CDF of the discrete random variable&nbsp; $X$]]
 
[[File:P_ID2857__Inf_A_4_1a_neu.png|right|frame|PDF and CDF of the discrete random variable&nbsp; $X$]]
 
'''(1)'''&nbsp; <u>Proposed solutions 1 and 2</u> are correct:
 
'''(1)'''&nbsp; <u>Proposed solutions 1 and 2</u> are correct:
*The cumulative distribution function (CDF)&nbsp; $F_X(x)$&nbsp; is obtained from the probability density function&nbsp; $f_X(x)$&nbsp; by integration over the (renamed) random variable in the range from&nbsp; $- \infty$&nbsp; to&nbsp; $x$.  
+
*The cumulative distribution function&nbsp; $F_X(x)$&nbsp; is obtained from the probability density function&nbsp; $f_X(x)$&nbsp; by integration over the (renamed) random variable in the range from&nbsp; $- \infty$&nbsp; to&nbsp; $x$.  
*The inverse is: &nbsp; given the CDF, obtain the PDF by differentiation.
+
*The inverse is: &nbsp; Given the CDF, obtain the PDF by differentiation.
 
*The given CDF contains five discontinuity points, which after differentiation lead to five Dirac functions:
 
*The given CDF contains five discontinuity points, which after differentiation lead to five Dirac functions:
 
:$$f_X(x) =  0.1 \cdot {\rm \delta}( x+2)  
 
:$$f_X(x) =  0.1 \cdot {\rm \delta}( x+2)  
Line 91: Line 90:
 
  +  0.4 \cdot {\rm \delta}( x) + 0.2 \cdot {\rm \delta}( x-1)  
 
  +  0.4 \cdot {\rm \delta}( x) + 0.2 \cdot {\rm \delta}( x-1)  
 
   + 0.1 \cdot {\rm \delta}( x-2)\hspace{0.05cm}.$$
 
   + 0.1 \cdot {\rm \delta}( x-2)\hspace{0.05cm}.$$
*The Dirac weights give the occurrence probabilities of the random variable&nbsp; $X = \{-2,\ -1,\ 0,\ +1,\ +2\}$ an, <br>for example:
+
*The Dirac weights give the occurrence probabilities of the random variable&nbsp; $X = \{-2,\ -1,\ 0,\ +1,\ +2\}$&nbsp;, e.g.:
 
:$${\rm Pr}(X = 0) = F_X(x \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{+}) - F_X(x \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{-}) =
 
:$${\rm Pr}(X = 0) = F_X(x \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{+}) - F_X(x \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{-}) =
 
   0.7 - 0.3 = 0.4\hspace{0.05cm}.$$
 
   0.7 - 0.3 = 0.4\hspace{0.05cm}.$$
Line 106: Line 105:
 
\hspace{0.15cm}\underline {= 0.8}\hspace{0.05cm}.$$
 
\hspace{0.15cm}\underline {= 0.8}\hspace{0.05cm}.$$
  
The same result is obtained using the distribution function.&nbsp; Here the general equation, which is equally valid for discrete-value and continuous-value random variables, is:
+
The same result is obtained using the CDF.&nbsp; Here the general equation, which is equally valid for discrete and continuous random variables, is:
 
:$${\rm Pr}(A < X \le B) =F_X(B) - F_X(A) \hspace{0.05cm}.$$  
 
:$${\rm Pr}(A < X \le B) =F_X(B) - F_X(A) \hspace{0.05cm}.$$  
  
 
* Thus, with&nbsp; $A= 0$&nbsp; and&nbsp; $B = +2$&nbsp; we obtain:
 
* Thus, with&nbsp; $A= 0$&nbsp; and&nbsp; $B = +2$&nbsp; we obtain:
 
:$${\rm Pr}(0 < X \le +2) = {\rm Pr}(X >0)= F_X(+2) - F_X(0) = 1 - 0.7 \hspace{0.15cm}\underline {= 0.3} \hspace{0.05cm}.$$
 
:$${\rm Pr}(0 < X \le +2) = {\rm Pr}(X >0)= F_X(+2) - F_X(0) = 1 - 0.7 \hspace{0.15cm}\underline {= 0.3} \hspace{0.05cm}.$$
*Setting $A=-2$ and $B = +1$, we get:
+
*Setting&nbsp; $A=-2$&nbsp; and&nbsp; $B = +1$,&nbsp; we get:
 
:$${\rm Pr}(-2 < X \le +1) = {\rm Pr}(|X|  \le 1)= F_X(+1) - F_X(-2) = 0.9 - 0.1 \hspace{0.15cm}\underline {= 0.8} \hspace{0.05cm}.$$
 
:$${\rm Pr}(-2 < X \le +1) = {\rm Pr}(|X|  \le 1)= F_X(+1) - F_X(-2) = 0.9 - 0.1 \hspace{0.15cm}\underline {= 0.8} \hspace{0.05cm}.$$
  
Line 117: Line 116:
  
 
[[File:P_ID2858__Inf_A_4_1c_neu.png|right|frame|PDF and CDF of the continuous random variable&nbsp; $Y$]]
 
[[File:P_ID2858__Inf_A_4_1c_neu.png|right|frame|PDF and CDF of the continuous random variable&nbsp; $Y$]]
'''(3)'''&nbsp; The cumulative distribution function&nbsp; $F_Y(y)$&nbsp; is obtained from the (renamed) WDF&nbsp; $f_Y(\eta)$&nbsp; by integrating&nbsp; $- \infty$&nbsp; to&nbsp; $x$.&nbsp; Due to symmetry, this can be written in the range&nbsp; $0 \le y \le +2$&nbsp;:
+
'''(3)'''&nbsp; The cumulative distribution function&nbsp; $F_Y(y)$&nbsp; is obtained from the (renamed) WDF&nbsp; $f_Y(\eta)$&nbsp; by integrating&nbsp; $- \infty$&nbsp; to&nbsp; $x$.&nbsp; Due to symmetry, this can be written in the range&nbsp; $0 \le y \le +2$:
 
:$$F_Y(y) = \int_{-\infty}^{\hspace{0.05cm}y} \hspace{-0.1cm}f_Y(\eta) \hspace{0.1cm}{\rm d}\eta ={1}/{2}+\int_{0}^{\hspace{0.05cm}y} \hspace{-0.1cm}f_Y(\eta) \hspace{0.1cm}{\rm d}\eta$$
 
:$$F_Y(y) = \int_{-\infty}^{\hspace{0.05cm}y} \hspace{-0.1cm}f_Y(\eta) \hspace{0.1cm}{\rm d}\eta ={1}/{2}+\int_{0}^{\hspace{0.05cm}y} \hspace{-0.1cm}f_Y(\eta) \hspace{0.1cm}{\rm d}\eta$$
 
:$$\Rightarrow \hspace{0.3cm}F_Y(y) = \frac{1}{2}+\int_{0}^{\hspace{0.05cm}y} \hspace{0.1cm}\frac{1}{2} \cdot \cos^2({\pi}/{4} \cdot \eta) \hspace{0.1cm}{\rm d}\eta = \frac{1}{2}+\frac{y}{4} + \frac{1}{2\pi} \cdot \sin({\pi}/{2} \cdot y).$$
 
:$$\Rightarrow \hspace{0.3cm}F_Y(y) = \frac{1}{2}+\int_{0}^{\hspace{0.05cm}y} \hspace{0.1cm}\frac{1}{2} \cdot \cos^2({\pi}/{4} \cdot \eta) \hspace{0.1cm}{\rm d}\eta = \frac{1}{2}+\frac{y}{4} + \frac{1}{2\pi} \cdot \sin({\pi}/{2} \cdot y).$$
Line 127: Line 126:
  
  
'''(4)'''&nbsp; The probability that the continuous-value random variable&nbsp; $Y$&nbsp; lies in the range from&nbsp; $-\varepsilon$&nbsp; to&nbsp; $+\varepsilon$&nbsp; can be calculated using the given equation as follows:
+
'''(4)'''&nbsp; The probability that the continuous random variable&nbsp; $Y$&nbsp; lies in the range from&nbsp; $-\varepsilon$&nbsp; to&nbsp; $+\varepsilon$&nbsp; can be calculated using the given equation as follows:
 
:$${\rm Pr}(-\varepsilon \le Y \le +\varepsilon) = F_Y(+\varepsilon) - F_Y(-\varepsilon) \hspace{0.05cm}.$$
 
:$${\rm Pr}(-\varepsilon \le Y \le +\varepsilon) = F_Y(+\varepsilon) - F_Y(-\varepsilon) \hspace{0.05cm}.$$
  
*It was taken into account that for the continuous random variable&nbsp; $Y$&nbsp; the "<"sign can be replaced by the "&#8804;" sign without distortion.
+
*It was taken into account that for the random variable&nbsp; $Y$&nbsp; the "<"sign can be replaced by the "&#8804;" sign without distortion.
*With the boundary transition&nbsp; $\varepsilon \to 0$&nbsp;, the probability we are looking for is obtained:
+
*With the boundary transition&nbsp; $\varepsilon \to 0$,&nbsp; the probability we are looking for is obtained:
 
:$${\rm Pr}(Y = 0)  =\lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm}{\rm Pr}(-\varepsilon \le Y \le +\varepsilon) =  
 
:$${\rm Pr}(Y = 0)  =\lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm}{\rm Pr}(-\varepsilon \le Y \le +\varepsilon) =  
 
\lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm} F_Y(+\varepsilon) - \lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm} F_Y(-\varepsilon) =
 
\lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm} F_Y(+\varepsilon) - \lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm} F_Y(-\varepsilon) =
Line 139: Line 138:
  
  
'''In general''': &nbsp; The probability&nbsp; ${\rm Pr}(Y = y_0)$ that a continuous value random variable&nbsp; $Y$&nbsp; takes a fixed value&nbsp; $y_0$&nbsp; is always zero.
+
'''In general''': &nbsp; The probability&nbsp; ${\rm Pr}(Y = y_0)$&nbsp; that a continuous random variable&nbsp; $Y$&nbsp; takes a fixed value&nbsp; $y_0$,&nbsp; is always zero.
  
  
Line 146: Line 145:
 
*Based on the PDF at hand, the result&nbsp; $Y=3$&nbsp; can be excluded.
 
*Based on the PDF at hand, the result&nbsp; $Y=3$&nbsp; can be excluded.
 
*The result&nbsp; $Y=0$&nbsp; on the other hand is quite possible, although&nbsp; ${\rm Pr}(Y = 0) = 0$&nbsp;.
 
*The result&nbsp; $Y=0$&nbsp; on the other hand is quite possible, although&nbsp; ${\rm Pr}(Y = 0) = 0$&nbsp;.
*For example, if one performs a random experiment&nbsp; $N \to \infty$&nbsp; times and obtains the result&nbsp; $Y= 0$ &nbsp; $N_0$&nbsp; times, then with finite &nbsp; $N_0$&nbsp; according to the classical definition of probability:
+
*For example, if one performs a random experiment&nbsp; $N \to \infty$&nbsp; times and obtains the result&nbsp; $Y= 0$ &nbsp; for&nbsp; $N_0$&nbsp; times, then with finite &nbsp; $N_0$&nbsp; according to the classical definition of probability:
 
:$${\rm Pr}(Y = 0) = \lim_{N\hspace{0.05cm}\rightarrow\hspace{0.05cm}\infty}\hspace{0.1cm}{N_0}/{N} = 0\hspace{0.05cm}.$$
 
:$${\rm Pr}(Y = 0) = \lim_{N\hspace{0.05cm}\rightarrow\hspace{0.05cm}\infty}\hspace{0.1cm}{N_0}/{N} = 0\hspace{0.05cm}.$$
  
  
 
+
'''(6)'''&nbsp; We again assume the equation&nbsp; $ {\rm Pr}(A \le Y \le B) = F_Y(B) - F_Y(A)$&nbsp; &nbsp; valid for the continuous random quantity &nbsp;$Y$:
'''(6)'''&nbsp; We again assume the equation&nbsp; $Y$&nbsp; valid for the continuous random quantity &nbsp; $ {\rm Pr}(A \le Y \le B) = F_Y(B) - F_Y(A)$&nbsp;:
+
*With&nbsp; $A = 0$&nbsp; and&nbsp; $B \to \infty$&nbsp; $($or&nbsp; $B = 2)$&nbsp; we obtain:
*With&nbsp; $A = 0$&nbsp; and&nbsp; $B \to \infty$&nbsp; $($bzw.&nbsp; $B = 2)$&nbsp; we obtain:
 
 
:$${\rm Pr}( Y > 0) = {\rm Pr}(0 \le Y \le \infty)  
 
:$${\rm Pr}( Y > 0) = {\rm Pr}(0 \le Y \le \infty)  
 
= {\rm Pr}(0 \le Y \le 2) = F_Y(2) - F_Y(0)  
 
= {\rm Pr}(0 \le Y \le 2) = F_Y(2) - F_Y(0)  
 
\hspace{0.15cm}\underline {= 0.5}\hspace{0.05cm}.$$
 
\hspace{0.15cm}\underline {= 0.5}\hspace{0.05cm}.$$
*Thus, for the symmetric continuous random variable&nbsp; $Y$&nbsp;  is indeed as expected &nbsp;${\rm Pr}( Y > 0) = 1/2$.  
+
*Thus, for the symmetric continuous random variable&nbsp; $Y$&nbsp;  holds indeed as expected: &nbsp;${\rm Pr}( Y > 0) = 1/2$.  
*Although the discrete value random variable&nbsp; $X$&nbsp; is also symmetric about&nbsp;$x= 0$&nbsp;,&nbsp;${\rm Pr}( X > 0)  = 0.3$&nbsp; was determined in subtask&nbsp; '''(3)'''&nbsp;, on the other hand.  
+
*Although the discrete random variable&nbsp; $X$&nbsp; is also symmetrical about&nbsp;$x= 0$ &nbsp; &rArr; &nbsp; ${\rm Pr}( X > 0)  = 0.3$&nbsp; was determined in subtask&nbsp; '''(3)''', on the other hand.  
*Further, with &nbsp;$A = -1$&nbsp; and &nbsp;$B = +1$&nbsp;, one obtains because of&nbsp;$F_Y(-1) = 1- F_Y(+1)$:
+
*Further, with &nbsp;$A = -1$&nbsp; and &nbsp;$B = +1$,&nbsp; one obtains because of&nbsp;$F_Y(-1) = 1- F_Y(+1)$:
 
:$${\rm Pr}( |Y| \le 1)  =  {\rm Pr}(-1 \le Y \le +1)  
 
:$${\rm Pr}( |Y| \le 1)  =  {\rm Pr}(-1 \le Y \le +1)  
 
=  F_Y(+1) - F_Y(-1)  =  2 \cdot F_Y(+1) -1 = 2 \cdot 0.909 -1 \hspace{0.15cm}\underline {= 0.818}. $$
 
=  F_Y(+1) - F_Y(-1)  =  2 \cdot F_Y(+1) -1 = 2 \cdot 0.909 -1 \hspace{0.15cm}\underline {= 0.818}. $$

Latest revision as of 09:27, 11 October 2021

$\rm (CDF)$  (top),  $\rm (PDF)$  (bottom)

To repeat some important basics from the book "Theory of Stochastic Signals" we are dealing with


The upper plot shows the cumulative distribution function  $F_X(x)$  of a discrete random variable  $X$.  The corresponding probability density function  $f_X(x)$  has to be determined in subtask  (1).

The equation

$$ {\rm Pr}(A < X \le B) = F_X(B) - F_X(A) = \lim_{\varepsilon \hspace{0.05cm}\rightarrow \hspace{0.05cm}0} \int_{A+\varepsilon}^{B+\varepsilon} \hspace{-0.15cm} f_X(x) \hspace{0.1cm}{\rm d}x $$

represents two ways to calculate the probability for the event  "The random variable  $X$  lies in a given interval"  from the CDF and the PDF,  respectively.

The lower graph shows the probability density function

$$ f_Y(y) = \left\{ \begin{array}{c} \hspace{0.1cm}1/2 \cdot \cos^2(\pi/4 \cdot y) \\ \hspace{0.1cm} 0 \\ \end{array} \right.\quad \begin{array}{*{20}c} {\rm{f\ddot{u}r}} \\ {\rm{f\ddot{u}r}} \\ \end{array}\begin{array}{*{20}l} | y| \le 2, \\ y < -2 \hspace{0.1cm}{\rm und}\hspace{0.1cm}y > +2 \\ \end{array}$$

of a continuous random variable  $Y$,  which is restricted to the range  $|Y| \le 2$ .  In principle, the same relationship between PDF, CDF and probabilities exists for the continuous random variable  $Y$  as for a discrete random variable.  Nevertheless, you will notice some differences in details.

For example, for the continuous random variable  $Y$,  the boundary transition can be omitted in the above equation, and we obtain simplified:

$${\rm Pr}(A \le Y \le B) = F_Y(B) - F_Y(A) =\int_{A}^{B} \hspace{-0.01cm} f_Y(y) \hspace{0.1cm}{\rm d}y\hspace{0.05cm}.$$





Hints:

  • The exercise belongs to the chapter  Differential Entropy.
  • Useful hints for solving this problem and further information on continuous random variables can be found in the third chapter  "Continuous Random Variables"  of the book  Theory of Stochastic Signals.
  • Given also is the following indefinite integral:
$$\int \hspace{0.1cm} \cos^2(A \eta) \hspace{0.1cm}{\rm d}\eta = \frac{\eta}{2} + \frac{1}{4A} \cdot \sin(2A \eta).$$


Questions

1

Determine the PDF  $f_X(x)$  of the discrete random variable  $X$.  Which of the following statements are true?

The PDF is composed of five Dirac functions.
 ${\rm Pr}(X= 0) = 0.4$   and  ${\rm Pr}(X= 1) = 0.2$  are true.
 ${\rm Pr}(X= 2) = 0.4$  is true.

2

Calculate the following probabilities:

${\rm Pr}(X > 0) \ = \ $

${\rm Pr}(|X| ≤ 1) \ = \ $

3

What are the values of the cumulative distribution function  $F_Y(y) ={\rm Pr}(Y \le y)$  of the continuous random variable  $Y$,  in particular:

$F_Y(y = 0) \ = \ $

$F_Y(y = 1) \ = \ $

$F_Y(y = 2) \ = \ $

4

What is the probability that  $Y = 0$ ?

${\rm Pr}(Y = 0) \ = \ $

5

Which of the following statements are correct?|type="[]"

The result  $Y = 0$  is impossible.
The result  $Y = 3$  is impossible.

6

What are the following probabilities?

${\rm Pr}(Y > 0) \ = \ $

${\rm Pr}(|Y| ≤ 1) \ = \ $


Solution

PDF and CDF of the discrete random variable  $X$

(1)  Proposed solutions 1 and 2 are correct:

  • The cumulative distribution function  $F_X(x)$  is obtained from the probability density function  $f_X(x)$  by integration over the (renamed) random variable in the range from  $- \infty$  to  $x$.
  • The inverse is:   Given the CDF, obtain the PDF by differentiation.
  • The given CDF contains five discontinuity points, which after differentiation lead to five Dirac functions:
$$f_X(x) = 0.1 \cdot {\rm \delta}( x+2) + 0.2 \cdot {\rm \delta}( x+1) + 0.4 \cdot {\rm \delta}( x) + 0.2 \cdot {\rm \delta}( x-1) + 0.1 \cdot {\rm \delta}( x-2)\hspace{0.05cm}.$$
  • The Dirac weights give the occurrence probabilities of the random variable  $X = \{-2,\ -1,\ 0,\ +1,\ +2\}$ , e.g.:
$${\rm Pr}(X = 0) = F_X(x \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{+}) - F_X(x \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{-}) = 0.7 - 0.3 = 0.4\hspace{0.05cm}.$$
  • Accordingly, the other probabilities are:
$${\rm Pr}(X = +1) = {\rm Pr}(X = -1) = 0.2\hspace{0.05cm},\hspace{0.3cm} {\rm Pr}(X = +2) = {\rm Pr}(X = -2) = 0.1\hspace{0.05cm}.$$


(2)  From the PDF just calculated, we obtain:

$${\rm Pr}(X >0) = {\rm Pr}(X = +1) + {\rm Pr}(X = +2) \hspace{0.15cm}\underline {= 0.3}\hspace{0.05cm},$$
$${\rm Pr}(|X| \le 1) ={\rm Pr}(X = -1) + {\rm Pr}(X = 0) + {\rm Pr}(X = +1) = 0.2 + 0.4 +0.2 \hspace{0.15cm}\underline {= 0.8}\hspace{0.05cm}.$$

The same result is obtained using the CDF.  Here the general equation, which is equally valid for discrete and continuous random variables, is:

$${\rm Pr}(A < X \le B) =F_X(B) - F_X(A) \hspace{0.05cm}.$$
  • Thus, with  $A= 0$  and  $B = +2$  we obtain:
$${\rm Pr}(0 < X \le +2) = {\rm Pr}(X >0)= F_X(+2) - F_X(0) = 1 - 0.7 \hspace{0.15cm}\underline {= 0.3} \hspace{0.05cm}.$$
  • Setting  $A=-2$  and  $B = +1$,  we get:
$${\rm Pr}(-2 < X \le +1) = {\rm Pr}(|X| \le 1)= F_X(+1) - F_X(-2) = 0.9 - 0.1 \hspace{0.15cm}\underline {= 0.8} \hspace{0.05cm}.$$


PDF and CDF of the continuous random variable  $Y$

(3)  The cumulative distribution function  $F_Y(y)$  is obtained from the (renamed) WDF  $f_Y(\eta)$  by integrating  $- \infty$  to  $x$.  Due to symmetry, this can be written in the range  $0 \le y \le +2$:

$$F_Y(y) = \int_{-\infty}^{\hspace{0.05cm}y} \hspace{-0.1cm}f_Y(\eta) \hspace{0.1cm}{\rm d}\eta ={1}/{2}+\int_{0}^{\hspace{0.05cm}y} \hspace{-0.1cm}f_Y(\eta) \hspace{0.1cm}{\rm d}\eta$$
$$\Rightarrow \hspace{0.3cm}F_Y(y) = \frac{1}{2}+\int_{0}^{\hspace{0.05cm}y} \hspace{0.1cm}\frac{1}{2} \cdot \cos^2({\pi}/{4} \cdot \eta) \hspace{0.1cm}{\rm d}\eta = \frac{1}{2}+\frac{y}{4} + \frac{1}{2\pi} \cdot \sin({\pi}/{2} \cdot y).$$

The equation holds in the entire range  $0 \le y \le +2$.  The CDF values we are looking for are thus:

  • $F_Y(y=0)\hspace{0.15cm}\underline{= 0.5}$  (integral over half the PDF),
  • $F_Y(y=1)= 3/4 + 1/(2 \pi)\hspace{0.15cm}\underline{= 0.909}$  (area in red background in the PDF),
  • $F_Y(y=2)\hspace{0.15cm}\underline{= 1}$  (integral over the entire PDF).


(4)  The probability that the continuous random variable  $Y$  lies in the range from  $-\varepsilon$  to  $+\varepsilon$  can be calculated using the given equation as follows:

$${\rm Pr}(-\varepsilon \le Y \le +\varepsilon) = F_Y(+\varepsilon) - F_Y(-\varepsilon) \hspace{0.05cm}.$$
  • It was taken into account that for the random variable  $Y$  the "<"sign can be replaced by the "≤" sign without distortion.
  • With the boundary transition  $\varepsilon \to 0$,  the probability we are looking for is obtained:
$${\rm Pr}(Y = 0) =\lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm}{\rm Pr}(-\varepsilon \le Y \le +\varepsilon) = \lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm} F_Y(+\varepsilon) - \lim_{\varepsilon\hspace{0.05cm}\rightarrow\hspace{0.05cm}0}\hspace{0.1cm} F_Y(-\varepsilon) = F_Y(y \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{+}) - F_Y(y \hspace{0.05cm}\rightarrow\hspace{0.05cm}0^{-})\hspace{0.05cm}.$$
  • Since for a continuous random variable the two limits are equal, $\underline{{\rm Pr}(Y = 0) = 0}$.


In general:   The probability  ${\rm Pr}(Y = y_0)$  that a continuous random variable  $Y$  takes a fixed value  $y_0$,  is always zero.


(5)  Proposed solution 2 is correct:

  • Based on the PDF at hand, the result  $Y=3$  can be excluded.
  • The result  $Y=0$  on the other hand is quite possible, although  ${\rm Pr}(Y = 0) = 0$ .
  • For example, if one performs a random experiment  $N \to \infty$  times and obtains the result  $Y= 0$   for  $N_0$  times, then with finite   $N_0$  according to the classical definition of probability:
$${\rm Pr}(Y = 0) = \lim_{N\hspace{0.05cm}\rightarrow\hspace{0.05cm}\infty}\hspace{0.1cm}{N_0}/{N} = 0\hspace{0.05cm}.$$


(6)  We again assume the equation  $ {\rm Pr}(A \le Y \le B) = F_Y(B) - F_Y(A)$    valid for the continuous random quantity  $Y$:

  • With  $A = 0$  and  $B \to \infty$  $($or  $B = 2)$  we obtain:
$${\rm Pr}( Y > 0) = {\rm Pr}(0 \le Y \le \infty) = {\rm Pr}(0 \le Y \le 2) = F_Y(2) - F_Y(0) \hspace{0.15cm}\underline {= 0.5}\hspace{0.05cm}.$$
  • Thus, for the symmetric continuous random variable  $Y$  holds indeed as expected:  ${\rm Pr}( Y > 0) = 1/2$.
  • Although the discrete random variable  $X$  is also symmetrical about $x= 0$   ⇒   ${\rm Pr}( X > 0) = 0.3$  was determined in subtask  (3), on the other hand.
  • Further, with  $A = -1$  and  $B = +1$,  one obtains because of $F_Y(-1) = 1- F_Y(+1)$:
$${\rm Pr}( |Y| \le 1) = {\rm Pr}(-1 \le Y \le +1) = F_Y(+1) - F_Y(-1) = 2 \cdot F_Y(+1) -1 = 2 \cdot 0.909 -1 \hspace{0.15cm}\underline {= 0.818}. $$