Difference between revisions of "Theory of Stochastic Signals/Cumulative Distribution Function"

From LNTwww
 
(20 intermediate revisions by 2 users not shown)
Line 7: Line 7:
 
==Relationship between PDF and CDF==
 
==Relationship between PDF and CDF==
 
<br>
 
<br>
To describe random variables,&nbsp; in addition to the&nbsp; [[Theory_of_Stochastic_Signals/Probability_Density_Function_(PDF)|probability density function]]&nbsp; $\rm (PDF)$,&nbsp; we also use the&nbsp; cumulative distribution function&nbsp; $\rm (CDF)$&nbsp; which is defined as follows:  
+
To describe random variables,&nbsp; in addition to the&nbsp; [[Theory_of_Stochastic_Signals/Probability_Density_Function|&raquo;probability density function&raquo;]]&nbsp; $\rm (PDF)$,&nbsp; we use the&nbsp; &raquo;cumulative distribution function&laquo;&nbsp; $\rm (CDF)$&nbsp; which is defined as follows:  
  
 
{{BlaueBox|TEXT=   
 
{{BlaueBox|TEXT=   
$\text{Definition:}$&nbsp; The&nbsp; '''cumulative distribution function'''&nbsp; $F_{x}(r)$&nbsp; corresponds to the probability that the random variable&nbsp; $x$&nbsp; is less than or equal to a real number&nbsp; $r$:  
+
$\text{Definition:}$&nbsp; The&nbsp; &raquo;'''cumulative distribution function'''&laquo;&nbsp; $F_{x}(r)$&nbsp; corresponds to the probability that the random variable&nbsp; $x$&nbsp; is less than or equal to a real number&nbsp; $r$:  
 
:$$F_{x}(r) = {\rm Pr}( x \le r).$$}}
 
:$$F_{x}(r) = {\rm Pr}( x \le r).$$}}
  
  
For a continuous random variable,&nbsp; the following statements are possible regarding the CDF:  
+
For a value-continuous random variable,&nbsp; the following statements are possible regarding the CDF:  
 
*The CDF is computable from the probability density function&nbsp; $f_{x}(x)$&nbsp; by integration.&nbsp; It holds:  
 
*The CDF is computable from the probability density function&nbsp; $f_{x}(x)$&nbsp; by integration.&nbsp; It holds:  
 
:$$F_{x}(r) = \int_{-\infty}^{r}f_x(x)\,{\rm d}x.$$
 
:$$F_{x}(r) = \int_{-\infty}^{r}f_x(x)\,{\rm d}x.$$
*Since the PDF is never negative,&nbsp; $F_{x}(r)$&nbsp; increases at least weakly monotonically,&nbsp; and always lies between the following limits:  
+
*Since the PDF is never negative,&nbsp; $F_{x}(r)$&nbsp; increases at least weakly monotonically,&nbsp; and the function always lies between the following limits:  
 
:$$F_{x}(r → \hspace{0.05cm} - \hspace{0.05cm} ∞) = 0, \hspace{0.5cm}F_{x}(r → +∞) = 1.$$  
 
:$$F_{x}(r → \hspace{0.05cm} - \hspace{0.05cm} ∞) = 0, \hspace{0.5cm}F_{x}(r → +∞) = 1.$$  
 
*Inversely,&nbsp; the probability density function can be determined from the CDF by differentiation:  
 
*Inversely,&nbsp; the probability density function can be determined from the CDF by differentiation:  
 
:$$f_{x}(x)=\frac{{\rm d} F_{x}(r)}{{\rm d} r}\Bigg |_{\hspace{0.1cm}r=x}.$$
 
:$$f_{x}(x)=\frac{{\rm d} F_{x}(r)}{{\rm d} r}\Bigg |_{\hspace{0.1cm}r=x}.$$
:The addition&nbsp; "$r = x$"&nbsp; makes it clear that in our nomenclature the PDF argument is the random variable itself, while the CDF argument specifies any real variable&nbsp; $r$&nbsp;.
+
:The addition&nbsp; &raquo;$r = x$&laquo;&nbsp; makes it clear that in our nomenclature the PDF argument is the random variable&nbsp; $x$&nbsp; itself, while the CDF argument specifies any real variable&nbsp; $r$&nbsp;.
 +
 
  
 
{{BlaueBox|TEXT=  
 
{{BlaueBox|TEXT=  
$\text{Nomenclature Notes:}$&nbsp;
+
$\text{Notes on nomenclature:}$&nbsp; If in the definitions of&nbsp; $\rm PDF$&nbsp; and&nbsp; $\rm CDF$&nbsp; we had distinguished
 +
*between the random variable&nbsp; $X$&nbsp;
 
   
 
   
If in the definitions of&nbsp; $\rm PDF$&nbsp; and&nbsp; $\rm CDF$&nbsp; we had distinguished
 
*between the random variable&nbsp; $X$&nbsp;
 
 
*and the realizations&nbsp; $x ∈ X$&nbsp; &nbsp; ⇒ &nbsp; $f_{X}(x), F_{X}(x)$,  
 
*and the realizations&nbsp; $x ∈ X$&nbsp; &nbsp; ⇒ &nbsp; $f_{X}(x), F_{X}(x)$,  
  
Line 34: Line 34:
 
:$$F_{X}(x) = {\rm Pr}(X \le x) = \int_{-\infty}^{x}f_{x}(\xi)\,{\rm d}\xi.$$
 
:$$F_{X}(x) = {\rm Pr}(X \le x) = \int_{-\infty}^{x}f_{x}(\xi)\,{\rm d}\xi.$$
  
Unfortunately,&nbsp; at the beginning of our&nbsp; $\rm LNTwww$ project&nbsp; (2001)&nbsp; we decided to use our nomenclature for quite legitimate reasons,&nbsp which now&nbsp; (2017)&nbsp; cannot be changed,&nbsp; also with regard to the realized learning videos.  
+
Unfortunately,&nbsp; at the beginning of our&nbsp; $\rm LNTwww$ project&nbsp; $(2001)$&nbsp; we decided to use our nomenclature for quite legitimate reasons,&nbsp; which now&nbsp; $(2017)$&nbsp; cannot be changed,&nbsp; also with regard to the realized learning videos. &nbsp; '''So we stick with&nbsp; $f_{x}(x)$&nbsp; instead of&nbsp; $f_{X}(x)$&nbsp; as well as&nbsp; $F_{x}(r)$&nbsp; instead of&nbsp; $F_{X}(x).$}}
 
 
'''So we stick with&nbsp; $f_{x}(x)$&nbsp; instead of&nbsp; $f_{X}(x)$&nbsp; as well as&nbsp; $F_{x}(r)$&nbsp; instead of&nbsp; $F_{X}(x).$}}
 
  
==CDF for continuous-valued random variables==
+
==CDF for value-continuous random variables==
 
<br>
 
<br>
The equations given in the last section apply only to continuous-valued random variables and will be illustrated here by an example.&nbsp; In the next section it will be shown that for&nbsp; [[Theory_of_Stochastic_Signals/Cumulative_Distribution_Function#CDF_for_discrete-valued_random_variables|discrete-valued random variables]]&nbsp; the equations must be modified somewhat.
+
The equations given in the last section apply only to value-continuous random variables and will be illustrated here by an example.&nbsp; In the next section it will be shown that for&nbsp; [[Theory_of_Stochastic_Signals/Cumulative_Distribution_Function#CDF_for_value-discrete_random_variables|&raquo;value-discrete random variables&laquo;]]&nbsp; the equations must be modified somewhat.
  
 
{{GraueBox|TEXT=   
 
{{GraueBox|TEXT=   
$\text{Example 1:}$&nbsp; The left image shows the photo&nbsp; "Lena",&nbsp; which is often used as a test template for image coding procedures.
+
$\text{Example 1:}$&nbsp; The left image shows the photo&nbsp; &raquo;Lena&laquo;,&nbsp; which is often used as a test template for image coding procedures.
[[File:P_ID617__Sto_T_3_2_S1b_neu.png |right|frame| PDF and CDF of a continuous-valued image]]
+
[[File:P_ID617__Sto_T_3_2_S1b_neu.png |right|frame| PDF and CDF of a value-continuous image]]
 
   
 
   
*If this image is divided into&nbsp; $256 × 256$&nbsp; (image) pixels,&nbsp;  and the brightness is determined for each pixel,&nbsp; a sequence&nbsp; $〈x_ν〉$&nbsp; of gray values is obtained whose length&nbsp; $N = 256^2 = 65536$.
+
*If this image is divided into&nbsp; $256 × 256$&nbsp; pixels,&nbsp;  and the brightness is determined for each pixel,&nbsp; a sequence&nbsp; $〈x_ν〉$&nbsp; of gray values is obtained whose length&nbsp; $N = 256^2 = 65\hspace{0.06cm}536$.
*The gray value&nbsp; $x$&nbsp; is a continuous-valued random variable,&nbsp; where the assignment to numerical values is arbitrary.&nbsp; For example,&nbsp; let&nbsp; "black"&nbsp; be characterized by the value&nbsp; $x = 0$&nbsp; and&nbsp; "white"&nbsp; by&nbsp; $x = 1$:&nbsp; The value&nbsp; $x =0.5$&nbsp; then characterizes a medium gray coloration.  
 
  
 +
*The gray value&nbsp; $x$&nbsp; is a value-continuous random variable,&nbsp; where the assignment to numerical values is arbitrary.&nbsp; For example,&nbsp; let&nbsp; &raquo;black&laquo;&nbsp; be characterized by&nbsp; $x = 0$&nbsp; and&nbsp; &raquo;white&laquo;&nbsp; by&nbsp; $x = 1$:&nbsp; The value&nbsp; $x =0.5$&nbsp; then characterizes a medium gray coloration.
  
The middle diagram shows the PDF&nbsp; $f_{x}(x)$&nbsp; which is also often referred to in the literature as&nbsp; "gray value statistics".
 
*It can be seen that in the original image some gray values are preferred and the two extreme values&nbsp; $x =0$&nbsp; ("deep black")&nbsp; or&nbsp; $x =1$&nbsp; ("pure white")&nbsp; occur very rarely.
 
*The distribution function&nbsp; $F_{x}(r)$&nbsp; of this continuous random variable is continuous and increases monotonically from&nbsp; $0$&nbsp; to&nbsp; $1$&nbsp; as the right figure shows. For&nbsp; $r \approx 0$&nbsp; and&nbsp; $r \approx 1$&nbsp; the CDF is horizontal due to the lack of PDF components.
 
  
 +
The middle diagram shows the PDF&nbsp; $f_{x}(x)$&nbsp; which is also often referred to in the literature as&nbsp; &raquo;gray value statistics&laquo;.
 +
*In the original image some gray values are preferred and the two extreme values&nbsp; $x =0$&nbsp; ("deep black")&nbsp; or&nbsp; $x =1$&nbsp; ("pure white")&nbsp; occur very rarely.
 +
 +
*The cumulative distribution function&nbsp; $F_{x}(r)$&nbsp; is continuous in value and increases monotonically from&nbsp; $0$&nbsp; to&nbsp; $1$&nbsp; as the right figure shows.&nbsp;
 +
 +
*For&nbsp; $r \approx 0$&nbsp; and&nbsp; $r \approx 1$&nbsp; the CDF is horizontal due to the lack of PDF components.
  
''Note:'' &nbsp; Strictly speaking, for an image that can be displayed on a computer - in contrast to an "analog" photograph - the gray value is always a discrete value random variable.&nbsp; However, with large resolution of the color information ("color depth"), this random variablecan be approximated to be continuous in value. }}
 
  
 +
$\text{Note:}$  &nbsp; Strictly speaking,&nbsp; for an image that can be displayed on a computer&nbsp; $($in contrast to an analog photograph$)$:
 +
# &nbsp; The gray value is always a discrete in value.&nbsp;
 +
# &nbsp; However,&nbsp; with large resolution of the color information&nbsp; $($&raquo;color depth&laquo;$)$,&nbsp; this random variable can be approximated to be continuous in value. }}
  
The topic of this chapter is illustrated with examples in the (German language) learning video&nbsp; [[Zusammenhang_zwischen_WDF_und_VTF_(Lernvideo)|Zusammenhang zwischen WDF und VTF]]&nbsp; $\Rightarrow$ relationship between PDF and CDF.
 
  
 +
&raquo; &nbsp; The topic of this chapter is illustrated with examples in the&nbsp; (German language)&nbsp; learning video&nbsp; <br> &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;[[Zusammenhang_zwischen_WDF_und_VTF_(Lernvideo)|&raquo;Zusammenhang zwischen WDF und VTF&raquo;]]&nbsp; $\Rightarrow$ &raquo;Relationship between PDF and CDF&laquo;.
  
==CDF for discrete-valued random variables==
+
 
 +
==CDF for value-discrete random variables==
 
<br>
 
<br>
For the calculation of the distribution function of a discrete value random variable&nbsp; $x$&nbsp; from its PDF, a more general equation must always be assumed&nbsp; Here, with the auxiliary variable $\varepsilon > 0$:  
+
For the CDF calculation of a value-discrete random variable&nbsp; $x$&nbsp; from its PDF,&nbsp; a more general equation must always be assumed.&nbsp; Here,&nbsp; with the auxiliary variable&nbsp; $\varepsilon > 0$:  
 
:$$F_{x}(r)=\lim_{\varepsilon\hspace{0.05cm}\to \hspace{0.05cm}0}\int_{-\infty}^{r+\varepsilon}f_x(x)\,{\rm d}x.$$
 
:$$F_{x}(r)=\lim_{\varepsilon\hspace{0.05cm}\to \hspace{0.05cm}0}\int_{-\infty}^{r+\varepsilon}f_x(x)\,{\rm d}x.$$
  
*Calculation of the distribution function by boundary value formation is required due to the "≤"&ndash;sign in the&nbsp; [[Theory_of_Stochastic_Signals/Cumulative_Distribution_Function#Relationship_between_PDF_and_CDF|general definition]]&nbsp;.
+
*Due to the&nbsp; &raquo;less than/equal&raquo;&nbsp; sign in the&nbsp; [[Theory_of_Stochastic_Signals/Cumulative_Distribution_Function#Relationship_between_PDF_and_CDF|&raquo;general definition&laquo;]], a limit value must be formed for the CDF calculation.&nbsp; If we also take into account that,&nbsp; for a value-discrete random variable,&nbsp; the PDF consists of a sum of weighted&nbsp; [[Signal_Representation/Direct_Current_Signal_-_Limit_Case_of_a_Periodic_Signal#Dirac_.28delta.29_function_in_frequency_domain|&raquo;Dirac delta functions&laquo;]],&nbsp; we obtain:  
*If we also take into account that, for a discrete random variable, the PDF consists of a sum of weighted&nbsp; [[Signal_Representation/General_Description/Gleichsignal_-_Grenzfall_eines_periodischen_Signals#Diracfunktion_im_Frequenzbereich|Dirac functions]]&nbsp;, we obtain:  
 
 
:$$F_{x}(r)=\lim_{\varepsilon\hspace{0.05cm}\to \hspace{0.05cm} 0}\int_{-\infty}^{r+\varepsilon}\sum\limits_{\mu= 1}^{ M}p_\mu\cdot \delta(x-x_\mu)\,{\rm d}x.$$
 
:$$F_{x}(r)=\lim_{\varepsilon\hspace{0.05cm}\to \hspace{0.05cm} 0}\int_{-\infty}^{r+\varepsilon}\sum\limits_{\mu= 1}^{ M}p_\mu\cdot \delta(x-x_\mu)\,{\rm d}x.$$
*If we interchange integration and summation in this equation, and consider that integration over the Dirac function yields the step function, we obtain:  
+
*If we interchange integration and summation in this equation,&nbsp; and consider that an integration over the Dirac delta function yields the step function,&nbsp; we obtain:  
 
:$$F_{x}(r)=\sum\limits_{\mu= \rm 1}^{\it M}p_\mu\cdot \gamma_0 (r-x_\mu),\hspace{0.4cm}{\rm with} \hspace{0.4cm}\gamma_0(x)=\lim_{\epsilon\hspace{0.05cm}\to \hspace{0.05cm} 0}\int_{-\infty}^{x+\varepsilon}\delta (u)\,{\rm d} u = \left\{ \begin{array}{*{2}{c}}  0 \hspace{0.4cm}  {\rm if}\hspace{0.1cm} x< 0,\\ 1  \hspace{0.4cm} {\rm if}\hspace{0.1cm}x\ge 0. \\ \end{array} \right.$$
 
:$$F_{x}(r)=\sum\limits_{\mu= \rm 1}^{\it M}p_\mu\cdot \gamma_0 (r-x_\mu),\hspace{0.4cm}{\rm with} \hspace{0.4cm}\gamma_0(x)=\lim_{\epsilon\hspace{0.05cm}\to \hspace{0.05cm} 0}\int_{-\infty}^{x+\varepsilon}\delta (u)\,{\rm d} u = \left\{ \begin{array}{*{2}{c}}  0 \hspace{0.4cm}  {\rm if}\hspace{0.1cm} x< 0,\\ 1  \hspace{0.4cm} {\rm if}\hspace{0.1cm}x\ge 0. \\ \end{array} \right.$$
It should be noted that:
+
::The function&nbsp; $γ_0(x)$&nbsp; differs from the&nbsp; [[Signal_Representation/Fourier_Transform_Theorems#Assignment_Theorem|&raquo;unit step function&laquo;]]&nbsp; $γ(x)$&nbsp; often used in systems theory in that at the jump point&nbsp; $x = 0$&nbsp; the right-hand side limit&nbsp; $1$nbsp; is valid&nbsp; $($instead of the mean value&nbsp; $0.5$&nbsp; between left&ndash; and right&ndash;hand side limits$)$.  
* $γ_0(x)$&nbsp; differs from the usual in systems theory&nbsp; [[Signal_Representation/Fourier_Transform_Theorems#Assignment_Theorem|unit step function]]&nbsp; $γ(x)$ in that at the jump point&nbsp; $x = 0$&nbsp; the right-hand side limit "one" is valid&nbsp; (instead of the mean value "$1/2$" between left- and right-hand side limits).  
+
*With the above CDF definition,&nbsp; the following probability equation holds for value-continuous and value-discrete random variables equally,&nbsp; and of course also for&nbsp; mixed random variables&nbsp; with discrete and continuous parts:
*With the above CDF definition, then, for the probability of continuous and discrete random variables equally, and of course also for&nbsp; ''mixed random variables''&nbsp; with discrete and continuous parts:
 
 
:$${\rm Pr}(x_{\rm u}<x \le x_{\rm o})=F_x(x_{\rm o})-F_x(x_{\rm u}).$$
 
:$${\rm Pr}(x_{\rm u}<x \le x_{\rm o})=F_x(x_{\rm o})-F_x(x_{\rm u}).$$
*For purely continuous random variables, the "less than" sign and the "less than/equal to" sign could be substituted for each other here.  
+
*For purely value-continuous random variables,&nbsp; the&nbsp; &raquo;less than&laquo;&nbsp; sign and the&nbsp; &raquo;less than/equal to&laquo;&nbsp; sign could be substituted for each other here.  
 
:$${\rm Pr}(x_{\rm u}<x \le x_{\rm o}) ={\rm Pr}(x_{\rm u}\le x \le x_{\rm o}) ={\rm Pr}(x_{\rm u}\le x < x_{\rm o}) ={\rm Pr}(x_{\rm u}<x < x_{\rm o}).$$
 
:$${\rm Pr}(x_{\rm u}<x \le x_{\rm o}) ={\rm Pr}(x_{\rm u}\le x \le x_{\rm o}) ={\rm Pr}(x_{\rm u}\le x < x_{\rm o}) ={\rm Pr}(x_{\rm u}<x < x_{\rm o}).$$
  
 
{{GraueBox|TEXT=   
 
{{GraueBox|TEXT=   
$\text{Example 2:}$&nbsp; If the gray value of the&nbsp; [[Theory_of_Stochastic_Signals/Cumulative_Distribution_Function#CDF_for_continuous_random_variables|original Lena photo]]&nbsp; is quantized by eight levels, so that each pixel can be represented by three bits and transmitted digitally, the discrete random variable&nbsp; $q$ is obtained. &nbsp; However, due to the quantization, a part of the image information is lost, which is reflected in the quantized image by clearly recognizable "contours".
+
$\text{Example 2:}$&nbsp; If the gray value of the&nbsp; [[Theory_of_Stochastic_Signals/Cumulative_Distribution_Function#CDF_for_continuous-valued_random_variables|&raquo;original Lena photo&laquo;]]&nbsp; is quantized by eight levels,&nbsp; so that each pixel can be represented by three bits and transmitted digitally,&nbsp; the discrete random variable&nbsp; $q$&nbsp; is obtained. &nbsp; However, due to the quantization,&nbsp; a part of the image information is lost,&nbsp; which is reflected in the quantized image by clearly recognizable&nbsp; &raquo;contours&laquo;.
  
[[File:P_ID74__Sto_T_3_2_S2b_neu.png |center|frame| PDF and CDF of a discrete value image]]
+
[[File:P_ID74__Sto_T_3_2_S2b_neu.png |right|frame| PDF and CDF of a value-discrete image]]
  
*The associated probability density function&nbsp; $f_{q}(q)$&nbsp; is composed of&nbsp; $M = 8$&nbsp; Dirac functions, where, in the quantization chosen here, the possible gray levels are assigned the values&nbsp; $q_\mu = (\mu - 1)/7$&nbsp; with&nbsp; $\mu = 1, 2,$ ... , $8$&nbsp; are assigned.  
+
*The associated PDF&nbsp; $f_{q}(q)$&nbsp; is composed of&nbsp; $M = 8$&nbsp; Dirac delta functions, where,&nbsp; in the quantization chosen here,&nbsp; the possible gray levels are assigned the values&nbsp; $q_\mu = (\mu - 1)/7$&nbsp; with&nbsp; $\mu = 1, 2,$ ... , $8$.
*The weights of the Dirac functions can be calculated from the PDF&nbsp; $f_{x}(x)$&nbsp; of the original image.&nbsp; One obtains  
+
:$$p_\mu={\rm Pr}(q = q_\mu ) = {\rm Pr}(\frac{2\mu-\rm 3}{14}< {x} \le\frac{2\it \mu- \rm 1}{14}) \rm = \int_{(2\it \mu- \rm 3)/14}^{(2\mu-1)/14}\it f_{x}{\rm (}x{\rm )}\,{\rm d}x.$$
+
*The weights of the Dirac delta functions can be calculated from the PDF&nbsp; $f_{x}(x)$&nbsp; of the original image.&nbsp; One obtains  
*For the undefined border areas&nbsp; $(x<0$&nbsp; resp.&nbsp; $x>1)$&nbsp; is to be set here respectively&nbsp; $f_{x}(x) = 0$&nbsp;.
+
:$$p_\mu={\rm Pr}(q = q_\mu ) = {\rm Pr}(\frac{2\mu-\rm 3}{14}< {x} \le\frac{2\it \mu- \rm 1}{14}) $$
*Since in the original image the gray levels&nbsp; $x ≈0$&nbsp; ("very deep black") &nbsp;or&nbsp; $x ≈1$&nbsp; ("almost pure white")&nbsp; are largely missing, the probabilities&nbsp; $p_1 ≈ p_8 ≈ 0$ result, and in fact only six Dirac functions are visible in the PDF. These two missing Dirac functions at&nbsp; $q = 0$&nbsp; and&nbsp; $q =1$&nbsp; are only indicated by dots in the middle graph.  
+
:$$\Rightarrow \hspace{0.3cm} p_\mu={\rm Pr}(q = q_\mu ) = \int_{(2\it \mu- \rm 3)/14}^{(2\mu-1)/14}\it f_{x}{\rm (}x{\rm )}\,{\rm d}x.$$
 +
*For the undefined areas&nbsp; $(x<0$, &nbsp; $x>1)$&nbsp; is to be set&nbsp; $f_{x}(x) = 0$.&nbsp; Since in the original image the gray levels&nbsp; $x ≈0$&nbsp; $($&raquo;very deep black&laquo;$)$&nbsp; or&nbsp; $x ≈1$&nbsp; $($&raquo;almost pure white&laquo;$)$&nbsp; are largely missing,&nbsp; $p_1 ≈ p_8 ≈ 0$ result.
  
*The distribution function sketched on the right&nbsp; $F_{q}(r)$&nbsp; thus has six points of discontinuity, where the right-hand side limit is valid in each case}}.
+
* Thus,&nbsp; only six Dirac delta functions are visible in the PDF.&nbsp; The two missing Diracs at&nbsp; $q = 0$&nbsp; and&nbsp; $q =1$&nbsp; are only indicated by dots.
 +
 +
*The step-shaped CDF&nbsp; $F_{q}(r)$&nbsp; sketched on the right thus has six points of discontinuity,&nbsp; where in each case the right-hand side limit is valid.}}
  
  
The topic of this chapter is illustrated with examples in the (German language) learning video&nbsp; [[Zusammenhang_zwischen_WDF_und_VTF_(Lernvideo)|Zusammenhang zwischen WDF und VTF]]&nbsp; $\Rightarrow$ relationship between PDF and CDF.
+
&raquo; &nbsp; The topic of this chapter is illustrated with examples in the&nbsp; (German language)&nbsp; learning video&nbsp; <br> &nbsp; &nbsp; &nbsp;&nbsp; &nbsp;[[Zusammenhang_zwischen_WDF_und_VTF_(Lernvideo)|&raquo;Zusammenhang zwischen WDF und VTF&raquo;]]&nbsp; $\Rightarrow$ &raquo;Relationship between PDF and CDF&laquo;.
  
  
 
==Exercises for the chapter==
 
==Exercises for the chapter==
 
<br>
 
<br>
[[Aufgaben:Exercise_3.2:_cos²-CDF_and_CDF_with_Step_Functions|Exercise 3.2: cos²-CDF and CDF with Step Functions]]
+
[[Aufgaben:Exercise_3.2:_CDF_for_Exercise_3.1|Exercise 3.2: CDF for Exercise 3.1]]
  
 
[[Aufgaben:Exercise_3.2Z:_Relationship_between_PDF_and_CDF|Exercise 3.2Z: Relationship between PDF and CDF]]
 
[[Aufgaben:Exercise_3.2Z:_Relationship_between_PDF_and_CDF|Exercise 3.2Z: Relationship between PDF and CDF]]

Latest revision as of 17:47, 19 February 2024

Relationship between PDF and CDF


To describe random variables,  in addition to the  »probability density function»  $\rm (PDF)$,  we use the  »cumulative distribution function«  $\rm (CDF)$  which is defined as follows:

$\text{Definition:}$  The  »cumulative distribution function«  $F_{x}(r)$  corresponds to the probability that the random variable  $x$  is less than or equal to a real number  $r$:

$$F_{x}(r) = {\rm Pr}( x \le r).$$


For a value-continuous random variable,  the following statements are possible regarding the CDF:

  • The CDF is computable from the probability density function  $f_{x}(x)$  by integration.  It holds:
$$F_{x}(r) = \int_{-\infty}^{r}f_x(x)\,{\rm d}x.$$
  • Since the PDF is never negative,  $F_{x}(r)$  increases at least weakly monotonically,  and the function always lies between the following limits:
$$F_{x}(r → \hspace{0.05cm} - \hspace{0.05cm} ∞) = 0, \hspace{0.5cm}F_{x}(r → +∞) = 1.$$
  • Inversely,  the probability density function can be determined from the CDF by differentiation:
$$f_{x}(x)=\frac{{\rm d} F_{x}(r)}{{\rm d} r}\Bigg |_{\hspace{0.1cm}r=x}.$$
The addition  »$r = x$«  makes it clear that in our nomenclature the PDF argument is the random variable  $x$  itself, while the CDF argument specifies any real variable  $r$ .


$\text{Notes on nomenclature:}$  If in the definitions of  $\rm PDF$  and  $\rm CDF$  we had distinguished

  • between the random variable  $X$ 
  • and the realizations  $x ∈ X$    ⇒   $f_{X}(x), F_{X}(x)$,


we would have the following nomenclature:

$$F_{X}(x) = {\rm Pr}(X \le x) = \int_{-\infty}^{x}f_{x}(\xi)\,{\rm d}\xi.$$

Unfortunately,  at the beginning of our  $\rm LNTwww$ project  $(2001)$  we decided to use our nomenclature for quite legitimate reasons,  which now  $(2017)$  cannot be changed,  also with regard to the realized learning videos.   So we stick with  $f_{x}(x)$  instead of  $f_{X}(x)$  as well as  $F_{x}(r)$  instead of  $F_{X}(x).$

CDF for value-continuous random variables


The equations given in the last section apply only to value-continuous random variables and will be illustrated here by an example.  In the next section it will be shown that for  »value-discrete random variables«  the equations must be modified somewhat.

$\text{Example 1:}$  The left image shows the photo  »Lena«,  which is often used as a test template for image coding procedures.

PDF and CDF of a value-continuous image
  • If this image is divided into  $256 × 256$  pixels,  and the brightness is determined for each pixel,  a sequence  $〈x_ν〉$  of gray values is obtained whose length  $N = 256^2 = 65\hspace{0.06cm}536$.
  • The gray value  $x$  is a value-continuous random variable,  where the assignment to numerical values is arbitrary.  For example,  let  »black«  be characterized by  $x = 0$  and  »white«  by  $x = 1$:  The value  $x =0.5$  then characterizes a medium gray coloration.


The middle diagram shows the PDF  $f_{x}(x)$  which is also often referred to in the literature as  »gray value statistics«.

  • In the original image some gray values are preferred and the two extreme values  $x =0$  ("deep black")  or  $x =1$  ("pure white")  occur very rarely.
  • The cumulative distribution function  $F_{x}(r)$  is continuous in value and increases monotonically from  $0$  to  $1$  as the right figure shows. 
  • For  $r \approx 0$  and  $r \approx 1$  the CDF is horizontal due to the lack of PDF components.


$\text{Note:}$   Strictly speaking,  for an image that can be displayed on a computer  $($in contrast to an analog photograph$)$:

  1.   The gray value is always a discrete in value. 
  2.   However,  with large resolution of the color information  $($»color depth«$)$,  this random variable can be approximated to be continuous in value.


»   The topic of this chapter is illustrated with examples in the  (German language)  learning video 
        »Zusammenhang zwischen WDF und VTF»  $\Rightarrow$ »Relationship between PDF and CDF«.


CDF for value-discrete random variables


For the CDF calculation of a value-discrete random variable  $x$  from its PDF,  a more general equation must always be assumed.  Here,  with the auxiliary variable  $\varepsilon > 0$:

$$F_{x}(r)=\lim_{\varepsilon\hspace{0.05cm}\to \hspace{0.05cm}0}\int_{-\infty}^{r+\varepsilon}f_x(x)\,{\rm d}x.$$
  • Due to the  »less than/equal»  sign in the  »general definition«, a limit value must be formed for the CDF calculation.  If we also take into account that,  for a value-discrete random variable,  the PDF consists of a sum of weighted  »Dirac delta functions«,  we obtain:
$$F_{x}(r)=\lim_{\varepsilon\hspace{0.05cm}\to \hspace{0.05cm} 0}\int_{-\infty}^{r+\varepsilon}\sum\limits_{\mu= 1}^{ M}p_\mu\cdot \delta(x-x_\mu)\,{\rm d}x.$$
  • If we interchange integration and summation in this equation,  and consider that an integration over the Dirac delta function yields the step function,  we obtain:
$$F_{x}(r)=\sum\limits_{\mu= \rm 1}^{\it M}p_\mu\cdot \gamma_0 (r-x_\mu),\hspace{0.4cm}{\rm with} \hspace{0.4cm}\gamma_0(x)=\lim_{\epsilon\hspace{0.05cm}\to \hspace{0.05cm} 0}\int_{-\infty}^{x+\varepsilon}\delta (u)\,{\rm d} u = \left\{ \begin{array}{*{2}{c}} 0 \hspace{0.4cm} {\rm if}\hspace{0.1cm} x< 0,\\ 1 \hspace{0.4cm} {\rm if}\hspace{0.1cm}x\ge 0. \\ \end{array} \right.$$
The function  $γ_0(x)$  differs from the  »unit step function«  $γ(x)$  often used in systems theory in that at the jump point  $x = 0$  the right-hand side limit  $1$nbsp; is valid  $($instead of the mean value  $0.5$  between left– and right–hand side limits$)$.
  • With the above CDF definition,  the following probability equation holds for value-continuous and value-discrete random variables equally,  and of course also for  mixed random variables  with discrete and continuous parts:
$${\rm Pr}(x_{\rm u}<x \le x_{\rm o})=F_x(x_{\rm o})-F_x(x_{\rm u}).$$
  • For purely value-continuous random variables,  the  »less than«  sign and the  »less than/equal to«  sign could be substituted for each other here.
$${\rm Pr}(x_{\rm u}<x \le x_{\rm o}) ={\rm Pr}(x_{\rm u}\le x \le x_{\rm o}) ={\rm Pr}(x_{\rm u}\le x < x_{\rm o}) ={\rm Pr}(x_{\rm u}<x < x_{\rm o}).$$

$\text{Example 2:}$  If the gray value of the  »original Lena photo«  is quantized by eight levels,  so that each pixel can be represented by three bits and transmitted digitally,  the discrete random variable  $q$  is obtained.   However, due to the quantization,  a part of the image information is lost,  which is reflected in the quantized image by clearly recognizable  »contours«.

PDF and CDF of a value-discrete image
  • The associated PDF  $f_{q}(q)$  is composed of  $M = 8$  Dirac delta functions, where,  in the quantization chosen here,  the possible gray levels are assigned the values  $q_\mu = (\mu - 1)/7$  with  $\mu = 1, 2,$ ... , $8$.
  • The weights of the Dirac delta functions can be calculated from the PDF  $f_{x}(x)$  of the original image.  One obtains
$$p_\mu={\rm Pr}(q = q_\mu ) = {\rm Pr}(\frac{2\mu-\rm 3}{14}< {x} \le\frac{2\it \mu- \rm 1}{14}) $$
$$\Rightarrow \hspace{0.3cm} p_\mu={\rm Pr}(q = q_\mu ) = \int_{(2\it \mu- \rm 3)/14}^{(2\mu-1)/14}\it f_{x}{\rm (}x{\rm )}\,{\rm d}x.$$
  • For the undefined areas  $(x<0$,   $x>1)$  is to be set  $f_{x}(x) = 0$.  Since in the original image the gray levels  $x ≈0$  $($»very deep black«$)$  or  $x ≈1$  $($»almost pure white«$)$  are largely missing,  $p_1 ≈ p_8 ≈ 0$ result.
  • Thus,  only six Dirac delta functions are visible in the PDF.  The two missing Diracs at  $q = 0$  and  $q =1$  are only indicated by dots.
  • The step-shaped CDF  $F_{q}(r)$  sketched on the right thus has six points of discontinuity,  where in each case the right-hand side limit is valid.


»   The topic of this chapter is illustrated with examples in the  (German language)  learning video 
        »Zusammenhang zwischen WDF und VTF»  $\Rightarrow$ »Relationship between PDF and CDF«.


Exercises for the chapter


Exercise 3.2: CDF for Exercise 3.1

Exercise 3.2Z: Relationship between PDF and CDF