Difference between revisions of "Theory of Stochastic Signals/Statistical Dependence and Independence"

From LNTwww
 
(15 intermediate revisions by 3 users not shown)
Line 6: Line 6:
 
}}
 
}}
  
==General Definition of Statistical Dependence==
+
==General definition of statistical dependence==
 
<br>
 
<br>
So far we have not paid much attention to&nbsp; ''statistical dependence''&nbsp;  between events, even though we have already used it as in the case of two disjoint sets: &nbsp; If an element belongs to&nbsp; $A$, it cannot with certainty also be contained in the disjoint set&nbsp; $B$&nbsp;.
+
So far we have not paid much attention to&nbsp; &raquo;statistical dependence&laquo;&nbsp;  between events,&nbsp; even though we have already used it as in the case of two&nbsp; &raquo;disjoint sets&laquo;: &nbsp;  
 +
*If an element belongs to&nbsp; $A$,&nbsp;  
  
The strongest form of dependence at all is such a&nbsp; '''deterministic dependence'''&nbsp; between two sets or two events.&nbsp; Less pronounced is the statistical dependence. Let us start with its complement:
+
*it cannot with certainty also be contained in the disjoint set&nbsp; $B$.
 +
 
 +
 
 +
The strongest form of dependence at all is such a&nbsp; &raquo;'''deterministic dependence'''&laquo;&nbsp; between two sets or two events.&nbsp; Less pronounced is the statistical dependence.&nbsp; Let us start with its complement:
  
 
{{BlaueBox|TEXT=   
 
{{BlaueBox|TEXT=   
$\text{Definition:}$&nbsp;
+
$\text{Definitions:}$&nbsp;
Two events&nbsp; $A$&nbsp; and&nbsp; $B$&nbsp; are called&nbsp; '''statistically independent'''&nbsp;, if the probability of the intersection&nbsp; $A ∩ B$&nbsp;  is equal to the product of the individual probabilities:
 
:$${\rm Pr}(A \cap B) = {\rm Pr}(A)\cdot {\rm Pr}(B).$$}}
 
  
 +
$(1)$&nbsp; Two events&nbsp; $A$&nbsp; and&nbsp; $B$&nbsp; are called&nbsp; &raquo;'''statistically independent'''&laquo;,&nbsp; if the probability of the intersection&nbsp; $A ∩ B$&nbsp;  is equal to the product of the individual probabilities:
 +
:$${\rm Pr}(A \cap B) = {\rm Pr}(A)\cdot {\rm Pr}(B).$$
 +
$(2)$&nbsp; If this condition is not satisfied,&nbsp; then the events&nbsp; $A$&nbsp; and&nbsp; $B$&nbsp; are &raquo;'''statistically dependent'''&laquo;:
 +
:$${\rm Pr}(A \cap B) \ne {\rm Pr}(A)\cdot {\rm Pr}(B).$$}}
  
*In some applications, statistical independence is obvious, for example, in the "coin toss" experiment. The probability for "heads" or "tails" is independent of whether&nbsp; ''heads''&nbsp; oder&nbsp; ''tails''&nbsp; occurred in the last toss.
 
  
*And also the individual results in the random experiment "throwing a roulette ball" are always statistically independent of each other under fair conditions, even if individual system players do not want to admit this.
+
*In some applications,&nbsp; statistical independence is obvious,&nbsp; for example,&nbsp; in the&nbsp; &raquo;coin toss&laquo;&nbsp; experiment.&nbsp; The probability for&nbsp; &raquo;heads&laquo;&nbsp; or&nbsp; &raquo;tails&laquo;&nbsp; is independent of whether&nbsp; &raquo;heads&laquo;&nbsp; or&nbsp; &raquo;tails&laquo;&nbsp; occurred in the last toss.
  
*In other applications, on the other hand, the question whether two events are statistically independent or not is not or only very difficult to answer instinctively.&nbsp; Here one can only arrive at the correct answer by checking the formal independence criterion given above, as the following example will show.
+
*And also the individual results in the random experiment&nbsp; &raquo;throwing a roulette ball&laquo;&nbsp; are always statistically independent of each other under fair conditions,&nbsp; even if individual system players do not want to admit this.
 +
 
 +
*In other applications,&nbsp; on the other hand,&nbsp; the question whether two events are statistically independent or not is not or only very difficult to answer instinctively.&nbsp; Here one can only arrive at the correct answer by checking the formal independence criterion given above,&nbsp; as the following example will show.
  
  
[[File:EN_Sto_T_1_3_S1.png|right|frame| Examples for statistically independent events]]
 
 
{{GraueBox|TEXT=   
 
{{GraueBox|TEXT=   
 
$\text{Example 1:}$&nbsp;
 
$\text{Example 1:}$&nbsp;
We consider the experiment  "throwing two dice", where the two dice can be distinguished by their colors red&nbsp; $(R)$&nbsp; and blue&nbsp; $(B)$&nbsp;.
+
We consider the experiment&nbsp; &raquo;throwing two dice&laquo;,&nbsp; where the two dice&nbsp; $($in graphic:&nbsp; "cubes"$)$&nbsp; can be distinguished by their colors red&nbsp; $(R)$&nbsp; and blue&nbsp; $(B)$.&nbsp; The graph illustrates this fact,&nbsp; where the sum&nbsp; $S = R + B$&nbsp; is entered in the two-dimensional field&nbsp; $(R, B)$.
 +
 
 +
For the following description we define the following events:
 +
[[File:EN_Sto_T_1_3_S1.png|right|frame| Examples for statistically independent events]]
 +
*$A_1$:&nbsp; The outcome of the red cube is&nbsp; $R < 4$&nbsp; $($red background$)$ &nbsp; &rArr; &nbsp; ${\rm Pr}(A_1) = 1/2$,
 +
*$A_2$:&nbsp; The outcome of the blue cube is&nbsp; $B > 4$&nbsp; $($blue font$)$ &nbsp; &rArr; &nbsp; ${\rm Pr}(A_2) = 1/3$,
 +
*$A_3$:&nbsp; The sum of the two cubes is&nbsp; $S = 7$&nbsp; $($green outline$)$ &nbsp; &rArr; &nbsp; ${\rm Pr}(A_3) = 1/6$,
 +
*$A_4$:&nbsp; The sum of the two cubes is&nbsp; $S = 8$&nbsp;  &nbsp; &rArr; &nbsp; ${\rm Pr}(A_4) = 5/36$,
 +
*$A_5$:&nbsp; The sum of the two cubes is&nbsp; $S = 10$&nbsp;  &nbsp; &rArr; &nbsp; ${\rm Pr}(A_5) = 3/36$.
  
The graph illustrates this fact, where the sum&nbsp; $S = R + B$&nbsp; is entered in the two-dimensional field&nbsp; $(R, B)$&nbsp;.
 
  
For the following description we define the following events:
 
*$A_1$:&nbsp; The number of eyes of the red die is&nbsp; $R < 4$&nbsp; (red background) &nbsp; &rArr; &nbsp; ${\rm Pr}(A_1) = 1/2$,
 
*$A_2$:&nbsp; The number of eyes of the blue die is&nbsp; $B > 4$&nbsp; (blue font) &nbsp; &rArr; &nbsp; ${\rm Pr}(A_2) = 1/3$,
 
*$A_3$:&nbsp; The sum of the two dice is&nbsp; $S = 7$&nbsp; (green outline) &nbsp; &rArr; &nbsp; ${\rm Pr}(A_3) = 1/6$,
 
*$A_4$:&nbsp; The sum of the two dice is&nbsp; $S = 8$&nbsp;  &nbsp; &rArr; &nbsp; ${\rm Pr}(A_4) = 5/36$,
 
*$A_5$:&nbsp; The sum of the two dice is&nbsp; $S = 10$&nbsp;  &nbsp; &rArr; &nbsp; ${\rm Pr}(A_5) = 3/36$.
 
<br clear=all>
 
 
The graph can be interpreted as follows:
 
The graph can be interpreted as follows:
*The two events&nbsp; $A_1$&nbsp; and&nbsp; $A_2$&nbsp; are statistically independent because the probability&nbsp; ${\rm Pr}(A_1 ∩ A_2) = 1/6$&nbsp; of the intersection is equal to the product of the two individual probabilities&nbsp; ${\rm Pr}(A_1) = 1/2$&nbsp; and&nbsp; ${\rm Pr}(A_2) = 1/3$&nbsp;.&nbsp; Given the problem definition, any other result would also have been very surprising.
+
*The two events&nbsp; $A_1$&nbsp; and&nbsp; $A_2$&nbsp; are statistically independent because the probability&nbsp; ${\rm Pr}(A_1 ∩ A_2) = 1/6$&nbsp; of the intersection is equal to the product of the two individual probabilities&nbsp; ${\rm Pr}(A_1) = 1/2$&nbsp; and&nbsp; ${\rm Pr}(A_2) = 1/3$&nbsp;.&nbsp; Given the problem definition,&nbsp; any other result would also have been very surprising.
*But also the events&nbsp; $A_1$&nbsp; and&nbsp; $A_3$&nbsp; are statistically independent because of&nbsp; ${\rm Pr}(A_1) = 1/2$,&nbsp; ${\rm Pr}(A_3) = 1/6$&nbsp; and&nbsp; ${\rm Pr}(A_1 ∩ A_3) = 1/12$&nbsp; statistisch unabhängig.&nbsp; The probability of intersection&nbsp; $(1/12)$&nbsp; arises because three of the&nbsp; $36$&nbsp; squares are both highlighted in red and outlined in green.
+
 
*In contrast, there are statistical ties between events&nbsp; $A_1$&nbsp; and&nbsp; $A_4$&nbsp; because the probability of intersection &nbsp; &rArr; &nbsp; ${\rm Pr}(A_1 ∩ A_4) = 1/18 = 4/72$&nbsp; is not equal to the product&nbsp; ${\rm Pr}(A_1) \cdot {\rm Pr}(A_4)= 1/2 \cdot 5/36 = 5/72$&nbsp;.
+
*But also the events&nbsp; $A_1$&nbsp; and&nbsp; $A_3$&nbsp; are statistically independent because of&nbsp; ${\rm Pr}(A_1) = 1/2$,&nbsp; ${\rm Pr}(A_3) = 1/6$&nbsp; and&nbsp; ${\rm Pr}(A_1 ∩ A_3) = 1/12$.&nbsp; The probability of intersection&nbsp; $(1/12)$&nbsp; arises because three of the&nbsp; $36$&nbsp; squares are both highlighted in red and outlined in green.
*The two events&nbsp; $A_1$&nbsp; and&nbsp; $A_5$&nbsp; are even disjunctive &nbsp; &rArr; &nbsp; ${\rm Pr}(A_1 ∩ A_5) = 0$: &nbsp; none of the boxes with red background is labeled&nbsp; $S=10$&nbsp;. This example shows that disjunctivity is a particularly pronounced form of statistical dependence. }}
+
 
 +
*In contrast,&nbsp; there are statistical bindings between&nbsp; $A_1$&nbsp; and&nbsp; $A_4$&nbsp; because the probability of intersection &nbsp; &rArr; &nbsp; ${\rm Pr}(A_1 ∩ A_4) = 1/18 = 4/72$&nbsp; is not equal to the product&nbsp; ${\rm Pr}(A_1) \cdot {\rm Pr}(A_4)= 1/2 \cdot 5/36 = 5/72$&nbsp;.
 +
 
 +
*The two events&nbsp; $A_1$&nbsp; and&nbsp; $A_5$&nbsp; are even disjoint &nbsp; &rArr; &nbsp; ${\rm Pr}(A_1 ∩ A_5) = 0$: &nbsp; none of the boxes with red background is labeled&nbsp; $S=10$&nbsp;.  
 +
 
 +
 
 +
'''This example shows that disjunctivity is a particularly pronounced form of statistical dependence'''. }}
  
==Conditional Probability==
+
==Conditional probability==
 
<br>
 
<br>
If there are statistical ties between the two events&nbsp; $A$&nbsp; and&nbsp; $B$&nbsp;, the (unconditional) probabilities&nbsp; ${\rm Pr}(A)$&nbsp; and&nbsp; ${\rm Pr}(B)$&nbsp; do not describe the situation unambiguously in the statistical sense.&nbsp; So-called conditional probabilities are then required.
+
If there are statistical bindings between the two events&nbsp; $A$&nbsp; and&nbsp; $B$,&nbsp; the&nbsp; $($unconditional$)$&nbsp; probabilities&nbsp; ${\rm Pr}(A)$&nbsp; and&nbsp; ${\rm Pr}(B)$&nbsp; do not describe the situation unambiguously in the statistical sense.&nbsp; So-called&nbsp; &raquo;conditional probabilities&laquo;&nbsp; are then required.
  
 
{{BlaueBox|TEXT=   
 
{{BlaueBox|TEXT=   
 
$\text{Definitions:}$&nbsp;
 
$\text{Definitions:}$&nbsp;
The&nbsp; '''conditional probability''' of&nbsp; $A$&nbsp; under condition&nbsp; $B$&nbsp; can be calculated as follows:
+
 
 +
$(1)$&nbsp; The&nbsp; &raquo;'''conditional probability'''&laquo; of&nbsp; $A$&nbsp; under condition&nbsp; $B$&nbsp; can be calculated as follows:
 
:$${\rm Pr}(A\hspace{0.05cm} \vert \hspace{0.05cm} B) = \frac{ {\rm Pr}(A \cap B)}{ {\rm Pr}(B)}.$$
 
:$${\rm Pr}(A\hspace{0.05cm} \vert \hspace{0.05cm} B) = \frac{ {\rm Pr}(A \cap B)}{ {\rm Pr}(B)}.$$
  
Similarly, the conditional probability of&nbsp; $B$&nbsp; under condition&nbsp; $A$ is:
+
$(2)$&nbsp; Similarly,&nbsp; the conditional probability of&nbsp; $B$&nbsp; under condition&nbsp; $A$&nbsp; is:
 
:$${\rm Pr}(B\hspace{0.05cm} \vert \hspace{0.05cm}A) = \frac{ {\rm Pr}(A \cap B)}{ {\rm Pr}(A)}.$$
 
:$${\rm Pr}(B\hspace{0.05cm} \vert \hspace{0.05cm}A) = \frac{ {\rm Pr}(A \cap B)}{ {\rm Pr}(A)}.$$
  
Combining these two equations, we get&nbsp; [https://en.wikipedia.org/wiki/Thomas_Bayes Bayes'] Theorem:
+
$(3)$&nbsp; Combining these two equations,&nbsp; we get&nbsp; [https://en.wikipedia.org/wiki/Thomas_Bayes $\text{Bayes'}$]&nbsp; theorem:
 
:$${\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm} A) = \frac{ {\rm Pr}(A\hspace{0.05cm} \vert \hspace{0.05cm} B)\cdot {\rm Pr}(B)}{ {\rm Pr}(A)}.$$}}
 
:$${\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm} A) = \frac{ {\rm Pr}(A\hspace{0.05cm} \vert \hspace{0.05cm} B)\cdot {\rm Pr}(B)}{ {\rm Pr}(A)}.$$}}
  
  
 
Below are some properties of conditional probabilities:
 
Below are some properties of conditional probabilities:
*A conditional probability also always lies between&nbsp; $0$&nbsp; and&nbsp; $1$&nbsp; including these two limits: &nbsp; $0 \le {\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm} B) \le 1$.
+
#Also a conditional probability lies always between&nbsp; $0$&nbsp; and&nbsp; $1$&nbsp; including these two limits: &nbsp; $0 \le {\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm} B) \le 1$.
*If the condition&nbsp; $B$&nbsp; can be regarded as constant, all calculation rules given in the chapter&nbsp; [[Theory_of_Stochastic_Signals/Set_Theory_Basics|Set Theory Basics]]&nbsp;  for the unconditional probabilities&nbsp; ${\rm Pr}(A)$&nbsp; and&nbsp; ${\rm Pr}(B)$&nbsp; still apply.
+
#With constant  condition&nbsp; $B$,&nbsp; all calculation rules given in the chapter&nbsp; [[Theory_of_Stochastic_Signals/Set_Theory_Basics|&raquo;Set Theory Basics&laquo;]]&nbsp;  for the unconditional probabilities&nbsp; ${\rm Pr}(A)$&nbsp; and&nbsp; ${\rm Pr}(B)$&nbsp; still apply.
*If the existing events&nbsp; $A$&nbsp; and&nbsp; $B$&nbsp; are disjoint, then&nbsp; ${\rm Pr}(A\hspace{0.05cm} | \hspace{0.05cm} B) = {\rm Pr}(B\hspace{0.05cm} | \hspace{0.05cm}A)= 0$.   
+
#If the existing events&nbsp; $A$&nbsp; and&nbsp; $B$&nbsp; are disjoint,&nbsp; then&nbsp; ${\rm Pr}(A\hspace{0.05cm} | \hspace{0.05cm} B) = {\rm Pr}(B\hspace{0.05cm} | \hspace{0.05cm}A)= 0$&nbsp; $($agreement:&nbsp; event&nbsp; $A$&nbsp; &raquo;exists&laquo;&nbsp; if&nbsp; ${\rm Pr}(A) > 0)$.   
*If&nbsp; $B$&nbsp; is a real or fake subset of&nbsp; $A$, then&nbsp; ${\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm} B) =1$. &nbsp;
+
#If&nbsp; $B$&nbsp; is a proper or improper subset of&nbsp; $A$,&nbsp; then&nbsp; ${\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm} B) =1$. &nbsp;
*If two events&nbsp; $A$&nbsp; and&nbsp; $B$ are statistically independent, their conditional probabilities are equal to the unconditional ones, as the following calculation shows:
+
#If two events&nbsp; $A$&nbsp; and&nbsp; $B$ are statistically independent,&nbsp; their conditional probabilities are equal to the unconditional ones,&nbsp; as the following calculation shows:
:$${\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm} B) = \frac{{\rm Pr}(A \cap B)}{{\rm Pr}(B)} = \frac{{\rm Pr} ( A) \cdot {\rm Pr} ( B)} { {\rm Pr}(B)} = {\rm Pr} ( A).$$
+
::$${\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm} B) = \frac{{\rm Pr}(A \cap B)}{{\rm Pr}(B)} = \frac{{\rm Pr} ( A) \cdot {\rm Pr} ( B)} { {\rm Pr}(B)} = {\rm Pr} ( A).$$
  
[[File:EN_Sto_T_1_3_S2.png |frame| Example of statistically dependent events|right]]
 
 
{{GraueBox|TEXT=   
 
{{GraueBox|TEXT=   
 
$\text{Example 2:}$&nbsp;
 
$\text{Example 2:}$&nbsp;
We again consider the experiment "Throwing two dice", where, as in&nbsp; [[Theory_of_Stochastic_Signals/Statistical_Dependence_and_Independence#General_Definition_of_Statistical_Dependence|$\text{Example 1}$]]&nbsp; $S = R + B$&nbsp; denotes the sum of the red and blue dice.
+
We again consider the experiment&nbsp; &raquo;Throwing two dice&laquo;,&nbsp; where&nbsp; $S = R + B$&nbsp; denotes the sum of the red and blue dice&nbsp; $($cube$)$.
  
Here we consider ties between the two events
+
[[File:EN_Sto_T_1_3_S2.png |frame| Example of statistically dependent events|right]]
*$A_1$:&nbsp; The roll of the red die is&nbsp; $R < 4$&nbsp; (red background) &nbsp; &rArr; &nbsp; ${\rm Pr}(A_1) = 1/2$,
+
Here we consider bindings between the two events
*$A_4$:&nbsp; The sum of the two dice is&nbsp; $S = 8$&nbsp; (green outline)  &nbsp; &rArr; &nbsp; ${\rm Pr}(A_4) = 5/36$,
+
*$A_1$:&nbsp; &raquo;The outcome of the red cube is&nbsp; $R < 4$ &laquo;&nbsp; $($red background$)$ &nbsp; &rArr; &nbsp; ${\rm Pr}(A_1) = 1/2$,
  
 +
*$A_4$:&nbsp; &raquo;The sum of the two cubes is&nbsp; $S = 8$ &laquo;&nbsp; $($green outline$)$  &nbsp; &rArr; &nbsp; ${\rm Pr}(A_4) = 5/36$,
  
and refer again to the event
+
 
*$A_3$:&nbsp; The sum of the two cubes is&nbsp; $S = 7$ &nbsp; &rArr; &nbsp; ${\rm Pr}(A_3) = 1/6$.
+
and refer again to the event of&nbsp; [[Theory_of_Stochastic_Signals/Statistical_Dependence_and_Independence#General_definition_of_statistical_dependence|$\text{Example 1}$]]:&nbsp;
<br clear=all>
+
*$A_3$:&nbsp; &raquo;The sum of the two cubes is&nbsp; $S = 7$ &laquo; &nbsp; &rArr; &nbsp; ${\rm Pr}(A_3) = 1/6$.
Regarding this graph, note:
+
 
*There are statistical ties between events&nbsp; $A_1$&nbsp; and&nbsp; $A_4$&nbsp;, since the probability of intersection  &nbsp; &rArr; &nbsp; ${\rm Pr}(A_1 ∩ A_4) = 2/36 = 4/72$&nbsp; is not equal to the product&nbsp; ${\rm Pr}(A_1) \cdot {\rm Pr}(A_4)= 1/2 \cdot 5/36 = 5/72$&nbsp;.   *The conditional probability&nbsp; ${\rm Pr}(A_1 \hspace{0.05cm} \vert \hspace{0.05cm} A_4) = 2/5$&nbsp; can be calculated from the quotient of the joint probability&nbsp; ${\rm Pr}(A_1 ∩ A_4) = 2/36$&nbsp; and the probability &nbsp; ${\rm Pr}(A_4) = 5/36$&nbsp;.
+
 
*Since&nbsp; $A_1$&nbsp; and&nbsp; $A_4$&nbsp; are statistically dependent, the conditional probability&nbsp; ${\rm Pr}(A_1 \hspace{0.05cm}\vert \hspace{0.05cm} A_4) = 2/5$&nbsp;  (two of the five squares outlined in green are highlighted in red)&nbsp; is not equal to the absolute probability&nbsp; ${\rm Pr}(A_1) = 1/2$&nbsp; (half of all squares are highlighted in red).  
+
Regarding this graph,&nbsp; note:
*Similarly, the conditional probability&nbsp; ${\rm Pr}(A_4 \hspace{0.05cm} \vert \hspace{0.05cm} A_1) = 2/18 = 4/36$&nbsp;  (two of the&nbsp; $18$&nbsp; fields with a red background are outlined in green) is unequal to the absolute probability&nbsp; ${\rm Pr}(A_4) = 5/36$&nbsp; (a total of five of the&nbsp; $36$&nbsp; fields are outlined in green).  
+
*There are statistical bindings between the both events&nbsp; $A_1$&nbsp; and&nbsp; $A_4$,&nbsp; since the probability of intersection  &nbsp; &rArr; &nbsp; ${\rm Pr}(A_1 ∩ A_4) = 2/36 = 4/72$&nbsp; is not equal to the product&nbsp; ${\rm Pr}(A_1) \cdot {\rm Pr}(A_4)= 1/2 \cdot 5/36 = 5/72$.
*This last result can also be derived using&nbsp; '''Bayes' theorem'''&nbsp;, for example:
+
   
 +
*The conditional probability&nbsp; ${\rm Pr}(A_1 \hspace{0.05cm} \vert \hspace{0.05cm} A_4) = 2/5$&nbsp; can be calculated from the quotient of the&nbsp; &raquo;joint probability&laquo;&nbsp; ${\rm Pr}(A_1 ∩ A_4) = 2/36$&nbsp; and the absolute probability &nbsp; ${\rm Pr}(A_4) = 5/36$.
 +
 
 +
*Since the events&nbsp; $A_1$&nbsp; and&nbsp; $A_4$&nbsp; are statistically dependent,&nbsp; the conditional probability&nbsp; ${\rm Pr}(A_1 \hspace{0.05cm}\vert \hspace{0.05cm} A_4) = 2/5$&nbsp;  $($two of the five squares outlined in green are highlighted in red$)$&nbsp; is not equal to the absolute probability&nbsp; ${\rm Pr}(A_1) = 1/2$&nbsp; $($half of all squares are highlighted in red$)$.
 +
 +
*Similarly,&nbsp; the conditional probability&nbsp; ${\rm Pr}(A_4 \hspace{0.05cm} \vert \hspace{0.05cm} A_1) = 2/18 = 1/9$&nbsp;  $($two of the&nbsp; $18$&nbsp; fields with a red background are outlined in green$)$&nbsp; is unequal to the absolute probability&nbsp; ${\rm Pr}(A_4) = 5/36$&nbsp; $($a total of five of the&nbsp; $36$&nbsp; fields are outlined in green$)$.  
 +
 
 +
*This last result can also be derived using&nbsp; &raquo;'''Bayes'&nbsp; theorem'''&laquo;, &nbsp; for example:
 
:$${\rm Pr}(A_4 \hspace{0.05cm} \vert\hspace{0.05cm} A_1) =  \frac{ {\rm Pr}(A_1 \hspace{0.05cm} \vert\hspace{0.05cm} A_4)\cdot {\rm Pr} ( A_4)} {  {\rm Pr}(A_1)}  = \frac{2/5 \cdot 5/36}{1/2}  = 1/9.$$
 
:$${\rm Pr}(A_4 \hspace{0.05cm} \vert\hspace{0.05cm} A_1) =  \frac{ {\rm Pr}(A_1 \hspace{0.05cm} \vert\hspace{0.05cm} A_4)\cdot {\rm Pr} ( A_4)} {  {\rm Pr}(A_1)}  = \frac{2/5 \cdot 5/36}{1/2}  = 1/9.$$
*In contrast, the following conditional probabilities hold for&nbsp; $A_1$&nbsp; and the statistically independent event&nbsp; $A_3$&nbsp; , see&nbsp; [[Theory_of_Stochastic_Signals/Statistische_Abhängigkeit_und_Unabhängigkeit#General_Definition_of_Statistical_Dependence| Example 1]]:
+
*In contrast,&nbsp; the following conditional probabilities hold for&nbsp; $A_1$&nbsp; and the statistically independent event&nbsp; $A_3$,&nbsp; see&nbsp; [[Theory_of_Stochastic_Signals/Statistical_Dependence_and_Independence#General_definition_of_statistical_dependence|$\text{Example 1}$]]:
:$${\rm Pr}(A_{\rm 1} \hspace{0.05cm}\vert \hspace{0.05cm} A_{\rm 3}) = {\rm Pr}(A_{\rm 1}) = \rm 1/2\hspace{0.5cm}{\rm bzw.}\hspace{0.5cm}{\rm Pr}(A_{\rm 3} \hspace{0.05cm} \vert \hspace{0.05cm} A_{\rm 1}) = {\rm Pr}(A_{\rm 3}) = 1/6.$$}}
+
:$${\rm Pr}(A_{\rm 1} \hspace{0.05cm}\vert \hspace{0.05cm} A_{\rm 3}) = {\rm Pr}(A_{\rm 1}) = \rm 1/2\hspace{0.5cm}{\rm resp.}\hspace{0.5cm}{\rm Pr}(A_{\rm 3} \hspace{0.05cm} \vert \hspace{0.05cm} A_{\rm 1}) = {\rm Pr}(A_{\rm 3}) = 1/6.$$}}
  
  
 
==General multiplication theorem==
 
==General multiplication theorem==
 
<br>
 
<br>
We consider several events denoted as&nbsp;  $A_i$&nbsp; with&nbsp; $1 ≤ i ≤ I$&nbsp;.&nbsp; However, these events&nbsp; $A_i$&nbsp; now no longer represent a&nbsp; [[Theory_of_Stochastic_Signals/Set_Theory_Basics#Complete_system|complete system]]&nbsp;, viz,
+
Furthermore,&nbsp; we consider several events denoted as&nbsp;  $A_i$&nbsp; with&nbsp; $1 ≤ i ≤ I$.&nbsp; However,&nbsp; these events&nbsp; $A_i$&nbsp; no longer represent a&nbsp; [[Theory_of_Stochastic_Signals/Set_Theory_Basics#Complete_system|&raquo;complete system&laquo;]]&nbsp;, viz:
*they are not pairwise disjoint to each other, and
+
*They are not pairwise disjoint to each other.&nbsp;
*there may also be statistical ties between the individual events.
+
 
 +
*There may also be statistical bindings between the individual events.
  
  
 
{{BlaueBox|TEXT=   
 
{{BlaueBox|TEXT=   
$\text{Definition:}$&nbsp;
+
$\text{Definition:}$&nbsp;  
For the so-called&nbsp; '''joint probability''', i.e. for the probability of the intersection of al&nbsp; $I$&nbsp; events&nbsp; $A_i$, holds in this case:
+
 
 +
$(1)$&nbsp; For the so-called&nbsp; &raquo;'''joint probability'''&laquo;, i.e. for the probability of the intersection of all&nbsp; $I$&nbsp; events&nbsp; $A_i$&nbsp; holds in this case:
 
:$${\rm Pr}(A_{\rm 1} \cap \hspace{0.02cm}\text{ ...}\hspace{0.1cm} \cap A_{I})  =  
 
:$${\rm Pr}(A_{\rm 1} \cap \hspace{0.02cm}\text{ ...}\hspace{0.1cm} \cap A_{I})  =  
 
  {\rm Pr}(A_{I})\hspace{0.05cm}\cdot\hspace{0.05cm}{\rm Pr}(A_{I \rm -1} \hspace{0.05cm}\vert  \hspace{0.05cm} A_I) \hspace{0.05cm}\cdot \hspace{0.05cm}{\rm Pr}(A_{I \rm -2} \hspace{0.05cm}\vert\hspace{0.05cm} A_{I - \rm 1}\cap A_I)\hspace{0.05cm} \cdot  \hspace{0.02cm}\text{ ...}  \hspace{0.1cm}  \cdot\hspace{0.05cm} {\rm Pr}(A_{\rm 1} \hspace{0.05cm}\vert  \hspace{0.05cm}A_{\rm 2} \cap \hspace{0.02cm}\text{ ...}  \hspace{0.1cm}\cap A_{ I}).$$
 
  {\rm Pr}(A_{I})\hspace{0.05cm}\cdot\hspace{0.05cm}{\rm Pr}(A_{I \rm -1} \hspace{0.05cm}\vert  \hspace{0.05cm} A_I) \hspace{0.05cm}\cdot \hspace{0.05cm}{\rm Pr}(A_{I \rm -2} \hspace{0.05cm}\vert\hspace{0.05cm} A_{I - \rm 1}\cap A_I)\hspace{0.05cm} \cdot  \hspace{0.02cm}\text{ ...}  \hspace{0.1cm}  \cdot\hspace{0.05cm} {\rm Pr}(A_{\rm 1} \hspace{0.05cm}\vert  \hspace{0.05cm}A_{\rm 2} \cap \hspace{0.02cm}\text{ ...}  \hspace{0.1cm}\cap A_{ I}).$$
  
In the same way, of course, holds:
+
$(2)$&nbsp; In the same way&nbsp; holds:
 
:$${\rm Pr}(A_{\rm 1} \cap \hspace{0.02cm}\text{ ...}\hspace{0.1cm} \cap A_{I})  =  {\rm Pr}(A_1)\hspace{0.05cm}\cdot\hspace{0.05cm}{\rm Pr}(A_2 \hspace{0.05cm}\vert  \hspace{0.05cm} A_1) \hspace{0.05cm}\cdot \hspace{0.05cm}{\rm Pr}(A_3 \hspace{0.05cm}\vert \hspace{0.05cm}  A_1\cap  A_2)\hspace{0.05cm} \cdot \hspace{0.02cm}\text{ ...}\hspace{0.1cm}  \cdot\hspace{0.05cm} {\rm Pr}(A_I \hspace{0.05cm}\vert  \hspace{0.05cm}A_1 \cap \hspace{0.02cm} \text{ ...}  \hspace{0.1cm}\cap A_{ I-1}).$$}}
 
:$${\rm Pr}(A_{\rm 1} \cap \hspace{0.02cm}\text{ ...}\hspace{0.1cm} \cap A_{I})  =  {\rm Pr}(A_1)\hspace{0.05cm}\cdot\hspace{0.05cm}{\rm Pr}(A_2 \hspace{0.05cm}\vert  \hspace{0.05cm} A_1) \hspace{0.05cm}\cdot \hspace{0.05cm}{\rm Pr}(A_3 \hspace{0.05cm}\vert \hspace{0.05cm}  A_1\cap  A_2)\hspace{0.05cm} \cdot \hspace{0.02cm}\text{ ...}\hspace{0.1cm}  \cdot\hspace{0.05cm} {\rm Pr}(A_I \hspace{0.05cm}\vert  \hspace{0.05cm}A_1 \cap \hspace{0.02cm} \text{ ...}  \hspace{0.1cm}\cap A_{ I-1}).$$}}
  
Line 111: Line 133:
 
{{GraueBox|TEXT=   
 
{{GraueBox|TEXT=   
 
$\text{Example 3:}$&nbsp;
 
$\text{Example 3:}$&nbsp;
A lottery drum contains ten lots, including three hits&nbsp; $($event $T_1)$.&nbsp; Then the probability of drawing two hits with two tickets is:
+
A lottery drum contains ten lots,&nbsp; including three hits&nbsp; $($event&nbsp; $T_1)$.&nbsp;  
 +
 
 +
*Then the probability of drawing two hits with two tickets is:
  
 
:$${\rm Pr}(T_1 \cap T_2) = {\rm Pr}(T_1) \cdot {\rm Pr}(T_2 \hspace{0.05cm }\vert \hspace{0.05cm} T_1) = 3/10 \cdot 2/9 = 1/15 \approx 6.7 \%.$$
 
:$${\rm Pr}(T_1 \cap T_2) = {\rm Pr}(T_1) \cdot {\rm Pr}(T_2 \hspace{0.05cm }\vert \hspace{0.05cm} T_1) = 3/10 \cdot 2/9 = 1/15 \approx 6.7 \%.$$
  
*This takes into account that in the second draw&nbsp; $($event $T_2)$&nbsp; there would be only nine tickets and two hits in the urn if one hit had been drawn in the first run &nbsp; &rArr; &nbsp;  ${\rm Pr}(T_2 \hspace{0.05cm} \vert\hspace{0.05cm} T_1) = 2/9$ .
+
*This takes into account that in the second draw&nbsp; $($event&nbsp; $T_2)$&nbsp; there would be only nine tickets and two hits in the drum if one hit had been drawn in the first run
*However, if the tickets were returned to the drum after the draw, the events&nbsp; $T_1$&nbsp; and&nbsp; $T_2$&nbsp; would be statistically independent and it would hold:   
+
:$${\rm Pr}(T_2 \hspace{0.05cm} \vert\hspace{0.05cm} T_1) = 2/9\approx 22.2 \%.$$  
 +
*However,&nbsp; if the tickets were returned to the drum after the draw,&nbsp; the events&nbsp; $T_1$&nbsp; and&nbsp; $T_2$&nbsp; would be statistically independent and it would hold:   
 
:$$ {\rm Pr}(T_1 ∩ T_2) = (3/10)^2 = 9\%.$$}}
 
:$$ {\rm Pr}(T_1 ∩ T_2) = (3/10)^2 = 9\%.$$}}
  
 
==Inference probability==
 
==Inference probability==
 
<br>
 
<br>
Given again events&nbsp; $A_i$&nbsp; with&nbsp; $1 ≤ i ≤ I$, that form a complete system. That is:
+
Given again events&nbsp; $A_i$&nbsp; with&nbsp; $1 ≤ i ≤ I$&nbsp; that form a&nbsp; [[Theory_of_Stochastic_Signals/Set_Theory_Basics#Complete_system|&raquo;complete system&laquo;]].&nbsp; That is:
 
*All events are pairwise disjoint&nbsp; $(A_i ∩ A_j = ϕ$&nbsp; for all&nbsp; $i ≠ j$&nbsp;).
 
*All events are pairwise disjoint&nbsp; $(A_i ∩ A_j = ϕ$&nbsp; for all&nbsp; $i ≠ j$&nbsp;).
 +
 
*The union gives the universal set:
 
*The union gives the universal set:
 
:$$\rm \bigcup_{\it i=1}^{\it I}\it A_i = \it G.$$
 
:$$\rm \bigcup_{\it i=1}^{\it I}\it A_i = \it G.$$
  
Besides, we consider the event&nbsp; $B$, of which all conditional probabilities&nbsp; ${\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm} A_i)$&nbsp; with indices&nbsp; $1 ≤ i ≤ I$&nbsp; are known.
+
Besides,&nbsp; we consider the event&nbsp; $B$,&nbsp; of which all conditional probabilities&nbsp; ${\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm} A_i)$&nbsp; with indices&nbsp; $1 ≤ i ≤ I$&nbsp; are known.
  
 
{{BlaueBox|TEXT=   
 
{{BlaueBox|TEXT=   
$\text{Theorem of total probability:}$&nbsp;
+
$\text{Theorem of total probability:}$&nbsp; Under the above conditions,&nbsp; the&nbsp; &raquo;'''unconditional&nbsp; probability'''&laquo;  of event&nbsp; $B$&nbsp; is:
Under the above conditions, the (unconditional) probability of event&nbsp; $B$ is:
 
 
:$${\rm Pr}(B) = \sum_{i={\rm1} }^{I}{\rm Pr}(B \cap A_i) = \sum_{i={\rm1} }^{I}{\rm Pr}(B \hspace{0.05cm} \vert\hspace{0.05cm} A_i)\cdot{\rm Pr}(A_i).$$}}
 
:$${\rm Pr}(B) = \sum_{i={\rm1} }^{I}{\rm Pr}(B \cap A_i) = \sum_{i={\rm1} }^{I}{\rm Pr}(B \hspace{0.05cm} \vert\hspace{0.05cm} A_i)\cdot{\rm Pr}(A_i).$$}}
  
  
 
{{BlaueBox|TEXT=   
 
{{BlaueBox|TEXT=   
$\text{Definition:}$&nbsp;
+
$\text{Definition:}$&nbsp; From this equation, using&nbsp;  [[Theory_of_Stochastic_Signals/Statistical_Dependence_and_Independence#Conditional_probability|$\text{Bayes' theorem:}$]] &nbsp;  &rArr; &nbsp; &raquo;'''Inference probability'''&laquo;:
From this equation, using&nbsp;  [[Theory_of_Stochastic_Signals/Statistical_Dependence_and_Independence#Conditional_Probability|Bayes' theorem]]&nbsp;  for&nbsp; '''Inference probability''':
 
 
:$${\rm Pr}(A_i \hspace{0.05cm} \vert \hspace{0.05cm} B) = \frac{ {\rm Pr}( B \mid A_i)\cdot {\rm Pr}(A_i )}{ {\rm Pr}(B)} = \frac{ {\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm} A_i)\cdot {\rm Pr}(A_i )}{\sum_{k={\rm1} }^{I}{\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm} A_k)\cdot{\rm Pr}(A_k) }.$$}}
 
:$${\rm Pr}(A_i \hspace{0.05cm} \vert \hspace{0.05cm} B) = \frac{ {\rm Pr}( B \mid A_i)\cdot {\rm Pr}(A_i )}{ {\rm Pr}(B)} = \frac{ {\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm} A_i)\cdot {\rm Pr}(A_i )}{\sum_{k={\rm1} }^{I}{\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm} A_k)\cdot{\rm Pr}(A_k) }.$$}}
  
Line 142: Line 166:
 
{{GraueBox|TEXT=   
 
{{GraueBox|TEXT=   
 
$\text{Example 4:}$&nbsp;
 
$\text{Example 4:}$&nbsp;
Munich's student dorms are occupied by students from
+
Munich's student hostels are occupied by students from
*the Ludwig Maximilian Universiy of Munich&nbsp; $($event&nbsp; $L$ &nbsp; &rArr; &nbsp; ${\rm Pr}(L) = 70\%)$&nbsp; and  
+
*Ludwig Maximilian Universiy of Munich&nbsp; $($event&nbsp; $L$ &nbsp; &rArr; &nbsp; ${\rm Pr}(L) = 70\%)$&nbsp; and
*the Technical University of Munich&nbsp; $($event&nbsp; $T$ &nbsp; &rArr; &nbsp;  ${\rm Pr}(T) = 30\%)$.  
+
 +
*Technical University of Munich&nbsp; $($event&nbsp; $T$ &nbsp; &rArr; &nbsp;  ${\rm Pr}(T) = 30\%)$.  
  
  
It is further known that at LMU&nbsp; $60\%$&nbsp; of all students are female, whereas at TUM only&nbsp; $10\%$ are female.  
+
It is further known that at LMU&nbsp; $60\%$&nbsp; of all students are female,&nbsp; whereas at TUM only&nbsp; $10\%$&nbsp; are female.  
  
*The proportion of all female students in the dormitory&nbsp; $($event $W)$&nbsp; can then be determined using the total probability theorem:
+
*The proportion of all female students in the hostel&nbsp; $($event $F)$&nbsp; can then be determined using the total probability theorem:
:$${\rm Pr}(W) = {\rm Pr}(W \hspace{0.05cm} \vert \hspace{0.05cm} L)\hspace{0.01cm}\cdot\hspace{0.01cm}{\rm Pr}(L) \hspace{0.05cm}+\hspace{0.05cm} {\rm Pr}(W \hspace{0.05cm} \vert \hspace{0.05cm} T)\hspace{0.01cm}\cdot\hspace{0.01cm}{\rm Pr}(T) = \rm 0.6\hspace{0.01cm}\cdot\hspace{0.01cm}0.7\hspace{0.05cm}+\hspace{0.05cm}0.1\hspace{0.01cm}\cdot \hspace{0.01cm}0.3 = 45 \%.$$
+
:$${\rm Pr}(F) = {\rm Pr}(F \hspace{0.05cm} \vert \hspace{0.05cm} L)\hspace{0.01cm}\cdot\hspace{0.01cm}{\rm Pr}(L) \hspace{0.05cm}+\hspace{0.05cm} {\rm Pr}(F \hspace{0.05cm} \vert \hspace{0.05cm} T)\hspace{0.01cm}\cdot\hspace{0.01cm}{\rm Pr}(T) = \rm 0.6\hspace{0.01cm}\cdot\hspace{0.01cm}0.7\hspace{0.05cm}+\hspace{0.05cm}0.1\hspace{0.01cm}\cdot \hspace{0.01cm}0.3 = 45 \%.$$
 
*If we meet a female student, we can use the inference probability
 
*If we meet a female student, we can use the inference probability
:$${\rm Pr}(L \hspace{-0.05cm}\mid  \hspace{-0.05cm}W) = \frac{ {\rm Pr}(W \hspace{-0.05cm}\mid  \hspace{-0.05cm}L)\cdot  {\rm Pr}(L) }{ {\rm Pr}(W \hspace{-0.05cm}\mid  \hspace{-0.05cm}L) \cdot  {\rm Pr}(L) +{\rm Pr}(W \hspace{-0.05cm}\mid  \hspace{-0.05cm}T) \cdot  {\rm Pr}(T)}=\rm \frac{0.6\cdot 0.7}{0.6\cdot 0.7 + 0.1\cdot 0.3}=\frac{14}{15}\approx 93.3 \%$$
+
:$${\rm Pr}(L \hspace{-0.05cm}\mid  \hspace{-0.05cm}F) = \frac{ {\rm Pr}(F \hspace{-0.05cm}\mid  \hspace{-0.05cm}L)\cdot  {\rm Pr}(L) }{ {\rm Pr}(F \hspace{-0.05cm}\mid  \hspace{-0.05cm}L) \cdot  {\rm Pr}(L) +{\rm Pr}(F \hspace{-0.05cm}\mid  \hspace{-0.05cm}T) \cdot  {\rm Pr}(T)}=\rm \frac{0.6\cdot 0.7}{0.6\cdot 0.7 + 0.1\cdot 0.3}=\frac{14}{15}\approx 93.3 \%$$
:predict that she will study at LMU. A quite realistic result&nbsp; (at least in the past).}}
+
:to predict that she will study at LMU.&nbsp; A quite realistic result&nbsp; $($at least in the past$)$.}}
  
  
The topic of this chapter is illustrated with examples in the&nbsp;  (German language)&nbsp;  learning video
+
&rArr; &nbsp; The topic of this chapter is illustrated with examples in the&nbsp;  $($German language$)$&nbsp;  learning video
:[[Statistische_Abhängigkeit_und_Unabhängigkeit_(Lernvideo)|Statistische Abhängigkeit und Unabhängigkeit]] &nbsp; $\Rightarrow$ &nbsp; "Statistical Dependence and Independence.".
+
::[[Statistische_Abhängigkeit_und_Unabhängigkeit_(Lernvideo)|&raquo;Statistische Abhängigkeit und Unabhängigkeit&laquo;]] &nbsp; $\Rightarrow$ &nbsp; &raquo;Statistical Dependence and Independence&laquo;.
  
 
==Exercises for the chapter==
 
==Exercises for the chapter==
 
<br>
 
<br>
[[Aufgaben:1.4_2S/3E-Kanalmodell|Aufgabe 1.4: 2S/3E-Kanalmodell]]
+
[[Aufgaben:Exercise_1.4:_2S/3E_Channel_Model|Exercise 1.4: 2S/3E Channel Model]]
  
[[Aufgaben:1.4Z_Summe von Ternärgrößen|Aufgabe 1.4Z: Summe von Ternärgrößen]]
+
[[Aufgaben:Exercise_1.4Z:_Sum_of_Ternary_Quantities|Exercise 1.4Z: Sum of Ternary Quantities]]
  
[[Aufgaben:1.5 Karten ziehen|Aufgabe 1.5: Karten ziehen]]
+
[[Aufgaben:Exercise_1.5:_Drawing_Cards|Exercise 1.5: Drawing Cards]]
  
[[Aufgaben:1.5Z_Ausfallwahrscheinlichkeiten|Aufgabe 1.5Z: Ausfallwahrscheinlichkeiten]]
+
[[Aufgaben:Exercise_1.5Z:_Probabilities_of_Default|Exercise 1.5Z: Probabilities of Default]]
  
  
 
{{Display}}
 
{{Display}}

Latest revision as of 17:25, 6 December 2023

General definition of statistical dependence


So far we have not paid much attention to  »statistical dependence«  between events,  even though we have already used it as in the case of two  »disjoint sets«:  

  • If an element belongs to  $A$, 
  • it cannot with certainty also be contained in the disjoint set  $B$.


The strongest form of dependence at all is such a  »deterministic dependence«  between two sets or two events.  Less pronounced is the statistical dependence.  Let us start with its complement:

$\text{Definitions:}$ 

$(1)$  Two events  $A$  and  $B$  are called  »statistically independent«,  if the probability of the intersection  $A ∩ B$  is equal to the product of the individual probabilities:

$${\rm Pr}(A \cap B) = {\rm Pr}(A)\cdot {\rm Pr}(B).$$

$(2)$  If this condition is not satisfied,  then the events  $A$  and  $B$  are »statistically dependent«:

$${\rm Pr}(A \cap B) \ne {\rm Pr}(A)\cdot {\rm Pr}(B).$$


  • In some applications,  statistical independence is obvious,  for example,  in the  »coin toss«  experiment.  The probability for  »heads«  or  »tails«  is independent of whether  »heads«  or  »tails«  occurred in the last toss.
  • And also the individual results in the random experiment  »throwing a roulette ball«  are always statistically independent of each other under fair conditions,  even if individual system players do not want to admit this.
  • In other applications,  on the other hand,  the question whether two events are statistically independent or not is not or only very difficult to answer instinctively.  Here one can only arrive at the correct answer by checking the formal independence criterion given above,  as the following example will show.


$\text{Example 1:}$  We consider the experiment  »throwing two dice«,  where the two dice  $($in graphic:  "cubes"$)$  can be distinguished by their colors red  $(R)$  and blue  $(B)$.  The graph illustrates this fact,  where the sum  $S = R + B$  is entered in the two-dimensional field  $(R, B)$.

For the following description we define the following events:

Examples for statistically independent events
  • $A_1$:  The outcome of the red cube is  $R < 4$  $($red background$)$   ⇒   ${\rm Pr}(A_1) = 1/2$,
  • $A_2$:  The outcome of the blue cube is  $B > 4$  $($blue font$)$   ⇒   ${\rm Pr}(A_2) = 1/3$,
  • $A_3$:  The sum of the two cubes is  $S = 7$  $($green outline$)$   ⇒   ${\rm Pr}(A_3) = 1/6$,
  • $A_4$:  The sum of the two cubes is  $S = 8$    ⇒   ${\rm Pr}(A_4) = 5/36$,
  • $A_5$:  The sum of the two cubes is  $S = 10$    ⇒   ${\rm Pr}(A_5) = 3/36$.


The graph can be interpreted as follows:

  • The two events  $A_1$  and  $A_2$  are statistically independent because the probability  ${\rm Pr}(A_1 ∩ A_2) = 1/6$  of the intersection is equal to the product of the two individual probabilities  ${\rm Pr}(A_1) = 1/2$  and  ${\rm Pr}(A_2) = 1/3$ .  Given the problem definition,  any other result would also have been very surprising.
  • But also the events  $A_1$  and  $A_3$  are statistically independent because of  ${\rm Pr}(A_1) = 1/2$,  ${\rm Pr}(A_3) = 1/6$  and  ${\rm Pr}(A_1 ∩ A_3) = 1/12$.  The probability of intersection  $(1/12)$  arises because three of the  $36$  squares are both highlighted in red and outlined in green.
  • In contrast,  there are statistical bindings between  $A_1$  and  $A_4$  because the probability of intersection   ⇒   ${\rm Pr}(A_1 ∩ A_4) = 1/18 = 4/72$  is not equal to the product  ${\rm Pr}(A_1) \cdot {\rm Pr}(A_4)= 1/2 \cdot 5/36 = 5/72$ .
  • The two events  $A_1$  and  $A_5$  are even disjoint   ⇒   ${\rm Pr}(A_1 ∩ A_5) = 0$:   none of the boxes with red background is labeled  $S=10$ .


This example shows that disjunctivity is a particularly pronounced form of statistical dependence.

Conditional probability


If there are statistical bindings between the two events  $A$  and  $B$,  the  $($unconditional$)$  probabilities  ${\rm Pr}(A)$  and  ${\rm Pr}(B)$  do not describe the situation unambiguously in the statistical sense.  So-called  »conditional probabilities«  are then required.

$\text{Definitions:}$ 

$(1)$  The  »conditional probability« of  $A$  under condition  $B$  can be calculated as follows:

$${\rm Pr}(A\hspace{0.05cm} \vert \hspace{0.05cm} B) = \frac{ {\rm Pr}(A \cap B)}{ {\rm Pr}(B)}.$$

$(2)$  Similarly,  the conditional probability of  $B$  under condition  $A$  is:

$${\rm Pr}(B\hspace{0.05cm} \vert \hspace{0.05cm}A) = \frac{ {\rm Pr}(A \cap B)}{ {\rm Pr}(A)}.$$

$(3)$  Combining these two equations,  we get  $\text{Bayes'}$  theorem:

$${\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm} A) = \frac{ {\rm Pr}(A\hspace{0.05cm} \vert \hspace{0.05cm} B)\cdot {\rm Pr}(B)}{ {\rm Pr}(A)}.$$


Below are some properties of conditional probabilities:

  1. Also a conditional probability lies always between  $0$  and  $1$  including these two limits:   $0 \le {\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm} B) \le 1$.
  2. With constant condition  $B$,  all calculation rules given in the chapter  »Set Theory Basics«  for the unconditional probabilities  ${\rm Pr}(A)$  and  ${\rm Pr}(B)$  still apply.
  3. If the existing events  $A$  and  $B$  are disjoint,  then  ${\rm Pr}(A\hspace{0.05cm} | \hspace{0.05cm} B) = {\rm Pr}(B\hspace{0.05cm} | \hspace{0.05cm}A)= 0$  $($agreement:  event  $A$  »exists«  if  ${\rm Pr}(A) > 0)$.
  4. If  $B$  is a proper or improper subset of  $A$,  then  ${\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm} B) =1$.  
  5. If two events  $A$  and  $B$ are statistically independent,  their conditional probabilities are equal to the unconditional ones,  as the following calculation shows:
$${\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm} B) = \frac{{\rm Pr}(A \cap B)}{{\rm Pr}(B)} = \frac{{\rm Pr} ( A) \cdot {\rm Pr} ( B)} { {\rm Pr}(B)} = {\rm Pr} ( A).$$

$\text{Example 2:}$  We again consider the experiment  »Throwing two dice«,  where  $S = R + B$  denotes the sum of the red and blue dice  $($cube$)$.

Example of statistically dependent events

Here we consider bindings between the two events

  • $A_1$:  »The outcome of the red cube is  $R < 4$ «  $($red background$)$   ⇒   ${\rm Pr}(A_1) = 1/2$,
  • $A_4$:  »The sum of the two cubes is  $S = 8$ «  $($green outline$)$   ⇒   ${\rm Pr}(A_4) = 5/36$,


and refer again to the event of  $\text{Example 1}$

  • $A_3$:  »The sum of the two cubes is  $S = 7$ «   ⇒   ${\rm Pr}(A_3) = 1/6$.


Regarding this graph,  note:

  • There are statistical bindings between the both events  $A_1$  and  $A_4$,  since the probability of intersection   ⇒   ${\rm Pr}(A_1 ∩ A_4) = 2/36 = 4/72$  is not equal to the product  ${\rm Pr}(A_1) \cdot {\rm Pr}(A_4)= 1/2 \cdot 5/36 = 5/72$.
  • The conditional probability  ${\rm Pr}(A_1 \hspace{0.05cm} \vert \hspace{0.05cm} A_4) = 2/5$  can be calculated from the quotient of the  »joint probability«  ${\rm Pr}(A_1 ∩ A_4) = 2/36$  and the absolute probability   ${\rm Pr}(A_4) = 5/36$.
  • Since the events  $A_1$  and  $A_4$  are statistically dependent,  the conditional probability  ${\rm Pr}(A_1 \hspace{0.05cm}\vert \hspace{0.05cm} A_4) = 2/5$  $($two of the five squares outlined in green are highlighted in red$)$  is not equal to the absolute probability  ${\rm Pr}(A_1) = 1/2$  $($half of all squares are highlighted in red$)$.
  • Similarly,  the conditional probability  ${\rm Pr}(A_4 \hspace{0.05cm} \vert \hspace{0.05cm} A_1) = 2/18 = 1/9$  $($two of the  $18$  fields with a red background are outlined in green$)$  is unequal to the absolute probability  ${\rm Pr}(A_4) = 5/36$  $($a total of five of the  $36$  fields are outlined in green$)$.
  • This last result can also be derived using  »Bayes'  theorem«,   for example:
$${\rm Pr}(A_4 \hspace{0.05cm} \vert\hspace{0.05cm} A_1) = \frac{ {\rm Pr}(A_1 \hspace{0.05cm} \vert\hspace{0.05cm} A_4)\cdot {\rm Pr} ( A_4)} { {\rm Pr}(A_1)} = \frac{2/5 \cdot 5/36}{1/2} = 1/9.$$
  • In contrast,  the following conditional probabilities hold for  $A_1$  and the statistically independent event  $A_3$,  see  $\text{Example 1}$:
$${\rm Pr}(A_{\rm 1} \hspace{0.05cm}\vert \hspace{0.05cm} A_{\rm 3}) = {\rm Pr}(A_{\rm 1}) = \rm 1/2\hspace{0.5cm}{\rm resp.}\hspace{0.5cm}{\rm Pr}(A_{\rm 3} \hspace{0.05cm} \vert \hspace{0.05cm} A_{\rm 1}) = {\rm Pr}(A_{\rm 3}) = 1/6.$$


General multiplication theorem


Furthermore,  we consider several events denoted as  $A_i$  with  $1 ≤ i ≤ I$.  However,  these events  $A_i$  no longer represent a  »complete system« , viz:

  • They are not pairwise disjoint to each other. 
  • There may also be statistical bindings between the individual events.


$\text{Definition:}$ 

$(1)$  For the so-called  »joint probability«, i.e. for the probability of the intersection of all  $I$  events  $A_i$  holds in this case:

$${\rm Pr}(A_{\rm 1} \cap \hspace{0.02cm}\text{ ...}\hspace{0.1cm} \cap A_{I}) = {\rm Pr}(A_{I})\hspace{0.05cm}\cdot\hspace{0.05cm}{\rm Pr}(A_{I \rm -1} \hspace{0.05cm}\vert \hspace{0.05cm} A_I) \hspace{0.05cm}\cdot \hspace{0.05cm}{\rm Pr}(A_{I \rm -2} \hspace{0.05cm}\vert\hspace{0.05cm} A_{I - \rm 1}\cap A_I)\hspace{0.05cm} \cdot \hspace{0.02cm}\text{ ...} \hspace{0.1cm} \cdot\hspace{0.05cm} {\rm Pr}(A_{\rm 1} \hspace{0.05cm}\vert \hspace{0.05cm}A_{\rm 2} \cap \hspace{0.02cm}\text{ ...} \hspace{0.1cm}\cap A_{ I}).$$

$(2)$  In the same way  holds:

$${\rm Pr}(A_{\rm 1} \cap \hspace{0.02cm}\text{ ...}\hspace{0.1cm} \cap A_{I}) = {\rm Pr}(A_1)\hspace{0.05cm}\cdot\hspace{0.05cm}{\rm Pr}(A_2 \hspace{0.05cm}\vert \hspace{0.05cm} A_1) \hspace{0.05cm}\cdot \hspace{0.05cm}{\rm Pr}(A_3 \hspace{0.05cm}\vert \hspace{0.05cm} A_1\cap A_2)\hspace{0.05cm} \cdot \hspace{0.02cm}\text{ ...}\hspace{0.1cm} \cdot\hspace{0.05cm} {\rm Pr}(A_I \hspace{0.05cm}\vert \hspace{0.05cm}A_1 \cap \hspace{0.02cm} \text{ ...} \hspace{0.1cm}\cap A_{ I-1}).$$


$\text{Example 3:}$  A lottery drum contains ten lots,  including three hits  $($event  $T_1)$. 

  • Then the probability of drawing two hits with two tickets is:
$${\rm Pr}(T_1 \cap T_2) = {\rm Pr}(T_1) \cdot {\rm Pr}(T_2 \hspace{0.05cm }\vert \hspace{0.05cm} T_1) = 3/10 \cdot 2/9 = 1/15 \approx 6.7 \%.$$
  • This takes into account that in the second draw  $($event  $T_2)$  there would be only nine tickets and two hits in the drum if one hit had been drawn in the first run:
$${\rm Pr}(T_2 \hspace{0.05cm} \vert\hspace{0.05cm} T_1) = 2/9\approx 22.2 \%.$$
  • However,  if the tickets were returned to the drum after the draw,  the events  $T_1$  and  $T_2$  would be statistically independent and it would hold:
$$ {\rm Pr}(T_1 ∩ T_2) = (3/10)^2 = 9\%.$$

Inference probability


Given again events  $A_i$  with  $1 ≤ i ≤ I$  that form a  »complete system«.  That is:

  • All events are pairwise disjoint  $(A_i ∩ A_j = ϕ$  for all  $i ≠ j$ ).
  • The union gives the universal set:
$$\rm \bigcup_{\it i=1}^{\it I}\it A_i = \it G.$$

Besides,  we consider the event  $B$,  of which all conditional probabilities  ${\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm} A_i)$  with indices  $1 ≤ i ≤ I$  are known.

$\text{Theorem of total probability:}$  Under the above conditions,  the  »unconditional  probability« of event  $B$  is:

$${\rm Pr}(B) = \sum_{i={\rm1} }^{I}{\rm Pr}(B \cap A_i) = \sum_{i={\rm1} }^{I}{\rm Pr}(B \hspace{0.05cm} \vert\hspace{0.05cm} A_i)\cdot{\rm Pr}(A_i).$$


$\text{Definition:}$  From this equation, using  $\text{Bayes' theorem:}$   ⇒   »Inference probability«:

$${\rm Pr}(A_i \hspace{0.05cm} \vert \hspace{0.05cm} B) = \frac{ {\rm Pr}( B \mid A_i)\cdot {\rm Pr}(A_i )}{ {\rm Pr}(B)} = \frac{ {\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm} A_i)\cdot {\rm Pr}(A_i )}{\sum_{k={\rm1} }^{I}{\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm} A_k)\cdot{\rm Pr}(A_k) }.$$


$\text{Example 4:}$  Munich's student hostels are occupied by students from

  • Ludwig Maximilian Universiy of Munich  $($event  $L$   ⇒   ${\rm Pr}(L) = 70\%)$  and
  • Technical University of Munich  $($event  $T$   ⇒   ${\rm Pr}(T) = 30\%)$.


It is further known that at LMU  $60\%$  of all students are female,  whereas at TUM only  $10\%$  are female.

  • The proportion of all female students in the hostel  $($event $F)$  can then be determined using the total probability theorem:
$${\rm Pr}(F) = {\rm Pr}(F \hspace{0.05cm} \vert \hspace{0.05cm} L)\hspace{0.01cm}\cdot\hspace{0.01cm}{\rm Pr}(L) \hspace{0.05cm}+\hspace{0.05cm} {\rm Pr}(F \hspace{0.05cm} \vert \hspace{0.05cm} T)\hspace{0.01cm}\cdot\hspace{0.01cm}{\rm Pr}(T) = \rm 0.6\hspace{0.01cm}\cdot\hspace{0.01cm}0.7\hspace{0.05cm}+\hspace{0.05cm}0.1\hspace{0.01cm}\cdot \hspace{0.01cm}0.3 = 45 \%.$$
  • If we meet a female student, we can use the inference probability
$${\rm Pr}(L \hspace{-0.05cm}\mid \hspace{-0.05cm}F) = \frac{ {\rm Pr}(F \hspace{-0.05cm}\mid \hspace{-0.05cm}L)\cdot {\rm Pr}(L) }{ {\rm Pr}(F \hspace{-0.05cm}\mid \hspace{-0.05cm}L) \cdot {\rm Pr}(L) +{\rm Pr}(F \hspace{-0.05cm}\mid \hspace{-0.05cm}T) \cdot {\rm Pr}(T)}=\rm \frac{0.6\cdot 0.7}{0.6\cdot 0.7 + 0.1\cdot 0.3}=\frac{14}{15}\approx 93.3 \%$$
to predict that she will study at LMU.  A quite realistic result  $($at least in the past$)$.


⇒   The topic of this chapter is illustrated with examples in the  $($German language$)$  learning video

»Statistische Abhängigkeit und Unabhängigkeit«   $\Rightarrow$   »Statistical Dependence and Independence«.

Exercises for the chapter


Exercise 1.4: 2S/3E Channel Model

Exercise 1.4Z: Sum of Ternary Quantities

Exercise 1.5: Drawing Cards

Exercise 1.5Z: Probabilities of Default