Difference between revisions of "Theory of Stochastic Signals/Markov Chains"

Revision as of 13:16, 29 November 2021

Considered scenario

Finally, we consider the case where one runs a random experiment continuously and that at each discrete time $(ν = 1, 2, 3, \text{...})$ a new event $E_ν$ . Here, let hold:

Possible events and event sequence

$$E_\nu \in G = \{ E_{\rm 1}, E_{\rm 2}, \hspace{0.1cm}\text{...}\hspace{0.1cm}, E_\mu , \hspace{0.1cm}\text{...}\hspace{0.1cm}, E_M \}.$$

This mathematically not quite proper nomenclature means (see graph):

The $M$ possible events are numbered consecutively with the index $μ$ .
The index $ν$ names the discrete time points at which events occur.

For simplicity, we restrict ourselves in the following to the case $M = 2$ with the universal set $G = \{ A, \ B \}$. Let further hold:

The probability of event $E_ν$ may well depend on all previous events – that is, on events $E_{ν\hspace{0.05cm}-1}, E_{ν\hspace{0.05cm}-2}, E_{ν\hspace{0.05cm}-3},$ , and so on.
This statement also means that in this chapter we consider a sequence of events with internal statistical ties .

This scenario is a special case of a discrete-time, discrete-value random process, which will be discussed in more detail in the chapter Random Processes to be discussed in more detail.

$\text{Example 1:}$ Cards are drawn one after the other from a deck of $32$ cards (including four aces). With the events

$A\text{ :}=$ „the drawn card is an ace”, and
$B = \overline{A}:=$ „the drawn card is not an ace”, the probabilities at time $ν = 1$:

$${\rm Pr} (A_{\rm 1}) ={4}/{32}= {1}/{8}, \hspace{0.5cm}{\rm Pr} (B_{\rm 1}) = {28}/{32}= {7}/{8}.$$

The probability ${\rm Pr} (A_{\rm 2})$, that an ace is drawn as the second card $(ν = 2)$ now depends on,

whether an ace was drawn at time $ν = 1$ ⇒ ${\rm Pr} (A_{\rm 2}) = 3/31 < 1/8$, or
whether at time $ν = 1$ no ace was drawn ⇒ ${\rm Pr} (A_{\rm 2}) = 4/31 > 1/8$.

Also, the probabilities ${\rm Pr} (A_{\nu})$ at later times $ν$ always depend on the occurrence or non-occurrence of all previous events $E_1, \hspace{0.1cm}\text{...}\hspace{0.1cm} ,E_{ν\hspace{0.05cm}–1}$ .

General definition of a Markov chain

In special cases, which however occur very frequently, the scenario described above can be described by a Markov chain.

$\text{Definition:}$ A $k$-th orderMarkov Chain serves as a model for time- and value-discrete processes, where

the event probabilities at time $ν$ depend on the previous events $E_{ν\hspace{0.05cm}–1}, \hspace{0.1cm}\text{...}\hspace{0.1cm}, E_{ν\hspace{0.05cm}–k}$ abhängen, and
can be expressed by $M^{k+1}$ conditional probabilities.

Second order Markov chain

The diagram illustrates this issue using $k = 2$ as an example..

Therefore, for $M = 2$ there are $2^{k+1}$ such conditional probabilities.

$${\rm Pr} ( E_\nu \hspace {0.05cm}\vert \hspace {0.05cm}E_{\nu {\rm -1 } },\hspace {0.1cm}\text{...}\hspace {0.1cm}, E_{\nu { -k } }).$$

With $E_{\nu }\in \{ A, B \}, \hspace {0.1cm}\text{...}\hspace {0.1cm}, E_{\nu { -k } } \in \{ A, B \}$ these are:

${\rm Pr} ( A_\nu \hspace {0.05cm}\vert \hspace {0.05cm}A_{\nu {\rm -1 } }, A_{\nu { -2 } })$, ${\rm Pr} ( B_\nu \hspace {0.05cm}\vert \hspace {0.05cm}A_{\nu {\rm -1 } }, A_{\nu { -2 } })$,
${\rm Pr} ( A_\nu \hspace {0.05cm}\vert \hspace {0.05cm}A_{\nu {\rm -1 } },B_{\nu { -2 } })$, ${\rm Pr} ( B_\nu \hspace {0.05cm}\vert \hspace {0.05cm}A_{\nu {\rm -1 } }, B_{\nu { -2 } })$,
${\rm Pr} ( A_\nu \hspace {0.05cm}\vert \hspace {0.05cm}B_{\nu {\rm -1 } }, A_{\nu { -2 } })$, ${\rm Pr} ( B_\nu \hspace {0.05cm}\vert \hspace {0.05cm}B_{\nu {\rm -1 } }, A_{\nu { -2 } })$,
${\rm Pr} ( A_\nu \hspace {0.05cm}\vert \hspace {0.05cm}B_{\nu {\rm -1 } }, B_{\nu { -2 } })$, ${\rm Pr} ( B_\nu \hspace {0.05cm}\vert \hspace {0.05cm}B_{\nu {\rm -1 } }, B_{\nu { -2 } })$.

$\text{Example 2:}$ Natural languages are often describable by Markov chains, although the order $k$ tends to infinity. In this example, however, texts are only approximated by second-order Markov chains.

Synthetically generated texts (German and English)

The graphic shows two synthetically generated texts:

The left text was synthetically generated starting from a German book template with bindings up to second order.
For the right text, an English template was used.

One recognizes despite the restriction to $k = 2$ many (short) German and/or English words and also that German words are on the average longer than English. There is no meaningful content, but the structure of the respective language is recognizable.

First order Markov chain

In the following, we always restrict ourselves to the special case $k =1$ .

$\text{Definition:}$ In a First Order Markov Chain only the statistical binding to the last event is considered, which in practice is usually the strongest.

A binary first order Markov chain ⇒ universal set $G = \{ A, \ B \}$ has the following event probabilities at time $\nu$ :

$${\rm Pr}(A_\nu) = {\rm Pr}(A_\nu \hspace{0.05cm} \vert \hspace{0.05cm}A_{\nu - 1}) \cdot {\rm Pr}(A_{\nu - 1}) \hspace{0.1cm} + \hspace{0.1cm} {\rm Pr}(A_\nu \hspace{0.05cm} \vert \hspace{0.05cm}B_{\nu - 1}) \cdot {\rm Pr}(B_{\nu - 1}) ,$$

$${\rm Pr}(B_\nu) = {\rm Pr}(B_\nu \hspace{0.05cm} \vert \hspace{0.05cm}A_{\nu - 1}) \cdot {\rm Pr}(A_{\nu - 1}) \hspace{0.1cm} + \hspace{0.1cm} {\rm Pr}(B_\nu \hspace{0.05cm} \vert \hspace{0.05cm}B_{\nu - 1}) \cdot {\rm Pr}(B_{\nu - 1}) .$$

Regarding these equations, we note:

${\rm Pr}(A_\nu)$ sis shorthand for the probability that at timenbsp; $ν$ the event $E_ν = A = \overline{B}$ occurs, and ${\rm Pr}(B_\nu) = 1 - {\rm Pr}(A_\nu)$ holds.
At each time there are four transition probabilities ${\rm Pr}(E_ν\hspace{0.05cm} |\hspace{0.05cm} E_{ν–1})$, but only two of them are independent, because it holds:

$${\rm Pr}(B_\nu \hspace{0.05cm} | \hspace{0.05cm}A_{\nu - 1}) = 1 - {\rm Pr}(A_\nu \hspace{0.05cm} | \hspace{0.05cm}A_{\nu - 1}), \hspace{0.5cm}{\rm Pr}(A_\nu \hspace{0.05cm} | \hspace{0.05cm}B_{\nu - 1}) = 1 - {\rm Pr}(B_\nu \hspace{0.05cm} | \hspace{0.05cm}B_{\nu - 1}).$$

Generalizing this last statement, one arrives at the result that for a Markov chain with $M$ events, there are exactly $M · (M – 1)$ independent transition probabilities at each time $ν$ .

$\text{Example 3:}$ With the given transition probabilities

$${\rm Pr}(A_ν\hspace{0.05cm}\vert\hspace{0.05cm} A_{ν–1}) = 0.2 \hspace{0.3cm}\text{und}\hspace{0.3cm} {\rm Pr}(B_ν\hspace{0.05cm} \vert\hspace{0.05cm} B_{ν–1}) = 0.4$$

the other two transition probabilities are also uniquely specified

$${\rm Pr}(B_ν\hspace{0.05cm} \vert\hspace{0.05cm} A_{ν–1}) = 1- 0.2 = 0.8 \hspace{0.3cm}\text{und}\hspace{0.3cm} {\rm Pr}(A_ν\hspace{0.05cm} \vert\hspace{0.05cm} B_{ν–1}) = 1- 0.4 = 0.6.$$

Homogeneous Markov chains

Applicability of Markov chains to practical problems is usually only given further restrictive conditions.

$\text{Definition:}$ If all transition probabilities are independent of the time $ν$, the Markov chain is homogeneous .

In the case $M = 2$ we use the following abbreviations for this:

$${\rm Pr}(A \hspace{0.05cm} \vert \hspace{0.05cm}A) = {\rm Pr}(A_\nu \hspace{0.05cm}\vert \hspace{0.05cm}A_{\nu - 1}) , \hspace{0.5cm} {\rm Pr}(A \hspace{0.05cm} \vert\hspace{0.05cm}B) = {\rm Pr}(A_\nu \hspace{0.05cm} \vert \hspace{0.05cm}B_{\nu - 1}) ,$$

$${\rm Pr}(B \hspace{0.05cm} \vert\hspace{0.05cm}A) = {\rm Pr}(B_\nu \hspace{0.05cm} \vert \hspace{0.05cm}A_{\nu - 1}) , \hspace{0.5cm} {\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm}B) = {\rm Pr}(B_\nu \hspace{0.05cm} \vert \hspace{0.05cm}B_{\nu - 1}) .$$

Thus, the event probabilities of a binary homogeneous Markov chain, which are absolute probabilities in contrast to the conditional transition probabilities, are :

First order homogeneous Markov chain

$${\rm Pr}(A_\nu) = {\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}A) \cdot {\rm Pr}(A_{\nu - 1}) \hspace{0.1cm} + \hspace{0.1cm} {\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}B) \cdot {\rm Pr}(B_{\nu - 1}) ,$$

$${\rm Pr}(B_\nu) = {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}A) \cdot {\rm Pr}(A_{\nu - 1}) \hspace{0.1cm} + \hspace{0.1cm} {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}B) \cdot {\rm Pr}(B_{\nu - 1}) .$$

We can also read from the Markov diagram:

The sum of the outgoing arrows of an event is always one

$${\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}A) + {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}A) =1,$$

$${\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}B) + {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}B) =1.$$

The sums ${\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}A) + {\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}B)$ bzw. ${\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}A) + {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}B)$ , on the other hand, are not subject to any restrictions.

You can calculate and display the "transient behavior" of the event probabilities of such a binary Markov chain with the interactive applet (German) Event Probabilities of a First Order Markov Chain .

Stationary probabilities

Important properties of random processes are stationarity und ergodicity. Although these terms are not defined until the fourth main chapter "Random Variables with Statistical Dependence" , we apply them here already in advance to Markov chains.

$\text{Definition:}$ If, in addition to the transition probabilities, all event probabilities of a Markov chain are independent of the time $ν$, then the Markov chain is called stationary. One then omits the index $ν$ and writes in the binary case:

$${\rm Pr}(A_\nu ) = {\rm Pr}(A ) \hspace{0.5 cm} {\rm bzw.} \hspace{0.5 cm} {\rm Pr}(B_\nu ) = {\rm Pr}(B).$$

These quantities are also called the ergodic probabilities of the Markov chain.

Stationary Markov chains have the special features mentioned below:

To calculate the ergodic probabilities of a binary Markov chain $(M =2)$ one can use the following equations:

$${\rm Pr}(A) = {\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}A) \cdot {\rm Pr}(A) \hspace{0.1cm} + \hspace{0.1cm} {\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}B) \cdot {\rm Pr}(B) ,$$

$${\rm Pr}(B) = {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}A) \cdot {\rm Pr}(A) \hspace{0.1cm} + \hspace{0.1cm} {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}B) \cdot {\rm Pr}(B) .$$

Since these equations depend linearly on each other, one may use only one of them. For example, as the second equation of determination one may use:

$${\rm Pr}(A) + {\rm Pr}(B) = 1.$$

The ergodic probabilities result from it to

$${\rm Pr}(A) = \frac {{\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}B) }{{\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}B) + {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}A) }\hspace{0.1cm}, \hspace{0.5cm}{\rm Pr}(B) = \frac {{\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}A) }{{\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}B) + {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}A) }.$$

$\text{Example 4:}$ We consider a binary Markov chain with events $A$ and $B$ and transition probabilities ${\rm Pr}(A \hspace{0.05cm} \vert\hspace{0.05cm}A) = 0.4$ and ${\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm}B) = 0.8$.
Furthermore, we assume that each realization of this chain starts with event $A$ at the starting time $ν = 0$ .

We then obtain the event probabilities listed below (with three decimal places):

Zum Einschwingen der Ereigniswahrscheinlichkeiten

In a strict sense, this is a non-stationary Markov chain, but it is (nearly) settled after a short time:

At later times $(ν > 5)$ , the event probabilities ${\rm Pr}(A_ν) \approx 1/4$ and ${\rm Pr}(B_ν) \approx 3/4$ are no longer seriously changed.
However, from ${\rm Pr}(A_{ν=5}) = 0.250$ and ${\rm Pr}(B_{ν=5}) = 0.750$ it should not be concluded that the Markov chain is already perfectly steady at time $ν = 5$ .
Indeed, the exact values are ${\rm Pr}(A_{ν=5}) = 0.25024$ and ${\rm Pr}(B_{ν=5}) = 0.74976$.

$\text{Conclusion:}$ For a stationary Markov chain, the ergodic probabilities always occur with good approximation very long after the chain is switched on $(ν → ∞)$ , regardless of the initial conditions ${\rm Pr}(A_0)$ and ${\rm Pr}(B_0) = 1 - {\rm Pr}(A_0)$.

Matrix vector representation

Homogeneous Markov chains can be represented very compactly with vectors and matrices. This is especially recommended for more than two events:

$$E_\nu \in G = \{ E_{\rm 1}, E_{\rm 2}, \hspace{0.1cm}\text{...}\hspace{0.1cm}, E_\mu , \hspace{0.1cm}\text{...}\hspace{0.1cm}, E_M \}.$$

For the matrix-vector representation, we use the following nomenclature:

We summarize the $M$ probabilities at time $ν$ in a column vector:

$${\mathbf{p}^{(\nu)}} = \left[ \begin{array}{c} {p_{\rm 1}}^{(\nu)} \\ \dots \\ {p_{M}}^{(\nu)} \end{array} \right] \hspace{0.5cm}{\rm mit} \hspace{0.5cm} {p_{\mu}}^{(\nu)} = {\rm Pr}(E_\nu = E_\mu ).$$

The transition probabilities are expressed by an $M \times M$ matrix:

$${\mathbf{P}} =\left( p_{ij} \right) = \left[ \begin{array}{cccc} p_{11} & p_{12} & \cdots & p_{1M} \\ p_{21} & p_{22}& \cdots & p_{2M} \\ \dots & \dots & \dots & \dots \\ p_{M1} & p_{M2} & \cdots & p_{MM} \end{array} \right] \hspace{0.5cm}{\rm mit} \hspace{0.5cm} {p_{ij}} = {\rm Pr}(E_{\nu +1 } = E_j \hspace{0.05cm}| \hspace{0.05cm} E_{\nu } = E_i).$$

The accompanying figure illustrates this nomenclature using the example $M = 2$. The new event probability vector after one step is:

To denote the transition probabilities

$${\mathbf{p}^{(\nu + 1)}} = {\mathbf{P}}^{\rm T} \cdot {\mathbf{p}^{(\nu )}} .$$

Here,${\mathbf{P}^{\rm T}}$ denotes the transposed matrix to $\mathbf{P}$. Thus, after $n$ steps, the result is

$${\mathbf{p}^{(\nu +{\it n})}} = \left( {\mathbf{P}}^{\rm T} \right)^n \cdot {\mathbf{p}^{(\nu )}} .$$

In the limit transition $(n → ∞)$ one then always achieves stationarity of the Markov chain:

$$\lim_{n \to\infty}\hspace{0.1cm}{\mathbf{p}^{(\nu + n)}} = \lim_{n \to\infty} \left( {\mathbf{P}}^{\rm T} \right)^n \cdot {\mathbf{p}^{(\nu )}} = {\mathbf{p}}_{\rm erg}= \left[ \begin{array}{c} {p_{\rm 1}} \\ \dots \\ {p_{M}} \end{array} \right] .$$

$\text{Definition:}$ Multiplying the transition matrix ${\mathbf{P} }$ infinitely many times by itself and denoting the result by ${\mathbf{P} }_{\rm erg}$, the resulting matrix consists of $M$ equal rows:

$${\mathbf{P} }_{\rm erg} = \lim_{n \to\infty} {\mathbf{P} }^n = \left[ \begin{array}{cccc} p_{1} & p_{2} & \cdots & p_{M} \\ p_{1} & p_{2}& \cdots & p_{M} \\ \dots & \dots & \dots & \dots \\ p_{1} & p_{2} & \cdots & p_{M} \end{array} \right] .$$

The probabilities $p_1, \hspace{0.1cm}\text{...}\hspace{0.1cm}, p_M$ in each of these lines are called the ergodic probabilities.

Markov diagram with three events

$\text{Example 5:}$ We consider a Markov chain with the events $E_1, E_2$ and $E_3$ ⇒ $M=3$. The graph shows

on the left the Markov diagram,
on the upper right the transition matrix $\mathbf{P}$ and
below their powers $\mathbf{P}^2$ and $\mathbf{P}^3$.

In the Markov diagram, all transition probabilities with the numerical value $1/2$ are shown in blue; green indicates the probability $1/4$.

At the start $(ν =0)$ all events are equally probable. For $ν = 1$ then holds:

$${\mathbf{p}^{(1)} } = {\mathbf{P} }^{\rm T} \cdot {\mathbf{p}^{(0 )} }= \left[ \begin{array}{ccc} 1/2 & 1/2& 1/2 \\ 1/4 & 0 & 1/2 \\ 1/4& 1/2& 0 \end{array} \right] \left[ \begin{array}{c} 1/3 \\ 1/3 \\ 1/3 \end{array} \right] = \left[ \begin{array}{c} 1/2 \\ 1/4 \\ 1/4 \end{array} \right] .$$

From this it can be seen that we get from the matrix $\mathbf{P}$ to the transposed matrix ${\mathbf{P}^{\rm T} }$ by exchanging the rows and columns.

At time $ν =2$ (and also at all later times) the same probabilities result:

$${\mathbf{p}^{(2)} } = {\mathbf{P} }^{\rm T} \cdot {\mathbf{p}^{(1 )} }= \left[ \begin{array}{ccc} 1/2 & 1/2& 1/2 \\ 1/4 & 0 & 1/2 \\ 1/4& 1/2& 0 \end{array} \right] \left[ \begin{array}{c} 1/2 \\ 1/4 \\ 1/4 \end{array} \right] = \left[ \begin{array}{c} 1/2 \\ 1/4 \\ 1/4 \end{array} \right] .$$

This means: the ergodic probabilities of the three events are $1/2$ $($for $E_1)$, $1/4$ $($for $E_2)$, and $1/4$ $($for $E_3)$.

This result could have been read directly from the ergodic matrix:

$${\mathbf{P} }_{\rm erg} = \lim_{n \to\infty} {\mathbf{P} }^n = \left[ \begin{array}{ccc} 1/2 & 1/4 & 1/4 \\ 1/2 & 1/4 & 1/4 \\ 1/2 & 1/4 & 1/4 \end{array} \right] .$$

This is obtained by continuously multiplying the transition matrix by itself.

In the picture above the powers $\mathbf{P}^2$ and $\mathbf{P}^3$ are given, which approximate the matrix $\mathbf{P}_{\rm erg}$ .

Exercises for the chapter

Exercise 1.6: Transition Probabilities

Exercise 1.6Z: Ergodic Probabilities

Exercise 1.7: Ternary Markov Chain

Exercise 1.7Z: BARBARA Generator

@@ Line 7: / Line 7: @@
 ==Considered scenario==
 <br>
-WFinally, we consider the case where one runs a random experiment continuously and that at each discrete time&nbsp; $(ν = 1, 2, 3, \text{...})$&nbsp; a new event&nbsp; $E_ν$&nbsp;. Here, let hold:
+Finally, we consider the case where one runs a random experiment continuously and that at each discrete time&nbsp; $(ν = 1, 2, 3, \text{...})$&nbsp; a new event&nbsp; $E_ν$&nbsp;. Here, let hold:
 [[File:En_Sto_T_1_4_S1.png |right|frame|Possible events and event sequence]]
 :$$E_\nu \in G = \{ E_{\rm 1}, E_{\rm 2}, \hspace{0.1cm}\text{...}\hspace{0.1cm}, E_\mu , \hspace{0.1cm}\text{...}\hspace{0.1cm}, E_M \}.$$
@@ Line 21: / Line 21: @@
-This scenario is a special case of a discrete-time, discrete-value random process, which will be discussed in more detail in the chapter&nbsp;} [[Theory_of_Stochastic_Signals/Auto-Correlation_Function_(ACF)#Zufallsprozesse|Random Processes]]&nbsp; to be discussed in more detail.
+This scenario is a special case of a discrete-time, discrete-value random process, which will be discussed in more detail in the chapter&nbsp; [[Theory_of_Stochastic_Signals/Auto-Correlation_Function_(ACF)#Zufallsprozesse|Random Processes]]&nbsp; to be discussed in more detail.
 {{GraueBox|TEXT=
@@ Line 126: / Line 126: @@
-You can calculate and display the "transient behavior" of the event probabilities of such a binary Markov chain with the interactive applet (German) &nbsp;  [[Applets:Markovketten|Event Probabilities of a First Order Markov Chain.]]&nbsp;.
+You can calculate and display the "transient behavior" of the event probabilities of such a binary Markov chain with the interactive applet (German) &nbsp;  [[Applets:Markovketten|Event Probabilities of a First Order Markov Chain]]&nbsp;.
-==Stationäre Wahrscheinlichkeiten==
+==Stationary probabilities==
 <br>
-Wichtige Eigenschaften von Zufallsprozessen sind&nbsp;  [[Theory_of_Stochastic_Signals/Autokorrelationsfunktion_(AKF)#Station.C3.A4re_Zufallsprozesse|Stationarität]]&nbsp;  und&nbsp; [[Theory_of_Stochastic_Signals/Autokorrelationsfunktion_(AKF)#Ergodische_Zufallsprozesse|Ergodizität]].&nbsp; Obwohl diese Begriffe erst im vierten Hauptkapitel&nbsp; "Zufallsgrößen mit statistischen Bindungen"&nbsp;  definiert werden, wenden wir sie hier bereits vorausgreifend auf Markovketten an.
+Important properties of random processes are&nbsp;  [[Theory_of_Stochastic_Signals/Auto-Correlation_Function_(ACF)#Stationary_random_processes|stationarity]]&nbsp;  und&nbsp; [[Theory_of_Stochastic_Signals/Auto-Correlation_Function_(ACF)#Ergodic_random_processes|ergodicity]].&nbsp; Although these terms are not defined until the fourth main chapter&nbsp; "Random Variables with Statistical Dependence"&nbsp;, we apply them here already in advance to Markov chains.
 {{BlaueBox|TEXT=
-$\text{Definition:}$&nbsp; Sind bei einer Markovkette neben den Übergangswahrscheinlichkeiten auch alle Ereigniswahrscheinlichkeiten unabhängig von der Zeit&nbsp; $ν$, so bezeichnet man die Markovkette als&nbsp; '''stationär'''&nbsp; (englisch:&nbsp; ''stationary'').&nbsp; Man verzichtet dann auf den Index $ν$ und schreibt im binären Fall:
+$\text{Definition:}$&nbsp;  If, in addition to the transition probabilities, all event probabilities of a Markov chain are independent of the time&nbsp; $ν$, then the Markov chain is called&nbsp; '''stationary'''.&nbsp; One then omits the index $ν$ and writes in the binary case:
 :$${\rm Pr}(A_\nu ) = {\rm Pr}(A )  \hspace{0.5 cm} {\rm bzw.} \hspace{0.5 cm} {\rm Pr}(B_\nu ) = {\rm Pr}(B).$$
-Diese Größen nennt man auch die&nbsp; '''ergodischen Wahrscheinlichkeiten'''&nbsp; der Markovkette.}}
+These quantities are also called the&nbsp; '''ergodic probabilities'''&nbsp; of the Markov chain.}}
-Stationäre Markovketten weisen die nachfolgend genannten Besonderheiten auf:
+Stationary Markov chains have the special features mentioned below:
-*Zur Berechnung der ergodischen Wahrscheinlichkeiten einer binären Markovkette&nbsp; $(M =2)$&nbsp; kann man folgende Gleichungen verwenden:
+*To calculate the ergodic probabilities of a binary Markov chain&nbsp; $(M =2)$&nbsp; one can use the following equations:
 :$${\rm Pr}(A) = {\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}A) \cdot {\rm Pr}(A) \hspace{0.1cm} + \hspace{0.1cm} {\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}B) \cdot {\rm Pr}(B) ,$$
 :$${\rm Pr}(B) = {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}A) \cdot {\rm Pr}(A) \hspace{0.1cm} + \hspace{0.1cm} {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}B) \cdot {\rm Pr}(B) .$$
-*Da diese Gleichungen linear voneinander abhängen, darf man nur eine hiervon benutzen.&nbsp; Als zweite Bestimmungsgleichung kann man beispielsweise  verwenden:
+*Since these equations depend linearly on each other, one may use only one of them.  For example, as the second equation of determination one may use:
 :$${\rm Pr}(A) +  {\rm Pr}(B)  = 1.$$
-*Die ergodischen Wahrscheinlichkeiten ergeben sich daraus zu
+*The ergodic probabilities result from it to
 :$${\rm Pr}(A) = \frac {{\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}B) }{{\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}B)  + {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}A) }\hspace{0.1cm}, \hspace{0.5cm}{\rm Pr}(B) = \frac {{\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}A) }{{\rm Pr}(A \hspace{0.05cm} | \hspace{0.05cm}B)  + {\rm Pr}(B \hspace{0.05cm} | \hspace{0.05cm}A) }.$$
 {{GraueBox|TEXT=
-$\text{Beispiel 4:}$&nbsp; Wir betrachten eine binäre Markovkette mit den Ereignissen&nbsp; $A$&nbsp; und $B$&nbsp; und den Übergangswahrscheinlichkeiten&nbsp; ${\rm Pr}(A \hspace{0.05cm} \vert\hspace{0.05cm}A) = 0.4$&nbsp; und&nbsp;  ${\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm}B) = 0.8$. <br>Weiterhin setzen wir voraus, dass jede Realisierung dieser Kette zum Startzeitpunkt&nbsp; $ν = 0$&nbsp; mit dem Ereignis&nbsp; $A$&nbsp; beginnt.
+$\text{Example 4:}$&nbsp; We consider a binary Markov chain with events&nbsp; $A$&nbsp; and $B$&nbsp; and transition probabilities&nbsp; ${\rm Pr}(A \hspace{0.05cm} \vert\hspace{0.05cm}A) = 0.4$&nbsp; and&nbsp;  ${\rm Pr}(B \hspace{0.05cm} \vert \hspace{0.05cm}B) = 0.8$. <br>Furthermore, we assume that each realization of this chain starts with event&nbsp; $A$&nbsp; at the starting time&nbsp; $ν = 0$&nbsp;.
-Man erhält dann die nachfolgend aufgelisteten Ereigniswahrscheinlichkeiten (mit drei Nachkommastellen):
+We then obtain the event probabilities listed below (with three decimal places):
 [[File:P_ID642__Sto_T_1_4_S5_neu.png |center|frame| Zum Einschwingen der Ereigniswahrscheinlichkeiten]]
-Es handelt sich hier im strengen Sinne um eine nichtstationäre Markovkette, die jedoch schon nach kurzer Zeit (nahezu) eingeschwungen ist:
+In a strict sense, this is a non-stationary Markov chain, but it is (nearly) settled after a short time:
-*Zu späteren Zeiten&nbsp; $(ν > 5)$&nbsp; werden die Ereigniswahrscheinlichkeiten&nbsp; ${\rm Pr}(A_ν) \approx 1/4$&nbsp; und&nbsp; ${\rm Pr}(B_ν) \approx 3/4$&nbsp; nicht mehr gravierend verändert.
+*At later times&nbsp; $(ν > 5)$&nbsp;, the event probabilities&nbsp; ${\rm Pr}(A_ν) \approx 1/4$&nbsp; and&nbsp; ${\rm Pr}(B_ν) \approx 3/4$&nbsp; are no longer seriously changed.
-*Aus der Angabe&nbsp; ${\rm Pr}(A_{ν=5}) = 0.250$&nbsp; und&nbsp;  ${\rm Pr}(B_{ν=5}) = 0.750$&nbsp;  sollte allerdings  nicht geschlossen werden, dass die Markovkette zum Zeitpunkt&nbsp; $ν = 5$&nbsp;  schon vollkommen eingeschwungen ist.
+*However, from&nbsp; ${\rm Pr}(A_{ν=5}) = 0.250$&nbsp; and&nbsp;  ${\rm Pr}(B_{ν=5}) = 0.750$&nbsp; it should not be concluded that the Markov chain is already perfectly steady at time &nbsp; $ν = 5$&nbsp;.
-*Die exakten Werte sind nämlich&nbsp; ${\rm Pr}(A_{ν=5}) = 0.25024$&nbsp; und&nbsp; ${\rm Pr}(B_{ν=5}) = 0.74976$.}}
+*Indeed, the exact values are&nbsp; ${\rm Pr}(A_{ν=5}) = 0.25024$&nbsp; and&nbsp; ${\rm Pr}(B_{ν=5}) = 0.74976$.}}
 {{BlaueBox|TEXT=
-$\text{Fazit:}$&nbsp; Bei einer stationären Markovkette treten sehr lange nach Einschalten der Kette&nbsp; $(ν → ∞)$&nbsp; mit guter Näherung stets die ergodischen Wahrscheinlichkeiten auf, unabhängig von den Startbedingungen&nbsp; ${\rm Pr}(A_0)$&nbsp; und&nbsp; ${\rm Pr}(B_0) = 1 - {\rm Pr}(A_0)$.
+$\text{Conclusion:}$&nbsp; For a stationary Markov chain, the ergodic probabilities always occur with good approximation very long after the chain is switched on&nbsp; $(ν → ∞)$&nbsp;, regardless of the initial conditions&nbsp; ${\rm Pr}(A_0)$&nbsp; and&nbsp; ${\rm Pr}(B_0) = 1 - {\rm Pr}(A_0)$.
 }}
-==Matrix-Vektordarstellung==
+==Matrix vector representation==
 <br>
-Homogene Markovketten können mit Vektoren und Matrizen sehr kompakt dargestellt werden. Dies empfiehlt sich insbesondere bei mehr als zwei Ereignissen:
+Homogeneous Markov chains can be represented very compactly with vectors and matrices. This is especially recommended for more than two events:
 :$$E_\nu \in G = \{ E_{\rm 1}, E_{\rm 2}, \hspace{0.1cm}\text{...}\hspace{0.1cm}, E_\mu , \hspace{0.1cm}\text{...}\hspace{0.1cm}, E_M \}.$$
-Für die Matrix-Vektorendarstellung verwenden wir folgende Nomenklatur:
+For the matrix-vector representation, we use the following nomenclature:
-*Die&nbsp; $M$&nbsp; Wahrscheinlichkeiten zum Zeitpunkt&nbsp; $ν$&nbsp; fassen wir in einem Spaltenvektor zusammen:
+*We summarize the&nbsp; $M$&nbsp; probabilities at time&nbsp; $ν$&nbsp; in a column vector:
 :$${\mathbf{p}^{(\nu)}} = \left[ \begin{array}{c} {p_{\rm 1}}^{(\nu)} \\ \dots \\ {p_{M}}^{(\nu)} \end{array} \right] \hspace{0.5cm}{\rm mit} \hspace{0.5cm} {p_{\mu}}^{(\nu)} = {\rm Pr}(E_\nu = E_\mu ).$$
-*Die Übergangswahrscheinlichkeiten werden durch eine&nbsp; $M \times M$-Matrix ausgedrückt:
+*The transition probabilities are expressed by an&nbsp; $M \times M$ matrix:
 :$${\mathbf{P}} =\left( p_{ij} \right) = \left[ \begin{array}{cccc} p_{11} & p_{12} & \cdots & p_{1M} \\ p_{21} & p_{22}& \cdots & p_{2M} \\ \dots & \dots & \dots & \dots \\ p_{M1} & p_{M2} & \cdots & p_{MM}  \end{array} \right] \hspace{0.5cm}{\rm mit} \hspace{0.5cm} {p_{ij}} = {\rm Pr}(E_{\nu +1 } = E_j \hspace{0.05cm}| \hspace{0.05cm} E_{\nu } = E_i).$$
-Die nebenstehende Abbildung verdeutlicht diese Nomenklatur am Beispiel&nbsp; $M = 2$.&nbsp; Der neue Ereigniswahrscheinlichkeitsvektor nach einem Schritt lautet:
+The accompanying figure illustrates this nomenclature using the example&nbsp; $M = 2$.&nbsp; The new event probability vector after one step is:
-[[File:P_ID448__Sto_T_1_4_S6_neu.png  |frame|Zur Bezeichnung der Übergangswahrscheinlichkeiten |right]]
+[[File:P_ID448__Sto_T_1_4_S6_neu.png  |frame|To denote the transition probabilities |right]]
 :$${\mathbf{p}^{(\nu + 1)}} = {\mathbf{P}}^{\rm T} \cdot {\mathbf{p}^{(\nu )}} .$$
-${\mathbf{P}^{\rm T}}$&nbsp; bezeichnet hierbei die&nbsp; ''transponierte''&nbsp; Matrix zu&nbsp; $\mathbf{P}$.&nbsp; Nach&nbsp; $n$&nbsp; Schritten ergibt sich somit
+Here,${\mathbf{P}^{\rm T}}$&nbsp; denotes the&nbsp; ''transposed''&nbsp; matrix to&nbsp; $\mathbf{P}$.&nbsp; Thus, after&nbsp; $n$&nbsp; steps, the result is
 :$${\mathbf{p}^{(\nu +{\it n})}} = \left( {\mathbf{P}}^{\rm T} \right)^n \cdot {\mathbf{p}^{(\nu )}} .$$
-Im Grenzübergang&nbsp; $(n → ∞)$&nbsp; erreicht man dann stets die Stationarität der Markovkette:
+In the limit transition&nbsp; $(n → ∞)$&nbsp; one then always achieves stationarity of the Markov chain:
 :$$\lim_{n \to\infty}\hspace{0.1cm}{\mathbf{p}^{(\nu + n)}} = \lim_{n \to\infty} \left( {\mathbf{P}}^{\rm T} \right)^n \cdot {\mathbf{p}^{(\nu )}}  = {\mathbf{p}}_{\rm erg}= \left[ \begin{array}{c} {p_{\rm 1}} \\ \dots \\ {p_{M}} \end{array} \right] .$$
 {{BlaueBox|TEXT=
-$\text{Definition:}$&nbsp; Multipliziert man die Übergangsmatrix&nbsp; ${\mathbf{P} }$&nbsp; unendlich oft mit sich selbst und benennt das Ergebnis mit&nbsp; ${\mathbf{P} }_{\rm erg}$, so besteht die resultierende Matrix aus&nbsp; $M$&nbsp; gleichen Zeilen:
+$\text{Definition:}$&nbsp; Multiplying the transition matrix&nbsp; ${\mathbf{P} }$&nbsp; infinitely many times by itself and denoting the result by&nbsp; ${\mathbf{P} }_{\rm erg}$, the resulting matrix consists of&nbsp; $M$&nbsp; equal rows:
 :$${\mathbf{P} }_{\rm erg} = \lim_{n \to\infty}  {\mathbf{P} }^n = \left[ \begin{array}{cccc} p_{1} & p_{2} & \cdots & p_{M} \\ p_{1} & p_{2}& \cdots & p_{M} \\ \dots & \dots & \dots & \dots \\ p_{1} & p_{2} & \cdots & p_{M}  \end{array} \right] .$$
-Die Wahrscheinlichkeiten&nbsp; $p_1, \hspace{0.1cm}\text{...}\hspace{0.1cm}, p_M$&nbsp; in jeder dieser Zeilen bezeichnet man als die&nbsp; '''ergodischen Wahrscheinlichkeiten'''. }}
+The probabilities&nbsp; $p_1, \hspace{0.1cm}\text{...}\hspace{0.1cm}, p_M$&nbsp; in each of these lines are called the&nbsp; '''ergodic probabilities'''. }}
-[[File:P_ID1445__Sto_T_1_4_S6c_neu.png | right|frame|Markovdiagramm mit drei Ereignissen]]
+[[File:P_ID1445__Sto_T_1_4_S6c_neu.png | right|frame|Markov diagram with three events]]
 {{GraueBox|TEXT=
-$\text{Beispiel 5:}$&nbsp; Wir betrachten beispelhaft eine Markovkette mit den Ereignissen&nbsp; $E_1,&nbsp; E_2$&nbsp; und&nbsp; $E_3$ &nbsp; &rArr; &nbsp; $M=3$. Die Grafik zeigt
+$\text{Example 5:}$&nbsp; We consider a Markov chain with the events&nbsp; $E_1,&nbsp; E_2$&nbsp; and&nbsp; $E_3$ &nbsp; &rArr; &nbsp; $M=3$. The graph shows
-*links das Markovdiagramm,
+*on the left the Markov diagram,
-*rechts oben die Übergangsmatrix&nbsp; $\mathbf{P}$ und
+*on the upper right the transition matrix&nbsp; $\mathbf{P}$ and
-*darunter deren Potenzen&nbsp; $\mathbf{P}^2$&nbsp; und&nbsp; $\mathbf{P}^3$.
+*below their powers&nbsp; $\mathbf{P}^2$&nbsp; and&nbsp; $\mathbf{P}^3$.
-Im Markovdiagramm sind alle Übergangswahrscheinlichkeiten mit dem Zahlenwert&nbsp; $1/2$&nbsp; blau eingezeichnet;&nbsp; grün kennzeichnet die Wahrscheinlichkeit&nbsp; $1/4$.
+In the Markov diagram, all transition probabilities with the numerical value&nbsp; $1/2$&nbsp; are shown in blue;&nbsp; green indicates the probability&nbsp; $1/4$.
-*Beim Start&nbsp; $(ν =0)$&nbsp; seien alle Ereignisse gleichwahrscheinlich.&nbsp; Für&nbsp; $ν = 1$&nbsp; gilt dann:
+*At the start&nbsp; $(ν =0)$&nbsp; all events are equally probable.&nbsp; For&nbsp; $ν = 1$&nbsp; then holds:
 :$${\mathbf{p}^{(1)} } = {\mathbf{P} }^{\rm T} \cdot {\mathbf{p}^{(0 )} }= \left[ \begin{array}{ccc} 1/2 & 1/2&  1/2 \\ 1/4 & 0 & 1/2  \\ 1/4& 1/2& 0  \end{array} \right] \left[ \begin{array}{c} 1/3 \\ 1/3  \\ 1/3  \end{array} \right] = \left[ \begin{array}{c} 1/2 \\ 1/4  \\ 1/4  \end{array} \right] .$$
-:Hieraus ist zu ersehen, dass man von der Matrix&nbsp;  $\mathbf{P}$&nbsp; durch den Austausch der Zeilen und Spalten zur transponierten Matrix&nbsp; ${\mathbf{P}^{\rm T} }$&nbsp; kommt.
+:From this it can be seen that we get from the matrix&nbsp;  $\mathbf{P}$&nbsp; to the transposed matrix&nbsp; ${\mathbf{P}^{\rm T} }$&nbsp; by exchanging the rows and columns.
-*Zum Zeitpunkt&nbsp; $ν =2$&nbsp; (und auch zu allen späteren Zeiten) ergeben sich die gleichen Wahrscheinlichkeiten:
+*At time&nbsp; $ν =2$&nbsp; (and also at all later times) the same probabilities result:
 :$${\mathbf{p}^{(2)} } = {\mathbf{P} }^{\rm T} \cdot {\mathbf{p}^{(1 )} }= \left[ \begin{array}{ccc} 1/2 & 1/2&  1/2 \\ 1/4 & 0 & 1/2  \\ 1/4& 1/2& 0  \end{array} \right] \left[ \begin{array}{c} 1/2 \\ 1/4  \\ 1/4  \end{array} \right] = \left[ \begin{array}{c} 1/2 \\ 1/4  \\ 1/4  \end{array} \right] .$$
-:Das bedeutet:&nbsp; Die ergodischen Wahrscheinlichkeiten der drei Ereignisse sind&nbsp;  $1/2$&nbsp; $($für&nbsp; $E_1)$,&nbsp; $1/4$&nbsp; $($für&nbsp; $E_2)$,&nbsp; und $1/4$&nbsp; $($für&nbsp; $E_3)$.
+:This means:&nbsp; the ergodic probabilities of the three events are&nbsp;  $1/2$&nbsp; $($for&nbsp; $E_1)$,&nbsp; $1/4$&nbsp; $($for&nbsp; $E_2)$,&nbsp; and $1/4$&nbsp; $($for&nbsp; $E_3)$.
-*Dieses Ergebnis hätte man auch direkt aus der ergodischen Matrix ablesen können:
+*This result could have been read directly from the ergodic matrix:
 :$${\mathbf{P} }_{\rm erg} = \lim_{n \to\infty}  {\mathbf{P} }^n = \left[ \begin{array}{ccc} 1/2 & 1/4 & 1/4 \\ 1/2 & 1/4 & 1/4  \\ 1/2 & 1/4 & 1/4  \end{array} \right] .$$
-:Diese ergibt sich durch fortlaufende Multiplikation der Übergangsmatrix mit sich selbst.
+:This is obtained by continuously multiplying the transition matrix by itself.
-*Im obigen Bild sind die Potenzen&nbsp;  $\mathbf{P}^2$&nbsp; und&nbsp; $\mathbf{P}^3$&nbsp; angegeben, die sich der Matrix&nbsp; $\mathbf{P}_{\rm erg}$&nbsp; annähern.}}
+*In the picture above the powers&nbsp;  $\mathbf{P}^2$&nbsp; and&nbsp; $\mathbf{P}^3$&nbsp; are given, which approximate the matrix&nbsp; $\mathbf{P}_{\rm erg}$&nbsp;.}}
-==Aufgaben zum Kapitel==
+==Exercises for the chapter==
-[[Aufgaben:1.6_Übergangswahrscheinlichkeiten|Aufgabe 1.6: Übergangswahrscheinlichkeiten]]
+[[Aufgaben:Exercise_1.6:_Transition_Probabilities|Exercise 1.6: Transition Probabilities]]
-[[Aufgaben:1.6Z_Ergodische_Wahrscheinlichkeiten|Aufgabe 1.6Z: Ergodische Wahrscheinlichkeiten]]
+[[Aufgaben:Exercise_1.6Z:_Ergodic_Probabilities|Exercise 1.6Z: Ergodic Probabilities]]
-[[Aufgaben:1.7 Ternäre Markovkette|Aufgabe 1.7: Ternäre Markovkette]]
+[[Aufgaben:Exercise_1.7:_Ternary_Markov_Chain|Exercise 1.7: Ternary Markov Chain]]
-[[Aufgaben:1.7Z_BARBARA-Generator|Aufgabe 1.7Z: BARBARA-Generator]]
+[[Aufgaben:Exercise_1.7Z:_BARBARA_Generator|Exercise 1.7Z: BARBARA Generator]]
 {{Display}}