Difference between revisions of "Digital Signal Transmission/Structure of the Optimal Receiver"

Revision as of 11:24, 20 June 2022

1 Block diagram and prerequisites
2 Fundamental approach to optimal receiver design
3 The theorem of irrelevance
4 Some properties of the AWGN channel
5 Beschreibung des AWGN-Kanals durch orthonormale Basisfunktionen
6 Optimaler Empfänger für den AWGN-Kanal
7 Implementierungsaspekte
8 Wahrscheinlichkeitsdichtefunktion der Empfangswerte
9 N–dimensionales Gaußsches Rauschen
10 Aufgaben zum Kapitel

Block diagram and prerequisites

In this chapter, the structure of the optimal receiver of a digital transmission system is derived in very general terms, whereby

the modulation process and further system details are not specified further,
the basis functions and the signal space representation according to the chapter "Signals, Basis Functions and Vector Spaces" are assumed.

General block diagram of a communication system

To the above block diagram it is to be noted:

The symbol set size of the source is $M$ and the symbol set is $\{m_i\}$ with $i = 0$, ... , $M-1$. Let the corresponding symbol probabilities ${\rm Pr}(m = m_i)$ also be known to the receiver.

For message transmission $M$ different signal forms $s_i(t)$ are available; for the indexing variable the indexing $i = 0$, ... , $M-1$ shall be valid. There is a fixed relation between the messages $\{m_i\}$ and the signals $\{s_i(t)\}$. If $m =m_i$ is transmitted, the transmitted signal is $s(t) =s_i(t)$.

Linear channel distortions are taken into account in the above graph by the impulse response $h(t)$. In addition, a noise $n(t)$ (of some kind) is effective. With these two effects interfering with the transmission, the signal $r(t)$ arriving at the receiver can be given in the following way:

$$r(t) = s(t) \star h(t) + n(t) \hspace{0.05cm}.$$

The task of the (optimal) receiver is to find out, on the basis of its input signal $r(t)$, which of the $M$ possible messages $m_i$ – or which of the signals $s_i(t)$ – was sent. The estimated value for $m$ found by the receiver is characterized by a circumflex (French: Circonflexe) ⇒ $\hat{m}$.

$\text{Definition:}$ One speaks of an optimal receiver if the symbol error probability assumes the smallest possible value for the boundary conditions:

$$p_{\rm S} = {\rm Pr} ({\cal E}) = {\rm Pr} ( \hat{m} \ne m) \hspace{0.15cm} \Rightarrow \hspace{0.15cm}{\rm minimum} \hspace{0.05cm}.$$

Notes:

In the following, we mostly assume the AWGN approach ⇒ $r(t) = s(t) + n(t)$, which means that $h(t) = \delta(t)$ is assumed to be distortion-free.
Otherwise, we can redefine the signals $s_i(t)$ as ${s_i}'(t) = s_i(t) \star h(t)$, i.e., impose the deterministic channel distortions on the transmitted signal.

Fundamental approach to optimal receiver design

Compared to the "block diagram" shown on the previous page, we now perform some essential generalizations:

The transmission channel is described by the "conditional probability density function" $p_{\hspace{0.02cm}r(t)\hspace{0.02cm} \vert \hspace{0.02cm}s(t)}$ which determines the dependence of the received signal $r(t)$ on the transmitted signal $s(t)$.

If a certain signal $r(t) = \rho(t)$ has been received, the receiver has the task to determine the probability density functions based on this signal realization $\rho(t)$ and the $M$ conditional probability density functions

$$p_{\hspace{0.05cm}r(t) \hspace{0.05cm} \vert \hspace{0.05cm} s(t) } (\rho(t) \hspace{0.05cm} \vert \hspace{0.05cm} s_i(t))\hspace{0.2cm}{\rm with}\hspace{0.2cm} i = 0, \text{...} \hspace{0.05cm}, M-1$$

taking into account all possible transmitted signals $s_i(t)$ and their probabilities of occurrence ${\rm Pr}(m = m_i)$, find out which of the possible messages $m_i$ or which of the possible signals $s_i(t)$ was most likely transmitted.

Thus, the estimate of the optimal receiver is determined in general by the equation

$$\hat{m} = {\rm arg} \max_i \hspace{0.1cm} p_{\hspace{0.02cm}s(t) \hspace{0.05cm} \vert \hspace{0.05cm} r(t) } ( s_i(t) \hspace{0.05cm} \vert \hspace{0.05cm} \rho(t)) = {\rm arg} \max_i \hspace{0.1cm} p_{m \hspace{0.05cm} \vert \hspace{0.05cm} r(t) } ( \hspace{0.05cm}m_i\hspace{0.05cm} \vert \hspace{0.05cm}\rho(t))\hspace{0.05cm},$$

where it is considered that the transmitted message $m = m_i$ and the transmitted signal $s(t) = s_i(t)$ can be uniquely transformed into each other.

$\text{In other words:}$ The optimal receiver considers as the most likely transmitted message $m_i$ whose conditional probability density function $p_{\hspace{0.02cm}m \hspace{0.05cm} \vert \hspace{0.05cm} r(t) }$ takes the largest possible value for the applied received signal $\rho(t)$ and under the assumption $m =m_i$.

Before we discuss the above decision rule in more detail, the optimal receiver should still be divided into two functional blocks according to the diagram:

Model for deriving the optimal receiver

The detector takes various measurements on the received signal $r(t)$ and summarizes them in the vector $\boldsymbol{r}$. With $K$ measurements $\boldsymbol{r}$ corresponds to a point in the $K$–dimensional vector space.

The decision forms the estimated value depending on this vector. For a given vector $\boldsymbol{r} = \boldsymbol{\rho}$ the decision rule is:

$$\hat{m} = {\rm arg}\hspace{0.05cm} \max_i \hspace{0.1cm} P_{m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} } ( m_i\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{\rho}) \hspace{0.05cm}.$$

In contrast to the upper decision rule, a conditional probability $P_{m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} }$ now occurs instead of the conditional probability density function (PDF) $p_{m\hspace{0.05cm} \vert \hspace{0.05cm}r(t)}$. Please note the upper and lower case for the different meanings.

$\text{Example 1:}$ We now consider the function $y = {\rm arg}\hspace{0.05cm} \max \ p(x)$, where $p(x)$ describes the probability density function (PDF) of a continuous-valued or discrete-valued random variable $x$. In the second case (right graph), the PDF consists of a sum of Dirac functions with the probabilities as pulse weights.

Illustration of the "arg max" function

The graphic shows exemplary functions. In both cases the PDF maximum $(17)$ is at $x = 6$:

$$\max_i \hspace{0.1cm} p(x) = 17\hspace{0.05cm},$$

$$y = {\rm \hspace{0.05cm}arg} \max_i \hspace{0.1cm} p(x) = 6\hspace{0.05cm}.$$

The (conditional) probabilities in the equation

$$\hat{m} = {\rm arg}\hspace{0.05cm} \max_i \hspace{0.1cm} P_{\hspace{0.02cm}m\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r} } ( m_i \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho})$$

are a posteriori probabilities.

"Bayes' theorem" can be used to write for this:

$$P_{\hspace{0.02cm}m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} } ( m_i \hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{\rho}) = \frac{ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm}m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm}m_i )}{p_{\boldsymbol{ r} } (\boldsymbol{\rho})} \hspace{0.05cm}.$$

The denominator term is the same for all alternatives $m_i$ and need not be considered for the decision. This gives the following rules:

$\text{Theorem:}$ The decision rule of the optimal receiver, also known as MAP receiver (stands for maximum–a–posteriori), is:

$$\hat{m}_{\rm MAP} = {\rm \hspace{0.05cm} arg} \max_i \hspace{0.1cm} P_{\hspace{0.02cm}m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} } ( m_i \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}) = {\rm \hspace{0.05cm}arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm} m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm} m_i )\big ]\hspace{0.05cm}.$$

The advantage of this equation is that the conditional PDF $p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm} m }$ ("output under the condition input") describing the forward direction of the channel can be used. In contrast, the first equation uses the inference probabilities $P_{\hspace{0.05cm}m\hspace{0.05cm} \vert \hspace{0.02cm} \boldsymbol{ r} } $ ("input under the condition output").

$\text{Theorem:}$ A maximum likelihood receiver (ML receiver in short) uses the decision rule

$$\hat{m}_{\rm ML} = \hspace{-0.1cm} {\rm arg} \max_i \hspace{0.1cm} p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm}m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm}m_i )\hspace{0.05cm}.$$

In this case, the possibly different occurrence probabilities ${\rm Pr}(m = m_i)$ are not used for the decision process, for example, because they are not known to the receiver.

See the earlier chapter "Optimal Receiver Strategies" for other derivations for these receiver types.

$\text{Conclusion:}$ For equally likely messages $\{m_i\}$ ⇒ ${\rm Pr}(m = m_i) = 1/M$, the generally slightly worse ML receiver is equivalent to the MAP receiver:

$$\hat{m}_{\rm MAP} = \hat{m}_{\rm ML} =\hspace{-0.1cm} {\rm\hspace{0.05cm} arg} \max_i \hspace{0.1cm} p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm}m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm}m_i )\hspace{0.05cm}.$$

The theorem of irrelevance

Note that the receiver described in the last section is optimal only if the detector is implemented in the best possible way, i.e., if no information is lost by the transition from the continuous signal $r(t)$ to the vector $\boldsymbol{r}$.

About the theorem of irrelevance

To clarify the question which and how many measurements have to be performed on the received signal $r(t)$ to guarantee optimality, the theorem of irrelevance is helpful. For this purpose, we consider the sketched receiver whose detector derives the two vectors $\boldsymbol{r}_1$ and $\boldsymbol{r}_2$ from the received signal $r(t)$ and makes them available to the decision. These quantities are related to the message $ m \in \{m_i\}$ via the composite probability density $p_{\boldsymbol{ r}_1, \hspace{0.05cm}\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm}m }$.

The decision rule of the MAP receiver with adaptation to this example is:

$$\hat{m}_{\rm MAP} \hspace{-0.1cm} = \hspace{-0.1cm} {\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 , \hspace{0.05cm}\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1, \hspace{0.05cm}\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} m_i ) \big]= {\rm arg} \max_i \hspace{0.1cm}\big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m_i ) \cdot p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )\big] \hspace{0.05cm}.$$

Here it is to be noted:

The vectors $\boldsymbol{r}_1$ and $\boldsymbol{r}_2$ are random variables. Their realizations are denoted here and in the following by $\boldsymbol{\rho}_1$ and $\boldsymbol{\rho}_2$. For emphasis, all vectors are shown in red in the graph.

The requirements for the application of the "theorem of irrelevance" are the same as those for a first order "Markov chain". The random variables $x$, $y$, $z$ then form a first order Markov chain if the distribution of $z$ is independent of $x$&nbsp for a given $y$. The first order Markov chain is the following:

$$p(x, y, z) = p(x) \cdot p(y\hspace{0.05cm} \vert \hspace{0.05cm}x) \cdot p(z\hspace{0.05cm} \vert \hspace{0.05cm}y) \hspace{0.25cm} {\rm instead \hspace{0.15cm}of} \hspace{0.25cm}p(x, y, z) = p(x) \cdot p(y\hspace{0.05cm} \vert \hspace{0.05cm}x) \cdot p(z\hspace{0.05cm} \vert \hspace{0.05cm}x, y) \hspace{0.05cm}.$$

In the general case, the optimal receiver must evaluate both vectors $\boldsymbol{r}_1$ and $\boldsymbol{r}_2$, since both composite probability densities $p_{\boldsymbol{ r}_1\hspace{0.05cm} \vert \hspace{0.05cm}m }$ and $p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r}_1, \hspace{0.05cm}m }$ occur in the above decision rule.

In contrast, the receiver can neglect the second measurement without loss of information if $\boldsymbol{r}_2$ is independent of message $m$ for given $\boldsymbol{r}_1$:

$$p_{\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )= p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 } \hspace{0.05cm} (\boldsymbol{\rho}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 ) \hspace{0.05cm}.$$

In this case, the decision rule can be further simplified:

$$\hat{m}_{\rm MAP} = {\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m_i ) \cdot p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i ) \big]$$

$$\Rightarrow \hspace{0.3cm}\hat{m}_{\rm MAP} = {\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m_i ) \cdot p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 )\big]$$

$$\Rightarrow \hspace{0.3cm}\hat{m}_{\rm MAP} = {\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m_i ) \big]\hspace{0.05cm}.$$

$\text{Example 2:}$ We consider two different system configurations with two noise terms $\boldsymbol{ n}_1$ and $\boldsymbol{ n}_2$ each to illustrate the theorem of irrelevance just presented. In the diagram all vectorial quantities are drawn in red. Moreover, the quantities $\boldsymbol{s}$, $\boldsymbol{ n}_1$ and $\boldsymbol{ n}_2$ are independent of each other.

Two examples of the theorem of irrelevance

The analysis of these two arrangements yields the following results:

In both cases, the decision must consider the component $\boldsymbol{ r}_1= \boldsymbol{ s}_i + \boldsymbol{ n}_1$, since only this component provides the information about the useful signal $\boldsymbol{ s}_i$ and thus about the transmitted message $m_i$.

In the upper configuration, $\boldsymbol{ r}_2$ contains no information about $m_i$ that has not already been provided by $\boldsymbol{ r}_1$. Rather, $\boldsymbol{ r}_2= \boldsymbol{ r}_1 + \boldsymbol{ n}_2$ is just a noisy version of $\boldsymbol{ r}_1$ and depends only on the noise $\boldsymbol{ n}_2$ once $\boldsymbol{ r}_1$ is known ⇒ $\boldsymbol{ r}_2$ is irrelevant:

$$p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )= p_{\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{\rho}_1 )= p_{\boldsymbol{ n}_2 } \hspace{0.05cm} (\boldsymbol{\rho}_2 - \boldsymbol{\rho}_1 )\hspace{0.05cm}.$$

In the lower configuration, on the other hand, $\boldsymbol{ r}_2= \boldsymbol{ n}_1 + \boldsymbol{ n}_2$ is helpful to the receiver, since it provides it with an estimate of the noise term $\boldsymbol{ n}_1$ ⇒ $\boldsymbol{ r}_2$ should therefore not be discarded here. Formally, this result can be expressed as follows:

$$p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i ) = p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ n}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 - \boldsymbol{s}_i, \hspace{0.05cm}m_i)= p_{\boldsymbol{ n}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ n}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2- \boldsymbol{\rho}_1 + \boldsymbol{s}_i \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 - \boldsymbol{s}_i, \hspace{0.05cm}m_i) = p_{\boldsymbol{ n}_2 } \hspace{0.05cm} (\boldsymbol{\rho}_2- \boldsymbol{\rho}_1 + \boldsymbol{s}_i ) \hspace{0.05cm}.$$

Since the message $\boldsymbol{ s}_i$ now appears in the argument of this function, $\boldsymbol{ r}_2$ is "not irrelevant" but quite relevant.

Some properties of the AWGN channel

In order to make further statements about the nature of the optimal measurements of the vector $\boldsymbol{ r}$, it is necessary to further specify the (conditional) probability density function $p_{\hspace{0.02cm}r(t)\hspace{0.05cm} \vert \hspace{0.05cm}s(t)}$ characterizing the channel. In the following, we will consider communication over the "AWGN channel", whose most important properties are briefly summarized again here:

The output signal of the AWGN channel is $r(t) = s(t)+n(t)$, where $s(t)$ indicates the transmitted signal and $n(t)$ is represented by a Gaussian noise process.

A random process $\{n(t)\}$ is said to be Gaussian if the elements of the $k$–dimensional random variables $\{n_1(t)\hspace{0.05cm} \text{...} \hspace{0.05cm}n_k(t)\}$ are jointly Gaussian ⇒ "Jointly Gaussian".

The average value of the AWGN noise is ${\rm E}\big[n(t)\big] = 0$. Moreover, $n(t)$ is "white", which means that the "power-spectral density" (PSD) is constant for all frequencies $($from $-\infty$ to $+\infty)$:

$${\it \Phi}_n(f) = {N_0}/{2} \hspace{0.05cm}.$$

According to the "Wiener-Chintchine theorem", the auto-correlation function (ACF) is obtained as the "Fourier retransform" of ${\it \Phi_n(f)}$:

$${\varphi_n(\tau)} = {\rm E}\big [n(t) \cdot n(t+\tau)\big ] = {N_0}/{2} \cdot \delta(t)\hspace{0.3cm} \Rightarrow \hspace{0.3cm} {\rm E}\big [n(t) \cdot n(t+\tau)\big ] = \left\{ \begin{array}{c} \rightarrow \infty \\ 0 \end{array} \right.\quad \begin{array}{*{1}c} {\rm f{or}} \hspace{0.15cm} \tau = 0 \hspace{0.05cm}, \\ {\rm f{or}} \hspace{0.15cm} \tau \ne 0 \hspace{0.05cm},\\ \end{array}$$

$N_0$ gibt dabei die physikalische (nur für $f \ge 0$ definierte) Rauschleistungsdichte an. Der konstante LDS–Wert $(N_0/2)$ und das Gewicht der Diracfunktion in der AKF $($ebenfalls $N_0/2)$ ergibt sich allein durch die zweiseitige Betrachtungsweise.

Weitere Informationen zu diesem Thema liefert das Lernvideo Der AWGN-Kanal im zweiten Teil.

Beschreibung des AWGN-Kanals durch orthonormale Basisfunktionen

Aus dem vorletzten Statement auf der letzten Seite geht hervor, dass

reines AWGN–Rauschen $n(t)$ stets eine unendliche Varianz (Leistung) aufweist: $\sigma_n^2 \to \infty$,
in der Realität demzufolge nur gefiltertes Rauschen $n\hspace{0.05cm}'(t) = n(t) \star h_n(t)$ auftreten kann.

Mit der Impulsantwort $h_n(t)$ und dem Frequenzgang $H_n(f) = {\rm F}\big [h_n(t)\big ]$ gelten dann folgende Gleichungen:

$${\rm E}\big[n\hspace{0.05cm}'(t) \big] \hspace{0.15cm} = \hspace{0.2cm} {\rm E}\big[n(t) \big] = 0 \hspace{0.05cm},$$

$${\it \Phi_{n\hspace{0.05cm}'}(f)} \hspace{0.1cm} = \hspace{0.1cm} {N_0}/{2} \cdot |H_{n}(f)|^2 \hspace{0.05cm},$$

$$ {\it \varphi_{n\hspace{0.05cm}'}(\tau)} \hspace{0.1cm} = \hspace{0.1cm} {N_0}/{2}\hspace{0.1cm} \cdot \big [h_{n}(\tau) \star h_{n}(-\tau)\big ]\hspace{0.05cm},$$

$$\sigma_n^2 \hspace{0.1cm} = \hspace{0.1cm} { \varphi_{n\hspace{0.05cm}'}(\tau = 0)} = {N_0}/{2} \cdot \int_{-\infty}^{+\infty}h_n^2(t)\,{\rm d} t ={N_0}/{2}\hspace{0.1cm} \cdot < \hspace{-0.1cm}h_n(t), \hspace{0.1cm} h_n(t) \hspace{-0.05cm} > \hspace{0.1cm} = \int_{-\infty}^{+\infty}{\it \Phi}_{n\hspace{0.05cm}'}(f)\,{\rm d} f = {N_0}/{2} \cdot \int_{-\infty}^{+\infty}|H_n(f)|^2\,{\rm d} f \hspace{0.05cm}.$$

Im Folgenden beinhaltet $n(t)$ stets implizit eine Bandbegrenzung; auf die Schreibweise $n'(t)$ wird also zukünftig verzichtet.

$\text{Bitte beachten Sie:}$ Ähnlich wie das Sendesignal $s(t)$ lässt sich auch der Rauschprozess $\{n(t)\}$ als gewichtete Summe von orthonormalen Basisfunktionen $\varphi_j(t)$ schreiben.

Im Gegensatz zu $s(t)$ ist allerdings nun eine Beschränkung auf eine endliche Anzahl an Basisfunktionen nicht möglich.
Vielmehr gilt bei rein stochastischen Größen für die entsprechende Signaldarstellung stets

$$n(t) = \lim_{N \rightarrow \infty} \sum\limits_{j = 1}^{N}n_j \cdot \varphi_j(t) \hspace{0.05cm},$$

wobei der Koeffizient $n_j$ durch die Projektion von $n(t)$ auf die Basisfunktion $\varphi_j(t)$ bestimmt ist:

$$n_j = \hspace{0.1cm} < \hspace{-0.1cm}n(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > \hspace{0.05cm}.$$

Hinweis: Um eine Verwechslung mit den Basisfunktionen $\varphi_j(t)$ zu vermeiden, wird im Folgenden die AKF $\varphi_n(\tau)$ des Rauschprozesses stets nur noch als der Erwartungswert ${\rm E}\big [n(t) \cdot n(t + \tau)\big ]$ ausgedrückt.

Optimaler Empfänger für den AWGN-Kanal

Optimaler Empfänger beim AWGN-Kanal

Auch das Empfangssignal $r(t) = s(t) + n(t)$ lässt sich in bekannter Weise in Basisfunktionen zerlegen:

$$r(t) = \sum\limits_{j = 1}^{\infty}r_j \cdot \varphi_j(t) \hspace{0.05cm}.$$

Zu berücksichtigen ist:

Die $M$ möglichen Sendesignale $\{s_i(t)\}$ spannen einen Signalraum mit insgesamt $N$ Basisfunktionen $\varphi_1(t)$, ... , $\varphi_N(t)$ auf.

Diese $N$ Basisfunktionen $\varphi_j(t)$ werden gleichzeitig zur Beschreibung des Rauschsignals $n(t)$ und des Empfangssignals $r(t)$ verwendet.

Zur vollständigen Charakterisierung von $n(t)$ bzw. $r(t)$ werden nun aber darüber hinaus noch unendlich viele weitere Basisfunktionen $\varphi_{N+1}(t)$, $\varphi_{N+2}(t)$, ... benötigt.

Damit ergeben sich die Koeffizienten des Empfangssignals $r(t)$ gemäß folgender Gleichung, wobei berücksichtigt ist, dass die Signale $s_i(t)$ und das Rauschen $n(t)$ voneinander unabhängig sind:

$$r_j \hspace{0.1cm} = \hspace{0.1cm} \hspace{0.1cm} < \hspace{-0.1cm}r(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > \hspace{0.1cm}=\hspace{0.1cm} \left\{ \begin{array}{c} < \hspace{-0.1cm}s_i(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > + < \hspace{-0.1cm}n(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > \hspace{0.1cm}= s_{ij}+ n_j\\ < \hspace{-0.1cm}n(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > \hspace{0.1cm} = n_j \end{array} \right.\quad \begin{array}{*{1}c} {j = 1, 2, \hspace{0.05cm}\text{...}\hspace{0.05cm} \hspace{0.05cm}, N} \hspace{0.05cm}, \\ {j > N} \hspace{0.05cm}.\\ \end{array}$$

Somit ergibt sich für den optimalen Empfänger die oben skizzierte Struktur.

Betrachten wir zunächst den AWGN–Kanal. Hier kann auf das Vorfilter mit dem Frequenzgang $W(f)$ verzichtet werden, das für farbiges Rauschen vorgesehen ist.

Der Detektor des optimalen Empfängers bildet die Koeffizienten $r_j \hspace{0.1cm} = \hspace{0.1cm} \hspace{0.1cm} < \hspace{-0.1cm}r(t), \hspace{0.1cm} \varphi_j(t)\hspace{-0.05cm} >$ und reicht diese an den Entscheider weiter. Basiert die Entscheidung auf sämtlichen – also unendlich vielen – Koeffizienten $r_j$, so ist die Wahrscheinlichkeit für eine Fehlentscheidung minimal und der Empfänger optimal.

Die reellwertigen Koeffizienten $r_j$ wurden oben wie folgt berechnet:

$$r_j = \left\{ \begin{array}{c} s_{ij} + n_j\\ n_j \end{array} \right.\quad \begin{array}{*{1}c} {j = 1, 2, \hspace{0.05cm}\text{...}\hspace{0.05cm}, N} \hspace{0.05cm}, \\ {j > N} \hspace{0.05cm}.\\ \end{array}$$

Nach dem Theorem der Irrelevanz lässt sich zeigen, dass für additives weißes Gaußsches Rauschen

die Optimalität nicht herabgesetzt wird, wenn man die nicht von der Nachricht $(s_{ij})$ abhängigen Koeffizienten $r_{N+1}$, $r_{N+2}$, ... nicht in den Entscheidungsprozess einbindet, und somit

der Detektor nur die Projektionen des Empfangssignals $r(t)$ auf die $N$ durch das Nutzsignal $s(t)$ vorgegebenen Basisfunktionen $\varphi_{1}(t)$, ... , $\varphi_{N}(t)$ bilden muss.

In der Grafik ist diese signifikante Vereinfachung durch die graue Hinterlegung angedeutet.

Im Fall von farbigem Rauschen ⇒ Leistungsdichtespektrum ${\it \Phi}_n(f) \ne {\rm const.}$ ist lediglich zusätzlich ein Vorfilter mit dem Amplitudengang $|W(f)| = {1}/{\sqrt{\it \Phi}_n(f)}$ erforderlich. Man nennt dieses Filter auch "Whitening Filter", da die Rauschleistungsdichte am Ausgang wieder konstant – also "weiß" – ist.

Genaueres hierzu finden Sie im Kapitel Matched-Filter bei farbigen Störungen des Buches "Stochastische Signaltheorie".

Implementierungsaspekte

Wesentliche Bestandteile des optimalen Empfängers sind die Berechnungen der inneren Produkte gemäß den Gleichungen $r_j \hspace{0.1cm} = \hspace{0.1cm} \hspace{0.1cm} < \hspace{-0.1cm}r(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} >$. Diese können auf verschiedene Art und Weise implementiert werden:

Beim Korrelationsempfänger (Näheres zu dieser Implementierung finden Sie im gleichnamigen Kapitel) werden die inneren Produkte direkt entsprechend der Definition mit analogen Multiplizierern und Integratoren realisiert:

$$r_j = \int_{-\infty}^{+\infty}r(t) \cdot \varphi_j(t) \,{\rm d} t \hspace{0.05cm}.$$

Der Matched–Filter–Empfänger, der bereits im Kapitel Optimaler Binärempfänger zu Beginn dieses Buches hergeleitet wurde, erzielt mit einem linearen Filter mit der Impulsantwort $h_j(t) = \varphi_j(t) \cdot (T-t)$ und anschließender Abtastung zum Zeitpunkt $t = T$ das gleiche Ergebnis:

$$r_j = \int_{-\infty}^{+\infty}r(\tau) \cdot h_j(t-\tau) \,{\rm d} \tau = \int_{-\infty}^{+\infty}r(\tau) \cdot \varphi_j(T-t+\tau) \,{\rm d} \tau \hspace{0.3cm} \Rightarrow \hspace{0.3cm} r_j (t = \tau) = \int_{-\infty}^{+\infty}r(\tau) \cdot \varphi_j(\tau) \,{\rm d} \tau = r_j \hspace{0.05cm}.$$

Die Abbildung zeigt die beiden möglichen Realisierungsformen des optimalen Detektors.

Drei unterschiedliche Implementierungen des inneren Produktes

Wahrscheinlichkeitsdichtefunktion der Empfangswerte

Bevor wir uns im folgenden Kapitel der optimalen Gestaltung des Entscheiders und der Berechnung und Annäherung der Fehlerwahrscheinlichkeit zuwenden, erfolgt zunächst eine für den AWGN–Kanal gültige statistische Analyse der Entscheidungsgrößen $r_j$.

Signalraumkonstellation und WDF des Empfangssignals

Dazu betrachten wir nochmals den optimalen Binärempfänger für die bipolare Basisbandübertragung über den AWGN–Kanal, wobei wir von der für das vierte Hauptkapitel gültigen Beschreibungsform ausgehen.

Mit den Parametern $N = 1$ und $M = 2$ ergibt sich für das Sendesignal die in der linken Grafik dargestellte Signalraumkonstellation

mit nur einer Basisfunktion $\varphi_1(t)$, wegen $N = 1$,
mit den beiden Signalraumpunkten $s_i \in \{s_0, \hspace{0.05cm}s_1\}$, wegen $M = 2$.

Für das Signal $r(t) = s(t) + n(t)$ am Ausgang des AWGN–Kanals ergibt sich im rauschfreien Fall ⇒ $r(t) = s(t)$ die genau gleiche Konstellation. Die Signalraumpunkte liegen somit bei

$$r_0 = s_0 = \sqrt{E}\hspace{0.05cm},\hspace{0.2cm}r_1 = s_1 = -\sqrt{E}\hspace{0.05cm}.$$

Bei Berücksichtigung des (bandbegrenzten) AWGN–Rauschens $n(t)$ überlagern sich den beiden Punkten $r_0$ und $r_1$ jeweils Gaußkurven mit der Varianz $\sigma_n^2$ ⇒ Streuung $\sigma_n$ (siehe rechte Grafik). Die WDF der Rauschkomponente $n(t)$ lautet dabei:

$$p_n(n) = \frac{1}{\sqrt{2\pi} \cdot \sigma_n}\cdot {\rm e}^{ - {n^2}/(2 \sigma_n^2)}\hspace{0.05cm}.$$

Für die bedingte Wahrscheinlichkeitsdichte, dass der Empfangswert $\rho$ anliegt, wenn $s_i$ gesendet wurde, ergibt sich dann folgender Ausdruck:

$$p_{\hspace{0.02cm}r\hspace{0.05cm}|\hspace{0.05cm}s}(\rho\hspace{0.05cm}|\hspace{0.05cm}s_i) = \frac{1}{\sqrt{2\pi} \cdot \sigma_n}\cdot {\rm e}^{ - {(\rho - s_i)^2}/(2 \sigma_n^2)} \hspace{0.05cm}.$$

Zu den Einheiten der hier aufgeführten Größen ist zu bemerken:

$r_0 = s_0$ und $r_1 = s_1$ sowie $n$ sind jeweils Skalare mit der Einheit "Wurzel aus Energie".

Damit ist offensichtlich, dass $\sigma_n$ ebenfalls die Einheit "Wurzel aus Energie" besitzt und $\sigma_n^2$ eine Energie darstellt.

Beim AWGN–Kanal ist die Rauschvarianz $\sigma_n^2 = N_0/2$. Diese ist also ebenfalls eine physikalische Größe mit der Einheit $\rm W/Hz = Ws$.

Die hier angesprochene Thematik wird in der Aufgabe 4.6 an Beispielen verdeutlicht.

N–dimensionales Gaußsches Rauschen

Liegt ein $N$–dimensionales Modulationsverfahren vor, das heißt, es gilt mit $0 \le i \le M-1$ und $1 \le j \le N$:

$$s_i(t) = \sum\limits_{j = 1}^{N} s_{ij} \cdot \varphi_j(t) = s_{i1} \cdot \varphi_1(t) + s_{i2} \cdot \varphi_2(t) + \hspace{0.05cm}\text{...}\hspace{0.05cm} + s_{iN} \cdot \varphi_N(t)\hspace{0.05cm}\hspace{0.3cm} \Rightarrow \hspace{0.3cm} \boldsymbol{ s}_i = \left(s_{i1}, s_{i2}, \hspace{0.05cm}\text{...}\hspace{0.05cm}, s_{iN}\right ) \hspace{0.05cm},$$

so muss der Rauschvektor $\boldsymbol{ n}$ ebenfalls mit der Dimension $N$ angesetzt werden. Das gleiche gilt auch für den Empfangsvektor $\boldsymbol{ r}$:

$$\boldsymbol{ n} = \left(n_{1}, n_{2}, \hspace{0.05cm}\text{...}\hspace{0.05cm}, n_{N}\right ) \hspace{0.01cm},\hspace{0.2cm}\boldsymbol{ r} = \left(r_{1}, r_{2}, \hspace{0.05cm}\text{...}\hspace{0.05cm}, r_{N}\right )\hspace{0.05cm}.$$

Die Wahrscheinlichkeitsdichtefunktion (WDF) lautet dann für den AWGN–Kanal mit der Realisierung $\boldsymbol{ \eta}$ des Rauschsignals

$$p_{\boldsymbol{ n}}(\boldsymbol{ \eta}) = \frac{1}{\left( \sqrt{2\pi} \cdot \sigma_n \right)^N } \cdot {\rm exp} \left [ - \frac{|| \boldsymbol{ \eta} ||^2}{2 \sigma_n^2}\right ]\hspace{0.05cm},$$

und für die bedingte WDF in der Maximum–Likelihood–Entscheidungsregel ist anzusetzen:

$$p_{\hspace{0.02cm}\boldsymbol{ r}\hspace{0.05cm} | \hspace{0.05cm} \boldsymbol{ s}}(\boldsymbol{ \rho} \hspace{0.05cm}|\hspace{0.05cm} \boldsymbol{ s}_i) \hspace{-0.1cm} = \hspace{0.1cm} p_{\hspace{0.02cm} \boldsymbol{ n}\hspace{0.05cm} | \hspace{0.05cm} \boldsymbol{ s}}(\boldsymbol{ \rho} - \boldsymbol{ s}_i \hspace{0.05cm} | \hspace{0.05cm} \boldsymbol{ s}_i) = \frac{1}{\left( \sqrt{2\pi} \cdot \sigma_n \right)^2 } \cdot {\rm exp} \left [ - \frac{|| \boldsymbol{ \rho} - \boldsymbol{ s}_i ||^2}{2 \sigma_n^2}\right ]\hspace{0.05cm}.$$

Die Gleichung ergibt sich aus der allgemeinen Darstellung der $N$–dimensionalen Gaußschen WDF im Abschnitt Korrelationsmatrix des Buches "Stochastische Signaltheorie" unter der Voraussetzung, dass die Komponenten unkorreliert (und somit statistisch unabhängig) sind. $||\boldsymbol{ \eta}||$ bezeichnet man als die Norm (Länge) des Vektors $\boldsymbol{ \eta}$.

Zweidimensionale Gauß–WDF

$\text{Beispiel 3:}$ Rechts dargestellt ist die zweidimensionale Gauß–WDF $p_{\boldsymbol{ n} } (\boldsymbol{ \eta})$ der 2D–Zufallsgröße $\boldsymbol{ n} = (n_1,\hspace{0.05cm}n_2)$. Beliebige Realisierungen der Zufallsgröße $\boldsymbol{ n}$ werden mit $\boldsymbol{ \eta} = (\eta_1,\hspace{0.05cm}\eta_2)$ bezeichnet.

Die Gleichung der dargestellten Glockenkurve lautet:

$$p_{n_1, n_2}(\eta_1, \eta_2) = \frac{1}{\left( \sqrt{2\pi} \cdot \sigma_n \right)^2 } \cdot {\rm exp} \left [ - \frac{ \eta_1^2 + \eta_2^2}{2 \sigma_n^2}\right ]\hspace{0.05cm}. $$

Das Maximum dieser Funktion liegt bei $\eta_1 = \eta_2 = 0$ und hat den Wert $2\pi \cdot \sigma_n^2$.
Mit $\sigma_n^2 = N_0/2$ lässt sich die 2D–WDF in Vektorform auch wie folgt schreiben:

$$p_{\boldsymbol{ n} }(\boldsymbol{ \eta}) = \frac{1}{\pi \cdot N_0 } \cdot {\rm exp} \left [ - \frac{\vert \vert \boldsymbol{ \eta} \vert \vert ^2}{N_0}\right ]\hspace{0.05cm}.$$

Diese rotationssymmetrische WDF eignet sich zum Beispiel für die Beschreibung/Untersuchung eines zweidimensionalen Modulationsverfahrens wie M–QAM, M–PSK oder 2–FSK.

Oft werden zweidimensionale reelle Zufallsgrößen aber auch eindimensional–komplex dargestellt, meist in der Form $n(t) = n_{\rm I}(t) + {\rm j} \cdot n_{\rm Q}(t)$. Die beiden Komponenten bezeichnet man dann als Inphaseanteil $n_{\rm I}(t)$ und Quadraturanteil $n_{\rm Q}(t)$ des Rauschens.

Die Wahrscheinlichkeitsdichtefunktion hängt nur vom Betrag $\vert n(t) \vert$ der Rauschvariablen ab und nicht von Winkel ${\rm arc} \ n(t)$. Das heißt: Komplexes Rauschen ist zirkulär symmetrisch (siehe Grafik).

Zirkulär symmetrisch bedeutet auch, dass die Inphasekomponente $n_{\rm I}(t)$ und die Quadraturkomponente $n_{\rm Q}(t)$ die gleiche Verteilung aufweisen und damit auch gleiche Varianz (Streuung) besitzen:

$$ {\rm E} \big [ n_{\rm I}^2(t)\big ] = {\rm E}\big [ n_{\rm Q}^2(t) \big ] = \sigma_n^2 \hspace{0.05cm},\hspace{1cm}{\rm E}\big [ n(t) \cdot n^*(t) \big ]\hspace{0.1cm} = \hspace{0.1cm} {\rm E}\big [ n_{\rm I}^2(t) \big ] + {\rm E}\big [ n_{\rm Q}^2(t)\big ] = 2\sigma_n^2 \hspace{0.05cm}.$$

Abschließend noch einige Bezeichnungsvarianten für Gaußsche Zufallsgrößen:

$$x ={\cal N}(\mu, \sigma^2) \hspace{-0.1cm}: \hspace{0.3cm}\text{reelle gaußverteilte Zufallsgröße, mit Mittelwert}\hspace{0.1cm}\mu \text { und Varianz}\hspace{0.15cm}\sigma^2 \hspace{0.05cm},$$

$$y={\cal CN}(\mu, \sigma^2)\hspace{-0.1cm}: \hspace{0.12cm}\text{komplexe gaußverteilte Zufallsgröße} \hspace{0.05cm}.$$

Aufgaben zum Kapitel

Aufgabe 4.4: Maximum–a–posteriori und Maximum–Likelihood

Aufgabe 4.5: Theorem der Irrelevanz

@@ Line 23: / Line 23: @@
 :$$r(t) = s(t) \star h(t) + n(t) \hspace{0.05cm}.$$
-*Aufgabe des (optimalen) Empfängers ist es, anhand seines Eingangssignals&nbsp; $r(t)$&nbsp; herauszufinden, welche der&nbsp; $M$&nbsp; möglichen Nachrichten&nbsp; $m_i$&nbsp; &ndash; bzw. welches der Signale&nbsp; $s_i(t)$&nbsp; &ndash; gesendet wurde. Der vom Empfänger gefundene Schätzwert für&nbsp; $m$&nbsp; wird durch ein Zirkumflex (französisch: ''Circonflexe'') gekennzeichnet &nbsp; &rArr; &nbsp;  $\hat{m}$.
+*The task of the (optimal) receiver is to find out, on the basis of its input signal&nbsp; $r(t)$,&nbsp; which of the&nbsp; $M$&nbsp; possible messages&nbsp; $m_i$&nbsp; &ndash; or which of the signals&nbsp; $s_i(t)$&nbsp; &ndash; was sent. The estimated value for&nbsp; $m$&nbsp; found by the receiver is characterized by a circumflex (French: ''Circonflexe'') &nbsp; &rArr; &nbsp;  $\hat{m}$.
 {{BlaueBox|TEXT=
-$\text{Definition:}$&nbsp; Man spricht von einem '''optimalen Empfänger''', wenn die Symbolfehlerwahrscheinlichkeit den für die Randbedingungen kleinstmöglichsten Wert annimmt:
+$\text{Definition:}$&nbsp; One speaks of an '''optimal receiver''' if the symbol error probability assumes the smallest possible value for the boundary conditions:
-:$$p_{\rm S} = {\rm Pr}  ({\cal E}) = {\rm Pr} ( \hat{m} \ne m) \hspace{0.15cm} \Rightarrow \hspace{0.15cm}{\rm Minimum}   \hspace{0.05cm}.$$}}
+:$$p_{\rm S} = {\rm Pr}  ({\cal E}) = {\rm Pr} ( \hat{m} \ne m) \hspace{0.15cm} \Rightarrow \hspace{0.15cm}{\rm minimum}   \hspace{0.05cm}.$$}}
-<i>Hinweise:</i>
+<i>Notes:</i>
-#Im Folgenden wird meist der AWGN&ndash;Ansatz &nbsp; &rArr; &nbsp;  $r(t) =  s(t) + n(t)$&nbsp; vorausgesetzt, was bedeutet, dass &nbsp;$h(t) =  \delta(t)$&nbsp; als verzerrungsfrei angenommen wird.
+#In the following, we mostly assume the AWGN approach &nbsp; &rArr; &nbsp;  $r(t) =  s(t) + n(t)$,&nbsp; which means that &nbsp;$h(t) =  \delta(t)$&nbsp; is assumed to be distortion-free.
-#Andernfalls können wir die Signale&nbsp; $s_i(t)$&nbsp; als &nbsp;${s_i}'(t) = s_i(t) \star h(t)$&nbsp; neu definieren, also die deterministischen Kanalverzerrungen dem Sendesignal beaufschlagen.<br>
+#Otherwise, we can redefine the signals&nbsp; $s_i(t)$&nbsp; as &nbsp;${s_i}'(t) = s_i(t) \star h(t)$,&nbsp; i.e., impose the deterministic channel distortions on the transmitted signal.<br>
-== Fundamentaler Ansatz zum optimalen Empfängerentwurf==
+== Fundamental approach to optimal receiver design==
 <br>
-Gegenüber dem auf der vorherigen Seite gezeigten&nbsp; [[Digitalsignal%C3%BCbertragung/Struktur_des_optimalen_Empf%C3%A4ngers#Blockschaltbild_und_Voraussetzungen| Blockschaltbild]]&nbsp; führen wir nun einige wesentliche Verallgemeinerungen durch:
+Compared to the&nbsp; [[Digital_Signal_Transmission/Structure_of_the_Optimal_Receiver#Block_diagram_and_prerequisites|"block diagram"]]&nbsp; shown on the previous page, we now perform some essential generalizations:
-*Der Übertragungskanal wird durch die&nbsp; [[Theory_of_Stochastic_Signals/Statistische_Abhängigkeit_und_Unabhängigkeit#Bedingte_Wahrscheinlichkeit|bedingte Wahrscheinlichkeitsdichtefunktion]]&nbsp; $p_{\hspace{0.02cm}r(t)\hspace{0.02cm} \vert \hspace{0.02cm}s(t)}$&nbsp; beschrieben, welche die Abhängigkeit des Empfangssignals&nbsp; $r(t)$&nbsp; vom Sendesignal&nbsp; $s(t)$&nbsp; festlegt.<br>
+*The transmission channel is described by the&nbsp; [[Theory_of_Stochastic_Signals/Statistical_Dependence_and_Independence#Conditional_Probability|"conditional probability density function"]]&nbsp; $p_{\hspace{0.02cm}r(t)\hspace{0.02cm} \vert \hspace{0.02cm}s(t)}$&nbsp; which determines the dependence of the received signal&nbsp; $r(t)$&nbsp; on the transmitted signal&nbsp; $s(t)$.&nbsp; <br>
-*Wurde nun ein ganz bestimmtes Signal&nbsp; $r(t) = \rho(t)$&nbsp; empfangen, so hat der Empfänger die Aufgabe, anhand dieser Signalrealisierung&nbsp; $\rho(t)$&nbsp; sowie der&nbsp; $M$&nbsp; bedingten Wahrscheinlichkeitsdichtefunktionen
+*If a certain signal&nbsp; $r(t) = \rho(t)$&nbsp; has been received, the receiver has the task to determine the probability density functions based on this signal realization&nbsp; $\rho(t)$&nbsp; and the&nbsp; $M$&nbsp; conditional probability density functions
-:$$p_{\hspace{0.05cm}r(t) \hspace{0.05cm} \vert \hspace{0.05cm} s(t) } (\rho(t) \hspace{0.05cm} \vert \hspace{0.05cm} s_i(t))\hspace{0.2cm}{\rm mit}\hspace{0.2cm} i = 0, \text{...} \hspace{0.05cm}, M-1$$
+:$$p_{\hspace{0.05cm}r(t) \hspace{0.05cm} \vert \hspace{0.05cm} s(t) } (\rho(t) \hspace{0.05cm} \vert \hspace{0.05cm} s_i(t))\hspace{0.2cm}{\rm with}\hspace{0.2cm} i = 0, \text{...} \hspace{0.05cm}, M-1$$
-:unter Berücksichtigung aller möglichen Sendesignale&nbsp; $s_i(t)$&nbsp; und deren Auftrittswahrscheinlichkeiten&nbsp; ${\rm Pr}(m = m_i)$&nbsp; herauszufinden, welche der möglichen Nachrichten&nbsp; $m_i$&nbsp; bzw. welches der möglichen Signale&nbsp; $s_i(t)$&nbsp; am wahrscheinlichsten gesendet wurde.<br>
+:taking into account all possible transmitted signals&nbsp; $s_i(t)$&nbsp; and their probabilities of occurrence&nbsp; ${\rm Pr}(m = m_i)$,&nbsp; find out which of the possible messages&nbsp; $m_i$&nbsp; or which of the possible signals&nbsp; $s_i(t)$&nbsp; was most likely transmitted.<br>
-*Die Schätzung des optimalen Empfängers ist also ganz allgemein bestimmt durch die Gleichung
+*Thus, the estimate of the optimal receiver is determined in general by the equation
 :$$\hat{m} = {\rm arg} \max_i \hspace{0.1cm} p_{\hspace{0.02cm}s(t) \hspace{0.05cm} \vert \hspace{0.05cm} r(t) } (  s_i(t) \hspace{0.05cm} \vert \hspace{0.05cm} \rho(t)) = {\rm arg} \max_i \hspace{0.1cm} p_{m \hspace{0.05cm} \vert \hspace{0.05cm} r(t) } (  \hspace{0.05cm}m_i\hspace{0.05cm} \vert \hspace{0.05cm}\rho(t))\hspace{0.05cm},$$
-:wobei  berücksichtigt ist, dass die gesendete Nachricht&nbsp; $m = m_i$&nbsp; und das gesendete Signal&nbsp; $s(t) = s_i(t)$&nbsp; eineindeutig ineinander übergeführt werden können.<br>
+:where it is considered that the transmitted message&nbsp; $m = m_i$&nbsp; and the transmitted signal&nbsp; $s(t) = s_i(t)$&nbsp; can be uniquely transformed into each other.<br>
 {{BlaueBox|TEXT=
-$\text{In anderen Worten:}$&nbsp; Der optimale Empfänger betrachtet diejenige Nachricht&nbsp; $m_i$&nbsp; als die am wahrscheinlichsten gesendete, deren bedingte Wahrscheinlichkeitsdichtefunktion&nbsp; $p_{\hspace{0.02cm}m \hspace{0.05cm} \vert \hspace{0.05cm} r(t) }$&nbsp;  für das anliegende Empfangssignal&nbsp; $\rho(t)$&nbsp; sowie unter der Annahme&nbsp; $m =m_i$&nbsp; den größtmöglichen Wert annimmt.}}<br>
+$\text{In other words:}$&nbsp; The optimal receiver considers as the most likely transmitted message&nbsp; $m_i$&nbsp; whose conditional probability density function&nbsp; $p_{\hspace{0.02cm}m \hspace{0.05cm} \vert \hspace{0.05cm} r(t) }$&nbsp; takes the largest possible value for the applied received signal&nbsp; $\rho(t)$&nbsp; and under the assumption&nbsp; $m =m_i$.&nbsp; }}<br>
-Bevor wir die obige Entscheidungsregel näher diskutieren, soll der optimale Empfänger entsprechend der Grafik noch in zwei Funktionsblöcke aufgeteilt werden:
+Before we discuss the above decision rule in more detail, the optimal receiver should still be divided into two functional blocks according to the diagram:
-[[File:P ID2001 Dig T 4 2 S2 version2.png|right|frame|Modell zur Herleitung des optimalen Empfängers|class=fit]]
+[[File:P ID2001 Dig T 4 2 S2 version2.png|right|frame|Model for deriving the optimal receiver|class=fit]]
-*Der &nbsp;'''Detektor'''&nbsp; nimmt am Empfangssignal&nbsp; $r(t)$&nbsp; verschiedene Messungen vor und fasst diese im Vektor &nbsp;$\boldsymbol{r}$&nbsp; zusammen. Bei &nbsp;$K$&nbsp; Messungen entspricht&nbsp; $\boldsymbol{r}$&nbsp; einem Punkt im &nbsp;$K$&ndash;dimensionalen Vektorraum.<br>
+*The &nbsp;'''detector'''&nbsp; takes various measurements on the received signal&nbsp; $r(t)$&nbsp; and summarizes them in the vector &nbsp;$\boldsymbol{r}$.&nbsp; With &nbsp;$K$&nbsp; measurements&nbsp; $\boldsymbol{r}$&nbsp; corresponds to a point in the &nbsp;$K$&ndash;dimensional vector space.<br>
-*Der &nbsp;'''Entscheider'''&nbsp; bildet abhängig von diesem Vektor den Schätzwert. Bei einem gegebenen Vektor&nbsp; $\boldsymbol{r} = \boldsymbol{\rho}$&nbsp; lautet dabei die Entscheidungsregel:
+*The &nbsp;'''decision'''&nbsp; forms the estimated value depending on this vector. For a given vector&nbsp; $\boldsymbol{r} = \boldsymbol{\rho}$&nbsp; the decision rule is:
 :$$\hat{m} = {\rm arg}\hspace{0.05cm} \max_i \hspace{0.1cm} P_{m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} } (  m_i\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{\rho}) \hspace{0.05cm}.$$
 <br clear=all>
-Im Gegensatz zur oberen Entscheidungsregel tritt nun eine bedingte Wahrscheinlichkeit&nbsp; $P_{m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} }$&nbsp; anstelle der bedingten Wahrscheinlichkeitskeitsdichtefunktion (WDF)&nbsp; $p_{m\hspace{0.05cm} \vert \hspace{0.05cm}r(t)}$&nbsp; auf. Beachten Sie bitte die Groß&ndash; bzw. Kleinschreibung für die unterschiedlichen Bedeutungen.
+In contrast to the upper decision rule, a conditional probability&nbsp; $P_{m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} }$&nbsp; now occurs instead of the conditional probability density function (PDF)&nbsp; $p_{m\hspace{0.05cm} \vert \hspace{0.05cm}r(t)}$.&nbsp; Please note the upper and lower case for the different meanings.
 <br clear=all>
 {{GraueBox|TEXT=
-$\text{Beispiel 1:}$&nbsp; Wir betrachten nun die Funktion&nbsp; $y =  {\rm arg}\hspace{0.05cm} \max \ p(x)$, wobei&nbsp; $p(x)$&nbsp; die Wahrscheinlichkeitsdichtefunktion (WDF) einer wertkontinuierlichen oder wertdiskreten Zufallsgröße&nbsp; $x$&nbsp; beschreibt. Im zweiten Fall (rechte Grafik) besteht die WDF aus einer Summe von Diracfunktionen mit den Wahrscheinlichkeiten als Impulsgewichte.<br>
+$\text{Example 1:}$&nbsp; We now consider the function&nbsp; $y =  {\rm arg}\hspace{0.05cm} \max \ p(x)$, where&nbsp; $p(x)$&nbsp; describes the probability density function (PDF) of a continuous-valued or discrete-valued random variable&nbsp; $x$.&nbsp; In the second case (right graph), the PDF consists of a sum of Dirac functions with the probabilities as pulse weights.<br>
-[[File:P ID2002 Dig T 4 2 S2b version1.png|righ|frame|Zur Verdeutlichung der Funktion „arg max”|class=fit]]
+[[File:P ID2002 Dig T 4 2 S2b version1.png|righ|frame|Illustration of the "arg max" function|class=fit]]
-Die Grafik zeigt beispielhafte Funktionen. In beiden Fällen liegt das WDF&ndash;Maximum&nbsp; $(17)$&nbsp; bei&nbsp; $x = 6$:
+The graphic shows exemplary functions. In both cases the PDF maximum&nbsp; $(17)$&nbsp; is at&nbsp; $x = 6$:
 :$$\max_i \hspace{0.1cm} p(x) = 17\hspace{0.05cm},$$
 :$$y = {\rm \hspace{0.05cm}arg} \max_i \hspace{0.1cm} p(x) = 6\hspace{0.05cm}.$$
-Die (bedingten) Wahrscheinlichkeiten in der Gleichung
+The (conditional) probabilities in the equation
 :$$\hat{m} = {\rm arg}\hspace{0.05cm} \max_i \hspace{0.1cm} P_{\hspace{0.02cm}m\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r} } (  m_i \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho})$$
-sind  &nbsp;'''a&ndash;Posteriori&ndash;Wahrscheinlichkeiten'''.
+are &nbsp;'''a posteriori probabilities'''.
-Mit dem&nbsp; [[Theory_of_Stochastic_Signals/Statistische_Abh%C3%A4ngigkeit_und_Unabh%C3%A4ngigkeit#R.C3.BCckschlusswahrscheinlichkeit| Satz von Bayes]]&nbsp; kann hierfür geschrieben werden:
+&nbsp; [[Theory_of_Stochastic_Signals/Statistical_Dependence_and_Independence#Inference_probability|"Bayes' theorem"]]&nbsp; can be used to write for this:
 :$$P_{\hspace{0.02cm}m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} } (  m_i \hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{\rho}) =
 \frac{ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm}m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm}m_i )}{p_{\boldsymbol{ r} } (\boldsymbol{\rho})}
@@ Line 85: / Line 85: @@
-Der Nennerterm ist für alle Alternativen&nbsp; $m_i$&nbsp; gleich und muss für die Entscheidung nicht berücksichtigt werden. Damit erhält man folgende Regeln:
+The denominator term is the same for all alternatives&nbsp; $m_i$&nbsp; and need not be considered for the decision. This gives the following rules:
 {{BlaueBox|TEXT=
-$\text{Satz:}$&nbsp; Die Entscheidungsregel des optimalen Empfängers, auch bekannt als &nbsp;'''MAP&ndash;Empfänger'''&nbsp; (steht für ''Maximum&ndash;a&ndash;posteriori''), lautet:
+$\text{Theorem:}$&nbsp; The decision rule of the optimal receiver, also known as &nbsp;'''MAP receiver'''&nbsp; (stands for ''maximum&ndash;a&ndash;posteriori''), is:
 :$$\hat{m}_{\rm MAP} = {\rm \hspace{0.05cm} arg} \max_i \hspace{0.1cm} P_{\hspace{0.02cm}m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} } (  m_i \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}) = {\rm \hspace{0.05cm}arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm} m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm} m_i )\big ]\hspace{0.05cm}.$$
-Der Vorteil dieser Gleichung ist, dass die die Vorwärtsrichtung des Kanals beschreibende bedingte WDF&nbsp; $p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm} m }$&nbsp; ("Ausgang unter der Bedingung Eingang") verwendet werden kann. Dagegen verwendet die erste Gleichung die Rückschlusswahrscheinlichkeiten&nbsp; $P_{\hspace{0.05cm}m\hspace{0.05cm} \vert \hspace{0.02cm} \boldsymbol{ r} } $&nbsp;  ("Eingang unter der Bedingung Ausgang").}}
+The advantage of this equation is that the conditional PDF&nbsp; $p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm} m }$&nbsp; ("output under the condition input") describing the forward direction of the channel can be used. In contrast, the first equation uses the inference probabilities&nbsp; $P_{\hspace{0.05cm}m\hspace{0.05cm} \vert \hspace{0.02cm} \boldsymbol{ r} } $&nbsp;  ("input under the condition output").}}
 {{BlaueBox|TEXT=
-$\text{Satz:}$&nbsp; Ein &nbsp;'''Maximum&ndash;Likelihood&ndash;Empfänger'''&nbsp; (kurz ML&ndash;Empfänger) verwendet die Entscheidungsregel
+$\text{Theorem:}$&nbsp; A &nbsp;'''maximum likelihood receiver'''&nbsp; (ML receiver in short) uses the decision rule
 :$$\hat{m}_{\rm ML} = \hspace{-0.1cm} {\rm arg} \max_i \hspace{0.1cm}  p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm}m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm}m_i )\hspace{0.05cm}.$$
-Bei diesem werden die möglicherweise unterschiedlichen Auftrittswahrscheinlichkeiten&nbsp; ${\rm Pr}(m = m_i)$&nbsp; für den Entscheidungsprozess nicht herangezogen, zum Beispiel, weil sie dem Empfänger nicht bekannt sind.}}<br>
+In this case, the possibly different occurrence probabilities&nbsp; ${\rm Pr}(m = m_i)$&nbsp; are not used for the decision process, for example, because they are not known to the receiver.}}<br>
-Im früheren Kapitel&nbsp; [[Digital_Signal_Transmission/Optimale_Empfängerstrategien|Optimale Empfängerstrategien]]&nbsp; finden Sie andere Herleitungen für diese Empfängertypen.
+See the earlier chapter&nbsp; [[Digital_Signal_Transmission/Optimal_Receiver_Strategies|"Optimal Receiver Strategies"]]&nbsp; for other derivations for these receiver types.
 {{BlaueBox|TEXT=
-$\text{Fazit:}$&nbsp; Bei gleichwahrscheinlichen Nachrichten&nbsp; $\{m_i\}$   &nbsp; &#8658; &nbsp; ${\rm Pr}(m = m_i) = 1/M$&nbsp; ist der im Allgemeinen etwas schlechtere ML&ndash;Empfänger gleichwertig mit dem MAP&ndash;Empfänger:
+$\text{Conclusion:}$&nbsp; For equally likely messages&nbsp; $\{m_i\}$   &nbsp; &#8658; &nbsp; ${\rm Pr}(m = m_i) = 1/M$,&nbsp; the generally slightly worse ML receiver is equivalent to the MAP receiver:
 :$$\hat{m}_{\rm MAP} = \hat{m}_{\rm ML} =\hspace{-0.1cm} {\rm\hspace{0.05cm} arg} \max_i \hspace{0.1cm}
    p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm}m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm}m_i )\hspace{0.05cm}.$$}}
-== Das Theorem der Irrelevanz==
+== The theorem of irrelevance==
 <br>
-Zu beachten ist, dass der auf der letzten Seite beschriebene Empfänger nur dann optimal ist, wenn auch der Detektor bestmöglich implementiert ist, das heißt, wenn durch den Übergang vom kontinuierlichen Signal&nbsp; $r(t)$&nbsp; zum Vektor&nbsp; $\boldsymbol{r}$&nbsp; keine Information verloren geht.<br>
+Note that the receiver described in the last section is optimal only if the detector is implemented in the best possible way, i.e., if no information is lost by the transition from the continuous signal&nbsp; $r(t)$&nbsp; to the vector&nbsp; $\boldsymbol{r}$.&nbsp; <br>
-[[File:P ID2003 Dig T 4 2 S3a version2.png|center|frame|Zum Theorem der Irrelevanz|class=fit]]
+[[File:P ID2003 Dig T 4 2 S3a version2.png|center|frame|About the theorem of irrelevance|class=fit]]
-Um die Frage zu klären, welche und wieviele Messungen am Empfangssignal&nbsp; $r(t)$&nbsp; durchzuführen sind, um Optimalität zu garantieren, ist das <i>Theorem der Irrelevanz</i>&nbsp; hilfreich. Dazu betrachten wir den skizzierten Empfänger, dessen Detektor aus dem Empfangssignal&nbsp; $r(t)$&nbsp; die zwei Vektoren&nbsp; $\boldsymbol{r}_1$&nbsp; und&nbsp; $\boldsymbol{r}_2$&nbsp; ableitet und dem Entscheider zur Verfügung stellt. Diese Größen stehen mit der Nachricht&nbsp; $ m \in \{m_i\}$&nbsp; über die Verbundwahrscheinlichkeitsdichte&nbsp; $p_{\boldsymbol{ r}_1, \hspace{0.05cm}\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm}m }$&nbsp; in Zusammenhang.<br>
+To clarify the question which and how many measurements have to be performed on the received signal&nbsp; $r(t)$&nbsp; to guarantee optimality, the <i>theorem of irrelevance</i>&nbsp; is helpful. For this purpose, we consider the sketched receiver whose detector derives the two vectors&nbsp; $\boldsymbol{r}_1$&nbsp; and&nbsp; $\boldsymbol{r}_2$&nbsp; from the received signal&nbsp; $r(t)$&nbsp; and makes them available to the decision. These quantities are related to the message&nbsp; $ m \in \{m_i\}$&nbsp; via the composite probability density&nbsp; $p_{\boldsymbol{ r}_1, \hspace{0.05cm}\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm}m }$.&nbsp; <br>
-Die Entscheidungsregel des MAP&ndash;Empfängers lautet mit Anpassung an dieses Beispiel:
+The decision rule of the MAP receiver with adaptation to this example is:
 :$$\hat{m}_{\rm MAP} \hspace{-0.1cm}  =  \hspace{-0.1cm} {\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 , \hspace{0.05cm}\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1, \hspace{0.05cm}\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} m_i ) \big]=
@@ Line 126: / Line 126: @@
 \hspace{0.05cm}.$$
-Hierzu ist anzumerken:
+Here it is to be noted:
-*Die Vektoren&nbsp; $\boldsymbol{r}_1$&nbsp; und &nbsp;$\boldsymbol{r}_2$&nbsp; sind Zufallsgrößen. Ihre Realisierungen werden hier und im Folgenden mit&nbsp; $\boldsymbol{\rho}_1$&nbsp; und &nbsp;$\boldsymbol{\rho}_2$&nbsp; bezeichnet. Zur Hervorhebung sind alle Vektoren in der Grafik rot eingetragen.
+*The vectors&nbsp; $\boldsymbol{r}_1$&nbsp; and &nbsp;$\boldsymbol{r}_2$&nbsp; are random variables. Their realizations are denoted here and in the following by&nbsp; $\boldsymbol{\rho}_1$&nbsp; and &nbsp;$\boldsymbol{\rho}_2$.&nbsp; For emphasis, all vectors are shown in red in the graph.
-*Die Voraussetzungen für die Anwendung des "Theorems der Irrelevanz" sind die gleichen wie die an eine&nbsp; [[Theory_of_Stochastic_Signals/Markovketten#Betrachtetes_Szenario| Markovkette]]&nbsp; erster Ordnung. Die Zufallsvariablen&nbsp; $x$,&nbsp; $y$,&nbsp; $z$&nbsp; formen dann eine Markovkette erster Ordnung, falls die Verteilung von&nbsp; $z$&nbsp; bei gegebenem &nbsp;$y$&nbsp; unabhängig von&nbsp; $x$&nbsp; ist:
+*The requirements for the application of the "theorem of irrelevance" are the same as those for a first order&nbsp; [[Theory_of_Stochastic_Signals/Markov_Chains#Considered_scenario|"Markov chain"]].&nbsp; The random variables&nbsp; $x$,&nbsp; $y$,&nbsp; $z$&nbsp; then form a first order Markov chain if the distribution of&nbsp; $z$&nbsp; is independent of &nbsp; $x$&nbsp for a given&nbsp;$y$.&nbsp; The first order Markov chain is the following:
-:$$p(x, y, z) = p(x) \cdot p(y\hspace{0.05cm} \vert \hspace{0.05cm}x) \cdot p(z\hspace{0.05cm} \vert \hspace{0.05cm}y) \hspace{0.25cm} {\rm anstelle \hspace{0.15cm}von} \hspace{0.25cm}p(x, y, z) = p(x) \cdot p(y\hspace{0.05cm} \vert \hspace{0.05cm}x) \cdot p(z\hspace{0.05cm} \vert \hspace{0.05cm}x, y) \hspace{0.05cm}.$$
+:$$p(x, y, z) = p(x) \cdot p(y\hspace{0.05cm} \vert \hspace{0.05cm}x) \cdot p(z\hspace{0.05cm} \vert \hspace{0.05cm}y) \hspace{0.25cm} {\rm instead \hspace{0.15cm}of} \hspace{0.25cm}p(x, y, z) = p(x) \cdot p(y\hspace{0.05cm} \vert \hspace{0.05cm}x) \cdot p(z\hspace{0.05cm} \vert \hspace{0.05cm}x, y) \hspace{0.05cm}.$$
-*Der optimale Empfänger muss im allgemeinen Fall beide Vektoren&nbsp; $\boldsymbol{r}_1$&nbsp; und&nbsp; $\boldsymbol{r}_2$&nbsp; auswerten, da in obiger Entscheidungsregel beide Verbundwahrscheinlichkeitsdichten&nbsp; $p_{\boldsymbol{ r}_1\hspace{0.05cm} \vert \hspace{0.05cm}m }$&nbsp;  und&nbsp; $p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r}_1, \hspace{0.05cm}m }$&nbsp;  auftreten.
+*In the general case, the optimal receiver must evaluate both vectors&nbsp; $\boldsymbol{r}_1$&nbsp; and&nbsp; $\boldsymbol{r}_2$,&nbsp; since both composite probability densities&nbsp; $p_{\boldsymbol{ r}_1\hspace{0.05cm} \vert \hspace{0.05cm}m }$&nbsp; and&nbsp; $p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r}_1, \hspace{0.05cm}m }$&nbsp; occur in the above decision rule.
-*Dagegen kann der Empfänger ohne Informationseinbuße die zweite Messung vernachlässigen, falls&nbsp; $\boldsymbol{r}_2$&nbsp; bei gegebenem&nbsp; $\boldsymbol{r}_1$&nbsp; unabhängig von der Nachricht&nbsp; $m$&nbsp; ist:
+*In contrast, the receiver can neglect the second measurement without loss of information if&nbsp; $\boldsymbol{r}_2$&nbsp; is independent of message&nbsp; $m$&nbsp; for given&nbsp; $\boldsymbol{r}_1$:&nbsp;
 :$$p_{\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )=
 p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1  } \hspace{0.05cm} (\boldsymbol{\rho}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1  )
 \hspace{0.05cm}.$$
-*In diesem Fall lässt sich die Entscheidungsregel weiter vereinfachen:
+*In this case, the decision rule can be further simplified:
 :$$\hat{m}_{\rm MAP} =
 {\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1  \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1
@@ Line 154: / Line 154: @@
 {{GraueBox|TEXT=
-$\text{Beispiel 2:}$&nbsp; Wir betrachten zur Verdeutlichung des soeben vorgestellten Theorems der Irrelevanz zwei verschiedene Systemkonfigurationen mit jeweils zwei Rauschtermen&nbsp; $\boldsymbol{ n}_1$&nbsp; und&nbsp; $\boldsymbol{ n}_2$. In der Grafik sind alle vektoriellen Größen rot eingezeichnet. Die Größen&nbsp; $\boldsymbol{s}$,&nbsp; $\boldsymbol{ n}_1$&nbsp; und &nbsp;$\boldsymbol{ n}_2$&nbsp; seien zudem jeweils unabhängig voneinander.<br>
+$\text{Example 2:}$&nbsp; We consider two different system configurations with two noise terms&nbsp; $\boldsymbol{ n}_1$&nbsp; and&nbsp; $\boldsymbol{ n}_2$ each to illustrate the theorem of irrelevance just presented. In the diagram all vectorial quantities are drawn in red. Moreover, the quantities&nbsp; $\boldsymbol{s}$,&nbsp; $\boldsymbol{ n}_1$&nbsp; and &nbsp;$\boldsymbol{ n}_2$&nbsp; are independent of each other.<br>
-[[File:Dig_T_4_2_S3b_version2.png|center|frame|Zwei Beispiele zum Theorem der Irrelevanz|class=fit]]
+[[File:Dig_T_4_2_S3b_version2.png|center|frame|Two examples of the theorem of irrelevance|class=fit]]
-Die Analyse dieser beiden Anordnungen liefert folgende Ergebnisse:
+The analysis of these two arrangements yields the following results:
-*Der Entscheider muss in beiden Fällen die Komponente&nbsp; $\boldsymbol{ r}_1= \boldsymbol{ s}_i + \boldsymbol{ n}_1$&nbsp; berücksichtigen, da nur diese die Information über das Nutzsignal&nbsp; $\boldsymbol{ s}_i$&nbsp; und damit über die gesendete Nachricht&nbsp; $m_i$&nbsp; liefert.<br>
+*In both cases, the decision must consider the component&nbsp; $\boldsymbol{ r}_1= \boldsymbol{ s}_i + \boldsymbol{ n}_1$,&nbsp; since only this component provides the information about the useful signal&nbsp; $\boldsymbol{ s}_i$&nbsp; and thus about the transmitted message&nbsp; $m_i$.&nbsp; <br>
-*Bei der oberen Konfiguration enthält&nbsp; $\boldsymbol{ r}_2$&nbsp; keine Information über&nbsp; $m_i$, die nicht bereits von &nbsp;$\boldsymbol{ r}_1$&nbsp; geliefert wurde. Vielmehr ist&nbsp; $\boldsymbol{ r}_2= \boldsymbol{ r}_1 + \boldsymbol{ n}_2$&nbsp; nur eine verrauschte Version von&nbsp; $\boldsymbol{ r}_1$&nbsp; und hängt nur vom Rauschen&nbsp; $\boldsymbol{ n}_2$&nbsp; ab, sobald&nbsp; $\boldsymbol{ r}_1$&nbsp; bekannt ist &nbsp; &#8658; &nbsp; $\boldsymbol{ r}_2$&nbsp; ist irrelevant:
+*In the upper configuration, &nbsp; $\boldsymbol{ r}_2$&nbsp; contains no information about&nbsp; $m_i$ that has not already been provided by &nbsp;$\boldsymbol{ r}_1$.&nbsp; Rather, &nbsp; $\boldsymbol{ r}_2= \boldsymbol{ r}_1 + \boldsymbol{ n}_2$&nbsp; is just a noisy version of&nbsp; $\boldsymbol{ r}_1$&nbsp; and depends only on the noise&nbsp; $\boldsymbol{ n}_2$&nbsp; once&nbsp; $\boldsymbol{ r}_1$&nbsp; is known &nbsp; &#8658; &nbsp; $\boldsymbol{ r}_2$&nbsp; is irrelevant:
 :$$p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )=
 p_{\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1  } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{\rho}_1  )=
 p_{\boldsymbol{ n}_2  } \hspace{0.05cm} (\boldsymbol{\rho}_2 - \boldsymbol{\rho}_1  )\hspace{0.05cm}.$$
-*Bei der unteren Konfiguration ist dagegen $\boldsymbol{ r}_2= \boldsymbol{ n}_1 + \boldsymbol{ n}_2$ für den Empfänger hilfreich, da ihm so ein Schätzwert für den Rauschterm $\boldsymbol{ n}_1$ geliefert wird &nbsp; &#8658; &nbsp; $\boldsymbol{ r}_2$ sollte deshalb hier nicht verworfen werden. Formal lässt sich dieses Resultat wie folgt ausdrücken:
+*In the lower configuration, on the other hand, $\boldsymbol{ r}_2= \boldsymbol{ n}_1 + \boldsymbol{ n}_2$ is helpful to the receiver, since it provides it with an estimate of the noise term $\boldsymbol{ n}_1$ &nbsp; &#8658; &nbsp; $\boldsymbol{ r}_2$ should therefore not be discarded here. Formally, this result can be expressed as follows:
 :$$p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm}  \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i ) = p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm}  \boldsymbol{ n}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1  - \boldsymbol{s}_i, \hspace{0.05cm}m_i)= p_{\boldsymbol{ n}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ n}_1 , \hspace{0.05cm} m  } \hspace{0.05cm} (\boldsymbol{\rho}_2- \boldsymbol{\rho}_1  + \boldsymbol{s}_i \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1  - \boldsymbol{s}_i, \hspace{0.05cm}m_i) = p_{\boldsymbol{ n}_2  } \hspace{0.05cm} (\boldsymbol{\rho}_2- \boldsymbol{\rho}_1  + \boldsymbol{s}_i )
 \hspace{0.05cm}.$$
-*Da nun im Argument dieser Funktion die Nachricht $\boldsymbol{ s}_i$ erscheint, ist $\boldsymbol{ r}_2$ "nicht irrelevant", sondern durchaus relevant.}}<br>
+*Since the message $\boldsymbol{ s}_i$ now appears in the argument of this function, $\boldsymbol{ r}_2$ is "not irrelevant" but quite relevant.}}<br>
-== Einige Eigenschaften des AWGN-Kanals==
+== Some properties of the AWGN channel==
 <br>
-Um weitere Aussagen über die Art der optimalen Messungen des Vektors&nbsp; $\boldsymbol{ r}$&nbsp; machen zu können, ist es notwendig, die den Kanal charakterisierende (bedingte) Wahrscheinlichkeitsdichtefunktion&nbsp; $p_{\hspace{0.02cm}r(t)\hspace{0.05cm} \vert \hspace{0.05cm}s(t)}$&nbsp; weiter zu spezifizieren. Im Folgenden wird die Kommunikation über den&nbsp; [[Modulationsverfahren/Qualit%C3%A4tskriterien#Einige_Anmerkungen_zum_AWGN.E2.80.93Kanalmodell| AWGN&ndash;Kanal]]&nbsp; betrachtet, dessen wichtigste Eigenschaften hier nochmals kurz zusammengestellt werden:
+In order to make further statements about the nature of the optimal measurements of the vector&nbsp; $\boldsymbol{ r}$,&nbsp; it is necessary to further specify the (conditional) probability density function&nbsp; $p_{\hspace{0.02cm}r(t)\hspace{0.05cm} \vert \hspace{0.05cm}s(t)}$&nbsp; characterizing the channel. In the following, we will consider communication over the&nbsp; [[Modulation_Methods/Quality_Criteria#Some_remarks_on_the_AWGN_channel_model| "AWGN channel"]],&nbsp; whose most important properties are briefly summarized again here:
-*Das Ausgangssignal des AWGN&ndash;Kanals ist&nbsp; $r(t) = s(t)+n(t)$, wobei&nbsp; $s(t)$&nbsp; das Sendesignal angibt und&nbsp; $n(t)$&nbsp; durch einen Gaußschen Rauschprozess dargestellt wird.<br>
+*The output signal of the AWGN channel is&nbsp; $r(t) = s(t)+n(t)$, where &nbsp; $s(t)$&nbsp; indicates the transmitted signal and&nbsp; $n(t)$&nbsp; is represented by a Gaussian noise process.<br>
-*Einen Zufallsprozess&nbsp; $\{n(t)\}$&nbsp; bezeichnet man als gaußisch, wenn die Elemente der&nbsp; $k$&ndash;dimensionalen Zufallsvariablen&nbsp; $\{n_1(t)\hspace{0.05cm} \text{...} \hspace{0.05cm}n_k(t)\}$&nbsp; gemeinsam gaußverteilt sind &nbsp; &rArr; &nbsp; <i>"Jointly Gaussian"</i>.<br>
+*A random process&nbsp; $\{n(t)\}$&nbsp; is said to be Gaussian if the elements of the&nbsp; $k$&ndash;dimensional random variables&nbsp; $\{n_1(t)\hspace{0.05cm} \text{...} \hspace{0.05cm}n_k(t)\}$&nbsp; are jointly Gaussian &nbsp; &rArr; &nbsp; <i>"Jointly Gaussian"</i>.<br>
-*Der Mittelwert des AWGN&ndash;Rauschens ist&nbsp; ${\rm E}\big[n(t)\big] = 0$. Außerdem ist&nbsp; $n(t)$&nbsp; "weiß", was bedeutet, dass das&nbsp; [[Theory_of_Stochastic_Signals/Leistungsdichtespektrum_(LDS)|Leistungsdichtespektrum]]&nbsp; (LDS) für alle Frequenzen &nbsp;$($von &nbsp;$-\infty$ bis $+\infty)$&nbsp; konstant ist: &nbsp;
+*The average value of the AWGN noise is&nbsp; ${\rm E}\big[n(t)\big] = 0$. Moreover,&nbsp; $n(t)$&nbsp; is "white", which means that the&nbsp; [[Theory_of_Stochastic_Signals/Power-Spectral_Density|"power-spectral density"]]&nbsp; (PSD) is constant for all frequencies &nbsp;$($from &nbsp;$-\infty$ to $+\infty)$:&nbsp; &nbsp;
 :$${\it \Phi}_n(f) = {N_0}/{2}
 \hspace{0.05cm}.$$
-*Nach dem&nbsp; [[Theory_of_Stochastic_Signals/Leistungsdichtespektrum_(LDS)#Theorem_von_Wiener-Chintchine |Wiener&ndash;Chintchine&ndash;Theorem]]&nbsp; ergibt sich die Autokorrelationsfunktion (AKF) als die&nbsp; [[Signal_Representation/Fourier_Transform_and_Its_Inverse#Das_zweite_Fourierintegral| Fourierrücktransformierte]]&nbsp; von&nbsp; ${\it \Phi_n(f)}$:
+*According to the&nbsp; [[Theory_of_Stochastic_Signals/Power-Spectral_Density#Wiener-Khintchine_Theorem|"Wiener-Chintchine theorem"]],&nbsp; the auto-correlation function (ACF) is obtained as the&nbsp; [[Signal_Representation/Fourier_Transform_and_its_Inverse#The_second_Fourier_integral| "Fourier retransform"]]&nbsp; of&nbsp; ${\it \Phi_n(f)}$:
 :$${\varphi_n(\tau)} = {\rm E}\big [n(t) \cdot n(t+\tau)\big  ] = {N_0}/{2} \cdot \delta(t)\hspace{0.3cm}
 \Rightarrow \hspace{0.3cm} {\rm E}\big [n(t) \cdot n(t+\tau)\big  ]  =
 \left\{ \begin{array}{c} \rightarrow \infty \\
   \end{array} \right.\quad
-  \begin{array}{*{1}c} {\rm f{\rm \ddot{u}r}}  \hspace{0.15cm} \tau = 0 \hspace{0.05cm},
+  \begin{array}{*{1}c} {\rm f{or}}  \hspace{0.15cm} \tau = 0 \hspace{0.05cm},
-\\  {\rm f{\rm \ddot{u}r}}  \hspace{0.15cm} \tau \ne 0 \hspace{0.05cm},\\ \end{array}$$
+\\  {\rm f{or}}  \hspace{0.15cm} \tau \ne 0 \hspace{0.05cm},\\ \end{array}$$
 *$N_0$&nbsp; gibt dabei die physikalische (nur für &nbsp;$f \ge 0$&nbsp; definierte) Rauschleistungsdichte an. Der konstante LDS&ndash;Wert&nbsp; $(N_0/2)$&nbsp; und das Gewicht der Diracfunktion in der AKF $($ebenfalls &nbsp;$N_0/2)$&nbsp; ergibt sich allein durch die zweiseitige Betrachtungsweise.<br><br>