Difference between revisions of "Digital Signal Transmission/Structure of the Optimal Receiver"

From LNTwww
 
(59 intermediate revisions by 7 users not shown)
Line 1: Line 1:
 
   
 
   
 
{{Header
 
{{Header
|Untermenü=Verallgemeinerte Beschreibung digitaler Modulationsverfahren
+
|Untermenü=Generalized Description of Digital Modulation Methods
 
|Vorherige Seite=Signale, Basisfunktionen und Vektorräume
 
|Vorherige Seite=Signale, Basisfunktionen und Vektorräume
 
|Nächste Seite=Approximation der Fehlerwahrscheinlichkeit
 
|Nächste Seite=Approximation der Fehlerwahrscheinlichkeit
 
}}
 
}}
  
== Blockschaltbild und Voraussetzungen ==
+
== Block diagram and prerequisites ==
 
<br>
 
<br>
In diesem Kapitel wird die Struktur des optimalen Empfängers eines digitalen Übertragungssystems sehr allgemein hergeleitet, wobei
+
In this chapter,&nbsp; the structure of the optimal receiver of a digital transmission system is derived in very general terms,&nbsp; whereby
*das Modulationsverfahren und weitere Systemdetails nicht weiter spezifiziert werden,<br>
+
*the modulation process and further system details are not specified further,<br>
*von den Basisfunktionen und der Signalraumdarstellung gemäß [http://en.lntwww.de/Digitalsignal%C3%BCbertragung/Signale,_Basisfunktionen_und_Vektorr%C3%A4ume#Zur_Nomenklatur_im_vierten_Kapitel_.281.29 Kapitel 4.1] ausgegangen wird..
 
  
:[[File:P ID2000 Dig T 4 2 S1 version1.png|Allgemeines Blockschaltbild eines Kommunikationssystems|class=fit]]<br>
+
*the basis functions and the signal space representation according to the chapter&nbsp; [[Digital_Signal_Transmission/Signals,_Basis_Functions_and_Vector_Spaces|"Signals, Basis Functions and Vector Spaces"]]&nbsp; are assumed.
  
Zum obigen Blockschaltbild ist anzumerken:
+
[[File:EN_Dig_T_4_2_S1.png|right|frame|General block diagram of a communication system|class=fit]]
*Der Symbolumfang der Quelle beträgt <i>M</i> und der Symbolvorrat ist {<i>m<sub>i</sub></i>} mit <i>i</i> = 0, ... , <i>M</i> &ndash; 1. Die zugehörigen Symbolwahrscheinlichkeiten Pr(<i>m</i> = <i>m<sub>i</sub></i>) seien auch dem Empfänger bekannt.<br>
+
<br>
 +
To the block diagram it is to be noted:
 +
*The symbol set size of the source is&nbsp; $M$&nbsp; and the source symbol set is&nbsp; $\{m_i\}$&nbsp; with&nbsp; $i = 0$, ... , $M-1$.&nbsp;  
  
*Zur Nachrichtenübertragung stehen <i>M</i> verschiedene Signalformen <i>s<sub>i</sub></i>(<i>t</i>) zur Verfügung, wobei für die Laufvariable ebenfalls die Indizierung <i>i</i> = 0, ... , <i>M</i> &ndash; 1 gelten soll.<br>
+
*Let the corresponding source symbol probabilities&nbsp; ${\rm Pr}(m = m_i)$&nbsp; also be known to the receiver.<br>
  
*Es besteht eine feste Beziehung zwischen den Nachrichten {<i>m<sub>i</sub></i>} und den Signalen {<i>s<sub>i</sub></i>(<i>t</i>)}. Wird die Nachricht <i>m</i> = <i>m<sub>i</sub></i> übertragen, so ist das Sendesignal <i>s</i>(<i>t</i>) = <i>s<sub>i</sub></i>(<i>t</i>).<br>
+
*For the transmission,&nbsp; $M$&nbsp; different signal forms&nbsp; $s_i(t)$&nbsp; are available;&nbsp; for the indexing variable shall be valid: &nbsp; $i = 0$, ... , $M-1$&nbsp;.&nbsp;
  
*Lineare Kanalverzerrungen sind in der obigen Grafik durch die Impulsantwort <i>h</i>(<i>t</i>) berücksichtigt. Außerdem ist ein (irgendwie geartetes) Rauschen <i>n</i>(<i>t</i>) wirksam.<br>
+
*There is a fixed relation between messages&nbsp; $\{m_i\}$&nbsp; and signals&nbsp; $\{s_i(t)\}$.&nbsp; If&nbsp; $m =m_i$,&nbsp; the transmitted signal is&nbsp; $s(t) =s_i(t)$.<br>
  
*Mit diesen beiden die Übertragung störenden Effekten lässt sich das am Empfänger ankommende Signal <i>r</i>(<i>t</i>) in folgender Weise angeben:
+
*Linear channel distortions are taken into account in the above graph by the impulse response&nbsp; $h(t)$. &nbsp; In addition,&nbsp; the noise term&nbsp; $n(t)$&nbsp; (of some kind)&nbsp; is effective.&nbsp;
  
::<math>r(t) = s(t) \star h(t) + n(t) \hspace{0.05cm}.</math>
+
*With these two effects interfering with the transmission, the signal&nbsp; $r(t)$&nbsp; arriving at the receiver can be given in the following way:
 +
:$$r(t) = s(t) \star h(t) + n(t) \hspace{0.05cm}.$$
  
*Aufgabe des (optimalen) Empfängers ist es, anhand seines Eingangssignals <i>r</i>(<i>t</i>) herauszufinden, welche der <i>M</i> möglichen Nachrichten <i>m<sub>i</sub></i> &ndash; bzw. welches der Signale <i>s<sub>i</sub></i>(<i>t</i>) &ndash; gesendet wurde.<br>
+
*The task of the&nbsp; $($optimal$)$&nbsp; receiver is to find out on the basis of its input signal&nbsp; $r(t)$,&nbsp; which of the&nbsp; $M$&nbsp; possible messages&nbsp; $m_i$&nbsp; &ndash; or which of the signals&nbsp; $s_i(t)$&nbsp; &ndash; was sent. The estimated value for&nbsp; $m$&nbsp; found by the receiver is characterized by a&nbsp; "circumflex"  &nbsp; &rArr; &nbsp;  $\hat{m}$.  
  
*Der vom Empfänger gefundene Schätzwert für <i>m</i> wird in Gleichungen durch ein &bdquo;Circonflexe&rdquo; (^) gekennzeichnet. Im Fließtext (HTML&ndash;Zeichensatz) ist diese Darstellung leider nicht möglich.<br>
 
  
*Man spricht von einem optimalen Empfänger, wenn die Symbolfehlerwahrscheinlichkeit den für die Randbedingungen kleinstmöglichsten Wert annimmt:
+
{{BlaueBox|TEXT= 
::<math>p_{\rm S} = {\rm Pr}  ({\cal E}) = {\rm Pr} ( \hat{m} \ne m) \hspace{0.15cm} \Rightarrow \hspace{0.15cm}{\rm Minimum}  \hspace{0.05cm}.</math>
+
$\text{Definition:}$&nbsp; One speaks of an&nbsp; '''optimal receiver'''&nbsp; if the symbol error probability assumes the smallest possible value for the boundary conditions:
 +
:$$p_{\rm S} = {\rm Pr}  ({\cal E}) = {\rm Pr} ( \hat{m} \ne m) \hspace{0.15cm} \Rightarrow \hspace{0.15cm}{\rm minimum}  \hspace{0.05cm}.$$}}
  
<b>Hinweis:</b> Im Folgenden wird meist <i>r</i>(<i>t</i>) = <i>s</i>(<i>t</i>) + <i>n</i>(<i>t</i>) vorausgesetzt, was bedeutet, dass <i>h</i>(<i>t</i>) = <i>&delta;</i>(<i>t</i>) als verzerrungsfrei angenommen wird. Andernfalls könnten wir die Signale <i>s<sub>i</sub></i>(<i>t</i>) als <i>s'<sub>i</sub></i>(<i>t</i>) = <i>s<sub>i</sub></i>(<i>t</i>) &#8727; <i>h</i>(<i>t</i>) neu definieren, also die deterministischen Kanalverzerrungen dem Sendesignal beaufschlagen.<br>
 
  
== Fundamentaler Ansatz zum optimalen Empfängerentwurf (1) ==
+
<u>Notes:</u>
 +
#In the following,&nbsp; we mostly assume the AWGN approach &nbsp; &rArr; &nbsp;  $r(t) = s(t) + n(t)$,&nbsp; which means that &nbsp;$h(t) = \delta(t)$&nbsp; is assumed to be distortion-free.
 +
#Otherwise,&nbsp; we can redefine the signals&nbsp; $s_i(t)$&nbsp; as &nbsp; ${s_i}'(t) = s_i(t) \star h(t)$,&nbsp; i.e.,&nbsp; impose the deterministic channel distortions on the transmitted signal.<br>
 +
 
 +
== Fundamental approach to optimal receiver design==
 
<br>
 
<br>
Gegenüber dem auf der vorherigen Seite gezeigten [http://en.lntwww.de/index.php?title=Digitalsignal%C3%BCbertragung/Struktur_des_optimalen_Empf%C3%A4ngers&action=submit#Blockschaltbild_und_Voraussetzungen Blockschaltbild] führen wir nun einige wesentliche Verallgemeinerungen durch:
+
Compared to the&nbsp; [[Digital_Signal_Transmission/Structure_of_the_Optimal_Receiver#Block_diagram_and_prerequisites|"block diagram"]]&nbsp; shown in the previous section, we now perform some essential generalizations:
*Der Übertragungskanal wird durch die bedingte Wahrscheinlichkeitsdichtefunktion <i>p</i><sub><i>r</i>(<i>t</i>)|<i>s</i>(<i>t</i>)</sub> beschrieben, welche die Anhängigkeit des Empfangssignals <i>r</i>(<i>t</i>) vom Sendesignal <i>s</i>(<i>t</i>) festlegt.<br>
+
[[File:EN_Dig_T_4_2_S2b.png|right|frame|Model for deriving the optimal receiver|class=fit]]
  
*Wurde nun ein ganz bestimmtes Signal <i>r</i>(<i>t</i>) = <i>&rho;</i>(<i>t</i>) empfangen, so hat der Empfänger die Aufgabe, anhand dieses Signals <i>&rho;</i>(<i>t</i>) sowie der <i>M</i> bedingten Wahrscheinlichkeitsdichtefunktionen
+
*The transmission channel is described by the&nbsp; [[Channel_Coding/Channel_Models_and_Decision_Structures#AWGN_channel_at_Binary_Input|"conditional probability density function"]]&nbsp;  $p_{\hspace{0.02cm}r(t)\hspace{0.02cm} \vert \hspace{0.02cm}s(t)}$&nbsp; which determines the dependence of the received signal&nbsp; $r(t)$&nbsp; on the transmitted signal&nbsp; $s(t)$.&nbsp; <br>
  
::<math>p_{r(t) | s(t) } (\rho(t) | s_i(t))\hspace{0.2cm}{\rm mit}\hspace{0.2cm} i = 0, ... \hspace{0.05cm}, M-1</math>
+
*If a certain signal&nbsp; $r(t) = \rho(t)$&nbsp; has been received,&nbsp; the receiver has the task to determine the probability density functions based on this&nbsp; "signal realization" &nbsp; $\rho(t)$&nbsp; and the&nbsp; $M$&nbsp; conditional probability density functions
 +
:$$p_{\hspace{0.05cm}r(t) \hspace{0.05cm} \vert \hspace{0.05cm} s(t) } (\rho(t) \hspace{0.05cm} \vert \hspace{0.05cm} s_i(t))\hspace{0.2cm}{\rm with}\hspace{0.2cm} i = 0, \text{...} \hspace{0.05cm}, M-1.$$
  
:unter Berücksichtigung aller möglichen Sendesignale <i>s<sub>i</sub></i>(<i>t</i>) und deren Auftrittswahrscheinlichkeiten Pr(<i>m</i> = <i>m<sub>i</sub></i>) herauszufinden, welche der möglichen Nachrichten (<i>m<sub>i</sub></i>) bzw. welches der möglichen Signale (<i>s<sub>i</sub></i>(<i>t</i>)) am wahrscheinlichsten gesendet wurde.<br>
+
*It is to be found out which message&nbsp; $\hat{m}$&nbsp; was transmitted most probably,&nbsp; taking into account all possible transmitted signals&nbsp; $s_i(t)$&nbsp; and their occurrence probabilities&nbsp; ${\rm Pr}(m = m_i)$.
  
*Die Schätzung des optimalen Empfängers ist also ganz allgemein bestimmt durch die Gleichung
+
*Thus,&nbsp; the estimate of the optimal receiver is determined in general by
 +
:$$\hat{m} = {\rm arg} \max_i \hspace{0.1cm} p_{\hspace{0.02cm}s(t) \hspace{0.05cm} \vert \hspace{0.05cm} r(t) } (  s_i(t) \hspace{0.05cm} \vert \hspace{0.05cm} \rho(t)) = {\rm arg} \max_i \hspace{0.1cm} p_{m \hspace{0.05cm} \vert \hspace{0.05cm} r(t) } (  \hspace{0.05cm}m_i\hspace{0.05cm} \vert \hspace{0.05cm}\rho(t))\hspace{0.05cm}.$$
  
::<math>\hat{m} = {\rm arg} \max_i \hspace{0.1cm} p_{s(t) | r(t) } (  s_i(t) | \rho(t)) = {\rm arg} \max_i \hspace{0.1cm} p_{m | r(t) } (  m_i | \rho(t))\hspace{0.05cm},</math>
+
{{BlaueBox|TEXT= 
 +
$\text{In other words:}$&nbsp; The optimal receiver considers as the most likely transmitted message&nbsp; $\hat{m} \in \{m_i\}$&nbsp; whose conditional probability density function&nbsp; $p_{\hspace{0.02cm}m \hspace{0.05cm} \vert \hspace{0.05cm} r(t) }$&nbsp; takes the largest possible value for the applied received signal&nbsp; $\rho(t)$&nbsp; and under the assumption&nbsp; $m =\hat{m}$.&nbsp; }}<br>
  
:wobei wieder berücksichtigt ist, dass die gesendete Nachricht <i>m</i> = <i>m<sub>i</sub></i> und das gesendete Signal <i>s</i>(<i>t</i>) =  <i>s<sub>i</sub></i>(<i>t</i>) eineindeutig ineinander übergeführt werden können.<br>
+
Before we discuss the above decision rule in more detail,&nbsp; the optimal receiver should still be divided into two functional blocks according to the diagram:
 +
*The &nbsp;'''detector'''&nbsp; takes various measurements on the received signal&nbsp; $r(t)$&nbsp; and summarizes them in the vector &nbsp;$\boldsymbol{r}$.&nbsp; With &nbsp;$K$&nbsp; measurements,&nbsp; $\boldsymbol{r}$&nbsp; corresponds to a point in the &nbsp;$K$&ndash;dimensional vector space.<br>
  
In anderen Worten: Der optimale Empfänger betrachtet diejenige Nachricht <i>m<sub>i</sub></i> als die gesendete, wenn die bedingte Wahrscheinlichkeitsdichtefunktion <i>p</i><sub><i>m</i>|<i>r</i>(<i>t</i>)</sub> für das anliegende Empfangssignal <i>&rho;</i>(<i>t</i>) sowie unter der Annahme <i>m</i> = <i>m<sub>i</sub></i> den größtmöglichen Wert annimmt.<br>
+
*The &nbsp;'''decision'''&nbsp; forms the estimated value depending on this vector.&nbsp; For a given vector&nbsp; $\boldsymbol{r} = \boldsymbol{\rho}$&nbsp; the decision rule is:
 +
:$$\hat{m} = {\rm arg}\hspace{0.05cm} \max_i \hspace{0.1cm} P_{m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} } ( m_i\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{\rho}) \hspace{0.05cm}.$$
  
[[File:P ID2001 Dig T 4 2 S2 version2.png|Modell zur Herleitung des optimalen Empfängers|class=fit]]<br>
+
In contrast to the upper decision rule,&nbsp; a conditional probability&nbsp; $P_{m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} }$&nbsp; now occurs instead of the conditional probability density function&nbsp; $\rm (PDF)$ &nbsp; $p_{m\hspace{0.05cm} \vert \hspace{0.05cm}r(t)}$.&nbsp; Please note the upper and lower case for the different meanings.
  
Bevor wir die obige Entscheidungsregel näher diskutieren, soll der optimale Empfänger entsprechend der Grafik noch in zwei Funktionsblöcke aufgeteilt werden:
 
*Der Detektor nimmt am Empfangssignal <i>r</i>(<i>t</i>) verschiedene Messungen vor und fasst diese im Vektor <b><i>r</i></b> zusammen. Bei <i>K</i> Messungen entspricht <b><i>r</i></b> einem Punkt im <i>K</i>&ndash;dimensionalen Vektorraum.<br>
 
  
*Der Entscheider bildet abhängig von diesem Vektor den Schätzwert. Bei einem gegebenen Vektor <b><i>r</i></b> = <b><i>&rho;</i></b> lautet dabei die Entscheidungsregel:
+
{{GraueBox|TEXT= 
 +
$\text{Example 1:}$&nbsp; We now consider the function&nbsp; $y =  {\rm arg}\hspace{0.05cm} \max \ p(x)$,&nbsp; where&nbsp; $p(x)$&nbsp; describes the probability density function&nbsp; $\rm (PDF)$&nbsp; of a continuous-valued or discrete-valued random variable&nbsp; $x$.&nbsp; In the second case&nbsp; (right graph),&nbsp; the PDF consists of a sum of Dirac delta functions with the probabilities as pulse weights.<br>
  
::<math>\hat{m} = {\rm arg} \max_i \hspace{0.1cm} P_{m | \boldsymbol{ r} } (  m_i | \boldsymbol{\rho}) \hspace{0.05cm}.</math>
+
[[File:EN_Dig_T_4_2_S2c.png|righ|frame|Illustration of the "arg max" function|class=fit]]
  
Im Gegensatz zur oberen Gleichung tritt nun in der Entscheidungsregel eine bedingte Wahrscheinlichkeit <i>P</i><sub><i>m</i>|<b><i>r</i></b></sub> anstelle der bedingten Wahrscheinlichkeitskeitsdichtefunktion (WDF) <i>p</i><sub><i>m</i>|<i>r</i>(<i>t</i>)</sub> auf. Beachten Sie bitte die Groß&ndash; bzw. Kleinschreibung für die unterschiedlichen Bedeutungen.<br>
+
&rArr; &nbsp; The graphic shows exemplary functions.&nbsp; In both cases the PDF maximum&nbsp; $(17)$&nbsp; is at&nbsp; $x = 6$:
 +
:$$\max_i \hspace{0.1cm} p(x) = 17\hspace{0.05cm},$$
 +
:$$y = {\rm \hspace{0.05cm}arg} \max_i \hspace{0.1cm} p(x) = 6\hspace{0.05cm}.$$
  
== Fundamentaler Ansatz zum optimalen Empfängerentwurf (2) ==
+
&rArr; &nbsp; The (conditional) probabilities in the equation
<br>
 
Wir betrachten nun die Funktion <i>y</i> = arg max <i>p</i>(<i>x</i>), wobei <i>p</i>(<i>x</i>) die Wahrscheinlichkeitsdichtefunktion (WDF) einer wertkontinuierlichen oder wertdiskreten Zufallsgröße <i>x</i> beschreibt. Im zweiten Fall besteht die WDF aus einer Summe von Diracfunktionen mit den Wahrscheinlichkeiten als Impulsgewichte.<br>
 
  
[[File:P ID2002 Dig T 4 2 S2b version1.png|Zur Verdeutlichung der Funktion „arg max”|class=fit]]<br>
+
:$$\hat{m} = {\rm arg}\hspace{0.05cm} \max_i \hspace{0.1cm} P_{\hspace{0.02cm}m\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r} } (  m_i \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho})$$
  
Die Grafik zeigt beispielhafte Funktionen. In beiden Fällen liegt das WDF&ndash;Maximum (17) bei <i>x</i> = 6:
+
are &nbsp;'''a-posteriori probabilities'''. &nbsp; [[Theory_of_Stochastic_Signals/Statistical_Dependence_and_Independence#Inference_probability|"Bayes' theorem"]]&nbsp; can be used to write for this:
 +
:$$P_{\hspace{0.02cm}m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} } ( m_i \hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{\rho}) =
 +
\frac{ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm}m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm}m_i )}{p_{\boldsymbol{ r} } (\boldsymbol{\rho})}
 +
\hspace{0.05cm}.$$}}
  
:<math>\max_i \hspace{0.1cm} p(x) = 17\hspace{0.05cm},\hspace{0.2cm}y = {\rm arg} \max_i \hspace{0.1cm} p(x) = 6\hspace{0.05cm}.</math>
 
  
Man nennt die (bedingten) Wahrscheinlichkeiten in der Gleichung
+
The denominator term &nbsp; $p_{\boldsymbol{ r} }(\boldsymbol{\rho})$ &nbsp; is the same for all alternatives&nbsp; $m_i$&nbsp; and need not be considered for the decision.
  
:<math>\hat{m} = {\rm arg} \max_i \hspace{0.1cm} P_{m | \boldsymbol{ r} } (  m_i | \boldsymbol{\rho})</math>
+
This gives the following rules:
  
auch a&ndash;Posteriori&ndash;Wahrscheinlichkeiten. Mit dem [http://en.lntwww.de/Stochastische_Signaltheorie/Statistische_Abh%C3%A4ngigkeit_und_Unabh%C3%A4ngigkeit#R.C3.BCckschlusswahrscheinlichkeit Satz] von Bayes kann hierfür geschrieben werden:
+
{{BlaueBox|TEXT= 
 +
$\text{Theorem:}$&nbsp; The decision rule of the optimal receiver,&nbsp; also known as &nbsp;'''maximum–a–posteriori  receiver'''&nbsp; $($in short:&nbsp; '''MAP receiver'''$)$&nbsp; is:
  
:<math>P_{m | \boldsymbol{ r} } (  m_i | \boldsymbol{\rho}) =
+
:$$\hat{m}_{\rm MAP} = {\rm \hspace{0.05cm} arg} \max_i \hspace{0.1cm} P_{\hspace{0.02cm}m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} } (  m_i \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}) = {\rm \hspace{0.05cm}arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm} m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm} m_i )\big ]\hspace{0.05cm}.$$
\frac{{\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}|m } (\boldsymbol{\rho}|m_i )}{p_{\boldsymbol{ r}} (\boldsymbol{\rho})}
 
\hspace{0.05cm}.</math>
 
  
Da der Term im Nenner für alle <i>m<sub>i</sub></i> gleich ist, muss er für die Entscheidung nicht weiter berücksichtigt werden. Damit erhält man die folgenden Regeln:
+
*The advantage of this equation is that the conditional PDF&nbsp; $p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm} m }$&nbsp; $($"output under the condition input"$)$&nbsp; describing the forward direction of the channel can be used.  
  
{{Satz}} '''1:''' Die Entscheidungsregel des optimalen Empfängers, auch bekannt als MAP&ndash;Empfänger (die Abkürzung steht für Maximum&ndash;a&ndash;posteriori), lautet:
+
*In contrast,&nbsp; the first equation uses the inference probabilities&nbsp; $P_{\hspace{0.05cm}m\hspace{0.05cm} \vert \hspace{0.02cm} \boldsymbol{ r} } $&nbsp; $($"input under the condition output"$)$.}}
  
:<math>\hat{m}_{\rm MAP} = {\rm arg} \max_i \hspace{0.1cm} P_{m | \boldsymbol{ r} } (  m_i | \boldsymbol{\rho}) = {\rm arg} \max_i \hspace{0.1cm} [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}|m } (\boldsymbol{\rho}|m_i )]\hspace{0.05cm}.</math>
 
  
Der Vorteil dieser Gleichung ist, dass die die Vorwärtsrichtung des Kanals beschreibende bedingte WDF <i>p<sub><b>r</b>|m</sub></i> (&bdquo;Ausgang unter der Bedingung Eingang&rdquo;) verwendet werden kann. Dagegen verwendet die erste Gleichung die Rückschlusswahrscheinlichkeiten <i>P</i><sub><i>m|<b>r</b></i></sub> (&bdquo;Eingang unter der Bedingung Ausgang&rdquo;).
+
{{BlaueBox|TEXT= 
 +
$\text{Theorem:}$&nbsp; A &nbsp;'''maximum likelihood receiver'''&nbsp; $($in short:&nbsp; '''ML receiver'''$)$&nbsp; uses the following  decision rule:
  
{{Satz}} <b>2:</b> Ein Maximum&ndash;Likelihood&ndash;Empfänger (ML&ndash;Empfänger) verwendet die Entscheidungsregel
+
:$$\hat{m}_{\rm ML} = \hspace{-0.1cm} {\rm arg} \max_i \hspace{0.1cm}  p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm}m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm}m_i )\hspace{0.05cm}.$$
  
:<math>\hat{m}_{\rm ML} = \hspace{-0.1cm} {\rm arg} \max_i \hspace{0.1cm}  p_{\boldsymbol{ r}|m } (\boldsymbol{\rho}|m_i )\hspace{0.05cm}.</math>
+
*In this case,&nbsp; the possibly different occurrence probabilities&nbsp; ${\rm Pr}(m = m_i)$&nbsp; are not used for the decision process.
 +
*For example,&nbsp; because they are not known to the receiver.}}<br>
  
Bei diesem werden die möglicherweise unterschiedlichen Auftrittswahrscheinlichkeiten Pr(<i>m<sub>i</sub></i>) für den Entscheidungsprozess nicht herangezogen, zum Beispiel, weil sie dem Empfänger nicht bekannt sind.{{end}}{{end}}<br>
+
See the earlier chapter&nbsp; [[Digital_Signal_Transmission/Optimal_Receiver_Strategies|"Optimal Receiver Strategies"]]&nbsp; for other derivations for these receiver types.
  
<b>Hinweis:</b> Im [http://en.lntwww.de/Digitalsignal%C3%BCbertragung/Optimale_Empf%C3%A4ngerstrategien#Betrachtetes_Szenario_im_Kapitel_3.7 Kapitel 3.7] finden Sie eine andere Herleitung. Allgemein gilt: Bei gleichwahrscheinlichen Nachrichten {<i>m<sub>i</sub></i>}  &nbsp;&#8658;&nbsp; Pr(<i>m<sub>i</sub></i>) = 1/<i>M</i> ist der ML&ndash;Empfänger gleichwertig mit dem MAP&ndash;Empfänger:
+
{{BlaueBox|TEXT= 
 +
$\text{Conclusion:}$&nbsp; For equally likely messages&nbsp; $\{m_i\}$   &nbsp; &#8658; &nbsp; ${\rm Pr}(m = m_i) = 1/M$,&nbsp; the generally slightly worse&nbsp; "maximum likelihood  receiver"&nbsp; is equivalent to the&nbsp; "maximum–a–posteriori receiver":
 +
:$$\hat{m}_{\rm MAP} = \hat{m}_{\rm ML} =\hspace{-0.1cm} {\rm\hspace{0.05cm} arg} \max_i \hspace{0.1cm}
 +
  p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm}m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm}m_i )\hspace{0.05cm}.$$}}
  
:<math>\hat{m}_{\rm MAP} = \hat{m}_{\rm ML} =\hspace{-0.1cm} {\rm arg} \max_i \hspace{0.1cm}
 
  p_{\boldsymbol{ r}|m } (\boldsymbol{\rho}|m_i )\hspace{0.05cm}.</math>
 
  
 
+
== The irrelevance theorem==
== Das Theorem der Irrelevanz (1) ==
 
 
<br>
 
<br>
Zu beachten ist, dass der auf der letzten Seite beschriebene Empfänger nur dann optimal ist, wenn auch der Detektor bestmöglich implementiert ist, das heißt, wenn durch den Übergang vom kontinuierlichen Signal <i>r</i>(<i>t</i>) zum Vektor <i><b>r</b></i>&nbsp; keine Information verloren geht.<br>
+
[[File:EN_Dig_T_4_2_S3a.png|right|frame|About the irrelevance theorem|class=fit]]
 +
Note that the receiver described in the last section is optimal only if the detector is implemented in the best possible way,&nbsp; if no information is lost by the transition from the continuous signal&nbsp; $r(t)$&nbsp; to the vector&nbsp; $\boldsymbol{r}$.&nbsp; <br>
  
Um die Frage zu klären, welche und wieviele Messungen am Empfangssignal <i>r</i>(<i>t</i>) durchzuführen sind, um Optimalität zu garantieren, ist das <i>Theorem der Irrelevanz</i> hilfreich. Dazu betrachten wir den nachfolgend skizzierten Empfänger, dessen Detektor aus dem Empfangssignal <i>r</i>(<i>t</i>) die zwei Vektoren <i><b>r</b></i><sub>1</sub> und <i><b>r</b></i><sub>2</sub> ableitet und dem Entscheider zur Verfügung stellt. <i><b>r</b></i><sub>1</sub> und <i><b>r</b></i><sub>2</sub> stehen mit der Nachricht <i>m</i> &#8712; {<i>m<sub>i</sub></i>} über die Verbundwahrscheinlichkeitsdichte <i>p</i><sub><i><b>r</b></i><sub>1</sub>, <i><b>r</b></i><sub>2</sub>|<i>m</i></sub> in Zusammenhang.<br>
+
To clarify the question which and how many measurements have to be performed on the received signal&nbsp; $r(t)$&nbsp; to guarantee optimality,&nbsp; the&nbsp; "irrelevance theorem"&nbsp; is helpful.  
  
[[File:P ID2003 Dig T 4 2 S3a version2.png|Zum Theorem der Irrelevanz|class=fit]]<br>
+
*For this purpose,&nbsp; we consider the sketched receiver whose detector derives the two vectors&nbsp; $\boldsymbol{r}_1$&nbsp; and&nbsp; $\boldsymbol{r}_2$&nbsp; from the received signal&nbsp; $r(t)$&nbsp; and makes them available to the decision.  
  
Die Entscheidungsregel des MAP&ndash;Empfängers lautet mit Anpassung an dieses Beispiel:
+
*These quantities are related to the message&nbsp; $ m \in \{m_i\}$&nbsp; via the composite probability density&nbsp; $p_{\boldsymbol{ r}_1, \hspace{0.05cm}\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm}m }$.&nbsp; <br>
  
:<math>\hat{m}_{\rm MAP} \hspace{-0.1cm}  =  \hspace{-0.1cm} {\rm arg} \max_i \hspace{0.1cm} [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 , \hspace{0.05cm}\boldsymbol{ r}_2 |m } \hspace{0.05cm} (\boldsymbol{\rho}_1, \hspace{0.05cm}\boldsymbol{\rho}_2|m_i )]=</math>
+
*The decision rule of the MAP receiver with adaptation to this example is:
:::<math>\hspace{-0.1cm}  =  \hspace{-0.1cm}
 
{\rm arg} \max_i \hspace{0.1cm} [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1  |m } \hspace{0.05cm} (\boldsymbol{\rho}_1
 
|m_i )
 
\cdot p_{\boldsymbol{ r}_2 | \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2| \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )]
 
\hspace{0.05cm}.</math>
 
  
Hierzu ist anzumerken:
+
:$$\hat{m}_{\rm MAP} \hspace{-0.1cm}  =  \hspace{-0.1cm} {\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 , \hspace{0.05cm}\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1, \hspace{0.05cm}\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} m_i ) \big]=
*Die Vektoren <i><b>r</b></i><sub>1</sub> und <i><b>r</b></i><sub>2</sub> sind Zufallsgrößen. Ihre Realisierungen werden hier und im Folgenden mit <i><b>&rho;</b></i><sub>1</sub> und <i><b>&rho;</b></i><sub>2</sub> bezeichnet. Zur Hervorhebung sind alle Vektoren in der Grafik rot eingetragen.
+
{\rm arg} \max_i \hspace{0.1cm}\big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1  \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1
*Die Voraussetzungen für die Anwendung des &bdquo;Theorems der Irrelevanz&rdquo; sind die gleichen wie die an eine [http://en.lntwww.de/Stochastische_Signaltheorie/Markovketten#Betrachtetes_Szenario Markovkette] erster Ordnung. Die Zufallsvariablen <i>x</i>, <i>y</i>, <i>z</i> formen dann eine solche, falls die Verteilung von <i>z</i> bei gegebenem <i>y</i> unabhängig von <i>x</i> ist:
+
\hspace{0.05cm} \vert \hspace{0.05cm}m_i )
 +
\cdot p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )\big]
 +
\hspace{0.05cm}.$$
  
::<math>p(x, y, z) = p(x) \cdot p(y|x) \cdot p(z|y) \hspace{0.25cm} {\rm anstelle \hspace{0.15cm}von} \hspace{0.25cm}p(x, y, z) = p(x) \cdot p(y|x) \cdot p(z|x, y) \hspace{0.05cm}.</math>
+
Here it is to be noted:
 +
*The vectors&nbsp; $\boldsymbol{r}_1$&nbsp; and &nbsp;$\boldsymbol{r}_2$&nbsp; are random variables.&nbsp; Their realizations are denoted here and in the following by&nbsp; $\boldsymbol{\rho}_1$&nbsp; and &nbsp;$\boldsymbol{\rho}_2$.&nbsp; For emphasis,&nbsp; all vectors are red inscribed in the graph.
  
*Der optimale Empfänger muss im allgemeinen Fall beide Vektoren <i><b>r</b></i><sub>1</sub> und <i><b>r</b></i><sub>2</sub> auswerten, da in obiger Entscheidungsregel beide Verbundwahrscheinlichkeitsdichten <i>p</i><sub><i><b>r</b></i><sub>1</sub>|<i>m</i></sub>  und <i>p</i><sub><i><b>r</b></i><sub>2</sub>| <i><b>r</b></i><sub>1</sub>, <i>m</i></sub>  auftreten.
+
*The requirements for the application of the&nbsp; "irrelevance theorem"&nbsp; are the same as those for a first order&nbsp; [[Theory_of_Stochastic_Signals/Markov_Chains#Considered_scenario|"Markov chain"]].&nbsp; The random variables&nbsp; $x$,&nbsp; $y$,&nbsp; $z$&nbsp; then form a first order Markov chain if the distribution of&nbsp; $z$&nbsp; is independent of &nbsp; $x$&nbsp; for a given&nbsp;$y$.&nbsp; The first order Markov chain is the following:
 +
:$$p(x,\ y,\ z) = p(x) \cdot p(y\hspace{0.05cm} \vert \hspace{0.05cm}x) \cdot p(z\hspace{0.05cm} \vert \hspace{0.05cm}y) \hspace{0.75cm} {\rm instead \hspace{0.15cm}of} \hspace{0.75cm}p(x, y, z) = p(x) \cdot p(y\hspace{0.05cm} \vert \hspace{0.05cm}x) \cdot p(z\hspace{0.05cm} \vert \hspace{0.05cm}x, y) \hspace{0.05cm}.$$
  
*Dagegen kann der Empfänger ohne Informationseinbuße die zweite Messung vernachlässigen, falls <i><b>r</b></i><sub>2</sub> bei gegebenem <i><b>r</b></i><sub>1</sub> unabhängig von der Nachricht <i>m</i> ist:
+
*In the general case,&nbsp; the optimal receiver must evaluate both vectors&nbsp; $\boldsymbol{r}_1$&nbsp; and&nbsp; $\boldsymbol{r}_2$,&nbsp; since both composite probability densities&nbsp; $p_{\boldsymbol{ r}_1\hspace{0.05cm} \vert \hspace{0.05cm}m }$&nbsp; and&nbsp; $p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r}_1, \hspace{0.05cm}m }$&nbsp; occur in the above decision rule.&nbsp; In contrast,&nbsp; the receiver can neglect the second measurement without loss of information&nbsp; if&nbsp; $\boldsymbol{r}_2$&nbsp; is independent of the message&nbsp; $m$&nbsp; for given&nbsp; $\boldsymbol{r}_1$:&nbsp;
 +
:$$p_{\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )=
 +
p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1  } \hspace{0.05cm} (\boldsymbol{\rho}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1  )
 +
\hspace{0.05cm}.$$
  
::<math>p_{\boldsymbol{ r}_2 | \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2| \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )=
+
*In this case,&nbsp; the decision rule can be further simplified:
p_{\boldsymbol{ r}_2 | \boldsymbol{ r}_1 } \hspace{0.05cm} (\boldsymbol{\rho}_2| \boldsymbol{\rho}_1  )
+
:$$\hat{m}_{\rm MAP} =
\hspace{0.05cm}.</math>
+
{\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1  \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1
 +
\hspace{0.05cm} \vert \hspace{0.05cm}m_i )
 +
\cdot p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i ) \big]$$
 +
:$$\Rightarrow \hspace{0.3cm}\hat{m}_{\rm MAP} =
 +
{\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1
 +
\hspace{0.05cm} \vert \hspace{0.05cm}m_i )
 +
\cdot p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1  )\big]$$
 +
:$$\Rightarrow \hspace{0.3cm}\hat{m}_{\rm MAP} =
 +
{\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1
 +
\hspace{0.05cm} \vert \hspace{0.05cm}m_i )
 +
\big]\hspace{0.05cm}.$$
  
*In diesem Fall lässt sich die Entscheidungsregel weiter vereinfachen:
+
{{GraueBox|TEXT=
 +
[[File:EN_Dig_T_4_2_S3b.png|right|frame|Two examples of the irrelevance theorem|class=fit]] 
 +
$\text{Example 2:}$&nbsp; We consider two different system configurations with two noise terms&nbsp; $\boldsymbol{ n}_1$&nbsp; and&nbsp; $\boldsymbol{ n}_2$ each to illustrate the irrelevance theorem just presented.
 +
*In the diagram all vectorial quantities are red inscribed.
  
::<math>\hat{m}_{\rm MAP} \hspace{-0.1cm}  =  \hspace{-0.1cm}
+
*Moreover, red inscribed the quantities&nbsp; $\boldsymbol{s}$,&nbsp; $\boldsymbol{ n}_1$&nbsp; and &nbsp;$\boldsymbol{ n}_2$&nbsp; are independent of each other.<br>
{\rm arg} \max_i \hspace{0.1cm} [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1  |m } \hspace{0.05cm} (\boldsymbol{\rho}_1
 
|m_i )
 
\cdot p_{\boldsymbol{ r}_2 | \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2| \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )]= </math>
 
::::<math> =  \hspace{-0.1cm}
 
{\rm arg} \max_i \hspace{0.1cm} [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 |m } \hspace{0.05cm} (\boldsymbol{\rho}_1
 
|m_i )
 
\cdot p_{\boldsymbol{ r}_2 | \boldsymbol{ r}_1 } \hspace{0.05cm} (\boldsymbol{\rho}_2| \boldsymbol{\rho}_1  )]=</math>
 
::::<math> =  \hspace{-0.1cm}
 
{\rm arg} \max_i \hspace{0.1cm} [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1  |m } \hspace{0.05cm} (\boldsymbol{\rho}_1
 
|m_i )
 
]
 
\hspace{0.05cm}.</math>
 
  
== Das Theorem der Irrelevanz (2) ==
 
<br>
 
Betrachten wir zur Verdeutlichung des soeben vorgestellten Theorems der Irrelevanz zwei verschiedene Systemkonfigurationen mit jeweils zwei Rauschtermen <i><b>n</b></i><sub>1</sub> und <i><b>n</b></i><sub>2</sub>. Anmerkung: Alle vektoriellen Größen sind rot eingezeichnet und <i><b>s</b></i>, <i><b>n</b></i><sub>1</sub> und <i><b>n</b></i><sub>2</sub> seien jeweils unabhängig voneinander.<br>
 
  
[[File:P ID2004 Dig T 4 2 S3b version1.png|Zwei Beispiele zum Theorem der Irrelevanz|class=fit]]<br>
+
The analysis of these two arrangements yields the following results:
 +
*In both cases,&nbsp; the decision must consider the component&nbsp; $\boldsymbol{ r}_1= \boldsymbol{ s}_i + \boldsymbol{ n}_1$,&nbsp; since only this component provides the information about the possible transmitted signals&nbsp; $\boldsymbol{ s}_i$&nbsp; and thus about the message&nbsp; $m_i$.&nbsp; <br>
  
Die Analyse dieser beiden Anordnungen liefert folgende Ergebnisse:
+
*In the upper configuration, &nbsp; $\boldsymbol{ r}_2$&nbsp; contains no information about&nbsp; $m_i$ that has not already been provided by &nbsp;$\boldsymbol{ r}_1$.&nbsp; Rather, &nbsp; $\boldsymbol{ r}_2= \boldsymbol{ r}_1 + \boldsymbol{ n}_2$&nbsp; is just a noisy version of&nbsp; $\boldsymbol{ r}_1$&nbsp; and depends only on the noise&nbsp; $\boldsymbol{ n}_2$&nbsp; once&nbsp; $\boldsymbol{ r}_1$&nbsp; is known &nbsp; &#8658; &nbsp; $\boldsymbol{ r}_2$&nbsp; '''is irrelevant''':
*Der Entscheider muss in beiden Fällen die Komponente <i><b>r</b></i><sub>1</sub> = <i><b>s</b></i> + <i><b>n</b></i><sub>1</sub> berücksichtigen, da nur diese die Information über das Nutzsignal <i><b>s</b></i> und damit über die gesendete Nachricht <i>m</i> liefert.<br>
+
:$$p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )=
 +
p_{\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1  } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{\rho}_1  )=
 +
p_{\boldsymbol{ n}_2  } \hspace{0.05cm} (\boldsymbol{\rho}_2 - \boldsymbol{\rho}_1  )\hspace{0.05cm}.$$
  
*Bei der oberen Konfiguration enthält <i><b>r</b></i><sub>2</sub> keine Information über <i>m</i>, die nicht bereits von <i><b>r</b></i><sub>1</sub> geliefert wurde. Vielmehr ist <i><b>r</b></i><sub>2</sub> = <i><b>r</b></i><sub>1</sub> + <i><b>n</b></i><sub>2</sub> nur eine verrauschte Version von <i><b>r</b></i><sub>1</sub> und hängt nur vom Rauschen <i><b>n</b></i><sub>2</sub> ab, sobald <i><b>r</b></i><sub>1</sub> bekannt ist &nbsp;&#8658;&nbsp; <i><b>r</b></i><sub>2</sub> ist irrelevant:
+
*In the lower configuration,&nbsp; on the other hand,&nbsp; $\boldsymbol{ r}_2= \boldsymbol{ n}_1 + \boldsymbol{ n}_2$&nbsp; is helpful to the receiver,&nbsp; since it provides it with an estimate of the noise term&nbsp; $\boldsymbol{ n}_1$ &nbsp; &#8658; &nbsp; $\boldsymbol{ r}_2$ should therefore not be discarded here.
  
::<math>p_{\boldsymbol{ r}_2 | \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2| \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )=
+
*Formally, this result can be expressed as follows:
p_{\boldsymbol{ r}_2 | \boldsymbol{ r}_1  } \hspace{0.05cm} (\boldsymbol{\rho}_2| \boldsymbol{\rho}_1  )=  
+
:$$p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm}  \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i ) = p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm}  \boldsymbol{ n}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1  - \boldsymbol{s}_i, \hspace{0.05cm}m_i)= p_{\boldsymbol{ n}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ n}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2- \boldsymbol{\rho}_1  + \boldsymbol{s}_i \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1  - \boldsymbol{s}_i, \hspace{0.05cm}m_i) = p_{\boldsymbol{ n}_2  } \hspace{0.05cm} (\boldsymbol{\rho}_2- \boldsymbol{\rho}_1  + \boldsymbol{s}_i )  
p_{\boldsymbol{ n}_2  } \hspace{0.05cm} (\boldsymbol{\rho}_2 - \boldsymbol{\rho}_1  )\hspace{0.05cm}.</math>
+
\hspace{0.05cm}.$$
  
*Bei der unteren Konfiguration ist dagegen <i><b>r</b></i><sub>2</sub> = <i><b>n</b></i><sub>1</sub> + <i><b>n</b></i><sub>2</sub> für den Empfänger hilfreich, da dadurch dem Empfänger ein Schätzwert für den Rauschterm <i><b>n</b></i><sub>1</sub> geliefert wird &nbsp;&#8658;&nbsp; <i><b>r</b></i><sub>2</sub> sollte nicht verworfen werden. Formal lässt sich dieses Resultat wie folgt ausdrücken:
+
*Since the possible transmitted signal&nbsp; $\boldsymbol{ s}_i$&nbsp; now appears in the argument of this function,&nbsp; $\boldsymbol{ r}_2$&nbsp; '''is not irrelevant,&nbsp; but quite relevant'''.}}<br>
  
::<math>p_{\boldsymbol{ r}_2 | \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2| \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )
+
== Some properties of the AWGN channel==
\hspace{-0.1cm}  = \hspace{-0.1cm}
 
p_{\boldsymbol{ r}_2  | \boldsymbol{ n}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2 | \boldsymbol{\rho}_1  - \boldsymbol{s}_i, \hspace{0.05cm}m_i)= </math>
 
:::::::<math> \hspace{0.5cm} \hspace{-0.1cm}  =  \hspace{-0.1cm}
 
p_{\boldsymbol{ n}_2 | \boldsymbol{ n}_1 , \hspace{0.05cm} m  } \hspace{0.05cm} (\boldsymbol{\rho}_2- \boldsymbol{\rho}_1  + \boldsymbol{s}_i| \boldsymbol{\rho}_1  - \boldsymbol{s}_i, \hspace{0.05cm}m_i)= </math>
 
:::::::<math>\hspace{0.4cm}=\hspace{-0.1cm}
 
p_{\boldsymbol{ n}_2  } \hspace{0.05cm} (\boldsymbol{\rho}_2- \boldsymbol{\rho}_1  + \boldsymbol{s}_i )
 
\hspace{0.05cm}.</math>
 
 
 
:Da nun im Argument dieser Funktion die Nachricht (<i><b>s</b><sub>i</sub></i>) erscheint, ist <i><b>r</b></i><sub>2</sub> nicht irrelevant.<br>
 
 
 
== Einige Eigenschaften des AWGN-Kanals (1) ==
 
 
<br>
 
<br>
Um weitere Aussagen über die Art der optimalen Messungen des Vektors <i><b>r</b></i>&nbsp; machen zu können, ist es notwendig, die den Kanal charakterisierende (bedingte) Wahrscheinlichkeitsdichtefunktion <i>p</i><sub><i>r</i>(<i>t</i>)|<i>s</i>(<i>t</i>)</sub> weiter zu spezifizieren. Im Folgenden wird die Kommunikation über den [http://en.lntwww.de/Modulationsverfahren/Qualit%C3%A4tskriterien#Einige_Anmerkungen_zum_AWGN.E2.80.93Kanalmodell AWGN&ndash;Kanal] betrachtet, dessen wichtigste Eigenschaften hier nochmals kurz zusammengestellt werden:
+
In order to make further statements about the nature of the optimal measurements of the vector&nbsp; $\boldsymbol{ r}$,&nbsp; it is necessary to further specify the&nbsp; (conditional)&nbsp; probability density function&nbsp; $p_{\hspace{0.02cm}r(t)\hspace{0.05cm} \vert \hspace{0.05cm}s(t)}$&nbsp; characterizing the channel.&nbsp; In the following,&nbsp; we will consider communication over the&nbsp; [[Modulation_Methods/Quality_Criteria#Some_remarks_on_the_AWGN_channel_model| "AWGN channel"]],&nbsp; whose most important properties are briefly summarized again here:
*Das Ausgangssignal des AWGN&ndash;Kanals ist <i>r</i>(<i>t</i>) = <i>s</i>(<i>t</i>) + <i>n</i>(<i>t</i>), wobei <i>s</i>(<i>t</i>) das Sendesignal angibt und <i>n</i>(<i>t</i>) durch einen Gaußschen Rauschprozess dargestellt wird.<br>
+
*The output signal of the AWGN channel is&nbsp; $r(t) = s(t)+n(t)$,&nbsp; where &nbsp; $s(t)$&nbsp; indicates the transmitted signal and&nbsp; $n(t)$&nbsp; is represented by a Gaussian noise process.<br>
  
*Ein Zufallsprozess {<i>n</i>(<i>t</i>)} ist gaußisch, falls die Elemente der <i>k</i>&ndash;dimensionalen Zufallsvariablen {<i>n</i>(<i>t</i><sub>1</sub>), ... , <i>n</i>(<i>t<sub>k</sub></i>)} gemeinsam gaußverteilt sind (<i>&bdquo;Jointly Gaussian&rdquo;</i>).<br>
+
*A random process&nbsp; $\{n(t)\}$&nbsp; is said to be Gaussian if the elements of the&nbsp; $k$&ndash;dimensional random variables&nbsp; $\{n_1(t)\hspace{0.05cm} \text{...} \hspace{0.05cm}n_k(t)\}$&nbsp; are&nbsp; "jointly Gaussian".<br>
  
*Der Mittelwert des AWGN&ndash;Rauschens ist E[<i>n</i>(<i>t</i>)] = 0. Außerdem ist <i>n</i>(<i>t</i>) weiß, was bedeutet, dass das Leistungsdichtespektrum (LDS) für alle Frequenzen (von &ndash; &#8734; bis + &#8734;) konstant ist:
+
*The mean value of the AWGN noise is&nbsp; ${\rm E}\big[n(t)\big] = 0$.&nbsp; Moreover,&nbsp; $n(t)$&nbsp; is&nbsp; "white",&nbsp; which means that the&nbsp; [[Theory_of_Stochastic_Signals/Power-Spectral_Density|"power-spectral density"]]&nbsp; $\rm (PSD)$&nbsp; is constant for all frequencies&nbsp; $($from &nbsp;$-\infty$ to $+\infty)$:
 +
:$${\it \Phi}_n(f) = {N_0}/{2}
 +
\hspace{0.05cm}.$$
  
::<math>{\it \Phi_n(f)} = {N_0}/{2}
+
*According to the&nbsp; [[Theory_of_Stochastic_Signals/Power-Spectral_Density#Wiener-Khintchine_Theorem|"Wiener-Khintchine theorem"]],&nbsp; the auto-correlation function&nbsp; $\rm (ACF)$&nbsp; is obtained as the&nbsp; [[Signal_Representation/Fourier_Transform_and_its_Inverse#The_second_Fourier_integral| "Fourier retransform"]]&nbsp; of&nbsp; ${\it \Phi_n(f)}$:
\hspace{0.05cm}.</math>
+
:$${\varphi_n(\tau)} = {\rm E}\big [n(t) \cdot n(t+\tau)\big  ] = {N_0}/{2} \cdot \delta(t)\hspace{0.3cm}  
 +
\Rightarrow \hspace{0.3cm} {\rm E}\big [n(t) \cdot n(t+\tau)\big  ]  =
 +
\left\{ \begin{array}{c} \rightarrow \infty \\
 +
0  \end{array} \right.\quad
 +
\begin{array}{*{1}c} {\rm f{or}}  \hspace{0.15cm} \tau = 0 \hspace{0.05cm},
 +
\\  {\rm f{or}}  \hspace{0.15cm} \tau \ne 0 \hspace{0.05cm},\\ \end{array}$$
  
*Nach dem [http://en.lntwww.de/Stochastische_Signaltheorie/Leistungsdichtespektrum_(LDS)#Theorem_von_Wiener-Chintchine Wiener&ndash;Chintchine&ndash;Theorem] ergibt sich die Autokorrelationsfunktion (AKF) als die [http://en.lntwww.de/Signaldarstellung/Fouriertransformation_und_-r%C3%BCcktransformation#Das_zweite_Fourierintegral Fourierrücktransformierte] von <i>&Phi;<sub>n</sub></i>(<i>f</i>):
+
*Here,&nbsp; $N_0$&nbsp; indicates the physical noise power density&nbsp; $($defined only for &nbsp;$f \ge 0)$.&nbsp; The constant PSD value&nbsp; $(N_0/2)$&nbsp; and the weight of the Dirac delta function in the ACF&nbsp; $($also &nbsp;$N_0/2)$&nbsp; result from the two-sided approach alone.<br><br>
  
::<math>{\varphi_n(\tau)} = {\rm E}[n(t) \cdot n(t+\tau) ] = {N_0}/{2} \cdot \delta(t)</math>
+
&rArr; &nbsp; More information on this topic is provided by the&nbsp; (German language)&nbsp; learning video&nbsp; [[Der_AWGN-Kanal_(Lernvideo)|"The AWGN channel"]]&nbsp; in part two.<br>
  
::<math>\Rightarrow \hspace{0.3cm} {\rm E}[n(t) \cdot n(t+\tau) ]  =
+
== Description of the AWGN channel by orthonormal basis functions==
\left\{ \begin{array}{c} \rightarrow \infty \\
+
<br>
0  \end{array} \right.\quad
+
From the penultimate statement in the last section,&nbsp; we see that
\begin{array}{*{1}c} {\rm f{\rm \ddot{u}r}}  \hspace{0.15cm} \tau = 0 \hspace{0.05cm},
+
*pure AWGN noise&nbsp; $n(t)$&nbsp; always has infinite variance&nbsp; (power): &nbsp; $\sigma_n^2 \to \infty$,<br>
\\  {\rm f{\rm \ddot{u}r}}  \hspace{0.15cm} \tau \ne 0 \hspace{0.05cm},\\ \end{array}</math>
 
  
*<i>N</i><sub>0</sub> gibt dabei die physikalische (nur für <i>f</i> &#8805; 0 definierte) Rauschleistungsdichte an. Der konstante LDS&ndash;Wert (<i>N</i><sub>0</sub>/2) und das Gewicht der Diracfunktion in der AKF (ebenfalls <i>N</i><sub>0</sub>/2) ergibt sich allein durch die zweiseitige Betrachtungsweise.<br><br>
+
*consequently,&nbsp; in reality only filtered noise&nbsp; $n\hspace{0.05cm}'(t) = n(t) \star h_n(t)$&nbsp; can occur.<br><br>
  
Weitere Informationen zum AWGN&ndash;Kanal liefert das Lernvideo [[:File:AWGN_2.swf|Der AWGN&ndash;Kanal &ndash; Teil 2.]]<br>
+
With the impulse response&nbsp; $h_n(t)$&nbsp; and the&nbsp; frequency response $H_n(f) = {\rm F}\big [h_n(t)\big ]$,&nbsp; the following equations hold:<br>
  
== Einige Eigenschaften des AWGN-Kanals (2) ==
+
:$${\rm E}\big[n\hspace{0.05cm}'(t)  \big] \hspace{0.15cm} = \hspace{0.2cm} {\rm E}\big[n(t) \big] = 0 \hspace{0.05cm},$$
<br>
+
:$${\it \Phi_{n\hspace{0.05cm}'}(f)} \hspace{0.1cm}  =  \hspace{0.1cm} {N_0}/{2} \cdot |H_{n}(f)|^2 \hspace{0.05cm},$$
Aus dem vorletzten Statement auf der letzten Seite geht hervor, dass
+
:$$ {\it \varphi_{n\hspace{0.05cm}'}(\tau)} \hspace{0.1cm}  =  \hspace{0.1cm} {N_0}/{2}\hspace{0.1cm} \cdot \big [h_{n}(\tau) \star h_{n}(-\tau)\big  ]\hspace{0.05cm},$$
*reines AWGN&ndash;Rauschen <i>n</i>(<i>t</i>) stets eine unendliche Varianz (Leistung) aufweist: <i>&sigma;<sub>n</sub></i><sup>2</sup> &#8594; &#8734;,<br>
+
:$$\sigma_n^2  \hspace{0.1cm}  =  \hspace{0.1cm} {  \varphi_{n\hspace{0.05cm}'}(\tau = 0)} =  {N_0}/{2} \cdot
*in der Realität demzufolge nur gefiltertes Rauschen <i>n</i>'(<i>t</i>) = <i>n</i>(<i>t</i>) &#8727; <i>h<sub>n</sub></i>(<i>t</i>) auftreten kann.<br><br>
+
\int_{-\infty}^{+\infty}h_n^2(t)\,{\rm d} t ={N_0}/{2}\hspace{0.1cm} \cdot < \hspace{-0.1cm}h_n(t), \hspace{0.1cm} h_n(t) \hspace{-0.05cm} > \hspace{0.1cm} $$
 +
:$$\Rightarrow \hspace{0.3cm} \sigma_n^2  \hspace{0.1cm} = 
 +
\int_{-\infty}^{+\infty}{\it \Phi}_{n\hspace{0.05cm}'}(f)\,{\rm d} f = {N_0}/{2} \cdot \int_{-\infty}^{+\infty}|H_n(f)|^2\,{\rm d} f \hspace{0.05cm}.$$
  
Mit der Impulsantwort <i>h<sub>n</sub></i>(<i>t</i>) und dem Frequenzgang <i>H<sub>n</sub></i>(<i>f</i>) = F[<i>h<sub>n</sub></i>(<i>t</i>)] gelten dann folgende Gleichungen:<br>
+
In the following,&nbsp; $n(t)$&nbsp; always implicitly includes a&nbsp; "band limitation";&nbsp; thus,&nbsp; the notation&nbsp; $n'(t)$&nbsp; will be omitted in the future.<br>
  
:<math>{\rm E}[n'(t)  ] \hspace{0.15cm} \hspace{0.2cm} {\rm E}[n(t)  ] = 0 \hspace{0.05cm},</math>
+
{{BlaueBox|TEXT=   
:<math>\hspace{0.4cm}{\it \Phi_{n'}(f)} \hspace{0.1cm}  =  \hspace{0.1cm} {N_0}/{2} \cdot |H_{n}(f)|^2 \hspace{0.05cm},</math>
+
$\text{Please note:}$&nbsp; Similar to the transmitted signal&nbsp; $s(t)$,&nbsp; the noise process&nbsp; $\{n(t)\}$&nbsp; can be written as a weighted sum of orthonormal basis functions&nbsp; $\varphi_j(t)$.&nbsp;
:<math> \hspace{0.4cm}{\it \varphi_{n'}(\tau)} \hspace{0.1cm}  =  \hspace{0.1cm} {N_0}/{2}\hspace{0.1cm} \cdot [h_{n}(\tau) \star h_{n}(-\tau)]\hspace{0.05cm},</math>
+
*In contrast to&nbsp; $s(t)$,&nbsp; however,&nbsp; a restriction to a finite number of basis functions is not possible.
::<math>\hspace{0.4cm}\sigma_n^2  \hspace{0.1cm}  =  \hspace{0.1cm} {  \varphi_{n'}(\tau = 0)} =  {N_0}/{2} \cdot
 
\int_{-\infty}^{+\infty}h_n^2(t)\,{\rm d} t ={N_0}/{2}\hspace{0.1cm} \cdot < \hspace{-0.1cm}h_n(t), \hspace{0.1cm} h_n(t) \hspace{-0.05cm} > \hspace{0.1cm}=</math>
 
:::<math>\hspace{0.6cm}\hspace{-0.1cm}  =  \hspace{0.1cm}  \int_{-\infty}^{+\infty}{\it \Phi_{n'}(f)}\,{\rm d} f = {N_0}/{2} \cdot \int_{-\infty}^{+\infty}|H_n(f)|^2\,{\rm d} f \hspace{0.05cm}.</math>
 
  
Im Folgenden beinhaltet <i>n</i>(<i>t</i>) stets implizit eine Bandbegrenzung; auf die Schreibweise <i>n</i>'(<i>t</i>) wird also zukünftig verzichtet.<br>
+
*Rather,&nbsp; for purely stochastic quantities,&nbsp; the following always holds for the corresponding signal representation
  
Ähnlich wie das Sendesignal <i>s</i>(<i>t</i>) lässt sich auch der Rauschprozess <i>n</i>(<i>t</i>) als gewichtete Summe von orthonormalen Basisfunktionen <i>&phi;<sub>j</sub></i>(<i>t</i>) schreiben. Im Gegensatz zu <i>s</i>(<i>t</i>) ist allerdings nun eine Beschränkung auf eine endliche Anzahl an Basisfunktionen nicht möglich. Vielmehr gilt bei rein stochastischen Größen
+
:$$n(t) = \lim_{N \rightarrow \infty} \sum\limits_{j = 1}^{N}n_j \cdot \varphi_j(t) \hspace{0.05cm},$$
  
:<math>n(t) = \lim_{N \rightarrow \infty} \sum\limits_{j = 1}^{N}n_j \cdot \varphi_j(t) \hspace{0.05cm},</math>
+
:where the coefficient&nbsp; $n_j$&nbsp; is determined by the projection of&nbsp; $n(t)$&nbsp; onto the basis function&nbsp; $\varphi_j(t)$:&nbsp;
  
wobei der Koeffizient <i>n<sub>j</sub></i> durch die Projektion von <i>n</i>(<i>t</i>) auf die Basisfunktion <i>&phi;<sub>j</sub></i>(<i>t</i>) bestimmt ist:
+
:$$n_j = \hspace{0.1cm} < \hspace{-0.1cm}n(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > \hspace{0.05cm}.$$}}
  
:<math>n_j = \hspace{0.1cm} < \hspace{-0.1cm}n(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > \hspace{0.05cm}.</math>
 
  
<b>Hinweis:</b>  Um eine Verwechslung mit den Basisfunktionen <i>&phi;<sub>j</sub></i>(<i>t</i>) zu vermeiden, wird im Folgenden die AKF <i>&phi;<sub>n</sub></i>(<i>&tau;</i>) des Rauschprozesses stets nur noch als der Erwartungswert E[<i>n</i>(<i>t</i>) &middot; <i>n</i>(<i>t</i> + <i>&tau;</i>)] ausgedrückt.<br>
+
<u>Note:</u> &nbsp; To avoid confusion with the basis functions&nbsp; $\varphi_j(t)$,&nbsp;  we will in the following express the auto-correlation function&nbsp; $\rm (ACF)$&nbsp;&nbsp; $\varphi_n(\tau)$&nbsp; of the noise process only as the expected value&nbsp;
 +
:$${\rm E}\big [n(t) \cdot n(t + \tau)\big ] \equiv \varphi_n(\tau)  .$$ <br>
  
== Optimaler Empfänger für den AWGN-Kanal (1) ==
+
== Optimal receiver for the AWGN channel==
 
<br>
 
<br>
Auch das Empfangssignal <i>r</i>(<i>t</i>) = <i>s</i>(<i>t</i>) + <i>n</i>(<i>t</i>) lässt sich in bekannter Weise in Basisfunktionen zerlegen:
+
[[File:EN_Dig_T_4_2_S5b.png|right|frame|Optimal receiver at the AWGN channel|class=fit]]
:<math>r(t) =  \sum\limits_{j = 1}^{\infty}r_j \cdot \varphi_j(t) \hspace{0.05cm}.</math>
+
The received signal&nbsp; $r(t) = s(t) + n(t)$&nbsp; can also be decomposed into basis functions in a well-known way:
 +
$$r(t) =  \sum\limits_{j = 1}^{\infty}r_j \cdot \varphi_j(t) \hspace{0.05cm}.$$
 +
 
 +
To be considered:
 +
*The&nbsp; $M$&nbsp; possible transmitted signals&nbsp; $\{s_i(t)\}$&nbsp; span a signal space with a total of&nbsp;  $N$&nbsp; basis functions&nbsp; $\varphi_1(t)$, ... , $\varphi_N(t)$.<br>
  
Zu berücksichtigen ist:
+
*These&nbsp; $N$&nbsp; basis functions&nbsp; $\varphi_j(t)$&nbsp; are used simultaneously to describe the noise signal&nbsp; $n(t)$&nbsp; and the received signal&nbsp; $r(t)$.&nbsp; <br>
*Die <i>M</i> möglichen Sendesignale {<i>s<sub>i</sub></i>(<i>t</i>)} spannen einen Signalraum mit insgesamt  <i>N</i> Basisfunktionen <i>&phi;</i><sub>1</sub>(<i>t</i>), ... , <i>&phi;<sub>N</sub></i>(<i>t</i>) auf.<br>
 
  
*Diese <i>N</i> Basisfunktionen <i>&phi;<sub>j</sub></i>(<i>t</i>) werden gleichzeitig zur Beschreibung des Rauschsignals <i>n</i>(<i>t</i>) und des Empfangssignals <i>r</i>(<i>t</i>) verwendet.<br>
+
*For a complete characterization of&nbsp; $n(t)$&nbsp; or&nbsp; $r(t)$,&nbsp; however,&nbsp; an infinite number of further basis functions&nbsp; $\varphi_{N+1}(t)$,&nbsp; $\varphi_{N+2}(t)$,&nbsp; ... are needed.<br>
  
*Zur vollständigen Charakterisierung von <i>n</i>(<i>t</i>) bzw. <i>r</i>(<i>t</i>) werden nun aber darüber hinaus noch unendlich viele weitere Basisfunktionen <i>&phi;</i><sub><i>N</i></sub><sub>+1</sub>(<i>t</i>), <i>&phi;</i><sub><i>N</i></sub><sub>+2</sub>(<i>t</i>), ... benötigt.<br>
 
  
*Damit ergeben sich die Koeffizienten des Empfangssignals <i>r</i>(<i>t</i>) gemäß folgender Gleichung, wobei berücksichtigt ist, dass die Signale <i>s<sub>i</sub></i>(<i>t</i>) und das Rauschen <i>n</i>(<i>t</i>) voneinander unabhängig sind:
+
Thus,&nbsp; the coefficients of the received signal&nbsp; $r(t)$&nbsp; are obtained according to the following equation,&nbsp; taking into account that the signals&nbsp; $s_i(t)$&nbsp; and the noise&nbsp; $n(t)$&nbsp; are independent of each other:
  
::<math>r_j \hspace{0.1cm}  =  \hspace{0.1cm} \hspace{0.1cm} < \hspace{-0.1cm}r(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > \hspace{0.1cm}=</math>
+
:$$r_j \hspace{0.1cm}  =  \hspace{0.1cm} \hspace{0.1cm} < \hspace{-0.1cm}r(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > \hspace{0.1cm}=\hspace{0.1cm}
:::<math> \hspace{-0.1cm}  =  \hspace{-0.1cm}  
 
 
  \left\{ \begin{array}{c} < \hspace{-0.1cm}s_i(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > + < \hspace{-0.1cm}n(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} >  \hspace{0.1cm}= s_{ij}+ n_j\\  
 
  \left\{ \begin{array}{c} < \hspace{-0.1cm}s_i(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > + < \hspace{-0.1cm}n(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} >  \hspace{0.1cm}= s_{ij}+ n_j\\  
 
  < \hspace{-0.1cm}n(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} >  \hspace{0.1cm} = n_j \end{array} \right.\quad
 
  < \hspace{-0.1cm}n(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} >  \hspace{0.1cm} = n_j \end{array} \right.\quad
  \begin{array}{*{1}c} {j = 1, 2, ... \hspace{0.05cm}, N} \hspace{0.05cm},
+
  \begin{array}{*{1}c} {j = 1, 2, \hspace{0.05cm}\text{...}\hspace{0.05cm} \hspace{0.05cm}, N} \hspace{0.05cm},
\\  {j > N}  \hspace{0.05cm}.\\ \end{array}</math>
+
\\  {j > N}  \hspace{0.05cm}.\\ \end{array}$$
 
 
Somit ergibt sich für den optimalen Empfänger die folgende Struktur.<br>
 
 
 
[[File:P ID2005 Dig T 4 2 S5b version1.png|Optimaler Empfänger beim AWGN-Kanal|class=fit]]<br>
 
  
Die Bildbeschreibung folgt auf der nächsten Seite.<br>
+
Thus,&nbsp; the structure sketched above results for the optimal receiver.<br>
 
 
== Optimaler Empfänger für den AWGN-Kanal (2) ==
 
 
<br>
 
<br>
Betrachten wir zunächst wieder den AWGN&ndash;Kanal. Hier kann auf das Vorfilter mit dem Frequenzgang <i>W</i>(<i>f</i>) verzichtet werden, das in der Grafik für farbiges Rauschen vorgesehen ist.<br>
+
Let us first consider the &nbsp;'''AWGN channel'''.&nbsp; Here,&nbsp; the prefilter with the frequency response&nbsp; $W(f)$,&nbsp; which is intended for colored noise,&nbsp; can be dispensed with.<br>
  
[[File:P ID2006 Dig T 4 2 S5b version1.png|Optimaler Empfänger bei weißem und farbigem Rauschen|class=fit]]<br>
+
#The detector of the optimal receiver forms the coefficients&nbsp; $r_j \hspace{0.1cm}  = \hspace{0.1cm} \hspace{0.1cm} < \hspace{-0.1cm}r(t), \hspace{0.1cm} \varphi_j(t)\hspace{-0.05cm} >$&nbsp; and passes them on to the decision.
 
+
#If the decision is based on all&nbsp; $($i.e., infinitely many$)$&nbsp; coefficients&nbsp; $r_j$,&nbsp; the probability of a wrong decision is minimal and the receiver is optimal.<br>
Der Detektor des optimalen Empfängers bildet die Koeffizienten <i>r<sub>j</sub></i> = &#9001;<i>r</i>(<i>t</i>), <i>&phi;<sub>j</sub></i>(<i>t</i>)&#9002; und reicht diese an den Entscheider weiter. Basiert die Entscheidung auf sämtlichen &ndash; also unendlich vielen &ndash; Koeffizienten <i>r<sub>j</sub></i>, so ist die Wahrscheinlichkeit für eine Fehlentscheidung minimal und der Empfänger optimal.<br>
+
#The real-valued coefficients&nbsp; $r_j$&nbsp; were calculated as follows:
 
+
::$$r_j =   
Die reellwertigen Koeffizienten <i>r<sub>j</sub></i> wurden auf der letzten Seite wie folgt berechnet:
 
 
 
:<math>r_j =   
 
 
  \left\{ \begin{array}{c}  s_{ij} + n_j\\
 
  \left\{ \begin{array}{c}  s_{ij} + n_j\\
 
   n_j \end{array} \right.\quad
 
   n_j \end{array} \right.\quad
  \begin{array}{*{1}c} {j = 1, 2, ... \hspace{0.05cm}, N} \hspace{0.05cm},
+
  \begin{array}{*{1}c} {j = 1, 2, \hspace{0.05cm}\text{...}\hspace{0.05cm}, N} \hspace{0.05cm},
\\  {j > N}  \hspace{0.05cm}.\\ \end{array}</math>
+
\\  {j > N}  \hspace{0.05cm}.\\ \end{array}$$
 
 
Nach dem [http://en.lntwww.de/index.php?title=Digitalsignal%C3%BCbertragung/Struktur_des_optimalen_Empf%C3%A4ngers&action=submit#Das_Theorem_der_Irrelevanz_.281.29 Theorem der Irrelevanz] lässt sich zeigen, dass für additives weißes Gaußsches Rauschen
 
*die Optimalität nicht herabgesetzt wird, wenn man die nicht von der Nachricht (<i>s<sub>ij</sub></i>) abhängigen Koeffizienten <i>r</i><sub><i>N</i>+1</sub>, <i>r</i><sub><i>N</i>+2</sub>, ... nicht in den Entscheidungsprozess einbindet, und somit<br>
 
  
*der Detektor nur die Projektionen des Empfangssignals <i>r</i>(<i>t</i>) auf die <i>N</i> durch das Nutzsignal <i>s</i>(<i>t</i>) vorgegebenen Basisfunktionen <i>&phi;</i><sub>1</sub>(<i>t</i>), ... , <i>&phi;<sub>N</sub></i>(<i>t</i>) bilden muss.<br>
+
According to the&nbsp; [[Digital_Signal_Transmission/Structure_of_the_Optimal_Receiver#The_irrelevance_theorem|"irrelevance theorem"]]&nbsp; it can be shown that for additive white Gaussian noise
 +
*the optimality is not lowered if the coefficients &nbsp;$r_{N+1}$,&nbsp; $r_{N+2}$,&nbsp; ... ,&nbsp; that do not depend on the message&nbsp; $(s_{ij})$,&nbsp; are not included in the decision process,&nbsp; and therefore<br>
  
In der Grafik ist diese signifikante Vereinfachung durch die graue Hinterlegung angedeutet.<br>
+
*the detector has to form only the projections of the received signal&nbsp; $r(t)$&nbsp; onto the&nbsp; $N$&nbsp; basis functions&nbsp; $\varphi_{1}(t)$, ... , $\varphi_{N}(t)$&nbsp; given by the useful signal&nbsp; $s(t)$.&nbsp;
  
Im Fall von farbigem Rauschen &nbsp;&nbsp;&#8658;&nbsp;&nbsp; Leistungsdichtespektrum <i>&Phi;<sub>n</sub></i>(<i>f</i>) &ne; const. ist lediglich zusätzlich ein Vorfilter mit dem Amplitudengang
 
  
:<math>|W(f)| = \frac{1}{\sqrt{\it \Phi_n(f)}}</math>
+
In the graph this significant simplification is indicated by the gray background.<br>
  
erforderlich. Man nennt dieses Filter auch &bdquo;<i>Whitening Filter&rdquo;</i>, da die Rauschleistungsdichte am Ausgang wieder konstant, also &bdquo;weiß&rdquo; ist. Genaueres hierzu finden Sie im [http://en.lntwww.de/Stochastische_Signaltheorie/Matched-Filter#Matched-Filter_bei_farbigen_St.C3.B6rungen_.281.29 Kapitel 5.4] des Buches &bdquo;Stochastische Signaltheorie&rdquo;.<br>
+
In the case of &nbsp;'''colored noise''' &nbsp; &#8658; &nbsp; power-spectral density&nbsp; ${\it \Phi}_n(f) \ne {\rm const.}$,&nbsp; only an additional prefilter with the amplitude response&nbsp; $|W(f)| = {1}/{\sqrt{\it \Phi}_n(f)}$&nbsp; is required.&nbsp;
 +
#This filter is called&nbsp; "whitening filter",&nbsp; because the noise power-spectral density at the output is constant again &nbsp; &#8658; &nbsp; "white".
 +
#More details can be found in the chapter&nbsp; [[Theory_of_Stochastic_Signals/Matched_Filter#Generalized_matched_filter_for_the_case_of_colored_interference|"Matched filter for colored interference"]]&nbsp; of the book&nbsp; "Stochastic Signal Theory".<br>
  
== Implementierungsaspekte ==
+
== Implementation aspects ==
 
<br>
 
<br>
Wesentliche Bestandteile des optimalen Empfängers sind die Berechnungen der inneren Produkte gemäß den Gleichungen <i>r<sub>j</sub></i> = &#9001;<i>r</i>(<i>t</i>), <i>&phi;<sub>j</sub></i>(<i>t</i>)&#9002;. Diese können auf verschiedene Art und Weise implementiert werden:
+
Essential components of the optimal receiver are the calculations of the inner products according to the equations &nbsp; $r_j \hspace{0.1cm}  =  \hspace{0.1cm} \hspace{0.1cm} < \hspace{-0.1cm}r(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} >$.&nbsp;  
*Beim Korrelationsempfänger (Näheres zu dieser Implementierung finden Sie im [http://en.lntwww.de/Digitalsignal%C3%BCbertragung/Optimale_Empf%C3%A4ngerstrategien#Korrelationsempf.C3.A4nger Kapitel 3.7]) werden die inneren Produkte direkt entsprechend der Definition mit analogen Multiplizierern und Integratoren realisiert:
 
 
 
::<math>r_j = \int_{-\infty}^{+\infty}r(t) \cdot \varphi_j(t) \,{\rm d} t \hspace{0.05cm}.</math>
 
 
 
*Der Matched&ndash;Filter&ndash;Empfänger, der  bereits im [http://en.lntwww.de/Digitalsignal%C3%BCbertragung/Fehlerwahrscheinlichkeit_bei_Basisband%C3%BCbertragung#Optimaler_Bin.C3.A4rempf.C3.A4nger_-_Realisierung_mit_Matched-Filter_.281.29 Kapitel 1.2] dieses Buches hergeleitet wurde, erzielt mit einem linearen Filter mit der Impulsantwort  <i>h<sub>j</sub></i>(<i>t</i>) = <i>&phi;<sub>j</sub></i>(<i>T</i> &ndash; <i>t</i>) und anschließender  Abtastung zum Zeitpunkt <i>t</i> = <i>T</i> das gleiche Ergebnis:
 
  
::<math>r_j = \int_{-\infty}^{+\infty}r(\tau) \cdot h_j(t-\tau) \,{\rm d} \tau
+
{{BlaueBox|TEXT= 
= \int_{-\infty}^{+\infty}r(\tau) \cdot \varphi_j(T-t+\tau) \,{\rm d} \tau </math>
+
$\text{These can be implemented in several ways:}$&nbsp;
::<math>\Rightarrow \hspace{0.3cm} r_j (t = \tau) = \int_{-\infty}^{+\infty}r(\tau) \cdot \varphi_j(\tau) \,{\rm d} \tau = r_j
+
*In the &nbsp;'''correlation receiver'''&nbsp; $($see the&nbsp; [[Digital_Signal_Transmission/Optimal_Receiver_Strategies#Correlation_receiver_with_unipolar_signaling|"chapter of the same name"]] for more details on this implementation$)$,&nbsp; the inner products are realized directly according to the definition with analog multipliers and integrators:
\hspace{0.05cm}.</math>
+
:$$r_j = \int_{-\infty}^{+\infty}r(t) \cdot \varphi_j(t) \,{\rm d} t \hspace{0.05cm}.$$
  
[[File:P ID2008 Dig T 4 2 S6 version1.png|Implementierungen des inneren Produktes|class=fit]]<br>
+
*The &nbsp;'''matched filter receiver''',&nbsp; already derived in the chapter&nbsp; [[Digital_Signal_Transmission/Error_Probability_for_Baseband_Transmission#Optimal_binary_receiver_.E2.80.93_.22Matched_Filter.22_realization|"Optimal Binary Receiver"]]&nbsp; at the beginning of this book,&nbsp; achieves the same result using a linear filter with impulse response&nbsp;  $h_j(t) = \varphi_j(t) \cdot (T-t)$&nbsp; followed by sampling at time&nbsp; $t = T$:&nbsp;
 +
:$$r_j = \int_{-\infty}^{+\infty}r(\tau) \cdot h_j(t-\tau) \,{\rm d} \tau
 +
= \int_{-\infty}^{+\infty}r(\tau) \cdot \varphi_j(T-t+\tau) \,{\rm d} \tau \hspace{0.3cm}
 +
\Rightarrow \hspace{0.3cm} r_j (t = \tau) = \int_{-\infty}^{+\infty}r(\tau) \cdot \varphi_j(\tau) \,{\rm d} \tau = r_j
 +
\hspace{0.05cm}.$$
 +
[[File:EN_Dig_T_4_2_S6.png|left|frame|Three different implementations of the inner product|class=fit]]
 +
<br><br><br>
 +
The figure shows the two possible realizations <br>of the optimal detector.}}
  
Die Abbildung zeigt die beiden möglichen Realisierungsformen des optimalen Detektors. Bevor wir uns im folgenden Kapitel der optimalen Gestaltung des Entscheiders und der Berechnung und Annäherung der Fehlerwahrscheinlichkeit zuwenden, erfolgt zunächst eine für den AWGN&ndash;Kanal gültige statistische Analyse der Entscheidungsgrößen <i>r<sub>j</sub></i>.<br>
 
  
== Wahrscheinlichkeitsdichtefunktion der Empfangswerte ==
+
== Probability density function of the received values ==
 
<br>
 
<br>
Betrachten wir nochmals den optimalen Binärempfänger für die bipolare Basisbandübertragung über den AWGN&ndash;Kanal, wobei wir von der für Kapitel 4 gültigen Beschreibungsform ausgehen. Bei <i>N</i> = 1 und <i>M</i> = 2 ergibt sich für das Sendesignal die in der linken Grafik dargestellte Signalraumkonstellation
+
Before we turn to the optimal design of the decision maker and the calculation and approximation of the error probability in the following chapter, we first perform a statistical analysis of the decision variables&nbsp; $r_j$&nbsp; valid for the AWGN channel.
*mit nur einer Basisfunktion <i>&phi;</i><sub>1</sub>(<i>t</i>), wegen <i>N</i> = 1,<br>
 
*mit den beiden Signalraumpunkten <i>s<sub>i</sub></i> &#8712; {<i>s</i><sub>0</sub>, <i>s</i><sub>1</sub>}, wegen <i>M</i> = 2.<br><br>
 
 
 
[[File:P ID2009 Dig T 4 2 S7 version1.png|Signalraumkonstellation und WDF des Empfangssignals|class=fit]]<br>
 
 
 
Für das Signal am Ausgang des AWGN&ndash;Kanals,
 
  
:<math>r(t) = s(t) + n(t) \hspace{0.05cm},</math>
+
[[File:P ID2009 Dig T 4 2 S7 version1.png|right|frame|Signal space constellation&nbsp; (left)&nbsp; and PDF of the received signal&nbsp; (right)|class=fit]]
 +
For this purpose,&nbsp; we consider again the optimal binary receiver for bipolar baseband transmission over the AWGN channel,&nbsp; starting from the description form valid for the fourth main chapter.
  
ergibt sich im rauschfreien Fall &#8658; <i>r</i>(<i>t</i>) = <i>s</i>(<i>t</i>) die genau gleiche Konstellation; die Signalraumpunkte liegen somit bei
+
With the parameters&nbsp; $N = 1$&nbsp; and&nbsp; $M = 2$,&nbsp; the signal space constellation shown in the left graph is obtained for the transmitted signal
 +
*with only one basis function&nbsp; $\varphi_1(t)$,&nbsp; because of&nbsp; $N = 1$,<br>
  
:<math>r_0 = s_0 = \sqrt{E}\hspace{0.05cm},\hspace{0.2cm}r_1 = s_1 = -\sqrt{E}\hspace{0.05cm}.</math>
+
*with the two signal space points&nbsp; $s_i \in \{s_0, \hspace{0.05cm}s_1\}$,&nbsp; because of&nbsp; $M = 2$.
 +
<br clear=all>
 +
For the signal&nbsp; $r(t) = s(t) + n(t)$&nbsp; at the AWGN channel output,&nbsp; the noise-free case &nbsp; &#8658; &nbsp;  $r(t) = s(t)$&nbsp; yields exactly the same constellation;&nbsp; The signal space points are at
 +
:$$r_0 = s_0 = \sqrt{E}\hspace{0.05cm},\hspace{0.2cm}r_1 = s_1 = -\sqrt{E}\hspace{0.05cm}.$$
  
Bei Berücksichtigung des (bandbegrenzten) AWGN&ndash;Rauschens <i>n</i>(<i>t</i>) überlagern sich den beiden Punkten <i>r</i><sub>0</sub> und <i>r</i><sub>1</sub> jeweils Gaußkurven mit der Varianz <i>&sigma;<sub>n</sub></i><sup>2</sup> &nbsp;&#8658;&nbsp; Streuung <i>&sigma;<sub>n</sub></i> (siehe rechte Grafik). Die WDF der Rauschkomponente <i>n</i> lautet dabei:
+
Considering the (band-limited) AWGN noise&nbsp; $n(t)$,&nbsp;
 +
*Gaussian curves with variance&nbsp; $\sigma_n^2$ &nbsp;&#8658;&nbsp; standard deviation &nbsp; $\sigma_n$&nbsp; are superimposed on each of the two points &nbsp; $r_0$&nbsp; and&nbsp; $r_1$&nbsp; $($see right sketch$)$.  
  
:<math>p_n(n) = \frac{1}{\sqrt{2\pi} \cdot \sigma_n}\cdot {\rm exp} \left [ - \frac{n^2}{2 \sigma_n^2}\right ]\hspace{0.05cm}.</math>
+
*The probability density function&nbsp; $\rm (PDF)$&nbsp; of the noise component&nbsp; $n(t)$&nbsp; is thereby:
 +
:$$p_n(n) = \frac{1}{\sqrt{2\pi} \cdot \sigma_n}\cdot {\rm e}^{ - {n^2}/(2 \sigma_n^2)}\hspace{0.05cm}.$$
  
Für die bedingte Wahrscheinlichkeitsdichte, dass der Empfangswert <i>&rho;</i> anliegt, wenn <i>s<sub>i</sub></i> gesendet wurde, ergibt sich dann folgender Ausdruck:
+
The following expression is then obtained for the conditional probability density that the received value&nbsp; $\rho$&nbsp; is present when&nbsp; $s_i$&nbsp; has been transmitted:
  
:<math>p_{r|s}(\rho|s_i) = \frac{1}{\sqrt{2\pi} \cdot \sigma_n}\cdot {\rm exp} \left [ - \frac{(\rho - s_i)^2}{2 \sigma_n^2}\right ]\hspace{0.05cm}.</math>
+
:$$p_{\hspace{0.02cm}r\hspace{0.05cm}|\hspace{0.05cm}s}(\rho\hspace{0.05cm}|\hspace{0.05cm}s_i) = \frac{1}{\sqrt{2\pi} \cdot \sigma_n}\cdot {\rm e}^{ - {(\rho - s_i)^2}/(2 \sigma_n^2)} \hspace{0.05cm}.$$
  
Zu den Einheiten der hier aufgeführten Größen ist zu bemerken:
+
Regarding the units of the quantities listed here, we note:
*<i>r</i><sub>0</sub> = <i>s</i><sub>0</sub>, <i>r</i><sub>1</sub> = <i>s</i><sub>1</sub> sowie <i>n</i> sind jeweils Skalare mit der Einheit &bdquo;Wurzel aus Energie&rdquo;.<br>
+
*$r_0 = s_0$&nbsp; and&nbsp; $r_1 = s_1$&nbsp; as well as&nbsp; $n$&nbsp; are each scalars with the unit&nbsp; "root of energy".<br>
  
*Damit ist offensichtlich, dass <i>&sigma;<sub>n</sub></i> ebenfalls die Einheit &bdquo;Wurzel aus Energie&rdquo; besitzt und <i>&sigma;<sub>n</sub></i><sup>2</sup> eine Energie darstellt.<br>
+
*Thus,&nbsp; it is obvious that&nbsp; $\sigma_n$&nbsp; also has the unit&nbsp; "root of energy"&nbsp; and&nbsp; $\sigma_n^2$&nbsp; represents energy.<br>
  
*Beim AWGN&ndash;Kanal ist die Rauschvarianz <i>&sigma;<sub>n</sub></i><sup>2</sup> = <i>N</i><sub>0</sub>/2. Diese ist  also ebenfalls eine physikalische Größe mit der Einheit W/Hz = Ws.<br><br>
+
*For the AWGN channel,&nbsp; the noise variance is &nbsp; $\sigma_n^2 = N_0/2$, &nbsp; so this is also a physical quantity with unit&nbsp; "$\rm W/Hz \equiv Ws$".<br><br>
  
Die hier angesprochene Thematik wird in der Aufgabe A4.6 an Beispielen verdeutlicht.<br>
+
The topic addressed here is illustrated by examples in&nbsp; [[Aufgaben:Aufgabe_4.06:_Optimale_Entscheidungsgrenzen|"Exercise 4.6"]].&nbsp; <br>
  
== N–dimensionales Gaußsches Rauschen (1) ==
+
== N-dimensional Gaussian noise==
 
<br>
 
<br>
Liegt ein <i>N</i>&ndash;dimensionales Modulationsverfahren vor, das heißt, es gilt mit 0 &#8804; <i>i</i> &#8804; <i>M</i> &ndash;1 und 1 &#8804; <i>j</i> &#8804; <i>N</i>:
+
If an&nbsp; $N$&ndash;dimensional modulation process is present,&nbsp; i.e.,&nbsp; with&nbsp; $0 \le i \le M&ndash;1$&nbsp; and &nbsp;$1 \le j \le N$:
 
+
:$$s_i(t) = \sum\limits_{j = 1}^{N} s_{ij} \cdot \varphi_j(t) = s_{i1} \cdot \varphi_1(t)
:<math>s_i(t) = \sum\limits_{j = 1}^{N} s_{ij} \cdot \varphi_j(t) = s_{i1} \cdot \varphi_1(t)
+
+ s_{i2} \cdot \varphi_2(t) + \hspace{0.05cm}\text{...}\hspace{0.05cm} + s_{iN} \cdot \varphi_N(t)\hspace{0.05cm}\hspace{0.3cm}  
+ s_{i2} \cdot \varphi_2(t) + ... + s_{iN} \cdot \varphi_N(t)\hspace{0.05cm} </math>
+
\Rightarrow \hspace{0.3cm} \boldsymbol{ s}_i = \left(s_{i1}, s_{i2}, \hspace{0.05cm}\text{...}\hspace{0.05cm},  s_{iN}\right )  
 
+
\hspace{0.05cm},$$
:<math>\Rightarrow \hspace{0.3cm} \boldsymbol{ s}_i = \left(s_{i1}, s_{i2}, ... \hspace{0.05cm}s_{iN}\right )  
 
\hspace{0.05cm},</math>
 
 
 
so muss der Rauschvektor <b><i>n</i></b> ebenfalls mit der Dimension <i>N</i> angesetzt werden, und das gleiche gilt auch für den Empfangsvektor <b><i>r</i></b> :
 
 
 
:<math>\boldsymbol{ n} = \left(n_{1}, n_{2}, ... \hspace{0.05cm},  n_{N}\right )
 
\hspace{0.01cm},\hspace{0.2cm}\boldsymbol{ r} = \left(r_{1}, r_{2}, ... \hspace{0.05cm},  r_{N}\right )\hspace{0.05cm}.</math>
 
 
 
Die Wahrscheinlichkeitsdichtefunktion (WDF) lautet dann für den AWGN&ndash;Kanal mit der Realisierung <b><i>&eta;</i></b> des Rauschsignals
 
 
 
:<math>p_{\boldsymbol{ n}}(\boldsymbol{ \eta}) = \frac{1}{\left( \sqrt{2\pi}  \cdot \sigma_n \right)^N }  \cdot
 
{\rm exp} \left [ - \frac{|| \boldsymbol{ \eta} ||^2}{2 \sigma_n^2}\right ]\hspace{0.05cm},</math>
 
 
 
und für die bedingte WDF in der ML&ndash;Entscheidungsregel ist anzusetzen:
 
  
:<math>p_{\boldsymbol{ r}\hspace{0.05cm} | \hspace{0.01cm} \boldsymbol{ s}}(\boldsymbol{ \rho} | \boldsymbol{ s}_i) \hspace{-0.1cm} \hspace{0.1cm}
+
then the noise vector&nbsp; $\boldsymbol{ n}$&nbsp; must also be assumed to have dimension&nbsp; $N$.&nbsp; The same is true for the received vector&nbsp;  $\boldsymbol{ r}$:
p_{\boldsymbol{ n}\hspace{0.05cm} | \hspace{0.01cm} \boldsymbol{ s}}(\boldsymbol{ \rho} - \boldsymbol{ s}_i| \boldsymbol{ s}_i) =</math>
+
:$$\boldsymbol{ n} = \left(n_{1}, n_{2}, \hspace{0.05cm}\text{...}\hspace{0.05cm},  n_{N}\right )
::::<math>\hspace{-0.05cm} \hspace{-0.1cm} \frac{1}{\left( \sqrt{2\pi} \cdot \sigma_n \right)^2 } \cdot
+
\hspace{0.01cm},$$
{\rm exp} \left [ - \frac{|| \boldsymbol{ \rho} - \boldsymbol{ s}_i  ||^2}{2 \sigma_n^2}\right ]\hspace{0.05cm}.</math><br>
+
:$$\boldsymbol{ r} = \left(r_{1}, r_{2}, \hspace{0.05cm}\text{...}\hspace{0.05cm},  r_{N}\right )\hspace{0.05cm}.$$
[[File:P ID3146 Dig T 4 2 S8 version1.png|rahmenlos|rechts|Zweidimensionale Gauß–WDF]]
 
Die Gleichung ergibt sich aus der allgemeinen Darstellung der <i>N</i>&ndash;dimensionalen Gaußschen WDF in [http://en.lntwww.de/Stochastische_Signaltheorie/Verallgemeinerung_auf_N-dimensionale_Zufallsgr%C3%B6%C3%9Fen#Korrelationsmatrix Kapitel 4.7] des Buches &bdquo;Stochastische Signaltheorie&rdquo; unter der Voraussetzung, dass die Komponenten unkorreliert (und somit  statistisch unabhängig) sind. ||<b><i>&eta;</i></b>|| bezeichnet man als die Norm (Länge) des Vektors <b><i>&eta;</i></b>.<br>
 
  
Dargestellt ist die zweidimensionale Gauß&ndash;WDF <i>p<sub><b>n</b></sub></i>(<i><b>&eta;</b></i>) der 2D&ndash;Zufallsgröße <i><b>n</b></i> = (<i>n</i><sub>1</sub>, <i>n</i><sub>2</sub>). Die Bildbeschreibung folgt auf der nächsten Seite.<br>
+
The probability density function&nbsp; $\rm (PDF)$&nbsp; for the AWGN channel is with the realization&nbsp; $\boldsymbol{ \eta}$&nbsp; of the noise signal
 +
:$$p_{\boldsymbol{ n}}(\boldsymbol{ \eta}) = \frac{1}{\left( \sqrt{2\pi}  \cdot \sigma_n \right)^N } \cdot
 +
{\rm exp} \left [ - \frac{|| \boldsymbol{ \eta} ||^2}{2 \sigma_n^2}\right ]\hspace{0.05cm},$$
  
== N–dimensionales Gaußsches Rauschen (2) ==
+
and for the conditional PDF in the maximum likelihood decision rule:
<br>
 
[[File:P ID2012 Dig T 4 2 S8 version1.png|right|Zweidimensionale Gauß–WDF]]<br><br><br><br><br><br><br><br><br>
 
  
Zu der dargestellten 2D&ndash;WDF ist anzumerken:
+
:$$p_{\hspace{0.02cm}\boldsymbol{ r}\hspace{0.05cm} | \hspace{0.05cm} \boldsymbol{ s}}(\boldsymbol{ \rho} \hspace{0.05cm}|\hspace{0.05cm} \boldsymbol{ s}_i) \hspace{-0.1cm} = \hspace{0.1cm}
*Die Dichtefunktion bezieht sich auf die Zufallsgröße <b><i>n</i></b> = (<i>n</i><sub>1</sub>, <i>n</i><sub>2</sub>). Realisierungen hiervon werden mit <b><i>&eta;</i></b> = (<i>&eta;</i><sub>1</sub>, <i>&eta;</i><sub>2</sub>) bezeichnet. Die Gleichung der dargestellten Glockenkurve lautet:
+
p_{\hspace{0.02cm} \boldsymbol{ n}\hspace{0.05cm} | \hspace{0.05cm} \boldsymbol{ s}}(\boldsymbol{ \rho} - \boldsymbol{ s}_i \hspace{0.05cm} | \hspace{0.05cm} \boldsymbol{ s}_i) = \frac{1}{\left( \sqrt{2\pi}  \cdot \sigma_n \right)^2 } \cdot
 +
{\rm exp} \left [ - \frac{|| \boldsymbol{ \rho} - \boldsymbol{ s}_i  ||^2}{2 \sigma_n^2}\right ]\hspace{0.05cm}.$$
  
::<math>p_{n_1, n_2}(\eta_1, \eta_2) = \frac{1}{\left( \sqrt{2\pi} \cdot \sigma_n \right)^2 }  \cdot
+
The equation follows
{\rm exp} \left [ - \frac{ \eta_1^2 + \eta_2^2}{2 \sigma_n^2}\right ]\hspace{0.05cm}. </math>
+
*from the general representation of the&nbsp; $N$&ndash;dimensional Gaussian PDF in the section&nbsp; [[Theory_of_Stochastic_Signals/Generalization_to_N-Dimensional_Random_Variables#Correlation_matrix|"correlation matrix"]]&nbsp; of the book&nbsp; "Theory of Stochastic Signals"
 +
 +
*under the assumption that the components are uncorrelated&nbsp; (and thus statistically independent).  
  
 +
*$||\boldsymbol{ \eta}||$&nbsp; is called the&nbsp; "norm"&nbsp; (length)&nbsp; of the vector &nbsp;$\boldsymbol{ \eta}$.<br>
  
*Das Maximum dieser Funktion liegt bei <i>&eta;</i><sub>1</sub> = <i>&eta;</i><sub>2</sub> = 0 und hat den Wert 2&pi; &middot; <i>&sigma;<sub>n</sub></i><sup>2</sup>. Mit <i>&sigma;<sub>n</sub></i><sup>2</sup> = <i>N</i><sub>0</sub>/2 lässt sich die 2D&ndash;WDF in Vektorform auch wie folgt schreiben:
 
  
::<math>p_{\boldsymbol{ n}}(\boldsymbol{ \eta}) = \frac{1}{\pi \cdot N_0 }  \cdot  
+
{{GraueBox|TEXT= 
{\rm exp} \left [ - \frac{|| \boldsymbol{ \eta} ||^2}{N_0}\right ]\hspace{0.05cm}.</math>
+
$\text{Example 3:}$&nbsp;
 +
Shown on the right is the two-dimensional Gaussian probability density function&nbsp; $p_{\boldsymbol{ n} } (\boldsymbol{ \eta})$&nbsp; of the two-dimensional random variable&nbsp; $\boldsymbol{ n} = (n_1,\hspace{0.05cm}n_2)$.&nbsp;  Arbitrary realizations of the random variable&nbsp; $\boldsymbol{ n}$&nbsp; are denoted by&nbsp; $\boldsymbol{ \eta} = (\eta_1,\hspace{0.05cm}\eta_2)$.&nbsp; The equation of the represented two-dimensional&nbsp; "Gaussian bell curve"&nbsp; is:
 +
[[File:EN_Dig_T_4_2_S8.png|right|frame|Two-dimensional Gaussian PDF]]
 +
:$$p_{n_1, n_2}(\eta_1, \eta_2) = \frac{1}{\left( \sqrt{2\pi}  \cdot \sigma_n \right)^2 }  \cdot
 +
{\rm exp} \left [ - \frac{ \eta_1^2 + \eta_2^2}{2 \sigma_n^2}\right ]\hspace{0.05cm}. $$
 +
*The maximum of this function is at&nbsp; $\eta_1 = \eta_2 = 0$&nbsp; and has the value &nbsp;$2\pi \cdot \sigma_n^2$.&nbsp; With&nbsp; $\sigma_n^2 = N_0/2$,&nbsp; the two-dimensional PDF in vector form can also be written as follows:
 +
:$$p_{\boldsymbol{ n} }(\boldsymbol{ \eta}) = \frac{1}{\pi \cdot N_0 }  \cdot  
 +
{\rm exp} \left [ - \frac{\vert \vert \boldsymbol{ \eta} \vert \vert ^2}{N_0}\right ]\hspace{0.05cm}.$$
 +
*This rotationally symmetric PDF is suitable e.g. for describing/investigating a&nbsp; "two-dimensional modulation process"&nbsp; such as&nbsp; [[Digital_Signal_Transmission/Carrier_Frequency_Systems_with_Coherent_Demodulation#Quadrature_amplitude_modulation_.28M-QAM.29|"M&ndash;QAM"]],&nbsp; [[Digital_Signal_Transmission/Carrier_Frequency_Systems_with_Coherent_Demodulation#Multi-level_phase.E2.80.93shift_keying_.28M.E2.80.93PSK.29|"M&ndash;PSK"]]&nbsp; or&nbsp; [[Modulation_Methods/Non-Linear_Digital_Modulation#FSK_.E2.80.93_Frequency_Shift_Keying|"2&ndash;FSK"]].<br>
  
*Diese rotationssymmetrische WDF eignet sich zum Beispiel für die Beschreibung/Untersuchung eines [http://en.lntwww.de/Digitalsignal%C3%BCbertragung/Tr%C3%A4gerfrequenzsysteme_mit_koh%C3%A4renter_Demodulation#Signalraumdarstellung_der_linearen_Modulation_.281.29 linearen zweidimensionalen] Modulationsverfahrens wie QAM, <i>M</i>&ndash;PSK oder 2&ndash;FSK.<br>
+
*However,&nbsp; two-dimensional real random variables are often represented in a one-dimensional complex way,&nbsp; usually in the form&nbsp; $n(t) = n_{\rm I}(t) + {\rm j} \cdot n_{\rm Q}(t)$.&nbsp; The two components are then called the&nbsp; "in-phase component"&nbsp; $n_{\rm I}(t)$&nbsp; and the&nbsp; "quadrature component"&nbsp; $n_{\rm Q}(t)$&nbsp; of the noise.<br>
  
*Oft werden zweidimensionale reelle Zufallsgrößen aber auch eindimensional&ndash;komplex dargestellt, meist in der Form <i>n</i>(<i>t</i>) = <i>n</i><sub>I</sub>(<i>t</i>) + j &middot; <i>n</i><sub>Q</sub>(<i>t</i>). Die beiden Komponenten bezeichnet man dann als <i>Inphaseanteil </i> <i>n</i><sub>I</sub> und <i>Quadraturanteil</i> <i>n</i><sub>Q</sub> des Rauschens.<br>
+
*The probability density function depends only on the magnitude&nbsp; $\vert n(t) \vert$&nbsp; of the noise variable and not on angle&nbsp; ${\rm arc} \ n(t)$.&nbsp; This means: &nbsp; complex noise is circularly symmetric&nbsp; $($see graph$)$.<br>
  
*Die Wahrscheinlichkeitsdichtefunktion hängt nur vom Betrag |<i>n</i>(<i>t</i>)| der Rauschvariablen ab und nicht von Winkel arc <i>n</i>(<i>t</i>). Das heißt: Komplexes Rauschen ist zirkulär symmetrisch (siehe Grafik).<br>
+
*Circularly symmetric also means that the in-phase component&nbsp; $n_{\rm I}(t)$&nbsp; and the quadrature component&nbsp; $n_{\rm Q}(t)$&nbsp; have the same distribution and thus also the same variance&nbsp; $($and standard deviation$)$:
  
*Zirkulär symmetrisch bedeutet auch, dass Inphasekomponente <i>n</i><sub>I</sub> und Quadraturkomponente <i>n</i><sub>Q</sub> die gleiche Verteilung aufweisen und damit auch gleiche Varianz (Streuung) besitzen:
+
:$$ {\rm E} \big [ n_{\rm I}^2(t)\big  ]  = {\rm E}\big [ n_{\rm Q}^2(t) \big ] = \sigma_n^2 \hspace{0.05cm},\hspace{1cm}{\rm E}\big  [ n(t) \cdot n^*(t) \big  ]\hspace{0.1cm}  =  \hspace{0.1cm}  {\rm E}\big [ n_{\rm I}^2(t) \big ] + {\rm E}\big [ n_{\rm Q}^2(t)\big  ] = 2\sigma_n^2 \hspace{0.05cm}.$$}}
  
:::<math> \hspace{0.38cm}{\rm E}\left [ n_{\rm I}^2(t)\right ]\hspace{0.1cm}\hspace{-0.1cm}  =  \hspace{0.1cm}\hspace{-0.1cm} {\rm E}\left [ n_{\rm Q}^2(t)\right ] = \sigma_n^2 \hspace{0.05cm},</math>
 
::<math>{\rm E}\left [ n(t) \cdot n^*(t) \right ]\hspace{0.1cm}  =  \hspace{0.1cm} = {\rm E}\left [ n_{\rm I}^2(t)\right ] + {\rm E}\left [ n_{\rm Q}^2(t)\right ] = 2\sigma_n^2 \hspace{0.05cm}.</math>
 
  
Abschließend noch einige Beschreibungsformen für Gaußsche Zufallsgrößen:
+
Finally, some&nbsp; '''denotation variants'''&nbsp; for Gaussian random variables:
  
:<math>x \hspace{-0.1cm}  = \hspace{-0.1cm}{\cal N}(\mu, \sigma^2) \hspace{-0.1cm}: \hspace{0.3cm}{\rm reelle \hspace{0.15cm}  gaussverteilte \hspace{0.15cm}ZG}\hspace{0.05cm},
+
:$$x ={\cal N}(\mu, \sigma^2) \hspace{-0.1cm}: \hspace{0.3cm}\text{real Gaussian distributed random variable, with mean}\hspace{0.1cm}\mu \text                                          { and variance}\hspace{0.15cm}\sigma^2 \hspace{0.05cm},$$
{\rm \hspace{0.15cm}Mittelwert\hspace{0.15cm}}\mu {\rm , \hspace{0.15cm}Varianz\hspace{0.15cm}}\sigma^2 \hspace{0.05cm},</math>
+
:$$y={\cal CN}(\mu, \sigma^2)\hspace{-0.1cm}: \hspace{0.12cm}\text{complex Gaussian distributed random variable} \hspace{0.05cm}.$$
:<math>y \hspace{-0.1cm}  = \hspace{-0.1cm}{\cal CN}(\mu, \sigma^2)\hspace{-0.1cm}: \hspace{0.07cm}{\rm komplexe \hspace{0.15cm}gaussverteilte \hspace{0.15cm}ZG} \hspace{0.05cm}.</math>
 
  
==  Aufgaben ==
+
==  Exercises for the chapter==
 
<br>
 
<br>
[[Aufgaben:4.4 MAP– und ML–Empfänger|A4.4 MAP– und ML–Empfänger]]
+
[[Aufgaben:Exercise_4.4:_Maximum–a–posteriori_and_Maximum–Likelihood|Exercise 4.4: Maximum–a–posteriori and Maximum–Likelihood]]
  
[[Aufgaben:4.5 Theorem der Irrelevanz|A4.5 Theorem der Irrelevanz]]
+
[[Aufgaben:Exercise_4.5:_Irrelevance_Theorem|Exercise 4.5: Irrelevance Theorem]]
  
 
{{Display}}
 
{{Display}}

Latest revision as of 16:18, 23 January 2023

Block diagram and prerequisites


In this chapter,  the structure of the optimal receiver of a digital transmission system is derived in very general terms,  whereby

  • the modulation process and further system details are not specified further,
General block diagram of a communication system


To the block diagram it is to be noted:

  • The symbol set size of the source is  $M$  and the source symbol set is  $\{m_i\}$  with  $i = 0$, ... , $M-1$. 
  • Let the corresponding source symbol probabilities  ${\rm Pr}(m = m_i)$  also be known to the receiver.
  • For the transmission,  $M$  different signal forms  $s_i(t)$  are available;  for the indexing variable shall be valid:   $i = 0$, ... , $M-1$ . 
  • There is a fixed relation between messages  $\{m_i\}$  and signals  $\{s_i(t)\}$.  If  $m =m_i$,  the transmitted signal is  $s(t) =s_i(t)$.
  • Linear channel distortions are taken into account in the above graph by the impulse response  $h(t)$.   In addition,  the noise term  $n(t)$  (of some kind)  is effective. 
  • With these two effects interfering with the transmission, the signal  $r(t)$  arriving at the receiver can be given in the following way:
$$r(t) = s(t) \star h(t) + n(t) \hspace{0.05cm}.$$
  • The task of the  $($optimal$)$  receiver is to find out on the basis of its input signal  $r(t)$,  which of the  $M$  possible messages  $m_i$  – or which of the signals  $s_i(t)$  – was sent. The estimated value for  $m$  found by the receiver is characterized by a  "circumflex"   ⇒   $\hat{m}$.


$\text{Definition:}$  One speaks of an  optimal receiver  if the symbol error probability assumes the smallest possible value for the boundary conditions:

$$p_{\rm S} = {\rm Pr} ({\cal E}) = {\rm Pr} ( \hat{m} \ne m) \hspace{0.15cm} \Rightarrow \hspace{0.15cm}{\rm minimum} \hspace{0.05cm}.$$


Notes:

  1. In the following,  we mostly assume the AWGN approach   ⇒   $r(t) = s(t) + n(t)$,  which means that  $h(t) = \delta(t)$  is assumed to be distortion-free.
  2. Otherwise,  we can redefine the signals  $s_i(t)$  as   ${s_i}'(t) = s_i(t) \star h(t)$,  i.e.,  impose the deterministic channel distortions on the transmitted signal.

Fundamental approach to optimal receiver design


Compared to the  "block diagram"  shown in the previous section, we now perform some essential generalizations:

Model for deriving the optimal receiver
  • The transmission channel is described by the  "conditional probability density function"  $p_{\hspace{0.02cm}r(t)\hspace{0.02cm} \vert \hspace{0.02cm}s(t)}$  which determines the dependence of the received signal  $r(t)$  on the transmitted signal  $s(t)$. 
  • If a certain signal  $r(t) = \rho(t)$  has been received,  the receiver has the task to determine the probability density functions based on this  "signal realization"   $\rho(t)$  and the  $M$  conditional probability density functions
$$p_{\hspace{0.05cm}r(t) \hspace{0.05cm} \vert \hspace{0.05cm} s(t) } (\rho(t) \hspace{0.05cm} \vert \hspace{0.05cm} s_i(t))\hspace{0.2cm}{\rm with}\hspace{0.2cm} i = 0, \text{...} \hspace{0.05cm}, M-1.$$
  • It is to be found out which message  $\hat{m}$  was transmitted most probably,  taking into account all possible transmitted signals  $s_i(t)$  and their occurrence probabilities  ${\rm Pr}(m = m_i)$.
  • Thus,  the estimate of the optimal receiver is determined in general by
$$\hat{m} = {\rm arg} \max_i \hspace{0.1cm} p_{\hspace{0.02cm}s(t) \hspace{0.05cm} \vert \hspace{0.05cm} r(t) } ( s_i(t) \hspace{0.05cm} \vert \hspace{0.05cm} \rho(t)) = {\rm arg} \max_i \hspace{0.1cm} p_{m \hspace{0.05cm} \vert \hspace{0.05cm} r(t) } ( \hspace{0.05cm}m_i\hspace{0.05cm} \vert \hspace{0.05cm}\rho(t))\hspace{0.05cm}.$$

$\text{In other words:}$  The optimal receiver considers as the most likely transmitted message  $\hat{m} \in \{m_i\}$  whose conditional probability density function  $p_{\hspace{0.02cm}m \hspace{0.05cm} \vert \hspace{0.05cm} r(t) }$  takes the largest possible value for the applied received signal  $\rho(t)$  and under the assumption  $m =\hat{m}$. 


Before we discuss the above decision rule in more detail,  the optimal receiver should still be divided into two functional blocks according to the diagram:

  • The  detector  takes various measurements on the received signal  $r(t)$  and summarizes them in the vector  $\boldsymbol{r}$.  With  $K$  measurements,  $\boldsymbol{r}$  corresponds to a point in the  $K$–dimensional vector space.
  • The  decision  forms the estimated value depending on this vector.  For a given vector  $\boldsymbol{r} = \boldsymbol{\rho}$  the decision rule is:
$$\hat{m} = {\rm arg}\hspace{0.05cm} \max_i \hspace{0.1cm} P_{m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} } ( m_i\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{\rho}) \hspace{0.05cm}.$$

In contrast to the upper decision rule,  a conditional probability  $P_{m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} }$  now occurs instead of the conditional probability density function  $\rm (PDF)$   $p_{m\hspace{0.05cm} \vert \hspace{0.05cm}r(t)}$.  Please note the upper and lower case for the different meanings.


$\text{Example 1:}$  We now consider the function  $y = {\rm arg}\hspace{0.05cm} \max \ p(x)$,  where  $p(x)$  describes the probability density function  $\rm (PDF)$  of a continuous-valued or discrete-valued random variable  $x$.  In the second case  (right graph),  the PDF consists of a sum of Dirac delta functions with the probabilities as pulse weights.

Illustration of the "arg max" function

⇒   The graphic shows exemplary functions.  In both cases the PDF maximum  $(17)$  is at  $x = 6$:

$$\max_i \hspace{0.1cm} p(x) = 17\hspace{0.05cm},$$
$$y = {\rm \hspace{0.05cm}arg} \max_i \hspace{0.1cm} p(x) = 6\hspace{0.05cm}.$$

⇒   The (conditional) probabilities in the equation

$$\hat{m} = {\rm arg}\hspace{0.05cm} \max_i \hspace{0.1cm} P_{\hspace{0.02cm}m\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r} } ( m_i \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho})$$

are  a-posteriori probabilities.   "Bayes' theorem"  can be used to write for this:

$$P_{\hspace{0.02cm}m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} } ( m_i \hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{\rho}) = \frac{ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm}m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm}m_i )}{p_{\boldsymbol{ r} } (\boldsymbol{\rho})} \hspace{0.05cm}.$$


The denominator term   $p_{\boldsymbol{ r} }(\boldsymbol{\rho})$   is the same for all alternatives  $m_i$  and need not be considered for the decision.

This gives the following rules:

$\text{Theorem:}$  The decision rule of the optimal receiver,  also known as  maximum–a–posteriori receiver  $($in short:  MAP receiver$)$  is:

$$\hat{m}_{\rm MAP} = {\rm \hspace{0.05cm} arg} \max_i \hspace{0.1cm} P_{\hspace{0.02cm}m\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r} } ( m_i \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}) = {\rm \hspace{0.05cm}arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm} m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm} m_i )\big ]\hspace{0.05cm}.$$
  • The advantage of this equation is that the conditional PDF  $p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm} m }$  $($"output under the condition input"$)$  describing the forward direction of the channel can be used.
  • In contrast,  the first equation uses the inference probabilities  $P_{\hspace{0.05cm}m\hspace{0.05cm} \vert \hspace{0.02cm} \boldsymbol{ r} } $  $($"input under the condition output"$)$.


$\text{Theorem:}$  A  maximum likelihood receiver  $($in short:  ML receiver$)$  uses the following decision rule:

$$\hat{m}_{\rm ML} = \hspace{-0.1cm} {\rm arg} \max_i \hspace{0.1cm} p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm}m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm}m_i )\hspace{0.05cm}.$$
  • In this case,  the possibly different occurrence probabilities  ${\rm Pr}(m = m_i)$  are not used for the decision process.
  • For example,  because they are not known to the receiver.


See the earlier chapter  "Optimal Receiver Strategies"  for other derivations for these receiver types.

$\text{Conclusion:}$  For equally likely messages  $\{m_i\}$   ⇒   ${\rm Pr}(m = m_i) = 1/M$,  the generally slightly worse  "maximum likelihood receiver"  is equivalent to the  "maximum–a–posteriori receiver":

$$\hat{m}_{\rm MAP} = \hat{m}_{\rm ML} =\hspace{-0.1cm} {\rm\hspace{0.05cm} arg} \max_i \hspace{0.1cm} p_{\boldsymbol{ r}\hspace{0.05cm} \vert \hspace{0.05cm}m } (\boldsymbol{\rho}\hspace{0.05cm} \vert \hspace{0.05cm}m_i )\hspace{0.05cm}.$$


The irrelevance theorem


About the irrelevance theorem

Note that the receiver described in the last section is optimal only if the detector is implemented in the best possible way,  if no information is lost by the transition from the continuous signal  $r(t)$  to the vector  $\boldsymbol{r}$. 

To clarify the question which and how many measurements have to be performed on the received signal  $r(t)$  to guarantee optimality,  the  "irrelevance theorem"  is helpful.

  • For this purpose,  we consider the sketched receiver whose detector derives the two vectors  $\boldsymbol{r}_1$  and  $\boldsymbol{r}_2$  from the received signal  $r(t)$  and makes them available to the decision.
  • These quantities are related to the message  $ m \in \{m_i\}$  via the composite probability density  $p_{\boldsymbol{ r}_1, \hspace{0.05cm}\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm}m }$. 
  • The decision rule of the MAP receiver with adaptation to this example is:
$$\hat{m}_{\rm MAP} \hspace{-0.1cm} = \hspace{-0.1cm} {\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 , \hspace{0.05cm}\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1, \hspace{0.05cm}\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} m_i ) \big]= {\rm arg} \max_i \hspace{0.1cm}\big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m_i ) \cdot p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )\big] \hspace{0.05cm}.$$

Here it is to be noted:

  • The vectors  $\boldsymbol{r}_1$  and  $\boldsymbol{r}_2$  are random variables.  Their realizations are denoted here and in the following by  $\boldsymbol{\rho}_1$  and  $\boldsymbol{\rho}_2$.  For emphasis,  all vectors are red inscribed in the graph.
  • The requirements for the application of the  "irrelevance theorem"  are the same as those for a first order  "Markov chain".  The random variables  $x$,  $y$,  $z$  then form a first order Markov chain if the distribution of  $z$  is independent of   $x$  for a given $y$.  The first order Markov chain is the following:
$$p(x,\ y,\ z) = p(x) \cdot p(y\hspace{0.05cm} \vert \hspace{0.05cm}x) \cdot p(z\hspace{0.05cm} \vert \hspace{0.05cm}y) \hspace{0.75cm} {\rm instead \hspace{0.15cm}of} \hspace{0.75cm}p(x, y, z) = p(x) \cdot p(y\hspace{0.05cm} \vert \hspace{0.05cm}x) \cdot p(z\hspace{0.05cm} \vert \hspace{0.05cm}x, y) \hspace{0.05cm}.$$
  • In the general case,  the optimal receiver must evaluate both vectors  $\boldsymbol{r}_1$  and  $\boldsymbol{r}_2$,  since both composite probability densities  $p_{\boldsymbol{ r}_1\hspace{0.05cm} \vert \hspace{0.05cm}m }$  and  $p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r}_1, \hspace{0.05cm}m }$  occur in the above decision rule.  In contrast,  the receiver can neglect the second measurement without loss of information  if  $\boldsymbol{r}_2$  is independent of the message  $m$  for given  $\boldsymbol{r}_1$: 
$$p_{\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )= p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 } \hspace{0.05cm} (\boldsymbol{\rho}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 ) \hspace{0.05cm}.$$
  • In this case,  the decision rule can be further simplified:
$$\hat{m}_{\rm MAP} = {\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m_i ) \cdot p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i ) \big]$$
$$\Rightarrow \hspace{0.3cm}\hat{m}_{\rm MAP} = {\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m_i ) \cdot p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 )\big]$$
$$\Rightarrow \hspace{0.3cm}\hat{m}_{\rm MAP} = {\rm arg} \max_i \hspace{0.1cm} \big [ {\rm Pr}( m_i) \cdot p_{\boldsymbol{ r}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m } \hspace{0.05cm} (\boldsymbol{\rho}_1 \hspace{0.05cm} \vert \hspace{0.05cm}m_i ) \big]\hspace{0.05cm}.$$
Two examples of the irrelevance theorem

$\text{Example 2:}$  We consider two different system configurations with two noise terms  $\boldsymbol{ n}_1$  and  $\boldsymbol{ n}_2$ each to illustrate the irrelevance theorem just presented.

  • In the diagram all vectorial quantities are red inscribed.
  • Moreover, red inscribed the quantities  $\boldsymbol{s}$,  $\boldsymbol{ n}_1$  and  $\boldsymbol{ n}_2$  are independent of each other.


The analysis of these two arrangements yields the following results:

  • In both cases,  the decision must consider the component  $\boldsymbol{ r}_1= \boldsymbol{ s}_i + \boldsymbol{ n}_1$,  since only this component provides the information about the possible transmitted signals  $\boldsymbol{ s}_i$  and thus about the message  $m_i$. 
  • In the upper configuration,   $\boldsymbol{ r}_2$  contains no information about  $m_i$ that has not already been provided by  $\boldsymbol{ r}_1$.  Rather,   $\boldsymbol{ r}_2= \boldsymbol{ r}_1 + \boldsymbol{ n}_2$  is just a noisy version of  $\boldsymbol{ r}_1$  and depends only on the noise  $\boldsymbol{ n}_2$  once  $\boldsymbol{ r}_1$  is known   ⇒   $\boldsymbol{ r}_2$  is irrelevant:
$$p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i )= p_{\boldsymbol{ r}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm}\boldsymbol{\rho}_1 )= p_{\boldsymbol{ n}_2 } \hspace{0.05cm} (\boldsymbol{\rho}_2 - \boldsymbol{\rho}_1 )\hspace{0.05cm}.$$
  • In the lower configuration,  on the other hand,  $\boldsymbol{ r}_2= \boldsymbol{ n}_1 + \boldsymbol{ n}_2$  is helpful to the receiver,  since it provides it with an estimate of the noise term  $\boldsymbol{ n}_1$   ⇒   $\boldsymbol{ r}_2$ should therefore not be discarded here.
  • Formally, this result can be expressed as follows:
$$p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ r}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2\hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 , \hspace{0.05cm}m_i ) = p_{\boldsymbol{ r}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ n}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 - \boldsymbol{s}_i, \hspace{0.05cm}m_i)= p_{\boldsymbol{ n}_2 \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{ n}_1 , \hspace{0.05cm} m } \hspace{0.05cm} (\boldsymbol{\rho}_2- \boldsymbol{\rho}_1 + \boldsymbol{s}_i \hspace{0.05cm} \vert \hspace{0.05cm} \boldsymbol{\rho}_1 - \boldsymbol{s}_i, \hspace{0.05cm}m_i) = p_{\boldsymbol{ n}_2 } \hspace{0.05cm} (\boldsymbol{\rho}_2- \boldsymbol{\rho}_1 + \boldsymbol{s}_i ) \hspace{0.05cm}.$$
  • Since the possible transmitted signal  $\boldsymbol{ s}_i$  now appears in the argument of this function,  $\boldsymbol{ r}_2$  is not irrelevant,  but quite relevant.


Some properties of the AWGN channel


In order to make further statements about the nature of the optimal measurements of the vector  $\boldsymbol{ r}$,  it is necessary to further specify the  (conditional)  probability density function  $p_{\hspace{0.02cm}r(t)\hspace{0.05cm} \vert \hspace{0.05cm}s(t)}$  characterizing the channel.  In the following,  we will consider communication over the  "AWGN channel",  whose most important properties are briefly summarized again here:

  • The output signal of the AWGN channel is  $r(t) = s(t)+n(t)$,  where   $s(t)$  indicates the transmitted signal and  $n(t)$  is represented by a Gaussian noise process.
  • A random process  $\{n(t)\}$  is said to be Gaussian if the elements of the  $k$–dimensional random variables  $\{n_1(t)\hspace{0.05cm} \text{...} \hspace{0.05cm}n_k(t)\}$  are  "jointly Gaussian".
  • The mean value of the AWGN noise is  ${\rm E}\big[n(t)\big] = 0$.  Moreover,  $n(t)$  is  "white",  which means that the  "power-spectral density"  $\rm (PSD)$  is constant for all frequencies  $($from  $-\infty$ to $+\infty)$:
$${\it \Phi}_n(f) = {N_0}/{2} \hspace{0.05cm}.$$
$${\varphi_n(\tau)} = {\rm E}\big [n(t) \cdot n(t+\tau)\big ] = {N_0}/{2} \cdot \delta(t)\hspace{0.3cm} \Rightarrow \hspace{0.3cm} {\rm E}\big [n(t) \cdot n(t+\tau)\big ] = \left\{ \begin{array}{c} \rightarrow \infty \\ 0 \end{array} \right.\quad \begin{array}{*{1}c} {\rm f{or}} \hspace{0.15cm} \tau = 0 \hspace{0.05cm}, \\ {\rm f{or}} \hspace{0.15cm} \tau \ne 0 \hspace{0.05cm},\\ \end{array}$$
  • Here,  $N_0$  indicates the physical noise power density  $($defined only for  $f \ge 0)$.  The constant PSD value  $(N_0/2)$  and the weight of the Dirac delta function in the ACF  $($also  $N_0/2)$  result from the two-sided approach alone.

⇒   More information on this topic is provided by the  (German language)  learning video  "The AWGN channel"  in part two.

Description of the AWGN channel by orthonormal basis functions


From the penultimate statement in the last section,  we see that

  • pure AWGN noise  $n(t)$  always has infinite variance  (power):   $\sigma_n^2 \to \infty$,
  • consequently,  in reality only filtered noise  $n\hspace{0.05cm}'(t) = n(t) \star h_n(t)$  can occur.

With the impulse response  $h_n(t)$  and the  frequency response $H_n(f) = {\rm F}\big [h_n(t)\big ]$,  the following equations hold:

$${\rm E}\big[n\hspace{0.05cm}'(t) \big] \hspace{0.15cm} = \hspace{0.2cm} {\rm E}\big[n(t) \big] = 0 \hspace{0.05cm},$$
$${\it \Phi_{n\hspace{0.05cm}'}(f)} \hspace{0.1cm} = \hspace{0.1cm} {N_0}/{2} \cdot |H_{n}(f)|^2 \hspace{0.05cm},$$
$$ {\it \varphi_{n\hspace{0.05cm}'}(\tau)} \hspace{0.1cm} = \hspace{0.1cm} {N_0}/{2}\hspace{0.1cm} \cdot \big [h_{n}(\tau) \star h_{n}(-\tau)\big ]\hspace{0.05cm},$$
$$\sigma_n^2 \hspace{0.1cm} = \hspace{0.1cm} { \varphi_{n\hspace{0.05cm}'}(\tau = 0)} = {N_0}/{2} \cdot \int_{-\infty}^{+\infty}h_n^2(t)\,{\rm d} t ={N_0}/{2}\hspace{0.1cm} \cdot < \hspace{-0.1cm}h_n(t), \hspace{0.1cm} h_n(t) \hspace{-0.05cm} > \hspace{0.1cm} $$
$$\Rightarrow \hspace{0.3cm} \sigma_n^2 \hspace{0.1cm} = \int_{-\infty}^{+\infty}{\it \Phi}_{n\hspace{0.05cm}'}(f)\,{\rm d} f = {N_0}/{2} \cdot \int_{-\infty}^{+\infty}|H_n(f)|^2\,{\rm d} f \hspace{0.05cm}.$$

In the following,  $n(t)$  always implicitly includes a  "band limitation";  thus,  the notation  $n'(t)$  will be omitted in the future.

$\text{Please note:}$  Similar to the transmitted signal  $s(t)$,  the noise process  $\{n(t)\}$  can be written as a weighted sum of orthonormal basis functions  $\varphi_j(t)$. 

  • In contrast to  $s(t)$,  however,  a restriction to a finite number of basis functions is not possible.
  • Rather,  for purely stochastic quantities,  the following always holds for the corresponding signal representation
$$n(t) = \lim_{N \rightarrow \infty} \sum\limits_{j = 1}^{N}n_j \cdot \varphi_j(t) \hspace{0.05cm},$$
where the coefficient  $n_j$  is determined by the projection of  $n(t)$  onto the basis function  $\varphi_j(t)$: 
$$n_j = \hspace{0.1cm} < \hspace{-0.1cm}n(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > \hspace{0.05cm}.$$


Note:   To avoid confusion with the basis functions  $\varphi_j(t)$,  we will in the following express the auto-correlation function  $\rm (ACF)$   $\varphi_n(\tau)$  of the noise process only as the expected value 

$${\rm E}\big [n(t) \cdot n(t + \tau)\big ] \equiv \varphi_n(\tau) .$$

Optimal receiver for the AWGN channel


Optimal receiver at the AWGN channel

The received signal  $r(t) = s(t) + n(t)$  can also be decomposed into basis functions in a well-known way: $$r(t) = \sum\limits_{j = 1}^{\infty}r_j \cdot \varphi_j(t) \hspace{0.05cm}.$$

To be considered:

  • The  $M$  possible transmitted signals  $\{s_i(t)\}$  span a signal space with a total of  $N$  basis functions  $\varphi_1(t)$, ... , $\varphi_N(t)$.
  • These  $N$  basis functions  $\varphi_j(t)$  are used simultaneously to describe the noise signal  $n(t)$  and the received signal  $r(t)$. 
  • For a complete characterization of  $n(t)$  or  $r(t)$,  however,  an infinite number of further basis functions  $\varphi_{N+1}(t)$,  $\varphi_{N+2}(t)$,  ... are needed.


Thus,  the coefficients of the received signal  $r(t)$  are obtained according to the following equation,  taking into account that the signals  $s_i(t)$  and the noise  $n(t)$  are independent of each other:

$$r_j \hspace{0.1cm} = \hspace{0.1cm} \hspace{0.1cm} < \hspace{-0.1cm}r(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > \hspace{0.1cm}=\hspace{0.1cm} \left\{ \begin{array}{c} < \hspace{-0.1cm}s_i(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > + < \hspace{-0.1cm}n(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > \hspace{0.1cm}= s_{ij}+ n_j\\ < \hspace{-0.1cm}n(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} > \hspace{0.1cm} = n_j \end{array} \right.\quad \begin{array}{*{1}c} {j = 1, 2, \hspace{0.05cm}\text{...}\hspace{0.05cm} \hspace{0.05cm}, N} \hspace{0.05cm}, \\ {j > N} \hspace{0.05cm}.\\ \end{array}$$

Thus,  the structure sketched above results for the optimal receiver.

Let us first consider the  AWGN channel.  Here,  the prefilter with the frequency response  $W(f)$,  which is intended for colored noise,  can be dispensed with.

  1. The detector of the optimal receiver forms the coefficients  $r_j \hspace{0.1cm} = \hspace{0.1cm} \hspace{0.1cm} < \hspace{-0.1cm}r(t), \hspace{0.1cm} \varphi_j(t)\hspace{-0.05cm} >$  and passes them on to the decision.
  2. If the decision is based on all  $($i.e., infinitely many$)$  coefficients  $r_j$,  the probability of a wrong decision is minimal and the receiver is optimal.
  3. The real-valued coefficients  $r_j$  were calculated as follows:
$$r_j = \left\{ \begin{array}{c} s_{ij} + n_j\\ n_j \end{array} \right.\quad \begin{array}{*{1}c} {j = 1, 2, \hspace{0.05cm}\text{...}\hspace{0.05cm}, N} \hspace{0.05cm}, \\ {j > N} \hspace{0.05cm}.\\ \end{array}$$

According to the  "irrelevance theorem"  it can be shown that for additive white Gaussian noise

  • the optimality is not lowered if the coefficients  $r_{N+1}$,  $r_{N+2}$,  ... ,  that do not depend on the message  $(s_{ij})$,  are not included in the decision process,  and therefore
  • the detector has to form only the projections of the received signal  $r(t)$  onto the  $N$  basis functions  $\varphi_{1}(t)$, ... , $\varphi_{N}(t)$  given by the useful signal  $s(t)$. 


In the graph this significant simplification is indicated by the gray background.

In the case of  colored noise   ⇒   power-spectral density  ${\it \Phi}_n(f) \ne {\rm const.}$,  only an additional prefilter with the amplitude response  $|W(f)| = {1}/{\sqrt{\it \Phi}_n(f)}$  is required. 

  1. This filter is called  "whitening filter",  because the noise power-spectral density at the output is constant again   ⇒   "white".
  2. More details can be found in the chapter  "Matched filter for colored interference"  of the book  "Stochastic Signal Theory".

Implementation aspects


Essential components of the optimal receiver are the calculations of the inner products according to the equations   $r_j \hspace{0.1cm} = \hspace{0.1cm} \hspace{0.1cm} < \hspace{-0.1cm}r(t), \hspace{0.1cm} \varphi_j(t) \hspace{-0.05cm} >$. 

$\text{These can be implemented in several ways:}$ 

  • In the  correlation receiver  $($see the  "chapter of the same name" for more details on this implementation$)$,  the inner products are realized directly according to the definition with analog multipliers and integrators:
$$r_j = \int_{-\infty}^{+\infty}r(t) \cdot \varphi_j(t) \,{\rm d} t \hspace{0.05cm}.$$
  • The  matched filter receiver,  already derived in the chapter  "Optimal Binary Receiver"  at the beginning of this book,  achieves the same result using a linear filter with impulse response  $h_j(t) = \varphi_j(t) \cdot (T-t)$  followed by sampling at time  $t = T$: 
$$r_j = \int_{-\infty}^{+\infty}r(\tau) \cdot h_j(t-\tau) \,{\rm d} \tau = \int_{-\infty}^{+\infty}r(\tau) \cdot \varphi_j(T-t+\tau) \,{\rm d} \tau \hspace{0.3cm} \Rightarrow \hspace{0.3cm} r_j (t = \tau) = \int_{-\infty}^{+\infty}r(\tau) \cdot \varphi_j(\tau) \,{\rm d} \tau = r_j \hspace{0.05cm}.$$
Three different implementations of the inner product




The figure shows the two possible realizations
of the optimal detector.


Probability density function of the received values


Before we turn to the optimal design of the decision maker and the calculation and approximation of the error probability in the following chapter, we first perform a statistical analysis of the decision variables  $r_j$  valid for the AWGN channel.

Signal space constellation  (left)  and PDF of the received signal  (right)

For this purpose,  we consider again the optimal binary receiver for bipolar baseband transmission over the AWGN channel,  starting from the description form valid for the fourth main chapter.

With the parameters  $N = 1$  and  $M = 2$,  the signal space constellation shown in the left graph is obtained for the transmitted signal

  • with only one basis function  $\varphi_1(t)$,  because of  $N = 1$,
  • with the two signal space points  $s_i \in \{s_0, \hspace{0.05cm}s_1\}$,  because of  $M = 2$.


For the signal  $r(t) = s(t) + n(t)$  at the AWGN channel output,  the noise-free case   ⇒   $r(t) = s(t)$  yields exactly the same constellation;  The signal space points are at

$$r_0 = s_0 = \sqrt{E}\hspace{0.05cm},\hspace{0.2cm}r_1 = s_1 = -\sqrt{E}\hspace{0.05cm}.$$

Considering the (band-limited) AWGN noise  $n(t)$, 

  • Gaussian curves with variance  $\sigma_n^2$  ⇒  standard deviation   $\sigma_n$  are superimposed on each of the two points   $r_0$  and  $r_1$  $($see right sketch$)$.
  • The probability density function  $\rm (PDF)$  of the noise component  $n(t)$  is thereby:
$$p_n(n) = \frac{1}{\sqrt{2\pi} \cdot \sigma_n}\cdot {\rm e}^{ - {n^2}/(2 \sigma_n^2)}\hspace{0.05cm}.$$

The following expression is then obtained for the conditional probability density that the received value  $\rho$  is present when  $s_i$  has been transmitted:

$$p_{\hspace{0.02cm}r\hspace{0.05cm}|\hspace{0.05cm}s}(\rho\hspace{0.05cm}|\hspace{0.05cm}s_i) = \frac{1}{\sqrt{2\pi} \cdot \sigma_n}\cdot {\rm e}^{ - {(\rho - s_i)^2}/(2 \sigma_n^2)} \hspace{0.05cm}.$$

Regarding the units of the quantities listed here, we note:

  • $r_0 = s_0$  and  $r_1 = s_1$  as well as  $n$  are each scalars with the unit  "root of energy".
  • Thus,  it is obvious that  $\sigma_n$  also has the unit  "root of energy"  and  $\sigma_n^2$  represents energy.
  • For the AWGN channel,  the noise variance is   $\sigma_n^2 = N_0/2$,   so this is also a physical quantity with unit  "$\rm W/Hz \equiv Ws$".

The topic addressed here is illustrated by examples in  "Exercise 4.6"

N-dimensional Gaussian noise


If an  $N$–dimensional modulation process is present,  i.e.,  with  $0 \le i \le M–1$  and  $1 \le j \le N$:

$$s_i(t) = \sum\limits_{j = 1}^{N} s_{ij} \cdot \varphi_j(t) = s_{i1} \cdot \varphi_1(t) + s_{i2} \cdot \varphi_2(t) + \hspace{0.05cm}\text{...}\hspace{0.05cm} + s_{iN} \cdot \varphi_N(t)\hspace{0.05cm}\hspace{0.3cm} \Rightarrow \hspace{0.3cm} \boldsymbol{ s}_i = \left(s_{i1}, s_{i2}, \hspace{0.05cm}\text{...}\hspace{0.05cm}, s_{iN}\right ) \hspace{0.05cm},$$

then the noise vector  $\boldsymbol{ n}$  must also be assumed to have dimension  $N$.  The same is true for the received vector  $\boldsymbol{ r}$:

$$\boldsymbol{ n} = \left(n_{1}, n_{2}, \hspace{0.05cm}\text{...}\hspace{0.05cm}, n_{N}\right ) \hspace{0.01cm},$$
$$\boldsymbol{ r} = \left(r_{1}, r_{2}, \hspace{0.05cm}\text{...}\hspace{0.05cm}, r_{N}\right )\hspace{0.05cm}.$$

The probability density function  $\rm (PDF)$  for the AWGN channel is with the realization  $\boldsymbol{ \eta}$  of the noise signal

$$p_{\boldsymbol{ n}}(\boldsymbol{ \eta}) = \frac{1}{\left( \sqrt{2\pi} \cdot \sigma_n \right)^N } \cdot {\rm exp} \left [ - \frac{|| \boldsymbol{ \eta} ||^2}{2 \sigma_n^2}\right ]\hspace{0.05cm},$$

and for the conditional PDF in the maximum likelihood decision rule:

$$p_{\hspace{0.02cm}\boldsymbol{ r}\hspace{0.05cm} | \hspace{0.05cm} \boldsymbol{ s}}(\boldsymbol{ \rho} \hspace{0.05cm}|\hspace{0.05cm} \boldsymbol{ s}_i) \hspace{-0.1cm} = \hspace{0.1cm} p_{\hspace{0.02cm} \boldsymbol{ n}\hspace{0.05cm} | \hspace{0.05cm} \boldsymbol{ s}}(\boldsymbol{ \rho} - \boldsymbol{ s}_i \hspace{0.05cm} | \hspace{0.05cm} \boldsymbol{ s}_i) = \frac{1}{\left( \sqrt{2\pi} \cdot \sigma_n \right)^2 } \cdot {\rm exp} \left [ - \frac{|| \boldsymbol{ \rho} - \boldsymbol{ s}_i ||^2}{2 \sigma_n^2}\right ]\hspace{0.05cm}.$$

The equation follows

  • from the general representation of the  $N$–dimensional Gaussian PDF in the section  "correlation matrix"  of the book  "Theory of Stochastic Signals"
  • under the assumption that the components are uncorrelated  (and thus statistically independent).
  • $||\boldsymbol{ \eta}||$  is called the  "norm"  (length)  of the vector  $\boldsymbol{ \eta}$.


$\text{Example 3:}$  Shown on the right is the two-dimensional Gaussian probability density function  $p_{\boldsymbol{ n} } (\boldsymbol{ \eta})$  of the two-dimensional random variable  $\boldsymbol{ n} = (n_1,\hspace{0.05cm}n_2)$.  Arbitrary realizations of the random variable  $\boldsymbol{ n}$  are denoted by  $\boldsymbol{ \eta} = (\eta_1,\hspace{0.05cm}\eta_2)$.  The equation of the represented two-dimensional  "Gaussian bell curve"  is:

Two-dimensional Gaussian PDF
$$p_{n_1, n_2}(\eta_1, \eta_2) = \frac{1}{\left( \sqrt{2\pi} \cdot \sigma_n \right)^2 } \cdot {\rm exp} \left [ - \frac{ \eta_1^2 + \eta_2^2}{2 \sigma_n^2}\right ]\hspace{0.05cm}. $$
  • The maximum of this function is at  $\eta_1 = \eta_2 = 0$  and has the value  $2\pi \cdot \sigma_n^2$.  With  $\sigma_n^2 = N_0/2$,  the two-dimensional PDF in vector form can also be written as follows:
$$p_{\boldsymbol{ n} }(\boldsymbol{ \eta}) = \frac{1}{\pi \cdot N_0 } \cdot {\rm exp} \left [ - \frac{\vert \vert \boldsymbol{ \eta} \vert \vert ^2}{N_0}\right ]\hspace{0.05cm}.$$
  • This rotationally symmetric PDF is suitable e.g. for describing/investigating a  "two-dimensional modulation process"  such as  "M–QAM""M–PSK"  or  "2–FSK".
  • However,  two-dimensional real random variables are often represented in a one-dimensional complex way,  usually in the form  $n(t) = n_{\rm I}(t) + {\rm j} \cdot n_{\rm Q}(t)$.  The two components are then called the  "in-phase component"  $n_{\rm I}(t)$  and the  "quadrature component"  $n_{\rm Q}(t)$  of the noise.
  • The probability density function depends only on the magnitude  $\vert n(t) \vert$  of the noise variable and not on angle  ${\rm arc} \ n(t)$.  This means:   complex noise is circularly symmetric  $($see graph$)$.
  • Circularly symmetric also means that the in-phase component  $n_{\rm I}(t)$  and the quadrature component  $n_{\rm Q}(t)$  have the same distribution and thus also the same variance  $($and standard deviation$)$:
$$ {\rm E} \big [ n_{\rm I}^2(t)\big ] = {\rm E}\big [ n_{\rm Q}^2(t) \big ] = \sigma_n^2 \hspace{0.05cm},\hspace{1cm}{\rm E}\big [ n(t) \cdot n^*(t) \big ]\hspace{0.1cm} = \hspace{0.1cm} {\rm E}\big [ n_{\rm I}^2(t) \big ] + {\rm E}\big [ n_{\rm Q}^2(t)\big ] = 2\sigma_n^2 \hspace{0.05cm}.$$


Finally, some  denotation variants  for Gaussian random variables:

$$x ={\cal N}(\mu, \sigma^2) \hspace{-0.1cm}: \hspace{0.3cm}\text{real Gaussian distributed random variable, with mean}\hspace{0.1cm}\mu \text { and variance}\hspace{0.15cm}\sigma^2 \hspace{0.05cm},$$
$$y={\cal CN}(\mu, \sigma^2)\hspace{-0.1cm}: \hspace{0.12cm}\text{complex Gaussian distributed random variable} \hspace{0.05cm}.$$

Exercises for the chapter


Exercise 4.4: Maximum–a–posteriori and Maximum–Likelihood

Exercise 4.5: Irrelevance Theorem