From Random Experiment to Random Variable

From LNTwww

# OVERVIEW OF THE SECOND MAIN CHAPTER #


This chapter is intended to familiarize you with  discrete random variables  assuming them to be statistically independent.  Such random variables are needed in Communications Engineering,  for example,  for the simulation of a binary or multilevel digital signal,  but equally for the emulation of a channel with statistically independent errors by a digital model,  for example,  the BSC model.

In detail, it covers:

  • the  »relationship between probability and relative frequency«,
  • the  »expected values and moments«,
  • the  »binomial and Poisson distributions«  as special cases of discrete distributions,
  • the  »generation of pseudo-random binary symbols«  using  »PN generators«,  and
  • the  »generation of multilevel random variables«  on a digital computer.


On the concept of random variable


In the first chapter of this book, the term  Random Experiment  has already been explained. By this is meant an experiment that can be repeated any number of times under the same conditions with an uncertain outcome  $E$,  but in which the set  $\{E_μ \}$  of possible outcomes is specifiable.

Often the experimental results are numerical values, as for example in the random experiment "throwing a die".  In contrast, the experiment "coin toss" yields the possible results "heads" and "tails".

For a uniform description of different kinds of experiments and also because of the better manageability one uses the term of  "random variable" .

$\text{Definition:}$  A  random variable  $z$  is a one-to-one mapping of the result set  $\{E_μ \}$  onto the set of real numbers.  Complementary to this definition, it is still allowed that the random variable has a unit in addition to the numerical value.

Definition of a random variable



Some examples of random variables are given below:

  • In the random experiment "throwing a roulette ball", a distinction between  $E$  and  $z$  has no practical implications, but may be useful for formal reasons.  Thus,  $E_μ = 8$ denotes that the ball has come to rest in the spot of the roulette wheel marked "8".  Arithmetic operations  (for example, an expected value formation)  are not possible on the basis of the results.  In contrast, the random variable  $z$  actually denotes a numerical value  $($here integer between  $0$  and  $36)$, from which the expected mean value of the random variable  $($here  $18)$  can be determined.  Quite possible, but not useful, would be, for example, the assignment  $E_μ = 8$    ⇔   $z_μ ≠ 8$.
  • In the "coin toss" experiment, the possible outcomes are  "heads"  and  "tails", to which no arithmetic operations can be applied per se.   Only by the arbitrary but unambiguous assignment between the event set  $\{E_μ\} = \{$"heads", "tails"$ \}$  and the number set  $\{z_μ\} = \{0, 1\}$  a characteristic value can be given here at all.   Similarly, however, one could also specify the assignment "heads"   ⇔   $0$  and "tails"   ⇔   $1$ .
  • In circuit engineering, one designates the two possible logical states of a memory cell (flip-flops) according to the possible voltage levels with  $\rm L$  (Low) and  $\rm H$  (High).  We adopt these designations here also for binary symbols.  For practical work, one usually maps these symbols back to random variables, although this mapping is also arbitrary, but should make sense.
  • In coding theory, it is useful to map  $\{ \text{L, H}\}$  to  $\{0, 1\}$  in order to be able to use the possibilities of modulo algebra.  On the other hand, to describe modulation with bipolar (antipodal) signals, one better chooses the mapping  $\{ \text{L, H}\}$ ⇔ $ \{-1, +1\}$.


Continuous and discrete random variables


$\text{Definition:}$  According to the possible numerical values  $z_μ = z(E_μ)$  we distinguish here between continuous and discrete random variables:

  • continuous random variable  $z$  can – at least in certain ranges – assume infinitely many different values.  More precisely:   The set of admissible values is also uncountable for such variables.
  • Examples for continuous random variables are the speed of a car  $($at appropriate driving between  $v = 0$  and  $v = 120 \ \rm km/h)$  or also the noise voltage at a communication system.  Both random variables have besides a numerical value also a unit.
  • If, on the other hand, the set  $\{z_μ\}$ is countable, it is a  discrete random variable. Usually, the number of possible values of  $z$  is limited to  $M$  In communications engineering,  $M$  is called the symbol range  (in the sense of coding theory)  or the number of steps  (from the point of view of transmission engineering).


First, we restrict ourselves to discrete,  $M$–stage random variables with no internal statistical bindings, which are fully characterized by the  $M$  probabilities of occurrence  $p_μ ={\rm Pr}(z = z_μ)$  according to the section  Some Basic Definitions  .   By definition, the sum over all  $M$  probabilities is equal to  $1$.

In contrast, the probability  ${\rm Pr}(z = z_μ)$  for a continuous random variable  $z$  to take on a very specific value  $z_μ$  is identically zero.   Here, as will be described in the following chapter  Continuous Random Variables  we must move to the  probability density function  (PDF).

Random process and random sequence


$\text{Definition:}$  A  random process  differs from the random experiment considered so far in that it yields not just one outcome (event), but a  temporal sequence of outcomes  (events).


This brings us to the  random sequence  $\langle z_ν\rangle$  with the following properties established for our representation:

  • The index  $ν$  describes the temporal process sequence and can take values between  $1$  and  $N$ . Often such a sequence is also represented as  $N$–dimensional vector.
  • At any time  $ν$  the random variable  $z_ν$  can take on a value different from  $M$ . We use the following nomenclature for this:
$$z_\nu \in z_\mu \hspace{0.3cm} {\rm with} \hspace{0.3cm} \nu = 1,\hspace{0.1cm}\text{...} \hspace{0.1cm}, N \hspace{0.3cm} {\rm and} \hspace{0.3cm} \mu = 1,\hspace{0.1cm}\text{...} \hspace{0.1cm} , M.$$
  • If the process is  ergodic, then each random sequence  $\langle z_ν\rangle$  has the same statistical properties and can be used as a representative for the entire random process.
  • We first assume here that there are no statistical bindings between the individual sequence elements, that is, it holds for the  conditional probabilities:
$${\rm Pr}(z_\nu | z_{\nu \rm{ -1}} \hspace{0.1cm}\text{...} \hspace{0.1cm}z_{\rm 1})={\rm Pr}(z_\nu).$$

More and especially more detailed information about the characterization of random processes can be found in the later chapter  Autocorrelation Function.


$\text{Example 1:}$  If we repeat the random experiment "throwing a roulette ball" ten times, we get, for example, the following random sequence:

$$\langle z_ν\rangle = \langle \ 8; \ 23; \ 0; \ 17; \ 36; \ 0; \ 33; \ 11; \ 25 \ \rangle.$$

At any point in time, all random variables between  $0$  and  $36$  are nevertheless possible - regardless of the past - and also equally probable, but this cannot be read from such a short sequence.

Bernoulli's law of large numbers


$\text{Definitions:}$  To describe an  $M$–stage random variable, one uses the following descriptive variables whose sum over all  $μ = 1,\hspace{0.1cm}\text{...} \hspace{0.1cm} , M$  each yields the value  $1$  :

  • The  probabilities  $p_μ = {\rm Pr}(z = z_μ)$  provide predictions about the expected outcome of a statistical experiment and are thus so-called  a priori characteristics.
  • The  relative frequencies  $h_μ^{(N)}$  are  a-posteriori characteristics  and allow statistical statements to be made with respect to a previously conducted experiment.  They are determined as follows:
$$h_{\mu}^{(N)} = \frac{n_{\mu} }{N}= \frac{ {\rm number \hspace{0.15cm}of \hspace{0.15cm}experiments \hspace{0.15cm}with \hspace{0.15cm}the\hspace{0.15cm}result\hspace{0.15cm} }z_{\mu} } { {\rm number \hspace{0.15cm}of \hspace{0.15cm}all \hspace{0.15cm}attempts } } \hspace{1cm}(\mu=1,\hspace{0.1cm}\text{...} \hspace{0.1cm},M).$$


Only in the limiting case  $N → ∞$  do the relative frequencies coincide "exactly" with the corresponding probabilities, at least in the statistical sense.  In contrast, according to the "law of large numbers" formulated by  Jacob Bernoulli  for finite values of  $N$:

$$\rm Pr \left( \it \mid h_\mu^{(N)} - p_\mu\mid \hspace{0.1cm} \ge \varepsilon \hspace{0.1cm} \right) \le \frac{1}{\rm 4\cdot \it N\cdot \varepsilon^{\rm 2}}.$$

It also follows that for infinite random sequences  $(N → ∞)$  the relative frequencies  $h_μ^{(N)}$  and the probabilities  $p_μ$  are identical with probability  $1$ .

$\text{Example 2:}$  A binary data file consists of  $N = 10^6$  binary symbols ("bits"), where the zeros and ones are equally probable:   $p_0 = p_1 = 0.5$.

Bernoulli's law of large numbers  $($with  $\varepsilon = 0.01)$  now states that the probability of the event  "the number of ones in the file is between  $495  \hspace{0.05cm}000$  and  $505\hspace{0.05cm}000$"  is greater than or equal to  $1 - 1/400 = 99.75\%$  .


According to the above equation, the probability that the relative frequency  $h_μ^{(N)}$  of an event  $E_μ$  and the associated probability  $p_μ$  differ in amount by more than one value  $\varepsilon$  is not greater than  $1/(4 - N - ε^2)$.   For a given  $\varepsilon$  and a probability to be guaranteed, the minimum required value of  $N$  can be calculated from this.

Further, it should be noted:

  • The monotonic decay with  $N$  is valid only in the statistical sense and not for each individual realization.  Thus, in the "coin toss" experiment, after  $N = 1000$  tosses, the relative frequencies of "heads" and "tails" may well be exactly equal  $0. 5$  be  (if  $n_{\rm heads} = n_{\rm tails} = 500$  is)  and after  $N = 2000$  tosses deviate from it again more or less strongly.
  • If several subjects perform the experiment "coin toss" in parallel and the relative frequency is plotted as a function of  $N$  , the result is a curve which tends to decrease, but not monotonously.
  • If, however, one calculates the mean value over an infinite number of such curves, one obtains the monotonically with  $N$  decreasing course according to the Bernouillian prediction.


This topic, specifically the experiment of  Karl Pearson, is dealt with in the learning video (German)  Bernoullisches Gesetz der großen Zahlen $\Rightarrow$ Bernoulli's Law of Large Numbers.

Exercises to the Chapter


Exercise 2.1: Election Demand

Exercise 2.1Z: Different Signal Courses