Exercise 5.5: Error Sequence and Error Distance Sequence

Error sequence (blue),
error distance sequence (red)

Any error sequence $〈e_{\nu}〉$ can also be specified as the sequence $〈a_n〉$ of error distances. If the average error probability is not too large, then this results in a lower memory requirement than if the error sequence is stored.

For the comparison in this exercise, the following assumptions are to be made:

A error sequence of length $N = 10^6$ elements is to be stored in each case.

The most memory-efficient method $($one bit per error$)$ is to be used for storing $〈e_{\nu}〉$.

Each error distance is represented by $4$ bytes $(32$ bits$)$.

If the underlying channel model is renewing, such as the BSC model, two different methods can be used to generate the error sequence $〈e_{\nu}〉$ on a digital computer:

The symbol-wise generation of the errors, in the BSC model due to the probabilities $p$ ("error") and $1-p$ ("no error"),
The generation of the error distances, in the BSC model according to the "binomial distribution".

Notes:

The exercise belongs to the chapter "Binary Symmetric Channel".

In the following questions,
- $G_e$ indicates the required file size (in bytes) for storing the error sequence $〈e_{\nu}〉$, and
- $G_a$ indicates (also in bytes) the file size when storing the error distance sequence $〈a_n〉$.

Questions

$G_e \ = \ $

$\ \rm kByte$

$G_a \ = \ $

$\ \rm kByte$

$G_a \ = \ $

$\ \rm kByte$

$p_{\rm M, \ max} \ = \ $

$\ \% $

Solution

(1) For each element $e_{\nu}$ of the error sequence exactly one $\rm bit$ is needed.

Multiplication by $N$ results in $10^6 \ \rm bits$ corresponding to $G_e \ \underline {= 125 \ \rm kByte}$.

(2) With $N = 10^6$ and $p_{\rm M} = 10^{–3}$, about thousand error distances are to be stored, each one with $4 \ \rm bytes$ ⇒ $G_a \ \underline {= 4 \rm kByte}$.

In contrast to the storage of the error sequence, this value will vary slightly, since in an error sequence of (limited) length $N = 10^6$ not always exactly $1000$ errors will occur.

(3) Now, on average, $0.5 \cdot 10^6$ errors will occur ⇒ $G_a \ \underline {= 2000 \ \rm kByte}$.

From this it can be seen that storing the error distances only makes sense if the (mean) error probability is not too large.

(4) From the explanations of the upper subtasks it follows:

$$N \cdot p_{\rm M} \cdot 4 < {N}/{8} \Rightarrow \hspace{0.3cm}p_{\rm M, \hspace{0.1cm}max} = {1}/{32} \hspace{0.15cm}\underline {= 3.125\%}\hspace{0.05cm}.$$

This result is independent of the sequence length $N$.