Contents
Speech and data transmission components
On the right you can see the block diagram of the GSM transmission system at the transmitting end, which is
- suitable for both digitized speech signals (sampling rate: 8 kHz, quantization: 13 bit ⇒ data rate: 104 kbit/s)
- as well as being suitable for 9.6 kbit/s data signals.
Here is a brief description of each component:
- Speech signals are compressed by speech coding by factor 8 ⇒ from 104 kbit/s to 13 kbit/s. The bit rate given in the graph is for the full rate codec, which delivers exactly 260 bits per speech frame (duration TR=20 ms) .
- The »AMR codec« delivers in highest mode RB=12.2 kbit/s ⇒ 244 bits per speech frame. However, the speech codec must also transmit additional information regarding the mode ⇒ so the data rate before channel coding is 13 kbit/s .
- Task of the dashed »Voice Activity Detection« is to decide whether the current speech frame actually contains a speech signal or just a pause during which the power of the transmit amplifier should be turned down.
- By »channel coding« redundancy is added again to allow error correction at the receiver ⇒ the channel encoder outputs 456 bits per frame, resulting in the data rate 22.8 kbit/s . The more important bits are specially protected.
- The »interleaver« scrambles the resulting bit sequence to reduce burst error influence. The 456 input bits are split into four time frames of 114 bits each. Thus, two consecutive bits are always transmitted in two different bursts.
- A »data channel« – marked in red in the figure – differs from a speech channel (marked in blue) only by the different input rate (9.6 kbit/s instead of 104 kbit/s) and the use of a second, outer channel encoder instead of the speech encoder.
The components highlighted in green apply equally to speech and data transmission.
⇒ The first common system component for speech and data transmission in the block diagram of the GSM transmitter is the »encryption«, which is intended to prevent unauthorized persons from gaining access to the data. There are two fundamentally different encryption methods:
- »Symmetric encryption«: This knows only one secret key, which is used both for encrypting and enciphering the messages in the transmitter and for decrypting and deciphering them in the receiver. The key must be generated prior to communication and exchanged between the communication partners via a secure channel. The advantage of this encryption method used in conventional GSM is that it works very quickly.
- »Asymmetric encryption«: This method uses two independent but matching asymmetric keys. It is not possible to use one key to calculate the other. The "public key" is publicly available and is used for encryption. The "private key" is secret and used for decryption. In contrast to the symmetric encryption methods, the asymmetric methods are much slower, but also offer higher security.
⇒ The second green block is the »burst composition«, where there are different burst types. In a "normal burst" the 114 encoded, scrambled and encrypted bits are mapped to 156.25 bits by (duration T_{\rm burst} = 576.9 \ \rm µ s) adding the "guard period", signaling bits, etc.
These are transmitted within a time slot of duration T_{\rm Z} = T_{\rm burst} by means of the modulation method "GMSK". This results in the gross data rate 270.833 \ \rm kbit/s.
At the receiver side there are in reverse order the blocks "demodulation" – "burst de-composition" – "de-cryption" – "de-interleaving" – "channel decoding" – "speech decoding".
In the next sections all blocks of the above transmission scheme are presented in detail.
Encoding for speech signals
Uncoded radio data transmission leads to bit error rates in the percentage range. However, with \text{channel coding} some transmission errors can be detected or even corrected at the receiver. The bit error rate can thus be reduced to values smaller than 10^{-5}.
First, we consider GSM channel encoding for speech channels, assuming as speech encoder the \text{Full Rate Codec}. The channel coding of a speech frame of 20\ \rm ms duration is done in four consecutive steps according to the diagram.
From the description in chapter "Speech Coding" it can be seen that not all 260 bits have the same influence on the subjectively perceived speech quality.
- Therefore, the data are divided into classes according to their importance: The 50 most important bits form the "Class 1a", other 132 are assigned to "Class 1b" and the remaining 78 bits result in the less important "Class 2".
- In the next step, a three-bit long \text{Cyclic Redundancy Check} \rm (CRC) checksum is calculated for the 50 class 1a bits using a feedback shift register. The generator polynomial for this CRC check is:
- G_{\rm CRC}(D) = D^3 + D +1\hspace{0.05cm}.
- Subsequently, four (yellow) "tail bits" (0000) are added to the total of 185 bits of class 1a and 1b including the three (red) CRC parity bits. These bits initialize the four memory registers of the following convolutional encoder with zeros, so that for each speech frame a defined status can be assumed.
- The rate 1/2 convolutional encoder doubles these 189 most important bits to 378 bits and thus protects them significantly against transmission errors. Then the 78 bits of the less important class 2 are appended unprotected.
This way, there are exactly 456 bits per 20 \ \rm ms speech frame after channel coding.
- This corresponds to a (encoded) data rate of 22.8\ \rm kbit/s compared to 13\ \rm kbit/s after the speech coding.
- The effective channel coding rate is thus 260/456 = 57\%.
Interleaving for speech signals
The result of convolutional decoding depends not only on the frequency of the transmission errors, but also on their distribution.
- To achieve good correction results, the channel should not have any memory, but should provide statistically independent bit errors as far as possible.
- In mobile radio systems, however, transmission errors usually occur in blocks ("error bundles").
- By using the interleaving technique, such "bundle errors" are evenly distributed over several bursts and thus their effects are mitigated.
For a speech channel, the interleaver works in the following way:
- The 456 input bits per speech frame are divided into four blocks of 114 bits each according to a fixed algorithm. We denote these for the n–th speech frame by A_n, B_n, C_n and D_n. The index n-1 denotes the preceding frame and n+1 the succeeding one.
- The block A_n is further divided into two sub-blocks A_{{\rm g},\hspace{0.08cm}n} and A_{{\rm u},\hspace{0.08cm}n} of 57 bits each, where A_{{\rm g},\hspace{0.08cm}n} denote only the even (German: "gerade" ⇒ "g") and A_{{\rm u},\hspace{0.08cm}n} denote the odd (German: "ungerade" ⇒ "u") bit positions. In the graph, one recognizes A_{{\rm g},\hspace{0.08cm}n} and A_{{\rm u},\hspace{0.08cm}n} by red resp. blue backgrounds.
- The subblock A_{{\rm g},\hspace{0.08cm}n} of the n-th speech frame is identified with the block A_{{\rm u},\hspace{0.05cm}n-1} of the previous frame and gives the 114 payload of a "normal burst": \left (A_{{\rm g},\hspace{0.08cm}n}, A_{{\rm u},\hspace{0.08cm}n-1}\right ). The same applies to the next three bursts: \left (B_{{\rm g},\hspace{0.08cm}n},\hspace{0.12cm} B_{{\rm u},\hspace{0.08cm}n-1}\right ), \left (C_{{\rm g},\hspace{0.08cm}n}, C_{{\rm u},\hspace{0.08cm}n-1}\right ), \left (D_{{\rm g},\hspace{0.08cm}n}, D_{{\rm u},\hspace{0.08cm}n-1}\right ).
- In the same way, the odd subblocks of the n-th speech frame are nested with the even sub-blocks of the following frame: \left (A_{{\rm g},\hspace{0.08cm}n+1},\hspace{0.12cm} A_{{\rm u},\hspace{0.08cm}n}\right ), ... , \left (D_{{\rm g},\hspace{0.08cm}n+1},\hspace{0.12cm} D_{{\rm u},\hspace{0.08cm}n}\right ).
\text{Conclusions:} The scrambling type described here is called "block-diagonal interleaving" specifically of degree 8:
- This reduces the susceptibility to bundle errors.
- So two consecutive bits of a data block are never sent directly after each other.
- Multi-bit errors occur in isolation after the de-interleaver and can thus be corrected more effectively.
Encoding and interleaving for data signals
For GSM data transmission, each subscriber only has a net data rate of 9.6\ \rm kbit/s available. Two methods are used for error protection:
- »Forward Error Correction« \rm (FEC) is implemented at the physical layer by applying convolutional codes.
- »Automatic Repeat Request« \rm (ARQ) where defective packets that cannot be corrected are re-requested at the link layer.
The graph illustrates channel coding and interleaving for the data channel with 9.6\ \rm kbit/s, which in contrast to the channel coding of the speech channel (with bit error rate 10^{-5}... 10^{-6}) allows an almost error-free reconstruction of the data. Note:
- The data bit rate of 9.6\ \rm kbit/s is first increased in the "Terminal Equipment" of the mobile station through non-GSM specific channel encoding by 25\% to 12\ \rm kbit/s to allow error detection in circuit-switched networks.
- In data transmission, all bits are equivalent, so unlike speech channel coding, there are no classes. The 240 bits per 20 \rm ms time frame are combined together with four tailbits "0000" to form a single data frame.
- These 244 bits are doubled to 488 bits by a convolutional encoder of rate 1/2 as in speech channels. Two encoded symbols are generated per incoming bit, e.g. according to the generator polynomials G_0(D) = 1 + D^3 + D^4 (red marks in the second graph) and G_1(D) = 1 + D + D^3 + D^4.
- The following interleaver expects as output – just like a "speech interleaver" – only 456 bits per frame. Therefore, from the 488 bits at the output of the convolutional encoder still 32 bits at the positions 15 \cdot j \cdot 4 \ \ ( j = 1, ... , 32) are removed ("puncturing").
- Since data transmission is less time-critical than speech transmission, a higher interleaving degree is chosen here. The 456 bits are distributed over up to 24 interleaver blocks of 19 bits each, which would not be possible for speech services for reasons of real-time transmission.
- Then the 456 bits are split into four consecutive "normal bursts" and sent. When packing in the bursts, groupings of even and odd bits are again formed, similar to interleaving in the speech channel.
Receiver side of the GSM link - Decoding
The GSM receiver (highlighted in yellow) includes GMSK demodulation, burst decomposition, decoding, de-interleaving, and channel and speech decoding.
Regarding the last two blocks in the graph, it should be noted:
- The decoding method is not prescribed by the GSM specification, but is left to the individual network operators. The performance depends on the error correction algorithm used.
- For example, with the decoding procedure "Maximum Likelihood Sequence Estimation" \rm (MLSE), the most probable bit sequence is determined using the Viterbi algorithm or a MAP receiver ("Maximum A-posteriori Probability").
- After error correction, the "Cyclic Redundancy Check" \rm (CRC) is performed, where for the full rate codec the degree of the used CRC generator polynomial is G= 3. This will detect all error patterns up to weight 3 and all bundle errors up to length 4.
- CRC is used to decide the usability of each speech frame. With positive test result, the speech signals are synthesized from the 260 parameters per frame in the subsequent speech decoder.
- If frames are failed, parameters of earlier frames detected as correct are used for interpolation ⇒ "error concealment".
- If several incorrect speech frames occur in succession, the output is continuously lowered to mute.
Exercise for the chapter
Exercise 3.7: GSM System Components