Similarities Between GSM and UMTS

Cellular architecture

Cellular network structure, idealized (left) and realistic (right)

A characteristic feature of GSM and UMTS is the cellular network structure, which is often approximated by hexagons (see left graphic on the right):

The colors white, yellow and blue indicate different frequencies $($here: Reuse factor $3)$, thus avoiding intercell interference.
The graphic on the right shows a more realistic layout with non–hexagonal and also differently sized cells, depending on expected participant density and terrain topology. The base station is also not always located in the center of the cell.

With the "GSM–D" net $(f_{\rm T} = 900 \ \ \rm MHz)$ the cell radius is specified with maximum $\text{35 km}$ . In the "GSM–E" net, the maximum radius is only half as large because of the larger carrier frequency $(1800 \ \ \rm MHz)$.

Cell structure in UMTS

In the UMTS network $(f_{\rm T} \approx 2 \ \rm GHz)$ there are different types of radio cells:

»Macro cells« cover the complete coverage area and follow the classic design. Both overlaps and "holes" between cells should be minimized. A macro cell usually has many macro–neighbours: For exactly hexagonal cells six, in reality some more or less. The base stations work with high power $\text{(20 to 40 W)}$, are suspended at a great height and use sectorized antennas. In sparsely populated regions, macro cells have diameters of up to several kilometers. In city centres, however, macro cells are kept compact to increase capacity, often only a few hundred metres in diameter.

»Microcells« cover a small part of a macrocell and are primarily used to increase local capacity $($illumination of dead spots$)$. They usually have only one macro–neighbor, but can also have other micro/pico/femto–neighbors. The power is somewhat lower $\text{(5 to 10 W)}$ and the devices are smaller than in a macro cell. However, the antennas, most of which are not sectorized, also have to be positioned sufficiently high (on a mast or house wall).

»Picocells« supply small areas $(d \approx 100 \ \rm m)$ with very high data volume $($examples: airports, shopping centres, stadiums$)$. They allow higher data rates, but at the expense of the speed of movement. The devices of a picocell are significantly smaller than those in a microcell and operate with lower power $\text{(1 to 5 W)}$, but are more flexible in assembly.

»Femtocells« are often administered privately and uncoordinated $($example: WLAN access point$)$, sometimes with private backhaul $($own DSL line$)$. One also speaks of a "Home Base Station". They are operating "Indoor" and work with low power $\text{(< 1 W)}$ .

Interference power and cell breathing

If several subscribers use the same frequency channel, interference may occur and thus a very low carrier–to–interference ratio $\rm (CIR)$, which considerably impairs the transmission quality. The problem is serious in the case of UMTS, which is based on the multiple access method $\text{CDMA}$ ("Code Division Multiple Access"), since here all participants use the same frequency channel.

To illustrate intra- and intercellular interference

A distinction is made between two types of interference according to the graphic:

»Intracell interference« occurs when the same frequency channel is used by several users within the same cell. In the shown example, this case arises for $f_1 = f_2$.

»Intercell interference« occurs when users of neighboring cells use the same frequency, in the scenario shown, for example, when $f_3 = f_4$ applies.

Both intracell and intercell interference lead to a reduction in transmission quality. In the case of intercell interference (same frequency channel in adjacent cells), the disturbing influence of the interference power on the transmission quality can be limited by

$\text{Cell breathing}$: If the number of active users increases significantly with UMTS, the cell radius and thus the current interference power is reduced. For the supply of the users at the edge of a busy cell, a less busy neighboring cell steps in.

$\text{Power control}$: If the total interference power within a radio cell exceeds a specified limit, the transmission power of all users is reduced accordingly, but this also results in poorer transmission quality.

In contrast, in the case of intracell interference, each user must be regulated individually, for example by reducing transmission power and/or data rate.

Near-Far effect

The Near–Far effect is exclusively a problem of the uplink, i.e. the transmission of mobile users to a base station. We consider a scenario with two users at different distances from the base station ("Node B") according to the following graphic.

Scenarios for the Near-Far effect

If both mobile stations transmit with the same power, the received power of the red user $\rm A$ at the base station is significantly lower than that of the blue user $\rm B$ due to the path loss (left scenario). In large macro cells the difference can be up to $\text{100 dB}$. This way the red signal is largely hidden by the blue one.

You can largely avoid the Near–Far effect if the more distant user $\rm A$ transmits with higher power than user $\rm B$, as indicated in the right scenario. At the base station the received power of both mobile stations is then (almost) equal.

Note: In an idealized system $($one–way channel, ideal A/D converters, completely linear amplifiers$)$ the transmitted data of the users are orthogonal to each other and one could detect the users individually even with very different received powers.

This statement is valid for GSM due to the multiple access methods FDMA and TDMA, but also for UMTS (CDMA) and for the 4G system LTE (TDMA/OFDMA).

In reality, however, orthogonality is not always given due to the following reasons:

different received paths ⇒ multipath channel,

not ideal properties of spreading and scrambling codes with CDMA,

asynchrony of users in the time domain (basic path delay) and in the frequency domain (non–ideal oscillators and Doppler shift due to user mobility).

Consequently, the users are no longer orthogonal to each other and the signal-to-noise ratio of the user to be detected is not arbitrarily high compared to other users:

For GSM and LTE one can assume signal–to–noise ratios of $\text{25 dB}$ and more,
for CDMA only approx. $\text{15 dB}$ ,
for high rate data transmission rather less.

Power control

To avoid the Near–Far effect according to the right graph in the last section, however, a sufficiently good power control is required. It should be noted here:

For all systems (GSM, UMTS and LTE), a dynamic range at the base station of $\text{80 dB}$ must be assumed, whereby the changes with respect to path loss and shading occur rather slowly and for this a regulation in the seconds range around $±\text{5 dB}$ is sufficient.

For GSM and LTE, a regulation in the range of seconds is sufficient, since the signal–to–noise ratio between a user and the other users is more than $\text{25 dB}$ due to the good properties of FDMA/OFDMA. Very fast fluctuations of the "Fast Fading" $($Dynamic range between $\text{10 and 20 dB})$ do not have to be compensated.

With UMTS, on the other hand, fast fading must also be compensated, since the signal–to–noise ratio between users is less than the fluctuations of fast fading. For UMTS, "Fast Power Control" was specified, whereby the transmission power can be changed every $\text{0.67 ms}$ by $±\text{1 dB}$ , with an initial delay of $\text{2 ms}$ .

Otherwise, if a user changes from very bad to rather good fading conditions during fast fading within about $\text{10 ms}$ the base station would suddenly receive more power by $\text{10 to 20 dB}$. All other users in the cell would be extremely disturbed by this.

As already mentioned, the Near–Far effect and thus also the power control is exclusively an uplink problem. For the downlink a sophisticated power control is less essential.

If, however, the users near the base station are supplied with a lower power, the intercell interference is also reduced.
This means: All other users in the cell under consideration are then less affected by traffic to the nearby user.

Various handover strategies

A second problem besides the Near–Far effect occurs when a mobile user switches from one cell to another. In order to make the transition between different cells appear as uninterrupted as possible to the user, a so–called »$\text{Handover}$« is used for circuit–switched UMTS services and for GSM. A distinction is made between two types:

»Hard Handover«: Here, at a certain point in time the connection is suddenly redirected from the current base station to another base station.

»Soft Handover«: The handover of a user from one base station to another is gradual until the user has finally left the first cell. By combining several links, in UMTS up to three, even a diversity gain can be achieved.

$\text{Example 1:}$ The graphic shows a downlink scenario where a mobile station can receive its signal from two different base stations $(\rm BS1$ and $\rm BS2)$ at certain locations.

Handover scenarios for the downlink

For "Hard Handover" the mobile station at point $\rm A$ only evaluates the signal from $\rm BS1$ and at point $\rm C$ only the signal from $\rm BS2$. Switchover is performed immediately when the user is at point $\rm B$.

If you use "Soft Handover" and "Soft Combining", the mobile station benefits from both signals.

At any location $(\rm A, \ B, \ C)$ the received power increases.
This results in a diversity gain depending on the channel SNR,
and additionally in a coherence gain of $\text{3 dB}$.

These statements can also be applied to the right-hand scenario, in which the base station radiates with directional antennas in three sectors. Here it is assumed that the radiation angle is somewhat larger than $120^\circ$, which can be assumed in practice.

In the "UMTS downlink" the data is split in the "Radio Network Controller" $\rm (RNC)$, broadcast via different base stations and reassembled in the mobile station ("Rake Processing").

In the "UMTS uplink" the transmitted data is received by all participating base stations. The data is combined ("Soft Combining") in the RNC. The RNC then forwards the data to the "Core Network" $\rm (CN)$. A distinction is made here:

»Softer Handover«: A base station receives the signal of a user over two sectors and does "Soft Combining". There is a diversity gain and a coherence gain of $\text{3 dB}$.

»Intra–RNC Handover«: Two base stations decode the signal, make a $\rm CRC$ check and report their result to the RNC (or report a CRC error).
If only one "Node B" reports a CRC error, the data of the other is used. Here there is no coherence gain and the diversity gain is less than with "Softer Handover".

Typical mobile radio transmission system

Now some components of mobile radio systems, which are necessary for both GSM and UMTS, will be explained.

The diagram shows the components of the GSM transmitter, the rates given apply only to GSM.
For the GSM extension $\text{GPRS}$ you get different numerical values.
For UMTS a similar structure results, but not exactly the same.
In addition, the bit rates of UMTS data transmission are significantly higher, while for voice transmission, comparable rates to GSM can be assumed.

Let us first look at the $\text{GSM voice transmission}$, i.e. the upper branch of the graphic:

Components of the voice and data communication with GSM

The data rate of a PCM speech signal limited to $4\ \rm kHz$ is obtained by sampling with $8\ \rm kHz$ and quantizing with $13\ \rm bit$ to $104 \ \rm kbit/s$. With GSM, speech coding is used to extract exactly $260\ \rm bit$ for each $T_{\rm R}=20\ \rm ms$ frame. A bit stream with $13 \ \rm kbit/s$ is generated.

The task of the $\text{Voice Activity Detection}$ (dotted line) is to decide whether the current voice frame actually contains a speech signal or a voice pause during which the power of the transmit amplifier can be reduced.

$\text{Channel coding}$ $\rm (CRC$ and convolutional code$)$ is used to add redundancy to enable error correction at the receiver. This increases the data rate to $22.8 \ \rm kbit/s$, whereby the more important bits of the speech encoder are especially protected.

The $\text{Interleaver}$ scrambles the bit sequence of the channel encoder ' and thus reduces the influence of burst errors. For this purpose, the $456$ input bits are divided and interleaved into four time frames of $114\ \rm bit$ each. Successive bits are always transmitted in eight different bursts.

With the $\text{GSM Data transmission}$ (lower branch of the graphic) the user data rate is limited to $9.6 \ \rm kbit/s$ to give more room for channel coding. Here the resulting code rate is $192/456 = 0.421$ smaller than in the upper branch $(260/456 = 0.57)$.

Interleaving is also organized differently for data than for voice. The effective data rate of $22.8 \ \rm kbit/s$ after interleaving is the same for both branches. The rest of the description applies to voice and data equally:

After interleaving follows the "Encryption" for the purpose of authenticating users and securing the radio interface against "eavesdropping". UMTS offers some more $\text{Security Measures}$.

The next block is the "Burst building": The $456\ \rm bit$ after channel encoding, interleaving and encryption are completed by adding signaling bits, "Guard Period", etc. to $625\ \rm bit$ which are transmitted within four time slots $(4 \cdot T_{\rm Z})$.

This results in the total data rate of $625/(4 · 5769\ \rm µs) \approx 270.833 \ \rm kbit/s$, so that for each of the eight GSM users connected via TDMA, a bulk data rate of approximately $33.854 \ \rm kbit/s$ is available.
For voice transmission, however, only $13/33.854 =38.5\%$ of the data rate is available for the user, and for data transmission only $9.6/33.854 =28.4\%$.

An essential $\text{difference between GSM and UMTS}$ are the different modulation and multiple access methods:

For GSM: "Gaussian Minimum Shift Keying" $\rm (GMSK)$ together with $\rm FDMA$ and $\rm TDMA$,
for UMTS: "Quaternary Phase Shift Keying" $\rm (QPSK)$ together with $\rm CDMA$ and $\rm TDMA$.

This will be discussed in more detail in the chapters "Characteristics of GSM" and "Characteristics of UMTS".

The following points should also be noted:

The CDMA based $\rm UMTS$ is characterized by the chip rate of $R_{\rm C} = 3.84 \ \rm Mchip/s$ from which the bit rate $R_{\rm B} = R_{\rm C}/J$ can be calculated according to the selected spreading factor $J$.
With $J = 4$, ... , $512$ results in gross data rates between $7.5 \ \rm kbit/s$ and $960 \ \rm kbit/s$, which are selected depending on the current channel conditions.

Due to the different transmission technology, the block "Burst building" is organized differently in UMTS. This is based on the "Transmission Time Interval" $\rm (TTI)$. In the original UMTS specification such a "TTI" had a duration between $10\ \rm ms$ and $80\ \rm ms$.

In order to reduce the loss of time for the required block repetitions in bad channel conditions, this TTI value was reduced to $2\ \rm ms$ $($for $\rm HSDPA)$ in later releases.

$\text{Conclusion:}$ With $\rm GSM$ and $\rm UMTS$ the following problems in particular are to be solved together:

A suitable channel estimation and feedback to the transmitter,
a working carrier phase clock and system clock detection,
the frame synchronization.

Common speech coding methods

For GSM voice transmission each user has only one net data rate of $13\ \rm kbit/s$ available $($with channel coding $22. 8\ \rm kbit/s)$, while PCM transmission with $8\ \rm kHz$ sampling and $13\ \rm bit$ quantization would require a data rate of $104\ \rm kbit/s$ .

To keep the sampling theorem with a certain tolerance, the audio signal is limited by filtering to the frequency range from $300\ \rm Hz$ to $3.4\ \rm kHz$. The necessary compression by the factor $8$ is the task of the »Speech coding« (this is a special form of the $\text{Source coding}$), for which several standards were defined in the 1990s:

The »$\text{GSM Full Rate Vocoder}$« is based on the three compression methods $\text{LPC}$ (Linear Predictive Coding), $\text{LTP}$ (Long Term Prediction) and $\text{RPE}$ (Regular Pulse Excitation). – From each $20\ \rm ms$ speech frame, this coder extracts $74$ parameters with a total range of $260\ \rm bit$, which results in the data rate $13\ \rm kbit/s$ . At the receiver, the speech signal must be synthesized again from these $260\ \rm bit$ .

The »$\text{GSM Half Rate Vocoder}$« was specified in 1994 and offers the possibility to transmit an audio signal bandlimited to $4\ \rm kHz$ with nearly the same quality in half a traffic channel. Today, this speech codec plays only a minor role.

The »$\text{Enhanced Full Rate Codec}$« (short "EFR Codec") was developed in 1995 for the US–American DCS 1900–system. It works according to the coding method "Algebraic Code Excited Linear Prediction" $\rm (ACELP)$ and offers a significantly higher speech quality compared to the conventional full rate codec due to the ACELP principle and because of the improved error detection and obfuscation.

From the $\text{ACELP block diagram}$ one recognizes

the segmentation of the digitized speech signal into frames and sub-blocks,
the LPC analysis through a digital filter $A(z)$ in the red highlighted block,
the Long Term Prediction $\rm (LTP)$ with the help of the adaptive codebook (framed in blue), and
the search for the best entry in the fixed codebook (highlighted in green).

In the chapter $\text{Speech Coding}$ of the book "Examples of Communication Systems" the EFR Codec is described in detail. The data rate of $12.2 \ \rm kbit/s$ is identical to the highest mode of the AMR codec, which is briefly introduced in the next section.

Adaptive Multi Rate Codec

The most widely used speech encoder due to its flexibility is the AMR codec ("Adaptive Multi Rate"), which processes low-frequency signals $($in the frequency range between $300\ \rm Hz$ and $3.4\ \rm kHz)$ according to the ACELP principle. This encoder provides eight different modes with data rates

Modes of the AMR and Wideband AMR

between $12.2\ \rm kbit/s$ $(244 \ \rm bits$ per $20\ \rm ms$ speech frame$)$
and $4.75\ \rm kbit/s$ $(95 \ \rm bits$ per speech frame$)$.

Three modes play a special role (highlighted darker in the first line of the table), namely

$12.2 \ \rm kbit/s$ ⇒ the improved GSM Full rate codec ⇒ EFR codec,
$7.4 \ \rm kbit/s$ ⇒ compression according to the US standard IS–641,
$6.7 \ \rm kbit/s$ ⇒ EFR voice transmission of the Japanese PDC system.

$\text{Conclusion:}$ The $\text{AMR codec}$ has the following properties:

It adapts flexibly to the current conditions of the radio channel and to the respective network,

either in full rate mode $($higher voice quality ⇒ mode $\ge 7.4 \ \rm kbit/s)$,
or in half rate mode $($also with lower data rate ⇒ mode $\le 6.7 \ \rm kbit/s)$.

There are also several intermediate levels.

The AMR codec provides improved voice quality for both the full rate and the half rate traffic channel.
This is especially due to the flexible distribution of the available gross channel data rate between speech and channel coding.

The AMR codec is used in the same way for GSM and UMTS.

On the other hand, the "Wideband AMR" $\text{(W-AMR)}$ is used exclusively for UMTS for broadband signals between $50\ \rm Hz$ and $7\ \rm kHz$.
Sampling is done here with $16\ \rm kHz$ and quantization with $14 \ \rm bit$.
The nine defined modes of the W–AMR are shown in the bottom line of the table above, the more frequently used ones are again in dark.

The interactive SWF applet (German language) "Quality of different voice codecs" acoustically demonstrates the achievable voice quality of the codecs described here.