Exercise 2.2: Kraft–McMillan Inequality
In the figure some exemplary binary and ternary codes are given.
- With the binary code B1 all possible source symbols qμ (with index μ=1, ... , 8) are represented by a encoded sequence ⟨cμ⟩ of uniform length Lμ=3 .
- This code is unsuitable for data compression for this reason.
The possibility of data compression arises only when
- the M source symbols are not equally likely, and
- the length Lμ of the code words are different.
For example, the binary code B2 has this property:
- Here, one code word each has the length 1, 2 and 3, respectively (N1=1, N2=2, N3=3).
- Two code words have the length Lμ=4 (N4=N5=4).
A prerequisite for the decodability of such a code is that the code is prefix-free .
- That is, no code word may be the prefix (i.e., the beginning) of a longer code word.
- A necessary condition for a code for data compression to be prefix-free was stated by Leon Kraft in 1949, called Kraft's inequality:
- M∑μ=1D−Lμ≤1.
Here denote
- M the number of possible source symbols qμ ⇒ "symbol set size",
- Lμ the length of the code word cμ associated with the source symbol qμ,
- D=2 denotes a binary code (0 or 1) and D=3 denotes a ternary code (0, 1, 2).
A code can be prefix-free only if Kraft's inequality is satisfied.
The converse does not hold: If Kraft's inequality is satisfied, it does not mean that this code is actually prefix-free.
Hint:
- The exercise belongs to the chapter General Description of Source Coding.
Question
Solution
- B1: 8⋅2−3=1 ⇒ condition fulfilled,
- B2: 1⋅2−1+1⋅2−2+1⋅2−3+2⋅2−4=1 ⇒ condition fulfilled,
- B3: 1⋅2−1+1⋅2−2+1⋅2−3+2⋅2−4=1 ⇒ condition fulfilled,
- B4: 1⋅2−1+1⋅2−2+2⋅2−3+1⋅2−4=17/16 ⇒ condition not fulfilled.
(2) Proposed solutions 1 and 2 are correct:
- The code B4, which does not satisfy Kraft's inequality, is certainly not prefix-free either.
- But if Kraft's inequality is fulfilled, it is still not certain that this code is also prefix-free.
- In code B3 the code word "10" is the beginning of the code word "1011".
- In contrast, codes B1 and B2 are actually prefix-free.
(3) The correct solutions are 1 and 3:
- Kraft's inequality is satisfied by all three codes.
- As can be seen from the table, codes T1 and T3 are indeed prefix-free.
- The code T2 , on the other hand, is not prefix-free because "1" is the beginning of the code word "10".
(4) Ni indicates how many code words with i symbols there are in the code. For the code T1 it is:
- N1=1_,N2=2_,N3=6_.
(5) According to Kraft's inequality, the following must be true
- N1⋅3−1+N2⋅3−2+N3⋅3−3≤1.
For given N1=1 and N2=2 , this is satisfied as long as holds:
- N3⋅3−3≤1−1/3−2/9=4/9⇒N3≤12⇒ΔN3=6_.
The additional code words are 210,211,212,220,221,222.
(6) For code T3 it holds:
- S(T3)=2⋅3−1+2⋅3−2+1⋅3−3=25/27.
- Because of S(T3)≤1 the ternary code T3 satisfies Kraft's inequality and it is also prefix-free.
Let us now consider the proposed new codes.
- Code T4 (N1=2, N2=2, N3=5):
- S(T4)=S(T3)+4⋅3−3=29/27>1⇒T4isunsuitable,
- Code T5 (N1=2, N2=2, N3=1, N4=4):
- S(T5)=S(T3)+4⋅3−4=79/81<1⇒T5issuitable,
- Code T6 (N1=2, N2=2, N3=2, N4=3):
- S(T6)=S(T3)+1⋅3−3+3⋅3−4=75+3+381=1⇒T6issuitable.
Thus, the two last proposed solutions are correct.
For example, the total N=9 code words of the prefix-free code T6 are:
- 0,1,20,21,220,221,2220,2221,2222.