Question: Problem: The source Alphabet is A={a,b,c}. The symbol counts are n a =16, n b =2, n c =2 . You have to encode the

Problem:

The source Alphabet is A={a,b,c}. The symbol counts are na=16, nb=2, nc=2 . You have to encode the sequence ccba using the incremental arithmetic coding algorithm. Show your work for every step: calculate intervals, indicate the bits shifted to the code sequence if any, indicate what type of scaling is applied (E1, E2, E3) if any; E3 counter (if E3 is applied).

hint:

Integer Implementation

For integer implementation, we use rather counts of occurrences of each symbol, ni.

Cumulative counts (Cum_Count) per symbol play the role of the cdf, that is, we use an estimated value of cdf. For the k-th symbol

,

where

Calculation of intervals is refined to carry out arithmetic such that the uniqueness of codewords are guaranteed.

(**)

where xn is the symbol in the alphabet to be encoded. Please note that we round down in the above equations. Since we care about getting the tag in a quarter of the unit interval (not lesser as we do re-scaling), the total number of bits for representing the tag, m is calculated from the logarithm of what is four times greater than the total count of symbols in the sequence. Recall, the latter represents in integer arithmetic the complete unit interval.

or

m > 2+ log2( Total_Count)

For example, for the total number of symbols in the message/file to be encoded is 230, the number of binary digits for representing the tag is to be 2+8= 10.

The complete coding algorithm is formalized below:

Initialize l and u.

Get the symbol, update the interval:

while (MSB of u and l are both equal to b or E3 condition holds)

if (MSB of u and l are both equal to b)

{

send b (where b is a bit)

shift l to the left by 1 bit and shift 0 into LSB

shift u to the left by 1 bit and shift 1 into LSB

while(Scale3 > 0)

{

send complement of b

decrement Scale3

}

}

if (E3 condition holds)

{

shift l to the left by 1 bit and shift 0 into LSB

shift u to the left by 1 bit and shift 1 into LSB

complement (new) MSB of l and u

increment Scale3

}.

This algorithm lends itself to a simple circuit implementation. There is only a bit check and shift operation to be performed.

Before going into video examples, a few notes on implementation would help.

Let the number of bits needed for tag representation is 6, the algorithm initializes with l= 000000 and u= 111111. After reading from the input sequence a symbol, we have to recalculate intervals using Eqs. (**).

Few scenarios would help with understanding the implementation:

Let the new interval bounds are l=001001 and u =010111. Both l and u are in the low half of the unit interval, so we output 0 that indicates the low interval. At this point we have to perform re-scaling of the interval using the mapping function E1. In integer implementation, the equivalent mapping is by shifting left the binary sequence with shift-in from the right 0 for l and 1 for u. That is, the new interval bounds are l= 010010 and u=101111. The interval belongs to [0.25, 0.75) that can be recognized by two most significant bits being 01 (0.25), which with the binary point added is 0.01 and 10, that is 0.102= 0.510. Thus, we have to perform E3 mapping. This is performed by flipping the second bit in both l and u, and shift left both with shift-in bits 0 and 1 for low and upper bounds, respectively. The counter for E3=1. New interval values are l= 000100 and u=111111. Now, because the interval straddles the midpoint, we are ready to get another symbol. Assume, that after calculating new intervals, l=001001 and u=001111. As this is an indication of the low half of the unit interval, we output 0, and because E3 counter is 1, we then output 1 (opposite to 0). New interval is l= 010010 and u= 011111. It is in the low half. Output 0, shift again: l=100100 and u=111111. Now, it is in the upper half, so we output 1 (shift left). New intervals are 001001 and u 111111. Thus, weare ready to get another symbol. If it were a last symbol, we had to send the tag value, and output 1 (meaning 0.5). Aa a matter of fact, we can take any value in the interval, and because the low limit is not included in the previous intervals, we can use that as a tag. So, we take l. This is a great simplification for the algorithm. Do not be confused with this change. All before, that is taking the midpoint of the interval has been made for proving the theorems. Once it is proved, and we understand that any value in the interval is a valid code, we can use this fact for implementation.

Scaling the interval of coding

As you could have noticed, the Tag (code) is calculated, when the last symbol of a sequence is received from the source. It would be advantageous to start coding immediately with the first symbol of the source. We can produce a number of bits earlier than the last symbols of the sequence is received. This type of coding is called incremental.

There are few issues though, for example, the precision problem. The interval of coding for large sequences can shrink to a very small value and would need an infinitely large number of bits for encoding. The solution for this is to re-scale the interval whenever there is a "danger" of interval shrinking. This is understood particularly when the binary numbers are considered. In the previous discussion we have already used only integers (discarded the binary point). The binary number then, can be a very large.

The MSB of the tag determines the half of the unit interval:

MSB=1 --> Tag 0.5

MSB =0 --> Tag <0.5

Example:

0.10000.0 is 0.5;

0.01=0.25

In the coding procedure, we will generate a single bit indicating whether the tag is in the upper half, that is in [ 0.5, 1.0), or in the lower half, that is, in [0, 0.5). Produce the MSB bit indicating which half the interval the tag belongs and then ignore a half of the unit interval and concentrate on the interval containing the tag. Re-scale the half of the interval containing the tag to the full unit, that is [0,1) by doubling the interval to the full unit value as follows:

for the lower half.

for the upper half.

What if the interval straddles the midpoint? Consider the tag in the interval [0.25, 0.75), that is l(n) > 0.25 and u(n)= < 0.75. In the following figure, for A={a,b,c}; p(a)=p(b)=0.25 and p(c)=0.5, you can observe how fast the length of the tag increases, when the interval shrinks about the midpoint.

When this happens, we double the tag interval using another mapping function, E3.

;

If E3 re-scaling is due, increment a counter; If E3 is to be done a number of times, keep incrementing, and send/store nothing. Thereafter, if E1 or E2 re-scaling takes place, output a number of 0s or 1s equal to the E3 counter value, depending of which interval we are in now, decrement the counter after each bit is sent.

Examples:

;

If no re-scaling occurs of E1 or E2 types, output the tag value which is 0.5, that is 1 and then append the number of zeroes corresponding to the E3 counter value.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!