I have given message encoded with non-optimal Huffman code. I need to decode the message and encode it again, but this time with optimal Huffman code, so after that I can find average_number_of_bits_per_symbols = num_of_bits_in_message / num_of_characters_in_message
Example:
input:
6
A 11
B 10
C 01
D 0001
E 001
F 0000
101100000001000010001001100011000000010001010010000000100001000110001100011000000010000000000000100110001000100000000000000011011111010100100101011001010000111001010110000100000010010000000111010000001100100100001010001000100000000001
output:
2.469
Here is what I got from decoding:
// Third column is occurrence of the letter
A 11 5
B 10 19
C 01 12
D 0001 8
E 001 18
F 0000 19
BAFDFBEEBEBFEDCEFDFBEBEBEBFEFFFCEBEDFFFDBAABBBCECCBCCFABCCCBDFEEFDACFEBCEFBBEDFFE
num_of_characters: 81
So now I have to find optimal codes for these letters. I’ve tried tree like this one:
19 19 18 12 8 5
B F E C D A
/ / /
0 /1 0 /1 0 /1
/ / /
/
/
0 /1
/
/
/
0 /
/
/
/1
______/
And with this I will get:
A 111 => 3 * 5 = 15
D 110 => 3 * 8 = 24
C 101 => 3 * 12 = 36
E 100 => 3 * 18 = 54
F 01 => 2 * 19 = 38
B 00 => 2 * 19 = 38
num_of_bits = 205
So 205 / 81 = 2.530 => which is not equal with the example output...
So what I am doing wrong?
7