ABSTRACT

Fixed Length CBC is the simplest method for mapping an alphabet to binary. By assigning every character a bitstring of the exact same length, we ensure that decoding is a simple matter of “chopping” the binary stream into uniform blocks.


The Core Logic

In this scheme, we treat each character as an independent unit.

  1. Count the number of unique characters in the alphabet ().
  2. Calculate the uniform bit length () needed to provide a unique binary address for each character.
  3. Map each character to a unique binary sequence of length .

The Minimum Bits Formula

If an alphabet has characters, the minimum number of bits per character is:

TIP

This formula ensures we have enough “slots” () to cover all characters. If , some bitstrings will simply remain unused (e.g., in an alphabet of 6, the strings 110 and 111 are “wasted” if using 3-bit encoding).


Examples & Deep Dive

1. The Alphabet

For an alphabet of 6 characters: bits per character.

LetterBinary
A000
B001
C010
D011
E100
F101

Encoding “BADD”:

  • (Total 12 bits)

2. Industry Standard: ASCII

ASCII is the most famous Fixed Length CBC. It uses an 8-bit (1 byte) fixed length to represent 256 possible characters (including uppercase, lowercase, numbers, and symbols).

  • Even a simple character like !, which could theoretically be represented with fewer bits in a tiny alphabet, still takes up exactly 8 bits in ASCII to maintain the fixed-length property.

The Efficiency Trade-off

The “Waste” of Independence

Fixed Length CBC is often sub-optimal because it treats every character as a separate entity rather than looking at the string as a whole.

For example, a 4-letter string using the alphabet has possible combinations.

  • Fixed Length CBC: Always uses bits.
  • Theoretical Optimum: bits.

This “lost bit” occurs because Fixed Length CBC doesn’t account for the mathematical relationships between positions in a string. To reach that 11-bit optimum, you would need to encode the entire string at once using Strings as Integers.


Pros and Cons

StrengthsWeaknesses
Instant Decoding: No need to look ahead; just read bits at a time.Space Inefficiency: Often uses more bits than the theoretical optimum.
Random Access: You can jump to the -th character easily by calculating the bit offset ().Uniformity Trap: Uses the same space for frequent characters (like ‘E’) as it does for rare ones (like ‘Z’).
Robustness: A single bit error only affects one character, not the entire subsequent string.