Run Length Encoding

From MultimediaWiki
Jump to: navigation, search

Run length encoding (RLE) is perhaps the simplest compression technique of all. The basic idea behind this concept is to encode information about runs of identical numbers rather than encode the numbers themselves.

Theory

Note that this falls into the category of lossless compression algorithms. The decoded data will be the exact same as the original data.

As a basic example, consider the following string of numbers:

5 5 5 5 8 8 8 2 2 2 2 2

There is a fair amount of redundancy there. In RLE notation, this same string could be expressed as:

4 5 3 8 5 2

This indicates a run of 4 ‘5′ numbers, followed by a run of 3 ‘8′ numbers, followed by a run of 5 ‘2′ numbers. Suppose that each number was represented by a byte on disk. The original encoding requires 12 bytes. The RLE encoding requires 6 bytes. 2:1 compression in this simple case– not bad. But consider the following number string:

1 2 3 4 5 6

Apply the same RLE compression scheme as before:

1 1 1 2 1 3 1 4 1 5 1 6

So the original string was 6 bytes and the RLE string is 12 bytes. That did not go too well as the compression ratio is now 1:2. This is why pure RLE implementations have a mode for encoding strings of dissimilar numbers as well. It usually works by encoding a negative number that, when negated, will give the number of succeeding bytes that are not RLE data. Since the above example contains a string of 6 dissimilar numbers, the encoding would be:

-6 1 2 3 4 5 6

A decoder would see that the RLE code is negative, negate it to get 6, and then copy 6 bytes from the encoded byte stream to the decoded byte stream.

Using the basic concepts described so far, let’s decode the following byte stream:

4 5 -6 1 2 3 4 5 6 3 8 5 2

This is a combination of examples from above and decodes to:

5 5 5 5 1 2 3 4 5 6 8 8 8 2 2 2 2 2

Work through it by hand as necessary. So the encoded string is 13 bytes and the decoded string is 18 bytes. Modest compression. RLE compression naturally works best where there are long runs of the same pixel value. At worst, there will be no compression if there are almost no pairs of adjacent pixels with the same value.

Run Level Coding

Closely related to RLE is run level (RL) coding. There are various codecs that, during the coding process, produce long runs of zeros between non-zero numbers. Consider this string:

0 0 0 15 0 9 0 0 0 0 0 0 0 0 12

The non-zero numbers represent levels. Instead of coding each of those numbers independently, RL coding encodes such strings as (run, level), e.g.:

(3, 15), (1, 9), (8, 12)

A run of 3 zeros is followed by a level of 15, and so on. The application of this technique will be examined in later articles.

What A Multimedia Hacker Needs To Know

RLE was often seen in older multimedia compression schemes. It was often used by itself in order to compress palettized data. Notable examples include:

Also, numerous game-related formats developed in the mid-1990s used variations of this algorithm.