LZ77 compression

The first algorithm to use the Lempel-Ziv substitutional compression schemes, proposed in 1977. LZ77 compression keeps track of the last n bytes of data seen, and when a phrase is encountered that has already been seen, it outputs a pair of values corresponding to the position of the phrase in the previously-seen buffer of data, and the length of the phrase. In effect the compressor moves a fixed-size "window" over the data (generally referred to as a "sliding window"), with the position part of the (position, length) pair referring to the position of the phrase within the window.

The most commonly used algorithms are derived from the LZSS scheme described by James Storer and Thomas Szymanski in 1982. In this the compressor maintains a window of size N bytes and a "lookahead buffer", the contents of which it tries to find a match for in the window:

 while (lookAheadBuffer not empty)
 {
     get a pointer (position, match) to the longest match in
     the window for the lookahead buffer;

     if (length > MINIMUM_MATCH_LENGTH)
     {
       output a (position, length) pair;
       shift the window length characters along;
     }
     else
     {
       output the first character in the lookahead buffer;
       shift the window 1 character along;
     }
  }

Decompression is simple and fast: whenever a (POSITION, LENGTH) pair is encountered, go to that POSITION in the window and copy LENGTH bytes to the output.

Sliding-window-based schemes can be simplified by numbering the input text characters mod N, in effect creating a circular buffer. The sliding window approach automatically creates the LRU effect which must be done explicitly in LZ78 schemes. Variants of this method apply additional compression to the output of the LZSS compressor, which include a simple variable-length code (LZB), dynamic Huffman coding (LZH), and Shannon-Fano coding (ZIP 1.x), all of which result in a certain degree of improvement over the basic scheme, especially when the data are rather random and the LZSS compressor has little effect. An algorithm was developed which combines the ideas behind LZ77 and LZ78 to produce a hybrid called LZFG. LZFG uses the standard sliding window, but stores the data in a modified trie data structure and produces as output the position of the text in the trie. Since LZFG only inserts complete *phrases* into the dictionary, it should run faster than other LZ77-based compressors.

All popular archivers (arj, lha, zip, zoo) are variations on LZ77.

[comp.compression FAQ].

Last updated: 1995-04-07

LZ78 compression

A substitutional compression scheme which works by entering phrases into a dictionary and then, when a reoccurrence of that particular phrase is found, outputting the dictionary index instead of the phrase. Several algorithms are based on this principle, differing mainly in the manner in which they manage the dictionary.

The most well-known Lempel-Ziv scheme is Terry Welch's Lempel-Ziv Welch variant of LZ78.

[comp.compression FAQ].

LZ compression

Lempel-Ziv compression

lzexe

An executable file compression utility for MS-DOS. It adds a minimal header to the executable to decompress it when it is executed. See also pklite.

lzh

<filename extension>

The filename extension for a file produced by the LHA program.

Last updated: 1995-04-03

LZH compression

<algorithm>

(After Lempel-Ziv and Haruyasu, the inventors) A compression algorithm derived from the LZSS scheme with a sliding window and additional compression applied to the output of the LZSS compressor by dynamic Huffman coding.

Last updated: 1995-04-07

LZW compression

Lempel-Ziv Welch compression

Nearby terms:

Lynx Real-Time SystemsLYRICLZ77 compressionLZ78 compressionLZ compression

Try this search on Wikipedia, Wiktionary, Google, OneLook.



Loading