LZ77 and LZ78
LZ = Lempel Ziv
Replaces repeated occurrences of data with references.
The algorithm is based on the frequency of occurrence of the data item(byte). The most frequent data items will represented and encoded with a lower number of bits.
The main idea of the algorithm is create a binary tree, called Huffman tree, based on the bytes frequency on the data, where the leafs are the bytes symbols, and the path from the root to a leaf determines the new representation of that leaf byte.
Using the text ABCBAACD as example and applying those steps, we have the following tree:
So the new representation of the bytes on the text are:
- A: 0
- B: 10
- C: 111
- D: 110
The original representation has 8 bytes(64 bits) and the new representation have only 9 bits, that is 86% smaller than the original.