Back to main

Data Compression

Here are some of my models for data compression research compared against popular compressors gzip and 7-zip. Click on the names to download them as command-line tools. Note that none of these compressors are compatible with the others. All of these compressors are experimental. RH5 and its variants are designed for high speed and modest memory usage, and are meant to be practical solid multi-file archivers.

These benchmarks are with enwik8, a 100 MB text-file from the English-language Wikipedia. More information and a benchmark of some of my compressors can be found on the Large Text Compression Benchmark.

Benchmarks (i7 2600, 4 GB memory)

Program Algorithm Compressed size (bytes) Compression time (seconds) Decompression time (seconds) Compression memory (MB) Decompression memory (MB) Compression speed (MB/sec) Decompression speed (MB/sec)
Original 100000000
BTCM maxBWT + CM20,955,16521.2022.608226574.724.42
BTCM 8BWT + CM23,786,76317.2616.8052425.795.95
CM5 x64CM25,042,26416.8016.9435355.955.90
7-zip (normal)LZMA25,899,68472.001.40186181.3971.43
RH5ba_x64 maxLZMA27,510,18017.004.00130475.8825.00
RH5_x64 maxLZ + ctx29,878,25613.200.5319127.58188.68
ctxn (32-bit)LZMA30,211,2519.005.00676711.1120.00
RH4_x64ROLZ31,309,6893.100.58292532.26172.41
RH5_x64LZ + ctx31,798,1412.100.61191247.62163.93
RH5m_x64LZ + ctx33,638,2433.400.692.91.829.41144.93
gzip -9LZ7735,194,71914.000.94437.14106.38
gzipLZ7737,907,6234.100.974324.39103.09