Time-series storage design (part two)

NOTE: this article is out of date. Akumuli doesn’t work this way anymore.

Let’s talk about compression algorithms. You probably know that DBMS can be row-oriented or column-oriented. Both have some strengths and weaknesses. Row-orientation has higher write-throughput and not so great compression. Column-orientation on the other hand has lower write-throughput and better compression. This is achieved because the data in each column is uniform. We can compress all the timestamps using one algorithm and all the IDs using another one that’s better suited for IDs.

Akumuli is row-oriented in the sense that every chunk is a row. But on the other hand every chunk is column-oriented. This gives us both high write-throughput and good a compression ratio.

Compression algorithms

Akumuli uses several compression algorithms for different column types.

  • Timestamps are compressed using delta encoding followed by run-length encoding (RLE). This is great because timestamps are always increasing thus can be easily compressed by delta-encoding. Time-series data often has periodic nature. In this case delta encoding will give us a series of equal numbers that can be compressed by RLE very efficiently! Delta-RLE encoded data then gets encoded by the LEB128 algorithm. (The implementation can be found here).
  • IDs are compressed using LEB128 encoding. In this scheme small IDs (less then 128) will occupy a single byte and larger IDs will occupy two or more bytes.
  • Data offsets (used for binary BLOBs) are compressed using delta encoding followed by ZigZag encoding, RLE and finally LEB128 encoding. We need ZigZag encoding because offsets can go out of order sometimes and delta-encoding can produce negative numbers. For numeric values this column contains zeroes (most values in the chunk should be numeric) thus can be compressed by the RLE encoder very well.
  • The most interesting part is the one of numeric values! Let’s look at it more closely.

Numeric data compression algorithm

My compression algorithm is based on this paper (you can find PDF online). If we have an array of floating-point values we can’t delta-encode them because otherwise precision would be lost. Instead we can XOR two adjacent values and get a binary diff. In the case of numeric time series we will get some 64bit number with leading zeroes. This number can be encoded using a very small amount of memory.

Akumuli uses four bits to represent the length of a number and sequence of four bit numbers to encode its binary diff. The aforementioned paper describes a slightly more complex algorithm which uses forecasting to predict a new value. The predicted and actual value are then used to calculate the binary diff. This diff should be smaller than the diff between the previous and new value. I am not using this because it’s harder to implement and my simple algorithm already gives very good results.

The floating-point compression algorithm implementation can be found here. It uses the __builtin_clzl compiler intrinsic function to count leading zeroes for better performance.

Lies and benchmarks

In my tests this specialized compression algorithms gives up to 50% better compression ratio than a generic compression algorithm (gzip).