High-Performance Concurrency Control Mechanism for Main-Memory Databases

This paper describes the concurrency control mechanisms of the Hekaton storage engine of Microsoft SQL Server, one is optimistic and one is pessimistic. The authors also implemented a simple single-version locking concurrency control method optimized for in-memory databases, and report performance comparisons of the three.

The authors found that single-version locking is very fragile, in that it works well for short transactions and low contention, but its performance degrades rapidly under high contention, or if there is even one long running transaction. On the contrary, multi-version concurrency control method perform well under high contention and when there are long running transactions. The optimistic method consistently performs better than the pessimistic method.

The authors first experimented with homogeneous workload of short update transactions, which represents an extreme update-heavy scenario, and varied the multi-programming level of the system.

The pessimistic method, under low contention and read committed isolation level, has 30% lower performance than the optimistic method (because of extra writes for tracking dependencies and locks, which cause increased memory traffic. It takes pessimistic method 20% more cycles to execute the same number of instructions, and the additional control logic translates into 10% more instructions per transaction. Under high contention, the optimistic method performed the best throughput, with pessimistic method close behind. Single-verison locking achieved good throughput numbers initially but did not scale beyond 8 threads.

If we fix the contention level (low, 24 threads) and transaction workload type but vary the isolation level, optimistic MVCC’s performance degrades the fastest, followed by pessimistic MVCC and the single-version locking’s performance reduces only by 1.8%.

Screen Shot 2020-04-21 at 11.13.18 AM.png

Then the authors experimented with heterogeneous workload, fixing the multi-programming level and varying the percentage of short and long read-only transactions in the mix.

Starting with short read transactions, under low contention, single-version locking out-performed both MV until 90% read-only workload, at which point it was surpassed by both MV schemes. Under high contention MVCC schemes start to show more advantage as percentage of read-only transactions increases. When 80% of the transactions are read-only, two MVCC schemes achieve 63% and 73% high throughput than 1V.

If we mix in long-running read transactions instead, the advantage of MVCC schemes shows even more. 1V achieves twice the update throughput of the MV schemes when there is 0 long read transactions. However mixing in even one long read transaction, 1V’s update throughput drops by 75%, in contrast to only 5% drop of MV schemes. When 50% of the active transactions are long readers, MV has 80x high update throughput than 1V. In terms of throughput, MV schemes consistently outperform 1V as the gap widens as the number of long readers increases.

TATP benchmark simulates a telecommunications application. 80% of the transactions are queries, 16% of them are updates, 2% are inserts, and 2% are deletes. 1V achieved 4,220,119 txn/s, pessmisitic MV achieved 3,129,816 txn/s and optimistic MV achieved 3,121,494 txn/s.

I am only summarizing the performance results here, for detailed description of the two multi-version concurrency control implementations, I suggest that readers go to the original paper.