High-Performance Concurrency Control Mechanism for Main-Memory Databases

This paper describes the concurrency control mechanisms of the Hekaton storage engine of Microsoft SQL Server, one is optimistic and one is pessimistic. The authors also implemented a simple single-version locking concurrency control method optimized for in-memory databases, and report performance comparisons of the three.

The authors found that single-version locking is very fragile, in that it works well for short transactions and low contention, but its performance degrades rapidly under high contention, or if there is even one long running transaction. On the contrary, multi-version concurrency control method perform well under high contention and when there are long running transactions. The optimistic method consistently performs better than the pessimistic method.

The authors first experimented with homogeneous workload of short update transactions, which represents an extreme update-heavy scenario, and varied the multi-programming level of the system.

The pessimistic method, under low contention and read committed isolation level, has 30% lower performance than the optimistic method (because of extra writes for tracking dependencies and locks, which cause increased memory traffic. It takes pessimistic method 20% more cycles to execute the same number of instructions, and the additional control logic translates into 10% more instructions per transaction. Under high contention, the optimistic method performed the best throughput, with pessimistic method close behind. Single-verison locking achieved good throughput numbers initially but did not scale beyond 8 threads.

If we fix the contention level (low, 24 threads) and transaction workload type but vary the isolation level, optimistic MVCC’s performance degrades the fastest, followed by pessimistic MVCC and the single-version locking’s performance reduces only by 1.8%.

Screen Shot 2020-04-21 at 11.13.18 AM.png

Then the authors experimented with heterogeneous workload, fixing the multi-programming level and varying the percentage of short and long read-only transactions in the mix.

Starting with short read transactions, under low contention, single-version locking out-performed both MV until 90% read-only workload, at which point it was surpassed by both MV schemes. Under high contention MVCC schemes start to show more advantage as percentage of read-only transactions increases. When 80% of the transactions are read-only, two MVCC schemes achieve 63% and 73% high throughput than 1V.

If we mix in long-running read transactions instead, the advantage of MVCC schemes shows even more. 1V achieves twice the update throughput of the MV schemes when there is 0 long read transactions. However mixing in even one long read transaction, 1V’s update throughput drops by 75%, in contrast to only 5% drop of MV schemes. When 50% of the active transactions are long readers, MV has 80x high update throughput than 1V. In terms of throughput, MV schemes consistently outperform 1V as the gap widens as the number of long readers increases.

TATP benchmark simulates a telecommunications application. 80% of the transactions are queries, 16% of them are updates, 2% are inserts, and 2% are deletes. 1V achieved 4,220,119 txn/s, pessmisitic MV achieved 3,129,816 txn/s and optimistic MV achieved 3,121,494 txn/s.

I am only summarizing the performance results here, for detailed description of the two multi-version concurrency control implementations, I suggest that readers go to the original paper.


OLTP Through the Looking Glass, and What We Found There

This paper from 2008 provides an instruction-level breakdown of the performance impact of each major component of a OLTP database, running a subset of TPC-C.

Since the invention of transactional databases in the 1970s and 1980s, a few things have changes:

  1. Hardware improvements. CPUs are faster so that each transaction now only takes a few microseconds to complete. Memory is getting cheaper so that most OLTP databases fit in memory.
  2. Modern internet applications and data intensive applications no long require full standard database features, and can live with varying levels of consistency, reliability, concurrency, replication and query-ability.

Some alternative database designs are now worth considering, such log-less databases, single-threaded databases, and transaction-less databases.

The authors started with a system called Shore, and progressively striped down each of the following components of it, to identify the overhead of each one. The benchmark they run were NewOrder and Payment transactions from TPC-C.

  • Logging
  • Locking
  • Latching
  • Buffer management

The detailed instruction count breakdown was given by figure 5 and figure 6 of the paper.

Screen Shot 2020-04-14 at 10.34.33 AM.png

Buffer management and locking operations are the most significant contributors to performance overhead, followed by logging and latching operations.

The authors concluded that unless one strips down all of these components, it is hard to get a main-memory optimized database that performs much better than a traditional database where most data fits into RAM. But if one does strip down all of the components, i.e, to design a single-threaded, in-memory, transaction-less databases that uses copy-state-over-network recovery, the performance will be orders of magnitude better than the system one starts with.