In Part I of this series, we reviewed the benefits that In-Memory OLTP may provide for workloads that are bound by insert performance.In part II, we took a look at the benefits it may offer for workloads that are less constrained by inserts, but require ultra-fast access to read-only, or read-mostly data. Today, we will look at the benefits that In-Memory OLTP can offer workloads that suffer from extreme contention between reads and writes.

Read / write contention consists of several levels. You are probably familiar with lock contention that may block readers from accessing data that is currently being modified by other, concurrent transactions or vice versa – transactions that try to modify data but must wait until concurrent transactions that read data complete and release the locks that are held. SQL Server uses these locks, (mainly exclusive and shared locks) to facilitate transaction isolation according (more or less) to ANSI definitions and to prevent certain read phenomenon and modification anomalies. The isolation level of the transaction determines how, which type, and for how long locks are held and may significantly impact the system’s ability to process concurrent transactions. There are several ways to minimize the impact of these blocking locks by optimizing data access patterns, code and indexes but they can’t be avoided completely. Since version 2005, SQL Server also offers a lock-free isolation mechanism known as row-versioning or snapshot isolation which eliminates the need for locks, but introduces its own overhead in the form of increased activity in TempDB. The latter is needed since the row-versioning was added on top of existing data storage structures which did not include the required architectural abilities to implement it.

However, there is another level of read-write contention which is completely separate from the locking mechanism known as latch contention. In an excellent whitepaper titled “Diagnosing and resolving latch contention in SQL Server” (http://download.microsoft.com/download/B/9/E/B9EDF2CD-1DBF-4954-B81E-82522880A2DC/SQLServerLatchContention.docx) latches are defined as “lightweight synchronization primitives that are used by the SQL Server engine to guarantee consistency of in-memory structures including; index, data pages and internal structures such as non-leaf pages in a B-Tree.”. In the simplest terms, when SQL Server needs to modify the content of a page, it needs to make sure no one else is reading it at the same time or conflicts may occur. This is also true for the reverse case of course. Unlike lock contention, this has nothing to do with the logical aspects of transaction isolation but is part of the way SQL Server handles data constructs internally. This contention can even occur for a single transaction that utilized parallel execution where multiple threads try to access or modify the same data constructs concurrently. This also means that it can’t be overcome by snapshot isolation. Latch contention is most prominent when there is a small ‘region’ of data that gets hit heavily by both readers and writers, such as the very common case of an ever-increasing column (identity for example) being used as the leading key in a clustered index. For these, data will always be inserted at the ‘end’, or the last page of the table while typically, being the most recent data, it will be needed by users for read purposes as well. This contention has been one of the most challenging to work around and I’ve spent many months helping customers that encountered this type of contention as the bottleneck to system performance and scalability. In some cases, these workarounds required extensive, painful architectural redesigns of the schema and code.

This was the one of the main scenarios that drove the design of the In-Memory OLTP engine and it now offers a 100% lock and latch free solution.

  • Data structures were redesigned from scratch to overcome the need for latching. There are no more pages or extents that need to be moved from disk to memory or be protected by latches.
  • Row construction was redesigned from scratch and now includes a built-in row-versioning mechanism as part of the base row construct. These constructs use timestamps and potentially multiple copies of the row in-memory to server concurrent transactions without the need for locks or blocking.

However, there are no free-meals and this mechanism now introduces potential new challenges.

  • Since each row can exist multiple times in memory, each with a different time stamp, data that is heavily modified may take up significantly more space than traditionally constructed disk-based data rows.
  • No locks exist to protect data from being logically mishandled. For example, two transactions can try to change the same row concurrently without ‘knowing’ of each other. Also, transactions that require high isolation levels as serializable isolation level which must guarantee that no new rows are introduced within a range read by a particular transaction until it commits may fail this requirement without realizing it. Of course, logical consistency is of the utmost important so these rules are validated when the transaction attempts to commit. If a transaction can’t commit due to such consistency violations, an error will be raised by SQL Server and the application must be able to handle these errors correctly. This may require application changes.

Although this may sound like pretty restrictive or challenging changes to implement, the benefit is so huge that in many cases it was a game changer for applications that hit scalability barriers due to latch contention. If you think that is the case, I highly recommend that you read the case study of Bwin.Party (http://www.microsoft.com/casestudies/Microsoft-SQL-Server-2014/bwin.party/Gaming-Site-Can-Scale-to-250-000-Requests-Per-Second-and-Improve-Player-Experience/710000003117)
The bottom line – “By migrating to In-Memory OLTP, batch requests increased from a limit of 15,000 to 250,000 for a gain of 16X in throughput. This allowed BWin.party to scale-up its environment and consolidate from 18 logically partitioned server instances of SQL Server to one. To accomplish this, BWin.party did not modify any code outside of SQL Server.” (“In-Memory OLTP – Common Workload Patterns and Migration Considerations” white paper at http://msdn.microsoft.com/en-us/library/dn673538.aspx).

Isn’t that something???

In part IV we will discuss some more features of SQL Server 2014 that may help you get your application to new levels of performance.

Have a wonderful weekend!