In Part I of this series, we reviewed the benefits that In-Memory OLTP may provide for workloads that are bound by insert performance. Today, we’ll take a look at the benefits it may offer for workloads that are less constrained by inserts, but require ultra-fast access to read-only, or read-mostly data.
Although a read operation is relatively fast compared to data modifications, in many cases the sheer volume of read activity, makes the associated overhead (however small) for each operation and layer in the execution stack critical. High performance read-only workloads with traditional disk based tables are typically bound by the following bottlenecks:
- If the pages needed to read the data (metadata, index and actual data pages may be needed) are not in the buffer cache, they need to be fetched from disk by incurring physical reads. For most systems, physical IO is still the most expensive resource in terms of duration. Remember that access times to RAM are measured in nanoseconds, while access to disks, spinning or solid-state, is still measured in milliseconds. That is two orders of magnitude slower.
- The access path to any page with disk based tables may consist of one of the following:
- Metadata pages need to be retrieved first
- Table (heap) or index (clustered or non-clustered) scans which involve reading of multiple pages
- Clustered or non-clustered index seeks which involve traversing the non-leaf levels of the BTree which typically involve a few levels. For non-clustered indexes, the operation may require an additional lookup which will traverse all levels of the clustered index in addition to the non-clustered one.
- Data is accessed using traditional, interpreted T-SQL which introduces latency overhead that may impact overall throughput when multiple concurrent sessions read the data. These steps involve:
- Parsing of the SQL Statement (whether ad-hoc or in a stored procedure)
- Cache lookup to see if a suitable plan already exists
- Algebrizing, parameterizing and optimizing of the statement
- Compiling and executing the selected execution plan
- Additional optimizations which are performed at the storage engine level if pages are not found in cache
- Lock and latch management to enable safe execution of the statement and isolate the transaction and protect it from concurrency anomalies
Typical solutions for highly demanding read workloads with traditional disk-based tables may involve one or more of the following techniques and technologies:
- Optimizing cache reuse to avoid excessive recompilations by using stored procedures, parameterized statement or even plan reuse hints.
- Middle tier or client side caching solutions for direct access to the data
- Scale out solutions that utilize multiple copies of the data to distribute the read load across multiple servers. These type of solutions may use various data replication technologies such as hardware level replication, log shipping, SQL Server replication, database mirroring or availability groups.
In-Memory OLTP may offer significant benefits for this type of workloads with the following features.
- As data resides in memory, there is no need for physical IO operations
- The highly optimized, In-Memory data page-less structures reduces the metadata overhead to access the actual data rows
- Hash indexes are highly optimized for singleton (or a few rows) lookup type of queries as they do not require traversing non-leaf levels of BTrees or performing any additional lookups
- Utilizing natively-compiled stored procedures drastically reduces the overhead and code path length of T-SQL statement execution by pre-compilation of the plans and storing them as directly executable DLLs
- In-Memory OLTP is completely lock and latch free, minimizing the overhead and contention latency that is inherent to traditional, disk-based data access
- Always On availability groups may be utilized for scale out and distribution of the read workload across multiple servers. Typically, satisfying SLAs will require much fewer servers than with traditional disk-based scale out solutions
However, workloads that have the following characteristics may gain less benefits from migration to In-Memory OLTP:
- Data volumes are too large to effectively fit in memory. Remember that even if you have enough memory to hold the required read data, this memory will be exclusively reserved and will impact memory availability for other data structured and for internal memory required for the normal operations of SQL Server
- DSS type of queries that benefit from parallel execution. In-Memory OLTP does not currently support parallelism. Consider Columnstore indexes as an alternative solution for DSS type queries or any queries that can significantly benefit from parallel execution
In Part III, we will cover the benefits of In-Memory OLTP for high contention workloads that require both reads and writes on the same data regions.
Have a wonderful weekend 🙂