- December 29, 2014
- Posted by: Warner Chaves
- Category: Database Administration, SQL Server Performance Tuning
Ask 10 DBA’s what the biggest new feature of SQL Server 2014 is and they will answer “Hekaton” or “In-Memory OLTP”. Of course those are the two names for the same thing: the new In-Memory storage and query compilation for SQL Server.
Despite there being an abundance of marketing and tech material about this new feature, I’ve had several clients and DBA’s come up to me and say “I understand what it is, I understand why they’re making such a big deal about it but I don’t know if I should start planning to use it or where I would even use it on my systems”. In the end, we have a very famous new feature but a lot of hesitation for DBAs and clients to try out.
To help in this regard, I will be outlining in this post the different use cases I’ve discovered so far with my clients and the easiest ways to start dipping your toes in the In-Memory OLTP pool. That’s a very nice thing about how Microsoft decided to implement this feature, it’s not an all-or-nothing proposition. There are several ways to start playing with it and getting performance gains and gradually investing more and more into it.
Before diving into the use cases, let’s go over some of the basics in case anyone needs a refresher.
The Fundamental Premises
These are the fundamental driving forces behind the development of this technology:
- CPU speed is not increasing but instead we have more cores per chip.
- Cost of RAM keeps going down and servers can hold more and more memory.
- There are aggressive optimizations that work on RAM but do NOT work on disk (magnetic or SSD).
I want to stress on that last point. Many people have mentioned to me “well, I already have a 512GB server and my database fits all in memory”, or “I have SSDs for all my storage so I don’t think going In-Memory will help with performance”. This is NOT THE SAME. Even if your database is completely in memory now, the storage engine is NOT OPTIMIZED for In-Memory access. Even if all your storage is SSD, that’s still about 3 orders of magnitude slower than RAM.
Remember this: magnetic disks usually respond in milliseconds, SSDs in microseconds and RAM in nanoseconds.
Where are the improvements?
Now, despite of what marketing would have you believe, In-Memory OLTP is not going to be the snake oil that fixes all your system issues. Don’t get me wrong, it’s a great tool, but like any tool, you have to know where to apply it.
Let’s go over the different components of SQL Server and how In-Memory affects them:
- Network Protocol: there are no changes to this. So if your bottleneck is network, this is NOT the feature for you.
- Transaction log: durable data produces less log records because storage operations don’t need to be logged, for example, page splits. Also, we now have the concept of non-durable data that don’t write to the log AT ALL.
- Data files: data file IO is now optimized for streaming sequential IO. This could benefit your system if your storage can be optimized for sequential IO writes.
- Query optimization: SQL Server 2014 introduces native compilation when used with In-Memory OLTP. This allows stored procedures to compile directly to machine code. This can be a big benefit but not all of T-SQL is supported.
- Query execution: there’s no locking or latching, it’s a completely optimistic concurrency model.
- Indexing: there are new hash indexes for super fast equality predicates, indexes are always covering and there’s no need to worry about fragmentation. This is a BIG IMPROVEMENT.
- Memory use: table variables can now be really in memory. Indexes don’t keep copies of the data but pointers instead. Multiple versions of rows need to be in memory at all times though.
- CPU: CPU is maximized by potentially removing IO, memory and execution bottlenecks and further optimized if native compilation is used.
Keep in mind where the major improvements are when we discuss the use cases because that’s really what it comes down to. Your system will benefit if your major bottlenecks are directly attacked by the improvements in In-Memory OLTP. Plain and simple, if your bottleneck is something else, then the feature is not for you.
The compromises of In-Memory OLTP V1
Like any new technology, there are several limitations in place that one needs to be aware of. It’s important to keep these in mind before starting a project to migrate to In-Memory OLTP so you don’t end up wasting a bunch of time only to hit a wall and having to roll back your entire work. These are the main limitations of this first release:
- No table partitioning (can be simulated with a view).
- Requires a lot of careful measurements that can’t be easily undone (amount of RAM, hash buckets, etc).
- No parallel plans when In-Memory OLTP is involved.
- No FKs, Unique indexes, triggers.
- Schema is 100% static.
- The T-SQL for native compilation is very limited.
Now I’m sure that the team at Microsoft is going to continue to work on this feature and these limitations will get smaller and smaller with each new release.
Alright, now that we have a solid foundation, let’s jump into the Use Cases!
Use Case #1: The Dials of In-Memory and Durability
This use case basically means that we adapt how we use In-Memory OLTP to the type of bottleneck we have in our system. For example, if we have a lot of locking and tempdb contention, then In-Memory is a good option. If on top of that we also have log IO issues then we should study our data and see if any of it could be delayed durable or completely non-durable. Here’s a graphic outlining the different combinations:
Here are the specific scenarios I’ve encountered so far:
- You have locking issues and row versioning didn’t significantly improve throughput. (Go In-Memory).
- You don’t have locking but you still have latching and it’s limiting your throughput. (Go In-Memory).
- Your locking and latching are not significant but you still need to improve throughput. (Go In-Memory and native compilation).
- You have to ingest large amounts of data where there are big spikes of activity and then low periods. (Go In-Memory and mix with Classic).
- You have high throughput data where its OK if you lose SOME as long as most is there. (Go In-Memory and Delayed Durable).
- You have data that needs to be queried with the richness of T-SQL but it’s not required to keep between restarts. (Go In-Memory and Schema-Only).
- You have non-durable data that you want to scale-out with AlwaysOn Availability Groups. (Go In-Memory and Delayed Durable).
SCHEMA_ONLY will NOT work on this previous use case because there’s no transaction log operation involved.
Use Case #2: The Landing Pad
Another scenario where In-Memory can actually yield some very nice benefits is on landing or staging tables used in ETL processes. A lot of the times, the ETL will take the data load into a table inside SQL Server, do some process with it and finally insert it into the table where it’s going to be permanently. Even though the landing table will just get truncated and reloaded every time the ETL runs, if it’s a classic table then it will be doing all kinds of transaction logging and data file IO while your ETL is running.
Replacing these landing tables with In-Memory Schema-Only tables removes all the data file and transaction log IO. Another option is to use an In-Memory table with delayed durability, this way we get rid of the WRITELOG waits since the log IO will happen asynchronously.
This is one of my favorite ways to get a client started with using In-Memory OLTP. It’s easy to setup, troubleshoot and measure improvements and will show clients the potential of the technology.
One important note: for best performance you want to be able to parallelize the tasks and do the loading with multiple threads. In a future video I will be showing exactly how to do this with SSIS!
Use Case #3: The Table Variable Conversion
For a long time the SQL Server community had this myth that table variables were always in memory. This is not true and anyone curious enough can easily verify that table variables allocate pages from tempdb as well as produce tempdb IO when being used. Now with In-Memory OLTP, there are memory optimized table variables and these are REALLY IN-MEMORY. There’s no allocation from tempdb and no IO generated at all.
This is another great way to get clients to start using some of the In-Memory technology. The existing code will not have to be changed, all they need to do is redefine their user defined types as Memory_Optimized and recompile their stored procedures. Here are the scenarios where I’ve used this so far:
- Anywhere table variables are heavily used and hitting tempdb significantly.
- Anywhere where temp tables are used but there’s no advantage to them being on tempdb (more indexes, actual stats needed for the plan, etc).
- The plans you currently have do NOT depend on parallelism.
And these are the 3 use cases I’ve encountered so far, I’m sure that as the adoption of the technology grows, we will see more and more use cases, patterns and best practices begin to come up.
As I mentioned, I will be doing a video recording showing a demo of each one of these use cases and the type of gains that you can expect with each. Stay tuned!