Published on September 30th, 2016 | by Warner Chaves0
SQL On The Edge #10 – SQL 2016 Columnstore Improvements
SQL Server 2016 hit general availability on June 1st 2016 and comes with a big set of new features and improvements. From built-in JSON support to mobile and tablet friendly Reporting Services, as usual there is a lot of investment done by Microsoft before a major release of the flagship database product.
In this post I want to focus on the Columnstore indexes feature and give you an idea of why it’s important and where you can leverage this functionality. SQL 2016 continues to build on this feature and it’s one of the best ways for Enterprise license customers to get the most bang for their buck.
What is a Columnstore?
Columnstore indexes are a different way of organizing and processing data for a database. Instead of organizing the records where all the fields are stored together, a Columnstore will store column values together and then reconstruct the record based on the different column values. Columnar storage has been implemented by most database vendors especially targeting the data warehousing space. Microsoft, Vertica, Teradata, Oracle, Amazon and Google have all implemented their version of this type of data storage.
There are two main advantages to running Columnstores in SQL Server:
- Compression: due to column values being stored together, data compression can yield great benefits to space savings and increasing data density in RAM.
- Batch mode: SQL Server has a special processing mode used for Columnstore data where it will process multiple rows (about 1000) at the same time in one CPU instruction. This yields great performance benefits.
The use case for Columnstores is large data warehousing tables or as an analytics-friendly index on top of OLTP data. It is not meant to be the main structure supporting an OLTP system as it’s not efficient in updating or deleting data.
What is the history of the Columnstore feature?
SQL Server first introduced Columnar Storage with the SQL 2012 Enterprise release. In this release, Columnstores were read-only indexes, so it required to drop the index, load the table or partition and then rebuild the index to refresh it with the latest data.
SQL Server 2014 upgraded Columnstores with full read-write capabilities, allowing the Columnstore to become the ‘clustered’ index for the table and hold all the data instead of just being one more index on top of row-organized data. 2014 also introduced many improvements to batch operations so more pieces of an execution plan could take advantage of this faster processing mode.
What’s new in SQL Server 2016?
With SQL 2016 the Columnstore story just keeps getting stronger. Here are the new features in this release:
- Batch mode is enabled even for single-thread queries and covers more execution operators.
- Multi-version row isolation is supported now for Columnstores.
- Availability Groups secondaries can provide read queries on Columnstores.
- Updateable non-clustered Columnstore on top of a heap, row-based clustered index or In-Memory table. This is the scenario called Real-time analytics.
- Non-clustered b-tree indexes on a clustered Columnstore. This allows creating PKs and FKs on the Columnstore in a trade-off of integrity for performance.
- Non-clustered Columnstores can have filter and compression delays defined in order to fine-tune when a row is moved to the Columnstore
Keep in mind that the latest Columnstore technology is available not only in SQL Server 2016 but also on Azure SQL Database (for real-time analytics or small datamarts) or Azure SQL DataWarehouse (full data warehouse support).
In the demo I’m going to highlight the new feature of PK and FK constraints on Columnstores. Check it out below!