Published on December 10th, 2014 | by Warner Chaves0
Practical Intro: SQL Server Table Change Tracking – Part 2 – Column Tracking
Hi everybody, this is part two on our “Practical Intro” series to SQL Server Table Change Tracking. If you haven’t seen Part 1 you probably want to check it right here.
On this second part we’re going to focus on two key topics for this feature: tracking updated columns and change tracking internal tables.
Tracking updated columns is a more advanced use of Change Tracking and like the name suggests, it not only will tell us what records have changed but also what columns specifically.
When should you use it?
Tracking updated columns is heavier on the change tracking mechanism because more information needs to be maintained regarding the changes that have happened in the system. For this reason, we recommend activating it when it’s really necessary for these use cases:
- Some columns are a lot more frequently updated in your records as opposed to the rest of the columns. This is specially true if the updated column is small and the record size is large.
- The data is synced fairly frequently and the speed of the synchronization is very important. Since we’ll only be transmitting data from the columns that have changed, the amount of information that needs to flow is less and can make for much faster syncs.
Let’s see an example
First things first, we’ll update our table so column tracking is now enabled. For this, we need to DISABLE it and REENABLE while activating column tracking:
ALTER TABLE HoursTracked DISABLE CHANGE_TRACKING; GO ALTER TABLE HoursTracked ENABLE CHANGE_TRACKING WITH (TRACK_COLUMNS_UPDATED = ON);
Now at this point let’s do some updates, hitting different columns:
UPDATE HoursTracked set TotalHours=TotalHours+1 where EmployeeId=1; UPDATE HoursTracked set Month=10 where EmployeeId=2;
With these two updates we have modified a total of 3 records, twice on the TotalHours column and once on the Month column. So let’s see how we can extract this information now:
select SYS_CHANGE_OPERATION,SYS_CHANGE_COLUMNS,RecordId from CHANGETABLE(CHANGES HoursTracked, 6) ct
Note I’m passing a value of version 6 and I’m retrieving the SYS_CHANGE_COLUMNS column. This column holds a binary representation of what columns changed and can be decoded by using the CHANGE_TRACKING_IS_COLUMN_IN_MASK function.
To actually know what columns changed, we need to have the column ID that we are interested in and then use the CHANGETABLE results as parameters to the function. Here’s an example for the TotalHours column:
select SYS_CHANGE_OPERATION,SYS_CHANGE_COLUMNS,RecordId, CHANGE_TRACKING_IS_COLUMN_IN_MASK(COLUMNPROPERTY(OBJECT_ID('HoursTracked'), 'TotalHours', 'ColumnId'),ct.SYS_CHANGE_COLUMNS) TotalHoursChange from CHANGETABLE(CHANGES HoursTracked, 6) ct
In this case we added a new column TotalHoursChange where we can see that it is populated with a value of 1 when that column changed and a value of 0 when it didn’t change.
Taking this logic into account, we can write a SELECT statement that will only bring the columns with updated values:
declare @employeeId int = COLUMNPROPERTY(OBJECT_ID('HoursTracked'), 'Employee', 'ColumnId'); declare @yearId int = COLUMNPROPERTY(OBJECT_ID('HoursTracked'), 'Year', 'ColumnId'); declare @monthId int = COLUMNPROPERTY(OBJECT_ID('HoursTracked'), 'Month', 'ColumnId'); declare @totalHoursId int = COLUMNPROPERTY(OBJECT_ID('HoursTracked'), 'TotalHours', 'ColumnId'); select SYS_CHANGE_OPERATION,ht.RecordId, CASE WHEN CHANGE_TRACKING_IS_COLUMN_IN_MASK(@employeeId,ct.SYS_CHANGE_COLUMNS)=1 then ht.EmployeeId ELSE NULL END EmployeeId, CASE WHEN CHANGE_TRACKING_IS_COLUMN_IN_MASK(@yearId,ct.SYS_CHANGE_COLUMNS)=1 then ht.[Year] ELSE NULL END Year, CASE WHEN CHANGE_TRACKING_IS_COLUMN_IN_MASK(@monthId,ct.SYS_CHANGE_COLUMNS)=1 then ht.[Month] ELSE NuLL END Month, CASE WHEN CHANGE_TRACKING_IS_COLUMN_IN_MASK(@totalHoursId,ct.SYS_CHANGE_COLUMNS)=1 then ht.TotalHours ELSE NuLL END TotalHours from CHANGETABLE(CHANGES HoursTracked, 6) ct inner join HoursTracked ht on ct.RecordId=ht.RecordId
And with this, we only bring in the data for the columns that have changed and the unchanged ones are left as NULL:
Internal Change Tracking Tables
There are two main tables that are used for ChangeTracking:
- sys.syscommittab: this tracks transactions per database and gets a row added per committed transaction.
- sys.change_tracking_***** where the **** will be a set of numbers generated by SQL Server : this is one per table and it tracks the changes themselves with one row added every time a record is changed.
To find the exact name of the sys.change_tracking table, you can look inside sys.internal_tables or you can look at the execution plan generated on any query that uses CHANGETABLE.
You will want to keep an eye out for the size of these tables to make sure they don’t grow out of control and compromise your system. This can specially be the case when the auto cleanup gets disabled and then it never gets turned ON again. The specific size of the tables will depend on the amount of activity on the system and the width of the tables and PK. You should monitor it and baseline it so you can setup an alert when they go significantly above the baseline.
This concludes our Practical Intro to Table Change Tracking, if there’s anything else you want us to cover feel free to comment on this post. Cheers!