SQL – Differential Backup Size Prediction
One of the tasks I regularly have to perform at my job involves moving multi-terabyte databases from one server to another. The goal is to make these database moves happen with as little downtime as possible, so I always make use of differential backups and restores to keep the move times to a minimum. I’ll do a full backup of the source database, then I’ll perform a restore of that database to the new server, specifying ‘with norecovery’ in the restore command. For a 2 terabyte database, this process will take many hours, but that’s ok because I’ll make sure it’s complete prior to the actual maintenance window. Then when the maintenance window begins, I’ll disable access to the database and perform a differential backup of the database on the source server. Then I restore the differential backup ‘with recovery’ to the new server. The process of doing a differential backup and restore is much less time consuming than the full backup and restore, and this allows the actual maintenance window to be much smaller since the full backup and restore is completed at an earlier time. However, when I’m dealing with such large databases, it becomes extremely helpful to know how big the differential backup is going to be before I actually execute it. This way I’m able to estimate how long the whole process will take. When you have a 2 terabyte database, it’s not uncommon to have a several hundred gigabyte differential backup. You could see why it might help to know in advance whether the differential is going to be 10GB or 200GB.
Darwin Hatheway wrote a really nice article explaining how the differential backup size can be estimated. He gets all the credit for teaching me how to do this. I have two implementations below. One is a straight T-SQL script, and the other is a HTML Application (HTA) that utilizes the T-SQL script with a bit of vbscript. On a 2TB database it generally only takes a handful of seconds for the script to complete.
T-SQL Version: SQL Differential Backup Size Prediction
HTML Application: http://dougzuck.com/hta
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | /*SQL Differential Backup Size Prediction - Doug Zuckerman, 2009 - http://dougzuck.com */ IF isNULL(object_id('tempdb.dbo.##showFileStats'), 1) <> 1 DROP TABLE ##showFileStats CREATE TABLE ##showFileStats ( fileID INT, fileGroup INT, totalExtents BIGINT, usedExtents BIGINT, logicalFileName VARCHAR (500), filePath VARCHAR (1000) ) IF isNULL(object_id('tempdb.dbo.##DCM'), 1) <> 1 DROP TABLE ##DCM CREATE TABLE ##DCM ( parentObject VARCHAR(5000), [object] VARCHAR(5000), FIELD VARCHAR (5000), value VARCHAR (5000) ) /*we need to get a list of all the files in the database. each file needs to be looked at*/ INSERT INTO ##showFileStats EXEC('DBCC SHOWFILESTATS with tableresults') DECLARE @currentFileID INT, @totalExtentsOfFile BIGINT, @dbname VARCHAR(100), @SQL VARCHAR(200), @currentDCM BIGINT, @step INT SET @dbname = db_name() SET @step = 511232 DECLARE myCursor SCROLL CURSOR FOR SELECT fileID, totalExtents FROM ##showFileStats OPEN myCursor FETCH NEXT FROM myCursor INTO @currentFileID, @totalExtentsOfFile /*look at each differential change map page in each data file of the database and put the output into ##DCM*/ WHILE @@FETCH_STATUS = 0 BEGIN SET @currentDCM = 6 WHILE @currentDCM <= @totalExtentsOfFile*8 BEGIN SET @SQL = 'dbcc page('+ @dbname + ', ' + CAST(@currentFileID AS VARCHAR) + ', ' + CAST(@currentDCM AS VARCHAR) + ', 3) WITH TABLERESULTS' INSERT INTO ##DCM EXEC (@SQL) SET @currentDCM = @currentDCM + @step END FETCH NEXT FROM myCursor INTO @currentFileID, @totalExtentsOfFile END CLOSE myCursor DEALLOCATE myCursor /*remove all unneeded rows from our results table*/ DELETE FROM ##DCM WHERE value = 'NOT CHANGED' OR parentObject NOT LIKE 'DIFF_MAP%' --SELECT * FROM ##DCM /*sum the extentTally column*/ SELECT SUM (extentTally) AS totalChangedExtents, SUM(extentTally)/16 AS 'diffPrediction(MB)', SUM(extentTally)/16/1024 AS 'diffPrediction(GB)' FROM /*create extentTally column*/ (SELECT extentTally = CASE WHEN secondChangedExtent > 0 THEN CAST(secondChangedExtent AS BIGINT) - CAST(firstChangedExtent AS BIGINT) + 1 ELSE 1 END FROM /*parse the 'field' column to give us the first and last extents of the range*/ (SELECT (SUBSTRING(FIELD,(SELECT CHARINDEX(':', FIELD, 0))+1,(CHARINDEX(')', FIELD, 0))-(CHARINDEX(':', FIELD, 0))-1))/8 AS firstChangedExtent, secondChangedExtent = CASE WHEN CHARINDEX(':', FIELD, CHARINDEX(':', FIELD, 0)+1) > 0 THEN (SUBSTRING(FIELD,(CHARINDEX(':', FIELD, CHARINDEX(':', FIELD, 0)+1)+1),(CHARINDEX(')', FIELD,CHARINDEX(')', FIELD, 0)+1))-(CHARINDEX(':', FIELD, CHARINDEX(':', FIELD, 0)+1))-1))/8 ELSE '' END FROM ##DCM)parsedFieldColumn)extentTallyColumn |
This is great, thanks for sharing. We are about to start implementing differntials in our environment and needed to understand the impact.
Awesome, thank-you thank-you thank-you. Glad I found this post. This saved me time and a headache. I have been meaning to start testing to see if a Full + Diffs would improve my backup scenario but have run into the situation where the Diff was often as big as the full backup. Now I can conditionally perform a full or diff backup based on the the predicted change percent. Seems to me this should be part of the native SQL Server functionality!
Glad you’re finding it as useful as I do. Thanks for the comment.
-Doug
Agreed, this is awesome. Thanks.
Thanks, Adam.