SQL – Differential Backup Size Prediction

One of the tasks I regularly have to perform at my job involves moving multi-terabyte databases from one server to another. The goal is to make these database moves happen with as little downtime as possible, so I always make use of differential backups and restores to keep the move times to a minimum. I’ll do a full backup of the source database, then I’ll perform a restore of that database to the new server, specifying ‘with norecovery’ in the restore command. For a 2 terabyte database, this process will take many hours, but that’s ok because I’ll make sure it’s complete prior to the actual maintenance window. Then when the maintenance window begins, I’ll disable access to the database and perform a differential backup of the database on the source server. Then I restore the differential backup ‘with recovery’ to the new server. The process of doing a differential backup and restore is much less time consuming than the full backup and restore, and this allows the actual maintenance window to be much smaller since the full backup and restore is completed at an earlier time. However, when I’m dealing with such large databases, it becomes extremely helpful to know how big the differential backup is going to be before I actually execute it. This way I’m able to estimate how long the whole process will take. When you have a 2 terabyte database, it’s not uncommon to have a several hundred gigabyte differential backup. You could see why it might help to know in advance whether the differential is going to be 10GB or 200GB.

Darwin Hatheway wrote a really nice article explaining how the differential backup size can be estimated.  He gets all the credit for teaching me how to do this.  I have two implementations below.  One is a straight T-SQL script, and the other is a HTML Application (HTA) that utilizes the T-SQL script with a bit of vbscript.  On a 2TB database it generally only takes a handful of seconds for the script to complete.

T-SQL Version: SQL Differential Backup Size Prediction
HTML Application: http://dougzuck.com/hta

diffPredictionResults

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
/*SQL Differential Backup Size Prediction - Doug Zuckerman, 2009 - http://dougzuck.com */
 
IF isNULL(object_id('tempdb.dbo.##showFileStats'), 1) <> 1
	DROP TABLE ##showFileStats
 
CREATE TABLE ##showFileStats (
	fileID INT,
	fileGroup INT,
	totalExtents BIGINT,
	usedExtents BIGINT,
	logicalFileName VARCHAR (500),	
	filePath VARCHAR (1000)
)
 
IF isNULL(object_id('tempdb.dbo.##DCM'), 1) <> 1
	DROP TABLE ##DCM
 
CREATE TABLE ##DCM (
	parentObject VARCHAR(5000),
	[object] VARCHAR(5000),
	FIELD VARCHAR (5000),
	value VARCHAR (5000)
)
 
/*we need to get a list of all the files in the database.  each file needs to be looked at*/	
INSERT INTO ##showFileStats EXEC('DBCC SHOWFILESTATS with tableresults')
 
DECLARE @currentFileID INT,
	@totalExtentsOfFile BIGINT,
	@dbname VARCHAR(100),
	@SQL VARCHAR(200),
	@currentDCM BIGINT,
	@step INT
 
SET @dbname = db_name()
SET @step = 511232
 
DECLARE myCursor SCROLL CURSOR FOR
SELECT fileID, totalExtents 
FROM ##showFileStats
 
OPEN myCursor
FETCH NEXT FROM myCursor INTO @currentFileID, @totalExtentsOfFile
 
/*look at each differential change map page in each data file of the database and put the output into ##DCM*/
WHILE @@FETCH_STATUS = 0 
BEGIN
 
	SET @currentDCM = 6
	WHILE @currentDCM <= @totalExtentsOfFile*8
	BEGIN	
		SET @SQL = 'dbcc page('+ @dbname + ', ' + CAST(@currentFileID AS VARCHAR) + ', ' + CAST(@currentDCM AS VARCHAR) + ', 3) WITH TABLERESULTS'
		INSERT INTO ##DCM EXEC (@SQL)
		SET @currentDCM = @currentDCM + @step
	END
 
	FETCH NEXT FROM myCursor INTO @currentFileID, @totalExtentsOfFile
END
CLOSE myCursor
DEALLOCATE myCursor
 
/*remove all unneeded rows from our results table*/
DELETE FROM ##DCM WHERE value = 'NOT CHANGED' OR parentObject NOT LIKE 'DIFF_MAP%'
--SELECT * FROM ##DCM
 
/*sum the extentTally column*/
SELECT SUM (extentTally) AS totalChangedExtents, SUM(extentTally)/16 AS 'diffPrediction(MB)', SUM(extentTally)/16/1024 AS 'diffPrediction(GB)' 
FROM
	/*create extentTally column*/
	(SELECT extentTally =
	CASE
		WHEN secondChangedExtent > 0 THEN CAST(secondChangedExtent AS BIGINT) - CAST(firstChangedExtent AS BIGINT) + 1
		ELSE 1
	END
	FROM
		/*parse the 'field' column to give us the first and last extents of the range*/
		(SELECT (SUBSTRING(FIELD,(SELECT CHARINDEX(':', FIELD, 0))+1,(CHARINDEX(')', FIELD, 0))-(CHARINDEX(':', FIELD, 0))-1))/8 AS firstChangedExtent,
		secondChangedExtent = 
		CASE	
			WHEN CHARINDEX(':', FIELD, CHARINDEX(':', FIELD, 0)+1) > 0 THEN (SUBSTRING(FIELD,(CHARINDEX(':', FIELD, CHARINDEX(':', FIELD, 0)+1)+1),(CHARINDEX(')', FIELD,CHARINDEX(')', FIELD, 0)+1))-(CHARINDEX(':', FIELD, CHARINDEX(':', FIELD, 0)+1))-1))/8
			ELSE ''
		END 
		FROM ##DCM)parsedFieldColumn)extentTallyColumn
Aug 2nd, 2009 | Posted in SQL, Technology
Tags:
  1. Oct 27th, 2010 at 15:59 | #1

    This is great, thanks for sharing. We are about to start implementing differntials in our environment and needed to understand the impact.

  2. John Danley
    Feb 2nd, 2012 at 12:26 | #2

    Awesome, thank-you thank-you thank-you. Glad I found this post. This saved me time and a headache. I have been meaning to start testing to see if a Full + Diffs would improve my backup scenario but have run into the situation where the Diff was often as big as the full backup. Now I can conditionally perform a full or diff backup based on the the predicted change percent. Seems to me this should be part of the native SQL Server functionality!

    • Feb 3rd, 2012 at 20:13 | #3

      Glad you’re finding it as useful as I do. Thanks for the comment.

      -Doug

  3. Feb 12th, 2012 at 15:11 | #4

    Agreed, this is awesome. Thanks.

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">