SQL – Differential Backup Size Prediction / Estimation


Darwin Hatheway wrote a really nice article explaining how the differential backup size can be estimated or predicted.  He gets all the credit for teaching me how to do this.  However, since he didn’t provide a complete solution, I had to do it myself. There are two implementations below.  One is a straight T-SQL script, and the other is a HTML Application (HTA) that utilizes the T-SQL script with a bit of vbscript.

T-SQL Version: SQL Differential Backup Size Prediction
HTML Application: http://dougzuck.com/hta

diffPredictionResults

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
/*SQL Differential Backup Size Prediction / Estimation - Doug Zuckerman, 2009 - http://dougzuck.com */
 
IF isNULL(object_id('tempdb.dbo.##showFileStats'), 1) <> 1
	DROP TABLE ##showFileStats
 
CREATE TABLE ##showFileStats (
	fileID INT,
	fileGroup INT,
	totalExtents BIGINT,
	usedExtents BIGINT,
	logicalFileName VARCHAR (500),	
	filePath VARCHAR (1000)
)
 
IF isNULL(object_id('tempdb.dbo.##DCM'), 1) <> 1
	DROP TABLE ##DCM
 
CREATE TABLE ##DCM (
	parentObject VARCHAR(5000),
	[object] VARCHAR(5000),
	FIELD VARCHAR (5000),
	VALUE VARCHAR (5000)
)
 
/*we need to get a list of all the files in the database.  each file needs to be looked at*/	
INSERT INTO ##showFileStats EXEC('DBCC SHOWFILESTATS with tableresults')
 
DECLARE @currentFileID INT,
	@totalExtentsOfFile BIGINT,
	@dbname VARCHAR(100),
	@SQL VARCHAR(200),
	@currentDCM BIGINT,
	@step INT
 
SET @dbname = db_name()
SET @step = 511232
 
DECLARE myCursor SCROLL CURSOR FOR
SELECT fileID, totalExtents 
FROM ##showFileStats
 
OPEN myCursor
FETCH NEXT FROM myCursor INTO @currentFileID, @totalExtentsOfFile
 
/*look at each differential change map page in each data file of the database and put the output into ##DCM*/
WHILE @@FETCH_STATUS = 0 
BEGIN
 
	SET @currentDCM = 6
	WHILE @currentDCM <= @totalExtentsOfFile*8
	BEGIN	
		SET @SQL = 'dbcc page('+ @dbname + ', ' + CAST(@currentFileID AS VARCHAR) + ', ' + CAST(@currentDCM AS VARCHAR) + ', 3) WITH TABLERESULTS'
		INSERT INTO ##DCM EXEC (@SQL)
		SET @currentDCM = @currentDCM + @step
	END
 
	FETCH NEXT FROM myCursor INTO @currentFileID, @totalExtentsOfFile
END
CLOSE myCursor
DEALLOCATE myCursor
 
/*remove all unneeded rows from our results table*/
DELETE FROM ##DCM WHERE VALUE = 'NOT CHANGED' OR parentObject NOT LIKE 'DIFF_MAP%'
--SELECT * FROM ##DCM
 
/*sum the extentTally column*/
SELECT SUM (extentTally) AS totalChangedExtents, SUM(extentTally)/16 AS 'diffPrediction(MB)', SUM(extentTally)/16/1024 AS 'diffPrediction(GB)' 
FROM
	/*create extentTally column*/
	(SELECT extentTally =
	CASE
		WHEN secondChangedExtent > 0 THEN CAST(secondChangedExtent AS BIGINT) - CAST(firstChangedExtent AS BIGINT) + 1
		ELSE 1
	END
	FROM
		/*parse the 'field' column to give us the first and last extents of the range*/
		(SELECT (SUBSTRING(FIELD,(SELECT CHARINDEX(':', FIELD, 0))+1,(CHARINDEX(')', FIELD, 0))-(CHARINDEX(':', FIELD, 0))-1))/8 AS firstChangedExtent,
		secondChangedExtent = 
		CASE	
			WHEN CHARINDEX(':', FIELD, CHARINDEX(':', FIELD, 0)+1) > 0 THEN (SUBSTRING(FIELD,(CHARINDEX(':', FIELD, CHARINDEX(':', FIELD, 0)+1)+1),(CHARINDEX(')', FIELD,CHARINDEX(')', FIELD, 0)+1))-(CHARINDEX(':', FIELD, CHARINDEX(':', FIELD, 0)+1))-1))/8
			ELSE ''
		END 
		FROM ##DCM)parsedFieldColumn)extentTallyColumn

  1. #1 by Oscar Zamora on October 27, 2010 - 3:59 pm

    This is great, thanks for sharing. We are about to start implementing differntials in our environment and needed to understand the impact.

  2. #2 by John Danley on February 2, 2012 - 12:26 pm

    Awesome, thank-you thank-you thank-you. Glad I found this post. This saved me time and a headache. I have been meaning to start testing to see if a Full + Diffs would improve my backup scenario but have run into the situation where the Diff was often as big as the full backup. Now I can conditionally perform a full or diff backup based on the the predicted change percent. Seems to me this should be part of the native SQL Server functionality!

    • #3 by doug on February 3, 2012 - 8:13 pm

      Glad you’re finding it as useful as I do. Thanks for the comment.

      -Doug

  3. #4 by Adam Mikolaj on February 12, 2012 - 3:11 pm

    Agreed, this is awesome. Thanks.

  4. #6 by Ram on September 25, 2013 - 3:57 am

    Tons of Applauds.. Really great Work

(will not be published)