Refresh the Staging Server with Production Data – A Real World Scenario.

Recently I had to restore the Production Reporting Database Server onto our Staging environment for testing. This database was around 280 GB in size with around 120 GB of data in the Primary Data File (.mdf), 130 GB of data in the Secondary Data File (.ndf) and around 30 GB of Transactional Log File (.ldf). The database on the Production Reporting Server was used by the Business Intelligence Datasync Package to ship the Data onto the BI Server and perform Complex Analysis. This BI Package took approximately 24 hrs to execute which was deemed an acceptable timeframe. I suspected that there are lot of unwanted conditions in the T-SQL queries, which after proper analysis could be carefully removed, which in turn will bring down the Query Execution time. In Order to thoroughly test it, we decided to restore this onto our staging environment.

The Production Reporting Server was located in the Data Centre hosted in Chicago which had a reasonable N/W speed, 16 GB of RAM, 12 CPU s and used Fibre Channel Optical SAN. The Staging Server was located in a Data Centre in Atlanta, the speed was not as good as the Production environment, and had 12 GB of RAM, 12 CPUs with an Iscsi SAN environment. The name of the Production Reporting Database was ABC_Reporting.

We never had any server in the Chicago Data Centre which had around 300 GB of free space on it. We did had a NAS Box in the Atlanta Data Centre which had a huge storage but the main concern was that if I tried to take the Full Backup across the N/W onto the NAS Server it would take a long long time because the N/W speed was very poor. I identified 2 Servers in the Chicago Data Centre wherein I can store atleast 100 GB and 50 GB of Data and the remaining portion I decided to keep on the NAS Box. The name of the servers in Chicago was A and B respectively whereas the name of the NAS Box was C. The Servers in Chicago and Atlanta were in the Same Data Centre. The name of the Production Reporting Server was D respectively. On the Server named A,B and C inside the B drive, I created a folder named Staging_Backup inside which I created another folder named D respectively. Against the master database on the server named D, I executed the below T-SQL query to split the backups.

BACKUP DATABASE ABC
TO DISK='\\A\Staging_Backup\D\ABC_Staging_Restore_1.bak',
DISK='\\A\Staging_Backup\D\ ABC _Staging_Restore_2.bak',
DISK='\\B\Staging_Backup\D\ ABC _Staging_Restore_3.bak',
DISK='\\C\Staging_Backup\D\ ABC _Staging_Restore_4.bak',
DISK='\\C\Staging_Backup\D\ ABC _Staging_Restore_5.bak'

It took around 13 hrs for the complete backup of 280 GB to complete successfully. Once the Backup was completed, I transferred the split backups present on the server A and B onto the location \\C\Staging_Backup\D which was present on the NAS Box located in the Atlanta Data Centre. It took around 5 hrs to copy the 3 copies of split backup onto the server named C.

The name of the Staging Server where these backup copy needed to be Restored was ‘E’ and the drive on which it needed to be restored was labelled ‘O’. This server was also in the same domain as the other servers discussed above. Once the backup was transferred successfully to the NAS Box, the next step was to perform a Restore with Verify Only to ensure that the backup set if free from any sort of corruption. Against the master database on the server named E, I executed the below T-SQL query to prove that the backup set is free from corruption.

RESTORE VERIFYONLY
FROM DISK='\\C\Staging_Backup\D\ABC_Staging_Restore_1.bak',
DISK='\\C\Staging_Backup\D\ABC_Staging_Restore_2.bak',
DISK='\\C\Staging_Backup\D\ABC_Staging_Restore_3.bak',
DISK='\\C\Staging_Backup\D\ABC_Staging_Restore_4.bak',
DISK='\\C\Staging_Backup\D\ABC_Staging_Restore_5.bak'

It took around 2 hours for the above query to execute and the output of this was

The Backup Set is
        Valid

which proved that the backup set is free from corruption and can be restored without any issues.

The next step was to Restore the Database Backups onto drive ‘O’ on the server ‘E’. One important point is that the RAID level of the drive named O was 5. Before starting the Restoration of the Backup files, the important step which needed to be performed was to find the Logical Name of the Primary Data File, Secondary Data File and the Transactional Log File of the ABC database present on the Production Reporting Server named D. On master database on the server named D, I executed the below T-SQL to find the logical name.

Select * from sysfiles

This returned the logical names of the Data,Log and Transactional Log File as ABC_Data, ABC_Data_Secondary and ABC_Log respectively. On the Server ‘E’, inside the drive named O, I created the path ‘Microsoft SQL Server\MSSQL1\Data’.

Once the path was created , I executed the below T-SQL against the master database on the server ‘E’.

Restore Database ABC_Reporting_v1
FROM DISK='\\C\Staging_Backup\D\ABC_Staging_Restore_1.bak',
DISK='\\C\Staging_Backup\D\ABC_Staging_Restore_2.bak',
DISK='\\C\Staging_Backup\D\ABC_Staging_Restore_3.bak',
DISK='\\C\Staging_Backup\D\ABC_Staging_Restore_4.bak',
DISK='\\C\Staging_Backup\D\ABC_Staging_Restore_5.bak'
WITH MOVE ‘ABC_Data’ TO ‘O:\Microsoft SQL Server\MSSQL1\Data\ABC_Data.mdf’,
MOVE ‘ABC_Data_Secondary’ TO ‘O:\Microsoft SQL Server\MSSQL1\Data\ABC_Data_Secondary.ndf’,
MOVE ‘ABC_Log’ TO ‘O:\Microsoft SQL Server\MSSQL1\Data\ABC_Log.ldf’,
STATS=10

Once the above T-SQL was executed, I noticed that the speed at which the Database was being Restored was very very poor, 3.5 GB/hr which was unacceptable. Initially I thought that this was due to the poor speed of the RAID 5 drive, but after thorough investigation I came to a conclusion that it wasn’t the case. One important point to mention here is that the Server ‘E’ and the NAS Server ‘C’ on which the Database Backups were present were located in the same Data Centre. After analysing the complete situation, I decided to implement “Instant File Initialization to speed up the Restoration process. When instant file initialization is turned on, SQL Server doesn’t have to zero out every 8K page that has been allocated. Instead, the space is just allocated to SQL Server by the operating system in one attempt, which is a quick process, potentially saving us a great deal of time. Once this feature was enabled on the Server Level it just took around 1 hr and 53 minutes for the entire database of 280 GB to be Restored. Since it was a Staging Server, as soon as I completed the task of Database Restoration, I immediately turned off the Instant File Initialization option.

The next step was to change the owner of the Restored Database to sa. It was done by executing the below T-SQL against the database named ABC_Reporting_v1 on the server named E.

Exec sp_changedbowner ‘sa’

The next step was to get the list of all the Orphaned users in the database ‘ABC_Reporting_V1’ by executing the below T-SQL.

Exec sp_change_users_login ‘report’

Once we found the list of all the Orphaned Users, the next step was to fix it by executing the below T-SQL.

            sp_change_users_login
            'update_one', '<database_user>', '<login_name>';

Once the Orphaned Users issue was fixed, the last step was to execute the Re-indexing and Update Statistics procedure :

Re-indexing the Database:

        -- Ensure a USE  statement has been executed first.
SET NOCOUNT ON;
DECLARE @objectid int;
DECLARE @indexid int;
DECLARE @partitioncount bigint;
DECLARE @schemaname nvarchar(130);
DECLARE @objectname nvarchar(130);
DECLARE @indexname nvarchar(130);
DECLARE @partitionnum bigint;
DECLARE @partitions bigint;
DECLARE @frag float;
DECLARE @command nvarchar(4000);
-- Conditionally select tables and indexes from the sys.dm_db_index_physical_stats function
-- and convert object and index IDs to names.
SELECT
object_id AS objectid,
index_id AS indexid,
partition_number AS partitionnum,
avg_fragmentation_in_percent AS frag
INTO #work_to_do
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL , NULL, 'LIMITED')
WHERE avg_fragmentation_in_percent > 10.0 AND index_id > 0;
-- Declare the cursor for the list of partitions to be processed.
DECLARE partitions CURSOR FOR SELECT * FROM #work_to_do;
-- Open the cursor.
OPEN partitions;
-- Loop through the partitions.
WHILE (1=1)
    BEGIN;
FETCH NEXT
FROM partitions
INTO @objectid, @indexid, @partitionnum, @frag;
IF @@FETCH_STATUS < 0 BREAK;
SELECT @objectname = QUOTENAME(o.name), @schemaname = QUOTENAME(s.name)
FROM sys.objects AS o
JOIN sys.schemas as s ON s.schema_id = o.schema_id
WHERE o.object_id = @objectid;
SELECT @indexname = QUOTENAME(name)
FROM sys.indexes
WHERE  object_id = @objectid AND index_id = @indexid;
SELECT @partitioncount = count (*)
FROM sys.partitions
WHERE object_id = @objectid AND index_id = @indexid;
-- 30 is an arbitrary decision point at which to switch between reorganizing and rebuilding.
IF @frag < 30.0
SET @command = N'ALTER INDEX ' + @indexname + N' ON ' + @schemaname + N'.' + @objectname + N' REORGANIZE';
IF @frag >= 30.0
SET @command = N'ALTER INDEX ' + @indexname + N' ON ' + @schemaname + N'.' + @objectname + N' REBUILD';
IF @partitioncount > 1
SET @command = @command + N' PARTITION=' + CAST(@partitionnum AS nvarchar(10));
EXEC (@command);
--PRINT N'Executed: ' + @command;
    END;
-- Close and deallocate the cursor.
CLOSE partitions;
DEALLOCATE partitions;
-- Drop the temporary table.
DROP TABLE #work_to_do;
GO

Update Statistics Without NTEXT and Image Data Type :

        SET NOCOUNT ON
DECLARE @columnname VARCHAR(MAX)
DECLARE @tablename SYSNAME
DECLARE @statsname SYSNAME
DECLARE @sql NVARCHAR(4000)
DECLARE @NAME VARCHAR(MAX)
declare @i INT
declare @j INT
create table #temp
(
tablename varchar(1000),
statsname varchar(1000),
columnname varchar(1000)
)
insert #temp(tablename,statsname,columnname)
SELECT DISTINCT
OBJECT_NAME(s.[object_id]),
s.name AS StatName,
COALESCE(@NAME+ ', ', '')+c.name
FROM sys.stats s JOIN sys.stats_columns sc ON sc.[object_id] = s.[object_id] AND sc.stats_id = s.stats_id
JOIN sys.columns c ON c.[object_id] = sc.[object_id] AND c.column_id = sc.column_id
JOIN INFORMATION_SCHEMA.COLUMNS D ON D.[COLUMN_NAME]= C.[NAME]
JOIN sys.partitions par ON par.[object_id] = s.[object_id]
JOIN sys.objects obj ON par.[object_id] = obj.[object_id]
WHERE OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1
AND D.DATA_TYPE NOT IN('NTEXT','IMAGE')
create table #temp1
(
id int identity(1,1),
tablename varchar(8000),
statsname varchar(8000),
columnname varchar(8000)
)
insert #temp1(tablename,statsname,columnname)
select tablename,statsname,stuff(
(
select ','+ [columnname] from #temp where
statsname = t.statsname for XML path('')
),1,1,'')
from (select distinct tablename,statsname from #temp )t
SELECT @i=1
SELECT @j=MAX(ID) FROM #temp1
WHILE(@I<=@J)
BEGIN
SELECT @statsname = statsname from #temp1 where id = @i
SELECT @tablename = tablename from #temp1 where id = @i
SELECT @columnname = columnname from #temp1 where id = @i
SET @sql = N'UPDATE STATISTICS '+QUOTENAME(@tablename)+QUOTENAME(@statsname)+space(1)+'WITH FULLSCAN'
--PRINT @sql
EXEC sp_executesql @sql
SET @i = @i+1
END
DROP TABLE #temp
DROP TABLE #temp1
Update Statistics With NTEXT and Image Data Type.
SET NOCOUNT ON
DECLARE @columnname VARCHAR(MAX)
DECLARE @tablename SYSNAME
DECLARE @statsname SYSNAME
DECLARE @sql NVARCHAR(4000)
DECLARE @NAME VARCHAR(MAX)
declare @i INT
declare @j INT
create table #temp
(
tablename varchar(1000),
statsname varchar(1000),
columnname varchar(1000)
)
insert #temp(tablename,statsname,columnname)
SELECT DISTINCT
OBJECT_NAME(s.[object_id]),
s.name AS StatName,
COALESCE(@NAME+ ', ', '')+c.name
FROM sys.stats s JOIN sys.stats_columns sc ON sc.[object_id] = s.[object_id] AND sc.stats_id = s.stats_id
JOIN sys.columns c ON c.[object_id] = sc.[object_id] AND c.column_id = sc.column_id
JOIN INFORMATION_SCHEMA.COLUMNS D ON D.[COLUMN_NAME]= C.[NAME]
JOIN sys.partitions par ON par.[object_id] = s.[object_id]
JOIN sys.objects obj ON par.[object_id] = obj.[object_id]
WHERE OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1
AND D.DATA_TYPE  IN('NTEXT','IMAGE')
create table #temp1
(
id int identity(1,1),
tablename varchar(8000),
statsname varchar(8000),
columnname varchar(8000)
)
insert #temp1(tablename,statsname,columnname)
select tablename,statsname,stuff(
(
select ','+ [columnname] from #temp where
statsname = t.statsname for XML path('')
),1,1,'')
from (select distinct tablename,statsname from #temp )t
SELECT @i=1
SELECT @j=MAX(ID) FROM #temp1
WHILE(@I<=@J)
BEGIN
SELECT @statsname = statsname from #temp1 where id = @i
SELECT @tablename = tablename from #temp1 where id = @i
SELECT @columnname = columnname from #temp1 where id = @i
SET @sql = N'UPDATE STATISTICS '+QUOTENAME(@tablename)+QUOTENAME(@statsname)
--PRINT @sql
EXEC sp_executesql @sql
SET @i = @i+1
END
DROP TABLE #temp
DROP TABLE #temp1

Although it took me around 3 working days i.e. a complete weekend for me to fix this issue but as a DBA, I felt that it was a great concept to learn and understand. My sincere thanks to all the readers for providing their valuable time in reading this post. If you have any suggestions then please do us know.

]]>

Leave a comment Cancel reply