SQL Server Performance

_Extremely_ slow updates/inserts

Discussion in 'Performance Tuning for DBAs' started by Hartmut5, Oct 13, 2004.

  1. Hartmut5 New Member

    Off and on, I've been having problems with certain updates/inserts taking an extremely long time to complete. In one case, an update statement that updated a non-indexed bit field for every row in a table took 7 hours to complete. (Table size = 5 million rows, about 2.5GB data size)

    Later, an update statement for the same table that populates 2 non-indexed fields with summarized data from another table took somewhere between 26 and 36 hours to complete! For the time being, I am updating this table using a cursor to update each row individually, which completes in about 20 minutes.

    This, however, is not my biggest problem. I have a stored procedure that truncates a table, then repopulates it with an INSERT...SELECT statement that summarizes data from another table. I have previously run this stored procedure and it completed in a little over an hour. However, I ran the same stored procedure again, and it is currently at 51 hours and still going. (Truncate/Insert table = 17M rows/2.2GB; Summarized table = 59M rows/20GB)

    The wait type for the process is almost always PAGEIOLATCH_SH or PAGEIOLATCH_EX, and the wait time sometimes goes up to 400ms, although the wait time is usually 0 (based on process info in Enterprise Manager). I checked the performance stats on the server from time to time and they alternate between the two sets of stats below, with the majority of the time being the first set:

    Memory:
    Pages/Sec: 0.0 0.0

    Physical Disk:
    % Idle: 0.5 0.0
    Avg Q Len: 1.2 90.0
    Disk Reads/s: 80.0 4.0
    Disk Writes/s: 0.2 260.0

    CPU Usage: 4%

    It appears that tempdb is not being used, since the process is holding no locks on that database.

    The server hardware is sub-optimal by far, since it is but a test machine (data + log on 2-disk RAID 0), but I would never expect anything to take this long.

    I can tell that there is some sort of I/O bottleneck here, but what I don't understand is why it is so extreme. How can a simple update statement take 26 hours to run, while a cursor can accomplish the same task in 20 minutes? And why on earth have my hard drives been crunching for more than 2 days straight on a stored procedure that took a little over an hour to complete previously? Should I simply use a cursor-based insert/update method for anything that affects more than a couple million rows? I already know that ideally the log and data should be on separate drives, but our current budget does not allow for this on the test machine.

    Thanks in advance for any help/pointers on this matter.



    -Hartmut5
  2. satya Moderator

    How often you recompile this stored procedure and execute UPDATE STATISTICS on involved tables?

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  3. joechang New Member

    without disk constraints, the single statement is usually faster than a loop for single row ops, however, it also requires a very large write to the log,
    how big is your log file? i tink what is happening is that this operation requires reading from the data partition, and writing to the log, disrupting sequential IO on both operations,
    also, are there other people using the db at this time?
    if so, your query may not be able to escalate to a table lock, requiring a massive number of row locks, taking memory away from the buffer cache
    just stick with the cursor for now, you could batches of 1000-4000 rows
  4. Hartmut5 New Member

    I have not recompiled the stored procedure or explicitly run UPDATE STATISTICS. The SQL in the stored procedure is very simple - 2 statements: Truncate Table and Insert...Select. The number of rows in the table being summarized has not changed, only the non-indexed values being summarized have been updated. Since there is no WHERE clause, a table scan is a given. Does the expected size of the SELECT part of the INSERT statement affect SQL Server's data storage plan? This is only about the third time I've run this stored procedure on this database/server.<br /><br />Some other info:<br />18GB Log File, 30% in use. (Recovery model is SIMPLE for the time being)<br />The server is for data processing, and this is the only thing running on it at the moment (besides the occasional connections for stats, etc.).<br />The process currently holds an exclusive table lock on the INSERT table, and "IS" table locks on the two other tables involved.<br /><br />EM&gt;&gt;Management&gt;&gt<img src='/community/emoticons/emotion-4.gif' alt=';P' />rocessInfo:<br />PhysicalIO: 36,000,000<br /><br />Time running: 70 hours and still going<br /><br />That comes to around 143 I/O's per second, which seems about right for the hard drives. However, this still seems like a whole lot of disk crunching for the task being accomplished.<br /><br />Any suggestions as to how to calculate when I would need to use cursor methods due to insert/update size limitations? Would a tempdb hint help in this situation? Or would that just create even more disk activity? At times, this process will be run for smaller sub-sets, so to avoid 3+ days of down-time in the future, I would like to know if I need to use a cursor method exclusively, or if the slowness is directly related to RAM/resultset sizes. Any suggestions or information about the reason for the jump from 1 hour to 70+ and any way to predict it would be greatly appreciated.<br /><br /><br />-Hartmut5
  5. satya Moderator

    Get the PERFMON stats for PRocessor, process, memory, SQL Server locks, SQL Server memory etc. counters for assessment.

    Also PROFILER will help to fine tune the slow running queries.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  6. joechang New Member

    also get the perfmon physical disk counters
    I also need to see the DDL (generate script) for the tables with indexes and FK,
    also the full INSERT/SELECT statement
    I think the single big op is simply too big for your system, and that a batch loop for 1-4K rows at a time should be faster than a cursor loop for 1 row at a time
  7. Hartmut5 New Member

    Here are the perfmon stats. Again, there are 2 distinct sets of stats. Each set represents the average stats from 10 samples at 10 second intervals. The first set was taken during "light" hard drive activity, and the second set was taken when I could audibly hear the hard drive activity increase from the next room. These stats alternate at regular intervals, with the majority of the time spent in the state that generates the first set.

    94 hours and still going...

    Memory
    Pages/sec: 0.010 / 0.030

    Physical Disk: C:
    % Disk Read Time: 123.921 / 121.079
    % Disk Write Time: 0.778 / 8477.988
    % Idle Time: 0.505 / 0.000
    Avg. Disk Queue Length: 1.247 / 85.991
    Disk Reads/sec: 84.127 / 3.079
    Disk Writes/sec: 0.870 / 234.196

    Process: sqlservr
    % Processor Time: 0.860 / 0.641
    IO Other Operations/sec: 0.610 / 0.610
    IO Read Operations/sec: 84.137 / 3.079
    IO Write Operations/sec: 0.250 / 233.426
    Working Set: 857694208.000 / 856956928.000

    Processor: _Total
    % Processor Time: 2.850 / 3.537

    SQLServer:Locks _Total
    Lock Requests/sec: 398.253 / 27.584
    All Other Counters: 0.000 / 0.000

    SQL Server:Memory Manager
    Connection Memory (KB): 352.000 / 352.000
    Granted Workspace Memory (KB): 11296.000 / 11296.000
    Lock Blocks: 1001.000 / 1001.000
    Lock Blocks Allocated: 3500.000 / 3500.000
    Lock Memory (KB): 336.000 / 336.000
    Lock Owner Blocks: 1001.000 / 1001.000
    Lock Owner Blocks Allocated: 3505.000 / 3505.000
    Maximum Workspace Memory (KB): 605880.000 / 605880.000
    Memory Grants Outstanding: 1.000 / 1.000
    Memory Grants Pending: 0.000 / 0.000
    Optimizer Memory (KB): 0.000 / 0.000
    SQL Cache Memory (KB): 160.000 / 160.000
    Target Server Memory (KB): 819364.000 / 818641.600
    Total Server Memory (KB): 819364.000 / 818641.600

    SQL Server:Buffer Manager
    Buffer cache hit ratio: 89.563 / 89.774
    Free Pages: 36.778 / 33.889


    Let me know if I should be looking at other stats besides these.


    -Hartmut5
  8. Hartmut5 New Member

    Here is the DDL (with substitutions -- hopefully that didn't break it). I removed the collation specs to save space. Fillfactors are set to 100 since the data is generally read-only after being processed, and nearly the entire database is re-indexed once all processing is complete.<br /><pre><br />CREATE TABLE [dbo].[TableA] (<br />[ID_No] [varchar] (9) NOT NULL ,<br />[Sub_ID_No] [varchar] (5) NOT NULL ,<br />[Full_ID_No] [varchar] (19) NULL ,<br />[Group_Code] [varchar] (2) NOT NULL ,<br />[SubGroup_Code] [varchar] (3) NOT NULL ,<br />[Special_SubGroup_Code] [varchar] (3) NOT NULL ,<br />[Active_Fl] [bit] NOT NULL ,<br />[DateField1] [datetime] NULL ,<br />[DateField2] [datetime] NULL ,<br />[DateField3] [datetime] NULL ,<br />[FloatField1] [float] NULL ,<br />[FloatField2] [float] NULL ,<br />[DataField1] [varchar] (2) NULL ,<br />[DataField2] [varchar] (1) NULL ,<br />[DataField3] [varchar] (3) NULL ,<br />[DataField4] [varchar] (1) NULL ,<br />[DataField5] [varchar] (50) NULL ,<br />[DataField6] [varchar] (50) NULL ,<br />[DataField7] [varchar] (50) NULL ,<br />[DataField8] [varchar] (50) NULL ,<br />[DataField9] [varchar] (40) NULL ,<br />[DataField10] [datetime] NULL ,<br />[DataField11] [varchar] (40) NULL ,<br />[DataField12] [bigint] NULL ,<br />[DataField13] [bit] NULL ,<br />[DataField14] [varchar] (5) NULL ,<br />[DataField15] [varchar] (40) NULL ,<br />[DataField16] [varchar] (40) NULL ,<br />[DataField17] [varchar] (10) NULL ,<br />[DataField18] [varchar] (60) NULL ,<br />[DataField19] [varchar] (30) NULL ,<br />[DataField20] [varchar] (30) NULL ,<br />[DataField21] [varchar] (2) NULL ,<br />[DataField22] [varchar] (2) NULL ,<br />[DataField23] [varchar] (5) NULL ,<br />[DataField24] [varchar] (5) NULL ,<br />[DataField25] [varchar] (4) NULL ,<br />[DataField26] [varchar] (4) NULL ,<br />[DataField27] [varchar] (7) NULL ,<br />[DataField28] [varchar] (35) NULL ,<br />[DataField29] [varchar] (40) NULL ,<br />[DataField30] [varchar] (40) NULL ,<br />[DataField31] [varchar] (5) NULL ,<br />[DataField32] [varchar] (40) NULL ,<br />[DataField33] [varchar] (40) NULL ,<br />[DataField34] [varchar] (10) NULL ,<br />[DataField35] [varchar] (60) NULL ,<br />[DataField36] [varchar] (30) NULL ,<br />[DataField37] [varchar] (30) NULL ,<br />[DataField38] [varchar] (2) NULL ,<br />[DataField39] [varchar] (2) NULL ,<br />[DataField40] [varchar] (5) NULL ,<br />[DataField41] [varchar] (5) NULL ,<br />[DataField42] [varchar] (4) NULL ,<br />[DataField43] [varchar] (4) NULL ,<br />[DataField44] [varchar] (7) NULL ,<br />[DataField45] [varchar] (35) NULL ,<br />[DataField46] [varchar] (5) NULL ,<br />[DataField47] [varchar] (5) NULL ,<br />[DataField48] [varchar] (1) NULL ,<br />[DataField49] [varchar] (15) NULL ,<br />[DataField50] [varchar] (6000) NULL ,<br />[Created_Date_Time] [datetime] NULL ,<br />[Modified_Date_Time] [datetime] NULL ,<br />[Modified_ID] [varchar] (<img src='/community/emoticons/emotion-11.gif' alt='8)' /> NULL ,<br />[Edit_Date_Time] [datetime] NULL ,<br />[Edit_Initials] [varchar] (3) NULL ,<br />[Edit_Workstation] [varchar] (50) NULL ,<br />[DataField51] [money] NULL ,<br />[DataField52] [money] NULL ,<br />[DataField53] [money] NULL ,<br />[DataField54] [money] NULL ,<br />[DataField55] [tinyint] NOT NULL <br />) ON [PRIMARY]<br />GO<br /><br />CREATE TABLE [dbo].[TableB] (<br />[TableB_ID] [numeric](19, 0) IDENTITY (1, 1) NOT NULL ,<br />[DateRange_Begin_Year] [smallint] NOT NULL ,<br />[DateRange_Begin_Month] [tinyint] NOT NULL ,<br />[DateRange_Begin_Day] [tinyint] NOT NULL ,<br />[DateRange_End_Year] [smallint] NOT NULL ,<br />[DateRange_End_Month] [tinyint] NOT NULL ,<br />[DateRange_End_Day] [tinyint] NOT NULL ,<br />[DateField4] [datetime] NULL ,<br />[DateField4_Year] [smallint] NULL ,<br />[DateField5] [datetime] NULL ,<br />[DateField5_Year] [smallint] NULL ,<br />[ID_No] [varchar] (9) NOT NULL ,<br />[DataField46] [varchar] (4) NULL ,<br />[DataField47] [varchar] (4) NULL ,<br />[Group_Code] [varchar] (2) NOT NULL ,<br />[SubGroup_Code] [varchar] (3) NOT NULL ,<br />[DataField56] [varchar] (3) NULL ,<br />[DataField57] [varchar] (4) NULL ,<br />[DataField58] [varchar] (2) NULL ,<br />[DataField59] [varchar] (6) NULL ,<br />[DataField60] [varchar] (6) NULL ,<br />[DataField61] [varchar] (6) NULL ,<br />[DataField1] [varchar] (2) NULL ,<br />[DataField2] [varchar] (1) NULL ,<br />[DataField62] [varchar] (2) NULL ,<br />[DataField3] [varchar] (3) NULL ,<br />[DataField4] [varchar] (1) NULL ,<br />[DataField63] [smallint] NULL ,<br />[DataField64] [tinyint] NULL ,<br />[DataField65] [tinyint] NULL ,<br />[DataField66] [varchar] (2) NULL ,<br />[DataField67] [varchar] (3) NULL ,<br />[NumberField1] [money] NULL ,<br />[NumberField2] [money] NULL ,<br />[NumberField3] [money] NULL ,<br />[NumberField4] [money] NULL ,<br />[NumberField5] [money] NULL ,<br />[DataField68] [varchar] (23) NULL ,<br />[Created_Date_Time] [datetime] NULL ,<br />[NumberField6] [money] NULL ,<br />[NumberField7] [money] NULL ,<br />[NumberField8] [money] NULL ,<br />[NumberField9] [money] NULL ,<br />[NumberField10] [money] NULL ,<br />[NumberField11] [money] NULL ,<br />[NumberField12] [money] NULL ,<br />[NumberField13] [money] NULL ,<br />[NumberField14] [money] NULL ,<br />[NumberField15] [money] NULL ,<br />[NumberField16] [money] NULL ,<br />[NumberField17] [money] NULL ,<br />[NumberField18] [money] NULL ,<br />[NumberField19] [money] NULL ,<br />[NumberField20] [money] NULL ,<br />[NumberField21] [money] NULL <br />) ON [PRIMARY]<br />GO<br /><br />CREATE TABLE [dbo].[TableC] (<br />[Group_Code] [varchar] (2) NOT NULL ,<br />[SubGroup_Code] [varchar] (3) NOT NULL ,<br />[ID_No] [varchar] (9) NOT NULL ,<br />[DateField4_Year] [smallint] NOT NULL ,<br />[NumberField22] [money] NOT NULL ,<br />[NumberField23] [money] NOT NULL ,<br />[NumberField24] [money] NOT NULL ,<br />[NumberField25] [money] NOT NULL ,<br />[NumberField26] [money] NOT NULL ,<br />[NumberField27] [money] NOT NULL ,<br />[NumberField28] [money] NOT NULL ,<br />[NumberField29] [money] NOT NULL ,<br />[Created_Date_Time] [datetime] NULL ,<br />[NumberField30] [money] NULL ,<br />[NumberField31] [money] NULL ,<br />[NumberField32] [money] NULL ,<br />[NumberField33] [money] NULL ,<br />[DataField69] [varchar] (5) NULL <br />) ON [PRIMARY]<br />GO<br /><br />ALTER TABLE [dbo].[TableC] WITH NOCHECK ADD <br />CONSTRAINT [PK_TableC] PRIMARY KEY CLUSTERED <br />(<br />[Group_Code],<br />[SubGroup_Code],<br />[ID_No],<br />[DateField4_Year]<br />) WITH FILLFACTOR = 100 ON [PRIMARY] <br />GO<br /><br /> CREATE CLUSTERED INDEX [Group_SubGroup_Code] ON [dbo].[TableA]([Group_Code], [SubGroup_Code]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE CLUSTERED INDEX [Group_SubGroup_Code] ON [dbo].[TableB]([Group_Code], [SubGroup_Code], [ID_No]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br />ALTER TABLE [dbo].[TableA] WITH NOCHECK ADD <br />CONSTRAINT [DF_TableA_Active_Fl] DEFAULT (1) FOR [Active_Fl],<br />CONSTRAINT [DF_TableA_FloatField1] DEFAULT (1) FOR [FloatField1],<br />CONSTRAINT [DF_TableA_DataField55] DEFAULT (1) FOR [DataField55],<br />CONSTRAINT [PK_TableA] PRIMARY KEY NONCLUSTERED <br />(<br />[ID_No],<br />[Sub_ID_No],<br />[Group_Code],<br />[SubGroup_Code]<br />) WITH FILLFACTOR = 100 ON [PRIMARY] <br />GO<br /><br />ALTER TABLE [dbo].[TableB] WITH NOCHECK ADD <br />CONSTRAINT [DF_TableB_NumberField1] DEFAULT (0) FOR [NumberField1],<br />CONSTRAINT [DF_TableB_NumberField2] DEFAULT (0) FOR [NumberField2],<br />CONSTRAINT [DF_TableB_NumberField3] DEFAULT (0) FOR [NumberField3],<br />CONSTRAINT [DF_TableB_NumberField4] DEFAULT (0) FOR [NumberField4],<br />CONSTRAINT [DF_TableB_NumberField5] DEFAULT (0) FOR [NumberField5],<br />CONSTRAINT [DF_TableB_NumberField6] DEFAULT (0) FOR [NumberField6],<br />CONSTRAINT [DF_TableB_NumberField7] DEFAULT (0) FOR [NumberField7],<br />CONSTRAINT [DF_TableB_NumberField8] DEFAULT (0) FOR [NumberField8],<br />CONSTRAINT [DF_TableB_NumberField9] DEFAULT (0) FOR [NumberField9],<br />CONSTRAINT [DF_TableB_NumberField10] DEFAULT (0) FOR [NumberField10],<br />CONSTRAINT [DF_TableB_NumberField11] DEFAULT (0) FOR [NumberField11],<br />CONSTRAINT [DF_TableB_NumberField12] DEFAULT (0) FOR [NumberField12],<br />CONSTRAINT [DF_TableB_NumberField13] DEFAULT (0) FOR [NumberField13],<br />CONSTRAINT [DF_TableB_NumberField14] DEFAULT (0) FOR [NumberField14],<br />CONSTRAINT [DF_TableB_NumberField15] DEFAULT (0) FOR [NumberField15],<br />CONSTRAINT [DF_TableB_NumberField16] DEFAULT (0) FOR [NumberField16],<br />CONSTRAINT [DF_TableB_NumberField17] DEFAULT (0) FOR [NumberField17],<br />CONSTRAINT [DF_TableB_NumberField18] DEFAULT (0) FOR [NumberField18],<br />CONSTRAINT [DF_TableB_NumberField19] DEFAULT (0) FOR [NumberField19],<br />CONSTRAINT [DF_TableB_NumberField20] DEFAULT (0) FOR [NumberField20],<br />CONSTRAINT [DF_TableB_NumberField21] DEFAULT (0) FOR [NumberField21],<br />CONSTRAINT [PK_TableB] PRIMARY KEY NONCLUSTERED <br />(<br />[TableB_ID]<br />) WITH FILLFACTOR = 100 ON [PRIMARY] <br />GO<br /><br />ALTER TABLE [dbo].[TableC] WITH NOCHECK ADD <br />CONSTRAINT [DF_TableC_NumberField22] DEFAULT (0) FOR [NumberField22],<br />CONSTRAINT [DF_TableC_NumberField23] DEFAULT (0) FOR [NumberField23],<br />CONSTRAINT [DF_TableC_NumberField24] DEFAULT (0) FOR [NumberField24],<br />CONSTRAINT [DF_TableC_NumberField25] DEFAULT (0) FOR [NumberField25],<br />CONSTRAINT [DF_TableC_NumberField26] DEFAULT (0) FOR [NumberField26],<br />CONSTRAINT [DF_TableC_NumberField27] DEFAULT (0) FOR [NumberField27],<br />CONSTRAINT [DF_TableC_NumberField28] DEFAULT (0) FOR [NumberField28],<br />CONSTRAINT [DF_TableC_NumberField29] DEFAULT (0) FOR [NumberField29],<br />CONSTRAINT [DF_TableC_Created_Date_Time] DEFAULT (getdate()) FOR [Created_Date_Time]<br />GO<br /><br /> CREATE UNIQUE INDEX [ix_TableA_FullIDNo] ON [dbo].[TableA]([Full_ID_No]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE INDEX [ix_TableA_DateField2] ON [dbo].[TableA]([DateField2]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE UNIQUE INDEX [ix_TableA_DataField6] ON [dbo].[TableA]([DataField6], [Full_ID_No]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE UNIQUE INDEX [ix_TableA_DataField8] ON [dbo].[TableA]([DataField8], [Full_ID_No]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE UNIQUE INDEX [ix_TableA_DataField47] ON [dbo].[TableA]([DataField47], [Full_ID_No]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br />/****** The index created by the following statement is for internal use only. ******/<br />/****** It is not a real index but exists as statistics only. ******/<br />if (@@microsoftversion &gt; 0x07000000 )<br />EXEC ('CREATE STATISTICS [Statistic_Special_SubGroup_Code] ON [dbo].[TableA] ([Special_SubGroup_Code]) ')<br />GO<br /><br /> CREATE INDEX [ix_TableA_Group_SpecialSubGroupCode] ON [dbo].[TableA]([Group_Code], [Special_SubGroup_Code]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE INDEX [ix_TableA_DataField1217] ON [dbo].[TableA]([DataField15], [DataField12], [DataField14], [DataField16], [DataField17]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE INDEX [ix_TableA_DataField3034] ON [dbo].[TableA]([DataField32], [DataField30], [DataField31], [DataField33], [DataField34]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE UNIQUE INDEX [ix_TableA_DataField18] ON [dbo].[TableA]([DataField18], [Full_ID_No]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br />/****** The index created by the following statement is for internal use only. ******/<br />/****** It is not a real index but exists as statistics only. ******/<br />if (@@microsoftversion &gt; 0x07000000 )<br />EXEC ('CREATE STATISTICS [Statistic_DateField3] ON [dbo].[TableA] ([DateField3]) ')<br />GO<br /><br />/****** The index created by the following statement is for internal use only. ******/<br />/****** It is not a real index but exists as statistics only. ******/<br />if (@@microsoftversion &gt; 0x07000000 )<br />EXEC ('CREATE STATISTICS [Statistic_DataField1] ON [dbo].[TableA] ([DataField1]) ')<br />GO<br /><br />/****** The index created by the following statement is for internal use only. ******/<br />/****** It is not a real index but exists as statistics only. ******/<br />if (@@microsoftversion &gt; 0x07000000 )<br />EXEC ('CREATE STATISTICS [Statistic_DataField2] ON [dbo].[TableA] ([DataField2]) ')<br />GO<br /><br />/****** The index created by the following statement is for internal use only. ******/<br />/****** It is not a real index but exists as statistics only. ******/<br />if (@@microsoftversion &gt; 0x07000000 )<br />EXEC ('CREATE STATISTICS [Statistic_DataField4] ON [dbo].[TableA] ([DataField4]) ')<br />GO<br /><br />/****** The index created by the following statement is for internal use only. ******/<br />/****** It is not a real index but exists as statistics only. ******/<br />if (@@microsoftversion &gt; 0x07000000 )<br />EXEC ('CREATE STATISTICS [Statistic_DataField5] ON [dbo].[TableA] ([DataField5]) ')<br />GO<br /><br />/****** The index created by the following statement is for internal use only. ******/<br />/****** It is not a real index but exists as statistics only. ******/<br />if (@@microsoftversion &gt; 0x07000000 )<br />EXEC ('CREATE STATISTICS [Statistic_DataField7] ON [dbo].[TableA] ([DataField7]) ')<br />GO<br /><br />/****** The index created by the following statement is for internal use only. ******/<br />/****** It is not a real index but exists as statistics only. ******/<br />if (@@microsoftversion &gt; 0x07000000 )<br />EXEC ('CREATE STATISTICS [Statistic_DataField9] ON [dbo].[TableA] ([DataField9]) ')<br />GO<br /><br /> CREATE UNIQUE INDEX [ix_TableA_DataField35] ON [dbo].[TableA]([DataField35], [Full_ID_No]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE INDEX [ix_TableA_DateField1] ON [dbo].[TableA]([DateField1]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE UNIQUE INDEX [ix_TableA_FilterFields] ON [dbo].[TableA]([Full_ID_No], [Special_SubGroup_Code], [DateField1], [DateField2], [DataField47], [Active_Fl]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE UNIQUE INDEX [ix_TableA_DataField51] ON [dbo].[TableA]([DataField51], [Full_ID_No]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE UNIQUE INDEX [ix_TableA_DataField52] ON [dbo].[TableA]([DataField52], [Full_ID_No]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE INDEX [ix_TableB_CreatedDateTime] ON [dbo].[TableB]([Created_Date_Time], [ID_No], [DataField47], [DateField4_Year], [DateField5_Year]) WITH FILLFACTOR = 100 ON [PRIMARY]<br />GO<br /><br /> CREATE INDEX [ix_TableB_IDNo] ON [dbo].[TableB]([ID_No]) ON [PRIMARY]<br />GO<br /><br /> CREATE INDEX [ix_TableC_IDNo] ON [dbo].[TableC]([ID_No], [Group_Code], [SubGroup_Code]) ON [PRIMARY]<br />GO<br /><br /></pre><br /><br />-Hartmut5
  9. Hartmut5 New Member

    Whew! Thank goodness for cut and paste. By the way, is there any way to indent in these forums? Following is the entire text of the stored procedure:<br /><pre><br />TRUNCATE TABLE TableC<br /><br />INSERT INTO TableC<br />(Group_Code, SubGroup_Code<br />,DateField4_Year, ID_No<br />,NumberField22, NumberField23, NumberField30<br />,NumberField24, NumberField25, NumberField31<br />,NumberField26, NumberField27, NumberField32<br />,NumberField28, NumberField29, NumberField33<br />,DataField69<br />)<br />SELECTb.Group_Code, b.SubGroup_Code<br />,b.DateField4_Year, b.ID_No<br />,Sum(b.NumberField6), Sum(b.NumberField7), Sum(b.NumberField6) + Sum(b.NumberField7)<br />,Sum(b.NumberField<img src='/community/emoticons/emotion-11.gif' alt='8)' />, Sum(b.NumberField9), Sum(b.NumberField<img src='/community/emoticons/emotion-11.gif' alt='8)' /> + Sum(b.NumberField9)<br />,Sum(b.NumberField10), Sum(b.NumberField11), Sum(b.NumberField10) + Sum(b.NumberField11)<br />,Sum(b.NumberField12), Sum(b.NumberField13), Sum(b.NumberField12) + Sum(b.NumberField13)<br />,IsNull((SELECT TOP 1 a.DataField47 FROM TableA a <br />WHERE a.Group_Code = b.Group_Code<br />AND a.SubGroup_Code = b.SubGroup_Code<br />AND a.ID_No = b.ID_No), Max(b.DataField47))<br />FROM TableB b<br />GROUP BY b.Group_Code, b.SubGroup_Code, b.DateField4_Year, b.ID_No<br /><br /><br /></pre><br />-Hartmut5
  10. satya Moderator

    Tip: You can use [ code ] .... [ /code ] snippets for such representations.
    I've modified your posts.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  11. satya Moderator

    Also refer points raised by Joe
    how big is your log file? i tink what is happening is that this operation requires reading from the data partition, and writing to the log, disrupting sequential IO on both operations,
    also, are there other people using the db at this time?


    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  12. Hartmut5 New Member

    163 hours and still going.

    The log file is still 18GB in size, but now 57% in use. (It was 30% at 70 hours, so the active part of the log file has grown about 4.6GB in 93 hours.) There are no other users for this database/server.

    I ran a SELECT (*) WITH (NO LOCK) to get the number of rows in the table, which is now around 16.9 million. I believe there should be a little more than 17 million rows, so it looks like the operation is nearly complete. I'll finally get to see the actual execution plan once it is done, but I'm not sure if this will reveal anything since log activity is not included.


    -Hartmut5
  13. Twan New Member

    Hi Harmut,

    it would be interesting to see if the statement is an order of magnitude or two faster without the



    ,IsNull((SELECT TOP 1 a.DataField47 FROM TableA a
    WHERE a.Group_Code = b.Group_Code
    AND a.SubGroup_Code = b.SubGroup_Code
    AND a.ID_No = b.ID_No), Max(b.DataField47))


    if it is then perhaps the statement could be reworked a little...?

    Cheers
    Twan

  14. Hartmut5 New Member

    It finally finished! It looks like the slowness may have been due to a statistics problem. When I finally got to see the actual execution plan, the estimated rowcount was only 83,500 -- nowhere near the 17M that it ended up being. As a result, the execution plan did a primary key index seek and then a bookmark lookup for each of the 17M rows, storing the results in an in-memory hash table. The IO stats it produced are pretty scary:


    Table 'TableC'. Scan count 0, logical reads 233166026, physical reads 13014123, read-ahead reads 5934807.
    Table 'TableA'. Scan count 17074727, logical reads 137699216, physical reads 22783654, read-ahead reads 947548.
    Table 'TableB'. Scan count 1, logical reads 2573204, physical reads 3, read-ahead reads 2596596.

    (17101856 row(s) affected)

    Even after running UPDATE STATISTICS on TableB (the one that determines the row count) I still get 85k rows instead of 17M. I'll try updating the statistics with a larger sample and see how the execution plan differs from the one it's currently choosing. It seems to me that the estimated execution plan (before I ran the stored procedure) used a temporary index on TableA. And I'm pretty sure the same thing showed up in the execution plan the first time I ran the stored procedure (when it only took a little over an hour).

    It looks like my mistake was in assuming that the optimizer would choose the same plan twice. I also should have picked up on the temporary index in the prior query plans and built that index into the stored procedure, even if I do end up dropping it at the end of the procedure.

    I guess one possibility for re-working the procedure (to eliminate the sub-select) would be to insert the values, leaving the sub-select field blank, then updating that field for the entire table using a join. I'll have to time it and see if that would be faster than creating a temporary index.

    Also, sorting the SELECT statement using TableC's clustered index shouldn't make any difference, since it normally sorts the rows while inserting anyways, correct?

    Thanks again for all your help. This is a great site!


    -Hartmut5
  15. Twan New Member

    Hi Harmut,

    the clustered index on tableC is not in the same order as the group by, so there will most likely be some resorting of rows as they are inserted into the table. I'm not sure whether this performance degradation is larger than matching the clustered index and group by column order and adding an order by to the insert...select (can't remember if that's possible...)

    the indexes have all been created with fillfactor 100... This is only really good for readonly tables not for anything that is inserted/updated. much better to stick to default of 0 which won't totally fill up the intermediary pages thus avoiding page splits at these levels (which can result in fragmentation and as a result degradation both at read and write time)

    In terms of the insert proc, it does then look like that the problem was the selecting from tableA. is it possible to add an index to tablea on ( group_code, subgroup_code, id_no, datafield47 )

    Cheers
    Twan
  16. Hartmut5 New Member

    I updated the statistics on TableB (I can't remember if I used 25% or 50%), which took about 45 minutes. At this point the estimated row count went up to 4M, which was enough to push the optimizer to create the temporary index. Re-running the stored procedure then finished in 51 minutes. Here are the much improved IO stats:


    Table 'TableC'. Scan count 0, logical reads 179959494, physical reads 0, read-ahead reads 0.
    Table 'Worktable'. Scan count 4832515, logical reads 39329234, physical reads 44715, read-ahead reads 0.
    Table 'TableA'. Scan count 1, logical reads 409904, physical reads 0, read-ahead reads 411349.
    Table 'TableB'. Scan count 1, logical reads 2573204, physical reads 4, read-ahead reads 2596596.

    (17101856 row(s) affected)

    Since this is only one step in a long process, I'm going for the shortest possible time for the entire process. Given that the statistics update took about 45 minutes, I'm going to try explicitly creating the index and then dropping it once the procedure is complete. I'm hoping that this will keep the total time for this step down to around 1 hour (including the index creation).

    Is an Index Spool/Eager Spool stored in-memory (or other optimizations)? Or is it basically like a normal index? I'm just wondering if relying on the optimizer to create the index would have any advantage over manually creating one (when the resulting rowcount is known).



    -Hartmut5
  17. satya Moderator

    The hit seems to be an Index Spool/Eager Spool in the execution plan which adds some writes to the process (according to the trace).


    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  18. Hartmut5 New Member

    I re-ran the stored procedure again, this time with the index pre-defined, and it still ran longer than expected. When I looked at the query plan this time, it used hash grouping instead of order grouping. I'm trying it one more time using OPTION ( ORDER GROUP ), which should hopefully bring the time down to about an hour instead of 24+.


    -Hartmut5
  19. zenonmist New Member

    1.
    your clustered indexes contain the data ORDERED. each time you add new data, it may be reordered depending on the data. SQL Server is probably rebuilding your indexes with each new insert.
    Pad the indexes and leave some space (like 25%) so it can reorder the indexes without recreating them.
    Better yet, drop your indexes before you INSERT and create them new after the insert.
    Teplace clustered indexes with nonclustered indexes. This will eliminate the ordering completely. Then add new nonclustered indexes that provide the order you need in the queries.

    2. Instead of selecting from x into y tables, you might want to select what you need into table Z (with no indexes and exact shape as the select query) then INSERT into table y. This simplifies SQL Server's task by eliminating any ordering in the query.

    These approaches have fixed performance lag by many magnitudes in my experience.
  20. mmarovic Active Member

    Hartumt5, is your problem solved yet?

Share This Page