SQL Server Performance

Indexing a table with 25 million rows

Discussion in 'SQL Server 2005 T-SQL Performance Tuning' started by rejimani, Feb 23, 2009.

  1. rejimani New Member

    H i,
    I have a table with 25 million records. Sample format is given below.
    SELECT TOP 5 Id, Call_Time, Duration, Rate, Charge, Calling_Nop, Called_Nop, Tier
    FROM EWSD
    ----------------------------------------------------------------------------------
    463935 2009-01-30 00:28:17.000 10 11.4 1.9 MTN VIS MOB
    463936 2009-01-30 00:28:20.000 10 11.4 1.9 MTN VIS MOB
    463937 2009-01-30 00:27:07.000 87 11.4 16.53 MTN VIS MOB
    463938 2009-01-30 00:28:34.000 1 11.4 0.19 MTN VIS MOB
    463939 2009-01-30 00:28:08.000 30 11.4 5.7 ETS VIS MOB
    ----------------------------------------------------------------------------------
    I ahve clustered index (primary key) on ID column and non-clustered indexes on Call_Date and Call_Time columns. Aproximately 3-400,000 records are added to this table every day.
    I have a query executed from a stored procedure which takes 5 minutes to complete.
    SELECT
    Call_Date,
    Count(ID) AS CallCount,
    sum(Duration) AS Duration
    FROM
    EWSD
    WHERE
    Call_Time BETWEEN '02/01/2009 00:00:01' AND '02/23/2009 10:10:00'
    AND Called_Nop='MTN'
    GROUP BY
    Call_Date
    Explain plan shows it is using the non-clustered index on call_time column.
    Any suggestions to improve the performance of this query execution?
    Reji Mani

  2. FrankKalis Moderator

    I would probably add "Calling_Nop" as "regular" column to the index and "Duration as INCLUDED column.
    Currently neither of your two indices satisfy the query, so SQL Server needs to scan or lookup the missing data which is costly. Still 5 minutes is quite long.
  3. SqlMan4u New Member

    Let me know whether the execution plan shows it is using non-clustered index scan or seek.
    Any delete operations are going on in large scale on this table? If so, how frequently?
  4. Atesim New Member

    Hi there,

    I'm not certain as to how the nonclustered indexes are defined. Are you saying you have a nonclustered composite with calltime+calldate? Or two separate nonclustered, one each for calltime and calldate? I'm guessing the latter something like:

    id, primary key clustered
    create nonclustered index idx_ewsd__call_date on ewsd (call_date)
    create nonclustered index idx_ewsd__call_time on ewsd (call_time)

    So, I created a table with schema based on your notes as well as used a data generator for the data (50million rows). Did this on my laptop.

    The index configuration you describe doesn't quite take 5 minutes. In fact, for a result set of approximately 11million rows, I clocked the query at 62 seconds.
    Execution plan looks like:
    SELECT 0%-->ComputeScalar 0%-->StreamAgg 0%-->Sort 63% --> NestedLoops 0% FORK1--> IndexSeek (idx_ewsd__call_time) 18%
    FORK2-->KeyLookup (pk_ewsd_id) 18%

    Now, I think redefining your nonclustered indexes to be composite and even cover might yield better results.

    I dropped previous indexes (excluding clustered) and created the following:

    create nonclustered index idx_ewsd__call_time_called_nop on ewsd (call_time, called_nop, call_date) include (duration)


    As expected, this changes the execution plan:
    SELECT 0%-->ComputeScalar 0%-->StreamAgg 0%-->Sort 78% --> IndexSeek (idx_ewsd__call_time_callednop_call_date) 22%

    Returning the same number of rows in 11 seconds.
    Now, I can't know about how else the table is used (what other type of queries are being executed) so that is something to consider. This allows for the query search predicate and group by/sym is satisfied by the index.

    If you're using SQL2005, you might consider the big picture by looking at the DMVs for index utilization. SQL server is pretty good about telling you also which indexes are essentially useless and which are missing (yes, it will recommend to you which columns and in what order!) The stats in these DMVs are cumulative and begin from service startup.
    See the following link for more information:
    http://www.sql-server-performance.c...Missing_Indexes_Using_SQL_Server_DMVs_p1.aspx

    Cheers!






  5. moh_hassan20 New Member

    [quote user="Atesim"]
    ........
    In fact, for a result set of approximately 11million rows, I clocked the query at 62 seconds.
    Execution plan looks like:
    SELECT 0%-->ComputeScalar 0%-->StreamAgg 0%-->Sort 63% --> NestedLoops 0% FORK1--> IndexSeek (idx_ewsd__call_time) 18%
    FORK2-->KeyLookup (pk_ewsd_id) 18%

    .....
    As expected, this changes the execution plan:
    SELECT 0%-->ComputeScalar 0%-->StreamAgg 0%-->Sort 78% --> IndexSeek (idx_ewsd__call_time_callednop_call_date) 22%

    Returning the same number of rows in 11 seconds.
    .....
    [/quote]
    Atesim, that is excellent QA lab for simulating the problem performance
    Can you set statistics IO ON , set statistics time on , and post the result of the two solution to have complete picture of the performance.
    These values are good measure for CPU and IO usage resources.

    There is other solution:
    Can you create idx_ewsd__call_time_called_nop as a clustered index , and
    id, primary key non clustered , and show us the result of performance

    it is better for avoid auto increment cluster key (if table has many insertion)

  6. Adriaan New Member

    "it is better for avoid auto increment cluster key (if table has many insertion)"
    Uh, no quite the opposite - if the key values are purely incremental, like an identity column, it is the ideal candidate for a clustered index.
  7. moh_hassan20 New Member

    <p>Hi Adriaan ,&nbsp;</p><p>&nbsp;I agree with you to some extent , but i need to clarify that point because there are two opinions for auto increment fields.<br></p><p>The problem is in statistics update , and all insertions are in the last page which may lead to page lock.</p><p>&nbsp;</p><table class="MsoTableGrid" style="border: medium none ; border-collapse: collapse;" border="1" cellpadding="0" cellspacing="0"> <tbody><tr style=""> <td style="border: 1pt solid windowtext; padding: 0cm 5.4pt; width: 426.1pt;" valign="top" width="568"> <h4>Ascending Keys and Auto Quick Corrected Statistics <span title="Rated Excellent [5 out of 5 / rated 1 time(s)]." id="ctl00___ctl00___ctl00_ctl00_bcr_ctl00___Entry___Ratings"><!--[if gte vml 1]><v:shapetype id="_x0000_t75" coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f"> <v:stroke joinstyle="miter"/> <v:formulas> <v:f eqn="if lineDrawn pixelLineWidth 0"/> <v:f eqn="sum @0 1 0"/> <v:f eqn="sum 0 0 @1"/> <v:f eqn="prod @2 1 2"/> <v:f eqn="prod @3 21600 pixelWidth"/> <v:f eqn="prod @3 21600 pixelHeight"/> <v:f eqn="sum @0 0 1"/> <v:f eqn="prod @6 1 2"/> <v:f eqn="prod @7 21600 pixelWidth"/> <v:f eqn="sum @8 21600 0"/> <v:f eqn="prod @7 21600 pixelHeight"/> <v:f eqn="sum @10 21600 0"/> </v:formulas> <v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"/> <o:lock v:ext="edit" aspectratio="t"/> </v:shapetype><v:shape id="_x0000_i1025" type="#_x0000_t75" alt="" style='width:4.5pt; height:9pt'> <v:imagedata src="file:///G:DOCUME~1mLOCALS~1Tempmsohtml11clip_image001.gif" o:href="http://blogs.msdn.com/Themes/default/images/common/star-left-on.gif"/> </v:shape><![endif]--><!--[if !vml]--><img src="file:///G:/DOCUME%7E1/m/LOCALS%7E1/Temp/msohtml1/01/clip_image001.gif" v:shapes="_x0000_i1025" align="absmiddle" border="0" height="12" width="6"><!--[endif]--><!--[if gte vml 1]><v:shape id="_x0000_i1026" type="#_x0000_t75" alt="" style='width:3.75pt;height:9pt'> <v:imagedata src="file:///G:DOCUME~1mLOCALS~1Tempmsohtml11clip_image002.gif" o:href="http://blogs.msdn.com/Themes/default/images/common/star-right-on.gif"/> </v:shape><![endif]--><!--[if !vml]--><img src="file:///G:/DOCUME%7E1/m/LOCALS%7E1/Temp/msohtml1/01/clip_image002.gif" v:shapes="_x0000_i1026" align="absmiddle" border="0" height="12" width="5"><!--[endif]--><!--[if gte vml 1]><v:shape id="_x0000_i1027" type="#_x0000_t75" alt="" style='width:4.5pt;height:9pt'> <v:imagedata src="file:///G:DOCUME~1mLOCALS~1Tempmsohtml11clip_image001.gif" o:href="http://blogs.msdn.com/Themes/default/images/common/star-left-on.gif"/> </v:shape><![endif]--><!--[if !vml]--><img src="file:///G:/DOCUME%7E1/m/LOCALS%7E1/Temp/msohtml1/01/clip_image001.gif" v:shapes="_x0000_i1027" align="absmiddle" border="0" height="12" width="6"><!--[endif]--><!--[if gte vml 1]><v:shape id="_x0000_i1028" type="#_x0000_t75" alt="" style='width:3.75pt;height:9pt'> <v:imagedata src="file:///G:DOCUME~1mLOCALS~1Tempmsohtml11clip_image002.gif" o:href="http://blogs.msdn.com/Themes/default/images/common/star-right-on.gif"/> </v:shape><![endif]--><!--[if !vml]--><img src="file:///G:/DOCUME%7E1/m/LOCALS%7E1/Temp/msohtml1/01/clip_image002.gif" v:shapes="_x0000_i1028" align="absmiddle" border="0" height="12" width="5"><!--[endif]--><!--[if gte vml 1]><v:shape id="_x0000_i1029" type="#_x0000_t75" alt="" style='width:4.5pt;height:9pt'> <v:imagedata src="file:///G:DOCUME~1mLOCALS~1Tempmsohtml11clip_image001.gif" o:href="http://blogs.msdn.com/Themes/default/images/common/star-left-on.gif"/> </v:shape><![endif]--><!--[if !vml]--><img src="file:///G:/DOCUME%7E1/m/LOCALS%7E1/Temp/msohtml1/01/clip_image001.gif" v:shapes="_x0000_i1029" align="absmiddle" border="0" height="12" width="6"><!--[endif]--><!--[if gte vml 1]><v:shape id="_x0000_i1030" type="#_x0000_t75" alt="" style='width:3.75pt;height:9pt'> <v:imagedata src="file:///G:DOCUME~1mLOCALS~1Tempmsohtml11clip_image002.gif" o:href="http://blogs.msdn.com/Themes/default/images/common/star-right-on.gif"/> </v:shape><![endif]--><!--[if !vml]--><img src="file:///G:/DOCUME%7E1/m/LOCALS%7E1/Temp/msohtml1/01/clip_image002.gif" v:shapes="_x0000_i1030" align="absmiddle" border="0" height="12" width="5"><!--[endif]--><!--[if gte vml 1]><v:shape id="_x0000_i1031" type="#_x0000_t75" alt="" style='width:4.5pt;height:9pt'> <v:imagedata src="file:///G:DOCUME~1mLOCALS~1Tempmsohtml11clip_image001.gif" o:href="http://blogs.msdn.com/Themes/default/images/common/star-left-on.gif"/> </v:shape><![endif]--><!--[if !vml]--><img src="file:///G:/DOCUME%7E1/m/LOCALS%7E1/Temp/msohtml1/01/clip_image001.gif" v:shapes="_x0000_i1031" align="absmiddle" border="0" height="12" width="6"><!--[endif]--><!--[if gte vml 1]><v:shape id="_x0000_i1032" type="#_x0000_t75" alt="" style='width:3.75pt;height:9pt'> <v:imagedata src="file:///G:DOCUME~1mLOCALS~1Tempmsohtml11clip_image002.gif" o:href="http://blogs.msdn.com/Themes/default/images/common/star-right-on.gif"/> </v:shape><![endif]--><!--[if !vml]--><img src="file:///G:/DOCUME%7E1/m/LOCALS%7E1/Temp/msohtml1/01/clip_image002.gif" v:shapes="_x0000_i1032" align="absmiddle" border="0" height="12" width="5"><!--[endif]--><!--[if gte vml 1]><v:shape id="_x0000_i1033" type="#_x0000_t75" alt="" style='width:4.5pt;height:9pt'> <v:imagedata src="file:///G:DOCUME~1mLOCALS~1Tempmsohtml11clip_image001.gif" o:href="http://blogs.msdn.com/Themes/default/images/common/star-left-on.gif"/> </v:shape><![endif]--><!--[if !vml]--><img src="file:///G:/DOCUME%7E1/m/LOCALS%7E1/Temp/msohtml1/01/clip_image001.gif" v:shapes="_x0000_i1033" align="absmiddle" border="0" height="12" width="6"><!--[endif]--><!--[if gte vml 1]><v:shape id="_x0000_i1034" type="#_x0000_t75" alt="" style='width:3.75pt;height:9pt'> <v:imagedata src="file:///G:DOCUME~1mLOCALS~1Tempmsohtml11clip_image002.gif" o:href="http://blogs.msdn.com/Themes/default/images/common/star-right-on.gif"/> </v:shape><![endif]--><!--[if !vml]--><img src="file:///G:/DOCUME%7E1/m/LOCALS%7E1/Temp/msohtml1/01/clip_image002.gif" v:shapes="_x0000_i1034" align="absmiddle" border="0" height="12" width="5"><!--[endif]--><span style="display: none;"></span><input value="5" id="ctl00___ctl00___ctl00_ctl00_bcr_ctl00___Entry___Ratings_Value" type="hidden"></span><o:p></o:p></h4> <p>A common problem for some SQL Server applications are cases where data typically ascends.&nbsp; For example, datetime columns where the column represents a current date.&nbsp; SQL Server builds statistics with the assumption that the data will by in large be similar in the future.&nbsp; However, when data typically ascends, most new&nbsp;insertions are out of the previously found range.&nbsp; <i><b>This can lead to poorly performing plans as filters selecting recent data seem to exclude the entire relation when in fact a significant number of rows are included.</b></i><o:p></o:p></p> <p class="MsoNormal" style="text-align: left; direction: ltr; unicode-bidi: embed;"><span style=""><o:p>&nbsp;ref: http://blogs.msdn.com/ianjo/archive/2006/04/24/582227.aspx</o:p></span></p><p class="MsoNormal" style="text-align: left; direction: ltr; unicode-bidi: embed;">also there is a white paper for microsoft , but may be i find the url latter.&nbsp;</p> </td> </tr></tbody></table><p>&nbsp;for luck , Microsoft in sql 2005 SP1 resolved that issue usingTrace flag 2389 and 2390, both new in SQL Server 2005 SP1, can help to address this problem.&nbsp;&nbsp; </p><p>but in sql 2000 , that issue is still as is.<br></p><p>so , let me say i agree with you in case sql2005 SP1 or higher&nbsp; [:D]<br></p><p>i&nbsp; agree with me&nbsp; in sql 2000&nbsp;[:)]</p><p>Any way , i respect your opinion.</p><p>&nbsp;</p>
  8. Adriaan New Member

    Hi Mohamed,
    Thanks for the follow-up. Sounds like that problem can be targeted with the proper maintenance plan?

Share This Page