SQL Server Performance

Optimizer not using our indexes

Discussion in 'SQL Server 2005 Performance Tuning for DBAs' started by Lychev, Dec 18, 2006.

  1. Lychev New Member

    Description of problem.<br />We have in the Database table ConsumerListing with a number of<br />searchable<br />columns. We can not predict the combination of search conditions.<br />Because of<br />that we chose to create a number of one-column non-clustered indexes.<br /><br />Two types of searches are being performed on the table.<br />Type 1: we need to access all records which satisfy certain conditions.<br />In<br />our example people who Dogs and interested in Gardening in the household.<br />Type 2 we need to access records which satisfy certain conditions and<br />have<br />certain phone numbers. To achieve that we decided in addition to<br />one-column<br />non-clustered indexes to create indexes with keys AreaCode, Phone and<br />add<br />all the other search conditions on the leaf level of non clustered index<br />(included columns new for SQL Server 2005).<br /><br />We decided to test query<br />select count(*) from ConsumerListing<br />where<br />Dogs='Y' AND Gardner='Y'<br />It took long. Query optimizer issued following execution plan:<br />StmtText<br />------------------------------------------------------------------------<br />----<br />------------------------------------------------------------------------<br />----<br />-----------------------------------------------<br /> |--Compute<br />Scalar(DEFINE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Expr1004]=CONVERT_IMPLICIT(int,[globalagg1008],0)))<br /> |--Stream<br />Aggregate(DEFINE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[globalagg1008]=SUM([partialagg1007])))<br /> |--Parallelism(Gather Streams)<br /> |--Stream Aggregate(DEFINE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[partialagg1007]=Count(*)))<br /> |--Index<br />Scan(OBJECT<img src='/community/emoticons/emotion-6.gif' alt=':(' />[DD_Data].[dbo].[ConsumerListing].[Ind_Phone]),<br />WHERE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[DD_Data].[dbo].[ConsumerListing].[Dogs]='Y' AND<br />[DD_Data].[dbo].[ConsumerListing].[Gardner]='Y'))<br />Meaning instead of seeking indexes on Dogs and Cats and merging results<br />(Behavior we expected) optimaizer choose to scan<br />Ind_Phone.<br />We decided to overwrite this with index hints<br />select count(*) from ConsumerListing<br />with (index (IND_Dogs, Ind_Gardner))<br />where<br />Dogs='Y' AND Gardner='Y'<br />We get expected Behavior<br />StmtText<br />------------------------------------------------------------------------<br />----<br />------------------------------------------------------------------------<br />----<br />------------------------------------------------------------------------<br />----<br />------------------------------------------------------------------------<br />----<br />------------------------------------------------------------------------<br />----<br />------------------------------------------------------------------------<br />----<br />-----------------------------------------------<br /> |--Compute<br />Scalar(DEFINE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Expr1004]=CONVERT_IMPLICIT(int,[Expr1007],0)))<br /> |--Stream Aggregate(DEFINE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Expr1007]=Count(*)))<br /> |--Merge Join(Inner Join,<br />MERGE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[DD_Data].[dbo].[ConsumerListing].[AddressID],<br />[DD_Data].[dbo].[ConsumerListing].[IndividualId],<br />[Uniq1002])=([DD_Data].[dbo].[ConsumerListing].[AddressID],<br />[DD_Data].[dbo].[ConsumerListing].[IndividualId], [Uniq1002]),<br />RESIDUAL<img src='/community/emoticons/emotion-6.gif' alt=':(' />[DD_Data].[dbo].[ConsumerListing].[AddressID] =<br />[DD_Data].[dbo].[ConsumerListing].[AddressID] AND<br />[DD_Data].[dbo].[ConsumerListing].[IndividualId] =<br />[DD_Data].[dbo].[ConsumerListing].[IndividualId] AND [Uniq1002] =<br />[Uniq1002]))<br /> |--Index<br />Seek(OBJECT<img src='/community/emoticons/emotion-6.gif' alt=':(' />[DD_Data].[dbo].[ConsumerListing].[Ind_Dogs]),<br />SEEK<img src='/community/emoticons/emotion-6.gif' alt=':(' />[DD_Data].[dbo].[ConsumerListing].[Dogs]='Y') ORDERED FORWARD)<br /> |--Index<br />Seek(OBJECT<img src='/community/emoticons/emotion-6.gif' alt=':(' />[DD_Data].[dbo].[ConsumerListing].[Ind_Gardner]),<br />SEEK<img src='/community/emoticons/emotion-6.gif' alt=':(' />[DD_Data].[dbo].[ConsumerListing].[Gardner]='Y') ORDERED FORWARD)<br />We decided to test both situations with<br /><br />DBCC DROPCLEANBUFFERS<br />DBCC FREEPROCCACHE<br />SET STATISTICS IO ON<br />SET STATISTICS TIME ON<br />Resulsts for SET STATISTICS TIME ON SQL Server Execution Times:<br />CPU time = 36266 ms, elapsed time = 396475 ms.--default<br />CPU time = 19032 ms, elapsed time = 19245 ms.--with index hints<br />In my oppinion wiht index hints times match better<br />Results from STATISTICS TIME ON<br />Table 'ConsumerListing'. Scan count 5, logical reads 3948831, physical<br />reads<br />7, read-ahead reads 3939681, lob logical reads 0, lob physical reads 0,<br />lob<br />read-ahead reads 0.--default<br />Table 'ConsumerListing'. Scan count 2, logical reads 88589, physical<br />reads<br />6, read-ahead reads 88581, lob logical reads 0, lob physical reads 0,<br />lob<br />read-ahead reads 0.----with index hints<br /><br />Again all results are better when we're using two one column indexes.<br />All our one column indexes has low selectivity<br />Partial formatted results from DBCC SHOW_STATISTICS bellow:<br />Dogs<br />RANGE_HI_KEYRANGE_ROWSEQ_ROWSDISTINCT_RANGE_ROWSAVG_RANGE_ROWS<br />U0169,693,20001<br />Y034,562,52001<br />Gardener<br />RANGE_HI_KEYRANGE_ROWSEQ_ROWSDISTINCT_RANGE_ROWSAVG_RANGE_ROWS<br />U0159,358,90001<br />Y044,896,82001<br /><br />Table has 204,255,720<br />Please note that regarding of low selectivity if you will run <br />select count(*) from ConsumerListing<br />where<br />Dogs='Y'<br />(Or anything with only one condition) optimizer will use ind_dogs (seek)(or any other appropriate index.<br /><br />The question is why optimizer is using such an ineffective execution<br />plan?<br />What can we do to avoid index hints( will be difficult to program)?<br />Help will be greatly appreciated.<br /><br /><br /><br />
  2. MohammedU New Member

  3. gurucb New Member

    Please upload the entire plan for better comparisions. Also along with below options
    SET STATISTICS IO ON
    SET STATISTICS TIME ON
    also enable
    SET STATISTICS PROFILE ON to compare execution plans, rows returned and subtree cost for each plan.

    Tried to do a simple repro of your problem and below are sample script see if they imitate your problem:


    create table TestPerformance
    (
    CustomerName varchar(1000),
    hascar int,
    hasHouse int,
    hasDog int,
    hasPhone int,
    HasGarden int,
    haswife int
    )

    create clustered index ind on TestPerformance(customername)
    create index indcar on TestPerformance(hascar)
    create index indHouse on TestPerformance(hasHouse)
    create index indDog on TestPerformance(hasDog)
    create index indPhone on TestPerformance(hasPhone)
    create index indGarden on TestPerformance(HasGarden)
    create index indwife on TestPerformance(haswife)

    declare @i int
    set @i = 1
    while @i < 10000
    begin
    insert into TestPerformance
    values(@i,@i%2,@i%3,@i%4,@i%5,@i%6,@i%7)
    set @i = @i + 1
    end

    update statistics testperformance with fullscan
    set statistics profile on
    go
    set statistics time on
    go
    set statistics io on
    go

    select count(*) from testperformance with (index(indcar,indwife))
    where
    hascar=0 AND haswife=0


    select count(*) from testperformance --with (index(indcar,indwife))
    where
    hascar=0 AND haswife=0

    Got almost same time of plans that you had pasted in here.

    With index hints provided there seems to be a merge join occuring between different columns and I bet if you looked at the subtree cost of both plans plan with index hints costed higher than default one.

    Also CONVERT_IMPLICIT is also present CONVERT_IMPLICIT(int,[globalagg1008],0))) that can impact performance.

    The plan with index hints is going to be a CPU hogging one if put under stress test as since it is a merge join we are more on cpu also in some cases there may be sort operations involved prior to merge join.

    In SQL Server 2000 earlier version book mark lookups cost are not properly estimaated and we are gradually moving away from bookmark lookups.



Share This Page