SQL Server Performance

Top operator yields slow performance

Discussion in 'T-SQL Performance Tuning for Developers' started by chinchy, Nov 22, 2005.

  1. chinchy New Member

    Our developers wrote some code:

    There are two tables.

    Table1 - Fields such as ID, DocumentID, subDocumentID
    Table2 - Fields such as subDocumentID, FileType, FilePath, etc

    I'm guessing they want to pull out the FilePath, and guessin that all subDocuments for each given Document is stored in the same location, so all they would need is the first record, from which they can parse out the path.

    The query they execute looks something like this:

    Select top 1 b.FilePath
    from Table1 a, Table2 b
    where a.DocumentID = '12345' and a.subDocumentID = b.subDocumentID
    and b.FileType <= 2

    Both tables have many entries, 11,000,000 +.

    The problem is that when the statement executes, it takes some 2-5 minutes to execute to return 1 row. If however, the Top 1 clause is removed, it executes in 3 seconds, even though it might return say 1500-2500 rows.

    With regards to indexes, Table1 has a nonclustered index on DocumentID, and Table2 has a primary clustered index on subDocumentID and FileType.

    I've also noticed that if we execute the query and add an order by, it returns the results quite quickly. I.e.:

    Select top 1 b.FilePath
    from Table1 a, Table2 b
    where a.DocumentID = '12345' and a.subDocumentID = b.subDocumentID
    and b.FileType <= 2
    order by a.ID

    (where Table1 has a primary clustered index on ID)


    Any ideas on why the query runs so slowly with the Top operator?
  2. druer New Member

    Very odd indeed. I'd be very interested in having you run the query with the TOP 1 in the Query Analyzer with the Show Execution Plan on, and document what it does differently than when you run it without the TOP 1. Obviously for some reason it must be forcing optimizer to ignore an index or something causing a table scan or something.
  3. Madhivanan Moderator

    Try using Set RowCount and see if there is any performance gain

    SET RowCount 1
    Select b.FilePath
    from Table1 a, Table2 b
    where a.DocumentID = '12345' and a.subDocumentID = b.subDocumentID
    and b.FileType <= 2
    order by a.ID
    SET RowCount 0

    Madhivanan

    Failing to plan is Planning to fail
  4. SQLDBcontrol New Member

    The TOP keyword is meaningless without the order by clause. For SQL Server to know what is top and what is bottom you have to provide an order by clause. If you do not, then SQL Server will use the default order (based on the clustered index) which in your case spans two columns, subDocumentID and FileType.

    As you noticed already, if you explicitly specify the order by clause on a.ID it runs quicker.

    Not only is it a good idea to explicitly specify the order by clause for this reason but also, without it, you might in fact get different results.

    Karl Grambow

    www.sqldbcontrol.com
  5. chinchy New Member

    Hmm.. that makes sense... Thanks Karl!

Share This Page