SQL Server Performance

Pls optimize this query

Discussion in 'T-SQL Performance Tuning for Developers' started by kingfung_lam, May 16, 2006.

  1. kingfung_lam New Member

    I need to create a report on the traffic distribution to our site.
    It means that I have to find # of people from different country.

    THe IP Look up table :

    fromIP bigint
    toIP bigint
    Country varchar(20)


    the QUery I have is :
    select country, count(*) from iplookup, traffic
    where fromip <=traffic.ip and traffic.ip<=toIP


    I have millions of record per day and it takes 7 minute for 10000 records.

    How can i improve the runtime?

    Note: IP is is represented in a number form by : eg: 192.168.0.1 = 192 * 256^3 + 168^2 + 1
  2. Luis Martin Moderator

    Did you see execution plan?. May be you need some index.


    Luis Martin
    Moderator
    SQL-Server-Performance.com

    Although nature commences with reason and ends in experience it is necessary for us to do the opposite, that is to commence with experience and from this to proceed to investigate the reason.
    Leonardo Da Vinci

    Nunca esperes el reconocimiento de tus hijos, eso ocurrirá luego de tu muerte


    All postings are provided “AS IS” with no warranties for accuracy.



  3. kingfung_lam New Member

    I tried to make a cluster index on ipfrom and ipto, but it doesn't help.
    THe index tunning wizard has no suggestion for indexing
  4. Luis Martin Moderator

    Could you post both tables, (columns and attributes)?

    Luis Martin
    Moderator
    SQL-Server-Performance.com

    Although nature commences with reason and ends in experience it is necessary for us to do the opposite, that is to commence with experience and from this to proceed to investigate the reason.
    Leonardo Da Vinci

    Nunca esperes el reconocimiento de tus hijos, eso ocurrirá luego de tu muerte


    All postings are provided “AS IS” with no warranties for accuracy.



  5. kingfung_lam New Member

    The tables struture :
    IPLookup:

    FromIP bigInt
    ToIp bigint
    Country varchar(20)



    Tracking table:

    trackingid int pk
    TrackingIP bigint
    trackingtimestamp datetime

  6. derrickleggett New Member

    You can't possibly be running that query against those tables. The columns/table names in your select don't match what you just posted.

    Run your query like this (changing it so it works):

    SET STATISTICS IO ON
    SET STATISTICS PROFILE ON
    SET STATISTICS TIME ON

    select country, count(*) from iplookup, traffic
    where fromip <=traffic.ip and traffic.ip<=toIP

    SET STATISTICS IO OFF
    SET STATISTICS PROFILE OFF
    SET STATISTICS TIME OFF

    Paste the results of both the results and messages tab into here (minus the actual result set of data) with [ q u o t e ] [ / q u o t e ] around without the spaces in the tags.

    Also, how many rows are in both those tables when you run this query against 10,000 rows? What is the criteria for only selecting 10k rows?

    Have you tried putting non-clustered indexes on the TrackingIP, FromIP, and ToIP? Also, you might not be getting any mileage at all out of your indexes, depending on what the data in those tables looks like, since you basically using a BETWEEN clause to join these two tables together.

    Last, if you run the query like this:



    SET STATISTICS IO ON
    SET STATISTICS PROFILE ON
    SET STATISTICS TIME ON

    SELECT
    il.country,
    count(1)
    FROM
    IPLookup il
    INNER LOOP JOIN Tracking t ON il.FromIP <= t.TrackingIP
    AND t.TrackingIP <=il.ToIP

    SET STATISTICS IO OFF
    SET STATISTICS PROFILE OFF
    SET STATISTICS TIME OFF


    Again, you have to correct either your code you posted or the table structures you posted, so this whole thing makes sense to us.


    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  7. mmarovic Active Member

    The query can't be faster unless you add date range restriction in the where clause. If you do so, you also need to make sure you have appropriate index.
  8. mmarovic Active Member

    Also group by clause is missing in your query.
  9. kingfung_lam New Member

    mmarovic, it will be better if you can have more constructive input.
    date range is up to the report requirement.

    "need to make sure you have appropriate index".

    What's approperiate? Can you elborate instead of giving common sense?


    "Also group by clause is missing in your query."

    I cant give the run time unless the query is having the right syntax.
    WHy don't you give me a faster query suggestion instead of picking on typo?
  10. Twan New Member

    Hi ya,

    Mirko means that your query as it is will give way too many rows, it should have a group by cluase as in



    select country, count(*) from iplookup, traffic
    where fromip <=traffic.ip and traffic.ip<=toIP
    group by country


    How many rows in traffic? since it is going to table scan the entire table with this query...?

    I'd put a clustered index on fromip, toip, and a nonclustered index on traffic.ip. Finally I'd change count(*) to count(country) assuming country is never null

    Cheers
    Twan
  11. mmarovic Active Member

    kungfung_lam,

    I allways give a constructive input. However, you didn't provided working query. You can't have aggregate function (count) and a column in the same query without group by clause.
    That means we don't know what is the query we are supposed to optimize.

    More, even if you did just forget group by, there is no where clause so it looks like your query returns all records from the big table (unless lookup table contains only part of possible ranges).

    So, if you don't add a where clause no significatn improvement is possible.

    I hope it is clear now.

    However, I don't like to be accussed to be "not constructive" when I spent my time trying to help. So this is my last answer to any problem you post on this board.
  12. mmarovic Active Member

    One more thing, forgot about indexes (this is really my last answer):

    What do you think is appropriate index when searching on date range? Common sense is enough to find the answer.
  13. kingfung_lam New Member

    1 day. to 1 month.
    1 day with 73641 data
    1 month with 1965766 data



    I have also mention in the first post that I only test with the first 10K data for 1 day and it runs like 7 minutes
    "I have millions of record per day and it takes 7 minute for 10000 records."


    So, your common sense is not good enough to observe.

    Please elbroate your index by answering How if you know database.
  14. Twan New Member

    ok so you have a clustered index on traffic.trackingtimestamp then?

    Cheers
    Twan
  15. kingfung_lam New Member

    yes. I have a cluster on the time.
    I have another cluster on the fromip and toip, but it does't help on the runtime
  16. Twan New Member

    Hi ya,

    and the query you have has the group by too?

    could you please post the results of the command when first running "set showplan on"

    Cheers
    Twan
  17. cmdr_skywalker New Member

    This is a technical forum so it is important that we, those trying to help, get as much as possible, accurate technical information. The performance might not be on the query but also on the configuration and the hardware itself. Please provide us the execution plan so we can determine if you have a optimum index/execution path. Do you have a baseline?

    Do you have another server with the same hardware spec? If you do, run a baseline generator for both server and see if there is a difference. If there is, then the problem might not be in the query itself but the SQL config or OS config.

    May the Almighty God bless us all!
    www.empoweredinformation.com
  18. derrickleggett New Member

    If you want help, listen. Don't criticize people trying to help you. If you can't post code that at least would compile, you're not having performance issues. You can't have performance issues when nothing works.

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  19. kingfung_lam New Member

    Here is the thing I ran<br />SET STATISTICS IO ON<br />SET STATISTICS PROFILE ON<br />SET STATISTICS TIME ON<br /><br />select countryshort , count(*) from tbl_IPCountry, <br />(select top 1000 * from tbl_tracking) tbl_tracking<br />where ipfrom &lt;= dbo.getIPNumber(tracking_IP) and dbo.getIPNumber(tracking_ip) &lt;= ipTo<br />group by countryshort<br /><br />SET STATISTICS IO OFF<br />SET STATISTICS PROFILE OFF<br />SET STATISTICS TIME OFF<br /><br /><br /><br />result: (How can I make it not that nested?) It takes 1 hr 37 mins<br /><br />131select countryshort , count(*) from tbl_IPCountry, (select top 1000 * from tbl_tracking) tbl_tracking where ipfrom &lt;= dbo.getIPNumber(tracking_IP) and dbo.getIPNumber(tracking_ip) &lt;= ipTo group by countryshort210NULLNULLNULLNULL232.0NULLNULLNULL1.9521354NULLNULLSELECT0NULL<br />131 |--Compute Scalar(DEFINE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Expr1004]=Convert([globalagg1006])))221Compute ScalarCompute ScalarDEFINE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Expr1004]=Convert([globalagg1006]))[Expr1004]=Convert([globalagg1006])232.00.00.0000232161.9521354[tbl_IPCountry].[countrySHORT], [Expr1004]NULLPLAN_ROW01.0<br />131 |--Hash Match(Aggregate, HASH<img src='/community/emoticons/emotion-6.gif' alt=':(' />[tbl_IPCountry].[countrySHORT]), RESIDUAL<img src='/community/emoticons/emotion-6.gif' alt=':(' />[tbl_IPCountry].[countrySHORT]=[tbl_IPCountry].[countrySHORT]) DEFINE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[globalagg1006]=SUM([partialagg1005])))232Hash MatchAggregateHASH<img src='/community/emoticons/emotion-6.gif' alt=':(' />[tbl_IPCountry].[countrySHORT]), RESIDUAL<img src='/community/emoticons/emotion-6.gif' alt=':(' />[tbl_IPCountry].[countrySHORT]=[tbl_IPCountry].[countrySHORT])[globalagg1006]=SUM([partialagg1005])232.00.00.48632047201.9521123[tbl_IPCountry].[countrySHORT], [globalagg1006]NULLPLAN_ROW01.0<br />9411 |--Nested Loops(Inner Join, OUTER REFERENCES<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Tbl_Tracking].[Tracking_IP]))243Nested LoopsInner JoinOUTER REFERENCES<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Tbl_Tracking].[Tracking_IP])NULL70111.8830.00.293067661141.4657918[tbl_IPCountry].[countrySHORT], [partialagg1005]NULLPLAN_ROW01.0<br />9411 |--Hash Match(Aggregate, HASH<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Tbl_Tracking].[Tracking_IP]), RESIDUAL<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Tbl_Tracking].[Tracking_IP]=[Tbl_Tracking].[Tracking_IP]) DEFINE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[partialagg1005]=COUNT(*)))254Hash MatchAggregateHASH<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Tbl_Tracking].[Tracking_IP]), RESIDUAL<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Tbl_Tracking].[Tracking_IP]=[Tbl_Tracking].[Tracking_IP])[partialagg1005]=COUNT(*)9.48175330.02.4479087E-2313.8903072E-2[Tbl_Tracking].[Tracking_IP], [partialagg1005]NULLPLAN_ROW01.0<br />10001 | |--Top(1000)265TopTopNULLNULL1000.00.09.9999997E-5231.4423984E-2[Tbl_Tracking].[Tracking_IP]NULLPLAN_ROW01.0<br />10001 | |--Clustered Index Scan(OBJECT<img src='/community/emoticons/emotion-6.gif' alt=':(' />[MEDIAWHIZ_SURVEY_NEW_DB].[dbo].[Tbl_Tracking].[export_friendly_index]))286Clustered Index ScanClustered Index ScanOBJECT<img src='/community/emoticons/emotion-6.gif' alt=':(' />[MEDIAWHIZ_SURVEY_NEW_DB].[dbo].[Tbl_Tracking].[export_friendly_index])[Tbl_Tracking].[Tracking_IP]1000.07.1119918E-37.1119918E-32451.4223984E-2[Tbl_Tracking].[Tracking_IP]NULLPLAN_ROW01.0<br />941941 |--Filter(WHERE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[dbo].[getIPNumber]([Tbl_Tracking].[Tracking_IP])&lt;=[tbl_IPCountry].[ipTO]))2134FilterFilterWHERE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[dbo].[getIPNumber]([Tbl_Tracking].[Tracking_IP])&lt;=[tbl_IPCountry].[ipTO])NULL7394.40040.01.4295841E-2921.1338211[tbl_IPCountry].[countrySHORT]NULLPLAN_ROW09.4817533<br />16911448941 |--Clustered Index Seek(OBJECT<img src='/community/emoticons/emotion-6.gif' alt=':(' />[MEDIAWHIZ_SURVEY_NEW_DB].[dbo].[tbl_IPCountry].[testIP]), SEEK<img src='/community/emoticons/emotion-6.gif' alt=':(' />[tbl_IPCountry].[ipFROM] &lt;= [dbo].[getIPNumber]([Tbl_Tracking].[Tracking_IP])) ORDERED FORWARD)21413Clustered Index SeekClustered Index SeekOBJECT<img src='/community/emoticons/emotion-6.gif' alt=':(' />[MEDIAWHIZ_SURVEY_NEW_DB].[dbo].[tbl_IPCountry].[testIP]), SEEK<img src='/community/emoticons/emotion-6.gif' alt=':(' />[tbl_IPCountry].[ipFROM] &lt;= [dbo].[getIPNumber]([Tbl_Tracking].[Tracking_IP])) ORDERED FORWARD[tbl_IPCountry].[ipTO], [tbl_IPCountry].[countrySHORT]24648.0020.103944172.7289893E-2920.99827147[tbl_IPCountry].[ipTO], [tbl_IPCountry].[countrySHORT]NULLPLAN_ROW09.4817533<br />
  20. joechang New Member

    i think your are being killed on your getIPNumber function call
    i had another customer who used string IP's, which has less overhead than a function call

    when rebuild the table to use binary(4) IP, there was a significant improvement

    basically, convert IP to binary(4), store it in the table that way,
    convert your input IP to binary(4)
    then write your query so there is no function call
  21. kingfung_lam New Member

    Can someone please help on this Problem?
  22. cmdr_skywalker New Member

    As Joe suggested, SQL will perform faster if it is a number (binary) compared to varchar. I suggest you time the following statements:

    --check how long this statement
    SELECT TOP 1000 dbo.getIPNumber(tracking_IP)
    FROM tbl_tracking
    GO

    --time this separately
    select countryshort , count(*)
    from tbl_IPCountry
    WHERE '192.168.0.1' between ipfrom and ipTo --hardcoded
    GO

    --time this separately
    select countryshort , count(*)
    from tbl_IPCountry
    WHERE dbo.getIPNumber('192.168.0.1') between ipfrom and ipTo --hardcoded
    GO

    Hopefully, this will give you an idea where's the issue is.



    May the Almighty God bless us all!
    www.empoweredinformation.com
  23. kingfung_lam New Member

    SELECT TOP 1000 dbo.getIPNumber(tracking_IP)
    FROM tbl_tracking
    GO


    It Takes 0 second


    select countryshort , count(*)
    from tbl_IPCountry
    WHERE dbo.getIpnumber('192.168.0.1') between ipfrom and ipTo --hardcoded
    group by countryshort

    It takes 1 sec


  24. derrickleggett New Member

    Turn the getIpnumber into a table function. Join to the table function instead of calling it each time in the WHERE clause. This will allow you to fully take advantage of indexes. That should help you out quite a bit.

    (It's hard to troubleshoot this without knowing exactly what the getIpnumber function does.)


    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  25. cmdr_skywalker New Member

    When I said about time the statement, it is not the time when it starts executing but when it finishes it. That means, after the 1000 records. The reason I am asking you this is because query has outer reference of tbl_tracking which does not use the index that well with other ip. This means that the 1000 records is constrained by the other conditions, not necessarily the subquery of selecting 1000 records.

    Can you tell us if you want TO (1) EXTRACT FIRST THE 1000 RECORDS OF TBL_TRACKING AND THEN, CHECK if it meets the condition OR (2) you want to extract the TOP 1000 records of TBL_TRACKING THAT MEETS THE CONDITION.

    Which one is functionally equivalent:

    SELECT countryshort, count(*)
    FROM
    tbl_IPCountry TC LEFT JOIN (SELECT TOP 1000 DISTINCT dbo.getIPNumber(tracking_IP) IP
    FROM tbl_tracking
    ) AS TA ON (TA.IP between tc.ipFrom and tc.ipTo)
    GROUP BY countryshort

    OR
    SELECT countryshort, count(*)
    FROM
    tbl_IPCountry TC JOIN (SELECT TOP 1000 DISTINCT dbo.getIPNumber(tracking_IP) IP
    FROM tbl_tracking
    ) AS TA ON (TA.IP between tc.ipFrom and tc.ipTo)
    GROUP BY countryshort









    DECLARE
    @Start DATETIME,
    @End DATETIME
    SET @Start = GETDATE()

    SET @End = GETDATE()

    May the Almighty God bless us all!
    www.empoweredinformation.com

Share This Page