SQL Server Performance

Table Statistics

Discussion in 'General Developer Questions' started by amitm79, Mar 8, 2005.

  1. amitm79 New Member

    Hi All,
    I have some basic doubts regarding statistics. I hope some of you guys can answer most of my doubts.

    1> When i say Auto create statistics, what are the columns for which statistics are collected? Are these all columns part of the primary key or just the index statistics?
    2> What is the difference between index statistics and column statistics?
    3> Do i need to explicitly create statistics for columns which are not part of primary key but are used in queries?
    4> Who are candidates for column statistics?
    5> If i explicitly create statistics on some columns, is updating them my resposibility or its taken care of by auto update statistics?
    6> Which system table stores statistics information?
    7> What are storage implications of adding statistics (default, column etc)?

    I will appreciate related links/tutotials and any other material for thought.

    Regards,
    Amit
  2. Luis Martin Moderator

    1) Both.
    2) Index statistics are some columns of table.
    3) No. If you have auto create statistics on, SQL does this task.
    4) Candidates are columns frequently used in queries, but not neccesary to need a Index.
    5) Auto update take care. But auto update on, consume resources, is better to schedule a job to update statistics.
    6) Sysindexes.
    7) None.

    Bonus track:

    With this query you can see Indexs and Statistics.

    Use yourdatabase
    select si.rows as 'Rows', SO.Name as Tabla, SI.name as 'Index', SFG.groupname as 'Filegroup'
    from sysobjects as SO
    join sysindexes as SI
    on SO.Id = SI.id
    join sysfilegroups as SFG
    on SI.GroupId = SFG.GroupId

    order by si.rows desc, SO.Name , SI.name, SFG.GroupName

    Luis Martin
    Moderator
    SQL-Server-Performance.com

    One of the symptoms of an approaching nervous breakdown is the belief that one's work is terribly important
    Bertrand Russell


    All postings are provided “AS IS” with no warranties for accuracy.



  3. amitm79 New Member

    Hi Luis,
    Thanks for the reply and the query. When i run this query and check the table i frequently query on then apart from primary key i see more rows. These all rows have index name starting from _WA_Sys_. These are typically the dimensions and measures i query on. Are these some kind of automatic statistics on my table?

    I see automatic stats created even for those columns which are part of the index. Only the first column in the index is not there. Is it that on creating an composite index just the first column is used?

    Should i put create statistics statement for every dimension i query on, while creating the table itself?

    Is there any way to see the difference in query response after creating statistics?

    Regards,
    Amit
  4. Luis Martin Moderator

    Yes, _WA_Sys_ are automatic statistics created by SQL.

    "Should i put create statistics statement for every dimension i query on, while creating the table itself?"

    No, when database is in use, SQL create statistics depending on witch queries are used.

    "Is there any way to see the difference in query response after creating statistics?"

    Yes, but you have to update frequently (Allways I set auto update statistics off, and I update via job).

    Also, if you pick one heavy query and run Index Tuning Wizard (using QA) the recomendations include statistics, I mean Indexs and Statistics.


    One article to read:

    http://www.sql-server-performance.com/nb_execution_plan_statistics.asp




    Luis Martin
    Moderator
    SQL-Server-Performance.com

    One of the symptoms of an approaching nervous breakdown is the belief that one's work is terribly important
    Bertrand Russell


    All postings are provided “AS IS” with no warranties for accuracy.



  5. amitm79 New Member

    Hi Luis,
    I understand that SQL Server creates statistics automatically based on queries for the data. Then why do i receive missing statistics warning in SQL query analyzer?

    Regards,
    Amit
  6. Luis Martin Moderator

    If you made some change in table structure, like rename a column or insert a new one, is possible that one or more statistics linked to that columns now is obsolet.

    I suppose you find that looking a execution plan. In that case you can right click on index and update or create missing statistics.

    The following script give show null statistics, you can save the output and run in QA.

    SELECT 'update statistics' + ' ' + RTRIM(object_name(I.id)) + ' ' + RTRIM(name)

    /* DATALENGTH (statblob) size,
    STATS_DATE (I.id, I.indid) last_updated */
    FROM
    sysindexes as I

    WHERE
    OBJECTPROPERTY(I.id, N'IsUserTable') = 1 AND
    INDEXPROPERTY (I.id , name , 'IsAutoStatistics' ) = 1 AND
    DATALENGTH (statblob) is null

    You can save the output in sql form, and execute to update those null statistics.

    If you run again and there is still some null statistics, you can use the following to delete it.


    SELECT 'drop statistics' + ' ' + RTRIM(object_name(I.id)) + '.' + RTRIM(name)

    /* DATALENGTH (statblob) size,
    STATS_DATE (I.id, I.indid) last_updated */
    FROM
    sysindexes as I

    WHERE
    OBJECTPROPERTY(I.id, N'IsUserTable') = 1 AND
    INDEXPROPERTY (I.id , name , 'IsAutoStatistics' ) = 1 AND
    DATALENGTH (statblob) is null



    Luis Martin
    Moderator
    SQL-Server-Performance.com

    One of the symptoms of an approaching nervous breakdown is the belief that one's work is terribly important
    Bertrand Russell


    All postings are provided “AS IS” with no warranties for accuracy.



  7. amitm79 New Member

    Hi Luis,
    I have not made any changes to the table structure after creating the table. If i directly execute a query on a table the execution plan gives me a warning about missing statistics. I have another table which has some auto statistics created on it by sql server. Is it that after analyzing your queries for some time sql creates statistics for some columns and not immediately?

    The query for null statistics is not returning any row to me.

    Is there any way in which i can know in advance that what all columns will lead to missing statistics warning in QA and can create them upfront?

    Regards,
    Amit
  8. Luis Martin Moderator

    " Is it that after analyzing your queries for some time sql creates statistics for some columns and not immediately?"
    Yes.

    "Is there any way in which i can know in advance that what all columns will lead to missing statistics warning in QA and can create them upfront?"

    As far I know, no. But if run DBCC DBREINDEX, all statistics will update.




    Luis Martin
    Moderator
    SQL-Server-Performance.com

    One of the symptoms of an approaching nervous breakdown is the belief that one's work is terribly important
    Bertrand Russell


    All postings are provided “AS IS” with no warranties for accuracy.



  9. amitm79 New Member

    Hi Luis,
    A typical one. Any idea when does sql server creates statitics based on the queries i have fired in a given point of time?

    DBCC DBREINDEX will update existing statistics but won't do anything for those which are not there. I mean the ones sql server creates based on user queries.

    Regards,
    Amit
  10. mmarovic Active Member

    quote:Originally posted by amitm79

    Any idea when does sql server creates statitics based on the queries i have fired in a given point of time?
    I think they are created when the query runs for the first time. If query optimizer "thinks" it needs statistcs on specific SARG column to build specific execution plan it will do it first, then create exec plan and finally execute it. However, I have never tested that theory. If you are willing to test it, feedback would be appreciated.
  11. Luis Martin Moderator

    "DBCC DBREINDEX will update existing statistics but won't do anything for those which are not there. I mean the ones sql server creates based on user queries."

    I don't understand what you mean.

    If you run DBCC DBREINDEX (authors, '', 70) all statistics for authors tables will update.




    Luis Martin
    Moderator
    SQL-Server-Performance.com

    One of the symptoms of an approaching nervous breakdown is the belief that one's work is terribly important
    Bertrand Russell


    All postings are provided “AS IS” with no warranties for accuracy.



  12. amitm79 New Member

    Hi Luis,
    By the line
    "DBCC DBREINDEX will update existing statistics but won't do anything for those which are not there. I mean the ones sql server creates based on user queries."

    i mean that it just picks the entries that are there in sysindexes and won't create new entries. If it doesn't create new indexes (for statistics) then there is no chance it can correct missing statistics issue. I tried this on one of my tables which is giving missing statistics warning and the warning is still there after reindex.

    mmarovic
    No its not doing it. It just leaves a warning every time. But in say 24 hour period, it checks all the queries which have been executed and creates statistics for all the columns which have been queried. I want to know when does it happens? The reason for it is that system performance may not be optimized unless complete statistics are available.

    Basically the bottomline is that i still don't know when does sql server creates automatic statistics on frequently queried columns.

    Regards,
    Amit
  13. mmarovic Active Member

    quote:mmarovic
    No its not doing it. It just leaves a warning every time. But in say 24 hour period, it checks all the queries which have been executed and creates statistics for all the columns which have been queried. I want to know when does it happens? The reason for it is that system performance may not be optimized unless complete statistics are available.
    Yes, I came to the same conclusion after I posted the answer. I believe the rule about when statistics are updated/creted is not published. The closer thing I remember is that it happens in lower activity period. I even don't remember if Microsoft was the source.
  14. amitm79 New Member

    Hi mmarovic,
    I agree to what you have observed. I am just wondering why nobody else is seeing this issue. I think there must be some way out. I think having missing statistics warning can be a serious issue. I am seeing significant gain in performance after creating missing statitics. Only thing is i don't know a direct solution. Right now i am creating statistics wherever i recieve a warning.

    The workaround i am thinking of implementing right now is
    1> Check execution plans for all frequntly running queries
    2> Make a list of all columns which are missing statistics (warning)
    3> Feed this information back into table creation script and create statistics explicitly for those columns.

    Only problem here is that i may not be able to cover all the scenarios where wrong execution plan is generated.

    Regards,
    Amit
  15. satya Moderator

    The auto update statistics option can slow down your system, but don't disable the option simply because you're concerned that it might affect performance. More often than not, keeping this option enabled is the correct decision and will lead to substantially better performance. If you load a lot of data into a table during a peak processing period, SQL Server automatic statistics update on that table might further degrade performance.

    In this case, you can disable the auto statistics option for the table during the load, and then manually run UPDATE STATISTICS when the load is finished. If f you've disabled the auto statistics option for the table, mowever, you might find that certain queries begin to choose inefficient query plans, which could cause the statistics to no longer reflect the accurate distribution of data in the table and result in SQL Server choosing a suboptimal execution plan.

    As for how often SQL Server performs this update, see the Microsoft article "INF: How SQL Server 7.0 and SQL Server 2000 Autostats Work." This interesting article covers the algorithms SQL Server uses to choose when to automatically update statistics.

    The cost of this automatic statistics update is minimized by sampling the data rather than analyzing all of it. Under some circumstances, statistical sampling will not be able to accurately characterize the data in a table. You can control the amount of data that is sampled during manual statistics updates on a table-by-table basis by using the SAMPLE and FULLSCAN clauses of the UPDATE STATISTICS statement. The FULLSCAN clause specifies that all of the data in the table is scanned to gather statistics, whereas the SAMPLE clause can be used to specify either the percentage of rows to sample or the number of rows to sample.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  16. mmarovic Active Member

    Satya, you are the man!
  17. amitm79 New Member

    Hi Satya,
    I am not considering turning off the auto stats option unless i really see an issue in data loading.

    Is there any way i can know that what all columns do in need to create statistics on (in case auto stats is not there)?

    Regards,
    Amit
  18. Luis Martin Moderator

    The only way I know is Index Tuning Wizard.


    Luis Martin
    Moderator
    SQL-Server-Performance.com

    One of the symptoms of an approaching nervous breakdown is the belief that one's work is terribly important
    Bertrand Russell


    All postings are provided “AS IS” with no warranties for accuracy.



  19. TRACEYSQL New Member

    i ran your command
    SELECT 'update statistics' + ' ' + RTRIM(object_name(I.id)) + ' ' + RTRIM(name)

    /* DATALENGTH (statblob) size,
    STATS_DATE (I.id, I.indid) last_updated */
    FROM
    sysindexes as I

    WHERE
    OBJECTPROPERTY(I.id, N'IsUserTable') = 1 AND
    INDEXPROPERTY (I.id , name , 'IsAutoStatistics' ) = 1 AND
    DATALENGTH (statblob) is null

    and get
    update statistics Z_PJPABJP_ERROR _WA_Sys_ERROR_DESC_007FC3B8
    update statistics Z_PJPABJP_ERROR _WA_Sys_PROJ_ID_007FC3B8
    update statistics Z_PJPABJP_ERROR _WA_Sys_ERROR_NO_007FC3B8
    update statistics Z_APPREVVR_INIT _WA_Sys_ERROR_DESC_0153EC60
    update statistics Z_APPREVVR_INIT _WA_Sys_FY_CD_0153EC60
    update statistics Z_APPREVVR_INIT _WA_Sys_PD_NO_0153EC60
    update statistics Z_APPREVVR_INIT _WA_Sys_SUB_PD_NO_0153EC60
    update statistics Z_APPREVVR_INIT _WA_Sys_VCHR_KEY_0153EC60
    update statistics Z_APPREVVR_INIT _WA_Sys_VCHR_NO_0153EC60

    so do i need the _WA indexes?

    i did update statistcis_Z_PJPABJP
    still comes out as missing
  20. satya Moderator

    Thanks Mmarovice, hope that helps as I can see this question more in forums.

    Amit,
    I'm not referring to disable AUTO STATS option and as suggested you have to keepup a historic information on statistics of table, refer to this KBAhttp://support.microsoft.com/kb/195565 that clarifies on the how stats are being update (even though its for SQL pre-decessor version).

    At my end we have a weekly job that takes care of this udpate stats on the volatile table to take care of optimum stats for optimum performance.

    quote:Originally posted by amitm79

    Hi Satya,
    I am not considering turning off the auto stats option unless i really see an issue in data loading.

    Is there any way i can know that what all columns do in need to create statistics on (in case auto stats is not there)?

    Regards,
    Amit

    Satya SKJ
    Microsoft SQL Server MVP
    Writer, Contributing Editor & Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing. The greatest discovery of my generation is that a human being can alter his life by altering his attitudes of mind.
  21. hi

    iam durga prasad working as SE in accenture ,i have one query
    i wana create table using stored procedures is there any possibility.
    if it is possible send me reply to durgaprasad.gundepalli@gmail.com

    durgaprasad

Share This Page