SQL Server Performance

Insert speed slowing on Indexed Table

Discussion in 'Performance Tuning for DBAs' started by dineshks, Sep 16, 2004.

  1. dineshks New Member

    I have a table with 28 columns, and around 10 indexes (one clustered index) defined on it. Record lenght is around 2900 bytes (out of which 8 nvarchar fields totalling 2700 bytes).

    My application may continuously insert records into it upto 20 -30 millions (in batches of 200-300 records).
    The insert speed to the table comes down drastically over a period (like a c/x curve).
    I made fill factor for all indexes as 50%. But still the speed reduces in similar fashion.

    The Insert speed initially will be around 1700 to 2000 records per second. But over a period of time insert speed is coming down drastically & reaches to arround 200 records per second after 2.5 Million records & comes down to around 40 records/sec speed after 4 million inserts.

    Is there any way I can improve the inserts (without deep drops & keeping update speed more than 300-400 rec/sec for 20-30 million range)?

    We need all the indexes defined, else the applications which will query the DB will not work.
  2. derrickleggett New Member

    Have you used profiler and performance monitor to monitor what's going on during this time? Do you have a lot of locking or blocking going on? This would definitely cause this type of activity. I would look there first.

    If you're not having any issues there, try using DBCC INDEXDEFRAG and UPDATE STATISTICS periodically on the table. The DBCC INDEXDEFRAG could be ran twice a day. The UPDATE STATISTICS I would experiment with.

    MeanOldDBA
    derrickleggett@hotmail.com

    When life gives you a lemon, fire the DBA.
  3. dineshks New Member

    I am running only the application which is inserting into the table. No locking & blocking etc. is happening.

    It look like Indexes are filling up (even after specifying 50% fill factor) and then inserts are slowing down. I am seeing that page splits graph is also going high & coming down along with Insert speed. I am seeing two peaks(more than 1000 rec/sec) in insert speed - first one between 0-2 million records & then another at 4.5-7.5 million records range.

    Can I do INDEXDEFRAG when the application is continuosly trying to insert records? Will it not cause query timeout?
  4. thomas New Member

    If the inserts are the only thing using this table (no selects, etc), why not remove the indexes? If you're not querying the table too, there's no need for them (unless they enforce constraints e.g. Primary Keys, etc)

    Tom Pullen
    DBA, Oxfam GB
  5. dineshks New Member

    There are other applications which will query on this table. Hence Indexes are needed.

    May be we can stop other applications using the table till the inserts finishes.

    I had removed indexes (except the primary key index) and seen the performance. The insert speed comes down after 16 million records & the curve flattens.

    Is Index removal & re-creation going to take long time on huge Table?
  6. thomas New Member

    Yes, the index removal and re-creation will take a long time. Is the PK clustered?

    Removing noclustered indexes doesn't take long provided the clustered index is left in place. Recreating them will take a long time. If your PK isn't clustered, I'd try making it clustered. Are rows going to be inserted in PK order? If they are (e.g if it's an identity column), and the PK is clustered, theoretically no fragmentation should happen, and the speed shouldn't decline over time.

    You may be constrained by the requirements of other queries, etc.

    Tom Pullen
    DBA, Oxfam GB
  7. dineshks New Member

    Yes, the PK is clustered. Insertions are in random order(keys values are generated through hashing).

    Not sure, how much faster the insertions will be, if I make it non-clustered. (May be later make it clustered for other applications with select queries to work faster?)

    I would like to have the feel of time gain by these alterations. Since currently it takes around 50Hrs for 20 million records & 28hrs for 15 million, 12hrs for 10 million
    (into empty Table).

    When I removed all indexes except clustered PK, I got following results - 1.5hrs for 10million, 4.5Hrs for 14million, 31Hrs for 17 millions

    Will it not be great time savings if indexes are removed, do insert & recreate index for high volume inserts say more than 15 million records?
    [I couldn't try removing PK, since the current application which does insert doesn't work,if PK removed]
  8. Adriaan New Member

    You mentioned you're inserting in batches of 200-300 rows, and effectively 1700-2000 rows per second. Do you have any influence over the number of rows in each batch? I would test what happens with bigger batches.
  9. FrankKalis Moderator

    Just out of curiosity, how are you inserting and what is your recovery model?

    -----------------------
    --Frank
    http://www.insidesql.de
    -----------------------
  10. FrankKalis Moderator

    quote:
    Not sure, how much faster the insertions will be, if I make it non-clustered. (May be later make it clustered for other applications with select queries to work faster?)
    Btw, I wouldn't do that.

    Here's an extract from BOL:

    quote:
    "If a clustered index is created on a table with several secondary indexes,
    all of the secondary indexes must be rebuilt so that they contain the
    clustering key value instead of the row identifier (RID). Likewise, if a
    clustered index is deleted on a table that has several nonclustered indexes,
    the nonclustered indexes are all rebuilt as part of the DROP operation. This
    may take significant time on large tables.

    The preferred way to build indexes on large tables is to start with the
    clustered index and then build the nonclustered indexes. When dropping all
    indexes, drop the nonclustered indexes first and the clustered index last.
    That way, no indexes need to be rebuilt."

    -----------------------
    --Frank
    http://www.insidesql.de
    -----------------------
  11. dineshks New Member

    Yes, I tried inserting 1024 records batch & found it better compared to 300 rows batch.

    The data I gave is for 1024 records batch. For 300 records batch it was taking around 18Hrs for 10 million rows.

    Do you think, I can even go higher (say 2000, 4000 etc) & get better result?
  12. FrankKalis Moderator

    Another point I don't understand is

    quote:
    I would like to have the feel of time gain by these alterations. Since currently it takes around 50Hrs for 20 million records & 28hrs for 15 million, 12hrs for 10 million
    (into empty Table).

    When I removed all indexes except clustered PK, I got following results - 1.5hrs for 10million, 4.5Hrs for 14million, 31Hrs for 17 millions

    Will it not be great time savings if indexes are removed, do insert & recreate index for high volume inserts say more than 15 million records?
    [I couldn't try removing PK, since the current application which does insert doesn't work,if PK removed]
    I guess that has been suggested before, to load the data first, and then build the indexes. You said, you can't do that because there are apps instantly querying the data.

    Is this now an option for you or not?

    Anyway, here's a new and very interesting case study from Microsoft:
    http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/incbulkload.mspx

    -----------------------
    --Frank
    http://www.insidesql.de
    -----------------------
  13. dineshks New Member

    quote:Originally posted by FrankKalis

    Just out of curiosity, how are you inserting and what is your recovery model?

    -----------------------
    --Frank
    http://www.insidesql.de
    -----------------------


    Begin transaction & commit transaction is used for insertion. Is that what you are asking?
  14. dineshks New Member

    quote:Originally posted by FrankKalis

    Another point I don't understand is

    quote:
    I would like to have the feel of time gain by these alterations. Since currently it takes around 50Hrs for 20 million records & 28hrs for 15 million, 12hrs for 10 million
    (into empty Table).

    When I removed all indexes except clustered PK, I got following results - 1.5hrs for 10million, 4.5Hrs for 14million, 31Hrs for 17 millions

    Will it not be great time savings if indexes are removed, do insert & recreate index for high volume inserts say more than 15 million records?
    [I couldn't try removing PK, since the current application which does insert doesn't work,if PK removed]
    I guess that has been suggested before, to load the data first, and then build the indexes. You said, you can't do that because there are apps instantly querying the data.

    Is this now an option for you or not?

    Anyway, here's a new and very interesting case study from Microsoft:
    http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/incbulkload.mspx

    -----------------------
    --Frank
    http://www.insidesql.de
    -----------------------


    Frank,

    I am basically looking for improvement options keeping the indexes intact.
    If it is certain that there is no other option but to use the drop index- insert & re-create index, then we may have to redesign our system for big volumes of data.

    I am interested in knowing what will be the size limit to get decent performance & what are other tunable parameters to maximize performance.
  15. FrankKalis Moderator

    Yes and no.
    So, your insertions are fully recorded in the transaction log? Or do you use SELECT INTO, BULK COPY or bcp?
    That's what I was asking, what method you use to insert the data.

    -----------------------
    --Frank
    http://www.insidesql.de
    -----------------------
  16. Adriaan New Member

    The idea about the batch size was just a wild guess about file growth issues. Why not try even bigger batches and see if that further improves performance.

    Do you have the indexes on a separate file from the data?
  17. dineshks New Member

    quote:Originally posted by FrankKalis

    Yes and no.
    So, your insertions are fully recorded in the transaction log? Or do you use SELECT INTO, BULK COPY or bcp?
    That's what I was asking, what method you use to insert the data.

    -----------------------
    --Frank
    http://www.insidesql.de
    -----------------------


    Yes, Insertions are recorded in Transaction log. INSERT INTO is used (in C++ program) to insert multiple rows together.
  18. dineshks New Member

    quote:Originally posted by Adriaan

    The idea about the batch size was just a wild guess about file growth issues. Why not try even bigger batches and see if that further improves performance.

    Do you have the indexes on a separate file from the data?

    I have put Indexes into Different disk (64K blocksize Fielsystem) and tried.
    Putting indexes into different disk had improved performance to some extent.

    Want to know what are the implications of increasing batch sizes?
  19. Adriaan New Member

    quote:Originally posted by dineshks
    I have put Indexes into Different disk (64K blocksize Fielsystem) and tried.
    Putting indexes into different disk had improved performance to some extent.

    Want to know what are the implications of increasing batch sizes?

    Also test the file growth setting for the indexes file.
  20. joechang New Member

    what are your physical disk performance counters showing?
    specifically, for the data partition:
    Disk reads/sec
    Disk writes/sec
    avg disk sec/read
    avg disk sec/write
    avg Disk Queue length

    if the disks are overloaded, you need to do one of more of:
    1) have fewer indexes
    2) use sequentially generated keys, to reduce the number 8K blocks in which related data are located
    3) add more disks
  21. dhilditch New Member

    The main implications to increasing the batch size are that you will get a performance gain over the entire data load, but at the cost of increased delays for other users trying to query the table.

    I know you have said that you don't want to change the index, but I would recommend changing your clustered index, so that all new rows are placed at the end of the clustered index - e.g. using an identity key as the clustered index would perform this, or placing the clustered index on your time_stamp value if you record the time the row is inserted. Inserting this many rows sounds like inserting is the primary purpose - you wont experience any slow down for size if you do it this way.

    Dave Hilditch.
  22. dineshks New Member

    Dave,

    I just tried out changing PK to non-clustered. I got phenomenal insert speed iprovement.
    Now my insert speed is remaining high (more than 700-800 records/sec even after 40Million records!

    I guess I may not need that index to be clustered since part of it is hash value for which no range quries are needed.


    Thanks everybody for suggestions.
  23. dhilditch New Member

    One additional note: if you don't have a clustered index, everything is getting added to what is called a 'heap' so there is no definite ordering in your table any more. From what I can figure, I think you probably have a fair amount of non-clustered indexes on your table - these will all slow down the inserts. If you can pick a clustered index column that is small (i.e. small number of bytes) and that orders upwards as rows are inserted you should see a performance boost - because the non-clustered indexes point to the clustered index, so the smaller the identifier is on the clustered index, the smaller and quicker to update your non-clustered indexes will be. If there are any of the non-clustered indexes that can be removed, or merged perhaps, then you will also see a performance boost. When inserting rows each non-clustered index has to be updated. The clustered index on the other hand, is not updated exactly, but is kept in order which would be why you were seeing a slow down - inserting enough records somewhere in the middle will eventually cause all of the data to be shuffled to make space for the new records. With 20 million rows that would take a while.

    Dave Hilditch.
  24. simas New Member

    Hello Dinesh,<br /> you had some good information sent you way in the posts above, however it is hard to advice anything without knowing what you goals are<br />a) You mentioned that you have both inserts and selects - what is your read/write ratio (Go to Task Manager -&gt<img src='/community/emoticons/emotion-4.gif' alt=';P' />rocesses -&gt;View -&gt;Select Columns (I/O Reads, I/O Read bytes, I/O write, I/O write bytes,etc)<br />b) what has higher priority in your optimization process - how slow selects are you willing to tolerate to optimize inserts and vice versa? Right now removing the clustered index may make you inserts faster (as well as dropping the indexes) but are paying for it each and every time select delete or update is issued against that table- what is your the priority? Will the user be ok if updating a status will not take from 5 sec to 15 minutes (as XX million row heap structure is scanned)? <br />c) How is your server performing overall - what are its bottlenecks (ready the articles Brad posted on this site on how to do performance evaluation of your server)? What is this system - DSS, pure OLTP, mixed? if mixed, is DSS processing separate from OLTP (by time windows) or runs concurrently? If runs cuncurrently , what has the priority, is the priority fixed or flexible depending on the business rules? <br />d) Is the system designed properly (everything from normalization to transaction length to dataflow,etc)? Why is application doing massive inserts via manual transactions from C++ application?<br /> You need to identify what you want to get out of this round of optimization and when sit down with whoever designed the system and whoever designed the application . Without looking at overall design of the system , "poking around" (dropping indexes, changing clustered into non-clustered and back) is riskly and can be dangerous to overall health of your system. <br /><br />Simas
  25. sify New Member

    Have you checked what is the cache hit ratio during this heavy insertion. During heavy insertion cache hit ratio will comedown gradually?

Share This Page