SQL Server Performance Forum – Threads Archive
Need help speeding up an insertI’ve got a job that runs nightly (M-F), inserting about 700k rows to a table that already has about 65M rows. Prior to the insert, I remove all records I don’t want inserted, so the select portion of the insert has no where clause. The source and destination tables both reside in the same database and within the same filegroup. Unfortunately, when the server was setup, it was setup using RAID 5 for both the data and log files. We never do anything with the log files, so a RAID 0 configuration for those would suffice for our needs. My problem is, to insert the ~700k rows, it typically takes just over 3.5 hours. I’ve set the database up in Bulk Logged mode (according to a local consultant) and tested creating an index (to see if dropping/inserting/recreating would be an option), and creating an index on one date field, one smallint, and 3 tinyints took 16 minutes. Since there are probably 6 or 7 indexes on this table, it probably wouldn’t save me the amount of time I’m looking to save to perform all of these operations prior to my insert every evening. Outside of switching the log files over to a RAID 0 array, what can I do to speed things up? Thanks for your help.
I assume you are bulk inserting the records into the table. My suggestions is to insert, maybe, 50000 records at a time and loop through until all records are inserted. This would definitely be faster. Try intervals that will provide maximum performance for you.
"How do you expect to beat me when I am forever?"
Royv’s advice is good and a standard recommendation in your case. Also, someone else recently asked a similar question in the forum. Do a search on the forum and see the feedback presented there. ——————
Brad M. McGehee
Bulk Insert requires your source to be a data file doesn’t it? My source is a table in the same database as the destination.
Some questions: Is there a clustered index on the destination table? This can slow down your inserts if data has to be re-arranged for every insert. Depening on your queries you might not need a clustered index. If you do, what is the speed difference when dropping the clustered index before insert and then re-creating it after? Do you really need 65M rows in the destination table? If you are dealing with yearly, monthly, weekly data etc, maybe it’s possible to off-load some data to archvie tables? Is it possible to do the inserts more frequently and not only once per day to minimize the number of inserts required each time? /Argyle
Good questions. No, there is no clustered index on the table. The main querying criteria for the table is on a date field, and when adding records, better than 90 of the records would go at the bottom of the table if there were a clustered index. Unfortunately, we do need that many records. That’s 3 months worth of data, and that’s what the system is built to show.
You could possibly see performance improvements by dropping the index altogether, and reapplying it after all insertions have completed.
Sorry, that was assuming you have a NON clustered index existing
You could try and export the 700k rows to say seven 100k datafiles and then bulk insert them in parallel into the 65M table from seven connections. For this to work you need to drop all indexes and use the TABLOCK hint when using the bulk insert command. The rows_per_batch size should be the same size as the files being imported. But since the data you import is just a few percent of the total ammount of data you might not benefit at all since rebuilding the indexes might take a long time. You have to test it out. You could still try and export the 700k data to a single file and then bulk insert it with the indexes in place. /Argyle