SQL Server Performance

Data Load Performance and Tramsaction Log Issue

Discussion in 'Performance Tuning for DBAs' started by sonnysingh, Aug 19, 2006.

  1. sonnysingh Member

    Hi All

    I creating data of 21100000 records in the table. I have recovery model to set 'SIMPLE'. It is taking more than 2 hours and main problem is my transaction log getting full inspite of 20GB free space availability.

    1. how do speed the create data operation
    2. how I solve transaction log issue

    Thanks in advance..

  2. Luis Martin Moderator

    How are you loading the data? BCP?, DTS?


    Luis Martin
    Moderator
    SQL-Server-Performance.com

    All in Love is Fair
    Stevie Wonder


    All postings are provided “AS IS” with no warranties for accuracy.



  3. mmarovic Active Member

    Split data loading to smaller batches (e.g. 10.000 in one iteration) to handle log growth problem.
    Indexes create after data are loaded...
  4. satya Moderator

    ... also try to run BACKUP LOG WITH NO_LOG in order to clear the virtual log, that can keep up the size in this case.

    Satya SKJ
    Microsoft SQL Server MVP
    Contributing Editor & Forums Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing.
  5. FrankKalis Moderator

    Check out this really interesting link:http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/incbulkload.mspx

    The main issue for bad performance is typically the presence of index(es) on the base table. Drop the indexes before the load (if there are any) and recreate them after load is completed. You should see that this is still much faster than the time it takes to load the data when the indexes are present. However, more than 2 hours for 21 million rows is, to say the least, _very_ slow. In my testings it took me about 1.5 minute to load 1 million rows using BULK INSERT.

    --
    Frank Kalis
    Moderator
    Microsoft SQL Server MVP
    Webmaster:http://www.insidesql.de
  6. sonnysingh Member

    Thanks a lot guys..

    I have tried some of your suggestion over the weekend..
    1. split into batch (2 months data at a time)
    2. truncate the log with truncate_only option..(satya what would be the different cos you mentioned "BACKUP LOG WITH NO_LOG") please advise..
    3. shrink the file with DBCC SHRINKFILE command.
    4. set recovery model to BULK-LOGGED mode...

    It work fine but I want to know that have I done right approach or do i need to alter/add any thing else????
    Luis... I just using simple Insert.. select into statement cos tables are within same database.
    mmarovic... what do you mean by index create after loaded the data..please ellaborate it..

    regards guys..




  7. gurucb New Member

    Database is in simple recovery mode yet transaction log is getting full?
    So, are you inserting all of them in single transactios (as open transactions will not allow the TL to be truncated).

    Also what the expansion options for TL is it by size MB or %. If % what is the size of Transaction log before it is getting full..

    frequent auto increments also cause TL size to grow..

    In your case best way to go is BULK copy methods.
  8. satya Moderator

    truncate the log with truncate_only option..(satya what would be the different cos you mentioned "BACKUP LOG WITH NO_LOG") please advise..
    In the simple recovery model, a sequence of transaction logs is not being maintained. All log records before the MinLSN can be truncated at any time, except while a BACKUP statement is being processed. NO_LOG and TRUNCATE_ONLY are the only BACKUP LOG options that are valid for a database that is using the simple recovery model.

    I suggest to look at Books ONline for Tlog architecture in this case, for information.


    Satya SKJ
    Microsoft SQL Server MVP
    Contributing Editor & Forums Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided AS IS with no rights for the sake of knowledge sharing.
  9. merrillaldrich New Member

    Could it be that the data file(s) or transaction log file are auto-growing during the load, causing the slowdown?
  10. waqar Member

    Hi,

    Just to add my experience,
    I always have to add somewhere like 40,50 million records in one table and a lot of pre-procesing is required. If i try to run in one go, my LOG even can reach to 20-30 GB so i always try to break import into small portions and it help me to clear log while i finish each import.

    ________________________________________________
    Be great in act, as you have been in thought.
  11. sonnysingh Member

    Thanks a lot guys..
    Like said above..

    1. set recovery model to BULK-LOGGED mode.
    2. split into batch (2 months data at a time)
    2. truncate the log with BACK LOG with TRUNCATE_ONLY option in between each load.
    3. shrink the file with DBCC SHRINKFILE command in between each load.

    It is transaction file which causing problem... of course datafile is also growing cos data inserting... can shrink the data(mdf) file? and upto what percentage I can shrink it?

    Now I have Bulk Logged recovery mode so how can truncate log and data file in this situtaion???

    regards
  12. sonnysingh Member

    Hi me again...

    guys.. my database is 44GB and I have 10GB space left on the hard drive. And still need to insert 20 million records in it. cos of space problem my job was crashed..presently, I have above mentioned configuration for database.

    please advise on approach or method to solve this situation.. IT is very urgent..guys.

    really appreciated your help..

    regards
  13. waqar Member

    Sonny bro,

    Break your 20 mil records in 5 mil each.

    ->insert
    ->backup log dbname with truncate_only
    ->dbcc shrinkfile(logfile,1)
    ->check hdd space

    repeat all steps again.


    ________________________________________________
    Be great in act, as you have been in thought.
  14. gurucb New Member

    One more pointer in BcP there is a parameter for Batchsize, by default it is the size of the file and every thing will be one single transaction.

    Specifies the number of rows per batch of imported data. Each batch is imported and logged as a separate transaction that imports the whole batch before being committed. By default, all the rows in the data file are imported as one batch. To distribute the rows among multiple batches, specify a batch_size that is smaller than the number of rows in the data file. If the transaction for any batch fails, only insertions from the current batch are rolled back. Batches already imported by committed transactions are unaffected by a later failure.

  15. sonnysingh Member

    thanks guys it was good steps
  16. mmarovic Active Member

    quote:mmarovic... what do you mean by index create after loaded the data..please ellaborate it..
    See Frank's answer. Also, Merrill's question was good one. Don't shrink log files (if you do) if they are going to grow again next time you load data. Keep log size big enough they don't grow during load and additionally slow down the process.
  17. sonnysingh Member

    mmarovic.... why shouldn't shrink the log file?? like I have mentioned that I am doin in following steps to handle the situtatiobn

    - inserting records
    - backup the log with (backup log Traveldb with No_LOG)
    - Then DBCC SHRINKFILE (Travel_LOG,20,Truncateonly)

    they worked fine... but could explain that why why shouldn't shrink the log file??

    regards



  18. mmarovic Active Member

    One time shrink makes sense. If you are doing it regurarly, it means next time sql server needs more space then it is available in the log files, so it has to expand. That means you need the space you had temporarily given back to OS again. Than why not just keep this space reserved for sql server in advance instead of forcing sql server to expand each time you have havy transaction activity?
  19. sonnysingh Member

    mmarovic...

    The problem is I have limited space available on the disk and if not shrink the lofg file it will fill it up and data load process will in halt and ended isn't?
    But on the other hand, you are right that shrinking is not the ideal solution so how come reserved the space for log file (by increase the size of log file?? and if so then what happen when all tat space will filled up cos I am loading 25GB of data in batch into table from another table (it is 'insert into select from' operation).

    Please suggest ideal way...
    regards

  20. mmarovic Active Member

    The logical log space is reused after transaction log backup or checkpoint if the model is simple. The possible exceptions are uncompleted transactions. To avoid big transaction to occupy all the space on the disk, split insert ... select into smaller batches. Keep transaction log file with enough space for the biggest transaction load and don't shrink it.
  21. sonnysingh Member

    I am using BULK/LOGGED recovery method... in order to minimum log generation in log file. so what will be your suggested method in this recovery model and full recovery model for the situation I am in?????

    regards
  22. mmarovic Active Member

    1. Schedule frequent enough transaction log backups.
    2. Split the large inserts/updates/deletes to smaller batches.
    3. Rebuild only indexes that are fragmented.

Share This Page