SQL Server Performance

How to accelerate the "insert" rate and reduce log

Discussion in 'SQL Server 2005 General DBA Questions' started by shuai_sh, Jan 9, 2007.

  1. shuai_sh New Member

    SQL Server 2005<br />XEON CPU 3.0G<br />MEMORY 2.0G<br />RAID<br /><br />Tow tables:<br />HIS_HTTP_ONLINE_LOG(PARTITION) FOR HISTORY DATA<br />REL_HTTP_ONLINE_LOG(NOT PARTITIONED) FOR EVERYDAY DATA,<br />AND THEY HAVE THE SAME STRUCTURE<br /><br />CREATE TABLE HIS_HTTP_ONLINE_LOG(<br />ID numeric(20,0) NOT NULL,<br />USERID varchar(32) NOT NULL,<br />USERIP varchar(16) NOT NULL,<br />USERPORT numeric(10, 0) NULL,<br />OBJECTIP varchar(16) NULL,<br />OBJECTPORT numeric(10, 0) NULL,<br />HTTPURL varchar(256) NULL,<br />HTTPHOST varchar(12<img src='/community/emoticons/emotion-11.gif' alt='8)' /> NULL,<br />HTTPDNS varchar(12<img src='/community/emoticons/emotion-11.gif' alt='8)' /> NULL,<br />VISITIME numeric(10, 0) NULL,<br />STARTIME datetime NOT NULL,<br />ENDTIME datetime NOT NULL<br />).......<br /><br />SELECT * INTO REL_HTTP_ONLINE_LOG <br />SELECT * <br />FROM HIS_HTTP_ONLINE_LOGWHERE 1=2<br /><br />There are 5 indexes in HIS_HTTP_ONLINE_LOG ,<br />There is not one index in REL_HTTP_ONLINE_LOG<br /><br />There are about 5000,000 records in REL_HTTP_ONLINE_LOG everyday,<br />at night it will move into HIS_HTTP_ONLINE_LOG automatically,<br />The data of everyday in REL_HTTP_ONLINE_LOG will be last 90 days.<br /><br />My operations:<br />1: set the the recovery model of database "simple";<br />2: EXEC SP_DBOPTION DB, 'select into/bulkcopy', 'TRUE'<br />3£ºINSERT INTO HIS_HTTP_ONLINE_LOG<br /> SELECT * FROM REL_HTTP_ONLINE_LOG<br />4: TRUNCATE TABLE REL_HTTP_ONLINE_LOG<br /><br />ASK:<br />why the step 3 cost so much time ? (about 1 hour)<br /> and how can I reduce the transaction logs in this period ? <br /> <br />Could you give me some suggestions ?<br />Thanks!
  2. MohammedU New Member

    In step3 you are inserting the data into REL_HTTP_ONLINE_LOG then truncating REL_HTTP_ONLINE_LOG table in step4 why?

    May you are inserting the data into HIS_HTTP_ONLINE_LOG from REL_HTTP_ONLINE_LOG then trucating the REL_HTTP_ONLINE_LOG tabel...

    To reduce the step3 time: Make sure you have enough allocated space in your db...



    Mohammed U.
  3. alzdba Member

    Ad and order by claus to your SELECT * FROM HIS_HTTP_ONLINE_LOGWHERE 1=2 so the inserted data matches the clustering index of your destination table !
    This just to avoid pagesplit overhaed

  4. shuai_sh New Member

    quote:Originally posted by MohammedU

    In step3 you are inserting the data into REL_HTTP_ONLINE_LOG then truncating REL_HTTP_ONLINE_LOG table in step4 why?

    May you are inserting the data into HIS_HTTP_ONLINE_LOG from REL_HTTP_ONLINE_LOG then trucating the REL_HTTP_ONLINE_LOG tabel...

    To reduce the step3 time: Make sure you have enough allocated space in your db...



    Mohammed U.

    REL_HTTP_ONLINE_LOG is for realtime and it works at daytime,
    its data will be moved into HIS_HTTP_ONLINE_LOG at midnight

    I have enough space in my db

    Now I'm seeking for how to accelerate the "insert " operation

    Could you tell me how to ?
  5. madhuottapalam New Member

    Correct me if i am mistaken

    (a) Your History table will have last 90 days history
    (b) Your Online table will have 5 million records daily
    (c) you are pumping this 5 million rows to Historytable daily at midnight
    (d) my question is does this history table contain approx 90 * 5 million rows ?
    (f) actually you want a solution to pump this live data from Online table to History table in a faster method ?

    I think what u can do is use BCP or Bulk Insert. This link is bit confusing, anyhow try in that way.

    http://www.sql-server-performance.com/bcp.asp

    Madhu
  6. mmarovic Active Member

    is:

    3£ºINSERT INTO REL_HTTP_ONLINE_LOG
    SELECT * FROM HIS_HTTP_ONLINE_LOGWHERE 1=2[/qoute]
    actually:


    INSERT INTO REL_HTTP_ONLINE_LOG
    SELECT *
    FROM HIS_HTTP_ONLINE_LOG
    WHERE 1=2

    If so, what is the purpose of such command? It will not insert any row.
  7. HankS New Member

    You sure that is the code?

    INSERT INTO REL_HTTP_ONLINE_LOG
    SELECT * FROM HIS_HTTP_ONLINE_LOGWHERE 1=2
    is guaranteed to return no results. Common way to create an empty recordset/dataset in an app. Also can be used to create an empty table if used in a SELECT INTO.

    Hank
  8. MohammedU New Member

    You can use Madhu recomendation to speed up but your will not be FAST BCP, because my guess is you have the indexes on History table... without indexes querying such a big table is pain...

    To reduce the time and log size you can load the data every hour instead whole day data as your table has datetime column in it and make sure you the index on datetime coulumn...




    Mohammed U.
  9. shuai_sh New Member

    "INSERT INTO REL_HTTP_ONLINE_LOG
    SELECT * FROM HIS_HTTP_ONLINE_LOGWHERE 1=2"
    is just in order to build a table not for other reason .


    madhuottapalam:
    Thanks for your help, it seems you have known my question
    (d) my question is does this history table contain approx 90 * 5 million rows ?
    Yes,the history table will contain approx 90 * 5 million rows .

    (f) actually you want a solution to pump this live data from Online table to History table in a faster method ?
    Yes

    but I don't know the following:
    "I think what u can do is use BCP or Bulk Insert. This link is bit confusing, anyhow try in that way."

    why can "bcp and bulk insert " adjust for my question ?
    Today I will try it and then give you a answer.
  10. shuai_sh New Member

    Sorry,
    I wrote the step 3 wrong

    I have corrected it now
  11. madhuottapalam New Member

    Read about BCP and Bulk Insert first in BOL and internet. such bulk data transfer can be done in a better way using BCP or Bulk Insert.

    Madhu
  12. mmarovic Active Member

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by shuai_sh</i><br /><br /><b>SQL Server 2005</b><br />XEON CPU 3.0G<br />MEMORY 2.0G<br />RAID<br /><br />Tow tables:<br />HIS_HTTP_ONLINE_LOG(<b>PARTITION</b>) FOR HISTORY DATA<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />Good. Partitioned by range of id column I guess. <br /><blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><br />REL_HTTP_ONLINE_LOG(NOT PARTITIONED) FOR EVERYDAY DATA,<br />AND THEY HAVE THE SAME STRUCTURE<br /><br />CREATE TABLE HIS_HTTP_ONLINE_LOG(<br />ID numeric(20,0) NOT NULL,<br />USERID varchar(32) NOT NULL,<br />USERIP varchar(16) NOT NULL,<br />USERPORT numeric(10, 0) NULL,<br />OBJECTIP varchar(16) NULL,<br />OBJECTPORT numeric(10, 0) NULL,<br />HTTPURL varchar(256) NULL,<br />HTTPHOST varchar(12<img src='/community/emoticons/emotion-11.gif' alt='8)' /> NULL,<br />HTTPDNS varchar(12<img src='/community/emoticons/emotion-11.gif' alt='8)' /> NULL,<br />VISITIME numeric(10, 0) NULL,<br />STARTIME datetime NOT NULL,<br />ENDTIME datetime NOT NULL<br />).......<br /><br />SELECT * INTO REL_HTTP_ONLINE_LOG <br />SELECT * <br />FROM HIS_HTTP_ONLINE_LOGWHERE 1=2<br /><br />There are 5 indexes in HIS_HTTP_ONLINE_LOG ,<br />There is not one index in REL_HTTP_ONLINE_LOG<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />So REL_HTTP_ONLINE_LOG is heap table (without clustered index). I would make one and make sure clustered index is used in the insert query. It is important to insert data in the order of destination table clustered index which should be on id column.<br /><br /><blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><br />There are about 5000,000 records in REL_HTTP_ONLINE_LOG everyday,<br />at night it will move into HIS_HTTP_ONLINE_LOG automatically,<br />The data of everyday in REL_HTTP_ONLINE_LOG will be last 90 days.<br /><br />My operations:<br />1: set the the recovery model of database "simple";<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />Simple recovery model doesn't help here unless you split insert into batches of 10,000 rows or something similar. If you want to reduce log growth here you need to insert data in smaller batches. I hope you don't shrink your db and you don't autoexpand it. The slowness may also come from expansion of transaction log or data files.
  13. alzdba Member

  14. mmarovic Active Member

    Yes, I read it once and forgot. This is great new feature of sql server 2005.
  15. deepakontheweb New Member

    No doubt, if i had to work on this... would have used

    1. Table Partitioning.
    2. BCP with batch size at leat 10,000 or more (depending on row size & DISK IO)
    3. Put database in simple recovery, if possible.

    if still insert are slow:
    1. Ask for faster Disk to get better Disk IO
    2. Having a RAID is best suitable as DB is in Simple Recovery too.. (just for avoiding disk crash and having no TL backups)



    Deepak Kumar
    MVP, MCDBA - SQL Server

    Disclaimer: This post is provided as is with no rights & warranty for accuracy, for the sake of knowledge sharing only.
  16. MohammedU New Member

    If you don't want to do major modificatins why don't you copy hourly data in regular hours to history using BCP/DTS instead of whole day one time at night...


    Mohammed U.
  17. shuai_sh New Member

    I have tested by using BCP and I find it will also bring large log
    where bcp the data in table

    test results:
    "BCP REL_HTTP_ONLINE_LOG(5 million records ,no index) OUT ......" will bring out 835M datafile
    and it took 60s
    "BCP the datafile in HIS_HTTP_ONLINE_LOG(6 indexes)......" will bring out log 13.8G MB
    and it took 5000s
    it is slower than my way
    {
    1: disable all HIS_HTTP_ONLINE_LOG's indexes
    2:insert into HIS_HTTP_ONLINE_LOG
    select *
    frin REL_HTTP_ONLINE_LOG
    3: enable all HIS_HTTP_ONLINE_LOG's indexes
    the period was less than 30 minutes
    }
    Now I'm testing alzdba's suggestion
    and later I will give you a result,
    I think it will handle my question.
  18. shuai_sh New Member

    Hello my friends ,I tested from three ways

    SQL Server 2005
    before testing I will clear the log


    1£º
    a) disable all indexes
    b) insert into HIS_HTTP_ONLINE_LOG
    select * from REL_HTTP_ONLINE_LOG
    3) enable all indexes


    The time of step b) increased along with the increase data of HIS_HTTP_ONLINE_LOG
    5million 00:03:45.017
    10million 00:10:04.563
    15million 00:15:05.233
    20million 00:19:47.173
    25million 00:24:02.970
    30million 00:28:16.343


    2£º
    Use BCP
    exporting datafile is very quick ,about 60 seconds,
    however inporting datafile is very slowly (-b£½10000)
    Time:01:06:00
    Log: 13.8GB


    3£º
    HIS_HTTP_ONLINE_LOG uses partition(STARTIME) £¬and it will bring a partition daily,build 6 indexes

    INSERT INTO HIS_HTTP_ONLINE_LOG
    SELECT *
    FROM REL_HTTP_ONLINE_LOG (do not disable any indexes)

    because STARTIME=getdate()
    The time of this step is steady ,about 20 minutes
    Log:15GB

Share This Page