SQL Server Performance Forum – Threads Archive

INSERT INTO vs SELECT … INTO

While doing some performance testing with a stored procedure I am working on, I decided to try using a temp table in an effort to speed up part of the stored proc. My first step was to write a query that extracted the data I wanted to put in the temp table, and ran it a few times to make sure a) it got the right data and b) it was pretty fast. FWIW, the timing code I used is: declare @_starttime datetime, @_sectionstart datetime, @_sectionend datetime
set @_starttime = getdate() set @_sectionstart = getdate() — Code to time goes here… set @_sectionend = getdate()
print ‘~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~’
print ‘Section: ‘ + convert(varchar, datediff(ms, @_sectionstart, @_sectionend))
print ‘Elapsed: ‘ + convert(varchar, datediff(ms, @_starttime, @_sectionend))
print ‘~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~’
set @_sectionstart = getdate() This showed the query as taking about 30 ms to execute. Now I thought it would be a simple matter of sticking an INSERT INTO #tmp in front of the query and everything would be fine… Naturally I created my temp table first, and ran the code with the INSERT INTO in place. I ran the code a couple of times to allow for caching/compiling/other system performance/etc. I expected the INSERT to take a bit longer than the SELECT, but was amazed to see it now took 1500ms to execute this block! I duplicated the block of code, so now I had the straight SELECT and the INSERT INTO – the SELECT took ~30ms and the INSERT took ~1500ms. As a test, I tried removing the INSERT INTO and CREATE TABLE and turning the query into a SELECT … INTO query. This is where things started to get strange – SELECT … INTO only took 160ms?! Now I was confused, so I tried a few more tests, this time creating the temp table, doing an INSERT INTO with a field list (INSERT INTO #tmp (field1, field2…) ), truncating the table and doing a second INSERT INTO without a field list and finally doing the SELECT … INTO (into a different table of course). The first INSERT INTO again took around 1500ms, while the second INSERT INTO took only 60ms. The SELECT … INTO was consistently 160ms. I then tried swapping the order of the INSERT INTOs so the one without the field list was first, yet I got the same result – the first time an INSERT INTO was executed, it took about 1500ms. Trying to get clever, I tried doing a SELECT … INTO #tmp, truncating the table then an INSERT INTO #tmp – yet the INSRT INTO took about 1500ms again. There goes the theory I was working on that the huge time difference was caused by the table being created only on the first access. Time and again I tried different combinations with the same result – the first INSERT INTO always took around 1500ms, while the second (and subsequent) INSERT INTO took only 60ms. About the only reason I can come up with is that INSERT INTO causes the query processor to worry about datatypes, whereas SELECT … INTO doesn’t care – the datatypes will "fall out" of the underlying SELECT query. After the first INSERT INTO, the datatypes have been determined and maybe cached which would explain the dramatic speed up. Can anyone help me understand what is going on here? I try to avoid temp tables like the plague, but sometimes they are necessary and can be very useful. Where I do use temp tables, I invariably opt for INSERT INTO to avoid potential locking problems with tempdb, yet if the performance difference is so substantial, sometimes SELECT … INTO might be more appropriate. Why the big difference between INSERT INTO and SELECT … INTO? As it turned out, I revisited the original query and was able to use derived tables to avoid the temp tables altogether. NOTE: Execution plans and IO statistics were identical for each query. These results were all obtained by running the queries in Query Analyser, not as a stored procedure.

Hi the select into statement
is actually a mix of 3 stmts
create , select and into
Hi hope this answers ur question

quote:Originally posted by kate_for_u Hi the select into statement
is actually a mix of 3 stmts
create , select and INSERT
Hi hope this answers ur question

The difference is in logging. As far as I know SELECT INTO is less logged then INSERT INTO and therefore quicker. Bambola.

Refer to this link http://www.sql-server-performance.com/nb_avoid_bottlenecks.asp] also. _________
Satya SKJ

quote:Originally posted by satya Refer to this link http://www.sql-server-performance.com/nb_avoid_bottlenecks.asp] also.

This article (and other information similar to this) is why I was trying to avoid using SELECT … INTO, although in this case the server is running 7.0, so the tempdb locking should not be a problem – it is just a habit I got into a while back to avoid SELECT … INTO. BTW, I tried doing the SELECT … INTO … WHERE 1=0 to create the table, then INSERT INTO to populate it, yet the results are still the same – the first time I try an INSERT INTO, it takes a LOT longer. As far as logging goes – I am fairly certain we have the SELECT INTO/BULK COPY flag is cleared, so the SELECT INTO AND INSERT INTO statements should both be logged the same, but I will double check that no-one has changed this option on our servers. Even if SELECT INTO/BULK COPY is set, that wouldn’t explain why subsequent INSERT INTO statements (even inserting different data) are so much (ie, orders of magnitude) faster.

This may be because Select into is a minimal logged operation, where as
Insert into is a fully logged in the transaction log. Thanks Rushendra

Try this on you DEV server. Restrict log growth on a test database to 1MB. Run an INSERT INTO with a table big enought to generate an error ("The log file for database ‘….’ is full") Now do the SELECT INTO on the same table and it will run with no problems.
both in full and in bulk recovery. So it is a logging issue. Bambola.

]]>