Generate 370 mill rows in a table | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Generate 370 mill rows in a table

I need to generate 370 million rows in a table. Are there any special tricks I should know about to do this or can I just create a while-loop and wait for it to finish? The table will only consist of an identity-field incrementing with 1 starting at 1, a userid which will get populated later and a datetime-field with GETDATE() in it. Also, what storage-capacity shuld I expect something like this to require? I have never worked with such large tables before so I really haven’t got a clue…oh and one more thing…I would really like the identityfield to have a mask, som row number 1 has ID 0000000000001 or something. Clues anyone? —
Frettmaestro
"Real programmers don’t document, if it was hard to write it should be hard to understand"
I would try avoid inserting one row at a time. CREATE TABLE zzz (i int) EXEC sp_indexoption ‘dbo.zzz’, ‘disallowrowlocks’, true
EXEC sp_indexoption ‘dbo.zzz’, ‘disallowpagelocks’, true SET NOCOUNT ON TRUNCATE TABLE zzz
DECLARE @id TABLE (n int) DECLARE @x int
, @numer_of_rows int SELECT @x = 1
, @numer_of_rows = 2000000 INSERT @id
SELECT 0 n UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 WHILE @x + 100000 – 1 <= @numer_of_rows
BEGIN
INSERT INTO zzz
SELECT (a.n + b.n*10 + c.n*100 + d.n*1000 + e.n*10000) + @x
FROM @id a, @id b, @id c, @id d , @id e
ORDER BY 1
SELECT @x = @x + 100000
END EXEC sp_indexoption ‘dbo.zzz’, ‘disallowrowlocks’, false
EXEC sp_indexoption ‘dbo.zzz’, ‘disallowpagelocks’, false This runs 16 seconds for 2,000,000 rows. Without changing the locks it goes to 30 seconds.
This numbers apply to SIMPLE recovery model. With FULL it takes 48 seconds (with disallowrowlocks and disallowpagelocks set to true). So if you can, change recovery model to SIMPLE. I think you should create the table with an identity and set identity_insert to off.
Otherwise, without using EM which will recreate the table, I don’t know how you can
change an existing int column to identity. Anyone knows if there’s a way to do that? For the mask you could use this. declare @i int
select @i = 1
select RIGHT(‘000000000000’ + convert(varchar(12), @i), 12)
select @i = 123456789
select RIGHT(‘000000000000’ + convert(varchar(12), @i), 12) You can also make it a calculated column if you need to. Bambola.
Wow…I would never have been able to come up with anything like this. I have looked at it for 15 minutes now and I still don’t understand what’s going on. I ran it on my local database and it took 32 secs to run through and when it finished the table had (as expected offcourse) 2 mill. rows but what I did not expect was for the table to be 54MB allready! We are talking 10GB for this table of mine! Hmm…this calls for some serious thaught. —
Frettmaestro
"Real programmers don’t document, if it was hard to write it should be hard to understand"
I will explain. DECLARE @id TABLE (n int)
DECLARE @id_0 TABLE (n int) This will insert into @t numbers 0, 1, 2…. to 9.
INSERT @id
SELECT 0 n UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 This will insert into @t_0 numbers 0, 10, 20… to 90.
INSERT @id_0
SELECT 0 n UNION ALL SELECT 10 UNION ALL SELECT 20 UNION ALL SELECT 30 UNION ALL SELECT 40
UNION ALL SELECT 50 UNION ALL SELECT 60 UNION ALL SELECT 70 UNION ALL SELECT 80 UNION ALL SELECT 90 A CROSS JOIN between this 2 table will return all possible combinations. select a.n, b.n, a.n + b.n
from @id a cross join @id_0 b So sum of the columns will return numbers 0 to 99. If you run the above select it woud be more clear. But I don’t really need the second table. I can multiply the column in the first table by 10 and
have the same results. And so on… select a.n, b.n, a.n + b.n from @id a cross join @id b
(or select a.n, b.n, a.n + b.n from @id a, @id b) @x is initially set to 1 (to start with 1 and not 0), and incremented by number of rows inserted (100000). So in the first loop will add 1 to the results, in the second 100001 and so on. EXEC sp_indexoption ‘dbo.zzz’, ‘disallowrowlocks’, true
EXEC sp_indexoption ‘dbo.zzz’, ‘disallowpagelocks’, true This will disable row and page lock so a table lock will be used and will be more efficient.
Simple recovery model will not log the operation. Bambola.
Now this was some pretty cool stuff!! The way you did those two table-variables and combined them was just awesome… but unfortunaetly I don’t think I’ll be able to use it after all <img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ /> I knew that this table would be large but I guess I just didn’t do my homework properly. Some simple math would tell me that this table would just be too big for it to be worth it. Each row was supposed to represent one share in the company I work for and I’m supposed to make a shareholder-registration/sharetrading-program but I have to find another way of solving this. But you should know that I *truly* appreciate your effort and I will most definetly treasure your solution in my bag of neat tricks.<br /><br />Thanx alot <img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ /><br /><br />–<br />Frettmaestro<br />"Real programmers don’t document, if it was hard to write it should be hard to understand"
]]>