SQL Server Performance Forum – Threads Archive

Large unused space on heap table after mass insert

Hi Guys,
Does any of you have ever seen this happening? I can’t figure it out what causes this. What happen is I have a program that perform rapid massive inserts into multiple heap (non-clustered indexed) tables via ODBC. Before the program is run, one of the tables has about 50,000 rows and size about 70 MB. After the program has been executed for a while, The program crashes out because my server runs out of disk space. 1000,000 rows were inserted into this particular table. The table grew to 2.3GB. Here is the result of SP_SPACEUSED with updateusage flag on.
TableRowsReservedDataIndexUnused
itemdata1508652358152 KB284576 KB23536 KB2050040 KB
2GB of UNUSED?????? How does this happen??
I investigated some more. Here is the output from DBCC SHOWCONTIG of the table
DBCC SHOWCONTIG scanning ‘XXX’ table…
Table: ‘XXX’ (61243273); index ID: 0, database ID: 7
TABLE level scan performed.
– Pages Scanned…………………………..: 35572
– Extents Scanned…………………………: 34745
– Extent Switches…………………………: 34744
– Avg. Pages per Extent……………………: 1.0
– Scan Density [Best Count:Actual Count]…….: 12.80% [4447:34745]
– Extent Scan Fragmentation ……………….: 99.62%
– Avg. Bytes Free per Page…………………: 7285.8
– Avg. Page Density (full)…………………: 9.98%
DBCC execution completed. If DBCC printed error messages, contact your system administrator. Average pages per extent is only 1. and the page density is only 10%. I know that before the program was run the average pages per extent was close to 8 and page density was close to 100%. The only way that I could bring the table size down is to create and drop clustered index on the table. Then, everything is back to normal. What#%92s really killing me is that how does this happen? Any comments would be greatly appreciated. Thanks,
Jack

Run the SP_SPACUESE @UPDATEUSAGE=’TRUE’ for optimum results and confirm whether the table has any text data type columns to insert. Satya SKJ
Moderator
http://www.SQL-Server-Performance.Com/forum
This posting is provided â€œAS ISâ€ with no rights for the sake of knowledge sharing.

Satya,
Thanks for the reply,
If you looked at my first post, you will see that i have run the sp_spaceused with updateusage flag enabled. The table consist of INT and CHAR data type only.
Any more input would be appriciated.
Thanks,
Jack

By the way,
after clustered index is created and dropped,
Here is the sp_spaceused result for the table
table rows reserved data index unused
XXX15088956032 KB29448 KB23528 KB3056 KB
Thanks,
Jack

I can’t explain why it did happen because I allways create clustered index on tables, so I don’t have experience with tables on the heap. The reason I allways create is table maintenance. I think your results supports my theory. If you decide to create clustered index, it is good idea to create it on identity column to avoid page splits during heavy inserts. If you don’t have identity column then create clustered index on column that behaves like identity which means order of column values corresponds to order rows are inserted. I mean clustered index column value increases for each new row inserted. If you don’t have such column in the table then create clustered index with fill factor other then 0 (e.g. 80). You will have to experiment to find optimal value. Fill factor will reduce page splits and fragmentation and allow to perform less frequent index defragmentation and still have acceptable performance.

Mmarovic,
Thanks for the answer, I guess i would like to know why it happened more than how to fix it. I believe that having a clustered index on primary key would solve this problem, but I still can’t figure out why it happen. Could you explian your page split theory to me abit more? If a table is a heap, why would a page needed to be splitted because SQL Server should not care about what page the data should be allocated in a heap. It should just keep filling up the page until it’s full and move on the the next page. This also does not explain having 1 page per extent either. SP_Configure shows that fill factor for the database is set to default. The table is a heap, does fill factor effect heap table? I experimented with placing the program on a remote machine and hitting my database machine from network, the problem does not seem to occur. Maybe the speed of inserts is the factor here. I will try to investigate more, but if you have anymore input, I would greatly appriciate it.
Thanks again,
Jack

JackP,
My discussion about page splits was not related to heap tables at all. However, if you create clustered index on the wrong column and without proprer fill-factor you may experience extreme number of page splits and fragmentation. In that case clustered index fragmentation may be as bad as fragmentation you have now. Server wide fill-factor is used when you don’t include fill-factor option in index creation script. I would also like to know the reason for behaviour you described, but unfortunatelly I don’t.

]]>