Inputing large data into the database | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Inputing large data into the database

I have created a database that stores millions of rows of X,Y,Z coordinate data. I optimised the database to retrieve data as this is the major work that is required. However, on occasions new X,Y,Z data needs to be added into the tables. Generally, you are looking at about 200000 rows to input at a time. Because the database is not optimised for data input it takes ages (approx. 3 hours) to input and this includes a test to see if the X,Y point is already in the database and if it needs overwriting or not. At the moment I am using VB to input the data from an array and using the addnew command and then updatebatch once all the data is added. Before i add the new point I have two stored procedures that i call from VB to 1)identify if the XY point is in the database already – then VB does a test to see if it needs overwrting if so then 2)another stored procedure is called to delete the point. I think this is where the speed issue occur – with the testing of the point – any ideas as to improve this method of testing a point? Also is there a way to pass through an array from VB into a stored procedure and loop through this array within the SP? Thanks in advance for any help
What is your indexing scheme? How do you know the speed issue is with the two SPs?
Suggest importing the data from a file into a ‘staging’ table using bcp or BULK INSERT.
You can then join the staging table onto the source table, and insert any records which do not already exist.
I have four columns in my table: X, Y, Z and CoSys (the coordinate system number which the X, Y, Z data is in). I have a primary key set on X Y and CoSys (to avoid duplicate values) and a second index on CoSys as the select statement is:
Select X, Y, Z from [Table] where X between [Xmin] and [Xmax] and Y between [Ymin] and [Ymax] and CoSys = [coordinate system number]
I have run the index tuning wizard and it does not suggest any improvements but feel free to suggest any? Also, is a ‘staging’ table just a temporary table? What is the difference between bcp and BULK insert?
Books online explains:
The bcp command prompt utility copies Microsoft® SQL Server™ data to or from a data file. It is used most frequently to transfer large volumes of data into a SQL Server table from another program, usually another database management system (DBMS). The data is first exported from the source program to a data file, and then imported from the data file into a SQL Server table using bcp. Alternatively, bcp can be used to transfer data from a SQL Server table to a data file for use in other programs. For example, the data can be copied from an instance of SQL Server into a data file. From there, another program can import the data. Data can also be transferred into a SQL Server table from a data file using the BULK INSERT statement. However, the BULK INSERT statement cannot bulk copy data from an instance of SQL Server to a data file. The BULK INSERT statement allows you to bulk copy data to an instance of SQL Server using the functionality of the bcp utility with a Transact-SQL statement, rather than from the command prompt. Satya SKJ
This posting is provided “AS IS” with no rights for the sake of knowledge sharing.