SQL Server Performance: Query Tuning vs. Process Tuning
A well-designed relational OLAP database gets all mutations via source files. In a file, a complete record for an entity is given in a single line. The definition of the records is as follows:
First 50 characters identify the file and its origin with fixed field lengths.
After this, one or more categories are listed. These correspond to one or more tables in our database. The first four characters identify the category. The next three characters identify the length of the category content. After this, the next category starts.
Within a category, one or more elements are listed. These correspond to fields in the database. The first four characters identify the element, the next three characters identify the length of the content.
Because the number of categories, as well as the number of elements varies, and they have to be linked to a single entity, the chosen method was to parse the file in a .NET application to split it into a relational model for each record. From here, for each record an INSERT with VALUES was given for each table in the database. Loading files of a million records (as was common) with an average of nine tables leads to 9 million connections / executions to the database.
I set up a file import table, where the complete record was bulk loaded into the database. I added an identity field to uniquely identify a record / entity. From here, I parsed the identifying fields for the file, wrote them to a separate table, and removed this part of the record.
Next, I inserted all first categories into a separate table, removed this category from the body, and repeated the step until all categories were split. I repeated this step for each element in a category.
I used this code for it:
— CAPTURE RECORD INFO
INSERT INTO dbo.tbl_RecordInfo (RecordID, IDField, Status, …)
SELECT RecordID, SUBSTRING (RecordBody, 1, 10), SUBSTRING (RecordBody, 11, 1), ….
–REMOVE RECORD INFO FROM BODY
SET RecordBody = SUBSTRING(RecordBody, 51, LEN(RecordBody) – 50)
SET @CategoryCount = 0
SET @CategoryCount = @CategoryCount + 1
INSERT INTO dbo.tbl_RecordCategory (RecordID, SortOrder, CatNumber, CatBody)
SELECT RecordID, @CategoryCount, SUBSTRING (RecordBody, 1, 2), SUBSTRING (RecordBody, 6, CONVERT(INT, SUBSTRING(RecordBody, 3, 3)))
–REMOVE ALL RECORDS OF WICH ALL CATEGORIES ARE HANDLED
DELETE FROM dbo.tbl_RecordImport
WHERE (LEN(RecordBody)= CONVERT(INT, SUBSTRING(RecordBody, 3, 3)) + 10)
–REMOVE HANDLED CATEGORY FROM BODY
SET RecordBody = SUBSTRING(RecordBody, CONVERT(INT, SUBSTRING(RecordBody, 3, 3)) + 6, LEN(RecordBody) – CONVERT(INT, SUBSTRING(RecordBody, 3, 3)) + 5)
SET @Rows = @@ROWCOUNT
–IF NOT ALL CATEGORIES ARE HANDLED, LOOP AGAIN
IF @Rows > 0 GOTO CATEGORY_LOOP