SQL Server Performance Forum – Threads Archive
Is it wise to split large table into multiple?Hi people, I have the following database problem:
I have a table with about 500.000 rows, which is joined in a large query. This large query is requested a *lot*, and this large table seems to be a bottleneck for the speed of the query. This large table already contains quite some indexes. I was wondering if it is a good idea to split this large table up to about 60-70 new tables, all containing the same columnns as the original, but with a smaller amount of rows, to speed things up.
No thats not good idea to split into many tables only to store mimimum records
Make sure your tables are properly normalised
500,000 rows is not a big one. Did you use proper index for that table?
Post the table structure and your large query Madhivanan Failing to plan is Planning to fail
To be honest 500,000 rows is not considered a very large table. For improving performance, I would look at that "large query" you mentioned. Why is the query so large? Does it return all columns just so the same query can be used everywhere in the application? That is not a very good approach: use separate queries that return only the necessary columns. And if the query returns many rows that must be filtered in the application, then you should use a stored procedure instead that will filter the data before it is returned.
Ok, here’s the query: CREATE PROCEDURE getMenusSql @ShopId INT
, @FileSizeLimit INT
, @BrowserId INT AS
BEGIN SET NOCOUNT ON SELECT
, ParentCategory_Handle FROM
INNER JOIN Tones_FileTypes ON File_Id = Tone_FileType_File_Id AND File_Size <= @FileSizeLimit
INNER JOIN FileTypes ON Tone_FileType_FileType_Id = FileType_Id
INNER JOIN Browsers_FileTypes ON FileType_Id = Browser_FileType_FileType_Id AND Browser_FileType_Browser_Id = @BrowserId
INNER JOIN Tones ON Tone_FileType_Tone_Id = Tone_Id
INNER JOIN Shops_Tones ON Shop_Tone_Shop_Id = @ShopId AND Tone_Id = Shop_Tone_Tone_Id
INNER JOIN Shops ON Shop_Tone_Shop_Id = Shop_Id
INNER JOIN Charts ON Tone_Chart_Chart_Id = Chart_Id
INNER JOIN Shops_Charts ON Chart_Id = Shop_Chart_Chart_Id AND Shop_Chart_Shop_Id = @ShopId
FULL OUTER JOIN Tones_Categories
INNER JOIN Categories ON Tone_Category_Category_Id = Category_Id
INNER JOIN Shops_Categories ON Category_Id = Shop_Category_Category_Id AND Shop_Category_Shop_Id = @ShopId
INNER JOIN ParentCategories ON Category_ParentCategory_Id = ParentCategory_Id
INNER JOIN Shops_ParentCategories ON ParentCategory_Id = Shop_ParentCategory_ParentCategory_Id AND Shop_ParentCategory_Shop_Id = @ShopId
ON 1 = 0
ON Tone_Id = Tone_Category_Tone_Id OR Tone_Id = Tone_Chart_Tone_Id
NOT FileType_Id IN (SELECT Shop_Excluded_FileType_FileType_Id FROM Shops_Excluded_FileTypes WHERE Shop_Excluded_FileType_Shop_Id = @ShopId)
EXISTS (SELECT Shop_Recordlabel_Recordlabel_Id FROM Shops_Recordlabels WHERE Shop_Recordlabel_Shop_Id = @ShopId AND Shop_Recordlabel_Recordlabel_Id = Tone_Recordlabel_Id)
OR Tone_Recordlabel_Id IS NULL
, ParentCategory_Handle ORDER BY
ISNULL(Shop_ParentCategory_Position, 0) + ISNULL(Shop_Chart_Position, 0)
, ISNULL(Shop_ParentCategory_DisplayName, ”) + ISNULL(Shop_Chart_DisplayName, ”) END
This is a IO report when the query is executed: Table ‘Files’. Scan count 7858, logical reads 24624, physical reads 0, read-ahead reads 0.
Table ‘Shops_Excluded_Filetypes’. Scan count 909, logical reads 1818, physical reads 0, read-ahead reads 0.
Table ‘FileTypes’. Scan count 909, logical reads 909, physical reads 0, read-ahead reads 0.
Table ‘Tones_FileTypes’. Scan count 7448, logical reads 23131, physical reads 0, read-ahead reads 0.
Table ‘Shops_Tones’. Scan count 11265, logical reads 34818, physical reads 0, read-ahead reads 0.
Table ‘Shops_Recordlabels’. Scan count 497, logical reads 994, physical reads 0, read-ahead reads 0.
Table ‘Tones’. Scan count 13215, logical reads 26538, physical reads 0, read-ahead reads 0.
Table ‘Worktable’. Scan count 1, logical reads 173, physical reads 0, read-ahead reads 0.
Table ‘Tones_Charts’. Scan count 10, logical reads 20, physical reads 0, read-ahead reads 0.
Table ‘Charts’. Scan count 10, logical reads 20, physical reads 0, read-ahead reads 0.
Table ‘Shops_Charts’. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0.
Table ‘Worktable’. Scan count 1, logical reads 13273, physical reads 0, read-ahead reads 0.
Table ‘Tones_Categories’. Scan count 155, logical reads 330, physical reads 0, read-ahead reads 0.
Table ‘ParentCategories’. Scan count 155, logical reads 155, physical reads 0, read-ahead reads 0.
Table ‘Shops_ParentCategories’. Scan count 157, logical reads 624, physical reads 0, read-ahead reads 0.
Table ‘Categories’. Scan count 157, logical reads 314, physical reads 0, read-ahead reads 0.
Table ‘Shops_Categories’. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0.
Table ‘Shops’. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0.
Table ‘Browsers_FileTypes’. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0. The table shops_tones is the large table I mentioned.
When I look at the query, I see a FULL OUTER JOIN, which doesn’t strike me as being a fast join. Is it an idea to split this query up into 2 smaller queries? Right now it takes 7 seconds to execute, which is way too slow.
This looks like generated by some reporting software.
You haven’t written anything about your indexes or the table structures. With that much JOINs, you probably see mostly scans anyway. What so you mean by "requested a lot" and how many rows is the query supposed to return? —
Microsoft SQL Server MVP
Ich unterstÃ¼tze PASS Deutschland e.V. http://www.sqlpass.de)
There is a specific thing that only FULL JOIN can do: it shows all records from both tables, even if there is no match to a row on the opposite side. This is helpful for example if you want to see a list of all suppliers, with or without orders, and the number of orders. What you may find is that the GROUP BY was added to avoid duplicate rows that are occurring because of one of the joins. Experience tells me that this may well have been the reason why an EXISTS clause was added, and perhaps they forgot to remove the GROUP BY clause afterwards. There are lots of tables being joined here, and I would start looking very carefully if each table is in fact necessary. In this respect it is not very helpful that the SELECT list doesn’t say in which table the named columns are occurring …
Thanks for your input, but I think I alredy solved it by splitting the query in 2 separate ones. The FULL outer join made it amazingly slow…
Sometimes, people get caught up on "we have too many stored procedures". If a query is doing two seperate things and you could either:<br /><br />1. Create two simple stored procs that do each and run quickly.<br />2. Make a really complicated stored procedure you’re proud of that runs like crap.<br /><br />You should pick #1. Sometimes that means a little bit more coding. It was probably the right decision in this case…..just so you know.<br /><br />[<img src=’/community/emoticons/emotion-1.gif’ alt=’‘ />]<br /><br />MeanOldDBA<br />[email protected]<br /><br />When life gives you a lemon, fire the DBA.
I think you can even focus on partitioned views just in back of your mind.
I feel even that can be fruitful at times.