Multiple product category database design | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Multiple product category database design

Hi, I’ve got a scenario, as many ecommerce websites do, whereby our customer wishes to sell many different types of products – from bikes to CD’s. A design i’ve used before in this type of scenario is a three table approach….a table listing all the possible attributes, a products table which lists the common attributes across all the product ranges such as title,price…and then a productattributes table which stores a list of attributes for each product… Although this method has worked for me in the past, it is a complete and utter pain….theres far more work involved in the coding, much harder to maintain data integrity, and to get comparable speeds to a normalized table approach takes forever….and a lot of testing and reworking…. Is this really the sort of approach other large online ecommerce websites such as amazon adopt? Is there no clever methods for mapping product categories to specific tables without having to use dynamic SQL??? Any thoughts anyone??? Thanks Al
Hi Al,
I think you need to generalise your products to groups…we have a similar think with Assets in our system..will see if i can explain the approach we are moving towards using your example… Say you have table Customer (customer details) a customer can have many products (table CustomerProduct) all the generic Product stuff that most/all products have would go in this table (Price, etc).
We would then have a ProductType table that would list all the types of products that are available. Every Product in Customer Product would have to be assigned a Type. Each type would then have a table eg CustomerProductBicycle (linked to customer product) that would be designed to hold the additional data on a bicycle. This approach means you need to spec out the list of ProductTypes you will accept, and add a new table each time there is a new type that has different attributes…Let me know if this makes sense, or if u want any further explaination.. Ben
I think you can’t generalised the all the products as their attributes are different.
What you can do is categorised the products depending on their attributes
say for vehicles one category, stationery one category

Al I suggest you to refer to the case studies listed on the MS SQL home page for further information. Satya SKJ
Moderator
http://www.SQL-Server-Performance.Com/forum
This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
Hi satya, Which case studies are you referring to? I’ve had a look and nothing seems to resemble my scenario… Thanks for your reply Al
http://www.microsoft.com/sql/evaluation/casestudies/solutions.asp – list of companies. BTW can you be specific where exactly you’re hitting performance with your approach, as I can see we can reduce this overhead with a few optimization tips. Satya SKJ
Moderator
http://www.SQL-Server-Performance.Com/forum
This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
Hi Satya, Well in particular when trying to return data…especially when i need to return multiple attributes for a product…The stored proc below seems to hammer the cpu… CREATE PROCEDURE [dbo].[sp_GENERAL_GetLatestTracks]
(
@intNoOfRows Int,
@intGenreID Int = NULL
)
AS BEGIN –turn system messages off
SET NOCOUNT ON CREATE TABLE #Tracks
(
ID int IDENTITY(1,1),
TrackID Int NOT NULL,
TrackTypeID Int NOT NULL,
TrackTitle VarChar(150),
Genre VarChar(50) NOT NULL,
[Description] VarChar(250),
Artistname VarChar(100) NOT NULL,
ProfilePath VarChar(255) NOT NULL
) INSERT INTO #Tracks(TrackID,TrackTypeID,TrackTitle,Genre,[Description],Artistname,ProfilePath) SELECT P.ProductID,
PA6.TrackTypeID,
COALESCE(PA.Title,MST.Title + ‘ (‘ + MSA.[Name] + ‘)’),
Genres.[Name],
ISNULL(NULLIF(P.[Description],”),NULL) As [Description],
AP.Artistname,
AP.ProfilePath FROM Products P ####################### LEFT JOIN
(
SELECT ProductID,
AttributeValue ‘Title’
FROM ProductAttributes
WHERE AttributeID = 24
) PA
ON P.ProductID = PA.ProductID LEFT JOIN
(
SELECT ProductID,
Cast(AttributeValue As Int) ‘MainstreamTrackID’
FROM ProductAttributes
WHERE AttributeID = 34
) PA2
ON P.ProductID = PA2.ProductID LEFT JOIN
(
SELECT ProductID,
Cast(AttributeValue As Int) ‘UserID’
FROM ProductAttributes
WHERE AttributeID = 31
) PA3
ON P.ProductID = PA3.ProductID LEFT JOIN
(
SELECT ProductID,
Cast(AttributeValue As Int) ‘IsAuthorised’
FROM ProductAttributes
WHERE AttributeID = 56
) PA4
ON P.ProductID = PA4.ProductID LEFT JOIN
(
SELECT ProductID,
Cast(AttributeValue As Int) ‘GenreID’
FROM ProductAttributes
WHERE AttributeID = 40
) PA5
ON P.ProductID = PA5.ProductID LEFT JOIN
(
SELECT ProductID,
Cast(AttributeValue As Int) ‘TrackTypeID’
FROM ProductAttributes
WHERE AttributeID = 39
) PA6
ON P.ProductID = PA6.ProductID LEFT JOIN
(
SELECT ProductID,
CAST(AttributeValue As Int) ‘IsDeleted’
FROM ProductAttributes
WHERE AttributeID = 67
) PA7
ON P.ProductID = PA7.ProductID ######################## JOIN ArtistProfiles AP ON
AP.ArtistID = PA3.UserID JOIN Users ON
Users.UserID = PA3.UserID JOIN Genres ON
Genres.GenreID = PA5.GenreID LEFT JOIN MainstreamTracks MST ON
MST.TrackID = PA2.MainstreamTrackID
LEFT JOIN MainstreamArtists MSA ON
MSA.ArtistID = MST.ArtistID WHERE (Users.IsActive = 1 OR AP.IsMember = 1)
AND PA4.IsAuthorised = 1
AND PA7.IsDeleted = 0
AND P.CategoryID = 2
AND ISNULL(NULLIF(AP.ProfilePath,”),NULL) IS NOT NULL
AND ISNULL(NULLIF(AP.Artistname,”),NULL) IS NOT NULL
AND ((PA5.GenreID = @intGenreID AND @intGenreID IS NOT NULL) OR @intGenreID IS NULL) ORDER BY P.ProductID DESC –then return the number of rows as specified by the variables passed into the SP – @intNoOfRows
SELECT *
FROM #Tracks
WHERE #Tracks.[ID] <= @intNoOfRows DROP TABLE #Tracks –turn system messages back on
SET NOCOUNT OFF END I might be wrong, but i dont see anything amazingly wrong with this proc, but it seems to read a lot of rows, and uses a lot of cpu….i had looked at replacing all the left joins between the ############# lines above with a singular join using "where attributeid in(24,31,etc)", however this returns multiple rows so is of no great benefit…. Any ideas Satya, Thanks Al
Do run the particular SP frequently then i fell u can replace the
temporary table into permanent table.
And i think the time consuming process is joining the tables.
Try the ITW on this and go for indexing on joined column.
Exactly run the PROFILER while executing this SP/query and submit the trace to Index tuning wizard for recommendations on the indexes. http://www.sql-server-performance.com/rd_temp_tables.asp – about temp tables information.
Use Table variables – If you’ve enough memory to support the SQL server, then the primary reason for using them is they reside in memory but if the data gets huge and memory cant accomodate it then it results in paging thereby speed suffers.
Satya SKJ
Moderator
http://www.SQL-Server-Performance.Com/forum
This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
quote:Use Table variables – If you’ve enough memory to support the SQL server, then the primary reason for using them is they reside in memory but if the data gets huge and memory cant accomodate it then it results in paging thereby speed suffers.
I thought so for a long time, but recently I saw an article declaring table variables are stored in tempdb too. I’m to busy right now to check, can someone help?
Hi, Ive just ran the procedure lots of times, and monitored it using profiler…some typical results are: CPU: 234
Read: 24504
Duration: 1250 When i use the tuning wizard, it gives me no suggestions whatsoever…how can that be??? Thanks al
Mmarovic KBAhttp://support.microsoft.com/default.aspx?scid=kb;en-us;305977 FAQ about table variables. Al
Clear the procedure cache and recompile the Sp plan. Satya SKJ
Moderator
http://www.SQL-Server-Performance.Com/forum
This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
Hi Mmarovic, Do i just clear the SP cache using sp_recompile ‘objectname’ ? Secondly, either im using the profiler wrong, or the index tuning wizard is nothing short of useless. Im running my queries in query analyzer whilst running a trace in profiler…i save the results to a .trc file and then load it into the index tuning wizard…all it does, bearing in mind my query uses 8 different tables is to suggest one column (which i already have as an index!!) Any ideas?? Thanks Al
Thank you Satya. Here is the answer:
quote:Q4: Are table variables memory-only structures that are assured better performance as compared to temporary or permanent tables, because they are maintained in a database that resides on the physical disk? A4: A table variable is not a memory-only structure. Because a table variable might hold more data than can fit in memory, it has to have a place on disk to store data. Table variables are created in the tempdb database similar to temporary tables. If memory is available, both table variables and temporary tables are created and processed while in memory (data cache).
Hi higgsy.
quote:Do i just clear the SP cache using sp_recompile ‘objectname’ ?
That way you don’t clear the sp cache, you just replace execution plan of this specific stored procedure with new one, which is what you need actually. dbcc dropCleanBuffers cleans entire sp cache. I haven’t used ITW for a very long time. I prefer to tune indexes all by myself. That way I can discover other changes in table design that may be usefull. My approach would be to find the most used attributes and put them as a ‘static column’ in appropriate table. Call them predifined attributes. Leave in dynamic structure only really product specific attributes. Actually I wasn’t in position to design database for that kind of requirements, so I can’t really comment. I may change my opinion when I have to realy work on it.

I’ve just read Ben Willson’s recommendation. I like that approach and I used it and it works well if there is a restricted number of types that can be used. It is actually database object inheritance. [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />] The problem appears when it is requiered that application administrator can define attributes users are interested in.
Try this version:
CREATE PROCEDURE [dbo].[sp_GENERAL_GetLatestTracks]
(
@intNoOfRows Int,
@intGenreID Int = NULL
)
AS BEGIN –turn system messages off
SET NOCOUNT ON
–restrict number of rows
set rowCount @intGenreID SELECT
P.ProductID,
PA6.TrackTypeID,
COALESCE(PA.Title,MST.Title + ‘ (‘ + MSA.[Name] + ‘)’),
Genres.[Name],
ISNULL(NULLIF(P.[Description],”),NULL) As [Description],
AP.Artistname,
AP.ProfilePath
FROM Products P ####################### LEFT JOIN
(
SELECT ProductID, AttributeValue ‘Title’
FROM ProductAttributes
WHERE AttributeID = 24
) PA
ON P.ProductID = PA.ProductID LEFT JOIN
(
SELECT ProductID, Cast(AttributeValue As Int) ‘MainstreamTrackID’
FROM ProductAttributes
WHERE AttributeID = 34
) PA2
ON P.ProductID = PA2.ProductID
LEFT JOIN
(
SELECT ProductID, Cast(AttributeValue As Int) ‘UserID’
FROM ProductAttributes
WHERE AttributeID = 31
) PA3
ON P.ProductID = PA3.ProductID LEFT JOIN
(
SELECT ProductID, Cast(AttributeValue As Int) ‘IsAuthorised’
FROM ProductAttributes
WHERE AttributeID = 56
) PA4
ON P.ProductID = PA4.ProductID
LEFT JOIN
(
SELECT ProductID, Cast(AttributeValue As Int) ‘GenreID’
FROM ProductAttributes
WHERE AttributeID = 40
) PA5
ON P.ProductID = PA5.ProductID LEFT JOIN
(
SELECT ProductID, Cast(AttributeValue As Int) ‘TrackTypeID’
FROM ProductAttributes
WHERE AttributeID = 39
) PA6
ON P.ProductID = PA6.ProductID
LEFT JOIN
(
SELECT ProductID, CAST(AttributeValue As Int) ‘IsDeleted’
FROM ProductAttributes
WHERE AttributeID = 67
) PA7
ON P.ProductID = PA7.ProductID ######################## JOIN ArtistProfiles AP ON AP.ArtistID = PA3.UserID
JOIN Users ON Users.UserID = PA3.UserID
JOIN Genres ON Genres.GenreID = PA5.GenreID
LEFT JOIN MainstreamTracks MST ON MST.TrackID = PA2.MainstreamTrackID
LEFT JOIN MainstreamArtists MSA ON MSA.ArtistID = MST.ArtistID
WHERE
(Users.IsActive = 1 OR AP.IsMember = 1)
AND PA4.IsAuthorised = 1
AND PA7.IsDeleted = 0
AND P.CategoryID = 2
–AND ISNULL(NULLIF(AP.ProfilePath,”),NULL) IS NOT NULL
and ap.ProfilePath > ”
–AND ISNULL(NULLIF(AP.Artistname,”),NULL) IS NOT NULL
and ap.ArtistName > ”
AND PA5.GenreID = isNull(@intGenreID, -555)
ORDER BY P.ProductID DESC –turn system messages back on
SET NOCOUNT OFF END
You can also try normal left joins using different alias for each ‘instance’ of ProductAttributes table instead of using derived tables. I don’t know if it helps, but you can try.

quote:My approach would be to find the most used attributes and put them as a ‘static column’ in appropriate table. Call them predifined attributes. Leave in dynamic structure only really product specific attributes.
E.g. UserID, isDeleted and title should not be dynamic attributes. IsDeleted caries business logic, UserId too. I guess title is actually name and it is present for each product.
Hi everyone, sorry for the delay in responding. I just tried the same procedure, and added UserID to the products table and indexed it…it was about 200 times faster…so thats good news….there are a few confusing points though, that i need to get straight in my head before i could change the structure site wide… Firstly, mmarovic mentions to keep all the fields that have business logic outside of the attributes, i.e. static columns in the products table…..although this is not a problem, i will probably end up with 10-15 static columns as a lot of our different groups of products have fields that are relevant purely to that group that also have business logic….Whats the best way to get round this??? Thanks Al
quote:Firstly, mmarovic mentions to keep all the fields that have business logic outside of the attributes, i.e. static columns in the products table…..although this is not a problem, i will probably end up with 10-15 static columns as a lot of our different groups of products have fields that are relevant purely to that group that also have business logic….Whats the best way to get round this???
Ok, first 10-15 static columns should improve performance a lot and also simplify the code. If you have group specific columns involved in hard-coded business logic than there is no sense to have them as dynamic attributes as well. In that case I would use technique Ben Willson mentioned.
Hi Ben, I got an email saying you had replied to my post….but theres no reply here….whats going on with this forum, messages seem to just disapear!!! Al
case of the missing post…yes, i did indeed write a reply to Al’s post, but it appears the gremlins in the system have destroyed it…heres hoping Al could get something useful from the email…
Currently the website is under transmission to a new server and DNS servers are struggling to keepup the process. I believe Brad has fixed this but still you may get such issues and soon they will get resolved. Satya SKJ
Moderator
http://www.SQL-Server-Performance.Com/forum
This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
]]>