SQL Server Performance

Multiple product category database design

Discussion in 'General DBA Questions' started by higgsy, Jun 5, 2005.

  1. higgsy New Member

    Hi,

    I've got a scenario, as many ecommerce websites do, whereby our customer wishes to sell many different types of products - from bikes to CD's. A design i've used before in this type of scenario is a three table approach....a table listing all the possible attributes, a products table which lists the common attributes across all the product ranges such as title,price...and then a productattributes table which stores a list of attributes for each product...

    Although this method has worked for me in the past, it is a complete and utter pain....theres far more work involved in the coding, much harder to maintain data integrity, and to get comparable speeds to a normalized table approach takes forever....and a lot of testing and reworking....

    Is this really the sort of approach other large online ecommerce websites such as amazon adopt? Is there no clever methods for mapping product categories to specific tables without having to use dynamic SQL???

    Any thoughts anyone???

    Thanks

    Al
  2. benwilson New Member

    Hi Al,
    I think you need to generalise your products to groups...we have a similar think with Assets in our system..will see if i can explain the approach we are moving towards using your example...

    Say you have table Customer (customer details) a customer can have many products (table CustomerProduct) all the generic Product stuff that most/all products have would go in this table (Price, etc).
    We would then have a ProductType table that would list all the types of products that are available. Every Product in Customer Product would have to be assigned a Type. Each type would then have a table eg CustomerProductBicycle (linked to customer product) that would be designed to hold the additional data on a bicycle.

    This approach means you need to spec out the list of ProductTypes you will accept, and add a new table each time there is a new type that has different attributes...Let me know if this makes sense, or if u want any further explaination..

    Ben
  3. dineshasanka Moderator

    I think you can't generalised the all the products as their attributes are different.
    What you can do is categorised the products depending on their attributes
    say for vehicles one category, stationery one category
  4. satya Moderator

    Al

    I suggest you to refer to the case studies listed on the MS SQL home page for further information.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  5. higgsy New Member

    Hi satya,

    Which case studies are you referring to? I've had a look and nothing seems to resemble my scenario...

    Thanks for your reply

    Al
  6. satya Moderator

    http://www.microsoft.com/sql/evaluation/casestudies/solutions.asp - list of companies.

    BTW can you be specific where exactly you're hitting performance with your approach, as I can see we can reduce this overhead with a few optimization tips.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  7. higgsy New Member

    Hi Satya,

    Well in particular when trying to return data...especially when i need to return multiple attributes for a product...The stored proc below seems to hammer the cpu...

    CREATE PROCEDURE [dbo].[sp_GENERAL_GetLatestTracks]
    (
    @intNoOfRows Int,
    @intGenreID Int = NULL
    )
    AS

    BEGIN

    --turn system messages off
    SET NOCOUNT ON

    CREATE TABLE #Tracks
    (
    ID int IDENTITY(1,1),
    TrackID Int NOT NULL,
    TrackTypeID Int NOT NULL,
    TrackTitle VarChar(150),
    Genre VarChar(50) NOT NULL,
    [Description] VarChar(250),
    Artistname VarChar(100) NOT NULL,
    ProfilePath VarChar(255) NOT NULL
    )

    INSERT INTO #Tracks(TrackID,TrackTypeID,TrackTitle,Genre,[Description],Artistname,ProfilePath)

    SELECT P.ProductID,
    PA6.TrackTypeID,
    COALESCE(PA.Title,MST.Title + ' (' + MSA.[Name] + ')'),
    Genres.[Name],
    ISNULL(NULLIF(P.[Description],''),NULL) As [Description],
    AP.Artistname,
    AP.ProfilePath

    FROM Products P

    #######################

    LEFT JOIN
    (
    SELECT ProductID,
    AttributeValue 'Title'
    FROM ProductAttributes
    WHERE AttributeID = 24
    ) PA
    ON P.ProductID = PA.ProductID

    LEFT JOIN
    (
    SELECT ProductID,
    Cast(AttributeValue As Int) 'MainstreamTrackID'
    FROM ProductAttributes
    WHERE AttributeID = 34
    ) PA2
    ON P.ProductID = PA2.ProductID

    LEFT JOIN
    (
    SELECT ProductID,
    Cast(AttributeValue As Int) 'UserID'
    FROM ProductAttributes
    WHERE AttributeID = 31
    ) PA3
    ON P.ProductID = PA3.ProductID

    LEFT JOIN
    (
    SELECT ProductID,
    Cast(AttributeValue As Int) 'IsAuthorised'
    FROM ProductAttributes
    WHERE AttributeID = 56
    ) PA4
    ON P.ProductID = PA4.ProductID

    LEFT JOIN
    (
    SELECT ProductID,
    Cast(AttributeValue As Int) 'GenreID'
    FROM ProductAttributes
    WHERE AttributeID = 40
    ) PA5
    ON P.ProductID = PA5.ProductID

    LEFT JOIN
    (
    SELECT ProductID,
    Cast(AttributeValue As Int) 'TrackTypeID'
    FROM ProductAttributes
    WHERE AttributeID = 39
    ) PA6
    ON P.ProductID = PA6.ProductID

    LEFT JOIN
    (
    SELECT ProductID,
    CAST(AttributeValue As Int) 'IsDeleted'
    FROM ProductAttributes
    WHERE AttributeID = 67
    ) PA7
    ON P.ProductID = PA7.ProductID

    ########################

    JOIN ArtistProfiles AP ON
    AP.ArtistID = PA3.UserID

    JOIN Users ON
    Users.UserID = PA3.UserID

    JOIN Genres ON
    Genres.GenreID = PA5.GenreID

    LEFT JOIN MainstreamTracks MST ON
    MST.TrackID = PA2.MainstreamTrackID
    LEFT JOIN MainstreamArtists MSA ON
    MSA.ArtistID = MST.ArtistID

    WHERE (Users.IsActive = 1 OR AP.IsMember = 1)
    AND PA4.IsAuthorised = 1
    AND PA7.IsDeleted = 0
    AND P.CategoryID = 2
    AND ISNULL(NULLIF(AP.ProfilePath,''),NULL) IS NOT NULL
    AND ISNULL(NULLIF(AP.Artistname,''),NULL) IS NOT NULL
    AND ((PA5.GenreID = @intGenreID AND @intGenreID IS NOT NULL) OR @intGenreID IS NULL)

    ORDER BY P.ProductID DESC

    --then return the number of rows as specified by the variables passed into the SP - @intNoOfRows
    SELECT *
    FROM #Tracks
    WHERE #Tracks.[ID] <= @intNoOfRows

    DROP TABLE #Tracks

    --turn system messages back on
    SET NOCOUNT OFF

    END

    I might be wrong, but i dont see anything amazingly wrong with this proc, but it seems to read a lot of rows, and uses a lot of cpu....i had looked at replacing all the left joins between the ############# lines above with a singular join using "where attributeid in(24,31,etc)", however this returns multiple rows so is of no great benefit....

    Any ideas Satya,

    Thanks

    Al
  8. ranjitjain New Member

    Do run the particular SP frequently then i fell u can replace the
    temporary table into permanent table.
    And i think the time consuming process is joining the tables.
    Try the ITW on this and go for indexing on joined column.
  9. satya Moderator

    Exactly run the PROFILER while executing this SP/query and submit the trace to Index tuning wizard for recommendations on the indexes.

    http://www.sql-server-performance.com/rd_temp_tables.asp - about temp tables information.
    Use Table variables - If you've enough memory to support the SQL server, then the primary reason for using them is they reside in memory but if the data gets huge and memory cant accomodate it then it results in paging thereby speed suffers.


    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  10. mmarovic Active Member

    quote:Use Table variables - If you've enough memory to support the SQL server, then the primary reason for using them is they reside in memory but if the data gets huge and memory cant accomodate it then it results in paging thereby speed suffers.
    I thought so for a long time, but recently I saw an article declaring table variables are stored in tempdb too. I'm to busy right now to check, can someone help?
  11. higgsy New Member

    Hi,

    Ive just ran the procedure lots of times, and monitored it using profiler...some typical results are:

    CPU: 234
    Read: 24504
    Duration: 1250

    When i use the tuning wizard, it gives me no suggestions whatsoever...how can that be???

    Thanks

    al
  12. satya Moderator

  13. higgsy New Member

    Hi Mmarovic,

    Do i just clear the SP cache using sp_recompile 'objectname' ?

    Secondly, either im using the profiler wrong, or the index tuning wizard is nothing short of useless. Im running my queries in query analyzer whilst running a trace in profiler...i save the results to a .trc file and then load it into the index tuning wizard...all it does, bearing in mind my query uses 8 different tables is to suggest one column (which i already have as an index!!)

    Any ideas??

    Thanks

    Al
  14. mmarovic Active Member

    Thank you Satya. Here is the answer:
    quote:Q4: Are table variables memory-only structures that are assured better performance as compared to temporary or permanent tables, because they are maintained in a database that resides on the physical disk?

    A4: A table variable is not a memory-only structure. Because a table variable might hold more data than can fit in memory, it has to have a place on disk to store data. Table variables are created in the tempdb database similar to temporary tables. If memory is available, both table variables and temporary tables are created and processed while in memory (data cache).


    Hi higgsy.
    quote:Do i just clear the SP cache using sp_recompile 'objectname' ?
    That way you don't clear the sp cache, you just replace execution plan of this specific stored procedure with new one, which is what you need actually. dbcc dropCleanBuffers cleans entire sp cache. I haven't used ITW for a very long time. I prefer to tune indexes all by myself. That way I can discover other changes in table design that may be usefull.

    My approach would be to find the most used attributes and put them as a 'static column' in appropriate table. Call them predifined attributes. Leave in dynamic structure only really product specific attributes.

    Actually I wasn't in position to design database for that kind of requirements, so I can't really comment. I may change my opinion when I have to realy work on it.


  15. mmarovic Active Member

    I've just read Ben Willson's recommendation. I like that approach and I used it and it works well if there is a restricted number of types that can be used. It is actually database object inheritance. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />] The problem appears when it is requiered that application administrator can define attributes users are interested in.
  16. mmarovic Active Member

    Try this version:
    CREATE PROCEDURE [dbo].[sp_GENERAL_GetLatestTracks]
    (
    @intNoOfRows Int,
    @intGenreID Int = NULL
    )
    AS

    BEGIN

    --turn system messages off
    SET NOCOUNT ON
    --restrict number of rows
    set rowCount @intGenreID

    SELECT
    P.ProductID,
    PA6.TrackTypeID,
    COALESCE(PA.Title,MST.Title + ' (' + MSA.[Name] + ')'),
    Genres.[Name],
    ISNULL(NULLIF(P.[Description],''),NULL) As [Description],
    AP.Artistname,
    AP.ProfilePath
    FROM Products P

    #######################

    LEFT JOIN
    (
    SELECT ProductID, AttributeValue 'Title'
    FROM ProductAttributes
    WHERE AttributeID = 24
    ) PA
    ON P.ProductID = PA.ProductID

    LEFT JOIN
    (
    SELECT ProductID, Cast(AttributeValue As Int) 'MainstreamTrackID'
    FROM ProductAttributes
    WHERE AttributeID = 34
    ) PA2
    ON P.ProductID = PA2.ProductID
    LEFT JOIN
    (
    SELECT ProductID, Cast(AttributeValue As Int) 'UserID'
    FROM ProductAttributes
    WHERE AttributeID = 31
    ) PA3
    ON P.ProductID = PA3.ProductID

    LEFT JOIN
    (
    SELECT ProductID, Cast(AttributeValue As Int) 'IsAuthorised'
    FROM ProductAttributes
    WHERE AttributeID = 56
    ) PA4
    ON P.ProductID = PA4.ProductID
    LEFT JOIN
    (
    SELECT ProductID, Cast(AttributeValue As Int) 'GenreID'
    FROM ProductAttributes
    WHERE AttributeID = 40
    ) PA5
    ON P.ProductID = PA5.ProductID

    LEFT JOIN
    (
    SELECT ProductID, Cast(AttributeValue As Int) 'TrackTypeID'
    FROM ProductAttributes
    WHERE AttributeID = 39
    ) PA6
    ON P.ProductID = PA6.ProductID
    LEFT JOIN
    (
    SELECT ProductID, CAST(AttributeValue As Int) 'IsDeleted'
    FROM ProductAttributes
    WHERE AttributeID = 67
    ) PA7
    ON P.ProductID = PA7.ProductID

    ########################

    JOIN ArtistProfiles AP ON AP.ArtistID = PA3.UserID
    JOIN Users ON Users.UserID = PA3.UserID
    JOIN Genres ON Genres.GenreID = PA5.GenreID
    LEFT JOIN MainstreamTracks MST ON MST.TrackID = PA2.MainstreamTrackID
    LEFT JOIN MainstreamArtists MSA ON MSA.ArtistID = MST.ArtistID
    WHERE
    (Users.IsActive = 1 OR AP.IsMember = 1)
    AND PA4.IsAuthorised = 1
    AND PA7.IsDeleted = 0
    AND P.CategoryID = 2
    --AND ISNULL(NULLIF(AP.ProfilePath,''),NULL) IS NOT NULL
    and ap.ProfilePath > ''
    --AND ISNULL(NULLIF(AP.Artistname,''),NULL) IS NOT NULL
    and ap.ArtistName > ''
    AND PA5.GenreID = isNull(@intGenreID, -555)
    ORDER BY P.ProductID DESC

    --turn system messages back on
    SET NOCOUNT OFF

    END
    You can also try normal left joins using different alias for each 'instance' of ProductAttributes table instead of using derived tables. I don't know if it helps, but you can try.
  17. mmarovic Active Member

    quote:My approach would be to find the most used attributes and put them as a 'static column' in appropriate table. Call them predifined attributes. Leave in dynamic structure only really product specific attributes.
    E.g. UserID, isDeleted and title should not be dynamic attributes. IsDeleted caries business logic, UserId too. I guess title is actually name and it is present for each product.
  18. higgsy New Member

    Hi everyone,

    sorry for the delay in responding. I just tried the same procedure, and added UserID to the products table and indexed it...it was about 200 times faster...so thats good news....there are a few confusing points though, that i need to get straight in my head before i could change the structure site wide...

    Firstly, mmarovic mentions to keep all the fields that have business logic outside of the attributes, i.e. static columns in the products table.....although this is not a problem, i will probably end up with 10-15 static columns as a lot of our different groups of products have fields that are relevant purely to that group that also have business logic....Whats the best way to get round this???

    Thanks

    Al
  19. mmarovic Active Member

    quote:Firstly, mmarovic mentions to keep all the fields that have business logic outside of the attributes, i.e. static columns in the products table.....although this is not a problem, i will probably end up with 10-15 static columns as a lot of our different groups of products have fields that are relevant purely to that group that also have business logic....Whats the best way to get round this???
    Ok, first 10-15 static columns should improve performance a lot and also simplify the code. If you have group specific columns involved in hard-coded business logic than there is no sense to have them as dynamic attributes as well. In that case I would use technique Ben Willson mentioned.
  20. higgsy New Member

    Hi Ben,

    I got an email saying you had replied to my post....but theres no reply here....whats going on with this forum, messages seem to just disapear!!!

    Al
  21. benwilson New Member

    case of the missing post...yes, i did indeed write a reply to Al's post, but it appears the gremlins in the system have destroyed it...heres hoping Al could get something useful from the email...
  22. satya Moderator

    Currently the website is under transmission to a new server and DNS servers are struggling to keepup the process. I believe Brad has fixed this but still you may get such issues and soon they will get resolved.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.

Share This Page