SQL Server Performance

Triangle join - need tip and help for performance!

Discussion in 'SQL Server 2005 Performance Tuning for DBAs' started by infinity1975, Feb 5, 2009.

  1. infinity1975 New Member

    Hello. I have a table with ids that i want to build a "chain" like a triangle, or even more like a square.. Table looks something like this:

    id1 id2
    1 2
    1 3
    1 4
    2 3
    2 4
    3 1
    3 6
    3 4
    3 8
    4 1
    4 3

    SELECT DISTINCT id AS m_1.id1, m_2.id AS id2, m_3.id AS id3
    FROM t AS tbl1
    INNER JOIN t AS tbl2 ON tbl1.id2 = tbl2.id1
    INNER JOIN t AS tbl3 ON tbl2.id2 = tbl3.id1 AND tbl3.id2 = tbl1.id1

    The query should build a triangle with example that id (1) matches id (2) and id (2) matches id (4) and id (4) matches id (1)

    so the table consists of same foreign primary key to another table, and just want to build a chain of it.
    I have in the table already looked which id matches which id, but now i want to build a chain of it.

    Is there a better way than storing this in a table like that, the query will be run on a table that has about 5 million rows so its insanely time consuming when 100 people does it at the same time. I only have one server at the moment, but can expand to more if needed.
  2. preethi Member

    Hi,Do you know the set of rows already? What are the indexes you have on the table? Do you have any filtering method?
    Your method will scan the table three times for it may not be a good idea if your table is large. One method could be while storing the data, generate the possible triangles into another table so that you don’t have to search for it.
    Additionally, I assume that you want to get the connections only and not the order of connections. you need to remove some triangles which are duplicated simply because the starting point is different. from you example, I can get a triangle using rows 1,4 & 6. You can generate the same triangle when the order is different (4, 6, & 1, is the same as 1, 6 & 4 and 6, 1 & 4) Using DISTINCT will not help you at all. It does an additional sorting but will not remove any rows.
  3. infinity1975 New Member


    I have a unique index on both column1 and column2.
    What do you mean by filtering method ?

    The flow im using today is:

    1. Create the rows with id's matching one id to another. ie. 1 => 2, 3 => 1 etc

    2. I will do the query building the triangle

    3. I will save the triangle with sorted id's beginning with the lowest id in the first column in a new table called "Triangles"


    1 2 3
    x x x


    But, when i need to recalculate the values, its a resource consuming process, doing several steps.

    Do you have a smarter idea.. The Id's is something matching another id.. ie. 1 matching 2, 3 matching 1.

    Thanks for your reply, this is a really tricky problem.
  4. preethi Member

    [quote user="infinity1975"]I have a unique index on both column1 and column2.
    What do you mean by filtering method ?
    Do you want all the rows all the time, or do you filter them (Using WHERE clause)?
    [quote user="infinity1975"]But, when i need to recalculate the values, its a resource consuming process, doing several steps.
    Whenever I come across a problem like calculating values based on entire table, I think of storing them permemantly. When ever you do an insert, update or delete, find out the new triangles to be added and triangles to be removed and store them into a table. That means you will calculate the trangles related to the rows affected only and often it is a tiny subset when compared to the entire table.
  5. infinity1975 New Member

    Yes I use a filter with a where clause. The tables consist of ids from advertisers. I want to connect them together so they can switch the ads between them, in a chain like situation. Either 2 parts, 3 parts or even 4 parts.No mather what, the calculation of triangles can sometimes take up to 40-60 seconds. and I have tons of users doing this daily. Every new add does this, and we get around 100 new per day. the id id table consists of many millions of rows.
  6. preethi Member

    From your answers I fee that preparing the calculated values before hand and storing them in another table is the best solution.
  7. FrankKalis Moderator

    What are you going to model here?

Share This Page