SQL Server Performance

Remove duplicate values in a subset

Discussion in 'SQL Server 2008 General Developer Questions' started by Jelte1234, Mar 1, 2010.

  1. Jelte1234 New Member

    Hi!
    I don't know for sure if this is the correct forum for this question, but i'll try it.
    I keep getting a duplicate keys error when performing an insert into query. This is because the values in the column which has to be inserted in the primary key column contain duplicate values. What is the syntax to select only unique values out of a colomn?
    greetings jelte
  2. FrankKalis Moderator

    Welcome to the forum!
    Are you looking for something like:
    SELECT DISTINCT col FROM Table;
  3. Jelte1234 New Member

    I tried it. But is returns distinct values of a row. I need distinct values of one column, not the whole row. I also tried it with group by, but keep getting the error that the query does not include the specified expression 'Title' as part of an aggregate function.
    my query is:
    INSERT INTO Model ( model_id, model_name, L, H, W, V, TV, manufacturable )
    SELECT Order_Import.STL_nr, Order_Import.Title, Order_Import.length, Order_Import.height, Order_Import.width, Order_Import.volume, Order_Import.total_volume, "Unchecked" AS Expr1
    FROM Order_Import
    GROUP BY Order_Import.STL_nr;
    jelte
  4. FrankKalis Moderator

    Okay, can you show some sample data of what's in Order_Import and which of that data you'd want to insert into Model please?
  5. Jelte1234 New Member

    This is the query i'm using:
    INSERT INTO Model ( model_id, model_name, L, H, W, V, TV, manufacturable )
    SELECT DISTINCT Order_Import.STL_nr, Order_Import.Title, Order_Import.length, Order_Import.height, Order_Import.width, Order_Import.volume, Order_Import.total_volume, "Unchecked" AS Expr1
    FROM Order_Import
    WHERE NOT EXISTS (SELECT * FROM Model WHERE Model.model_id = Order_Import.STL_nr);
    STL_nr = text, example 74794s1_v1
    Title = text, example Chair
    length, height, width, volume, total_volume = number, example 3,23
    The problem is that it is possible that the same STL_nr hasn't has the same value is the other cells. Since model_id is a primairy key a error occurs in this case. I'm manipulating the db from an external program (Bonita Studio, business process modeler) and when an error occurs the query fails. Therefore I want to have only unique values in the selection of the column STL_nr.
  6. FrankKalis Moderator

    I don't know if your tool support common Table Expressions, but one way to solve this could be something like this. Untested...!!!
    ;WITH mycte
    AS
    (SELECT
    ROW_NUMBER() OVER(PARTITION BY STL_nr ORDER BY STL_nr) AS rn,
    Order_Import.STL_nr, Order_Import.Title, Order_Import.length, Order_Import.height, Order_Import.width, Order_Import.volume, Order_Import.total_volume, "Unchecked" AS Expr1
    FROM Order_Import
    )

    INSERT INTO Model ( model_id, model_name, L, H, W, V, TV, manufacturable )
    SELECT
    *
    FROM
    mycte
    WHERE
    rn = 1 AND NOT EXISTS
    (SELECT *
    FROM
    Model
    WHERE
    Model.model_id = mycte.STL_nr);
    If there is some logic as to what STL_nr row should be selected, you would need to change to PARTITION BY STL_nr ORDER BY STL_nr ORDER BY clause to your needs.
  7. Jelte1234 New Member

    I tried it but de syntax was not supported. I run the query in an access database (for now, this will be a sql db in the near future). The query is saved in the database and the saved query is called from the external progam (in doing this the limited functions of de external program are overcome). So do you know how to do this in acces sql?
  8. FrankKalis Moderator

    Sorry, no idea how this could work in Access SQL. [:(]
  9. Adriaan New Member

    If it is clear which of the columns is causing the duplicates, then decide which value you need to copy over from that column. Instead of the bare column name, you add an aggregate function around it, and remove the column from the GROUP BY clause.
    The aggregate functions you could use would be MIN(column), MAX(column) or AVG(column)which work fine for both alphanumeric and numeric data - except AVG().
    In Access, you can also use FIRST(column) and LAST(column) =,unless you select the option for SQL-92 queries only for the current MDB. ---This would be a good idea anyway, with a view to the future upgrade.
  10. paresh_sqldba New Member

    select re.* from
    (
    select row_number over(partition by id) as row_num, name
    from table1
    ) res
    where res.row_num = 1
    it will return only distinct records

Share This Page