Write a query to delete duplicate records in sql server

id name Age
1 A 25
2 B 26
3 C 24
2 B 26

 Create Table t1
   id int,
   name char(10),
   age int

 insert into t1
  select 1, 'A', 25 union all
  select 2, 'B', 26 union all
  select 3, 'C', 24 union all
  select 2, 'B', 26

First, run the above Group By query to determine how many sets of duplicate Pk { Primary Key } values exists, and the count of duplicates for each set.

Select the duplicate key values into a holding table. For example :

 Select id, name, count = count(*) 
  into holdkey
  from t1
  Group by id, name 
  Having count(*) > 1

Select the duplicate rows into a holding table, eliminating duplicates in the process. For example :

 Select Distinct t1.*
        into holddups
        from t1, holdkey
        Where t1.id = holdkey.id and t1.name = holdkey.name

At this point, the holddups table should have unique Pks , however, this will not be the case if t1 had duplicate Pks , yet unique rows (as in the SSN example above.) Verify that each key in holddups is unique, and that you do not have duplicate keys, yet unique rows. If so, you must stop here and reconcile which of the rows you wish to keep for a given duplicate key value. For example, the query

 Select id, name, count(*)
 From holddups
 Group by id, name

Delete the duplicate rows from the original table. For example

 Delete t1
 From t1, holdkey
 Where t1.id = holdkey.id

Post a Comment