Friday, 18 June 2010

Finding Duplicates

This is just a short post that uses a contrived example to demonstrate how to find duplicate records in a table. I needed to identify some duplicate records for a supplier recently and I wanted to make some notes on what I did for future reference.


 

First I will create a very simple table for this example


 

CREATE
TABLE [dbo].[Dups]


(

[FirstName] [nvarchar](50)
NULL,

[lastName] [nvarchar](50)
NULL,

[Company] [nvarchar](50)
NULL


)

ON [PRIMARY]


 

I will then add some example data, including some duplicate rows:


 

INSERT
INTO [C_AVG].[dbo].[Dups]


(

[FirstName],

[lastName],

[Company]


)

VALUES (


'Gethyn',


'Ellis',


'GRE'


)


 


 

INSERT
INTO [C_AVG].[dbo].[Dups]


(

[FirstName],

[lastName],

[Company]


)

VALUES (


'Lisa',


'Ellis',


'GRE'


)


 

INSERT
INTO [C_AVG].[dbo].[Dups]


(

[FirstName],

[lastName],

[Company]


)

VALUES (


'Ron',


'Ellis',


'GRE'


)


 

INSERT
INTO [C_AVG].[dbo].[Dups]


(

[FirstName],

[lastName],

[Company]


)

VALUES (


'Lisa',


'Ellis',


'GRE'


)


 

INSERT
INTO [C_AVG].[dbo].[Dups]


(

[FirstName],

[lastName],

[Company]


)

VALUES (


'Lisa',


'Ellis',


'GRE'


)


 


 

When I run a very simple select against this table I get the following output:


 

Gethyn    Ellis    GRE

Lisa        Ellis    GRE

Ron        Ellis    GRE

Lisa        Ellis    GRE


 

as we can see, Lisa is included in this table twice this but if you had a table with a couple million rows in this table and you suspected that it had duplicates spotting the duplicates maybe a little more difficult the following script will identify them for you:


 

This shows that


 

SELECT FirstName, lastName, Company FROM dups

GROUP
BY FirstName, lastName, Company

HAVING (COUNT
(*)
> 1)


 

This returns all the duplicate entries:


 

Lisa    Ellis    GRE


 

This only identifies the rows that exist more than once, cleaning up duplicates through deletion will be covered in another post.


 

No comments:

Post a Comment

Featured post

OUT OF SUPPORT: SQL Server 2008 and SQL Server 2008 R2

Welcome to this post, on SQL Server 2008 and 2008 R2, both of these versions of SQL server will go out of  extended support with Microsoft t...