Skip to main content

Finding Duplicates

This is just a short post that uses a contrived example to demonstrate how to find duplicate records in a table. I needed to identify some duplicate records for a supplier recently and I wanted to make some notes on what I did for future reference.


 

First I will create a very simple table for this example


 

CREATE
TABLE [dbo].[Dups]


(

[FirstName] [nvarchar](50)
NULL,

[lastName] [nvarchar](50)
NULL,

[Company] [nvarchar](50)
NULL


)

ON [PRIMARY]


 

I will then add some example data, including some duplicate rows:


 

INSERT
INTO [C_AVG].[dbo].[Dups]


(

[FirstName],

[lastName],

[Company]


)

VALUES (


'Gethyn',


'Ellis',


'GRE'


)


 


 

INSERT
INTO [C_AVG].[dbo].[Dups]


(

[FirstName],

[lastName],

[Company]


)

VALUES (


'Lisa',


'Ellis',


'GRE'


)


 

INSERT
INTO [C_AVG].[dbo].[Dups]


(

[FirstName],

[lastName],

[Company]


)

VALUES (


'Ron',


'Ellis',


'GRE'


)


 

INSERT
INTO [C_AVG].[dbo].[Dups]


(

[FirstName],

[lastName],

[Company]


)

VALUES (


'Lisa',


'Ellis',


'GRE'


)


 

INSERT
INTO [C_AVG].[dbo].[Dups]


(

[FirstName],

[lastName],

[Company]


)

VALUES (


'Lisa',


'Ellis',


'GRE'


)


 


 

When I run a very simple select against this table I get the following output:


 

Gethyn    Ellis    GRE

Lisa        Ellis    GRE

Ron        Ellis    GRE

Lisa        Ellis    GRE


 

as we can see, Lisa is included in this table twice this but if you had a table with a couple million rows in this table and you suspected that it had duplicates spotting the duplicates maybe a little more difficult the following script will identify them for you:


 

This shows that


 

SELECT FirstName, lastName, Company FROM dups

GROUP
BY FirstName, lastName, Company

HAVING (COUNT
(*)
> 1)


 

This returns all the duplicate entries:


 

Lisa    Ellis    GRE


 

This only identifies the rows that exist more than once, cleaning up duplicates through deletion will be covered in another post.


 

Comments

Popular posts from this blog

SQL Server 2012 and Virtual Service Accounts

This post is written by David Postlethwaite
If you are using SQL Server 2012 you will probably have noticed that the default account for the SQL services has changed from that used in previous versions. With SQL 2005 and 2008 the default account for SQL service and SQL Agent service was “NT Authority\System”. This is one the built in accounts on a Windows machine, managed by the machine and selectable from a dedicated dropdown list

The Network Service account was introduced in Windows 2003 as an alternative to using the LocalSystem account, which has full local system privileges on the local machine, a major security concern.
The Network Service has limited local privileges easing these security concerns but when many services on a machine use the Network Service account it becomes harder to track which service is actually accessing resources and performing actions, because all the services are using the one Network Service account.
Also, this account, by default, has sysadmin per…

Always Encrypted

By David Postlethwaite

Always Encrypted is new features in SQL Server 2016 and it is also available in Azure SQL Database. Here you can encrypt columns in a table with a master key and a certificate so that they will appear as encrypted strings to those who don’t have the required certificate installed on their pc.
Once the certificate is installed on the computer then the unencrypted data can then be seen as normal.

The data passes from database to your application as the encrypted value, only the application with the correct certificate can unencrypt the data so it is secure across the wire. This will go some way to resolving the concern of people worried about putting their sensitive data on a shared server in the cloud such as Microsoft Azure and accessing the data across the Internet.

At the time of writing Always Encrypted is only supported with ADO.NET 4.6, JDBC 6.0 and ODBC 13.1 but expect other driver to become available.

The calling application (including SSMS) must also hav…

New in SQL Server 2017: Graph Databases

David has recorded and published a video of his presentation on SQL Server Graph Database. In his video which you can watch below, David provides an excellent introduction into SQL Server 2017 Graph Databases. In his presentation he looks at Tennis results at tournaments for  his favourite player "The Fed"  Rodger Federer.

David  shows how to set up graph database and work with them in SQL Server 2017.

Graph Database is not new. Other vendors have had graph database capabilities for some time so Microsoft are quite late to the market. In David presentation it appears that Microsoft have done a reasonable job of implementing some of the graph database features but he does point some of the limitations of the Microsoft product too and suggests that it is not ready for production yet but Microsoft seem serious about this feature.

Please watch the video and feel free to leave a comment or feedback - David is delivering a version of this talk on Graph databases in SQL Saturday Ka…