Skip to main content

Dynamic Data Masking

By David Postlethwaite


Dynamic data masking is a really useful new tool in SQL Server 2016. Keeping data safe is one of the most important tasks that a company has; this goes further than just making sure that users can only see their own data and hackers can’t get access. One case that will be familiar to many is where a copy of production data is required by developers for testing or for production support issues. In this case it is often important that the data is obfuscated so that real personal data is protected. With large data sets running an obfuscation script can take a long time and hold up support or testing. What would be really useful is a way of masking the data without having to run these long obfuscation scripts. Introducing Dynamic Data Masking.

Microsoft has introduced a solution to this in SQL Server 2016 and Azure SQL database called Dynamic Data Masking. With DDM, whenever a SELECT statement is run a mask is placed over the data that is returned so that the original values are obfuscated from those who shouldn’t be allowed to see it.

DDM is very easy to configure using T-SQL or via the Azure portal. You can control which users and groups will be subject to the mask. Once enabled SQL will mask the data regardless of how it is viewed so you won’t get staff using SSMS as a backdoor to the real data. 

The great advantage of DDM is that you can build it into your production database but not actually enable it. When you copy the database to Production Support or Testing then it is simply a case of enabling masking and the data is instantly obfuscated.

In SQL 2016 there are four masking functions available. There has been no mention of this being expanded in SQL 2017. 

  • Default:   Replaces all characters with ‘XXXX’ and numbers with 0
  • Email: Replaces the part before the @ with the first letter and XXXXand puts a ‘’ at the end
  • Random: Replaces numbers with random values
  • Partial:  Here you can define the number of characters to display at the beginning and end and the masking characters to use in the middle
In Azure, the portal also offers a mask called “Credit Card” which is a prefilled partial function of partial(0, "xxxx-xxxx-xxxx-", 4).

You can create a table with masking functions using T-SQL

CREATE TABLE [dbo].[Customers](
[CustomerID] [int] NOT NULL,
[FirstName] [dbo].[Name] NOT NULL,
[LastName] [dbo].[Name] MASKED WITH (FUNCTION='partial(2,"XXXX",2)') NOT NULL,
[EmailAddress] [nvarchar](50) MASKED WITH (FUNCTION = 'email()') NULL,
[PhoneNumber] [nvarchar](25) MASKED WITH (FUNCTION = 'default()') NULL,
[TerritoryID] [int] NULL

Or you can alter a table to add masking functions
ALTER TABLE [dbo].[Customers]
ALTER TABLE [dbo].[Customers]
ALTER TABLE [dbo].[Customers]

Let’s insert some data for our example

--insert some data
INSERT INTO [Customers] VALUES('19169','Arianna','Bryant','816-14-3866','77778052342171','','725-555-0131',1)
INSERT INTO [Customers] VALUES('17423','Arianna','Alexander','513-30-2965','77776580079386','','140-555-0167',1)
INSERT INTO [Customers] VALUES('17722','Jerome','Serrano','319-51-1661','55559881820491','','1 (11) 500 555-0136',8)
INSERT INTO [Customers] VALUES('22728','Arianna','Russell','239-35-4068','33338016910017','','179-555-0197',6)
INSERT INTO [Customers] VALUES('29187','Jerome','Ramos','918-75-6173','11115373226233','','1 (11) 500 555-0162',9)
INSERT INTO [Customers] VALUES('14902','Arianna','Diaz','162-76-6285','33331128313730','','142-555-0189',4)
INSERT INTO [Customers] VALUES('21817','Jerome','Carlson','833-14-5928','11113048024782','','1 (11) 500 555-0113',10)
INSERT INTO [Customers] VALUES('25991','Arianna','Hayes','302-26-4124','33336197015476','','582-555-0144',6)

A user with db_owner permissions can view the unmasked data but a user who isn’t a db_owner will see masked data so it is important that your database security has been properly designed. I know from experience that many developers simply give everyone db_owner permissions to ensure that all procedures and functions work correctly. This will need to be changed to use DDM effectively.

Let’s create a user for this demonstration.
-- create user and give data reader role
ALTER ROLE [db_datareader] ADD MEMBER [MaskUser]

If the user tries to view the data it will be returned with the mask applied.

--run select as Maskuser - see the masked data
SELECT [CustomerID]
  FROM [dbo].[Customers]


You can remove the mask from a user with the GRANT UNMASK command

-- Granting users or roles the UNMASK permission will enable them to see unmasked data:

The select statement will now show the real data.

You can re-enable the mask with the REVOKE UNMASK

-- Revoking users or roles the UNMASK permission will force users to see masked data:

Masking Data in SQL Server 2016 is that simple. 
However, there are some drawbacks. 
  • The underlying data is still intact so if someone was able to take a copy of the database they would have access to all the personal data.
  • DBAs with their higher privileges will almost certainly be able to see the unmasked data.
  • As already mentioned, database security must be configured correctly to ensure DDM works for all users.
  • The “WHERE” and “ORDER BY” clauses use the underlying data not the masked data so clever users who can run adhoc queries could use this weakness to work out what the masked data actually is. For instance, if there was a salary field they could use an order by clause to see who is the highest and lowest paid staff. 
--clever users can work out the underlying data by using where clause
SELECT [CustomerID]
  FROM [dbo].[Customers]
  WHERE LastName = 'Bryant'

In the above statement the user has guessed a last name and this has returned a row even though the data returned has a mask on it.

However, if you insert the masked data to write data to a new table hoping to see the unmasked data you will be out of luck. SQL copies the masked data to a new table; you will not see the underlying data.

--if you write data to a new table you still get the masked data
SELECT CustomerId,Lastname, EmailAddress, PhoneNumber INTO #temp_table
FROM [dbo].[Customers] -- Masked Email and Phone

SELECT CustomerId,Lastname, EmailAddress, PhoneNumber FROM #temp_table

Azure SQL Database supports Dynamic Data Masking. You can configure it from the portal. In fact the portal will suggest columns for masking.

Dynamic Data Masking looks like a useful tool in the fight to protect your production data but you will have to use it wisely to ensure that users can’t use the limitations to get around it and access the underlying data. 
I haven’t seen any improvements mentioned in SQL 2017 but I am sure it will be improved at some point.


Popular posts from this blog

SQL Server 2012 and Virtual Service Accounts

This post is written by David Postlethwaite
If you are using SQL Server 2012 you will probably have noticed that the default account for the SQL services has changed from that used in previous versions. With SQL 2005 and 2008 the default account for SQL service and SQL Agent service was “NT Authority\System”. This is one the built in accounts on a Windows machine, managed by the machine and selectable from a dedicated dropdown list

The Network Service account was introduced in Windows 2003 as an alternative to using the LocalSystem account, which has full local system privileges on the local machine, a major security concern.
The Network Service has limited local privileges easing these security concerns but when many services on a machine use the Network Service account it becomes harder to track which service is actually accessing resources and performing actions, because all the services are using the one Network Service account.
Also, this account, by default, has sysadmin per…

Always Encrypted

By David Postlethwaite

Always Encrypted is new features in SQL Server 2016 and it is also available in Azure SQL Database. Here you can encrypt columns in a table with a master key and a certificate so that they will appear as encrypted strings to those who don’t have the required certificate installed on their pc.
Once the certificate is installed on the computer then the unencrypted data can then be seen as normal.

The data passes from database to your application as the encrypted value, only the application with the correct certificate can unencrypt the data so it is secure across the wire. This will go some way to resolving the concern of people worried about putting their sensitive data on a shared server in the cloud such as Microsoft Azure and accessing the data across the Internet.

At the time of writing Always Encrypted is only supported with ADO.NET 4.6, JDBC 6.0 and ODBC 13.1 but expect other driver to become available.

The calling application (including SSMS) must also hav…

New in SQL Server 2017: Graph Databases

David has recorded and published a video of his presentation on SQL Server Graph Database. In his video which you can watch below, David provides an excellent introduction into SQL Server 2017 Graph Databases. In his presentation he looks at Tennis results at tournaments for  his favourite player "The Fed"  Rodger Federer.

David  shows how to set up graph database and work with them in SQL Server 2017.

Graph Database is not new. Other vendors have had graph database capabilities for some time so Microsoft are quite late to the market. In David presentation it appears that Microsoft have done a reasonable job of implementing some of the graph database features but he does point some of the limitations of the Microsoft product too and suggests that it is not ready for production yet but Microsoft seem serious about this feature.

Please watch the video and feel free to leave a comment or feedback - David is delivering a version of this talk on Graph databases in SQL Saturday Ka…