Learn more about how we de-identify data in our ONS guide to data de-identification video.

At the Office for National Statistics (ONS), we collect and process data, to create statistics that help us understand the UK's economy, society and population.

It is important that when we use data, we do it in a way that respects the privacy of individuals and businesses.

One of the ways that we can reduce the risk of someone being identified is by “de-identifying” the data.

What does de-identifying data mean?

De-identifying data means removing any personal information that would allow you to be directly identified. For example, your name, address or date of birth.

How does data de-identification work?

De-identification works by removing personal information from data that would allow someone to be identified. This includes names, addresses and dates of birth. If needed, we can also create a new reference number that relates to but does not identify an individual.

Researchers from the ONS who have been given permission to access the data can use the new reference number to analyse and link it with other sources of data. They can now do this without revealing an individual’s or business’s personal information.

Using de-identified data to create statistics

Before we can publish statistics, they must go through a process to remove anything that might identify someone. Controls are in place to make sure that data cannot be re-identified during this process.

It is our legal and ethical duty to do this. These methods help to further reduce the risk of someone being identified, even after personal information has been removed.

After processing, we use statistical disclosure control methods to make sure no one can be identified in our statistics.

Statistical disclosure control might involve:

  • grouping data together
  • removing elements of the data
  • changing the way numbers are presented, such as changing a real number into a percentage

Alongside these methods, we will consider:

  • the rights of the people who shared their data
  • where the information will be published
  • how and why the data were collected
  • how the statistics help the public good

There will be many more factors, but the specific methods we use will be different in each case. We log a list of all the decisions we make and the methods we use.

Learn more about our statistical disclosure control methods.

Why is data de-identification important?

Data de-identification protects people’s personal information. This allows researchers to use sensitive data safely and securely. By analysing it in greater detail, they can create statistics that give a more complete picture of society’s needs. By de-identifying data we:

  • protect people’s privacy
  • are compliant with legislation and security requirements
  • can securely link with other data sources
  • can share analysis for better decision-making, while protecting people’s personal information

Our de-identification policy gives an outline of the principles we follow when we produce statistics.