The Problem with Data Masking

We often get asked, is there a difference between data masking and data encryption? The short answer: there is a HUGE difference.

Data masking (DM) or data anonymization is required by many compliance frameworks such as GDPR and ISO 27002:2022 (Control 8.11) and is recognized by Gartner as a growing category within data security technology. While it’s touted as a solution for protecting sensitive data, the reality is that most DM solutions only provide a band-aid for the gaping wound that is data theft and data ransom.

According to Gartner, DM is based on the premise that sensitive data can be transformed into less sensitive, but still useful, data. The market for data protection, DM included, continues to evolve with technologies designed to redact, anonymize, pseudonymize, or in some way deidentify sensitive data in order to protect it against confidentiality or privacy risk. Masking techniques have become almost common place within the clinical trial and research industries where they must maintain the privacy of subjects utilized within their studies.

Since the rise of COVID, the data masking market has grown from a little over half a billion dollars in  2022 to what is expected to reach over one billion dollars by 2028, as per the April 20, 2023 press release from Market Research Guru.

On paper, all of this is great for data masking vendors, but it has potential to create even more complexity for end users. After all, multiple versions of the same sensitive data, rolling around your network is exactly what data minimization requirements are trying to address. Rather than masking or anonymizing multiple versions of the sensitive data, organizations should focus on data encryption—fully encrypting their data in use, in transit, and at rest, within a single data set. Why simply mask the data when there are solutions that ensure data is always encrypted? This is the ultimate form of data security.

The Problem With Data Masking vs. Data Encryption

  • Also known as a form of data obfuscation, DM hides the actual data using modified content like characters or whole numbers. The main objective is to create an alternate version of the real data that cannot be easily identifiable or reverse engineered. Behind this new dataset is usually unencrypted plaintext data used to support the newly created masked dataset.
  • DM applications only work with plaintext data. Therefore, the critical data is still fully exposed while in memory. Today’s threat environment is a wound that requires a lot more attention than a band aid like DM solution can provide.
  • The typical DM solution often requires development, database administration time, and is an application that simply sits in the path of data delivery.
  • Typically, DM sits just under the query application. In essence, the query is pulling the data from plaintext memory (plaintext on plaintext search), then as the data is passed back to the query engine, it runs through the masking application. In this case, presented data is masked, the queried data is fully exposed plaintext.

While some DM techniques may include data encryption, it’s most often Format Preserving Encryption (FPE), which maintains the format of the data as part of the encryption protocol. In FPE, a social security number, for example, is 9-digit encryption. This form of data encryption is better than no encryption, but it can be cracked in minutes or even seconds by the threat actor. Many data encryption experts, such as NIST, recommend AES256 as a true means to keep sensitive data secure.

Benefits of Encryption-in-Use over DM

  • Data privacy: Searchable encryption ensures that the contents of the data are always kept private, even from the server that stores the data.
  • Searchability: Searchable data encryption allows users to search for data without having to ever decrypt it first.
  • Flexibility: Searchable encryption can be used to store and search a variety of different types of data, including text, images, and videos.
  • Cost: For the cost of a Data Masking solution, Paperclip SAFE® provides full searchable data encryption. SAFE has access control so that the end user only sees the results of the search and only what they’re allowed to see.
  • Ease of Use: For the same, or less, effort that it takes to implement a Masking application, you can implement Paperclip SAFE full searchable data encryption. One SAFE solution can service many business applications creating a sensitive data exchange. This eliminates the need to have the same sensitive data residing in multiple data silos servicing multiple business application. This supports both data minimization and segmentation requirements.

Security and encryption technology is constantly changing, and it can feel impossible to keep up. The simple answer is often the best answer: data that is always encrypted is the safest.

Let’s save the masks for Halloween.