Code names and flowers: Rabobank's novel approach to protecting customer data
Europe's new data privacy rules have forced banks to get creative to protect sensitive data from in appropriate access or breaches.
Take Dutch-based Rabobank, for example, which now converts customer data to the Latin names of flowers and animals in order to comply with the General Data Protection Regulation that sensitive client information be disguised.
“If you want to use client data, you need to pseudonymize it or encrypt it,” said Peter Claassen, delivery manager radical automation at the bank. “Otherwise you can see data that you’re not allowed to see or it can leak and then you have even bigger problems.”
So customer Willem Degreef might be listed as Papayer Orientale, the Latin word for poppies, for instance.
How Rabobank is managing its compliance with the rules, in addition to California's new Consumer Privacy Act and the New York State Department of Financial Services' cybersecurity regulations, provides lessons for other institutions facing the same enhanced standards.
The need to protect data
At the heart of all these regulations is the mandate that companies must make sure no one can access customer data who shouldn’t, and that every effort is made to protect that data from breaches. Storing customer data in the clear — not encrypted, anonymized, or pseudonymized — is not acceptable, to regulators or anyone else.
GDPR, which took effect May 25 and affects any business whose services are accessed by people in Europe, calls for “data protection by design.” This means, “the use of pseudonymisation (replacing personally identifiable material with artificial identifiers) and encryption (encoding messages so only those authorised can read them),” according to the official website.
The next phase of the New York agency's cybersecurity regulation, which will take effect Sept. 1, will require banks to either encrypt sensitive data or provide compensating controls to protect that data (as well as provide data audit trails).
The California Consumer Privacy Act says businesses that suffer a security breach involving consumers’ personal information will be held liable “if the business has failed to implement and maintain reasonable security procedures and practices, appropriate to the nature of the information, to protect the personal information from unauthorized disclosure.”
The use of pseudonyms at Rabobank
To protect its customers’ data, Rabobank has cryptographically converted terabytes of its most sensitive client data, including names, birth dates and account numbers, into a “desensitized representation” — meaning, it looks and behaves like the real data, but it's not.
The reason to use pseudonyms, like the names of Latin plants, to replace sensitive data, rather than encrypt or anonymize the data, is that with pseudonyms, the data can still be used for app testing and analytics.
Claasssen’s group has been using the pseudonymized data to test the performance of apps of apps that need to respond to requests in milliseconds.
“To test whether that works, we need to have an environment just like production, but we’re not allowed [under GDPR] to use production data,” Claassen said. “So we created an environment to do this performance testing and use pseudonymized data to simulate the performance of production in our test environment.”
Why Latin flowers and animal names?
“We needed to replace the names with something else,” Claassen said. “How do we make clear to the outside world that it’s not a real person or a real name? That’s why we use Latin plant names, because everyone can see it’s not a real name. The same for the address: The use of a Latin animal name shows it’s not a real address.”
The other reason is, there are a lot of Latin plant and animal names, enough to cover all of Rabobank’s 8 million customers and then some.
The hardest part of replacing customer data with pseudonyms, according to Claassen, is pre-installation: deciding who needs to have access to what kind of data and identifying what is personally identifiable data that must be protected.
“You need application specialists to specify, this is data that needs to be pseudonymized and this data can go through untouched,” he said. “That whole process takes time when implementing it.”
The bank plans to expand the use of pseudonyms to other areas, like analytics, Claassen said.
Pseudonymize or anonymize?
Pseudonyms, encryption and anonymization are all legitimate approaches to protecting customer data under GDPR, according to Richard Parry, consultant and former risk executive at JPMorgan Chase and Citigroup.
“Anything that gets around identity en clair is good," he said. "Different vendors may prefer one approach over another because they want to incorporate other proprietary approaches to secure their niche in an increasingly crowded market.”
Jennifer Everett, associate at Jones Day, agreed that pseudonymization can be a useful tool for organizations and minimizes the risks of data breaches.
“We envision banking institutions and companies in other industries will use pseudonymization for this reason, and indeed it is encouraged under the GDPR,” she said. “It is, however, important to still keep in mind that pseudonymized data remains subject to the GDPR, unlike anonymized data.”
Encryption also has its uses in data protection, Everett said.
“It can serve as a safe harbor for companies who experience a personal data breach and obviate them from having to notify supervisory authorities and data subjects, assuming the encryption key is not also compromised,” she said.
Michael Osborne, manager of the security and privacy group at IBM Research, which developed the technology Rabobank is using, argued that pseudonymization lets companies retain all of the data's usefulness. (Other providers of this type of technology include Protegrity and Delphix.) Pseudonyms also don’t require system changes, according to Osborne.
“If you start with pseudonymization, you can retain 100% of the data utility,” Osborne said. “You just can’t identify someone directly from that data set. Pseudonymization is what the card industry has been doing for years under PCI, it’s called tokenization.”
Anonymization, on the other hand, is difficult, Osborne said.
“It destroys data unless you do it very carefully,” he said.
When data is anonymized, there’s no linkability, so no analytics can be run on it; trends and patterns can’t be identified, Osborne said. And sometimes with anonymization, noise is added.
“The stronger the anonymization, the more the data is changed,” Osborne said. “The more anonymization you apply, the less analytic value or utility that data has."
Rabobank chose pseudonomization and not anonymization so it could use customer data in application testing.
“Then it’s usable not only in one application but in chained applications,” Claassen said. “With anonymization we wouldn’t be able to do that, because the data would be too scrambled. It loses its integrity; it loses its characteristics, so we can’t do our testing anymore.”
Richard Hogg, global GDPR evangelist at IBM (his business card has a drawing of a crown on it and says, “KEEP CALM #GDPR is live”), pointed out that none of these options is a silver bullet.
“It would be ideal if you could just encrypt everything and then know we could always control who has access to it when,” Hogg said. “Even if you put encryption in place as a base level of data protection internally, you could still have rogue IT employees with the keys to the safe effectively, who could still get data. Encryption is important to do as a first step, but it’s not a panacea. Knowing and managing the data while it’s in your custody and control is also important.”
Editor at Large Penny Crosman welcomes feedback at email@example.com.