BankThink

Synthetic data is the real deal for payments and financial services

By Lorn Davis October 19, 2020, 11:00 a.m. EDT 4 Min Read

If the advent and eventual rise of big tech has taught the financial industry one thing, it’s that vital customer data equals dollar signs.

The catch is that the financial institutions who are stewards to arguably some of the richest customer data that exists, remain mostly on the sidelines when it comes to cashing in. This is not without cause, as compliance and regulatory webs create burdens Big Tech doesn’t have to deal with. The U.S. Gramm-Leach-Bliley Act of 1999, the European Union’s General Data Protection Regulation (GDPR), and the more recent California Consumer Privacy Act, among others, have stifled data sharing at many institutions even further.

In theory, banks and credit unions that have vast amounts of data should be able to unlock the value of their data by pulling it out of their siloes, anonymizing and organizing it to be used both internally and externally, while still complying with privacy regulations. However, this would require a skillset few institutions have in-house.

This is where synthetic data comes into play. An institution’s use of synthetic data enables it to become a data-driven organization and benefit from internal and external monetization. Internal monetization can come in the form of using the data to improve or propel artificial intelligence and machine learning initiatives that help to build better products and deliver better customer experiences, while protecting the source data at hand. The external monetization of synthetic data represents an opportunity where reliable and legitimate companies will pay for such insights to inform investments, business strategies and even fiscal policy.

In short, synthetic data is defined as “microdata records created to improve data utility while preventing disclosure of confidential respondent information.” It’s a data type used by a variety of industries (health care, science, technology, financial services). For example, epidemiologists across the globe are using synthetic data from clinical health data aggregated in the U.S. to help develop vaccines in the fight against COVID-19.

In the context of banking, synthetic data represents a direct one-to-one transformation of the original data that reveals true insights into consumer spending. For example, 10 million credit and debit card transactions can be converted to 10 million synthesized transactions that’s stripped of identifying data such as name, address and birth date. More data equals truer consumer spending insights.

To be clear, it’s important to note that this also differs from anonymized data, where just the personally identifiable information is removed, but leaves the real transaction data unchanged. In synthetic data, raw data is run through special algorithms and generators to create new data sets that cannot be traced back to the original consumer or transaction. However, this “fake” data set retains the accuracy and statistical significance of the original data set, making it ideal for creating a baseline for future studies or testing, modeling business opportunities, projecting trends and more.

As far as synthetic data itself, the fact that it’s both anonymized and statistically accurate offer four key benefits for organizations (whether financial, retail, government, etc.).

Safe and secure data monetization. Safely monetize data assets by delivering the “truth” of data insights without putting privacy at risk. For example, synthetic data strips away identifying personal information such as a consumer’s name, address, birthdate, while only revealing the purchase date and transaction amount.

Privacy data protection. Safeguard personal data by reducing the degree to which authentic data needs to be shared internally or disclosed externally. Raw data never leaves your firewall.

Regulation compliance. Use of synthetic data can help organizations manage their privacy and data security-related legal obligations (i.e., CCPA, GDPR). This is becoming increasingly important as lawmakers turn up the heat on these issues.

Synthetic data helps financial institutions in a number of ways. One example is the ability to create or modify card products that become “top of wallet” for consumers.

Because of the direct one-to-one transformation of the original data that reveals true insights into consumer spending, you have better insight into what consumers are spending money on. For example, if 10 million transactions show high spending in multiple categories (say ride-share services, pharmacies, casual dining), marketers can incentive their customers to use their cards with bonus points for certain merchants. That card suddenly moves to top of wallet for the consumer.

As a bonus, financial institutions can have access to this type of data on a rolling basis, enabling them to make decisions on the fly against consumer spending, which is vital, especially in the pandemic era.

To be clear, there’s no doubting the potential. The global data monetization market is estimated to be $370 billion by 2023, and growing at a CAGR of 35.4%, according to Allied Market Research.

If the banking industry is to come to terms with the prevailing logic that we’re now operating in the data economy, the value of synthetic data cannot be overstated. The likes of Amazon, Apple, Google and the rest of big tech have been able to harness their own customer data in a profitable way, and are using synthetic data for product development and innovation. Apple, Facebook, Google and others are increasing developing products in the realm of financial services. Banks and credit unions have an opportunity to turn the tables with synthetic data, and enable themselves to become data powerhouses to benefit their customers.

Lorn Davis

Corporate and product strategy leader, Facteus