Small banks need to plug the data gap

Register now

Banking, along with many other industries, is racing toward a future driven by artificial intelligence. AI-related technologies are projected to save banks billions annually within the next half decade. Those that best harness these technologies and their benefits will gain a competitive advantage.

That should worry community and regional banks. Machine learning and deep learning solutions, which are driving much of the technological advancements in AI in recent years, require large volumes of data to train and fine-tune for market. In the age of AI-driven banking, this will give a significant advantage to the largest institutions with the most users and thereby the most data. This will enable megabanks to deploy more AI-based services than smaller competitors, and their solutions will be more effective and accurate with more training data.

To compete, community and regional banks will need to access greater volumes of data. Megabanks can simply harvest the data their customers provide by using their products and services, but smaller banks will need to get more creative. This will require exploring third-party data sources and leveraging data-sharing agreements, as well as considering new technologies that can alleviate this problem for smaller institutions.

Of course, many financial institutions have been purchasing data from third parties for years, but the increasing importance of AI technologies will likely drive renewed focus on third-party data. Purchasing data from data brokers or other entities can help smaller banks overcome their data scarcity and attain new insights.

For instance, Wescom Credit Union combined historical customer data with data from third-party sources to build individualized profiles for each of its members. Those profiles are then used to personalize timely marketing offers and recommendations based on recent activity.

Banks looking to acquire more external data can also turn to new sources. For instance, online data marketplaces are proliferating and growing more diverse in the types of data and sources they offer. Gartner has predicted that 25% of large enterprises across industries will be buying and selling data on such marketplaces by next year. Smaller banks should try to get in on the action too — in order to even the playing field.

Data sharing among organizations in the financial space is also likely to become more common, as banks and others look to increase their access to new data and insights. Large financial institutions and fintechs have already formed a new group to develop standards for sharing anonymized data among different industry participants. Smaller banks should investigate how they can leverage such data-sharing arrangements. Additionally, smaller banks can explore one-to-one data-sharing agreements with other firms, such as fintech providers. Large banks, including Wells Fargo and Chase, have been forming such partnerships to improve their customer experiences, but smaller banks may use them to gain access to anonymized data that can complement their own data sets.

Another path smaller banks can explore is synthetic data generation technology. This technology takes existing data sets and then uses algorithms to produce similar “synthetic” data that can stand in for real-world data.

Although synthetic data generation is still relatively new, early results have been promising. A research paper from MIT and the Institute for Data, Systems, and Society involving predictive modeling experiments found that synthetic data delivered similar results to authentic data 70% of the time. Using synthetic data for developing AI models can also provide unique benefits. Since synthetic data isn’t generated by any real-world users, companies don’t have to worry about the data privacy concerns that come with using real-world data.

Although synthetic data generation is still in the experimental phase in banking, researchers and organizations are already starting to consider the implications of using synthetic data to build or enhance models for credit scoring and anti-money-laundering detection. Community and regional banks should monitor developments and consider where synthetic data could provide value.

If not already doing so, smaller banks should be considering their AI strategies that address how they will leverage AI technologies in a targeted manner to deliver new value to their customers and gain new efficiencies. To implement that strategy, they will need to figure out what data assets they will need to train their AI-driven solutions. Once in-house data assets are assessed and inventoried, banks should explore what outside sources could supplement in-house assets. Finally, banks should earmark applications where present data volumes are insufficient as areas for potential exploration with synthetic data.

For reprint and licensing requests for this article, click here.