BankThink

Make sure math justifies machine learning-based lending decisions

Machine learning models hold great promise in helping expand credit access to those who are unfairly denied, especially those with thin credit files. But it also poses implementation pitfalls for lenders.

If lenders fail to simultaneously adopt valid math-based explainable technologies, the models run the risk of violating the law. And the regulators may shoot it down before it gets off the ground.

When consumers are denied credit by ML models, the law (quite properly) requires lenders to tell consumers why. This helps consumers know what to do to improve their chances of getting approved next time. It also gives regulators confidence that lenders are making fair, business-justified lending decisions.

But, do lenders truly know why their lending models make the decisions that they do?

Most lenders rely on underwriting algorithms to make these decisions for them — algorithms that rely on hundreds or thousands of variables. Which variables principally caused the lender to deny the loan? Was it variable 127 or variable 18?

That question gets even harder when credit decisions are made using ML models, which make more accurate predictions and decisions based on countless interactions among all those variables.

Trouble comes, however, when lenders try to explain their lending decisions and identify principal denial reasons using math designed for simpler, antiquated models. You can’t use old tools to explain ML models if you want to get the right answer every time, as the law requires.

Yet today most lenders use one of two seemingly reasonable methods to identify principal denial reasons: “drop one” and its cousin, “impute median.”

With drop one, lenders test which model variables contribute most to the model score by removing one variable from the model and measuring the change in a score to quantify the influence of that removed variable. With impute median, lenders do the same thing but instead of dropping a variable, they replace each variable, one at a time, with the median value of that variable in the dataset.

These methods sound reasonable but in practice, they are often inaccurate. That’s because once you change the data that the model considers, you have moved from the real world into a hypothetical one. You end up trying to explain situations that would never happen in the real world.

These techniques also fail to account for the fact that variables interact, are not always independent, and that in ML models, variables may point in different directions.

A better approach is based on a game-theoretic approach developed in the 1970s by Nobel laureate Lloyd Shapley and his peers. Their approach was initially developed to explain how players in games like basketball each contributed to the final score.

It turns out that in ML models, variables act a lot like basketball players, making Shapley’s methods perfect for explaining how the models operate, and accurately identify principal denial reasons each time.

For example, this approach was tested out in October during the Consumer Financial Protection Bureau’s virtual tech sprint on improving adverse action notices. During the sprint, Zest AI and our partners from First National Bank of Omaha, WebBank and Citizens Bank, demonstrated how the drop-one method produced the wrong denial reason every time, and that Shapley methods got it right.

Looking further, it would help lenders if the CFPB could update or expand its guidance to accept more of these explainability methods aimed at increasing the accuracy of information that consumers receive, and why particular consumers were denied. Accordingly, the CFPB should revise its guidance to allow more mathematically justified methods to generate adverse action notices.

Such methodologies and technologies are available and used by lenders today. But it would help provide clarity to the industry if the CFPB fosters adoption so that consumers can get the precise information they need to build credit.

For reprint and licensing requests for this article, click here.
Machine learning Consumer lending Fintech CFPB
MORE FROM AMERICAN BANKER