Podcast

'Don't fall for the sales pitches': Advice on deploying AI

By Penny Crosman February 13, 2024 9:00 AM

Transcription:
Transcripts are generated using a combination of speech recognition software and human transcribers, and may contain errors. Please check the corresponding audio for the authoritative record.

Penny Crosman (00:04):
Welcome to the American Banker Podcast. I'm Penny Crosman. Most banks are piloting and testing various forms of AI, including machine learning, deep learning, and generative AI. But what does it take to get from an AI use case or idea to successful deployment and tangible results? Eric Siegel, a former college professor and an expert on predictive analytics and AI. So much so that he created a music video on predictive analytics, has just written a book called The AI Playbook. He's here with us today and he's going to share some of his thoughts on how to get practical results from advanced AI. Welcome, Eric.

Eric Siegel (00:43):
Thanks, Penelope. It's great to be here.

Penny Crosman (00:45):
Thank you. First of all, what inspired you to create a music video about predictive analytics?

Eric Siegel (01:02):
Well, I'll do anything to help educate and ramp up the world on this technology. That's quite understandable. There's generally kind of two ways that the content about what exactly it means to deploy a predictive model. That's what machine learning gives you is a predictive model. One is sort of high level buzzwords and maybe over promises kind of hype, which kind of gets abstract. My thing though is let's get concrete because it is understandable, it's fascinating learning from data to predict and then using those predictions to improve pretty much any and all of the large scale operations that make the world go round, including targeting marketing, fraud detection, credit score management, insurance, pricing and selection, so many other application areas. It's fascinating and it's mandatory, and that's the message in my book, the AI Playbook, is that we need to bridge a gap between the B and tech, which includes, and bridging that gap requires business professionals to ramp up on a certain semi-technical understanding so they can collaborate deeply in a meaningful way. It's fun, it's interesting, and it's valuable. Right now, most new enterprise machine learning projects actually fail to reach deployment and it's due to this gap and a lack of rigorous business side deployment planning.

Penny Crosman (02:29):
Well, that was going to be one of my key questions for you. This idea that most machine learning projects fail to deploy. But let me go back to the idea that machine learning is mandatory. Why do you say it's mandatory? Because companies can't really compete or stay relevant if they don't use it.

Eric Siegel (02:49):
Well, just to clarify a minute, it's mandatory to learn about it, but yeah, that's because it's mandatory to use it. What's one of the last remaining points of differentiation? As a large scale enterprise processes become commoditized and everyone's doing largely the same thing, and products have largely the same look, touch and feel. This is what it means to improve business with science. This is what it means to large scale operations in those kinds of functions. I mentioned a moment ago consist of many decisions. Prediction is the holy grail for improving decisions. Business is a numbers game and this is the way that you tip the odds in your favor and play that numbers game more effectively. We don't have clairvoyance, we don't have magic crystal balls, whether from a human or a machine, but using data and learning from it to predict means that you can predict better than guessing. And oftentimes for most of these applications where you're learning from a large amount of data better than humans, and that's a big win. So marketing's more effectively targeted credit risk is more effectively assessed and fraud is more effectively detected.

Penny Crosman (03:55):
So when you say that most machine learning projects fail to deploy, would you say in a way that that's appropriate because not everything lends itself to machine learning and some machine learning models are not designed to do certain things, or do you see this as a problem that needs to be overcome?

Eric Siegel (04:16):
No, I'm referring to a problem that needs to be overcome. I'm talking about projects where it's already been broadly sussed out, Hey look, this is an opportunity where our fraud auditors could be looking at a more well chosen pool of transactions to audit those more likely, significantly more likely than average to be fraudulent. Therefore, much better use of their precious and costly time. Places like that where we have a very clear cut use case value proposition of predictive analytics, predictive AI, enterprise machine learning, whatever you want to call it, machine learning generates models that predict. So the idea is already sussed out. The data scientist does the number crunching, uses the machine learning software and churns out a predictive model and then with the intention that it would be deployed to improve those operations. But then the stakeholders ultimately get cold feet or, and or things just haven't been prepared rigorously enough from a technical standpoint because the focus was on that technology, which is the cool rocket science part, rather than on the enterprise operations improvement. The business side of it, that changed to operations where it really needs to be, things weren't planned rigorously enough, stakeholders weren't ramped up well enough and didn't participate in enough details. So if business stakeholders don't get their hands dirty, their feet will get cold, and that's the syndrome. So these models get made their potentially very valuable. The value is not captured because it's not deployed, it's not acted upon.

Penny Crosman (05:51):
And is that happening because of fear or because of lack of understanding or because of corporate bureaucracy and permafrost or

Eric Siegel (06:04):
Yeah, it's happened because of fear bureaucracy and lack of understanding that yes, I mean, well, first of all, it's change management like any other. So here's sort of the bad news. You can't just use this incredible rocket science and do the core number crunching, which is by the way, really amazing. It's the reason I got into the field more than 30 years ago, machine learning, and I dare say it's the reason why most data scientists get into it. It's the coolest science learning from data to predict ascertaining or discovering patterns or formulas or rules that hold in general that pan out and perform fairly well over new unforeseen unique cases and situations. And in that sense, it's literally learned something that holds in general. So that's why it's called machine learning. That's the coolest science. The bad news is that doing that science doesn't deliver value.

(07:01)

It doesn't capture or realize value. It generates potential value only by acting on it, you're only going to get enterprise value. When operations change, they can improve, but that improvement is change. So it needs change management, change management's not anything new, but the focus with these projects where everyone's kind of fetishizing the core technology isn't on change management. It's like people are forgetting, wait a minute, we're trying to improve the business. No, this is the panacea, the most awesome technology. It's incredible and it is. But this is a business project first, an operations improvement project that uses machine learning as necessary but not sufficient component as part of the project we now need to implement, deploy, operationalize it, change operations according to its predictions in order to improve them.

Penny Crosman (08:01):
So in financial services, as you mentioned, there is quite a bit of use of machine learning in making lending decisions, in fraud detection, in cybersecurity analysis and in marketing and areas like that. And in some of those areas there is some risk, like for instance, where banks use machine learning in lending decisions. Their regulators, like Rohit Chopra, who's the director of the Consumer Financial Protection Bureau, frequently warn banks that when they use AI models, they can't be a black box, they have to be explainable, they have to be transparent, there can't be any bias and the decisions must be fair and not have a disparate impact on protected groups. And we hear these warnings over and over again. Do you think just based on what you know about how machine learning models generally work, do you think those kinds of worries are overblown or merited?

Eric Siegel (09:03):
I think they're mostly merited. There's certain ways in which they're overblown. Let me go through some of them. First of all, the issues with responsible AI, responsible machine learning, the ethical considerations, I actually take those probably more seriously than your average data scientist. In fact, the second chapter of my first book, predictive Analytics is on ethics. And I've published a dozen op-eds in Scientific American blog in San Francisco Chronicle. Those are all available@civilrightsdata.com if you want to hear me pontificate. But my pet causes are kind of discriminatory models and machine bias, and I try to break that down. It has to do with models make or at least inform very potentially consequential decisions about whether you're approved for credit or even in the case of law enforcement, whether you're approved for parole. So when the model makes a mistake, you could be unjustly left in jail for an extended period of time or withheld from getting credit approval.

(10:01)

And these are just a couple examples. Housing, there's so many. The problem is that we don't have a magic crystal ball. We can't predict whether somebody's going to commit a crime again after release with extremely high confidence. But we can predict better than guessing and probably better than humans, whether it's a human or machine where there are going to be errors. The problem is when those errors that limit access to resources are higher for a certain protected group, like a certain race than another, and that difference in what's called false positive rates, where those costly errors are incurred from one group to another, that's often referred to as machine bias. I call it discriminatory models, when the model explicitly makes decisions based on a protected class like race. So that's a whole issue. I think it's extremely important. And yes, you need visibility into how the model is making its decisions to suss those out.

(10:56)

I think the place where understandability models gets overblown and the requirement of that transparency gets overblown is in a couple ways. One is there's a sense that hey, we need to understand the model in order to trust it. But there's a limit to our understanding in general, most of these models are created over found data. There's no experimental design, there's no control group. So we're not actually getting causality. But that doesn't mean it's not predictive. So it predicts, but it's hard to understand exactly why for one ad targeting project, people who or students who had indicated interest in military were more likely to respond to an ad for the art institute than average. And you can explain that in a bunch of different ways. What's their family background? Are people interested in military more? Well-balanced, and there's a million ways you could explain it, but we not know unless we did additional experiments, we don't need to do those experiments for business value. We aren't doing sociology, we're not trying to understand what makes humans tick. We're just trying to decide which ad to show the person that they're most likely to click on. So that's sort of mythology there a little bit about the degree to which we need to understand the model, but we do need transparency, at least for the ethical considerations.

Penny Crosman (12:19):
That certainly makes sense. So obviously the buzz over the last several months has been about generative AI and large language models. And I just wonder, what do you think are some of the most useful or practical use cases for large language models?

Eric Siegel (12:40):
Well, basically it makes first drafts of writing of computer code of images. So I think that there's a false promise in the general public narrative, which is that this thing is going to become capable of human level activities in general. And there's a lot of hype about it. What it does is absolutely incredible. I spent six years in the natural language processing research group at Columbia in the nineties, and believe me, I never thought I'd see what these things can do now, but the ability to create such seemingly human-like copy or text to respond in a often coherent way, a meaningful way to any kind of turns a phrase across topics, human use of language with metaphors and all that, it is amazing. But those core language models, large language models are trained on the per word, or technically it's per token, but that level of detail per word basis.

(13:42)

So they create this sort of seemingly human like aura and as a side effect have exhibit a lot of capabilities, but we're not designed in and of themself unless there's additional layers on top to meet higher order human goals such as being correct or always knowing the right answer when you would expect a human expert to, that's another research effort that TBD. And if you're trying to get the thing to really be human level, well, they call that a GI, artificial general intelligence, and I like to call it artificial humans. I don't think that we are headed in that direction actively, even if it may theoretically be possible someday. And we don't have a fraction thereof either. So if you're churning out a hundred letters a day to customers for customer service, your job could potentially be the amount of time it takes could be cut in half.

(14:37)

It depends on the very particular scope of your task, who you are and the exact language model you're using. And it's an empirical thing. You got to try it out and see how well it helps and how much time it saves. It potentially can be a huge time saver, but there always has to be the human in the loop. You have to review everything that it generates. You can't just trust it blindly. You can't have a chat out there giving expert advice for medical diagnosis or whatever that's consumer facing. So whereas what I've been talking to you about, what the book focuses on in a lot of my career focuses on is to distinguish it from generative AI. We could call it predictive AI. And this is the type of machine learning that you turn to if you want to improve pretty much any of your existing large scale operations.

(15:23)

Operations where there is much intrinsically, there's already errors, most males, junk mail, lots of fraud goes undetected. We're going to only improve it. We don't have to be perfect. It can be autonomous, it can operate on its own. It can automatically decide which credit card transaction to hold is potentially fraudulent instantly without human in the loop. And that's where we stand to improve what are already existing large scale operations, which therefore predictive AI is older, but it's not old school by any means. The potential has only barely been tapped, and it's where there's improvement track record, there's still a lot more resources thrown at it than generative, but it's not a competition, not a zero sum game. And generative is a whole new world. There are probably new ways to use it. I'm not sure that we're ever going to come across the killer app.

(16:14)

That's generally expected in terms of a huge amount of value, but just being able to have it right, a first draft of code, in many cases, a programmer will consider that and often does consider that a killer app. So that's a little bit of a subjective thing. It's a little hard to manage the expectations without overblowing them. I think it's fascinating. And in fact, the conference series that I've been running since 2009, machine learning week, we're now actually launching a new sister conference, generative AI world. And those two take place the first week of June in Phoenix.

Penny Crosman (16:46):
Well, and that a lot of what you said jibes with what we're seeing in financial services where all of the kind of hype and in curiosity about generative AI has brought about an increase in interest and use of more traditional forms of AI like machine learning and natural language processing and such. I feel like the title of your book is appealing. I think a lot of companies would like to be given an AI playbook that just says, here, do this, this, and this, and you've got it. You presto, you'll have a machine learning or an AI deployment, but I suspect that the playbook would need to be a little bit different for each organization, each use case, each team. Do you think that so, or do you think there are certain principles that everybody needs to use when they are trying to deploy AI?

Eric Siegel (17:42):
Yeah, I mean, there's some principles that they may not be fully sufficient. Every project has its own ins and outs, whether it's machine learning or any other kind of project. But there are some principles that are routinely missing, and that's why new machine learning projects routinely fail to deploy. Most new such projects actually fail to deploy. And the book, the AI, so let me read the full title. The AI Playbook, mastering the Rare Art of Machine Learning Deployment, and unfortunately, it is a rare art right now. There is a disconnect between biz and tech. So what I offer in the book is a six step paradigm playbook framework that I call biz ML business practice for running machine learning projects. And the last step is actually deployment. So culminate with actually getting the thing integrated and operationalized so that operations are actually being changed. The first step is to plan for that for the get go.

(18:37)

But the broader theme is that across those six steps, we need a deep collaboration between the data scientist and the business stakeholder, the data scientist's client, maybe the manager in charge of the operations meant to be improved with a predictive model. And that's generally missing, and that's what I'm trying to issue here, a clarion call to the world that, hey, look, the business stakeholders need to collaborate deeply, and to do so, they need to ramp up on some semi-technical understanding, which I can outline. Now. Basically, you need to understand for any given project, three things, what's predicted, how well and what's done about it. So let's predict which transactions fraudulent in order to target auditor activity or to automatically hold or block a transaction. Let's predict which customer's going to respond to marketing in order to decide who to spend $2 sending a glossy brochure, let's decide who's going to be a bad debtor.

(19:34)

And this is a standard use of a credit score in order to decide whether to approve an application for a credit card or any other kind of loan. The how well part is, how good is it, right? And that's often a key missing ingredient to these questions is how good is ai? How do you quantify it? What are the pertinent metrics? Right now, the disconnect is as follows, the data scientists only in many cases, in most cases, only measure the pure predictive performance, which only tells you relatively how good does it predict, how well does it predict compared to a baseline like random guessing, which is helpful to see and tells you it's potentially valuable. Whereas we also need business metrics like profit ROI, number of customers saved numbers of dollars saved. That is to say, what are the pertinent business metrics that could be improved and how much could they be improved?

(20:32)

And depending on how we do deploy the model, and that's where you're going to bridge that disconnect by ramping up on those notions and what they mean for different projects, what's predicted, how well and what's done about it, then the stakeholder is ready to participate. And more than anything, it's you as a business professional. Once you've ramped up on that semi-technical understanding that these projects need in order to succeed, then any additional increases or improvements to technology, it needs you. And by the way, the book, while it goes through the six steps as a side effect, it's ramping you up as a reader on that sort of stuff. So I can't cover it in an hour, I can't cover it. It's sort of a book's worth of understanding, but it's not the core rocket science. It's very accessible. It's sort of like to drive a car, I don't need to understand what's under the hood. And in fact, I've personally never changed a spark plug and I don't know where they are in my car. I've only looked under the hood of my car once and I was like, whoa, look at all the parts. But I'm an expert just like you. I know how to drive friction momentum rules of the road, how the car operates and the mutual expectations of drivers. That's a lot of expertise. You analogously need that expertise to drive a machine learning project if it's meant to successfully deploy and deliver value.

Penny Crosman (22:02):
That makes sense. Well, for a lot of financial companies, especially small community banks, they just don't have a staff of data scientists, programmers, and other technology specialists. They might have two or three IT people and that's about it. So companies like that are really dependent on vendors who prepackaged these things for them. Do you have any advice on choosing the right AI related vendors and vetting their products and working with them when you might be their smallest client?

Eric Siegel (22:37):
Yeah, absolutely. Don't fall for the software sales pitches. This is a consulting gig, not a solution plugin. By definition, a machine learning project is not just the technical number crunching part, it's the actual change to operations. And that's what this practice is about. You can participate in the practice, you do need data scientists, and you can go external. The size of the company, by the way, is not in itself a determining factor for whether there's a potential viable project that the use case is defined by what's predicted and what's done about it. And if it's a large enough operation that would be changed in that way. If you are sending marketing to a million prospects just once a year, you might be a pretty small company, but you've collected enough historical data in terms of who did and didn't respond in the past from which to learn.

(23:31)

So if the operation's big enough that tweaking it could deliver a huge benefit to the bottom line, then by virtue of the size of that operation, you've probably collected and aggregated enough historical learning examples. That's called the training data. Now it's a business practice like, well, how would I change my operations in terms of targeting marketing or changing decisions about loan application processing, insurance pricing and selection, fraud detection. How could that operation potentially be changed? That's where you're starting, it's reverse planning. Okay, well, to that end, what exactly would I need to predict? Okay, then what kind of data do I need to pull together? And it's the involvement. If it's an external service provider doing the analytics part, right, you're still the stakeholder. It's still a collaboration across these steps. It's not plug and play. There's this notion of a citizen data scientist and some of these machine learning software tools try to simplify things so much. I call it a PhD tool push here, dummy, right? It does everything for you. So you're sort of protected from the technical details and deciding too much about the parameters when you're setting it up to hit go. But look, it still requires data science expertise and it requires your business expertise. The core number of crunching itself is literally step five out of six and the way I formulated it, and that alone, the world needs to learn that lesson. That alone is not sufficient to deliver value.

Penny Crosman (25:08):
All right. Well, Eric Siegel, thanks so much for joining us today and to all of you, thank you for listening to the American Banker Podcast. I produced this episode with audio production by Kellie Malone Yee. Special thanks this week to Eric Siegel, author of the AI Playbook. Rate us, review us and subscribe to our content at www.americanbanker.com/subscribe. For American Banker, I'm Penny Crosman, and thanks for listening.

Penny Crosman

Executive Editor, Technology at Arizent, American Banker