
Ally Bank was one of the first U.S. banks to provide generative AI in its call centers to create call summaries. This work was one of the things that made Sathish Muthukrishnan, chief information, data and digital officer at Ally, American Banker's
Muthukrishnan recently shared with American Banker some of $182 billion-asset Ally's recent AI initiatives, the returns it's seeing on AI, and how he keeps Ally's and its customers' data safe with multiple large language models consuming it.
You recently launched generative AI to all employees. Did you do it in stages, or did you do it all at once?
SATHISH MUTHUKRISHNAN: We did it in stages. Like
Ally.ai, as you know, is the bridge between Ally and all of the other LLMs out there. So it gave us an opportunity to perfect the UI and the customer experience on Ally.ai, as well as pick and choose between the large language models externally. And then we made the decision earlier this year to roll it out to the entire company. So from then on, we have been progressively talking about it, rolling out training, and we required every one of those 10,000 employees to take training before they got access to the tool.
We rolled it out July 23 and within the first 48 hours, more than 50% of our employee base had logged in and used the AI. More than 6,500 employees logged in and logged more than 15,000 prompts within the first three days.
For those general employees across the organization, what are some of the top uses they're starting to make of this?
There are a number of uses all the way from "tell me how to summarize this" to "give me ideas on how to write this email" to "help me understand this concept that I don't understand." Some people used it to summarize the latest Ally earnings call. And some people are learning how to use AI through Ally.ai now, [prompting it] "tell me some of the ways that I should be using this." The art of Ally.ai is we have made it and refined it to behave like Ally through system prompts, so it almost operates like your Uber copilot within our bank. So it tells you how to use it for a number of things that you normally do on a day-to-day basis.
Is that based on what other people are doing, or on what people in other organizations have done with it, or something else?
Its foundational knowledge is coming from the broader internet, so what others are doing, but the personalization comes from Ally.ai sitting on top of these large language models, and we call that system prompts. We differentiate ourselves through the context, and we have made it possible to provide context through the implementation of retrieval augmented generation. [That means] Ally.ai has the ability to inspect the input and the output, to remove personally identifiable information before it leaves Ally. So the beauty of it is the foundational learning comes from what happens across the globe. But the nuance is the context that we provide Ally.ai to make it more relevant to our own employees.
How did you build Ally.ai?
We were the first ones to build a platform like this. The first reason was we saw technology moving extremely fast, so we did not want to put all our eggs in one basket. [A particular large language model] might not exist tomorrow, or it might pose a big operational risk for us. So we needed to create this moat around Ally to mitigate the risk of fast-evolving tech and to protect Ally.
The No. 1 goal for us was not to just roll out AI for everybody. We wanted to be early adopters, but we wanted to scale it safely and securely. So we needed to establish the security protocols, and that's why we built this platform called Ally.ai, and it does a number of things. One, it provides the context to operate like a digital bank. Two, it allows us to inspect the input and the output, remove PII, and help us adhere to the three guidelines that we established.
For internal-facing use cases, we'll always have a human in the middle. And no personally identifiable information of Ally's customers will leave the Ally network. So Ally.ai allowed us to remove the PII before we consumed the LLM. That module we built on top of LangChain, our early partner for the RAG implementation, we have open sourced it, and a number of other companies have used that PII module as part of their generative AI implementation.
Another element is personalization, having the LLM behave like the digital bank. And finally, Ally.ai allows us the freedom to experiment with a number of LLMs. The front end of Ally.ai is constant to our employees. So the team is evolving the UI/UX based on how our own employees are consuming the generative AI LLMs. They were evolving and advancing, while at the back end, we were plugging in multiple LLMs to see which one is working better. So one LLM was working great for summarization while another one was working great for code generation. So we were able to plug it in the way we wanted, and we were also able to experiment with a number of other LLMs without changing the UI every time.
How did you teach it to act/think like Ally's digital bank?
If I talk about a bank, a generic LLM could talk about a river bank, with pebbles and sand. So we need to make sure we give it documents so it understands it. As a digital bank, any content that we build has to understand how Ally has communicated with customers over the last 18 years since it became a digital bank. So we have given it all of the content that we have created. It knows the context behind how we have evolved our communication with customers, and gives them an answer that is very relevant to what they're doing today.
At one point, you were using OpenAI on Microsoft's Azure cloud. Are you still working on Microsoft's cloud?
Yes, we're still using Azure. Our data, however, is in the Amazon cloud. So we have mastered the art of bringing these two hyperscalers together, we still use OpenAI on Azure.
You've been using gen AI for call summarization in the contact centers for a year and a half. How is that going? Is that still yielding good results for you?
Fantastic results. We have gotten tremendous feedback from customer care reps that are using it. We started off with banking. We have rolled it across insurance and auto. We are getting so many other use cases from our customer care reps, who have been change leaders for us in generative AI. We have used that gen AI summarization across other businesses and functions, our audit team has used it. Among customer care reps, the acceptance of summaries provided by AI has gone from about mid teens to 90% so the team has been able to perfect how AI works, and the accuracy and comprehensive nature of summarizing the conversation has gone up from 40% to north of 80%. Our customer care reps are now able to singularly focus on our customers and allow AI to do all the mundane work of capturing the entire conversation and summarizing it.
Generative AI models sometimes miss the point of a conversation, they don't necessarily know the difference between an important point and a mundane occurrence or chitchat. Some people also fear that AI can introduce bias into a customer's history.
It's a constant watch item for us, and that is why we were patient in rolling this out. Initially, we had a number of pilot customer care associates who were working in parallel with a gen AI system and they gave a thumbs up or a thumbs down on the summary. We saw some of them going back and editing it, and we captured all of that to understand where it was going in the initial days. There are constant checks and balances going on. There is an industry protocol called ROUGE testing that allows us to go back and back test all these conversations, learn from that and go back and constantly refine Ally.ai.
We trained this LLM on these specific conversations we were having, so it was able to become more accurate quickly over time. We still have this constant testing going on, checks and balances happening and we catch biases. But again, those system prompts allow us to make sure that it is not introducing biases subconsciously, and we have told it to only answer questions that it accurately knows. If it doesn't, it's not supposed to answer. So you can actually prompt and configure the system to do all of this ahead of time so the outcome is less erratic and more consistent.
Have you seen specific results, like hours of time saved, with these summaries?
It varies from department to department. A minimum of 30 seconds to two to three minutes per call is what we have seen. There is an average of 10,000 calls per week, and every call averages around 15 minutes. So that's significant savings. For content creation, it has shaved off several weeks, even if the first draft is a throwaway, the writer's block or the creator's block is completely eliminated, and that could be from days to a week. So we're seeing anywhere from minutes to weeks of productivity gain and what it has allowed us to do is get to our road map items that we would never have been able to get to. And there is always less bandwidth than demand across the company, especially for a digital bank that is constantly evolving and trying to move the needle on customer needs.
I read that you have an Ally playbook that's like an AI policy. What are some things that are in that?
The AI playbook tells you how AI gets implemented safely at Ally. It tells you, if you have an idea, how do you create an AI use case, and how to progress it through execution, all the way from experimentation through execution in the different groups that you have to engage in, because it could be overwhelming if you are some level removed from technology.
The AI playbook also gives you resources where you can go and learn about AI. Think of it as a primary index that tells you about, if you're brand new to AI, here is where you start. If you are advanced, here is where you go. You already have a use case. Where do you go to get it approved? How could you engage and get it implemented? It's a simple, 40-page document that is an end to end understanding of AI at Ally and how you can engage with it.
I saw that Ally was the first bank to join the Responsible AI Institute. Can you tell me a little bit about that? What do you think it might be able to accomplish?
One of the things that all of us have been grappling with is, how do we safely and securely scale AI, and then also trying to understand, as we progress in this journey, we want to be cognizant of all of the controls and regulations that are coming in from a governing body, so we implement them and are not catching up later. It's human nature to say, "I don't want the controls now. I'll figure it out later." But we did the opposite. We did the hard things of establishing controls, ensuring that the data is safe and secure before we rolled out AI globally. And the Responsible AI Institute was representing the broader industries in terms of how to scale that, but also coming back with, here are the best practices and how you have to go about it. So we found that there wasn't any representation from a financial industry perspective. So we were the first bank to join them, not only to learn how to scale AI responsibility, but also to share some of the roadblocks that we face, some of the blocks that we have, and how they can represent us effectively. So it's been a great partnership, both from a learning perspective as well as from influencing and contributing back to the broader industry.
Is it kind of a peer group, or are there specific AI experts who kind of give seminars?
Both. There are experts that are global as well as there are peer groups. So you can actually learn from how other industries are scaling AI. It provides accelerated learning and experimenting.
Are you using generative AI in coding, giving it to developers?
We have given it to developers. Initially, we had about 100 developers using Ally.ai, generating inline code and giving us feedback on what is working and what is not working, and we have scaled it, taken it through our model governance, and now have rolled it out to a larger employee base to generate coding. We have also rolled it out for people who write user stories. So across agile, we see generative AI playing a critical role in uplifting, how fast we bring ideas to market and sharpening the development life cycle, especially important for a digital bank.