TD Bank Group's AI team has built and deployed its first AI agent — a model that completes mortgage loan applications in minutes as opposed to the 15 hours it normally takes a human to process.
The new AI agent was deployed in January and mortgage loan officers have embraced it, he said.
"If you just imagine people who spent a lot of time doing some fairly arduous tasks, looking through various IDs, hunting through 100-page documents for that key piece of information, we can reshape their work into something that's much more closely aligned with delivering on the brand promise that TD has of 'more human,'" said Chad Koziel, an assistant vice president at TD and founder of the AI startup Layer 6 that the bank acquired in 2018.
The new agent comes at a time when banks across the country are exploring practical uses for agentic AI that can show a quick return on investment, from
TD chose this initial agentic AI use case because it's a sharply defined task with clear inputs, clear outputs, and a human in the loop, and it can shorten loan approval times for customers as well as improve efficiency for the bank itself, Koziel told American Banker.
"Building a really good agentic system is hard, and hard is a corollary for expensive," Koziel said. "And so, for our first use case, we wanted to make sure it was something that made a massive positive impact for our clients, so the residential lending journey fit the bill."
Patrick Hall, chief AI officer at the George Washington University School of Business, said this project makes a lot of sense.
"Just like always, there's ways to do this right, and there's ways not to do this right," Hall told American Banker. "The TD application really struck me as doing a lot of things right. It's a document-based application, gathering documents, analyzing documents, and language models are good at and designed for handling documents. It's a fit for purpose use case."
The quantifiable savings of 15 hours is another positive, Hall said. "Quantitative measurement of generative AI and agents is substantially more difficult than the machine learning that banks were using three to five years ago," he said.
Brad Leimer, founder and principal at Leimer One Advisors and former head of innovation at Santander, also gave props to the project. "It's a good initial use case for agentic AI because it's a controlled, but very document-heavy workflow," Leimer told American Banker. "It has relatively standardized documents coming from clients, and clear timelines in terms of how the bank needs to respond. The mortgage loan process also has measurable friction, one where a human still needs to be accountable for the final decision. That's almost a perfect use case, as people still need to be very much involved, but you've shortened the time for the credit decision."
Leimer also thinks this effort is notable because it's not a chatbot experiment. "They seem to be building AI into their existing operating model," he said. "That is where more banks will see greater value. The ROI from AI comes from redesigning specific workflows where AI can gather, verify, summarize, escalate and route work more effectively."
What the new AI agent does
TD underwrites hundreds of thousands of mortgages every year, Koziel said. In the U.S. alone, TD Bank recorded $826 million in total residential mortgage origination volume in 2025.
Once a potential borrower has provided necessary documents, such as a purchase agreement, government-issued ID, account statements and proof of income, these are assembled into a package. In the past, a human reviewer would review the documents, extract the relevant information, verify details, check for inconsistencies and create a summary.
The new AI agent handles these steps and presents the results to a credit adjudicator to make a decision.
The mortgage officers "get a more complete package, they get a more accurate summary at the very end, and they get it in minutes," Koziel said.
He declined to say which foundation model underpins the agent, but said the bank creates some of its own models, and it also uses open-source models as well as off-the-shelf foundation models like Anthropic's Claude and OpenAI's ChatGPT.
He also declined to share how long it took to create the agent, or how many people were involved.
"But we'll say that it was thoughtful and involved, and the scale of something like this means you've got tons of people at TD involved in building it," he said. "We brought in subject matter experts from the business, we brought in scientists from Layer 6, we brought in engineers and technology professionals across the board, our risk and control partners as well are tightly involved end to end, and it's this combined, frankly heroic effort that results in this type of first transformation."
Hall gives high marks to this cross-functional participation. "If it is truly a cross-disciplinary project, then I think that makes the chances of success better than if it's a couple developers and a business line with a wild idea," he said.
Why an AI agent
Mortgage application processing might sound like work better suited to a rules-based model than to a predictive model.
But rules-based models can't handle this because of the differences among borrowers, Koziel said.
"In a rules-based framework, if we wanted to try and calculate somebody's income, there are all sorts of nuances: What time of year is it? If they give me a pay stub, how does that pay stub translate into an annualized income? How do I add their rental income to this? How do I extract that information when some of these are blurry phone photos?" Koziel said. "It is not possible to reliably account for all of these differences between humans using rules-based models, that's why we had humans do the process."
Generative AI is capable of this type of complex reasoning, he said.
Overcoming gen AI's poor math skills, other risks
Gen AI models are also famously bad at math, even counting. When asked how many "r"s appear in the word "strawberry," for instance, ChatGPT, Claude and Google's Gemini have all failed to come up with the correct number, in a phenomenon known as the strawberry test. Large language models do not read words letter by letter the way humans do. Instead, they use a process called tokenization, where words are broken down into chunks. That makes precise counting difficult without prompting the model to slow down or spell out the word.
Large language models struggle with math in general because they predict words based on statistical patterns rather than performing actual calculations. They sometimes hallucinate numbers, fail at multistep arithmetic and lack a rules-based system for operations.
This means generative AI models need to be shaped and optimized component by component, Koziel said.
"If we wanted a model to count the 'r's in strawberry, we wouldn't say, 'Count the r's in strawberry," Koziel said. "It's poor at that." Instead, TD developers will tell the model to use a deterministic, rules-based tool, like a calculator or Excel spreadsheet, to perform mathematical tasks like annualizing income from a pay stub. His team also feeds the model definitions for TD-specific terms and acronyms.
"All of that helps their performance as they undertake these complex reasoning and acting tasks," Koziel said. "We use guardrails to encourage the model to use certain things. We say, here is the library of things that you can use, here is what they are for."
Keeping AI agents on a leash
Generative AI models are known to make errors and
"Our job is to tamp that down," Koziel said. "Our job is to create an environment in which the model does what we want it to do, only what we wanted to do, and frankly, it does it really bloody well."
TD has a Trustworthy AI team led by Jesse Cresswell, staff machine learning scientist at Layer 6.
"This team guides not just how we should develop these models, and then as we get into development, the subject matter expert interaction with research scientists who verify through a rigorous testing process that the model is doing what we want it to do and only what we want it to do," Koziel said. "We conduct some final tests before we go live, and then once we're live, these things are monitored to within an inch of their life."
The bank watches performance metrics, accuracy and relevance of the models' answers. It watches for mistakes that should be fixed in future releases, he said.
Another thing banks need to watch out for as they deploy agentic AI is security, Hall said.
"These agents are very complex systems, they have very complex attack surfaces," he said. "People are going to try to trick these LLMs. They're blind to the physical world, and for that reason and other reasons, smart people can almost always find ways to trick them. Can I write something tricky, malicious or adversarial in my document to trick this system?"
There's also a verification burden when humans check the work of AI models. In some cases, verification could take longer than expected. In other cases, people can get bored and fail to check everything carefully, said Hall, who had an overall positive reaction to the bank's AI efforts.
TD plans to produce more AI agents now that this first one is functioning.
"This acts as the foothold that enables us to accelerate everything we want to do in agentic AI in the future,"Koziel said. "One of our first targets is a wholesale transformation of the lending journey. What we did is an impactful but small chunk of the overall journey towards funding a mortgage with TD. There's a lot more we can do there." Next up: AI agents for business loans.
Leimer noted that this project is a proof point for TD's acquisition of Layer 6.
"Buying AI talent and capability more than four years before this ChatGPT-led AI wave gave them institutional muscle that many banks are just now trying to build," Leimer said. "It is also smart of TD to start with one use case like this to prove out the operating model, and then expand it to other use cases. Too many institutions try to start with a giant enterprise AI vision instead of a specific workflow where the value, risk and control points are clear."












