Fraud Track: How AML tech drives personalization

Learning objectives 
  1. Data siloes (Why do they hurt AML efforts? Are the costs to breaking them down justified?)
  2. Cryptocurrency and crypto exchanges (Can banks do anything to limit the role crypto plays in money laundering?)
  3. Beneficial ownership and the Corporate Transparency Act (How much is the CTA changing this landscape?)
  4. Opportunities beyond AML (What synergies can a bank exploit while advancing its AML tech?)
Transcript:

Carter Pape (00:06):

My name is Carter. I'm a cybersecurity reporter for American Banker, but I also write about financial crimes including fraud and anti-money laundering. And I have it with me, Scott Nathan, Global Head of anti-money laundering detection, and customer insights at Citigroup. Scott was one of our three top honorees for Innovator of the Year. So I guess you should start with just introduce yourself a little bit.

Scott Nathan (00:33):

Sure. Scott Nathan, I lead the detection and client insights team for at Citigroup globally for our, looking at all of the ways that our customers are interacting to confirm that we are complying with our global regulations. So anything that passes through the bank from a financial crime is risk management perspective. All of those rules and regulations, we use various components of technology to consume all that data and identify potential anomalies that require action or investigation, and then potential subsequent filing for regulatory purposes, whether it's a SAR or other. So it's a multifaceted team spanning all of the, covering all our 95 plus geographies and we're made up of a cross-function of data science teams, risk specialists, risk analysts, technology specialists, and we also have a counterpart in our technology team that supports us. So it's an interesting dynamic group and we look at all of the petabytes of data that Citi consumes to make sure we're compliant.

Carter Pape (01:44):

So I wanted to talk through a bit of the hottest topics in the AML world to kind separate what's hyped from what's real. So you mentioned Citi processes, petabytes of petabytes of data. One of the things that I hear constantly about is using artificial intelligence to detect and prevent financial crimes, obviously useful in cases where you're processing a lot of data. So at this point, is using AI to detect money laundering and other financial crimes such as fraud, innovative, or is it something that most banks, credit unions and payment processors can access?

Scott Nathan (02:23):

Yeah, it's a good question. I think that it's definitely innovative in certain constructs. I think artificial intelligence has been kind of a misunderstood term considering it's, it has such a broad swath of applicability. You can think of the various components of ai, it's definitely become a buzzword in the financial crime space. But since institutions, financial institutions and fintechs have been using advanced algorithms to identify anomalous behavior, for instance, in the fraud space, I know we're a little bit about fraud and AML, but in the fraud space, we've been using AI based machine learning techniques for a long time to identify potential patterns out of sequence out of the norm. So that's how everyone gets those alerts on their cards, et cetera. The application of it is, has morphed, it's constantly changing. And I think that's one of the cool parts about the so-called notion of artificial intelligence. But in the financial crime space away from fraud, we're not making machines make the decisions. We always have to have a human in the loop for not for a variety of reasons, including compliance with laws and regulations. So facets of AI are completely expected and useful for us, meaning basic machine learning constructs, which are full, fully explainable and transparent. So we use a variety of machine learning techniques for training on data and looking at supervised and unsupervised applications for detecting behaviors. But we always have a human in the process. There are other robotics processing engines that allow our folks to populate data or populate filings in a way that would normally be a road human process that can be applied, that could be more of an AI based RPA based tool. But I like to take them down through the chain of sophistication. And when you get to the bottom or the top, whichever you want to think about, all the buzz of course now is large language models in GPT. And that's definitely not something a financial crimes program would endeavor to use at this point. But we do are experimenting with the likes of large language models in chat functions for other facets of the bank. And we've built, the city has created an innovation hub that's really focused on experimenting with AI type technologies and using internal data that's safeguarded and controlled in a sandbox so we can understand what the benefits are and what the weaknesses are and how we can use our explain and how we leverage explain ability techniques to make sure that our model risk teams are up to speed on everything that's happening. So we're trying to build that framework now and experiment how far we can take these capabilities. So we're always working behind the scenes to understand what the tools and outputs will look like and what benefit they may have and what risks they might also bring and what could potentially fail and how do we address that. So yeah, it's an evolving spectrum, but fraud has more risk appetite than we do in terms of letting the machine make decisions, but we're constantly looking to merge the fraud and AML spectrum so that we're more near real time.

Carter Pape (05:53):

How much interest have you seen from regulators in allowing machine learning to make decisions on fraud or specifically money laundering and taking the extent to which they're open to taking the human out of the loop when making those decisions?

Scott Nathan (06:09):

So it depends on, it's a good question. It's a complicated answer. I think there's a lot of different components that need to be addressed in order to make a regulator comfortable that you're taking the right steps and that you've properly covered the risks. And I think it all comes down to, like I said before, it's all about explain ability and having the ability to actually demonstrate with proper efficacy, the outcomes of each of the models in certain situations, like I said earlier, there's constructs of suspicious behavior that can be instantly identified in a more linear fashion. They don't require complex modeling. Those situations, were more comfortable experimenting and driving a machine to fully prepare an event disposition outcome, but we still have to have a QA process. We still have to have eyes on it, human eyes on it, just to constantly confirm that the systems are doing what we expect them to do. And I think any mature compliant organization will have that. It's just the sheer magnitude of people that are needed to do those types of quality reviews are drastically reduced by using this technology. So it's part of our simplification and just modernization construct.

Carter Pape (07:25):

Sure. Okay. So something else I wanted to ask about is data sharing frameworks. This gets a lot of attention for different reasons, but the main one that I see is just that there's a lot of promise people see in being able to catch fraud and money laundering and in all financial crimes if you're able to share between institutions information about the transactions that are going on. So I wanted to ask you about that. So what regulatory mandate do banks and credit unions have at the moment with respect to sharing AML data between institutions?

Scott Nathan (07:58):

So there's no mandate to share the financial institutions under the Banks Secrecy Act, and now the Patriot Act are permitted to share certain components of data as long as they follow the rules, SP specifically as they're written. So we can share under section three 14 of the Patriot Act can actually, banks can make contact with other institutions to ask questions pertaining to specific suspicious behavior that may involve the facets of the scope. And there's a safe harbor provision built in to allow the banks to do that, to protect national security and to protect the financial system. So the sharing of information today requires a very robust documented process, and you have to follow a bunch of guidelines and you have to make sure that the entities are registered. So it's not a seamless friction-free process. So the construct today to be has room for improvement. And that's something that fin, since it's the Financial Crimes Enforcement Network since its 2018, issuance around innovation, there was an interagency memo that supported innovation in this space, and they were looking for banks and financial companies to experiment with what could be done to help improve this process. And data sharing became one of those constructs. So in the fraud domain, people want to recover, they want to get their money back, institutions want to protect their customers and get the bad actor and get the funds back as quickly. So time is obviously of the essence in a fraud event. So fraud investigations sometimes follow a different information sharing path, but ultimately there's something that we're looking to work on that allows us to anonymize data or encrypt data in ways that allow us to build and train these models against shared assets that historically would've required many, many privacy hurdles to get through. And that's why we've struggled today to get as far as we need to go in terms of sharing information. So we're using and experimenting with some of these encryption techniques, homomorphic encryption, there's ways to secure data and train against it that with and still produce value, but never expose the personally identifiable information that's contained within. And that's the exciting part about some of these projects that we're working on. But ultimately, so there is no, to answer your question, there's no requirement that we share, the banks aren't even required to sign up to be a three 14 participant. It's encouraged because it does allow us, and a lot of the 4,000 plus institutions in the United States alone will contact Citi, for instance, if there's a transaction that they see that comes through Citi and involves one of their customers and they need help to do their job. And we have a whole team established to respond to three 14. So it is an important process. And referrals throughout the network typically result in high value investigations that result in, we're constantly bringing threats, identifying threats, and mitigating them as best we can. And it's the 3 14, the data sharing programs that have really proven through the test of time, even back in nine 11, the days of 9 11, it was all very manual, but it was very powerful. So the more we automate it, the more we take advantage of technology and leverage these additional machine learning techniques, the better off we'll be. And ultimately for me, FinCEN recently in the United States launched the national priorities as part of the AML Act of 2020. And those national priorities are helping institutions actually drive some of their resources towards proactive, algorithmic based detection tools. So instead of just boiling the ocean to identify behaviors that we think are meaningful, we know they're meaningful to us. We have different ways of measuring the tactical or the strategic value of the SAR or the filing with the regulators or the government, but we have no way of getting that feedback. It's a one-way pipe typically. So FinCen's Exchange program and their ability to start testing ways of secure data sharing and how that can inform our detection algorithms will then return back the meaningful data that they're looking for and the end users are interested in law enforcement. So taking those national priorities and decomposing them down into features and then building features to build, to address the behaviors is where we're going now. So I think that's where things are going to start to pivot and hopefully allow us in all of the institutions that are regulated in this capacity to be more efficient at doing this work.

Carter Pape (12:39):

So just to kind of summarize, on one point, there are efforts on doing data sharing between institutions, but there's also efforts with sharing between institutions and FinCen, correct? Like with FinCen being a central hub for disseminating information.

Scott Nathan (12:54):

Yeah. FinCen has its own program called the FinCEN Exchange. And as part of that program there are innovation hours and pilots and discussions around what could be done as they, and now that they've rolled out those priorities, we're trying to get a little more prescriptive around what that means.

Carter Pape (13:11):

Okay. So another hot topic. Third is fusion centers. So at a basic level, these are departments at banks that combine AML fraud prevention and preventing other financial crimes into one function, sharing data to make that happen. So first of all, I guess give us a little bit more color on what exactly does a Fusion center do?

Scott Nathan (13:33):

Yeah, I mean, it's an interesting question. The notion of a fusion center for me, it formed outside of the government. It formed within the government. I mean, we were working with fusion centers that were actually tasked with bringing data together across the interagency and across the intelligence community to identify potential threats obviously. And that was a post 9 11 action as well. So the Fusion Center concept took off after 9 11. Financial institutions have had fusion centers. They just weren't branded in a cool way that we have. There's different facets of fusion centers. I think that we have cyber fusion centers that are obviously focused on the complete and daunting task of attacks on the networks and attacks on the financial ecosystem of technology. But then there's investigative fusion centers that take signals from across the various aspects of the programs and just look at things more holistically. And I think the Fusion Center approach is something that we work on in the financial crime space around just, we call it entity resolution, but it's really what does our customer look like and what are the events or activities associated with a customer and their counterparties across the vast network? And instead of leaving data in the silos, we bring the data together in a fusion capacity and we use techniques to bring that data together. So for instance, we may have customers that reside in different data warehouses or data lakes. We use this technology to connect the dots between those data silos. And where a customer may have had seven identifiers, they now have one. And then we look at that customer's transaction traffic, and we evaluate the risk associated with that traffic. And we also look at, we can now bring in, we confuse media and other events into the data and we can look at the way the customer is interacting across the world, across the globe using things like energy resolution and graph analytics. So, that's literally just fussing our data with other data, whether it's external data or data that we purchase from a vendor. And then we can do that as broadly internally as necessary. So our fraud teams, we confuse that data and we can see on the client, if you look at the client and you can see the risks and or events associated with their behavior. So you'll see fraud events, you'll see cyber events, you'll see IP addresses that are anomalous. So there's a whole variety of information that historically was latent or not accessible. And so the Fusion Center approach while starting manually by literally dragging files into a central data store and trying to figure out what the common keys were and the connections are is now being automated at scale in real time using these machine learning type techniques. They all have to go through or the model validation process. But we end up with this really interesting fused view of a customer. And that's what makes it easier to both identify the risk, understand what's happening with that customer, but also improve and we're trying not to just be traditional compliance officers. We're trying to be in and supporting the business strategy. So we want to be there to help drive product to market. We are a financial institutions. We're trying to promote financial stability, financial wellbeing. We're supposed to be providing services to our consumers, not limiting them. So bridging this technology with our business partners and their products and the product scale in payments, for instance, digital payments is where we're spending a lot of time looking at how the a payment in an instant payment, what are the risks associated with instant payments. And without a fusion technique and a resolution technique, you're just buried in data that you'll never be able to sort through and you'll miss the things that you need to find. So as the volume of data scales and grows, we have to have the ability to attack it with the same types of technology that the rest of the industry's using.

Carter Pape (17:47):

So the first thing I asked was about the accessibility of this technology or really this organization, organizational structure to smaller institutions. What about fusion centers? Do you see community banks and credit unions using this sort of thing? Or is this just a thing that Citi and Wells Fargo are doing?

Scott Nathan (18:05):

No. So fusion centers, well, we do have large fusion centers, large money center banks. I do see the Fusion Center approach cascading all the way down to community banks and credit unions. I see that they've, with the technological change has come the ability for even smaller teams to use these capabilities. And a lot of the vendors have added them as well. So a lot of the smaller institution may have one core, but the core will produce and provide that type of fusion capability where historically it wouldn't have been there. So I think it's become a product differentiator for a lot of the tech fintechs. So the fusion centers are now resident within most of the organizations. And if you call contact any of the institutions and ask, you'd be able to find someone who's at least able to connect you with the right resource that can see it as much of the information as possible. Even if it's not all fused in a data frame, it still should be accessible through the technology.

Carter Pape (19:05):

Okay. So the populated fusion centers shows the utility of a AML data for purposes other than preventing money laundering, obviously for fraud and cyber crime as well. So this is something that we discussed before specifically. So it relates to Citi's client 360 initiative. So tell us a little bit about client 360.

Scott Nathan (19:26):

Yeah, so as I was mentioning, so we've endeavored on a path where historically the financial crimes programs that we would have our own pillars. We have detection monitor, transaction monitoring program that looks at all the transactions in one process. We have the client onboarding process that looks at the client as they're onboarded and then maintains and manages the relationship throughout its lifecycle, including screening that information against watch lists and looking for other variations and looking at the way the customer behaves and confirming that the requisite level of due diligence is completed. And then we have investigations teams that take the output from our other systems like transaction monitoring, and they have to work the case to determine whether or not there is suspicious activity. And if there is suspicious activity, then gets handed to a filing team that has to prepare the SAR and then file that with Vincent in the us or it's whichever agency they're responsible for and where, whichever jurisdiction we're operating in. But on a global scale, it's a lot and it's a lot of steps and a lot of information being kind of locked up and trapped. And it's the trapped data that created a lot of the inefficiency in our process. And what we've endeavored to do now is, I was coming into, Citi, was looking at the entire ecosystem and looking at our compliance technology. Our compliance technology is primarily spent on financial crimes risk mitigation. So we're spending a lot on both the people side and a technology side on the frameworks that are used for these compliance purposes. And we looked at it and as I was, the first piece of the puzzle I needed to crack was the transaction monitoring puzzle because that's where all of the transaction traffic, billions of dollars a day are monitored through systems and we have to be careful about how we update that system. So just the sheer process of updating our transaction monitoring engine was a huge, is a huge effort. It's an ongoing project. But as we started to rip off the covers on this black box of legacy engines that we're running for transaction monitoring, we quickly, I quickly realized that this is the time when we're exposing the engine that we should start doing all of the heavy duty lifting and mechanical repairs. So we set out to say, let's not a lot of the technology that we need for detection of suspicious behavior in the transaction monitoring context involves an algorithm going out and looking at Scott and his transactions and who he transacts with. So Scott and Carter may be transacting, is the relationship between Scott and Carter interesting or not? Is it normal, is it not? And then is Carter linked to something else that may be problematic? So we're under this constant pressure to know our customer, to know our customer's customer, and to quickly respond to that risk, either for good or for bad. So the technology that enables that is one piece of it is something that's basically just called energy resolution, which I've talked about before, but it's really some, it's not a buzzword, it's just an industry function in data where you're talking about customers in different data stores or information that's duplicated or triple that's in triple form, or you just have lots of data in different systems in different places and you need to resolve it down and with confidence. So that surviving record is actually Scott or is Carter. So we have to do that anyway to make the system effective. So we've engineered a new tech stack that allows us to put in on a big data platform. The entity resolved frame that runs on advanced tech, it's on big data, it's on Hadoop, it's cloud native, it can be cloud native. It uses Scala and Spark. So it's very fast. We use Elastic for Surge. So we can put all that together now, resolve down our customer data, and then look at the customer and how they interact with that environment rather than having to do manual calls or do it or do it through humans. So creating that central view of the customer allows us to build features, like I talked about with the fiend projects and the national priorities. We need to see that customer in it's full multi-dimensional format. We can't just look at the systems used to just say, okay, if Scott did X or Scott did Y and Carter was X, then it would generate an alert. Now we can say, train the model and look at the behavior of that graph and what that customer actually looks like, sounds like, and what that means in terms of predicting unusual behavior.

(24:21)

So we apply that now to the customer. Now, once we've done that, we have this view of the customer. We have a 360 view of the customer. It's not hard to do, as long as you put the right technology to work and you have the right developers that know what they're doing, you build this graph view of your customer, but you have to be able to then defend it. You have to be able to explain how the resolved entity data was bridged, what the compound keys were, how that was then relied upon. So there's a lot of work that goes into that. But once we have this 360 view of the customer and we know who our customer is, we start improving the experience. We start looking at from lifecycle, from onboarding all the way through exit. How many times does this customer get touched? What happens in this relationship? And a lot of the frustration that our first line teams are dealing with can be tagged. And we start decommissioning processes that are overlapping or complicated and unnecessarily complicated. So simplification through this process of knowing our customer and then sharing that data internally with the right stakeholders. So there's no reason that the financial crimes team, looking at the customer knowing that Scott sent a million dollar wire to Carter, it shouldn't be the banker that has to deal with that private bank relationship, shouldn't have to find out through separate mechanisms that there was a type of relationship there. So we enable our bankers to piggyback off of that data intelligence and use that context around how the customer is interacting to just create a better relationship experience and manage the relationship and update the profiles. The systems, we're looking at their data anyway. You don't have to call, I mean, everybody's been contacted by their bank to update their employer, or is this still your email address? Are you still a human? I mean, all of these things impact the customer experience, and it's one thing we do it today. Our Citi app even allows you to do it over the mobile app for traditional consumer banking, which isn't the challenge. The challenge is the complexity of institutional banking, and you have major multinational corporations with highly complicated ownership structures. They require a ton of compliance work. So having a 360 view of this large entity and how it operates and how it transacts and using that data and the way the customer's actually conducting business to, and then using the information from public sources to basically recreate what a human historically would've had to do. And if you're a low risk customer, if you're not a Fortune 100 company that is publicly traded, you don't require the same level of attention that a higher risk cannabis business or cannabis related business might. So why should we subject those customers to the same type of scrutiny and why does it require so many calls and touch points? So we like to make it as we like to use the 360 view of the customer now to totally simplify that process and make it easier for our bankers. And that's one of the powers of this transformation that we've set out to complete. So not only do we have a full view of the risk profile, but now we can really sort out our spectrum of clients and how do we apply resources more intelligently to the resources are scarce. Nobody wants to continuously attack the balance sheet with resource costs. So we're looking to put our valuable resources on the right things and not have them just work on road tasks. So that 360 engine, I mean, it's a powerful tool because never before have we had so much information consolidated. It's a massive exercise to bring it all together, and we're doing it in regions and by business groups, but once it comes together, the teams are able to do things in ways that we have compliance processes that would take upwards of eight weeks to perform. Like a specialized customer review would take eight weeks. And with this new framework that we've kicked out, we've had our testing teams and our analysts come back to me with one to two hours. So that's a significant amount of time save. But it's not just the time save, and it's not just about reducing resource cost, it's really about context and insights and what they can see within that customer relationship that was not there before. It's akin to having your first CAT scan. I mean, you go through a CAT scan, you're going to find stuff that you didn't want to know about. It's there too. So we learn a lot. And that's a constant cat and mouse game trying to balance the precision and recall of these things. But anyway, 360 gives us a really amazing tool for something it's not, and it's nothing new. People talk about a holistic view of your customer. It's just actually getting it to work and then going through the process of validating it and going through model risk management and then carefully decommissioning legacy rules engines that weren't fit for purpose. And then you just have to know how to document it and explain it, and then you end up really saving a lot of money and being a lot more effective at identifying anomalous behavior. So it's a fun project and it's really gotten attention across the company because everybody can appreciate what the ability to click, the ability to click on something. We see it in our day-to-day lives is we expect it from Amazon, we expect to be able to go in and look at our cart and search in our cart or something to that effect. But the ability to search petabytes of transactions in real time was unheard of, right? You couldn't go into a mainframe app and type Scott, Nathan, and it would just spin and crash. Right? Now you can type in attributes and you can understand how I exist and coexist within the enterprise and which transactions are interesting and which ones need attention and which ones ones don't matter. So it's really cool. It's a powerful, it's a powerful search algorithm for our entire data set and the amount of data we sit on. I mean, all of the banks have tremendous amount of data assets. So turning those data assets into something meaningful that not only returns value to the shareholders, but protects the customers. And that's exciting about it. They entrust with their data. Our clients trust us with their data, and we use that data to make their lives easier.

(30:59)

We take it very seriously, but it's the power of that data that helps. I think that's just one of the reasons, and one of the benefits of being highly regulated is you're highly regulated, so you have this capability to support the financial system, and with that support comes the ability to draw context from things that were previously not accessible. So what makes being regulated worth it? People say, people complain about regulation, but it's the regulation that gives me the confidence that we're constantly under review. I mean, literally, my teams are under under exams all the time. It's like painting the Golden Gate Bridge through. We have hundreds of exams, so we're constantly being tested, so we don't have as much time to innovate as we would like, but having the ability to have kicked this off really gets us in good stead for the next generation of technologies that will come forward. And that's 360 in a nutshell. It's just so much going on. It's just an exciting way to modernize the bank.

Carter Pape (32:02):

Yeah. Well, I'll let the next session get started, but thank you so much for your time.

Scott Nathan (32:07):

Thanks for having me.