Say you work for a large bank that has been embroiled in a crisis or scandal. In the discovery processes of the inevitable lawsuits that follow, investigators and lawyers (and in some cases, the U.S. Senate) find all manner of incriminating emails that the bank's risk and compliance departments did not know existed.
This scenario has occurred quite a bit in the aftermath of the mortgage crisis. A few classic examples from Standard & Poor's analysts — emails they wrote as they were inflating the ratings of worthless collateralized debt obligations — were made public by Rolling Stone magazine just this summer:
"Lord help our [expletive] scam … this has to be the stupidest place I have worked at," wrote one Standard & Poor's executive.
"As you know, I had difficulties explaining 'HOW' we got to those numbers since there is no science behind it," wrote a high-ranking S&P analyst.
"Let's hope we are all wealthy and retired by the time this house of card[s] falters," wrote another S&P executive.
Memorable gems. But you wouldn't want them discovered in your company.
Ten large U.S. and European banks are using natural language processing technology from Digital Reasoning — one of Bank Technology News' 'Top Ten Tech Companies to Watch for 2012' — to uncover such revealing documents before lawyers and examiners do.
The company launched six Proactive Compliance analytics products six months ago. The software is meant to find emails that reflect unethical behavior and violations of Dodd-Frank, anti-money laundering, Know Your Customer and other rules.
Some European banks use the software to analyze suspicious activity reports for signs of bribery. Other banks use it to find control room violations, to make sure their advisory services are clean, to keep insider information from leaking out of their organization, and to maintain the Chinese wall between trading and research.
Banks' current compliance solutions tend to focus on monitoring transactions and trade orders, Digital Reasoning executives say.
But much valuable information is buried not in transactions, but in emails, instant messages, Word documents, PowerPoint presentations and other forms of "unstructured data" (which basically means, any data not stored in a database).
Three large banks are using Proactive Compliance to catch employees who report that everything is fine but admit behind the scenes that disaster looms, in the manner of JPMorgan Chase's London Whale.
"All their internal systems were saying everything was good, but [trader Bruno Iksil] was busy communicating internally and externally about what he was doing and how he was doing it," says Stephen Epstein, vice president of product marketing at Digital Reasoning. "He was smart enough to conceal pieces of information he knew would be monitored. For instance, he didn't use the word 'portfolio' or 'basket' because those would trigger [compliance scrutiny]. He replaced them with words like 'umbrella.' Looking at the conversation, it was clear he was communicating about a transaction, and at the same time trying to conceal what he was doing in that transaction."
Banks tend to use keywords and lexicons to identify compliance violations in emails, Epstein says, which wouldn't pick up on such evasive messages.
Digital Reasoning's software "reads" unstructured files using natural language processing technology and looks for patterns in communication. The user feeds examples of the type of behavior a company is looking for into a platform called Synthesys. An example might be, "Today I bought 1,000 shares of IBM for $20,000." Synthesys will try to find people who talked about similar things.
"If someone tries to replace the word 'basket' with 'umbrella,' Synthesys knows that's not what's important," Epstein says. "What's important is the activity — someone bought or sold something and there was a counterparty involved. There's an action — a fire sale. That's the structure it's looking for, not a lexicon or keyword." The software will also take note if certain keywords, like portfolio or basket, are not used. "That makes the conversation more suspicious, because the person didn't use the expected language."
What if a bank doesn't have many real-life examples of, say, bribery to feed into the engine?
Such innocents can upload hypothetical examples, Epstein says. "Most banks have an idea of what constitutes bribery," he says.
The software will comb through the knowledge base and produce examples similar to the artificial examples. Then humans review the results and mark the false positives.
Another use case for Digital Reasoning's compliance modules is Know Your Customer Enhanced Due Diligence, which requires banks to take additional steps to validate information provided by the customer, and/or conduct additional research and inquiry about the customer.
Many bank analysts handle this by conducting Google searches on high-risk entities and customers, Epstein says. For a bank with thousands of clients, it would be too time-consuming to research every customer every quarter, so they tend to do spot-check internet searches for signs a customer defaulted or received a negative comment.
Digital Reasoning is working with two banks to automate the EDD process for public reviews of entities. They upload their entire customer list to Synthesys and point it to public sources of data such as Yahoo Finance and Twitter, using search engines like Google and Bing.
The software builds a queryable "knowledge graph" of customers. An employee could ask to see everything publicly available about a customer, for instance. This could be output as a spider graph or a spreadsheet, or users can toggle between different views.
Digital Reasoning provides APIs to its software so that it can be used by other programs. "We're not in the business of building a dashboard or a case management system, we want to make this information available to other systems within the bank," Epstein says, such as risk management, marketing, and compliance.
The company's next set of solutions will be focused on revenue generation. For instance, one bank wants to mine voice data to flag conversations about large orders.