Aggregation's Next Step Impeded by Data Issues

July 26, 2001, 1:00 a.m. EDT 11 Min Read

As aggregators move from presenting financial information to customers to actually using it in delivering services, data quality and data sources are becoming more of an issue.

People in the high-net-worth bracket are the most active users of aggregation and stand to benefit most from the next phase of its development. The coming generation of aggregator Web sites promises to let consumers and their financial advisers perform all sorts of advanced tasks, but these plans are running into interference from a host of data problems. For instance, some aggregators understandably are hesitant to develop and offer complex - and potentially costly - services from information culled electronically, and often without permission.

Studies show that individual consumers are increasingly doing business with multiple financial institutions. Estimates from BITS, the Financial Services Roundtable's technology arm, say that U.S. households use an average of 3.6 institutions for deposits, loans, and investments. It is estimated that two-thirds of online users of financial services want a consolidated view of all their accounts on the Web, and according to Gartner Inc., a Stamford, Conn., consulting firm, more than 30% are willing to pay for that service.

BITS says 300,000 to a million U.S. consumers are using aggregation services, and a lot more are expected to use them when they add the ability to transfer funds between accounts and to pay bills. Celent Communications, of Cambridge, Mass., says the U.S. user base will be 3.4 million at the end of this year, 9.52 million at yearend 2002, and more than 35 million by yearend 2004.

Consumers view financial information at a variety of Web venues, including those of banks, brokerages, and financial advisers and Internet companies such as Yahoo. Most data used in account aggregation are collected through screen scraping, a practice by which aggregators use customers' passwords to collect static information from Web sites, usually without the site operator's consent.

Aggregators also collect information by connecting to an institution using standards-based information exchange protocols, most commonly through Open Financial Exchange, developed in 1997 to let institutions transmit certain types of data quickly and reliably. OFX data are generally of a better quality than screen-scraped data. For example, this information can be downloaded directly into a personal financial management application.

A third common method is to collect data from an institution through a direct connection to that firm's back-office systems, and this method usually yields much richer data than the other two.

"With screen scraping, the challenge is having the most up-to-date, reliable data, because as data changes it's not automatically updated," said Adam Holt, a senior analyst at J.P. Morgan H&Q in San Francisco who has written a report on account aggregation. "Screen scraping has to go out and proactively get data."

While some aggregators have slightly better technology and methods than others, screen scraping is generally unwieldy because aggregators' software must analyze Web pages that are designed to be accessed by people, not software applications. Thus, most of the information on a standard Web page - links, buttons, icons, charts - is extraneous to an account aggregator; only small portions of Web pages, such as areas detailing account balances and payment-due figures, are relevant to the aggregator, and it can be hard to locate these areas.

And now that financial institutions are altering Web page layouts and locations, the process has become harder still, Mr. Holt said. In many cases, programmers have to make painstaking Web-page analyses to make changes.

"There isn't necessarily a mechanism with some screen scraping to monitor the cleanliness of the data relative to potential inconsistencies, like double entries or input mistakes," Mr. Holt said.

Because the quality of scraped data is low - generally static - its use in transactional operations poses problems, he said. For instance, it can be used for charting and in devising things like balance sheets and asset allocation models, but not for more difficult applications like calculating tax liabilities, or position accounting, which usually requires a direct data feed.

With a direct feed, the aggregator acts as an extension of the institution instead of as a distinct third party.

"You can get beyond basic analytics and basic charting capabilities to do a more comprehensive and broad set of applications on that information," Mr. Holt said. "The data is better coming directly out of back-office systems, and therefore it provides a more reliable platform to perform functions ranging from performance analysis to financial planning, because it's a higher-quality, more consistent data."

But direct-feed systems have their drawbacks, not the least of which is that they can take from several weeks to several months to build. For these connections to work effectively, they require the endorsement of a financial institution or its technology provider.

In the last few months, financial advisers and other aggregators have begun offering tools that enable them to act on the data they are gathering.

Fidelity Investments last month launched Portfolio Planner, offering financial planning, portfolio analysis, and investment planning. The Boston company gets its data from Yodlee Inc., which assembles information from 1,200 financial sites, half from screen scraping or OFX and half from direct feeds.

Stephen Mitchell, senior vice president of product management and development at Fidelity, said Yodlee has provided "very reliable data, and in most cases pretty good position-level breakdowns" for simulations based on asset allocation, market value, and account holdings.

Fidelity has 100,000 users for its basic aggregation service. Launched in November, it only lets customers view information.

While Fidelity is getting market-value and position data from Yodlee, "what we're not getting is transaction history," Mr. Mitchell said. Hence it cannot offer some tax computing services.

"There are some significant limitations, but it is such a huge step forward be able to provide a client with portfolio analysis," he said. "But if the industry moves toward more standard data feeds, for example, an aggregator could do performance reporting, because they had all the historical data to do it. That could be a huge next step."

The issue is not data accuracy, but "completeness," Mr. Mitchell continued. He said richer data could be used in applications that hit the "sweet spot" - the affluent market.

"Portfolio analysis and investment planning are very important and very useful for very-high-net-worth customers," Mr. Mitchell said. With better data, Fidelity could offer more personalized attention with services like estate planning and a deeper level of tax planning, which Portfolio Planner does not currently allow.

Jim Tascheta, chief marketing officer at Redwood City, Calif.-based Yodlee, said, "Our confidence level that our information is as accurate as what is coming out of financial institutions is very high." He noted that his company has built screen-scraping technology that matches direct feeds in data accuracy.

"To us, the accuracy of the information is of utmost importance, even at the presentation level," Mr. Tascheta said. "We are only as good as our data. Whether people are doing funds transfer or just looking at it, our data has to be accurate."

Yodlee does prefer direct feeds to screen-scraped data, he said, but only because operational efficiencies are higher for direct data feed - not because the data are more accurate.

"One of our strategies is to move to more data feeds, because of the operational efficiencies," he said. "Instead of having to have a robot look around at the Web site to bring that information back, we just send the request out to give us that information." It also is expensive to create and maintain the scripts that direct the screen-scraping robots, he said. "If you have a data feed, then you don't have to maintain scripts," he said.

Another provider of aggregation services, Advent Software Inc. in San Francisco, does not scrape. The San Francisco company, whose services are used by 6,000 firms, gets all its information - from more than 150 banks, brokerages, and other information custodians - through direct feeds it began building about four years ago.

"In order to do true gains and losses and get good, high-quality investment advice, we need certain pieces of information that screen scraping today does not make available," said Steve Lewczyk, vice president of electronic communications at Advent. "Because most Web sites are inconsistent in what they display, you're limited in a way if you go scraping. If you are looking for a snapshot of what a customer owns, screen scraping is great, but it doesn't help answer the questions: How am I doing and what should I do?"

He said the most important tools for affluent customers are the ones most reliant on direct-feed data.

"If a high-net-worth investor wants to see how they're doing with their taxes, they need to have the cost-basis information, and that is what our portfolio account system tracks," he said. These investors "want to know how well their portfolios are performing against each other."

Automating information flow makes it possible for an adviser in a family office - who would be Advent's customer - to be more efficient and take on more clients, and become more profitable, he said.

Yodlee has had a relationship with Advent since November to get data feeds, which the company uses in specialized applications for high-net-worth clients, a spokeswoman for the aggregator said.

For its part, Advent is exploring using OFX as an option where direct data feed is not available, Mr. Lewczyk said.

A Chicago firm, Business Logic Corp., provides software to financial institutions and other aggregators that standardizes the information they get from various sources.

"You're going to have to have a flexible architecture that allows you to access data from multiple sources, because there is no one source of data that is going to give you everything you need," said Dirk Quayle, Business Logic's executive vice president.

Aggregation is suffering from a "connectivity crisis," Mr. Quayle said.

"Getting the data is only half the battle - then connectivity has to occur," he said. "Depending on the line of business that a financial institution is trying to serve, there is a big difference in terms of the effectiveness of the data sources."

Using screen-scraped data for the high-net-worth customer is "very limited, because it has to go into pretty rigorous performance and analytics tools," Mr. Quayle said. Though screen scraping will continue to be used, high-net-worth clients are driving the movement toward direct connectivity, he said.

Users of Business Logic's software include Ibbotson Associates, a Chicago firm with a unusual specialty - it provides financial advice to the advisers of wealthy clients. Among other things, Ibbotson scrutinizes 401(k) holdings, which requires information on asset allocation, savings rates, matching rules, salary, age, and pensions.

"In general, screen scrapers don't allow you to get to" this information, Ibbotson president Mike Henkel said.

"It's not really accuracy we are concerned about with screen scraping, it's the completeness of it," he said. "What you may be able to do with it is limited by what was on the screen at the application being scraped."

Still, he said, automation, regardless of the source, will help advisers do their jobs better.

"The more you can get for an individual automatically, the less the risk of a data-entry error," he said.

If the players fail to address the accuracy issue, it could be costly for all those involved: financial institutions, vendors, and customers. What these parties don't sort out among themselves could be sorted out in the courts, because it is unclear where blame will fall if erroneous information results in financial loss.

"There is not a lot of law out there right now, no guidepost in terms of the court stepping in," said John Burke, counsel to BITS and an attorney at Foley Hoag LLP in Washington. "How you get information - what its currency is in terms of accuracy and who will be liable if consumers make a bad judgment based on data that is provided through an aggregation service - will only be resolved if there is litigation," he said. So far, he has not heard of any such lawsuits.

"Screen scraping is a pretty clunky technology," Mr. Burke said. "Using consumers' access codes, you don't know when the Web site was modified or updated, or, frankly, what the accuracy of the data is."

He said the idea of aggregation - letting consumers "access and assess products from a number of sources and work with their providers in making determinations of what is useful to them" - is good. But "there are some serious issues" because of the technology's flaws.

"If I'm a financial adviser, I need to know what I'm advising the client about is on the basis of good information," he said. "My advice is going to be limited by the accuracy of data I'm able to bring up."

From Our Archive