Data Duplication Must Stop. Copy That.

Database data is growing 25 percent yearly, with unstructured data related to online collaboration, Web 2.0 applications, messaging systems and other digital avenues increasing at 50 to 75 percent, according to The Enterprise Strategy Group. Storing all this data can strain IT architectures, a stress made worse by unnecessary duplication of data. Consider, for instance, that simply updating a customer address often prompts the system to resave the entire customer file even though nothing else in the file has changed.

Tackling data duplication by eliminating unnecessary backups reduces storage footprints and cuts the power and cost needed to maintain data center facilities. According to the ESG report, so-called data deduplication products identify and eliminate redundant data and thus "cut data protection capital and operation expenses, facilitate consolidation of distributed backup operation, and slash server virtualization-related storage costs. Data deduplication offerings should, therefore, be on every CIO's project short list."

By reducing energy needs there's a green aspect to data deduplication. In her report, Lauren Whitehouse, a senior analyst at ESG, notes that "seventy percent of business executives measure the success of corporate green initiatives by tracking reductions in energy costs. If IT executives want to align with business priorities, cutting power consumption via deduplication is a great start."

Laura DuBois, a program director at IDC, says she expects banks to start adopting the technology, though to date she's "not seen a groundswell." That may change as the pressure on data centers increases and the positive effect of data deduplication is demonstrated through clear metrics such as reduction ratios. A reduction ratio of 10x means an organization reduced backup by a factor of 10 times, from 500GB to 50GB for instance. Among the data protection survey respondents, 48 percent saw a 10-20x reduction after implementing data deduplication technologies, and 18 percent saw reductions ranging from 21x to more than 100x.

EMC, which just purchased industry leader Data Domain, is a big player in the space, as is Symantec. But plenty of other technology firms are getting into the crowded field: ExaGrid Systems, NetApp, Quantum, FalconStor, Sepaton, CommVault, Atempo are just a few.

Virginia Credit Union recently implemented a deduplication solution from NetApp. Rich Barlow, senior systems architect at the credit union, says, "We've seen an 80 percent savings on backup copies, 78 percent in the [Virtual Desktop Infrastructure], and we're routinely achieving 25 percent on home directories and group shares, 35 percent in our live documentation environment, and 50 percent savings in our scratch volumes."

Nevertheless, some are skeptical. Rod Nelsestuen, senior research director at Towergroup, and once a CIO himself at AgriBank, FCB, recalls that "I was promised a lot. I'm a little surprised by the huge quick payback" that ESG claims. Still, he agrees a technology like data deduplication to slow growth is needed.

While Nelsestuen and DuBois say there is a green component to the data deduplication story, Bart Narter, an svp at Celent, begs to differ. Most data deduplication involves customer data that needs to be normalized in order to better service and understand the customer, he says; this is necessary to analyze relationship pricing and profitability. "It's not tied to green; it's tied to one-view of the customer."

For reprint and licensing requests for this article, click here.
MORE FROM AMERICAN BANKER