'Availability' as Vital for CUs as Disaster Recovery

Register now

The ability to recover quickly and completely from systems failures and disasters is perhaps more critical for credit unions and other financial institutions than for many other types of organizations.

Because financial institutions play a crucial role in the overall economy, the Federal Financial Institutions Examinations Council (FFIEC) in its Business Continuity Planning Booklet, reminds, "Disruptions in service should be minimized in order to maintain public trust and confidence in the financial system." Particularly with increased public sensitivity to the financial industry's robustness, recovery has to go beyond the reentry of lost data to the instant and total restoration of applications and business processes to mitigate the effects of adverse incidents, reduce monetary loss and allow uninterrupted service to members.

Two key measures define the nature and effectiveness of a credit union's Disaster Recovery (DR) strategy: Recovery Time Objective (RTO) and Recovery Point Objective (RPO). RTO specifies the maximum amount of time a credit union can allow its data and applications to be unavailable after a shutdown of its primary data center. RPO specifies the amount of data that can be lost without severely impacting the recovery of operations.

While any service disruption will inconvenience members, institutions may decide to accept a small, but non-zero, RTO. However, RPO is another matter: members will not be forgiving if any record of their deposits is wiped out.

Nightly tape backups should only be a last line of defense as it fails to meet both RTOs and RPOs. Despite high-speed tape drives, data recovery can take several hours or even days, particularly if the backup tapes must be retrieved from an offsite vault. Further, because backup tapes are generally created nightly, new transactions are not recorded until the next evening. Should a disaster strike the data center just prior to the next backup, a full day's worth of transactions could be lost. Worse, if for any reason the databases can't be restored from the most recent backup tape, which happens approximately 25% of the time, two days' worth of transactions could be lost.

CUs can compensate for this unacceptable RPO by maintaining paper transaction records until the data is reliably captured on a few generations of backup tapes. Paper records stored in branches are protected from a disaster that strikes the data center, provided it is sufficiently distant from all branches. After a disaster, backup tapes can restore databases to the previous night's status while paper records can be entered manually to restore the daily transactions. However, the keying effort will lengthen an already long recovery time. In addition, a paper-based approach won't protect paperless transactions.

Next Generation: High Availability

High Availability (HA) technology solves the problems associated with tape-based backups by maintaining ready-to-run replicas of both data and applications on a secondary computer. Rather than backing up data at night, HA software captures all updates applied to a production server and instantaneously replicates them to a backup server.

Consequently, the backup server is always current and ready to run. Should the primary computer become unavailable, the high availability software simply switches users to the backup. Provided the backup server is located remotely, it remains isolated from disasters that might strike the primary production server. Operations can be restored in minutes.

Recent advances in HA technologies have resulted in new Continuous Data Protection (CDP) features that permit the recovery of databases at a selectable point in time, allowing institutions to precisely select their RTO to a point that predates the disaster.

HA systems consist of four primary components:

  • System-to-system communications between the primary production server and the backup
  • A data replication engine which replicates or mirrors transactions between the production and backup machines
  • A mechanism that continually monitors the data replication processes
  • Role-swapping capability to move users to the backup system

In addition to HA, complex institutions may need complementary software or hardware components to provide maximum protection against all downtime risks. These include secondary/standby communications links, uninterruptable power supplies, physical asset security, fire protection systems, etc.With today's concerns about the health of financial institutions, credit unions need to scrutinize the total cost of ownership of any software. Fortunately, HA technologies have become so highly autonomous and easy to use that fewer personnel are needed to manage them.
In addition, new employees can be trained and new solutions tested in a test environment that is updated with a current copy of production data in real time. HA also helps satisfy NCUA and SOX regulatory requirements for data protection and availability. The software ensures that should anything corrupt or accidentally delete a record, the credit union can "dial back" in time to the point just before the incident.

Why spend money on third-party recovery services or worry about incomplete tape backups? High Availability technology not only protects data in the event of a disaster and speeds recovery time afterwards, it provides immediate savings in operating costs. Through prudent HA and DR investments, risks relating to data integrity can be mitigated or even eliminated at a justifiable cost for credit unions of any size.

Henry Martinez is Senior VP-Engineering with Vision Solutions, Inc., a provider of high availability and disaster recovery solutions. For info: www.visionsolutions.com or call 801 541-7769.

For reprint and licensing requests for this article, click here.