For Members, Downtime (Naughty) More Important Than New Gizmo (Nice)
It's that time of year: time for making wish lists filled with gifts we hope to receive. But instead of a list of "nice-to-have" gifts, credit unions should create a list of "must-have" strategies to ensure high availability in 2004. Bah-humbug, you say? Consider this: Today's members have little tolerance for downtime of any kind, for any reason, even for the shortest duration. To keep them happy in the coming year, you need to ensure your systems and services are available 24/7. No high availability wish list would be complete without the following.
Backup Power: Remember the massive outage that cut power to much of the east coast and parts of Canada this summer? It didn't matter how many duplicate servers and backup tapes you had; if you didn't have an alternate power source, you were out of luck. And it doesn't even take an event of that magnitude to cause trouble, as strong winds or a car crashing into a utility pole can leave you in the dark. To withstand a power failure, invest in a backup generator. And be sure the specs include the ability to detect a partial power loss.
Redundant Disks: If the disk on your web server fails (you know, the one all of your home banking users rely on), you'll have a big problem on your hands-unless you're using RAID (Redundant Array of Independent Disks). RAID involves the use of two or more redundant disks; so if one fails, you only lose a disk, not the whole system. (Of course, if you end up with a failed disk, don't forget to replace it to stay redundant.)
Off-site Shadowing: CUs are increasingly replicating their systems in real-time to a secondary machine-a process known as "shadowing." But if you take this approach, don't make the mistake of keeping the primary and secondary hardware at the same site.
Clustering: Keeping a hot spare for your primary processing system used to be good enough; today, members won't wait patiently while you switch machines manually. It's time to move to clustering: the use of two or more systems that can automatically stand in for one another if one fails. The fail-over is automatic and the transition both instantaneous and virtually transparent. When not failing over, the two machines can balance the processing load when volume is high.
Consistent Upgrades: If you have multiple systems designed to stand in for one another and you upgrade the software on one, you'd better upgrade the others to be identical.
Redundant networking/communications: Keeping the lines of communication open is critical to ensure access to ATMs, Internet services, and other self-serve options, as well as branch communication. That's why some credit unions have both a primary ISP (Internet service provider) and a backup; and some use a VPN (Virtual Private Network) as a fall-back for branch communications. Whatever your approach, don't place yourself at the mercy of a single ISP. And always keep hot spares of key networking components, like hubs and switches.
Backup Certificates: If you offer Internet banking, it's likely you have a backup web server with duplicate software. But you may not have a backup Verisign certificate (used for encrypting and authenticating transactions over the web). If your certificate is unavailable due to a system failure, it'll take three days to get a new one. (No problem; your members will wait patiently, right?) Obtain a second copy of the certificate and make it accessible to select staff.
Redundant People: We're not suggesting you hire duplicate employees. But if only one person knows a vital password and he/she is on vacation, you've got a redundancy problem. When you centralize key info (or access to it) through a single person, you increase your vulnerability. It's important to control information access; but balance that by appointing a secondary person who can gain access if needed.
Accessible People: How reachable are the people you'd need to restore your systems and services? Are key phone numbers accessible, stored in multiple places, and known to multiple people? This must-have item won't hurt your budget; not having it could hurt your recovery.
A Living, Tested Plan: Drafting a plan that covers all of these areas is a great first step, but it's just that. Make the plan a living document- updating it whenever a new product, service or technology is added, new staff is hired, or a procedure changes. And test every component- because, if you never take it out of the box and try it, how can you count on it? Appoint an employee to "own" the plan, update and test it.
Board Buy-in: Since some of your wish list items require capital, board buy-in is a must. To secure it, quantify your anticipated ROI by calculating the cost of a hypothetical interruption (including lost revenue and unexpected expenses) and comparing it to the cost of your redundancy measures.
If you run an in-house system, consider this your high availability shopping list. If you use on-line/ASP services, use the list to gauge your supplier's performance or to evaluate a new supplier. Either way, may all of your systems and services be highly available in 2004!
Randy Riesenberg is VP-e-Services and Mickey Hackett is Director of System Services at USERS Inc.. They can be reached at 1-800-523-7282.