Are you one of those people who never deletes any of your email or files? It seems you are not alone: large online companies like MySpace and Yahoo can’t setup hardware fast enough and are having trouble finding enough electricity to run everything. Even small businesses have trouble estimating how much space they need and keeping it all backed up safely in case disaster strikes.
The cost of hard disks is certainly not an issue. In 1956, IBM was charging around $10,000 per megabyte for five megabytes of storage. By 2004, the cost was nearly nine megabytes per penny. And inexpensive disk capacity has grown an average of more than 90% per year since the early 1990s. Unfortunately, this has slowed dramatically in the last five years--which may indicate hard drive technology has reached its limit. Redundant Arrays (RAID,) common on servers for years, could start appearing in desktops.
Servers themselves keep getting faster--with more storage, but they have their limitations. Server hardware is designed to be a balance between storage, access time, and application use. And what about backups? Magnetic tape has been around since the beginning. In 2001, Sony announced in the development of terabyte-sized storage in 8mm tape cassettes. Tape libraries remain expensive, labor-intensive, complex, and the tapes themselves have a limited shelf life.
Before spending more money, first look at controlling users and their storage habits. Establish quotas and require users to clean up their storage space. Think about banning certain file types (like music, for instance.) Offsite archiving is also a good idea. Large external disks can store files that haven’t been opened in months or years. If the space crunch still exists there are several options.
If your file access seems to be slowing down, or if your file servers seem to be constantly running out of space, it may be time to look into a Storage Area Network (SAN) or Network Attached Storage (NAS.) Not only might these alleviate congestion, but disk-based backups become an attractive alternative. In its most basic form, a NAS can be a writable CD tower plugged into the network. More sophisticated devices now make use of RAID and IP-based SCSI (iSCSI.) The important part is that servers--regardless of their operating system or role—now have a central location to put all their files. Depending on the price tag, a NAS can hold anywhere from six terabytes up to 512. They are easy to manage, and can offer features like data snapshots for backups and replication to a remote location for disaster recovery. Best of all, if you run out of space—just buy another and plug it in.
Add Fibre Channel (FC) drives and switches and you have a more expensive and expansive SAN. If servers are clustered, and one of them fails, the backup server can access the offline server’s volumes on the SAN without downtime. iSCSI leverages this same technology but reduces the requirement for fiber switches. SANS also offer more flexible “pooling” of storage so that you are not locked into hard requirements per server as you are when using standalone storage on each server.
But how do you back up all this data? One of the frustrating aspects of backups is they occur nightly and on weekends, so chances are at least a day’s work will be lost in a disaster. If you have a NAS or SAN and implement running “snapshots,” data can be backed up as often as every 15 minutes. If the worst happens, and there is a massive hardware failure, imaging software like Symantec’s Backup Exec System Recovery allows you to boot from a CD and restore the operating system to similar (or different) replacement hardware. Not only are back ups and restores easier with external storage, but it allows servers to do what they do best—serve data from a hardware platform optimized for that purpose.
As you consider NAS or SAN technology, ask yourself some basic questions:
- What are my existing storage demands, and what are the trends in demand for storage?
- How much data needs to be backed up and how often does it change?
- How quickly do I need to recover backed-up data? Do I need it immediately (on-line) versus being needed soon (near-line) or not-so-soon (off-line)?