Team LiB
Previous Section Next Section

Which Backup Strategy Should You Use?

So incremental backups are great in that they use less tape, but you need ALL of them plus the full backup tapes to do a restore. Differentials are great because you only need the latest tape, plus the full backup tape to do a restore. But man, that's a lot of wasted tape.

Wouldn't it be nice if you could use less tape and not need so many tapes to be able to do a current restore? You can, using both backup concepts together with backup levels in what's called the Tower of Hanoi backup strategy.

The Classic Elegance of Level-Based Backups

Some of the more expensive backup suites today have tried to make backups easier and prettier. They wrap a GUI around the backup engine, abstract you from what's really going on at the file/schedule level, and paste slick marketing terminology throughout the tech manual, mixing up or inverting terms like differential and incremental, and even inventing their own terms such as occasional incremental or true differential. Such muddying of terminology, marketing spin, and sales-speak only add to the confusion when you're planning your company's data recovery strategy. Rather than relying on these commercial suites, consider an established system used by serious UNIX administrators: backup levels.

Backup levels are a simple system of numbers from 0 to 9 that allow you to describe and arrange exactly what, how, and when you want your files put to tape. Level 0 is your full backup and levels 1 to 9 are "levels of incrementals" that back up the data changed since the next lower backup level used. See the bird's eye view in Table 5-1.

Table 5-1: Backup Level

Backup Level

Description

Level 0

A full backup; backs up all files.

Level 1

The first incremental level; gets all files that have changed since the last level 0 backup. It acts like a differential by getting everything that has changed since the level 0/full backup.

Levels 2-9

Backs up whatever files have changed since the next lower level backup. Can work as an incremental if used sequentially, or as a differential if all the same number is used. Can also be "mixed" to create hybrid backup strategies.

The idea here is that a backup level of 1 to 9 will backup everything that has changed since the next lower level that it can find. For example, a backup of level 3 will backup everything that has changed since the last level 2 backup. If there was no level 2 backup done, then it backs up to the last level 1; if no level 1, then level 0. Look at Figure 5-3 to see how you could use backup levels to provide a pure incremental backup strategy.

Click To expand
Figure 5-3: Using sequential levels to achieve incremental backups.

This level-based backup schedule looks just like the Figure 5-1 incremental backup configuration, right?

Now, knowing how these levels work, imagine what you would get if you ran the following level-based backup schedule:

S

S

M

T

W

T

F

0

-

2

2

2

2

2

Since a backup level seeks the file changes since the next level lower than itself, a 0 2 2 2 2 2 backup schedule would behave like a pure differential backup, backing up all changes since the last full backup every night.

So, with numerical backup levels, we can control how much data actually goes to tape. Let's mix and match to see if we can get the most effective and efficient blend: full, incrementals, and differentials.

The Power of TOH

The classic "best scenario" for balancing the most efficient use of tape, data assurance, and file redundancy over time is referred to as the Tower of Hanoi (TOH) backup schedule. A formal description of the full TOH schedule can be found at www.pcmag.com/article2/0,1759,1155464,00.asp.

Implementing a full TOH schedule can be a real pain with a basic command-line tool like dump and no advanced index/tape pool tracking software. However, if you're on a budget and don't want to use the fancier backup suites that we discuss later, consider the simplified and modified TOH tape rotation schedule that we show here. It does nicely for a production server environment, using nothing but the included dump and restore backup utilities. The TOH schedule proposed here will keep your data safe, secure, and will keep you from having to run full/level 0 or level 1 backups on weekdays (which would slow your system(s) and possibly network access). I am representing our modified TOH schedule here in a 3-week cycle, which looks something like this (this schedule assumes an early morning start time, such as 1:34 A.M.):

Sat

Su

M

T

W

T

F

 

0C

-

3

2

5

4

1

 

1R

-

3

2

5

4

7

 

1R

-

3

2

5

4

7

-> start over

The idea here is that you start off the 3-week cycle with a level-0 (full) backup, making a dated Copy of this first level-0 for later. The rest of the first week is followed by the modified TOH schedule. On the second Saturday, you perform a level-1 and date it, which gets everything back to the original level-0. At this point you are able to safely do a full restore with nothing more than your current level-1 and last Saturday's level-0. Next you Rotate the first week's tape set (with the dated copy of your level-0) off-site for safe keeping in a fire safe for disaster recovery. For the rest of the second week, follow the 1, 3, 2, 5, 4, 7 schedule. The third Saturday you repeat the previous Saturday's tasks-complete level-1, date it, and rotate out the second week of tapes for off-site storage. You could now do a full restore to date with your second level-1 and the original level-0. Continue with the TOH schedule through the end of the third week, bringing back the first week's tapes (minus the copy of the level-0) to do it all over again.

Tip 

I recommend rotating tape sets out like this and when bringing them back for reuse, keeping Saturday's 0-1-1 tapes all at the remote storage site. This allows you to go back to any week of the year and fetch files from that week's Saturday with no more than a copy of the level-0 that they kept, or the level-0 plus level-1. This method will take only around 70 tapes per year and will give you an unlimited archive ability. At the end of each year you can even reabsorb all the archive tapes for reuse except for a level-0 for each quarter.

While this is not a pure TOH implementation (it's much more simple), this modified TOH backup strategy has several advantages:

  • Less restore work and less to go wrong-If you need to do a restore on Friday of the first week, for example, then you only need tapes 0, 2, 4, and last night's 7-instead of 0, 1, 2, 3, 4, and 5 as would be required if doing a pure incremental (0 to 5 as in Figure 5-3).

  • Less tape waste-You're not copying the same files onto the tape night after night as with the differentials or nightly full backups. Less waste of tape and backup time means saved money and added capacity for growth.

  • Off site rotation-Into the second week of the schedule, simply keep the original level-0 tape, and rotate a copy of tape 0 along with Monday-Friday tapes off-site to a fire safe or vault for disaster recovery. You can still perform a full restore with the original level-0 and the first Saturday's level-1 (and the 3, 2, 5... moving forward).

  • Data security-Files that have changed since the level 0 make it to tape at least twice in this scheme.

Note 

After each level-0 backup, you need to make a "dated clone" of your level-0 tape to go with your off-site rotation tape set, keeping the original onsite for restores. This will guarantee that anyone needing to do a full recovery from the off-site tape set will be able to do so with your safe, clearly labeled and dated 0, (1), 2, 4, 7 off-site tapes.

The main disadvantage of a TOH/levels-based backup strategy is that it usually takes more than one night's tape(s) to do a full restore. And that's about it really.

If you walk into most enterprise environments, you'll find that even if they're running some high end GUI based backup suite, this type of backup strategy is what the little software man behind the GUI curtain is actually doing.

You can take a closer look at how to implement this form of backup strategy in the Backup Tools and How to Use Them section.

Tip 

A 4- or 8-mm tape on a helical scanning backup drive can be used a couple of hundred times (for in-depth info, see www.datman.com/tbul/dmtb_035.htm), but after this you need to replace the tape with a new one before it starts encountering errors. Any time you introduce a new tape to your facility, you need some system to tell when it came into use and thus when it needs to be replaced. Most higher end backup suites do this for you automatically, but a simple manual tape-aging system that I've found useful is color-coded circle stickers from any office supply stores. Each color represents a given month. When you put a tape into commission, write the year in the color dot so you'll know how long you've been using it (only put the dot on the front edge of the tape, never on the flat body). Starting out, to help determine your usage frequency, put a tick mark on the circle every time you back up to it. After a few months you'll know how many uses/months your tapes get for your given backup strategy, so you can figure out how long it takes for tape in your backup schedule to "expire." After this, you can just put a color dot and year on all new tapes and know (ahead of time) when to dispose of them and get new tapes.

Backup Media Types and Hardware

You can find a huge array of modern tape drives and media types to choose from. Not one media is best for all applications; you need the right tool for the right job. For example, while a CD-RW might be fine for backing up your mom's My Documents folder, a few binaries, and config files on your server, it is really not considered to be a production-quality backup solution. A DVD-I-/-RW is fine for offloading your MP3 collection to archive or maybe even cloning your boot partition for doing emergency boot recoveries, but these forms of backup are limited and can be inherently expensive and too fragile for long-term use. They're not really suited for continual server backups, unless you're doing something special such as boot images or random access configuration file archiving. CD and DVD+/-RWs can be considered in some low-end backup scenarios, but are not considered a safe or viable solution for single production server or multiserver backup arrangements.

Some people prefer using hard disk drive based storage for either disk-to-disk cloning, or archiving, or nightly full backups. While this practice may be okay in the mind of new sys-admins, it really is not practicing safe backups. Unless something like external or hot swap drives are used, the media is not easily removable/storable; it usually stays in the machine spun up and running. This means that problems such as electrical damage, fire, or flooding can all render both source and backup drives inoperable. Even if you use some form of external or removable media, if you were to try to employ a large scale removable hard drive backup and rotation scenario, you would quickly find it to be very cost prohibitive and dangerous with the fragile nature of hard drives. In fact, if you need rotated and secure copies of data, this is actually one of the most expensive ways of backing up large volumes of data over time-and also clumsily awkward and frustrating to implement. The only way drive-based backups tend to be done today is via external SAN (storage area network), which we'll talk about later, but SAN is not really considered to be a classic disk-to-disk solution.

Tip 

Some determinedly creative administrators create a hybrid disk and tape backup solution that can have all the benefit of on line or "drag-n-drop" access (via the backup disk), while using a local or remote tape solution to backup the backup disk for tape rotation and off-site storage purposes. This offers a nice functional compromise, as the backup drive does not have "open file" issues that more traditional live backups have. This type of backup can be easily scripted, and you can even do prebackup database dumps or "hot copies" (such as using mysqlhotcopy) to static files on the backup drive before the backup drive-to-tape portion of the script runs, thus backing up the database also.

So what type of backup media do you need? This varies a lot based on your data size, type of data, pocketbook, and the scale of your operation. Let's take a look at a few of the various common business models out there and see which fits you best and based on that and other variables, which resulting media type you should consider.

SmalltoMedium Backups

Single or multitape drive backups are commonly used for small to medium server configurations. This is generally effective for a system of four to six servers; beyond that, you should consider using a tape library or mass robotic tape changer type arrangement. The up-front costs of implementing small, medium, or large tape backup arrangements tend to be higher than that of other less traditional forms of backup (such as DVD+/-R or disk-to-disk); however, tapes are used because they end up costing less per GB and are more robust than other "neat idea" solutions like disk-to-disk or CD/DVDs. When it comes down to it, the cost of tapes over the long haul is cheaper and they are easier to handle, rotate, and move. They also give you greater flexibility when growing into a large-scale backup, rotation, and off-site storage arrangement than do most other forms of backup. Table 5-2 compares the advantages and disadvantages of various tape drive and media types for small-, medium-, and large-scale backups.

Table 5-2: Backup Media Type Comparison

Media

Drive Price

Media Price

Capacity Nat/Comp.

$/GB

$/10TB

CD-RW

$50

$0.45

0.700GB

$0.64

$6,450

Pro: Good for small workstation/personal data

Con: Expensive and cumbersome for large/servers, fragile (scratches). Recent studies also indicate media degradation in as little as 18 months

DVD-RW

$100

$1.50

4.7GB

$0.39

$4,000

Pro: Good for server config backups, DBs, and recovery data and boot tools

Con: Still expensive, easy to outgrow, fragile (scratches)

Hard Disk

$150

N/a

150GB

$1.00

$10,000

Pro: Good for "instant recoveries," live/instant random access. Fast

Con: Very expensive and awkward for archiving/rotation, easy to outgrow

DDS3 Tape

$400

$4

12/24GB

$0.20

$2,400

Pro: Cheap drive and price/GB, good for small jobs, easy to find media

Con: Outdated, limited growth potential, slow backup/restore speed

DDS4

$500

$8

20/40GB

$0.24

$2,900

Pro: Cheaper drive and media. Good current solution for small or multiple small servers, fairly common media

Con: Represents last drive in "end of life" technology. 2-4 years remaining maximum

AIT-1 Tape

$700

$40

35/70GB

$0.67

$7,400

Pro: Cheaper end of AIT drives, good for secondary and small-medium servers

Con: Comparative ROI of media over time is low because of low capacity

AIT-2

$1,100

$45

50/100GB

$0.53

$6,400

Pro: Good middle road tape with growth potential. Good for changers

Con: Media still expensive/GB, in decline of acceptance

AIT-3

$3,000

$50

100/200GB

$0.29

$5,900

Pro: Higher end of AIT drives, great for growth servers and changers, good tape/capacity ROI

Con: Expensive initial cost of drive somewhat prohibitive

LTO-1 Tape

$2,800

$35

100/200GB

$0.21

$4,900

Pro: Great cost/GB, good for for mid-high end servers, and for streaming

Con: Newer tech, hard to find media, not good for start/stop

LTO-2 Tape

$3,800

$75

200/400GB

$0.22

$6,000

Pro: Fair costs, good future growth direction for changers, and for streaming

Con: Expensive drive, newer tech, hard to find media, not good for start/stop

Note 

Price per giga byte of the media is probably the most important figure to look at when starting production-grade backups. However, keep in mind that newer more expensive tape prices (such as AIT-3 and LTO) often come down quickly 2 to 3 years after introduction. Plan ahead.

The rightmost column in Table 5-2 includes the price of the drive. This figure is not a linear comparison but just a ballpark figure to show much your first 10TB of backups on this drive and media will cost you. You really need to account for your giga bytes/month needs and project it over the course or a year to get a useful comparison for your scenario.

Note 

Streaming means that when you're taking a backup, you feed the tape drive an uninterrupted stream of data. If the tape drive is much faster than the data it is being fed, then the tape will have to slow, stop, reverse (while buffering the now incoming data), pick up where it left off, and start going again. This causes unneeded wear and tear on the drive and tape, and depending on your scenario, can greatly slow your backup process. Fast tape LTO and DLT tape drives are a good fit for streaming backups, while helical scan tape drives such as DDS, AIT, or Mammoth are better for start/stop or nonstreaming scenarios.

More information on various Linux-compatible tape drive tests and specs can be found at the Linux Tape Device Certification Program site: www.linuxtapecert.org/drives.php.

Medium- to Large-Scale Backup Solutions

It's ironic that as the computing industry has gone from a centralized (mainframe) environment to a distributed one (PCs and servers) over the past 30 years, backups have moved the opposite direction. Even 10 years ago, each new server usually came bundled with its own tape backup drive. Now, doing backups to a local tape drive on every server is considered wasteful. Even from a purely administrative perspective, backup administrators would rather have centralized backup control and tape management over multiple remote systems. Innovations in the networking arena in the past 10 years have made this natural shift toward remote centralized backups possible.

Some medium to large corporations find that implementing network backups is a good way of minimizing tape drive costs and the complexity of tracking dozens of tape pools. In a nutshell, this process is handled by each machine running a network-backup client program that talks to a centralized backup server. This backup server is in turn hooked up to a single large robotic tape library (that is, a bunch of tapes, tape drives, and one or more robotic arms in a big cabinet). In this arrangement, your many servers all talk to the single centralized backup server daemon that is controlled, scheduled, and monitored centrally. This is often implemented with either DDS3/4 4-mm tape (on the low end) or AIT/Mammoth/LTO tape and drives (at the high end). This type of setup can be configured with tapes and all for under $20,000. Another larger enterprise backup solution is a Backup SAN (Storage Area Network) or what's sometimes called LAN-Free Backup. In the same way that a LAN allows distributed access to computing resources, a SAN allows for distributed access to storage resources. SAN is a more pricey arrangement by which all the production servers or machines to be backed up use a common (but "off-LAN") pool of high-speed fibre channel or iSCSI (Internet SCSI that runs over a second dedicated TCP/IP backup network) attached tape storage that forms a Backup SAN (see Figure 5-4). This is nice as it gives you centralized backup management, without the normal network/LAN congestion and timing issues you have when backing up over the corporate LAN. A tape backup SAN is segmented up for each of the production servers or machines. Each of these machines on this SAN has a Fibre Channel HBA (or dedicated high-speed NIC with iSCSI) and the backup client software installed and configured. The backup server sees all of the tape SAN client machines and schedules and directs the backup clients on the machines to the tape SAN for back up locally. The tape backup library's large streaming set of parallel tape drives and/or tape changer and library makes quick work of getting the data to tape, indexed, and accounted for. This type of arrangement (streaming or continuous writing of data) makes good use of high speed, linear tape technologies such as DLT (Digital Linear Tape) or LTO (Linear Tape Open standard, a.k.a. "Ultrim") because the high speeds needed for nonstop streaming is guaranteed by the throughput of the SAN to backup system. This is a higher end solution and can be pulled together for more on the scale of $80 to 120k on the low end to several million dollars on the high end.

Click To expand
Figure 5-4: A Linux server/machine running a backup client talking to a centralized backup SAN via high-speed fibre channel.

Other Backup Hardware Tips

Here are several words of wisdom regarding hardware-related backup issues, so you can use the pain and suffering of others in your pursuit of backup Zen.

SCSI or IDE Tape Drives?

Most serious administrators use SCSI tape drives. Nowadays, even higher end tape drives such as the AIT drives are coming with an optional cost-saving IDE/ATAPI interface. With blazing fast ATA-UDMA speeds being what they are these days, IDE/ATAPI tape drives are looking less vile than they once were, especially since ATA with UDMA is not burdening the system with Processor taxing Input/Output mode (PIO) based data movement any longer. However, running ATAPI on Linux is still not an ideal setup. Sometimes ATAPI drives work, but they often fail or lock up. It's been my observation that administrators who choose ATAPI tape drives for backup tend to care more about the price of their system more than they do the quality of their backups. That being said, if you enjoy living on the edge and want ATAPI tape drive based backups anyway, just be sure to disable ide-tape kernel driver support (rmmod ide-tape) and use SCSI emulation instead (insmod scsi_mod; insmod ide-scsi; insmod st). This will change your tape device from /dev/ht0 to /dev/st0. And be sure to make these changes permanent in your /etc/modules.conf file. Good luck.

Tip 

If you're going to risk it and run IDE/ATAPI interface tape drives, at least don't put them on the primary ATA interface with your system drive. Either put it by itself (or with a CD-ROM) on the secondary interface (cable) or get yourself a separate PCI ATA hard drive card for $20 and put it on that. You don't want your tape drive on the same controller as your production OS and data drives. Doing so will simply slow down your entire system.

Caution 

Feeling lucky with IDE/ATAPI tape drives? If you want to try these, the ATAPI AIT drives are fairly popular. Just do not use the low-end IDE/ATAPI (or even SCSI) versions of Travan-or QIC-based tape drives for production quality backups. These drives have caused many problems in the production backup arena and using them is begging for trouble. Besides lack of error feedback and high bit error rates, they require regular tape retensioning (via commands like mt -f /dev/st0 reten) and will plague you with frequent backup failures, or even worse, no "failures"-just bad backups! And did I mention that the average Travan tape costs $30 to 40 as compared to a DDS3/4 tape price of $4 to 8?

Dedicated Backup Interface

If you're running SCSI-based tape drives, be sure to put the SCSI tape drive on its own SCSI bus. I don't mean just external, or on the narrow external connector of a wide channel, I mean have more than one SCSI bus-usually another dedicated SCSI card or separate SCSI bus on a multichannel SCSI controller. You definitely don't want to put a SCSI tape drive (or other slow r/w devices) on the same SCSI bus as your high-speed SCSI or RAID devices. Doing so will just slow down your whole system, especially during backups. Just get yourself a cheap little SCSI card that will work with your drive and use that. Be sure to get a narrow card if you have a narrow tape drive, or a wide card if you have a wide tape drive. Some server-grade motherboards actually have a secondary SCSI bus built in just for this purpose. Some newer SCSI tape drives even run high-speed SCSI LVD or UltraXXX. Be sure to check this out on the drive and card specs before making the purchase. Almost any decent (or even older) SCSI card will work with the Linux kernel now days, at least for tape backups.

Tape Drive Error Flash Codes

Be sure to get a good tape drive that has some type of "LED flash codes" or "error codes" so that you can notice problems just by looking at the tape drive. Problems such as incorrect tape type, dirty/cleaning required, and other r/w errors are most often detected and fixed when a drive communicates them directly to the backup administrator or tape rotation staff via direct visual cues or LED error codes. Many times when your tape drives are remote from the backup software, your backup software may not report on such errors correctly, or the guy changing the tapes may just leave a problem drive alone if it keeps spitting a tape out or the drive just locks up on him. These types of critical errors are best reported directly from the drive itself and error flash codes seem to be the best way to communicate them.

Tape Drive Cleaning

Oh, don't forget to keep your tape drive clean and failure-free! When you buy backup tapes, spend $60 or more and get yourself a ten-pack of high-quality cleaning tapes too. They're a must in a production backup environment. Always keep at least two on hand for your specific tape drive. Consult your tape drive manual for further cleaning and maintenance details.

Set up a policy-based system so that after every X number of backups you load a cleaning tape and let the drive take care of itself. Follow your specific tape drives recommended cleaning schedule.

One more caution: Don't lend your cleaning tapes out! If you lend it to someone in another department, you never know where it has been or what it has cleaned when you get it back. You don't want an 8-mm cleaning tape back that just cleaned some guy's coffee-soaked Hi8 CamCorder coming back to your department to clean your server tape drives! Keep a couple extra on hand to just give to people who want to borrow one of your production cleaning tapes.


Team LiB
Previous Section Next Section