| 1 | .ie t .ds o \(bu |
| 2 | .el .ds o o |
| 3 | .de hP |
| 4 | .IP |
| 5 | \h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c |
| 6 | .. |
| 7 | .TH rsync-backup 8 "7 October 2012" rsync-backup |
| 8 | .SH NAME |
| 9 | rsync-backup \- back up files using rsync |
| 10 | .SH SYNOPSIS |
| 11 | .B rsync-backup |
| 12 | .RB [ \-nv ] |
| 13 | .RB [ \-c |
| 14 | .IR config-file ] |
| 15 | .SH DESCRIPTION |
| 16 | The |
| 17 | .B rsync-backup |
| 18 | script is a backup program of the currently popular |
| 19 | .RB ` rsync (1) |
| 20 | .BR \-\-link-dest ' |
| 21 | variety. It uses |
| 22 | .BR rsync 's |
| 23 | ability to create hardlinks from (apparently) similar existing local |
| 24 | trees to make incremental dumps efficient, even from remote sources. |
| 25 | Restoring files is easy because the backups created are just directories |
| 26 | full of files, exactly as they were on the source \(en and this is |
| 27 | verified using the |
| 28 | .BR fshash (1) |
| 29 | program. |
| 30 | .PP |
| 31 | The script does more than just running |
| 32 | .BR rsync . |
| 33 | It is also responsible for creating and removing snapshots of volumes to |
| 34 | be backed up, and expiring old dumps according to a user-specified |
| 35 | retention policy. |
| 36 | .SS Installation |
| 37 | The idea is that the |
| 38 | .B rsync-backup |
| 39 | script should be installed and run on a central backup server with local |
| 40 | access to the backup volumes. |
| 41 | .PP |
| 42 | The script should be run with full (root) privileges, so that it can |
| 43 | correctly record file ownership information. The server should also be |
| 44 | able to connect via |
| 45 | .BR ssh (1) |
| 46 | to the client machines, and run processes there as root. (This is not a |
| 47 | security disaster. Remember that the backup server is, in the end, |
| 48 | responsible for the integrity of the backup data. A dishonest backup |
| 49 | server can easily compromise a client which is being restored from |
| 50 | corrupt backup data.) |
| 51 | .SS Command-line options |
| 52 | Most of the behaviour of |
| 53 | .B rsync-backup |
| 54 | is controlled by a configuration file, described starting with the |
| 55 | section named |
| 56 | .B Configuration commands |
| 57 | below. |
| 58 | But a few features are controlled by command-line options. |
| 59 | .TP |
| 60 | .B \-h |
| 61 | Show a brief help message for the program, and exit successfully. |
| 62 | .TP |
| 63 | .B \-V |
| 64 | Show |
| 65 | .BR rsync-backup 's |
| 66 | version number and some choice pieces of build-time configuration, and |
| 67 | exit successfully. |
| 68 | .TP |
| 69 | .BI "\-c " conf |
| 70 | Read |
| 71 | .I conf |
| 72 | instead of the default configuration file (shown as |
| 73 | .B conf |
| 74 | in the |
| 75 | .B \-V |
| 76 | output). |
| 77 | .TP |
| 78 | .B \-n |
| 79 | Don't actually take a backup, or write proper logs: instead, write a |
| 80 | description of what would be done to standard error. |
| 81 | .TP |
| 82 | .B \-v |
| 83 | Produce verbose progress information on standard output while the backup |
| 84 | is running. This keeps one amused while running a backup |
| 85 | interactively. In any event, |
| 86 | .B rsync-backup |
| 87 | will report failures to standard error, and otherwise run silently, so |
| 88 | it doesn't annoy unnecessarily if run by |
| 89 | .BR cron (8). |
| 90 | .SS Backup process |
| 91 | Backing up a filesystem works as follows. |
| 92 | .hP \*o |
| 93 | Make a snapshot of the filesystem on the client, and ensure that the |
| 94 | snapshot is mounted. There are some `trivial' snapshot types which use |
| 95 | the existing mounted filesystem, and either prevent processes writing to |
| 96 | it during the backup, or just hope for the best. Other snapshot types |
| 97 | require the snapshot to be mounted somewhere distinct from the main |
| 98 | filesystem, so that the latter can continue being used. |
| 99 | .hP \*o |
| 100 | Run |
| 101 | .B rsync |
| 102 | to copy the snapshot to the backup volume \(en specifically, to |
| 103 | .IB host / fs / new \fR. |
| 104 | If this directory already exists, then it's presumed to be debris from a |
| 105 | previous attempt to dump this filesystem: |
| 106 | .B rsync |
| 107 | will update it appropriately, by adding, deleting or modifying the |
| 108 | files. This means that retrying a failed dump \(en after fixing whatever |
| 109 | caused it to go wrong, obviously! \(en is usually fairly quick. |
| 110 | .hP \*o |
| 111 | Run |
| 112 | .B fshash |
| 113 | on the client to generate a `digest' describing the contents of the |
| 114 | filesystem, and send this to the server as |
| 115 | .IB host / fs / new .fshash \fR. |
| 116 | .hP \*o |
| 117 | Release the snapshot: we don't need it any more. |
| 118 | .hP \*o |
| 119 | Run |
| 120 | .B fshash |
| 121 | over the new backup; specifically, to |
| 122 | .BI tmp/fshash. host . fs . date \fR. |
| 123 | This gives us a digest for what the backup volume actually stored. |
| 124 | .hP \*o |
| 125 | Compare the two |
| 126 | .B fshash |
| 127 | digests. If they differ then dump the differences to the log file and |
| 128 | report a backup failure. (Backups aren't any good if they don't |
| 129 | actually back up the right thing. And you stand a better chance of |
| 130 | fixing them if you know that they're going wrong.) |
| 131 | .hP \*o |
| 132 | Commit the backup, by renaming the dump directory to |
| 133 | .IB host / fs / date |
| 134 | and the |
| 135 | .B fshash |
| 136 | digest file to |
| 137 | .IB host / fs / date .fshash \fR. |
| 138 | .PP |
| 139 | The backup is now complete. |
| 140 | .SS Configuration commands |
| 141 | The configuration file is simply a Bash shell fragment: configuration |
| 142 | commands are shell functions. |
| 143 | .TP |
| 144 | .BI "addhook " hook " " command |
| 145 | Arrange that the named |
| 146 | .I hook |
| 147 | runs the given |
| 148 | .IR command . |
| 149 | See |
| 150 | .B runhook |
| 151 | for more details. |
| 152 | .TP |
| 153 | .BI "backup " "fs\fR[:\fIfsarg\fR] ..." |
| 154 | Back up the named filesystems. The corresponding |
| 155 | .IR fsarg s |
| 156 | may be required by the snapshot type. |
| 157 | .TP |
| 158 | .BI "defhook " hook |
| 159 | Define a new hook named |
| 160 | .IR hook . |
| 161 | See |
| 162 | .B addhook |
| 163 | and |
| 164 | .B runhook |
| 165 | for more information. |
| 166 | .TP |
| 167 | .BI "host " host |
| 168 | Future |
| 169 | .B backup |
| 170 | commands will back up filesystems on the named |
| 171 | .IR host . |
| 172 | To back up filesystems on the backup server itself, use its hostname: |
| 173 | .B rsync-backup |
| 174 | will avoid inefficient and pointless messing about |
| 175 | .BR ssh (1) |
| 176 | in this case. |
| 177 | This command clears the |
| 178 | .B like |
| 179 | list, the remote |
| 180 | .B user |
| 181 | name, and resets the retention policy to its default (i.e., the to |
| 182 | policy defined prior to the first |
| 183 | .B host |
| 184 | command). |
| 185 | .TP |
| 186 | .BI "like " "host\fR ..." |
| 187 | Declare that subsequent filesystems are `similar' to like-named |
| 188 | filesystems on the named |
| 189 | .IR host s, |
| 190 | and that |
| 191 | .B rsync |
| 192 | should use those trees as potential sources of hardlinkable files. Be |
| 193 | careful when using this option without |
| 194 | .BR rsync 's |
| 195 | .B \-\-checksum |
| 196 | option: an erroneous hardlink will cause the backup to fail. (The |
| 197 | backup won't be left silently incorrect.) |
| 198 | .TP |
| 199 | .BI "retain " frequency " " duration |
| 200 | Define part a backup retention policy: backup trees of the |
| 201 | .I frequency |
| 202 | should be kept for the |
| 203 | .IR duration . |
| 204 | The |
| 205 | .I frequency |
| 206 | can be |
| 207 | .BR daily , |
| 208 | .BR weekly , |
| 209 | .BR monthly , |
| 210 | or |
| 211 | .B annually |
| 212 | (or |
| 213 | .BR yearly , |
| 214 | which means the same); the |
| 215 | .I duration |
| 216 | may be any of |
| 217 | .BR week , |
| 218 | .BR month , |
| 219 | .BR year , |
| 220 | or |
| 221 | .BR forever . |
| 222 | Expiry considers each existing dump against the policy lines in order: |
| 223 | the last applicable line determines the dump's fate \(en so you should |
| 224 | probably write the lines in decreasing order of duration. |
| 225 | .RS |
| 226 | .PP |
| 227 | Groups of |
| 228 | .B retain |
| 229 | commands between |
| 230 | .B host |
| 231 | and/or |
| 232 | .B backup |
| 233 | commands collectively define a retention policy. Once a policy is |
| 234 | defined, subsequent |
| 235 | .B backup |
| 236 | operations use the policy. The first |
| 237 | .B retain |
| 238 | command after a |
| 239 | .B host |
| 240 | or |
| 241 | .B backup |
| 242 | command clears the policy and starts defining a new one. The policy |
| 243 | defined before the first |
| 244 | .B host |
| 245 | is the |
| 246 | .I default |
| 247 | policy: at the start of each |
| 248 | .B host |
| 249 | stanza, the policy is reset to the default. |
| 250 | .RE |
| 251 | .TP |
| 252 | .BI "retry " count |
| 253 | The |
| 254 | .B live |
| 255 | snapshot type (see below) doesn't prevent a filesystem from being |
| 256 | modified while it's being backed up. If this happens, the |
| 257 | .B fshash |
| 258 | pass will detect the difference and fail. If the filesystem in question |
| 259 | is relatively quiescent, then maybe retrying the backup will result in a |
| 260 | successful consistent copy. Following this command, a backup which |
| 261 | results in an |
| 262 | .B fshash |
| 263 | mismatch will be retried up to |
| 264 | .I count |
| 265 | times before being declared a failure. The default is to retry once, |
| 266 | clearing mismatching files from the |
| 267 | .BR fshash (1) |
| 268 | caches before the second attempt. |
| 269 | .TP |
| 270 | .BI "runhook " hook " " args\fR... |
| 271 | Invoke the named |
| 272 | .IR hook . |
| 273 | The individual commands on the hook are run, in order, as |
| 274 | .RS |
| 275 | .IP |
| 276 | .I command |
| 277 | .IR args ... |
| 278 | .PP |
| 279 | If any command fails (returns nonzero) then no other hooks are run and |
| 280 | .B runhook |
| 281 | fails with the same exit code. |
| 282 | .RE |
| 283 | .TP |
| 284 | .BI "snap " type " " \fR[\fIargs\fR...] |
| 285 | Use the snapshot |
| 286 | .I type |
| 287 | for subsequent backups. Some snapshot types require additional |
| 288 | arguments, which may be supplied here. This command clears the |
| 289 | .B retry |
| 290 | counter. |
| 291 | .TP |
| 292 | .BI "user " name |
| 293 | Specify the user name on the remote host. Without this, calls to |
| 294 | .BR ssh (1) |
| 295 | and |
| 296 | .BR rsync (1) |
| 297 | won't specify any user name, so the default (probably from the |
| 298 | .BR ssh_config (5) |
| 299 | file) will apply. |
| 300 | .SS Configuration variables |
| 301 | The following shell variables may be overridden by the configuration |
| 302 | file. |
| 303 | .TP |
| 304 | .B HASH |
| 305 | The hash function to use for verifying archive integrity. This is |
| 306 | passed to the |
| 307 | .B \-H |
| 308 | option of |
| 309 | .BR fshash , |
| 310 | so it must name one of the hash functions supported by your Python's |
| 311 | .B hashlib |
| 312 | module. |
| 313 | The default is |
| 314 | .BR sha256 . |
| 315 | .TP |
| 316 | .B INDEXDB |
| 317 | The name of a SQLite database initialized by |
| 318 | .BR update-bkp-index (8) |
| 319 | in which an index is maintained of which dumps are on which backup |
| 320 | volumes. If the file doesn't exist, then no index is maintained. The |
| 321 | default is |
| 322 | .IB localstatedir /lib/bkp/index.db |
| 323 | where |
| 324 | .I localstatedir |
| 325 | is the state directory configured at build time. |
| 326 | .TP |
| 327 | .B MAXLOG |
| 328 | The number of log files to be kept for each filesystem. Old logfiles |
| 329 | are deleted to keep the total number below this bound. The default |
| 330 | value is 14. |
| 331 | .TP |
| 332 | .B METADIR |
| 333 | The metadata directory for the currently mounted backup volume. |
| 334 | The default is |
| 335 | .IB mntbkpdir /meta |
| 336 | where |
| 337 | .I mntbkpdir |
| 338 | is the backup mount directory configured at build time. |
| 339 | .TP |
| 340 | .B RSYNCOPTS |
| 341 | Command-line options to pass to |
| 342 | .BR rsync (1) |
| 343 | in addition to the basic set: |
| 344 | .B \-\-archive |
| 345 | .B \-\-hard-links |
| 346 | .B \-\-numeric-ids |
| 347 | .B \-\-del |
| 348 | .B \-\-sparse |
| 349 | .B \-\-compress |
| 350 | .B \-\-one-file-system |
| 351 | .B \-\-partial |
| 352 | .BR "\-\-filter=""dir-merge .rsync-backup""" . |
| 353 | The default is |
| 354 | .BR \-\-verbose . |
| 355 | .TP |
| 356 | .B SNAPDIR |
| 357 | LVM (and |
| 358 | .BR rfreezefs ) |
| 359 | snapshots are mounted on subdirectories below the |
| 360 | .B SNAPDIR |
| 361 | .IR "on backup clients" . |
| 362 | The default is |
| 363 | .IB mntbkpdir /snap |
| 364 | where |
| 365 | .I mntbkpdir |
| 366 | is the backup mount directory configured at build time. |
| 367 | .TP |
| 368 | .B SNAPSIZE |
| 369 | The volume size option to pass to |
| 370 | .BR lvcreate (8) |
| 371 | when creating a snapshot. The default is |
| 372 | .B \-l10%ORIGIN |
| 373 | which seems to work fairly well. |
| 374 | .TP |
| 375 | .B STOREDIR |
| 376 | Where the actual backup trees should be stored. See the section on |
| 377 | .B Archive structure |
| 378 | below. |
| 379 | The default is |
| 380 | .IB mntbkpdir /store |
| 381 | where |
| 382 | .I mntbkpdir |
| 383 | is the backup mount directory configured at build time. |
| 384 | .TP |
| 385 | .B VOLUME |
| 386 | The name of the current volume. If this is left unset, the volume name |
| 387 | is read from the file |
| 388 | .IB METADIR /volume |
| 389 | once at the start of the backup run. |
| 390 | .SS Hooks |
| 391 | The configuration file can modify the behaviour of the backup in two |
| 392 | main ways: by adding commands to hooks (see the |
| 393 | .B addhook |
| 394 | command); and by redefining shell functions. |
| 395 | .PP |
| 396 | The following hooks are defined. |
| 397 | .TP |
| 398 | .BI "commit " host " " fs " " date |
| 399 | Called during the commit procedure. The backup tree and manifest have |
| 400 | been renamed into their proper places. Typically one would use this |
| 401 | hook to rename files created in a corresponding |
| 402 | .B precommit |
| 403 | command. |
| 404 | .TP |
| 405 | .BI "end " rc |
| 406 | The backup has completed; |
| 407 | .B rsync-backup |
| 408 | will exit with status |
| 409 | .IR rc . |
| 410 | .TP |
| 411 | .BI "precommit " host " " fs " " date |
| 412 | Called after a backup has been verified complete and about to be |
| 413 | committed. The backup tree is in |
| 414 | .B new |
| 415 | in the current directory, and the |
| 416 | .B fshash |
| 417 | manifest is in |
| 418 | .BR new.fshash . |
| 419 | A typical action would be to create a digital signature on the |
| 420 | manifest. |
| 421 | .TP |
| 422 | .BI "setup " host " " fs " " date |
| 423 | Called when a backup of a particular filesystem is about to start. It |
| 424 | can return with code 99 to skip the backup. |
| 425 | .TP |
| 426 | .B "start" |
| 427 | Invoked before performing any actual dumps (the first time |
| 428 | .B host |
| 429 | is run). |
| 430 | .PP |
| 431 | The following shell functions can be redefined by users. |
| 432 | .TP |
| 433 | .BI "backup_commit_hook " host " " fs " " date |
| 434 | Called from the |
| 435 | .B commit |
| 436 | hook for compatibility. |
| 437 | .TP |
| 438 | .BI "backup_precommit_hook " host " " fs " " date |
| 439 | Called from the |
| 440 | .B precommit |
| 441 | hook for compatibility. |
| 442 | .TP |
| 443 | .BR "whine " [ \-n ] " " \fItext\fR... |
| 444 | Called to report `interesting' events when the |
| 445 | .B \-v |
| 446 | option is in force. The default action is to echo the |
| 447 | .I text |
| 448 | to (what was initially) standard output, followed by a newline unless |
| 449 | .B \-n |
| 450 | is given. |
| 451 | .SS Snapshot types |
| 452 | The following snapshot types are available. |
| 453 | .TP |
| 454 | .B live |
| 455 | A trivial snapshot type: attempts to back up a live filesystem. How |
| 456 | well this works depends on how active the filesystem is. If files |
| 457 | change while the dump is in progress then the |
| 458 | .B fshash |
| 459 | verification will likely fail. Backups using this snapshot type must |
| 460 | specify the filesystem mount point as the |
| 461 | .IR fsarg . |
| 462 | .TP |
| 463 | .B ro |
| 464 | A slightly less trivial snapshot type: make the filesystem read-only |
| 465 | while the dump is in progress. Backups using this snapshot type must |
| 466 | specify the filesystem mount point as the |
| 467 | .IR fsarg . |
| 468 | .TP |
| 469 | .BI "lvm " vg |
| 470 | Create snapshots using LVM. The snapshot argument is interpreted as the |
| 471 | relevant volume group. The filesystem name is interpreted as the origin |
| 472 | volume name; the snapshot will be called |
| 473 | .IB fs .bkp |
| 474 | and mounted on |
| 475 | .IB SNAPDIR / fs \fR; |
| 476 | space will be allocated to it according to the |
| 477 | .I SNAPSIZE |
| 478 | variable. |
| 479 | .TP |
| 480 | .BI "rfreezefs " client " " vg |
| 481 | This gets complicated. Suppose that a server has an LVM volume group, |
| 482 | and exports (somehow) a logical volume to a client. Examples are a host |
| 483 | providing a virtual disk to a guest, or a server providing |
| 484 | network-attached storage to a client. The server can create a snapshot |
| 485 | of the volume using LVM, but must synchronize with the client to ensure |
| 486 | that the filesystem image captured in the snapshot is clean. The |
| 487 | .BR rfreezefs (8) |
| 488 | program should be installed on the client to perform this rather |
| 489 | delicate synchronization. Declare the server using the |
| 490 | .B host |
| 491 | command as usual; pass the client's name as the |
| 492 | .I client |
| 493 | and the |
| 494 | server's volume group name as the |
| 495 | .I vg |
| 496 | snapshot arguments. Finally, backups using this snapshot type must |
| 497 | specify the filesystem mount point (or, actually, any file in the |
| 498 | filesystem) on the client, as the |
| 499 | .IR fsarg . |
| 500 | .PP |
| 501 | Additional snapshot types can be defined in the configuration file. A |
| 502 | snapshot type requires two shell functions. |
| 503 | .TP |
| 504 | .BI snap_ type " " snapargs " " fs " " fsarg |
| 505 | Create the snapshot, and write the mountpoint (on the client host) to |
| 506 | standard output, in a form suitable as an argument to |
| 507 | .BR rsync . |
| 508 | .TP |
| 509 | .BI unsnap_ type " " snapargs " " fs " " fsarg |
| 510 | Remove the snapshot. |
| 511 | .PP |
| 512 | There are a number of utility functions which can be used by snapshot |
| 513 | type handlers: please see the script for details. Please send the |
| 514 | author interesting snapshot handlers for inclusion in the main |
| 515 | distribution. |
| 516 | .SS Archive structure |
| 517 | Backup trees are stored in a fairly straightforward directory tree. |
| 518 | .PP |
| 519 | At the top level is one directory for each client host. There are also |
| 520 | some special entries: |
| 521 | .TP |
| 522 | .B \&.rsync-backup-store |
| 523 | This file must be present in order to indicate that a backup volume is |
| 524 | present (and not just an empty mount point). |
| 525 | .TP |
| 526 | .B fshash.cache |
| 527 | The cache database used for improving performance of local file |
| 528 | hashing. There may be other |
| 529 | .B fshash.cache-* |
| 530 | files used by SQLite for its own purposes. |
| 531 | .TP |
| 532 | .B lost+found |
| 533 | Part of the filesystem used on the backup volume. You don't want to |
| 534 | mess with this. |
| 535 | .TP |
| 536 | .B tmp |
| 537 | Used to store temporary files during the backup process. (Some of them |
| 538 | want to be on the same filesystem as the rest of the backup.) When |
| 539 | things go wrong, files are left behind in the hope that they might help |
| 540 | someone debug the mess. It's always safe to delete the files in here |
| 541 | when no backup is running. |
| 542 | .PP |
| 543 | So don't use those names for your hosts. |
| 544 | .PP |
| 545 | The next layer down contains a directory for each filesystem on the |
| 546 | given host. |
| 547 | .PP |
| 548 | The bottom layer contains a directory for each dump of that filesystem, |
| 549 | named with the date at which the dump was started (in ISO8601 |
| 550 | .IB yyyy \(en mm \(en dd |
| 551 | format), together with associated files named |
| 552 | .IB date .* \fR. |
| 553 | There is also a symbolic link |
| 554 | .B last |
| 555 | referring to the most recent backup of the filesystem. |
| 556 | .SH SEE ALSO |
| 557 | .BR check-bkp-status (8), |
| 558 | .BR fshash (1), |
| 559 | .BR lvm (8), |
| 560 | .BR rfreezefs (8), |
| 561 | .BR rsync (1), |
| 562 | .BR ssh (1), |
| 563 | .BR update-bkp-index (8). |
| 564 | .SH AUTHOR |
| 565 | Mark Wooding, <mdw@distorted.org.uk> |