debian: Split into multiple packages.
[rsync-backup] / rsync-backup.8
1 .ie t .ds o \(bu
2 .el .ds o o
3 .de hP
4 .IP
5 \h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
6 ..
7 .TH rsync-backup 8 "7 October 2012" rsync-backup
8 .SH NAME
9 rsync-backup \- back up files using rsync
10 .SH SYNOPSIS
11 .B rsync-backup
12 .RB [ \-nv ]
13 .RB [ \-c
14 .IR config-file ]
15 .SH DESCRIPTION
16 The
17 .B rsync-backup
18 script is a backup program of the currently popular
19 .RB ` rsync (1)
20 .BR \-\-link-dest '
21 variety. It uses
22 .BR rsync 's
23 ability to create hardlinks from (apparently) similar existing local
24 trees to make incremental dumps efficient, even from remote sources.
25 Restoring files is easy because the backups created are just directories
26 full of files, exactly as they were on the source \(en and this is
27 verified using the
28 .BR fshash (1)
29 program.
30 .PP
31 The script does more than just running
32 .BR rsync .
33 It is also responsible for creating and removing snapshots of volumes to
34 be backed up, and expiring old dumps according to a user-specified
35 retention policy.
36 .SS Installation
37 The idea is that the
38 .B rsync-backup
39 script should be installed and run on a central backup server with local
40 access to the backup volumes.
41 .PP
42 The script should be run with full (root) privileges, so that it can
43 correctly record file ownership information. The server should also be
44 able to connect via
45 .BR ssh (1)
46 to the client machines, and run processes there as root. (This is not a
47 security disaster. Remember that the backup server is, in the end,
48 responsible for the integrity of the backup data. A dishonest backup
49 server can easily compromise a client which is being restored from
50 corrupt backup data.)
51 .SS Command-line options
52 Most of the behaviour of
53 .B rsync-backup
54 is controlled by a configuration file, described starting with the
55 section named
56 .B Configuration commands
57 below.
58 But a few features are controlled by command-line options.
59 .TP
60 .B \-h
61 Show a brief help message for the program, and exit successfully.
62 .TP
63 .B \-V
64 Show
65 .BR rsync-backup 's
66 version number and some choice pieces of build-time configuration, and
67 exit successfully.
68 .TP
69 .BI "\-c " conf
70 Read
71 .I conf
72 instead of the default configuration file (shown as
73 .B conf
74 in the
75 .B \-V
76 output).
77 .TP
78 .B \-n
79 Don't actually take a backup, or write proper logs: instead, write a
80 description of what would be done to standard error.
81 .TP
82 .B \-v
83 Produce verbose progress information on standard output while the backup
84 is running. This keeps one amused while running a backup
85 interactively. In any event,
86 .B rsync-backup
87 will report failures to standard error, and otherwise run silently, so
88 it doesn't annoy unnecessarily if run by
89 .BR cron (8).
90 .SS Backup process
91 Backing up a filesystem works as follows.
92 .hP \*o
93 Make a snapshot of the filesystem on the client, and ensure that the
94 snapshot is mounted. There are some `trivial' snapshot types which use
95 the existing mounted filesystem, and either prevent processes writing to
96 it during the backup, or just hope for the best. Other snapshot types
97 require the snapshot to be mounted somewhere distinct from the main
98 filesystem, so that the latter can continue being used.
99 .hP \*o
100 Run
101 .B rsync
102 to copy the snapshot to the backup volume \(en specifically, to
103 .IB host / fs / new \fR.
104 If this directory already exists, then it's presumed to be debris from a
105 previous attempt to dump this filesystem:
106 .B rsync
107 will update it appropriately, by adding, deleting or modifying the
108 files. This means that retrying a failed dump \(en after fixing whatever
109 caused it to go wrong, obviously! \(en is usually fairly quick.
110 .hP \*o
111 Run
112 .B fshash
113 on the client to generate a `digest' describing the contents of the
114 filesystem, and send this to the server as
115 .IB host / fs / new .fshash \fR.
116 .hP \*o
117 Release the snapshot: we don't need it any more.
118 .hP \*o
119 Run
120 .B fshash
121 over the new backup; specifically, to
122 .BI tmp/fshash. host . fs . date \fR.
123 This gives us a digest for what the backup volume actually stored.
124 .hP \*o
125 Compare the two
126 .B fshash
127 digests. If they differ then dump the differences to the log file and
128 report a backup failure. (Backups aren't any good if they don't
129 actually back up the right thing. And you stand a better chance of
130 fixing them if you know that they're going wrong.)
131 .hP \*o
132 Commit the backup, by renaming the dump directory to
133 .IB host / fs / date
134 and the
135 .B fshash
136 digest file to
137 .IB host / fs / date .fshash \fR.
138 .PP
139 The backup is now complete.
140 .SS Configuration commands
141 The configuration file is simply a Bash shell fragment: configuration
142 commands are shell functions.
143 .TP
144 .BI "backup " "fs\fR[:\fIfsarg\fR] ..."
145 Back up the named filesystems. The corresponding
146 .IR fsarg s
147 may be required by the snapshot type.
148 .TP
149 .BI "host " host
150 Future
151 .B backup
152 commands will back up filesystems on the named
153 .IR host .
154 To back up filesystems on the backup server itself, use its hostname:
155 .B rsync-backup
156 will avoid inefficient and pointless messing about
157 .BR ssh (1)
158 in this case.
159 This command clears the
160 .B like
161 list, the remote
162 .B user
163 name, and resets the retention policy to its default (i.e., the to
164 policy defined prior to the first
165 .B host
166 command).
167 .TP
168 .BI "like " "host\fR ..."
169 Declare that subsequent filesystems are `similar' to like-named
170 filesystems on the named
171 .IR host s,
172 and that
173 .B rsync
174 should use those trees as potential sources of hardlinkable files. Be
175 careful when using this option without
176 .BR rsync 's
177 .B \-\-checksum
178 option: an erroneous hardlink will cause the backup to fail. (The
179 backup won't be left silently incorrect.)
180 .TP
181 .BI "retain " frequency " " duration
182 Define part a backup retention policy: backup trees of the
183 .I frequency
184 should be kept for the
185 .IR duration .
186 The
187 .I frequency
188 can be
189 .BR daily ,
190 .BR weekly ,
191 .BR monthly ,
192 or
193 .B annually
194 (or
195 .BR yearly ,
196 which means the same); the
197 .I duration
198 may be any of
199 .BR week ,
200 .BR month ,
201 .BR year ,
202 or
203 .BR forever .
204 Expiry considers each existing dump against the policy lines in order:
205 the last applicable line determines the dump's fate \(en so you should
206 probably write the lines in decreasing order of duration.
207 .RS
208 .PP
209 Groups of
210 .B retain
211 commands between
212 .B host
213 and/or
214 .B backup
215 commands collectively define a retention policy. Once a policy is
216 defined, subsequent
217 .B backup
218 operations use the policy. The first
219 .B retain
220 command after a
221 .B host
222 or
223 .B backup
224 command clears the policy and starts defining a new one. The policy
225 defined before the first
226 .B host
227 is the
228 .I default
229 policy: at the start of each
230 .B host
231 stanza, the policy is reset to the default.
232 .RE
233 .TP
234 .BI "retry " count
235 The
236 .B live
237 snapshot type (see below) doesn't prevent a filesystem from being
238 modified while it's being backed up. If this happens, the
239 .B fshash
240 pass will detect the difference and fail. If the filesystem in question
241 is relatively quiescent, then maybe retrying the backup will result in a
242 successful consistent copy. Following this command, a backup which
243 results in an
244 .B fshash
245 mismatch will be retried up to
246 .I count
247 times before being declared a failure.
248 .TP
249 .BI "snap " type " " \fR[\fIargs\fR...]
250 Use the snapshot
251 .I type
252 for subsequent backups. Some snapshot types require additional
253 arguments, which may be supplied here. This command clears the
254 .B retry
255 counter.
256 .TP
257 .BI "user " name
258 Specify the user name on the remote host. Without this, calls to
259 .BR ssh (1)
260 and
261 .BR rsync (1)
262 won't specify any user name, so the default (probably from the
263 .BR ssh_config (5)
264 file) will apply.
265 .SS Configuration variables
266 The following shell variables may be overridden by the configuration
267 file.
268 .TP
269 .B HASH
270 The hash function to use for verifying archive integrity. This is
271 passed to the
272 .B \-H
273 option of
274 .BR fshash ,
275 so it must name one of the hash functions supported by your Python's
276 .B hashlib
277 module.
278 The default is
279 .BR sha256 .
280 .TP
281 .B INDEXDB
282 The name of a SQLite database initialized by
283 .BR update-bkp-index (8)
284 in which an index is maintained of which dumps are on which backup
285 volumes. If the file doesn't exist, then no index is maintained. The
286 default is
287 .IB localstatedir /lib/bkp/index.db
288 where
289 .I localstatedir
290 is the state directory configured at build time.
291 .TP
292 .B MAXLOG
293 The number of log files to be kept for each filesystem. Old logfiles
294 are deleted to keep the total number below this bound. The default
295 value is 14.
296 .TP
297 .B METADIR
298 The metadata directory for the currently mounted backup volume.
299 The default is
300 .IB mntbkpdir /meta
301 where
302 .I mntbkpdir
303 is the backup mount directory configured at build time.
304 .TP
305 .B RSYNCOPTS
306 Command-line options to pass to
307 .BR rsync (1)
308 in addition to the basic set:
309 .B \-\-archive
310 .B \-\-hard-links
311 .B \-\-numeric-ids
312 .B \-\-del
313 .B \-\-sparse
314 .B \-\-compress
315 .B \-\-one-file-system
316 .B \-\-partial
317 .BR "\-\-filter=""dir-merge .rsync-backup""" .
318 The default is
319 .BR \-\-verbose .
320 .TP
321 .B SNAPDIR
322 LVM (and
323 .BR rfreezefs )
324 snapshots are mounted on subdirectories below the
325 .B SNAPDIR
326 .IR "on backup clients" .
327 The default is
328 .IB mntbkpdir /snap
329 where
330 .I mntbkpdir
331 is the backup mount directory configured at build time.
332 .TP
333 .B SNAPSIZE
334 The volume size option to pass to
335 .BR lvcreate (8)
336 when creating a snapshot. The default is
337 .B \-l10%ORIGIN
338 which seems to work fairly well.
339 .TP
340 .B STOREDIR
341 Where the actual backup trees should be stored. See the section on
342 .B Archive structure
343 below.
344 The default is
345 .IB mntbkpdir /store
346 where
347 .I mntbkpdir
348 is the backup mount directory configured at build time.
349 .TP
350 .B VOLUME
351 The name of the current volume. If this is left unset, the volume name
352 is read from the file
353 .IB METADIR /volume
354 once at the start of the backup run.
355 .SS Hook functions
356 The configuration file may define shell functions to perform custom
357 actions at various points in the backup process.
358 .TP
359 .BI "backup_precommit_hook " host " " fs " " date
360 Called after a backup has been verified complete and about to be
361 committed. The backup tree is in
362 .B new
363 in the current directory, and the
364 .B fshash
365 manifest is in
366 .BR new.fshash .
367 A typical action would be to create a digital signature on the
368 manifest.
369 .TP
370 .BI "backup_commit_hook " host " " fs " " date
371 Called during the commit procedure. The backup tree and manifest have
372 been renamed into their proper places. Typically one would use this
373 hook to rename files created by the
374 .B backup_precommit_hook
375 function.
376 .TP
377 .BR "whine " [ \-n ] " " \fItext\fR...
378 Called to report `interesting' events when the
379 .B \-v
380 option is in force. The default action is to echo the
381 .I text
382 to (what was initially) standard output, followed by a newline unless
383 .B \-n
384 is given.
385 .SS Snapshot types
386 The following snapshot types are available.
387 .TP
388 .B live
389 A trivial snapshot type: attempts to back up a live filesystem. How
390 well this works depends on how active the filesystem is. If files
391 change while the dump is in progress then the
392 .B fshash
393 verification will likely fail. Backups using this snapshot type must
394 specify the filesystem mount point as the
395 .IR fsarg .
396 .TP
397 .B ro
398 A slightly less trivial snapshot type: make the filesystem read-only
399 while the dump is in progress. Backups using this snapshot type must
400 specify the filesystem mount point as the
401 .IR fsarg .
402 .TP
403 .BI "lvm " vg
404 Create snapshots using LVM. The snapshot argument is interpreted as the
405 relevant volume group. The filesystem name is interpreted as the origin
406 volume name; the snapshot will be called
407 .IB fs .bkp
408 and mounted on
409 .IB SNAPDIR / fs \fR;
410 space will be allocated to it according to the
411 .I SNAPSIZE
412 variable.
413 .TP
414 .BI "rfreezefs " client " " vg
415 This gets complicated. Suppose that a server has an LVM volume group,
416 and exports (somehow) a logical volume to a client. Examples are a host
417 providing a virtual disk to a guest, or a server providing
418 network-attached storage to a client. The server can create a snapshot
419 of the volume using LVM, but must synchronize with the client to ensure
420 that the filesystem image captured in the snapshot is clean. The
421 .BR rfreezefs (8)
422 program should be installed on the client to perform this rather
423 delicate synchronization. Declare the server using the
424 .B host
425 command as usual; pass the client's name as the
426 .I client
427 and the
428 server's volume group name as the
429 .I vg
430 snapshot arguments. Finally, backups using this snapshot type must
431 specify the filesystem mount point (or, actually, any file in the
432 filesystem) on the client, as the
433 .IR fsarg .
434 .PP
435 Additional snapshot types can be defined in the configuration file. A
436 snapshot type requires two shell functions.
437 .TP
438 .BI snap_ type " " snapargs " " fs " " fsarg
439 Create the snapshot, and write the mountpoint (on the client host) to
440 standard output, in a form suitable as an argument to
441 .BR rsync .
442 .TP
443 .BI unsnap_ type " " snapargs " " fs " " fsarg
444 Remove the snapshot.
445 .PP
446 There are a number of utility functions which can be used by snapshot
447 type handlers: please see the script for details. Please send the
448 author interesting snapshot handlers for inclusion in the main
449 distribution.
450 .SS Archive structure
451 Backup trees are stored in a fairly straightforward directory tree.
452 .PP
453 At the top level is one directory for each client host. There are also
454 some special entries:
455 .TP
456 .B \&.rsync-backup-store
457 This file must be present in order to indicate that a backup volume is
458 present (and not just an empty mount point).
459 .TP
460 .B fshash.cache
461 The cache database used for improving performance of local file
462 hashing. There may be other
463 .B fshash.cache-*
464 files used by SQLite for its own purposes.
465 .TP
466 .B lost+found
467 Part of the filesystem used on the backup volume. You don't want to
468 mess with this.
469 .TP
470 .B tmp
471 Used to store temporary files during the backup process. (Some of them
472 want to be on the same filesystem as the rest of the backup.) When
473 things go wrong, files are left behind in the hope that they might help
474 someone debug the mess. It's always safe to delete the files in here
475 when no backup is running.
476 .PP
477 So don't use those names for your hosts.
478 .PP
479 The next layer down contains a directory for each filesystem on the given host.
480 .PP
481 The bottom layer contains a directory for each dump of that filesystem,
482 named with the date at which the dump was started (in ISO8601
483 .IB yyyy \(en mm \(en dd
484 format), together with associated files named
485 .IB date .* \fR.
486 There is also a symbolic link
487 .B last
488 referring to the most recent backup of the filesystem.
489 .SH SEE ALSO
490 .BR fshash (1),
491 .BR lvm (8),
492 .BR rfreezefs (8),
493 .BR rsync (1),
494 .BR ssh (1),
495 .BR update-bkp-index (8).
496 .SH AUTHOR
497 Mark Wooding, <mdw@distorted.org.uk>