fshash.in, fshash.1: Evict cache entries identified by diff files.
[rsync-backup] / rsync-backup.8
CommitLineData
69305044
MW
1.ie t .ds o \(bu
2.el .ds o o
3.de hP
4.IP
5\h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
6..
f6b4ffdc 7.TH rsync-backup 8 "7 October 2012" rsync-backup
977d0da9
MW
8.SH NAME
9rsync-backup \- back up files using rsync
f6b4ffdc
MW
10.SH SYNOPSIS
11.B rsync-backup
3f496b2b 12.RB [ \-nv ]
f6b4ffdc
MW
13.RB [ \-c
14.IR config-file ]
15.SH DESCRIPTION
16The
17.B rsync-backup
18script is a backup program of the currently popular
19.RB ` rsync (1)
20.BR \-\-link-dest '
21variety. It uses
22.BR rsync 's
23ability to create hardlinks from (apparently) similar existing local
24trees to make incremental dumps efficient, even from remote sources.
25Restoring files is easy because the backups created are just directories
26full of files, exactly as they were on the source \(en and this is
27verified using the
28.BR fshash (1)
29program.
30.PP
31The script does more than just running
32.BR rsync .
33It is also responsible for creating and removing snapshots of volumes to
34be backed up, and expiring old dumps according to a user-specified
35retention policy.
36.SS Installation
37The idea is that the
38.B rsync-backup
39script should be installed and run on a central backup server with local
40access to the backup volumes.
41.PP
42The script should be run with full (root) privileges, so that it can
43correctly record file ownership information. The server should also be
44able to connect via
45.BR ssh (1)
46to the client machines, and run processes there as root. (This is not a
47security disaster. Remember that the backup server is, in the end,
48responsible for the integrity of the backup data. A dishonest backup
49server can easily compromise a client which is being restored from
50corrupt backup data.)
69305044
MW
51.SS Command-line options
52Most of the behaviour of
53.B rsync-backup
54is controlled by a configuration file, described starting with the
55section named
56.B Configuration commands
57below.
58But a few features are controlled by command-line options.
59.TP
60.B \-h
61Show a brief help message for the program, and exit successfully.
62.TP
63.B \-V
64Show
65.BR rsync-backup 's
66version number and some choice pieces of build-time configuration, and
67exit successfully.
68.TP
69.BI "\-c " conf
70Read
71.I conf
72instead of the default configuration file (shown as
73.B conf
74in the
75.B \-V
76output).
77.TP
3f496b2b
MW
78.B \-n
79Don't actually take a backup, or write proper logs: instead, write a
80description of what would be done to standard error.
81.TP
69305044
MW
82.B \-v
83Produce verbose progress information on standard output while the backup
84is running. This keeps one amused while running a backup
85interactively. In any event,
86.B rsync-backup
87will report failures to standard error, and otherwise run silently, so
88it doesn't annoy unnecessarily if run by
89.BR cron (8).
90.SS Backup process
91Backing up a filesystem works as follows.
92.hP \*o
93Make a snapshot of the filesystem on the client, and ensure that the
94snapshot is mounted. There are some `trivial' snapshot types which use
95the existing mounted filesystem, and either prevent processes writing to
96it during the backup, or just hope for the best. Other snapshot types
97require the snapshot to be mounted somewhere distinct from the main
98filesystem, so that the latter can continue being used.
99.hP \*o
100Run
101.B rsync
102to copy the snapshot to the backup volume \(en specifically, to
103.IB host / fs / new \fR.
104If this directory already exists, then it's presumed to be debris from a
105previous attempt to dump this filesystem:
106.B rsync
107will update it appropriately, by adding, deleting or modifying the
108files. This means that retrying a failed dump \(en after fixing whatever
109caused it to go wrong, obviously! \(en is usually fairly quick.
110.hP \*o
111Run
112.B fshash
113on the client to generate a `digest' describing the contents of the
114filesystem, and send this to the server as
115.IB host / fs / new .fshash \fR.
116.hP \*o
117Release the snapshot: we don't need it any more.
118.hP \*o
119Run
120.B fshash
121over the new backup; specifically, to
122.BI tmp/fshash. host . fs . date \fR.
123This gives us a digest for what the backup volume actually stored.
124.hP \*o
125Compare the two
126.B fshash
127digests. If they differ then dump the differences to the log file and
128report a backup failure. (Backups aren't any good if they don't
129actually back up the right thing. And you stand a better chance of
130fixing them if you know that they're going wrong.)
131.hP \*o
132Commit the backup, by renaming the dump directory to
133.IB host / fs / date
134and the
135.B fshash
136digest file to
137.IB host / fs / date .fshash \fR.
f6b4ffdc 138.PP
69305044 139The backup is now complete.
f6b4ffdc
MW
140.SS Configuration commands
141The configuration file is simply a Bash shell fragment: configuration
142commands are shell functions.
143.TP
9b1d71c6
MW
144.BI "addhook " hook " " command
145Arrange that the named
146.I hook
147runs the given
148.IR command .
149See
150.B runhook
151for more details.
152.TP
f6b4ffdc
MW
153.BI "backup " "fs\fR[:\fIfsarg\fR] ..."
154Back up the named filesystems. The corresponding
155.IR fsarg s
156may be required by the snapshot type.
157.TP
9b1d71c6
MW
158.BI "defhook " hook
159Define a new hook named
160.IR hook .
161See
162.B addhook
163and
164.B runhook
165for more information.
166.TP
f6b4ffdc
MW
167.BI "host " host
168Future
169.B backup
170commands will back up filesystems on the named
171.IR host .
69305044
MW
172To back up filesystems on the backup server itself, use its hostname:
173.B rsync-backup
174will avoid inefficient and pointless messing about
175.BR ssh (1)
176in this case.
177This command clears the
f6b4ffdc 178.B like
fdd73e22
MW
179list, the remote
180.B user
181name, and resets the retention policy to its default (i.e., the to
f8d0b27d
MW
182policy defined prior to the first
183.B host
184command).
f6b4ffdc
MW
185.TP
186.BI "like " "host\fR ..."
187Declare that subsequent filesystems are `similar' to like-named
188filesystems on the named
189.IR host s,
190and that
191.B rsync
192should use those trees as potential sources of hardlinkable files. Be
193careful when using this option without
194.BR rsync 's
195.B \-\-checksum
196option: an erroneous hardlink will cause the backup to fail. (The
197backup won't be left silently incorrect.)
198.TP
199.BI "retain " frequency " " duration
200Define part a backup retention policy: backup trees of the
201.I frequency
202should be kept for the
203.IR duration .
204The
205.I frequency
206can be
207.BR daily ,
208.BR weekly ,
209.BR monthly ,
210or
69305044 211.B annually
f6b4ffdc
MW
212(or
213.BR yearly ,
214which means the same); the
215.I duration
216may be any of
217.BR week ,
218.BR month ,
219.BR year ,
220or
221.BR forever .
222Expiry considers each existing dump against the policy lines in order:
223the last applicable line determines the dump's fate \(en so you should
224probably write the lines in decreasing order of duration.
e69b31ea 225.RS
f8d0b27d
MW
226.PP
227Groups of
228.B retain
229commands between
230.B host
231and/or
232.B backup
233commands collectively define a retention policy. Once a policy is
234defined, subsequent
235.B backup
236operations use the policy. The first
237.B retain
238command after a
239.B host
240or
241.B backup
242command clears the policy and starts defining a new one. The policy
243defined before the first
244.B host
245is the
246.I default
247policy: at the start of each
248.B host
249stanza, the policy is reset to the default.
e69b31ea 250.RE
f6b4ffdc 251.TP
5675acda
MW
252.BI "retry " count
253The
254.B live
255snapshot type (see below) doesn't prevent a filesystem from being
256modified while it's being backed up. If this happens, the
257.B fshash
258pass will detect the difference and fail. If the filesystem in question
259is relatively quiescent, then maybe retrying the backup will result in a
260successful consistent copy. Following this command, a backup which
261results in an
262.B fshash
263mismatch will be retried up to
264.I count
265times before being declared a failure.
266.TP
9b1d71c6
MW
267.BI "runhook " hook " " args\fR...
268Invoke the named
269.IR hook .
270The individual commands on the hook are run, in order, as
271.RS
272.IP
273.I command
274.IR args ...
275.PP
276If any command fails (returns nonzero) then no other hooks are run and
277.B runhook
278fails with the same exit code.
279.RE
280.TP
f6b4ffdc
MW
281.BI "snap " type " " \fR[\fIargs\fR...]
282Use the snapshot
283.I type
284for subsequent backups. Some snapshot types require additional
5675acda
MW
285arguments, which may be supplied here. This command clears the
286.B retry
287counter.
fdd73e22
MW
288.TP
289.BI "user " name
290Specify the user name on the remote host. Without this, calls to
291.BR ssh (1)
292and
293.BR rsync (1)
294won't specify any user name, so the default (probably from the
295.BR ssh_config (5)
296file) will apply.
f6b4ffdc
MW
297.SS Configuration variables
298The following shell variables may be overridden by the configuration
299file.
300.TP
8e40e6cf
MW
301.B HASH
302The hash function to use for verifying archive integrity. This is
303passed to the
304.B \-H
305option of
306.BR fshash ,
307so it must name one of the hash functions supported by your Python's
308.B hashlib
309module.
310The default is
311.BR sha256 .
312.TP
a8447303
MW
313.B INDEXDB
314The name of a SQLite database initialized by
315.BR update-bkp-index (8)
316in which an index is maintained of which dumps are on which backup
317volumes. If the file doesn't exist, then no index is maintained. The
318default is
319.IB localstatedir /lib/bkp/index.db
320where
321.I localstatedir
322is the state directory configured at build time.
323.TP
f6b4ffdc
MW
324.B MAXLOG
325The number of log files to be kept for each filesystem. Old logfiles
326are deleted to keep the total number below this bound. The default
327value is 14.
328.TP
a8447303
MW
329.B METADIR
330The metadata directory for the currently mounted backup volume.
331The default is
332.IB mntbkpdir /meta
333where
334.I mntbkpdir
335is the backup mount directory configured at build time.
336.TP
f6b4ffdc
MW
337.B RSYNCOPTS
338Command-line options to pass to
339.BR rsync (1)
340in addition to the basic set:
69305044
MW
341.B \-\-archive
342.B \-\-hard-links
343.B \-\-numeric-ids
344.B \-\-del
345.B \-\-sparse
346.B \-\-compress
347.B \-\-one-file-system
348.B \-\-partial
349.BR "\-\-filter=""dir-merge .rsync-backup""" .
f6b4ffdc
MW
350The default is
351.BR \-\-verbose .
352.TP
353.B SNAPDIR
354LVM (and
355.BR rfreezefs )
356snapshots are mounted on subdirectories below the
357.B SNAPDIR
358.IR "on backup clients" .
359The default is
360.IB mntbkpdir /snap
361where
362.I mntbkpdir
363is the backup mount directory configured at build time.
364.TP
365.B SNAPSIZE
366The volume size option to pass to
367.BR lvcreate (8)
368when creating a snapshot. The default is
369.B \-l10%ORIGIN
370which seems to work fairly well.
371.TP
372.B STOREDIR
373Where the actual backup trees should be stored. See the section on
374.B Archive structure
375below.
376The default is
377.IB mntbkpdir /store
378where
379.I mntbkpdir
380is the backup mount directory configured at build time.
381.TP
a8447303
MW
382.B VOLUME
383The name of the current volume. If this is left unset, the volume name
384is read from the file
385.IB METADIR /volume
386once at the start of the backup run.
9b1d71c6
MW
387.SS Hooks
388The configuration file can modify the behaviour of the backup in two
389main ways: by adding commands to hooks (see the
390.B addhook
391command); and by redefining shell functions.
392.PP
393The following hooks are defined.
f6b4ffdc 394.TP
9b1d71c6
MW
395.BI "commit " host " " fs " " date
396Called during the commit procedure. The backup tree and manifest have
397been renamed into their proper places. Typically one would use this
398hook to rename files created in a corresponding
399.B precommit
400command.
401.TP
402.BI "end " rc
403The backup has completed;
404.B rsync-backup
405will exit with status
406.IR rc .
407.TP
408.BI "precommit " host " " fs " " date
f6b4ffdc
MW
409Called after a backup has been verified complete and about to be
410committed. The backup tree is in
411.B new
412in the current directory, and the
413.B fshash
414manifest is in
415.BR new.fshash .
416A typical action would be to create a digital signature on the
417manifest.
418.TP
9b1d71c6
MW
419.BI "setup " host " " fs " " date
420Called when a backup of a particular filesystem is about to start. It
421can return with code 99 to skip the backup.
422.TP
423.B "start"
424Invoked before performing any actual dumps (the first time
425.B host
426is run).
427.PP
428The following shell functions can be redefined by users.
429.TP
f6b4ffdc 430.BI "backup_commit_hook " host " " fs " " date
9b1d71c6
MW
431Called from the
432.B commit
433hook for compatibility.
434.TP
435.BI "backup_precommit_hook " host " " fs " " date
436Called from the
437.B precommit
438hook for compatibility.
f6b4ffdc
MW
439.TP
440.BR "whine " [ \-n ] " " \fItext\fR...
441Called to report `interesting' events when the
442.B \-v
443option is in force. The default action is to echo the
444.I text
445to (what was initially) standard output, followed by a newline unless
446.B \-n
447is given.
448.SS Snapshot types
449The following snapshot types are available.
450.TP
451.B live
452A trivial snapshot type: attempts to back up a live filesystem. How
453well this works depends on how active the filesystem is. If files
454change while the dump is in progress then the
455.B fshash
456verification will likely fail. Backups using this snapshot type must
457specify the filesystem mount point as the
458.IR fsarg .
459.TP
460.B ro
461A slightly less trivial snapshot type: make the filesystem read-only
462while the dump is in progress. Backups using this snapshot type must
463specify the filesystem mount point as the
464.IR fsarg .
465.TP
466.BI "lvm " vg
467Create snapshots using LVM. The snapshot argument is interpreted as the
468relevant volume group. The filesystem name is interpreted as the origin
469volume name; the snapshot will be called
470.IB fs .bkp
471and mounted on
472.IB SNAPDIR / fs \fR;
473space will be allocated to it according to the
474.I SNAPSIZE
475variable.
476.TP
477.BI "rfreezefs " client " " vg
478This gets complicated. Suppose that a server has an LVM volume group,
479and exports (somehow) a logical volume to a client. Examples are a host
480providing a virtual disk to a guest, or a server providing
481network-attached storage to a client. The server can create a snapshot
482of the volume using LVM, but must synchronize with the client to ensure
483that the filesystem image captured in the snapshot is clean. The
484.BR rfreezefs (8)
485program should be installed on the client to perform this rather
486delicate synchronization. Declare the server using the
487.B host
488command as usual; pass the client's name as the
489.I client
490and the
491server's volume group name as the
492.I vg
493snapshot arguments. Finally, backups using this snapshot type must
494specify the filesystem mount point (or, actually, any file in the
495filesystem) on the client, as the
496.IR fsarg .
497.PP
498Additional snapshot types can be defined in the configuration file. A
499snapshot type requires two shell functions.
500.TP
501.BI snap_ type " " snapargs " " fs " " fsarg
502Create the snapshot, and write the mountpoint (on the client host) to
503standard output, in a form suitable as an argument to
504.BR rsync .
505.TP
506.BI unsnap_ type " " snapargs " " fs " " fsarg
507Remove the snapshot.
508.PP
509There are a number of utility functions which can be used by snapshot
510type handlers: please see the script for details. Please send the
511author interesting snapshot handlers for inclusion in the main
512distribution.
513.SS Archive structure
69305044
MW
514Backup trees are stored in a fairly straightforward directory tree.
515.PP
516At the top level is one directory for each client host. There are also
517some special entries:
518.TP
6037bdb3
MW
519.B \&.rsync-backup-store
520This file must be present in order to indicate that a backup volume is
521present (and not just an empty mount point).
522.TP
69305044
MW
523.B fshash.cache
524The cache database used for improving performance of local file
525hashing. There may be other
526.B fshash.cache-*
527files used by SQLite for its own purposes.
528.TP
529.B lost+found
530Part of the filesystem used on the backup volume. You don't want to
531mess with this.
532.TP
533.B tmp
534Used to store temporary files during the backup process. (Some of them
535want to be on the same filesystem as the rest of the backup.) When
536things go wrong, files are left behind in the hope that they might help
537someone debug the mess. It's always safe to delete the files in here
538when no backup is running.
539.PP
540So don't use those names for your hosts.
541.PP
4d4865fd
MW
542The next layer down contains a directory for each filesystem on the
543given host.
69305044
MW
544.PP
545The bottom layer contains a directory for each dump of that filesystem,
546named with the date at which the dump was started (in ISO8601
547.IB yyyy \(en mm \(en dd
548format), together with associated files named
549.IB date .* \fR.
2aea4573
MW
550There is also a symbolic link
551.B last
552referring to the most recent backup of the filesystem.
69305044 553.SH SEE ALSO
9f0350f9 554.BR check-bkp-status (8),
69305044
MW
555.BR fshash (1),
556.BR lvm (8),
557.BR rfreezefs (8),
558.BR rsync (1),
a8447303
MW
559.BR ssh (1),
560.BR update-bkp-index (8).
69305044
MW
561.SH AUTHOR
562Mark Wooding, <mdw@distorted.org.uk>