rsync-backup.8: Describe `HOST/FS/last' symlink.
[rsync-backup] / rsync-backup.8
CommitLineData
69305044
MW
1.ie t .ds o \(bu
2.el .ds o o
3.de hP
4.IP
5\h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
6..
f6b4ffdc
MW
7.TH rsync-backup 8 "7 October 2012" rsync-backup
8.SH SYNOPSIS
9.B rsync-backup
3f496b2b 10.RB [ \-nv ]
f6b4ffdc
MW
11.RB [ \-c
12.IR config-file ]
13.SH DESCRIPTION
14The
15.B rsync-backup
16script is a backup program of the currently popular
17.RB ` rsync (1)
18.BR \-\-link-dest '
19variety. It uses
20.BR rsync 's
21ability to create hardlinks from (apparently) similar existing local
22trees to make incremental dumps efficient, even from remote sources.
23Restoring files is easy because the backups created are just directories
24full of files, exactly as they were on the source \(en and this is
25verified using the
26.BR fshash (1)
27program.
28.PP
29The script does more than just running
30.BR rsync .
31It is also responsible for creating and removing snapshots of volumes to
32be backed up, and expiring old dumps according to a user-specified
33retention policy.
34.SS Installation
35The idea is that the
36.B rsync-backup
37script should be installed and run on a central backup server with local
38access to the backup volumes.
39.PP
40The script should be run with full (root) privileges, so that it can
41correctly record file ownership information. The server should also be
42able to connect via
43.BR ssh (1)
44to the client machines, and run processes there as root. (This is not a
45security disaster. Remember that the backup server is, in the end,
46responsible for the integrity of the backup data. A dishonest backup
47server can easily compromise a client which is being restored from
48corrupt backup data.)
69305044
MW
49.SS Command-line options
50Most of the behaviour of
51.B rsync-backup
52is controlled by a configuration file, described starting with the
53section named
54.B Configuration commands
55below.
56But a few features are controlled by command-line options.
57.TP
58.B \-h
59Show a brief help message for the program, and exit successfully.
60.TP
61.B \-V
62Show
63.BR rsync-backup 's
64version number and some choice pieces of build-time configuration, and
65exit successfully.
66.TP
67.BI "\-c " conf
68Read
69.I conf
70instead of the default configuration file (shown as
71.B conf
72in the
73.B \-V
74output).
75.TP
3f496b2b
MW
76.B \-n
77Don't actually take a backup, or write proper logs: instead, write a
78description of what would be done to standard error.
79.TP
69305044
MW
80.B \-v
81Produce verbose progress information on standard output while the backup
82is running. This keeps one amused while running a backup
83interactively. In any event,
84.B rsync-backup
85will report failures to standard error, and otherwise run silently, so
86it doesn't annoy unnecessarily if run by
87.BR cron (8).
88.SS Backup process
89Backing up a filesystem works as follows.
90.hP \*o
91Make a snapshot of the filesystem on the client, and ensure that the
92snapshot is mounted. There are some `trivial' snapshot types which use
93the existing mounted filesystem, and either prevent processes writing to
94it during the backup, or just hope for the best. Other snapshot types
95require the snapshot to be mounted somewhere distinct from the main
96filesystem, so that the latter can continue being used.
97.hP \*o
98Run
99.B rsync
100to copy the snapshot to the backup volume \(en specifically, to
101.IB host / fs / new \fR.
102If this directory already exists, then it's presumed to be debris from a
103previous attempt to dump this filesystem:
104.B rsync
105will update it appropriately, by adding, deleting or modifying the
106files. This means that retrying a failed dump \(en after fixing whatever
107caused it to go wrong, obviously! \(en is usually fairly quick.
108.hP \*o
109Run
110.B fshash
111on the client to generate a `digest' describing the contents of the
112filesystem, and send this to the server as
113.IB host / fs / new .fshash \fR.
114.hP \*o
115Release the snapshot: we don't need it any more.
116.hP \*o
117Run
118.B fshash
119over the new backup; specifically, to
120.BI tmp/fshash. host . fs . date \fR.
121This gives us a digest for what the backup volume actually stored.
122.hP \*o
123Compare the two
124.B fshash
125digests. If they differ then dump the differences to the log file and
126report a backup failure. (Backups aren't any good if they don't
127actually back up the right thing. And you stand a better chance of
128fixing them if you know that they're going wrong.)
129.hP \*o
130Commit the backup, by renaming the dump directory to
131.IB host / fs / date
132and the
133.B fshash
134digest file to
135.IB host / fs / date .fshash \fR.
f6b4ffdc 136.PP
69305044 137The backup is now complete.
f6b4ffdc
MW
138.SS Configuration commands
139The configuration file is simply a Bash shell fragment: configuration
140commands are shell functions.
141.TP
142.BI "backup " "fs\fR[:\fIfsarg\fR] ..."
143Back up the named filesystems. The corresponding
144.IR fsarg s
145may be required by the snapshot type.
146.TP
147.BI "host " host
148Future
149.B backup
150commands will back up filesystems on the named
151.IR host .
69305044
MW
152To back up filesystems on the backup server itself, use its hostname:
153.B rsync-backup
154will avoid inefficient and pointless messing about
155.BR ssh (1)
156in this case.
157This command clears the
f6b4ffdc 158.B like
f8d0b27d
MW
159list, and resets the retention policy to its default (i.e., the to
160policy defined prior to the first
161.B host
162command).
f6b4ffdc
MW
163.TP
164.BI "like " "host\fR ..."
165Declare that subsequent filesystems are `similar' to like-named
166filesystems on the named
167.IR host s,
168and that
169.B rsync
170should use those trees as potential sources of hardlinkable files. Be
171careful when using this option without
172.BR rsync 's
173.B \-\-checksum
174option: an erroneous hardlink will cause the backup to fail. (The
175backup won't be left silently incorrect.)
176.TP
177.BI "retain " frequency " " duration
178Define part a backup retention policy: backup trees of the
179.I frequency
180should be kept for the
181.IR duration .
182The
183.I frequency
184can be
185.BR daily ,
186.BR weekly ,
187.BR monthly ,
188or
69305044 189.B annually
f6b4ffdc
MW
190(or
191.BR yearly ,
192which means the same); the
193.I duration
194may be any of
195.BR week ,
196.BR month ,
197.BR year ,
198or
199.BR forever .
200Expiry considers each existing dump against the policy lines in order:
201the last applicable line determines the dump's fate \(en so you should
202probably write the lines in decreasing order of duration.
e69b31ea 203.RS
f8d0b27d
MW
204.PP
205Groups of
206.B retain
207commands between
208.B host
209and/or
210.B backup
211commands collectively define a retention policy. Once a policy is
212defined, subsequent
213.B backup
214operations use the policy. The first
215.B retain
216command after a
217.B host
218or
219.B backup
220command clears the policy and starts defining a new one. The policy
221defined before the first
222.B host
223is the
224.I default
225policy: at the start of each
226.B host
227stanza, the policy is reset to the default.
e69b31ea 228.RE
f6b4ffdc 229.TP
5675acda
MW
230.BI "retry " count
231The
232.B live
233snapshot type (see below) doesn't prevent a filesystem from being
234modified while it's being backed up. If this happens, the
235.B fshash
236pass will detect the difference and fail. If the filesystem in question
237is relatively quiescent, then maybe retrying the backup will result in a
238successful consistent copy. Following this command, a backup which
239results in an
240.B fshash
241mismatch will be retried up to
242.I count
243times before being declared a failure.
244.TP
f6b4ffdc
MW
245.BI "snap " type " " \fR[\fIargs\fR...]
246Use the snapshot
247.I type
248for subsequent backups. Some snapshot types require additional
5675acda
MW
249arguments, which may be supplied here. This command clears the
250.B retry
251counter.
f6b4ffdc
MW
252.SS Configuration variables
253The following shell variables may be overridden by the configuration
254file.
255.TP
256.B MAXLOG
257The number of log files to be kept for each filesystem. Old logfiles
258are deleted to keep the total number below this bound. The default
259value is 14.
260.TP
261.B RSYNCOPTS
262Command-line options to pass to
263.BR rsync (1)
264in addition to the basic set:
69305044
MW
265.B \-\-archive
266.B \-\-hard-links
267.B \-\-numeric-ids
268.B \-\-del
269.B \-\-sparse
270.B \-\-compress
271.B \-\-one-file-system
272.B \-\-partial
273.BR "\-\-filter=""dir-merge .rsync-backup""" .
f6b4ffdc
MW
274The default is
275.BR \-\-verbose .
276.TP
277.B SNAPDIR
278LVM (and
279.BR rfreezefs )
280snapshots are mounted on subdirectories below the
281.B SNAPDIR
282.IR "on backup clients" .
283The default is
284.IB mntbkpdir /snap
285where
286.I mntbkpdir
287is the backup mount directory configured at build time.
288.TP
289.B SNAPSIZE
290The volume size option to pass to
291.BR lvcreate (8)
292when creating a snapshot. The default is
293.B \-l10%ORIGIN
294which seems to work fairly well.
295.TP
296.B STOREDIR
297Where the actual backup trees should be stored. See the section on
298.B Archive structure
299below.
300The default is
301.IB mntbkpdir /store
302where
303.I mntbkpdir
304is the backup mount directory configured at build time.
305.TP
306.B HASH
307The hash function to use for verifying archive integrity. This is
308passed to the
309.B \-H
310option of
311.BR fshash ,
312so it must name one of the hash functions supported by your Python's
313.B hashlib
314module. The default is
315.BR sha256 .
316.SS Hook functions
317The configuration file may define shell functions to perform custom
318actions at various points in the backup process.
319.TP
320.BI "backup_precommit_hook " host " " fs " " date
321Called after a backup has been verified complete and about to be
322committed. The backup tree is in
323.B new
324in the current directory, and the
325.B fshash
326manifest is in
327.BR new.fshash .
328A typical action would be to create a digital signature on the
329manifest.
330.TP
331.BI "backup_commit_hook " host " " fs " " date
332Called during the commit procedure. The backup tree and manifest have
333been renamed into their proper places. Typically one would use this
334hook to rename files created by the
335.B backup_precommit_hook
336function.
337.TP
338.BR "whine " [ \-n ] " " \fItext\fR...
339Called to report `interesting' events when the
340.B \-v
341option is in force. The default action is to echo the
342.I text
343to (what was initially) standard output, followed by a newline unless
344.B \-n
345is given.
346.SS Snapshot types
347The following snapshot types are available.
348.TP
349.B live
350A trivial snapshot type: attempts to back up a live filesystem. How
351well this works depends on how active the filesystem is. If files
352change while the dump is in progress then the
353.B fshash
354verification will likely fail. Backups using this snapshot type must
355specify the filesystem mount point as the
356.IR fsarg .
357.TP
358.B ro
359A slightly less trivial snapshot type: make the filesystem read-only
360while the dump is in progress. Backups using this snapshot type must
361specify the filesystem mount point as the
362.IR fsarg .
363.TP
364.BI "lvm " vg
365Create snapshots using LVM. The snapshot argument is interpreted as the
366relevant volume group. The filesystem name is interpreted as the origin
367volume name; the snapshot will be called
368.IB fs .bkp
369and mounted on
370.IB SNAPDIR / fs \fR;
371space will be allocated to it according to the
372.I SNAPSIZE
373variable.
374.TP
375.BI "rfreezefs " client " " vg
376This gets complicated. Suppose that a server has an LVM volume group,
377and exports (somehow) a logical volume to a client. Examples are a host
378providing a virtual disk to a guest, or a server providing
379network-attached storage to a client. The server can create a snapshot
380of the volume using LVM, but must synchronize with the client to ensure
381that the filesystem image captured in the snapshot is clean. The
382.BR rfreezefs (8)
383program should be installed on the client to perform this rather
384delicate synchronization. Declare the server using the
385.B host
386command as usual; pass the client's name as the
387.I client
388and the
389server's volume group name as the
390.I vg
391snapshot arguments. Finally, backups using this snapshot type must
392specify the filesystem mount point (or, actually, any file in the
393filesystem) on the client, as the
394.IR fsarg .
395.PP
396Additional snapshot types can be defined in the configuration file. A
397snapshot type requires two shell functions.
398.TP
399.BI snap_ type " " snapargs " " fs " " fsarg
400Create the snapshot, and write the mountpoint (on the client host) to
401standard output, in a form suitable as an argument to
402.BR rsync .
403.TP
404.BI unsnap_ type " " snapargs " " fs " " fsarg
405Remove the snapshot.
406.PP
407There are a number of utility functions which can be used by snapshot
408type handlers: please see the script for details. Please send the
409author interesting snapshot handlers for inclusion in the main
410distribution.
411.SS Archive structure
69305044
MW
412Backup trees are stored in a fairly straightforward directory tree.
413.PP
414At the top level is one directory for each client host. There are also
415some special entries:
416.TP
6037bdb3
MW
417.B \&.rsync-backup-store
418This file must be present in order to indicate that a backup volume is
419present (and not just an empty mount point).
420.TP
69305044
MW
421.B fshash.cache
422The cache database used for improving performance of local file
423hashing. There may be other
424.B fshash.cache-*
425files used by SQLite for its own purposes.
426.TP
427.B lost+found
428Part of the filesystem used on the backup volume. You don't want to
429mess with this.
430.TP
431.B tmp
432Used to store temporary files during the backup process. (Some of them
433want to be on the same filesystem as the rest of the backup.) When
434things go wrong, files are left behind in the hope that they might help
435someone debug the mess. It's always safe to delete the files in here
436when no backup is running.
437.PP
438So don't use those names for your hosts.
439.PP
440The next layer down contains a directory for each filesystem on the given host.
441.PP
442The bottom layer contains a directory for each dump of that filesystem,
443named with the date at which the dump was started (in ISO8601
444.IB yyyy \(en mm \(en dd
445format), together with associated files named
446.IB date .* \fR.
2aea4573
MW
447There is also a symbolic link
448.B last
449referring to the most recent backup of the filesystem.
69305044
MW
450.SH SEE ALSO
451.BR fshash (1),
452.BR lvm (8),
453.BR rfreezefs (8),
454.BR rsync (1),
455.BR ssh (1).
456.SH AUTHOR
457Mark Wooding, <mdw@distorted.org.uk>