rsync-backup.in, rsync-backup.8: Retry backups which fail fshash check.
[rsync-backup] / rsync-backup.8
1 .ie t .ds o \(bu
2 .el .ds o o
3 .de hP
4 .IP
5 \h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
6 ..
7 .TH rsync-backup 8 "7 October 2012" rsync-backup
8 .SH SYNOPSIS
9 .B rsync-backup
10 .RB [ \-v ]
11 .RB [ \-c
12 .IR config-file ]
13 .SH DESCRIPTION
14 The
15 .B rsync-backup
16 script is a backup program of the currently popular
17 .RB ` rsync (1)
18 .BR \-\-link-dest '
19 variety. It uses
20 .BR rsync 's
21 ability to create hardlinks from (apparently) similar existing local
22 trees to make incremental dumps efficient, even from remote sources.
23 Restoring files is easy because the backups created are just directories
24 full of files, exactly as they were on the source \(en and this is
25 verified using the
26 .BR fshash (1)
27 program.
28 .PP
29 The script does more than just running
30 .BR rsync .
31 It is also responsible for creating and removing snapshots of volumes to
32 be backed up, and expiring old dumps according to a user-specified
33 retention policy.
34 .SS Installation
35 The idea is that the
36 .B rsync-backup
37 script should be installed and run on a central backup server with local
38 access to the backup volumes.
39 .PP
40 The script should be run with full (root) privileges, so that it can
41 correctly record file ownership information. The server should also be
42 able to connect via
43 .BR ssh (1)
44 to the client machines, and run processes there as root. (This is not a
45 security disaster. Remember that the backup server is, in the end,
46 responsible for the integrity of the backup data. A dishonest backup
47 server can easily compromise a client which is being restored from
48 corrupt backup data.)
49 .SS Command-line options
50 Most of the behaviour of
51 .B rsync-backup
52 is controlled by a configuration file, described starting with the
53 section named
54 .B Configuration commands
55 below.
56 But a few features are controlled by command-line options.
57 .TP
58 .B \-h
59 Show a brief help message for the program, and exit successfully.
60 .TP
61 .B \-V
62 Show
63 .BR rsync-backup 's
64 version number and some choice pieces of build-time configuration, and
65 exit successfully.
66 .TP
67 .BI "\-c " conf
68 Read
69 .I conf
70 instead of the default configuration file (shown as
71 .B conf
72 in the
73 .B \-V
74 output).
75 .TP
76 .B \-v
77 Produce verbose progress information on standard output while the backup
78 is running. This keeps one amused while running a backup
79 interactively. In any event,
80 .B rsync-backup
81 will report failures to standard error, and otherwise run silently, so
82 it doesn't annoy unnecessarily if run by
83 .BR cron (8).
84 .SS Backup process
85 Backing up a filesystem works as follows.
86 .hP \*o
87 Make a snapshot of the filesystem on the client, and ensure that the
88 snapshot is mounted. There are some `trivial' snapshot types which use
89 the existing mounted filesystem, and either prevent processes writing to
90 it during the backup, or just hope for the best. Other snapshot types
91 require the snapshot to be mounted somewhere distinct from the main
92 filesystem, so that the latter can continue being used.
93 .hP \*o
94 Run
95 .B rsync
96 to copy the snapshot to the backup volume \(en specifically, to
97 .IB host / fs / new \fR.
98 If this directory already exists, then it's presumed to be debris from a
99 previous attempt to dump this filesystem:
100 .B rsync
101 will update it appropriately, by adding, deleting or modifying the
102 files. This means that retrying a failed dump \(en after fixing whatever
103 caused it to go wrong, obviously! \(en is usually fairly quick.
104 .hP \*o
105 Run
106 .B fshash
107 on the client to generate a `digest' describing the contents of the
108 filesystem, and send this to the server as
109 .IB host / fs / new .fshash \fR.
110 .hP \*o
111 Release the snapshot: we don't need it any more.
112 .hP \*o
113 Run
114 .B fshash
115 over the new backup; specifically, to
116 .BI tmp/fshash. host . fs . date \fR.
117 This gives us a digest for what the backup volume actually stored.
118 .hP \*o
119 Compare the two
120 .B fshash
121 digests. If they differ then dump the differences to the log file and
122 report a backup failure. (Backups aren't any good if they don't
123 actually back up the right thing. And you stand a better chance of
124 fixing them if you know that they're going wrong.)
125 .hP \*o
126 Commit the backup, by renaming the dump directory to
127 .IB host / fs / date
128 and the
129 .B fshash
130 digest file to
131 .IB host / fs / date .fshash \fR.
132 .PP
133 The backup is now complete.
134 .SS Configuration commands
135 The configuration file is simply a Bash shell fragment: configuration
136 commands are shell functions.
137 .TP
138 .BI "backup " "fs\fR[:\fIfsarg\fR] ..."
139 Back up the named filesystems. The corresponding
140 .IR fsarg s
141 may be required by the snapshot type.
142 .TP
143 .BI "host " host
144 Future
145 .B backup
146 commands will back up filesystems on the named
147 .IR host .
148 To back up filesystems on the backup server itself, use its hostname:
149 .B rsync-backup
150 will avoid inefficient and pointless messing about
151 .BR ssh (1)
152 in this case.
153 This command clears the
154 .B like
155 list.
156 .TP
157 .BI "like " "host\fR ..."
158 Declare that subsequent filesystems are `similar' to like-named
159 filesystems on the named
160 .IR host s,
161 and that
162 .B rsync
163 should use those trees as potential sources of hardlinkable files. Be
164 careful when using this option without
165 .BR rsync 's
166 .B \-\-checksum
167 option: an erroneous hardlink will cause the backup to fail. (The
168 backup won't be left silently incorrect.)
169 .TP
170 .BI "retain " frequency " " duration
171 Define part a backup retention policy: backup trees of the
172 .I frequency
173 should be kept for the
174 .IR duration .
175 The
176 .I frequency
177 can be
178 .BR daily ,
179 .BR weekly ,
180 .BR monthly ,
181 or
182 .B annually
183 (or
184 .BR yearly ,
185 which means the same); the
186 .I duration
187 may be any of
188 .BR week ,
189 .BR month ,
190 .BR year ,
191 or
192 .BR forever .
193 Expiry considers each existing dump against the policy lines in order:
194 the last applicable line determines the dump's fate \(en so you should
195 probably write the lines in decreasing order of duration.
196 .TP
197 .BI "retry " count
198 The
199 .B live
200 snapshot type (see below) doesn't prevent a filesystem from being
201 modified while it's being backed up. If this happens, the
202 .B fshash
203 pass will detect the difference and fail. If the filesystem in question
204 is relatively quiescent, then maybe retrying the backup will result in a
205 successful consistent copy. Following this command, a backup which
206 results in an
207 .B fshash
208 mismatch will be retried up to
209 .I count
210 times before being declared a failure.
211 .TP
212 .BI "snap " type " " \fR[\fIargs\fR...]
213 Use the snapshot
214 .I type
215 for subsequent backups. Some snapshot types require additional
216 arguments, which may be supplied here. This command clears the
217 .B retry
218 counter.
219 .SS Configuration variables
220 The following shell variables may be overridden by the configuration
221 file.
222 .TP
223 .B MAXLOG
224 The number of log files to be kept for each filesystem. Old logfiles
225 are deleted to keep the total number below this bound. The default
226 value is 14.
227 .TP
228 .B RSYNCOPTS
229 Command-line options to pass to
230 .BR rsync (1)
231 in addition to the basic set:
232 .B \-\-archive
233 .B \-\-hard-links
234 .B \-\-numeric-ids
235 .B \-\-del
236 .B \-\-sparse
237 .B \-\-compress
238 .B \-\-one-file-system
239 .B \-\-partial
240 .BR "\-\-filter=""dir-merge .rsync-backup""" .
241 The default is
242 .BR \-\-verbose .
243 .TP
244 .B SNAPDIR
245 LVM (and
246 .BR rfreezefs )
247 snapshots are mounted on subdirectories below the
248 .B SNAPDIR
249 .IR "on backup clients" .
250 The default is
251 .IB mntbkpdir /snap
252 where
253 .I mntbkpdir
254 is the backup mount directory configured at build time.
255 .TP
256 .B SNAPSIZE
257 The volume size option to pass to
258 .BR lvcreate (8)
259 when creating a snapshot. The default is
260 .B \-l10%ORIGIN
261 which seems to work fairly well.
262 .TP
263 .B STOREDIR
264 Where the actual backup trees should be stored. See the section on
265 .B Archive structure
266 below.
267 The default is
268 .IB mntbkpdir /store
269 where
270 .I mntbkpdir
271 is the backup mount directory configured at build time.
272 .TP
273 .B HASH
274 The hash function to use for verifying archive integrity. This is
275 passed to the
276 .B \-H
277 option of
278 .BR fshash ,
279 so it must name one of the hash functions supported by your Python's
280 .B hashlib
281 module. The default is
282 .BR sha256 .
283 .SS Hook functions
284 The configuration file may define shell functions to perform custom
285 actions at various points in the backup process.
286 .TP
287 .BI "backup_precommit_hook " host " " fs " " date
288 Called after a backup has been verified complete and about to be
289 committed. The backup tree is in
290 .B new
291 in the current directory, and the
292 .B fshash
293 manifest is in
294 .BR new.fshash .
295 A typical action would be to create a digital signature on the
296 manifest.
297 .TP
298 .BI "backup_commit_hook " host " " fs " " date
299 Called during the commit procedure. The backup tree and manifest have
300 been renamed into their proper places. Typically one would use this
301 hook to rename files created by the
302 .B backup_precommit_hook
303 function.
304 .TP
305 .BR "whine " [ \-n ] " " \fItext\fR...
306 Called to report `interesting' events when the
307 .B \-v
308 option is in force. The default action is to echo the
309 .I text
310 to (what was initially) standard output, followed by a newline unless
311 .B \-n
312 is given.
313 .SS Snapshot types
314 The following snapshot types are available.
315 .TP
316 .B live
317 A trivial snapshot type: attempts to back up a live filesystem. How
318 well this works depends on how active the filesystem is. If files
319 change while the dump is in progress then the
320 .B fshash
321 verification will likely fail. Backups using this snapshot type must
322 specify the filesystem mount point as the
323 .IR fsarg .
324 .TP
325 .B ro
326 A slightly less trivial snapshot type: make the filesystem read-only
327 while the dump is in progress. Backups using this snapshot type must
328 specify the filesystem mount point as the
329 .IR fsarg .
330 .TP
331 .BI "lvm " vg
332 Create snapshots using LVM. The snapshot argument is interpreted as the
333 relevant volume group. The filesystem name is interpreted as the origin
334 volume name; the snapshot will be called
335 .IB fs .bkp
336 and mounted on
337 .IB SNAPDIR / fs \fR;
338 space will be allocated to it according to the
339 .I SNAPSIZE
340 variable.
341 .TP
342 .BI "rfreezefs " client " " vg
343 This gets complicated. Suppose that a server has an LVM volume group,
344 and exports (somehow) a logical volume to a client. Examples are a host
345 providing a virtual disk to a guest, or a server providing
346 network-attached storage to a client. The server can create a snapshot
347 of the volume using LVM, but must synchronize with the client to ensure
348 that the filesystem image captured in the snapshot is clean. The
349 .BR rfreezefs (8)
350 program should be installed on the client to perform this rather
351 delicate synchronization. Declare the server using the
352 .B host
353 command as usual; pass the client's name as the
354 .I client
355 and the
356 server's volume group name as the
357 .I vg
358 snapshot arguments. Finally, backups using this snapshot type must
359 specify the filesystem mount point (or, actually, any file in the
360 filesystem) on the client, as the
361 .IR fsarg .
362 .PP
363 Additional snapshot types can be defined in the configuration file. A
364 snapshot type requires two shell functions.
365 .TP
366 .BI snap_ type " " snapargs " " fs " " fsarg
367 Create the snapshot, and write the mountpoint (on the client host) to
368 standard output, in a form suitable as an argument to
369 .BR rsync .
370 .TP
371 .BI unsnap_ type " " snapargs " " fs " " fsarg
372 Remove the snapshot.
373 .PP
374 There are a number of utility functions which can be used by snapshot
375 type handlers: please see the script for details. Please send the
376 author interesting snapshot handlers for inclusion in the main
377 distribution.
378 .SS Archive structure
379 Backup trees are stored in a fairly straightforward directory tree.
380 .PP
381 At the top level is one directory for each client host. There are also
382 some special entries:
383 .TP
384 .B \&.rsync-backup-store
385 This file must be present in order to indicate that a backup volume is
386 present (and not just an empty mount point).
387 .TP
388 .B fshash.cache
389 The cache database used for improving performance of local file
390 hashing. There may be other
391 .B fshash.cache-*
392 files used by SQLite for its own purposes.
393 .TP
394 .B lost+found
395 Part of the filesystem used on the backup volume. You don't want to
396 mess with this.
397 .TP
398 .B tmp
399 Used to store temporary files during the backup process. (Some of them
400 want to be on the same filesystem as the rest of the backup.) When
401 things go wrong, files are left behind in the hope that they might help
402 someone debug the mess. It's always safe to delete the files in here
403 when no backup is running.
404 .PP
405 So don't use those names for your hosts.
406 .PP
407 The next layer down contains a directory for each filesystem on the given host.
408 .PP
409 The bottom layer contains a directory for each dump of that filesystem,
410 named with the date at which the dump was started (in ISO8601
411 .IB yyyy \(en mm \(en dd
412 format), together with associated files named
413 .IB date .* \fR.
414 .SH SEE ALSO
415 .BR fshash (1),
416 .BR lvm (8),
417 .BR rfreezefs (8),
418 .BR rsync (1),
419 .BR ssh (1).
420 .SH AUTHOR
421 Mark Wooding, <mdw@distorted.org.uk>