Release 1.1.1.
[rsync-backup] / rfreezefs.8
CommitLineData
f6b4ffdc
MW
1.TH rfreezefs 8 "October 2011" "rsync-backup"
2.SH NAME
3rfreezefs \- freeze a filesystem safely
4.SH SYNOPSIS
5.B rfreezefs
6.RB [ \-n ]
7.RB [ \-a
8.IR address ]
9.RB [ \-p
10.IR loport [\fB\- hiport ]]
11.I filesystem
12\&...
13.SH DESCRIPTION
14The
15.B rfreezefs
16program freezes one or more mounted filesystems for a period of time,
17and then thaws them. For more detail on what this means, why you'd want
18to, and how you might go about using
19.B rfreezefs
20to do it, see below.
21.PP
22The following command-line options are recognized.
23.TP
24.B "\-h, \-\-help"
25Writes a help message to standard output, and exits with status 0.
26.TP
27.B "\-v, \-\-version"
28Writes the version number to standard output, and exits with status 0.
29.TP
30.B "\-u, \-\-usage"
31Writes a command-line usage synopsis to standard output, and exits with
32status 0.
33.TP
34.BI "\-a, \-\-address=" address
35Listen only for incoming connections to the given
36.IR address .
37The default is to listen for connections to any local address.
38.TP
39.B "\-n, \-\-not-really"
40Don't actually freeze or thaw any filesystems; instead, write messages
41to standard error explaining what would be done.
42.TP
43.BI "\-p, \-\-port-range=" loport\fR[ \- hiport \fR]]
44Listen for incoming connections on a port between
45.I loport
46and
47.IR hiport .
48If
49.I hiport
50is omitted, listen for connections only on
51.IR loport .
52The default is to allow the kernel a free choice of local port number.
53.PP
54The
55.I filesystem
56arguments name the filesystems to be frozen. There must be at least one
57such argument. It's conventional to name the filesystem mount points,
58though actually any file or directory in the filesystem will do. The
59files are opened read-only.
60.PP
61The
62.B rfreezefs
63program starts, parses its command line, opens the named files, and
64creates a listening TCP socket according to the command-line options.
65It then prints a sequence of lines to standard output, which may have
66one of the following forms.
67.TP
68.BI "PORT " port
69Announces the TCP
70.I port
71number on which that
72.B rfreezefs
73is listening for incoming connections.
74.TP
75.BI "TOKEN " label " " token
76Declares a `token': a randomly chosen string which is to be used in the
77network connection. The token's value is
78.IR token :
79token values are a sequence of non-whitespace printable ASCII
80characters, but their precise structure is not specified. The token
81value will have the meaning given by the
82.IR label ,
83which is one of the token labels described below.
84.TP
85.B READY
86Marks the end of the lines and announces that
87.B rfreezefs
88is ready to accept connections.
89.PP
90These lines may be sent in any order, except that
91.B READY
92is always last. There may be many
93.B TOKEN
94lines.
95.PP
96Network communications use a simple plain-text line-oriented protocol.
97Each line consists of a token, optionally followed by a carriage return
98(code 13), followed by a linefeed (code 10). No other whitespace is
99permitted. The tokens allowed are precisely those announced in the
100.B TOKEN
101lines written to
102.BR rfreezefs 's
103standard output. Furthermore, only certain tokens are valid at
104particular points in the protocol. For reference, the token labels, and
105the meanings of the corresponding tokens, are as follows.
106.TP
107.B FREEZE
108Sent by a client to freeze the filesystems. This must be the first
109token transmitted by the client. On receipt,
110.B rfreezefs
111will close its listening socket and any other client connections. It
112will then freeze the filesystems.
113.TP
114.B FROZEN
115Sent by
116.B rfreezefs
117to indicate successful freezing of the filesystem.
118.TP
119.B KEEPALIVE
120Sent periodically by the client to prevent filesystems being thawed due
121to a timeout. No explicit acknowledgement is sent.
122.TP
123.B THAW
124Sent by the client to request thawing of the filesystems.
125.TP
126.B THAWED
127Sent by
939429c0
MW
128.B rfreezefs
129to indicate successful thawing of the filesystems in response to
f6b4ffdc
MW
130.BR THAW .
131.PP
132The high-level structure of the protocol is then as follows: the client
133sends
134.BR FREEZE ;
135the server freezes and responds with
136.BR FROZEN ;
137the client optionally sends
138.B KEEPALIVE
139at intervals; the client finally sends
140.BR THAW ;
141and the server responds with
142.B THAWED
143and drops the connection.
144.PP
145If sufficient time passes without
146.B rfreezefs
147receiving either
148.B THAW
149or
150.B KEEPALIVE
151tokens, or an invalid token is received, or it receives one of a number
939429c0 152of signals \(en currently
f6b4ffdc
MW
153.BR SIGINT ,
154.BR SIGQUIT ,
155.BR SIGTERM ,
156.BR SIGHUP ,
157.BR SIGALRM ,
158.BR SIGILL ,
159.BR SIGSEGV ,
160.BR SIGBUS ,
161.BR SIGFPE ,
162or
939429c0
MW
163.B SIGABRT
164\(en then
f6b4ffdc
MW
165.B rfreezefs
166will thaw the filesystems and report a failure.
167.PP
168Diagnostics are reported to standard error. Exit statuses have specific
169meanings:
170.TP
171.B 0
172Successful completion. Filesystems were frozen and thawed as required.
173.TP
174.B 1
175Problem with command-line arguments. No filesystems were frozen.
176.TP
177.B 2
178Environmental problem, typically a system call failure: e.g., a file
179failed to open, or there was a problem with the network communications.
180Either no filesystems were frozen, or all filesystems were successfully
181thawed again.
182.TP
183.B 3
dfc3e9b1
MW
184Timeout or invalid data. Either no connections containing the
185.B FREEZE
186token were made in time, or no data was received for a long enough
187period after the filesystems were frozen, or an invalid token was
188received. In the first case, no filesystems were frozen; in the other
189two cases, the filesystems were successfully thawed.
f6b4ffdc
MW
190.TP
191.B 4
192Crash. The
193.B rfreezefs
194program received a fatal signal after it had started to freeze
195filesystems. Under these circumstances, it thaws the filesystems,
196removes the signal handler, and sends itself the signal again, but if
197that doesn't work then
198.B rfreezefs
199exits with this status code. All frozen filesystems were successfully
200thawed again.
201.TP
202.B 112
203Failure during filesystem thaw (mnemonic: European emergency number).
204Some filesystems
205.I failed
206to thaw, and are still frozen. You might have some joy with
207.BR SysRq-j ,
208though in the author's experience that doesn't work and you'll probably
209have to reboot. At least your filesystems are consistent...
210.SS Background
211When frozen, a filesystem's backing block device is put in a consistent
212state (as if unmounted), and write operations to it are delayed until
213the filesystem is thawed again. In the meantime, it's possible to take
214a consistent snapshot of the block device. When a filesystem is
215directly mounted on an LVM logical volume, the kernel detects this
216situation and automatically freezes the filesystem while the snapshot is
217being prepared. If the logical volume and filesystem are on separate
218hosts, though, the filesystem must be frozen manually, which is why
219.B rfreezefs
220is useful.
221.PP
222The idea is to run
223.B rfreezefs
224using
225.BR ssh (1)
226or
227.BR userv (1),
228or some other means of acquiring the necessary privilege level. You
229read the port number and tokens, connect to the socket, and send the
230.B FREEZE
231token followed by a newline. You now wait to receive the
232.B FROZEN
233token from
234.BR rfreezefs .
235Once you have received this, the filesystems are frozen: you can safely
236take snapshots. If this will take an extended amount of time, you
237should send
238.B KEEPALIVE
239tokens to the connection at intervals in order to prevent
240.B rfreezefs
241from timing out and thawing the filesystems (but see the
242.B "Security notes"
243below). When your snapshot is prepared, sent the
244.B THAW
245token, and wait for the
246.B THAWED
247token in response. If this is received, the snapshot was completed
248successfully and the filesystems are properly thawed again. If you
249don't receive the
250.B THAWED
251token then something bad might have happened (e.g., the filesystem might
252have been prematurely thawed) and the snapshot is suspect. If the exit
253status is 112 then at least one filesystem is still frozen and some
254emergency action is needed. If you can't retrieve the exit status then
255it's possible that your transport is blocked for trying to write to the
256frozen filesystem (this especially likely if
257.B /
258or
259.B /var
260is frozen) and you should react as if the status was 112.
261.SS Security notes
262The
263.B rfreezefs
264program uses randomly chosen tokens to form a simple code which is
265revealed to the caller. It is assumed that this information is kept
266secret from adversaries, e.g., by ensuring that it is only transmitted
267over local pipes (as used by
268.BR userv (1))
269and/or secure network transports such as SSH (see
270.BR ssh (1)).
271The author believes that the worst possible outcome is that the host
272wedges up because an important filesystem is frozen, and
273.B rfreezefs
274therefore strives to prevent that from happening. In particular,
275cryptographic transport implementations such as SSH may attempt to log
276messages to frozen filesystems or otherwise wedge themselves:
277.B rfreezefs
278deliberately uses only kernel-implemented transports for its
279communication needs once the filesystems are frozen.
280.PP
281Most of the tokens are used at most once in the protocol. In
282particular, the
283.B FROZEN
284token can't be sent by an adversary in advance of the filesystem being
285frozen, since (under the assumption that the tokens are kept secret) it
286only revealed in the clear after a successful freeze. Similarly, the
287.B THAWED
288token is only transmitted if the filesystems are thawed as a result of a
289.B THAW
290request (rather than a dropped connection, timeout, or some other
291problem). If the client only sends the
292.B THAW
293request once its snapshot is complete, then a
294.B THAWED
295response indicates that the filesystems remained frozen until the
296snapshot was indeed completed and therefore the snapshot is consistent.
297.PP
298The exception is the
299.B KEEPALIVE
300token, which may be sent repeatedly. After it is first revealed, an
301adversary can hijack the connection and replay the
302.B KEEPALIVE
303token to keep the filesystems frozen indefinitely. You can recover from
304this by severing the connection somehow, or by sending
305.B rfreezefs
306a signal. It is therefore recommended that
307.B KEEPALIVE
308tokens not be sent unless necessary. The timeout is currently set to
30960s, which ought to be adequate for most snapshot mechanisms.
310.SH BUGS
311There ought to be a better one-time-token protocol for keepalives. I
312want to keep cryptography out of this program, though.
313.SH SEE ALSO
314.BR fsfreeze (8),
315.BR random (4),
316.BR lvm (8),
317.BR ssh (1),
318.BR userv (1).
319.SH AUTHOR
320Mark Wooding, <mdw@distorted.org.uk>