| 1 | .TH rfreezefs 8 "October 2011" "rsync-backup" |
| 2 | .SH NAME |
| 3 | rfreezefs \- freeze a filesystem safely |
| 4 | .SH SYNOPSIS |
| 5 | .B rfreezefs |
| 6 | .RB [ \-n ] |
| 7 | .RB [ \-a |
| 8 | .IR address ] |
| 9 | .RB [ \-p |
| 10 | .IR loport [\fB\- hiport ]] |
| 11 | .I filesystem |
| 12 | \&... |
| 13 | .SH DESCRIPTION |
| 14 | The |
| 15 | .B rfreezefs |
| 16 | program freezes one or more mounted filesystems for a period of time, |
| 17 | and then thaws them. For more detail on what this means, why you'd want |
| 18 | to, and how you might go about using |
| 19 | .B rfreezefs |
| 20 | to do it, see below. |
| 21 | .PP |
| 22 | The following command-line options are recognized. |
| 23 | .TP |
| 24 | .B "\-h, \-\-help" |
| 25 | Writes a help message to standard output, and exits with status 0. |
| 26 | .TP |
| 27 | .B "\-v, \-\-version" |
| 28 | Writes the version number to standard output, and exits with status 0. |
| 29 | .TP |
| 30 | .B "\-u, \-\-usage" |
| 31 | Writes a command-line usage synopsis to standard output, and exits with |
| 32 | status 0. |
| 33 | .TP |
| 34 | .BI "\-a, \-\-address=" address |
| 35 | Listen only for incoming connections to the given |
| 36 | .IR address . |
| 37 | The default is to listen for connections to any local address. |
| 38 | .TP |
| 39 | .B "\-n, \-\-not-really" |
| 40 | Don't actually freeze or thaw any filesystems; instead, write messages |
| 41 | to standard error explaining what would be done. |
| 42 | .TP |
| 43 | .BI "\-p, \-\-port-range=" loport\fR[ \- hiport \fR]] |
| 44 | Listen for incoming connections on a port between |
| 45 | .I loport |
| 46 | and |
| 47 | .IR hiport . |
| 48 | If |
| 49 | .I hiport |
| 50 | is omitted, listen for connections only on |
| 51 | .IR loport . |
| 52 | The default is to allow the kernel a free choice of local port number. |
| 53 | .PP |
| 54 | The |
| 55 | .I filesystem |
| 56 | arguments name the filesystems to be frozen. There must be at least one |
| 57 | such argument. It's conventional to name the filesystem mount points, |
| 58 | though actually any file or directory in the filesystem will do. The |
| 59 | files are opened read-only. |
| 60 | .PP |
| 61 | The |
| 62 | .B rfreezefs |
| 63 | program starts, parses its command line, opens the named files, and |
| 64 | creates a listening TCP socket according to the command-line options. |
| 65 | It then prints a sequence of lines to standard output, which may have |
| 66 | one of the following forms. |
| 67 | .TP |
| 68 | .BI "PORT " port |
| 69 | Announces the TCP |
| 70 | .I port |
| 71 | number on which that |
| 72 | .B rfreezefs |
| 73 | is listening for incoming connections. |
| 74 | .TP |
| 75 | .BI "TOKEN " label " " token |
| 76 | Declares a `token': a randomly chosen string which is to be used in the |
| 77 | network connection. The token's value is |
| 78 | .IR token : |
| 79 | token values are a sequence of non-whitespace printable ASCII |
| 80 | characters, but their precise structure is not specified. The token |
| 81 | value will have the meaning given by the |
| 82 | .IR label , |
| 83 | which is one of the token labels described below. |
| 84 | .TP |
| 85 | .B READY |
| 86 | Marks the end of the lines and announces that |
| 87 | .B rfreezefs |
| 88 | is ready to accept connections. |
| 89 | .PP |
| 90 | These lines may be sent in any order, except that |
| 91 | .B READY |
| 92 | is always last. There may be many |
| 93 | .B TOKEN |
| 94 | lines. |
| 95 | .PP |
| 96 | Network communications use a simple plain-text line-oriented protocol. |
| 97 | Each line consists of a token, optionally followed by a carriage return |
| 98 | (code 13), followed by a linefeed (code 10). No other whitespace is |
| 99 | permitted. The tokens allowed are precisely those announced in the |
| 100 | .B TOKEN |
| 101 | lines written to |
| 102 | .BR rfreezefs 's |
| 103 | standard output. Furthermore, only certain tokens are valid at |
| 104 | particular points in the protocol. For reference, the token labels, and |
| 105 | the meanings of the corresponding tokens, are as follows. |
| 106 | .TP |
| 107 | .B FREEZE |
| 108 | Sent by a client to freeze the filesystems. This must be the first |
| 109 | token transmitted by the client. On receipt, |
| 110 | .B rfreezefs |
| 111 | will close its listening socket and any other client connections. It |
| 112 | will then freeze the filesystems. |
| 113 | .TP |
| 114 | .B FROZEN |
| 115 | Sent by |
| 116 | .B rfreezefs |
| 117 | to indicate successful freezing of the filesystem. |
| 118 | .TP |
| 119 | .B KEEPALIVE |
| 120 | Sent periodically by the client to prevent filesystems being thawed due |
| 121 | to a timeout. No explicit acknowledgement is sent. |
| 122 | .TP |
| 123 | .B THAW |
| 124 | Sent by the client to request thawing of the filesystems. |
| 125 | .TP |
| 126 | .B THAWED |
| 127 | Sent by |
| 128 | .B rfreezefs |
| 129 | to indicate successful thawing of the filesystems in response to |
| 130 | .BR THAW . |
| 131 | .PP |
| 132 | The high-level structure of the protocol is then as follows: the client |
| 133 | sends |
| 134 | .BR FREEZE ; |
| 135 | the server freezes and responds with |
| 136 | .BR FROZEN ; |
| 137 | the client optionally sends |
| 138 | .B KEEPALIVE |
| 139 | at intervals; the client finally sends |
| 140 | .BR THAW ; |
| 141 | and the server responds with |
| 142 | .B THAWED |
| 143 | and drops the connection. |
| 144 | .PP |
| 145 | If sufficient time passes without |
| 146 | .B rfreezefs |
| 147 | receiving either |
| 148 | .B THAW |
| 149 | or |
| 150 | .B KEEPALIVE |
| 151 | tokens, or an invalid token is received, or it receives one of a number |
| 152 | of signals \(en currently |
| 153 | .BR SIGINT , |
| 154 | .BR SIGQUIT , |
| 155 | .BR SIGTERM , |
| 156 | .BR SIGHUP , |
| 157 | .BR SIGALRM , |
| 158 | .BR SIGILL , |
| 159 | .BR SIGSEGV , |
| 160 | .BR SIGBUS , |
| 161 | .BR SIGFPE , |
| 162 | or |
| 163 | .B SIGABRT |
| 164 | \(en then |
| 165 | .B rfreezefs |
| 166 | will thaw the filesystems and report a failure. |
| 167 | .PP |
| 168 | Diagnostics are reported to standard error. Exit statuses have specific |
| 169 | meanings: |
| 170 | .TP |
| 171 | .B 0 |
| 172 | Successful completion. Filesystems were frozen and thawed as required. |
| 173 | .TP |
| 174 | .B 1 |
| 175 | Problem with command-line arguments. No filesystems were frozen. |
| 176 | .TP |
| 177 | .B 2 |
| 178 | Environmental problem, typically a system call failure: e.g., a file |
| 179 | failed to open, or there was a problem with the network communications. |
| 180 | Either no filesystems were frozen, or all filesystems were successfully |
| 181 | thawed again. |
| 182 | .TP |
| 183 | .B 3 |
| 184 | Timeout or invalid data. Either no connections containing the cookie |
| 185 | were made in time, or no data was received for a long enough period |
| 186 | after the filesystems were frozen, or an invalid token was received. In |
| 187 | the first case, no filesystems were frozen; in the other two cases, the |
| 188 | filesystems were successfully thawed. |
| 189 | .TP |
| 190 | .B 4 |
| 191 | Crash. The |
| 192 | .B rfreezefs |
| 193 | program received a fatal signal after it had started to freeze |
| 194 | filesystems. Under these circumstances, it thaws the filesystems, |
| 195 | removes the signal handler, and sends itself the signal again, but if |
| 196 | that doesn't work then |
| 197 | .B rfreezefs |
| 198 | exits with this status code. All frozen filesystems were successfully |
| 199 | thawed again. |
| 200 | .TP |
| 201 | .B 112 |
| 202 | Failure during filesystem thaw (mnemonic: European emergency number). |
| 203 | Some filesystems |
| 204 | .I failed |
| 205 | to thaw, and are still frozen. You might have some joy with |
| 206 | .BR SysRq-j , |
| 207 | though in the author's experience that doesn't work and you'll probably |
| 208 | have to reboot. At least your filesystems are consistent... |
| 209 | .SS Background |
| 210 | When frozen, a filesystem's backing block device is put in a consistent |
| 211 | state (as if unmounted), and write operations to it are delayed until |
| 212 | the filesystem is thawed again. In the meantime, it's possible to take |
| 213 | a consistent snapshot of the block device. When a filesystem is |
| 214 | directly mounted on an LVM logical volume, the kernel detects this |
| 215 | situation and automatically freezes the filesystem while the snapshot is |
| 216 | being prepared. If the logical volume and filesystem are on separate |
| 217 | hosts, though, the filesystem must be frozen manually, which is why |
| 218 | .B rfreezefs |
| 219 | is useful. |
| 220 | .PP |
| 221 | The idea is to run |
| 222 | .B rfreezefs |
| 223 | using |
| 224 | .BR ssh (1) |
| 225 | or |
| 226 | .BR userv (1), |
| 227 | or some other means of acquiring the necessary privilege level. You |
| 228 | read the port number and tokens, connect to the socket, and send the |
| 229 | .B FREEZE |
| 230 | token followed by a newline. You now wait to receive the |
| 231 | .B FROZEN |
| 232 | token from |
| 233 | .BR rfreezefs . |
| 234 | Once you have received this, the filesystems are frozen: you can safely |
| 235 | take snapshots. If this will take an extended amount of time, you |
| 236 | should send |
| 237 | .B KEEPALIVE |
| 238 | tokens to the connection at intervals in order to prevent |
| 239 | .B rfreezefs |
| 240 | from timing out and thawing the filesystems (but see the |
| 241 | .B "Security notes" |
| 242 | below). When your snapshot is prepared, sent the |
| 243 | .B THAW |
| 244 | token, and wait for the |
| 245 | .B THAWED |
| 246 | token in response. If this is received, the snapshot was completed |
| 247 | successfully and the filesystems are properly thawed again. If you |
| 248 | don't receive the |
| 249 | .B THAWED |
| 250 | token then something bad might have happened (e.g., the filesystem might |
| 251 | have been prematurely thawed) and the snapshot is suspect. If the exit |
| 252 | status is 112 then at least one filesystem is still frozen and some |
| 253 | emergency action is needed. If you can't retrieve the exit status then |
| 254 | it's possible that your transport is blocked for trying to write to the |
| 255 | frozen filesystem (this especially likely if |
| 256 | .B / |
| 257 | or |
| 258 | .B /var |
| 259 | is frozen) and you should react as if the status was 112. |
| 260 | .SS Security notes |
| 261 | The |
| 262 | .B rfreezefs |
| 263 | program uses randomly chosen tokens to form a simple code which is |
| 264 | revealed to the caller. It is assumed that this information is kept |
| 265 | secret from adversaries, e.g., by ensuring that it is only transmitted |
| 266 | over local pipes (as used by |
| 267 | .BR userv (1)) |
| 268 | and/or secure network transports such as SSH (see |
| 269 | .BR ssh (1)). |
| 270 | The author believes that the worst possible outcome is that the host |
| 271 | wedges up because an important filesystem is frozen, and |
| 272 | .B rfreezefs |
| 273 | therefore strives to prevent that from happening. In particular, |
| 274 | cryptographic transport implementations such as SSH may attempt to log |
| 275 | messages to frozen filesystems or otherwise wedge themselves: |
| 276 | .B rfreezefs |
| 277 | deliberately uses only kernel-implemented transports for its |
| 278 | communication needs once the filesystems are frozen. |
| 279 | .PP |
| 280 | Most of the tokens are used at most once in the protocol. In |
| 281 | particular, the |
| 282 | .B FROZEN |
| 283 | token can't be sent by an adversary in advance of the filesystem being |
| 284 | frozen, since (under the assumption that the tokens are kept secret) it |
| 285 | only revealed in the clear after a successful freeze. Similarly, the |
| 286 | .B THAWED |
| 287 | token is only transmitted if the filesystems are thawed as a result of a |
| 288 | .B THAW |
| 289 | request (rather than a dropped connection, timeout, or some other |
| 290 | problem). If the client only sends the |
| 291 | .B THAW |
| 292 | request once its snapshot is complete, then a |
| 293 | .B THAWED |
| 294 | response indicates that the filesystems remained frozen until the |
| 295 | snapshot was indeed completed and therefore the snapshot is consistent. |
| 296 | .PP |
| 297 | The exception is the |
| 298 | .B KEEPALIVE |
| 299 | token, which may be sent repeatedly. After it is first revealed, an |
| 300 | adversary can hijack the connection and replay the |
| 301 | .B KEEPALIVE |
| 302 | token to keep the filesystems frozen indefinitely. You can recover from |
| 303 | this by severing the connection somehow, or by sending |
| 304 | .B rfreezefs |
| 305 | a signal. It is therefore recommended that |
| 306 | .B KEEPALIVE |
| 307 | tokens not be sent unless necessary. The timeout is currently set to |
| 308 | 60s, which ought to be adequate for most snapshot mechanisms. |
| 309 | .SH BUGS |
| 310 | There ought to be a better one-time-token protocol for keepalives. I |
| 311 | want to keep cryptography out of this program, though. |
| 312 | .SH SEE ALSO |
| 313 | .BR fsfreeze (8), |
| 314 | .BR random (4), |
| 315 | .BR lvm (8), |
| 316 | .BR ssh (1), |
| 317 | .BR userv (1). |
| 318 | .SH AUTHOR |
| 319 | Mark Wooding, <mdw@distorted.org.uk> |