| 1 | .ie t .ds o \(bu |
| 2 | .el .ds o o |
| 3 | .de hP |
| 4 | .IP |
| 5 | \h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c |
| 6 | .. |
| 7 | .TH fshash 1 "8 October 2012" rsync-backup |
| 8 | .SH SYNOPSIS |
| 9 | .B fshash |
| 10 | .RB [ \-a ] |
| 11 | .RB [ \-c |
| 12 | .IR cache ] |
| 13 | .RB [ \-f |
| 14 | .IR format ] |
| 15 | .RB [ \-C |
| 16 | .IR version ] |
| 17 | .RB [ \-H |
| 18 | .IR hash ] |
| 19 | .RI [ file |
| 20 | \&...] |
| 21 | .br |
| 22 | .B fshash |
| 23 | .RB \-u |
| 24 | .B \-c |
| 25 | .I cache |
| 26 | .RB [ \-H |
| 27 | .IR hash ] |
| 28 | .RI [ dir ] |
| 29 | .SH DESCRIPTION |
| 30 | The |
| 31 | .B fshash |
| 32 | program generates digests of filesystems. It's similar in concept to |
| 33 | (but somewhat different from) Ian Jackson's |
| 34 | .BR summer (1) |
| 35 | tool. |
| 36 | .PP |
| 37 | The idea is to capture everything interesting about a filesystem in a |
| 38 | file with the following properties: |
| 39 | .TP |
| 40 | .I Completeness |
| 41 | The digest file describes everything `interesting' about the filesystem, |
| 42 | such that two filesystems which are interestingly different will have |
| 43 | distinct digests. |
| 44 | .TP |
| 45 | .I Canonicalness |
| 46 | If two filesystems aren't different in any interesting way, then their |
| 47 | digests should be identical. |
| 48 | .TP |
| 49 | .I Readability |
| 50 | Given two subtly different filesystems, it's easy for a human equipped |
| 51 | with digests for them and |
| 52 | .BR diff (1) |
| 53 | to work out what the differences actually are. |
| 54 | .SS Command-line processing |
| 55 | The following command-line arguments are accepted. |
| 56 | .TP |
| 57 | .B \-h, \-\-help |
| 58 | Show a summary of the command-line syntax, and exit successfully. |
| 59 | .TP |
| 60 | .B \-\-version |
| 61 | Show the program's version number, and exit successfully. |
| 62 | .TP |
| 63 | .B \-a, \-\-all |
| 64 | Clear the cache of information about all files except those processed in |
| 65 | this run. |
| 66 | .TP |
| 67 | .B \-c, \-\-cache=\fIfile |
| 68 | Keep a cache of file hashes in the |
| 69 | .IR file . |
| 70 | The cache is keyed by inode and modification time: if a file has an |
| 71 | entry in the cache already then it won't be hashed again, which can |
| 72 | provide a valuable performance improvement on large filesystems. If the |
| 73 | .I file |
| 74 | doesn't exist, then it will be created. |
| 75 | .TP |
| 76 | .B \-f, \-\-files=\fIformat |
| 77 | Read a list of filenames on standard input in the given |
| 78 | .I format |
| 79 | and write digest lines for them. The |
| 80 | .I format |
| 81 | may be: |
| 82 | .B find0 |
| 83 | for simple null-terminated names, as produced by |
| 84 | .BR "find \-\-print0" ; |
| 85 | or |
| 86 | .B rsync |
| 87 | for file data as produced by |
| 88 | .BR rsync (1). |
| 89 | The latter is useful, since |
| 90 | .B rsync |
| 91 | has powerful file inclusion and exclusion capabilities \(en and a common |
| 92 | use case is generating a digest for a collection of files copied using |
| 93 | .BR rsync . |
| 94 | (The |
| 95 | .B find0 |
| 96 | format doesn't work well: see |
| 97 | .B BUGS |
| 98 | below.) |
| 99 | .TP |
| 100 | .B \-C, \-\-compat=\fIversion |
| 101 | Produce a manifest with the given compatibility |
| 102 | .IR version . |
| 103 | Alas, |
| 104 | .B fshash |
| 105 | has bugs in the way it produces manifests. Fixing the bugs makes the |
| 106 | output better, but now it can't be compared with old manifests which |
| 107 | were made with the bugs. By default, |
| 108 | .B fshash |
| 109 | produces manifests in the most recent format, but this option will force |
| 110 | it to be compatible with old versions. The original version was 1; all |
| 111 | later versions print a comment reporting the version number at the start |
| 112 | of the manifest. The current version is 2. |
| 113 | .TP |
| 114 | .B \-H, \-\-hash=\fIhash |
| 115 | Use the |
| 116 | .I hash |
| 117 | function, which can be any hash function supported by Python's |
| 118 | .BR hashlib . |
| 119 | This option may be omitted: if it is, then the hash is read from the |
| 120 | cache file; if there is no cache file either, then an error is reported. |
| 121 | .TP |
| 122 | .B \-u, \-\-udiff |
| 123 | Rather than produce a manifest, read a unified |
| 124 | .BR diff (1) |
| 125 | from standard input, and clear from the cache all files mentioned as |
| 126 | being different. Filenames in the diff are considered relative to |
| 127 | .I dir , |
| 128 | defaulting to the current working directory. |
| 129 | .PP |
| 130 | Positional arguments are interpreted as files and directories to be |
| 131 | processed, in order. A directory name which ends in |
| 132 | .RB ` / ' |
| 133 | is treated specially: |
| 134 | .B fshash |
| 135 | writes filenames relative to the given directory. |
| 136 | .SS Output format |
| 137 | Information about each filesystem object is written on a separate line. |
| 138 | These lines can be quite long, and consist of a number of fields: |
| 139 | .hP 1. |
| 140 | For regular files, a cryptographic hash of the file's content, in |
| 141 | hexadecimal. For other kinds of filesystem object, a description of the |
| 142 | object type and any special information about it, in square brackets, |
| 143 | and padded with spaces so as to take the same width as a hash; see |
| 144 | below for details. |
| 145 | .hP 2. |
| 146 | A `virtual inode identifier': a string which will be the same in two |
| 147 | lines if and only if they represent hard links to the same underlying |
| 148 | inode. Some care is taken so that files are assigned the same |
| 149 | identifier even if other parts of the filesystem are different, so as to |
| 150 | avoid spurious differences. |
| 151 | .hP 3. |
| 152 | The object's permissions and mode bits, in octal. |
| 153 | .hP 4. |
| 154 | The file's owner and group, in decimal, separated by a colon. |
| 155 | .hP 5. |
| 156 | The file's last-modified time, in UTC, in ISO8601 format, i.e., |
| 157 | .IB yyyy \(en mm \(en dd T hh : mm : ss Z \fR. |
| 158 | .hP 6. |
| 159 | The file's size in bytes, in decimal. |
| 160 | .hP 7. |
| 161 | The file's name (relative to some appropriate parent directory). |
| 162 | Characters which |
| 163 | would cause ambiguity are escaped: tab, linefeed and carriage return are |
| 164 | printed as |
| 165 | .RB ` \et ', |
| 166 | .RB ` \en ', |
| 167 | and |
| 168 | .RB ` \er ', |
| 169 | respectively; |
| 170 | .RB ` ' ' |
| 171 | is printed as |
| 172 | .RB ` \e' '; |
| 173 | .RB ` \e ' |
| 174 | is printed as |
| 175 | .RB ` \e\e '; |
| 176 | and other codes outside the range 32\(en127 are printed as hex escaped, |
| 177 | in the form |
| 178 | .RB ` \ex\fIxx '. |
| 179 | Finally, the sequence |
| 180 | .RB ` \~\->\~ ' |
| 181 | is printed as |
| 182 | .RB ` \~\e\->\~ ' |
| 183 | so that symlink targets are presented unambiguously (see below). |
| 184 | .PP |
| 185 | For non-regular file objects, the first field is an information field |
| 186 | enclosed in square brackets, and some of the other fields provide other |
| 187 | information or are suppressed, follows. |
| 188 | .TP |
| 189 | .I Errors |
| 190 | If there was an error reading the object's metadata then the information |
| 191 | field shows |
| 192 | .BI E nn |
| 193 | .IR message , |
| 194 | and the other fields, except the name, are printed as |
| 195 | .B error |
| 196 | rather than having any useful information. |
| 197 | .TP |
| 198 | .I Sockets |
| 199 | The information field shows |
| 200 | .BR socket . |
| 201 | .TP |
| 202 | .I Named pipes |
| 203 | The information field shows |
| 204 | .BR fifo . |
| 205 | .TP |
| 206 | .I Symbolic links |
| 207 | The information field shows |
| 208 | .BR symbolic-link . |
| 209 | The name is followed by |
| 210 | .RB ` \~\->\~ ' |
| 211 | and the link target (or |
| 212 | .BI <E nn \~ message > |
| 213 | if there was an error reading the link destination). |
| 214 | .TP |
| 215 | .I Directories |
| 216 | The information field shows |
| 217 | .BR directory , |
| 218 | and the size field shows |
| 219 | .B dir |
| 220 | (since directory sizes are not consistent across filesystem |
| 221 | implementations). The name is followed by |
| 222 | .RB ` / '. |
| 223 | .TP |
| 224 | .I Block and character devices |
| 225 | The information field shows |
| 226 | .B block-device |
| 227 | or |
| 228 | .BR character-device , |
| 229 | as appropriate, followed by the major and minor device numbers in |
| 230 | decimal, and separated by a colon. |
| 231 | .PP |
| 232 | .SH BUGS |
| 233 | No attempt is made to sort filenames read in |
| 234 | .B find0 |
| 235 | format, so they're not very likely to match digests produced any other |
| 236 | way. Indeed, they're not very likely to match digests produced by |
| 237 | .B find0 |
| 238 | on other machines either. |
| 239 | .SH SEE ALSO |
| 240 | .BR find (1), |
| 241 | .BR rsync (1), |
| 242 | .BR sha256sum (1) |
| 243 | etc. |
| 244 | .SH AUTHOR |
| 245 | Mark Wooding, <mdw@distorted.org.uk> |