| 1 | .ie t .ds o \(bu |
| 2 | .el .ds o o |
| 3 | .de hP |
| 4 | .IP |
| 5 | \h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c |
| 6 | .. |
| 7 | .TH fshash 1 "8 October 2012" rsync-backup |
| 8 | .SH SYNOPSIS |
| 9 | .B fshash |
| 10 | .RB [ \-a ] |
| 11 | .RB [ \-c |
| 12 | .IR cache ] |
| 13 | .RB [ \-f |
| 14 | .IR format ] |
| 15 | .RB [ \-H |
| 16 | .IR hash ] |
| 17 | .RI [ file |
| 18 | \&...] |
| 19 | .SH DESCRIPTION |
| 20 | The |
| 21 | .B fshash |
| 22 | program generates digests of filesystems. It's similar in concept to |
| 23 | (but somewhat different from) Ian Jackson's |
| 24 | .BR summer (1) |
| 25 | tool. |
| 26 | .PP |
| 27 | The idea is to capture everything interesting about a filesystem in a |
| 28 | file with the following properties: |
| 29 | .TP |
| 30 | .I Completeness |
| 31 | The digest file describes everything `interesting' about the filesystem, |
| 32 | such that two filesystems which are interestingly different will have |
| 33 | distinct digests. |
| 34 | .TP |
| 35 | .I Canonicalness |
| 36 | If two filesystems aren't different in any interesting way, then their |
| 37 | digests should be identical. |
| 38 | .TP |
| 39 | .I Readability |
| 40 | Given two subtly different filesystems, it's easy for a human equipped |
| 41 | with digests for them and |
| 42 | .BR diff (1) |
| 43 | to work out what the differences actually are. |
| 44 | .SS Command-line processing |
| 45 | The following command-line arguments are accepted. |
| 46 | .TP |
| 47 | .B \-h, \-\-help |
| 48 | Show a summary of the command-line syntax, and exit successfully. |
| 49 | .TP |
| 50 | .B \-\-version |
| 51 | Show the program's version number, and exit successfully. |
| 52 | .TP |
| 53 | .B \-a, \-\-all |
| 54 | Clear the cache of information about all files except those processed in |
| 55 | this run. |
| 56 | .TP |
| 57 | .B \-c, \-\-cache=\fIfile |
| 58 | Keep a cache of file hashes in the |
| 59 | .IR file . |
| 60 | The cache is keyed by inode and modification time: if a file has an |
| 61 | entry in the cache already then it won't be hashed again, which can |
| 62 | provide a valuable performance improvement on large filesystems. If the |
| 63 | .I file |
| 64 | doesn't exist, then it will be created. |
| 65 | .TP |
| 66 | .B \-f, \-\-files=\fIformat |
| 67 | Read a list of filenames on standard input in the given |
| 68 | .I format |
| 69 | and write digest lines for them. The |
| 70 | .I format |
| 71 | may be: |
| 72 | .B find0 |
| 73 | for simple null-terminated names, as produced by |
| 74 | .BR "find \-\-print0" ; |
| 75 | or |
| 76 | .B rsync |
| 77 | for file data as produced by |
| 78 | .BR rsync (1). |
| 79 | The latter is useful, since |
| 80 | .B rsync |
| 81 | has powerful file inclusion and exclusion capabilities \(en and a common |
| 82 | use case is generating a digest for a collection of files copied using |
| 83 | .BR rsync . |
| 84 | (The |
| 85 | .B find0 |
| 86 | format doesn't work well: see |
| 87 | .B BUGS |
| 88 | below.) |
| 89 | .TP |
| 90 | .B \-H, \-\-hash=\fIhash |
| 91 | Use the |
| 92 | .I hash |
| 93 | function, which can be any hash function supported by Python's |
| 94 | .BR hashlib . |
| 95 | If this option may be omitted then the hash is read from the cache file; |
| 96 | if there is no cache file either, then an error is reported. |
| 97 | .PP |
| 98 | Positional arguments are interpreted as files and directories to be |
| 99 | processed, in order. A directory name which ends in |
| 100 | .RB ` / ' |
| 101 | is treated specially: |
| 102 | .B fshash |
| 103 | writes filenames relative to the given directory. |
| 104 | .SS Output format |
| 105 | Information about each filesystem object is written on a separate line. |
| 106 | These lines can be quite long, and consist of a number of fields: |
| 107 | .hP 1. |
| 108 | For regular files, a cryptographic hash of the file's content, in |
| 109 | hexadecimal. For other kinds of filesystem object, a description of the |
| 110 | object type and any special information about it, in square brackets, |
| 111 | and padded with spaces so as to take the same width as a hash; see |
| 112 | below for details. |
| 113 | as follows. |
| 114 | .hP 2. |
| 115 | A `virtual inode identifier': a string which will be the same in two |
| 116 | lines if and only if they represent hard links to the same underlying |
| 117 | inode. Some care is taken so that files are assigned the same |
| 118 | identifier even if other parts of the filesystem are different, so as to |
| 119 | avoid spurious differences. |
| 120 | .hP 3. |
| 121 | The object's permissions and mode bits, in octal. |
| 122 | .hP 4. |
| 123 | The file's owner and group, in decimal, separated by a colon. |
| 124 | .hP 5. |
| 125 | The file's last-modified time, in UTC, in ISO8601 format, i.e., |
| 126 | .IB yyyy \(en mm \(en dd T hh : mm : ss Z \fR. |
| 127 | .hP 6. |
| 128 | The file's size in bytes, in decimal. |
| 129 | .hP 7. |
| 130 | The file's name (relative to some appropriate parent directory). |
| 131 | Characters which |
| 132 | would cause ambiguity are escaped: tab, linefeed and carriage return are |
| 133 | printed as |
| 134 | .RB ` \et ', |
| 135 | .RB ` \en ', |
| 136 | and |
| 137 | .RB ` \er ', |
| 138 | respectively; |
| 139 | .RB ` ' ' |
| 140 | is printed as |
| 141 | .RB ` \e' '; |
| 142 | .RB ` \e ' |
| 143 | is printed as |
| 144 | .RB ` \e\e '; |
| 145 | and other codes outside the range 32\(en127 are printed as hex escaped, |
| 146 | in the form |
| 147 | .RB ` \ex\fIxx '. |
| 148 | Finally, the sequence |
| 149 | .RB ` \~\->\~ ' |
| 150 | is printed as |
| 151 | .RB ` \~\e\->\~ ' |
| 152 | so that symlink targets are presented unambiguously (see below). |
| 153 | .PP |
| 154 | For non-regular file objects, the first field is an information field |
| 155 | enclosed in square brackets, and some of the other fields provide other |
| 156 | information or are suppressed, follows. |
| 157 | .TP |
| 158 | .I Errors |
| 159 | If there was an error reading the object's metadata then the information |
| 160 | field shows |
| 161 | .BI E nn |
| 162 | .IR message , |
| 163 | and the other fields, except the name, are printed as |
| 164 | .B error |
| 165 | rather than having any useful information. |
| 166 | .TP |
| 167 | .I Sockets |
| 168 | The information field shows |
| 169 | .BR socket . |
| 170 | .TP |
| 171 | .I Named pipes |
| 172 | The information field shows |
| 173 | .BR fifo . |
| 174 | .TP |
| 175 | .I Symbolic links |
| 176 | The information field shows |
| 177 | .BR symbolic-link . |
| 178 | The name is followed by |
| 179 | .RB ` \~\->\~ ' |
| 180 | and the link target (or by |
| 181 | .BI <E nn \~ message > |
| 182 | if there was an error reading the link destination). |
| 183 | .TP |
| 184 | .I Directories |
| 185 | The information field shows |
| 186 | .BR directory , |
| 187 | and the size field shows |
| 188 | .B dir |
| 189 | (since directory sizes are not consistent across filesystem |
| 190 | implementations). The name is followed by |
| 191 | .RB ` / '. |
| 192 | .TP |
| 193 | .I Block and character devices |
| 194 | The information field shows |
| 195 | .B block-device |
| 196 | or |
| 197 | .BR character-device , |
| 198 | as appropriate, followed by the major and minor device numbers in |
| 199 | decimal, and separated by a colon. |
| 200 | .PP |
| 201 | .SH BUGS |
| 202 | No attempt is made to sort filenames read in |
| 203 | .B find0 |
| 204 | format, so they're not very likely to match digests produced any other |
| 205 | way. Indeed, they're not very likely to match digests produced by |
| 206 | .B find0 |
| 207 | on other machines either. |
| 208 | .SH SEE ALSO |
| 209 | .BR find (1), |
| 210 | .BR rsync (1), |
| 211 | .BR sha256sum (1) |
| 212 | etc. |
| 213 | .SH AUTHOR |
| 214 | Mark Wooding, <mdw@distorted.org.uk> |