mdw@git.distorted.org.uk Git - rsync-backup/blob - fshash.1

   1 .ie t .ds o \(bu
   2 .el .ds o o
   3 .de hP
   4 .IP
   5 \h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
   6 ..
   7 .TH fshash 1 "8 October 2012" rsync-backup
   8 .SH SYNOPSIS
   9 .B fshash
  10 .RB [ \-a ]
  11 .RB [ \-c
  12 .IR cache ]
  13 .RB [ \-f
  14 .IR format ]
  15 .RB [ \-C
  16 .IR version ]
  17 .RB [ \-H
  18 .IR hash ]
  19 .RI [ file
  20 \&...]
  21 .br
  22 .B fshash
  23 .RB \-u
  24 .B \-c
  25 .I cache
  26 .RB [ \-H
  27 .IR hash ]
  28 .RI [ dir ]
  29 .SH DESCRIPTION
  30 The
  31 .B fshash
  32 program generates digests of filesystems.  It's similar in concept to
  33 (but somewhat different from) Ian Jackson's
  34 .BR summer (1)
  35 tool.
  36 .PP
  37 The idea is to capture everything interesting about a filesystem in a
  38 file with the following properties:
  39 .TP
  40 .I Completeness
  41 The digest file describes everything `interesting' about the filesystem,
  42 such that two filesystems which are interestingly different will have
  43 distinct digests.
  44 .TP
  45 .I Canonicalness
  46 If two filesystems aren't different in any interesting way, then their
  47 digests should be identical.
  48 .TP
  49 .I Readability
  50 Given two subtly different filesystems, it's easy for a human equipped
  51 with digests for them and
  52 .BR diff (1)
  53 to work out what the differences actually are.
  54 .SS Command-line processing
  55 The following command-line arguments are accepted.
  56 .TP
  57 .B \-h, \-\-help
  58 Show a summary of the command-line syntax, and exit successfully.
  59 .TP
  60 .B \-\-version
  61 Show the program's version number, and exit successfully.
  62 .TP
  63 .B \-a, \-\-all
  64 Clear the cache of information about all files except those processed in
  65 this run.
  66 .TP
  67 .B \-c, \-\-cache=\fIfile
  68 Keep a cache of file hashes in the
  69 .IR file .
  70 The cache is keyed by inode and modification time: if a file has an
  71 entry in the cache already then it won't be hashed again, which can
  72 provide a valuable performance improvement on large filesystems.  If the
  73 .I file
  74 doesn't exist, then it will be created.
  75 .TP
  76 .B \-f, \-\-files=\fIformat
  77 Read a list of filenames on standard input in the given
  78 .I format
  79 and write digest lines for them.  The
  80 .I format
  81 may be:
  82 .B find0
  83 for simple null-terminated names, as produced by
  84 .BR "find \-\-print0" ;
  85 or
  86 .B rsync
  87 for file data as produced by
  88 .BR rsync (1).
  89 The latter is useful, since
  90 .B rsync
  91 has powerful file inclusion and exclusion capabilities \(en and a common
  92 use case is generating a digest for a collection of files copied using
  93 .BR rsync .
  94 (The
  95 .B find0
  96 format doesn't work well: see
  97 .B BUGS
  98 below.)
  99 .TP
 100 .B \-C, \-\-compat=\fIversion
 101 Produce a manifest with the given compatibility
 102 .IR version .
 103 Alas,
 104 .B fshash
 105 has bugs in the way it produces manifests.  Fixing the bugs makes the
 106 output better, but now it can't be compared with old manifests which
 107 were made with the bugs.  By default,
 108 .B fshash
 109 produces manifests in the most recent format, but this option will force
 110 it to be compatible with old versions.  The original version was 1; all
 111 later versions print a comment reporting the version number at the start
 112 of the manifest.  The current version is 2.
 113 .TP
 114 .B \-H, \-\-hash=\fIhash
 115 Use the
 116 .I hash
 117 function, which can be any hash function supported by Python's
 118 .BR hashlib .
 119 This option may be omitted: if it is, then the hash is read from the
 120 cache file; if there is no cache file either, then an error is reported.
 121 .TP
 122 .B \-u, \-\-udiff
 123 Rather than produce a manifest, read a unified
 124 .BR diff (1)
 125 from standard input, and clear from the cache all files mentioned as
 126 being different.  Filenames in the diff are considered relative to
 127 .I dir ,
 128 defaulting to the current working directory.
 129 .PP
 130 Positional arguments are interpreted as files and directories to be
 131 processed, in order.  A directory name which ends in
 132 .RB ` / '
 133 is treated specially:
 134 .B fshash
 135 writes filenames relative to the given directory.
 136 .SS Output format
 137 Information about each filesystem object is written on a separate line.
 138 These lines can be quite long, and consist of a number of fields:
 139 .hP 1.
 140 For regular files, a cryptographic hash of the file's content, in
 141 hexadecimal.  For other kinds of filesystem object, a description of the
 142 object type and any special information about it, in square brackets,
 143 and padded with spaces so as to take the same width as a hash; see
 144 below for details.
 145 .hP 2.
 146 A `virtual inode identifier': a string which will be the same in two
 147 lines if and only if they represent hard links to the same underlying
 148 inode.  Some care is taken so that files are assigned the same
 149 identifier even if other parts of the filesystem are different, so as to
 150 avoid spurious differences.
 151 .hP 3.
 152 The object's permissions and mode bits, in octal.
 153 .hP 4.
 154 The file's owner and group, in decimal, separated by a colon.
 155 .hP 5.
 156 The file's last-modified time, in UTC, in ISO8601 format, i.e.,
 157 .IB yyyy \(en mm \(en dd T hh : mm : ss Z \fR.
 158 .hP 6.
 159 The file's size in bytes, in decimal.
 160 .hP 7.
 161 The file's name (relative to some appropriate parent directory).
 162 Characters which
 163 would cause ambiguity are escaped: tab, linefeed and carriage return are
 164 printed as
 165 .RB ` \et ',
 166 .RB ` \en ',
 167 and
 168 .RB ` \er ',
 169 respectively;
 170 .RB ` ' '
 171 is printed as
 172 .RB ` \e' ';
 173 .RB ` \e '
 174 is printed as
 175 .RB ` \e\e ';
 176 and other codes outside the range 32\(en127 are printed as hex escaped,
 177 in the form
 178 .RB ` \ex\fIxx '.
 179 Finally, the sequence
 180 .RB ` \~\->\~ '
 181 is printed as
 182 .RB ` \~\e\->\~ '
 183 so that symlink targets are presented unambiguously (see below).
 184 .PP
 185 For non-regular file objects, the first field is an information field
 186 enclosed in square brackets, and some of the other fields provide other
 187 information or are suppressed, follows.
 188 .TP
 189 .I Errors
 190 If there was an error reading the object's metadata then the information
 191 field shows
 192 .BI E nn
 193 .IR message ,
 194 and the other fields, except the name, are printed as
 195 .B error
 196 rather than having any useful information.
 197 .TP
 198 .I Sockets
 199 The information field shows
 200 .BR socket .
 201 .TP
 202 .I Named pipes
 203 The information field shows
 204 .BR fifo .
 205 .TP
 206 .I Symbolic links
 207 The information field shows
 208 .BR symbolic-link .
 209 The name is followed by
 210 .RB ` \~\->\~ '
 211 and the link target (or
 212 .BI <E nn \~ message >
 213 if there was an error reading the link destination).
 214 .TP
 215 .I Directories
 216 The information field shows
 217 .BR directory ,
 218 and the size field shows
 219 .B dir
 220 (since directory sizes are not consistent across filesystem
 221 implementations).  The name is followed by
 222 .RB ` / '.
 223 .TP
 224 .I Block and character devices
 225 The information field shows
 226 .B block-device
 227 or
 228 .BR character-device ,
 229 as appropriate, followed by the major and minor device numbers in
 230 decimal, and separated by a colon.
 231 .PP
 232 .SH BUGS
 233 No attempt is made to sort filenames read in
 234 .B find0
 235 format, so they're not very likely to match digests produced any other
 236 way.  Indeed, they're not very likely to match digests produced by
 237 .B find0
 238 on other machines either.
 239 .SH SEE ALSO
 240 .BR find (1),
 241 .BR rsync (1),
 242 .BR sha256sum (1)
 243 etc.
 244 .SH AUTHOR
 245 Mark Wooding, <mdw@distorted.org.uk>