mdw@git.distorted.org.uk Git - rsync-backup/blob - fshash.1

   1 .ie t .ds o \(bu
   2 .el .ds o o
   3 .de hP
   4 .IP
   5 \h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
   6 ..
   7 .TH fshash 1 "8 October 2012" rsync-backup
   8 .SH SYNOPSIS
   9 .B fshash
  10 .RB [ \-a ]
  11 .RB [ \-c
  12 .IR cache ]
  13 .RB [ \-f
  14 .IR format ]
  15 .RB [ \-H
  16 .IR hash ]
  17 .RI [ file
  18 \&...]
  19 .br
  20 .B fshash
  21 .RB \-u
  22 .B \-c
  23 .I cache
  24 .RB [ \-H
  25 .IR hash ]
  26 .RI [ dir ]
  27 .SH DESCRIPTION
  28 The
  29 .B fshash
  30 program generates digests of filesystems.  It's similar in concept to
  31 (but somewhat different from) Ian Jackson's
  32 .BR summer (1)
  33 tool.
  34 .PP
  35 The idea is to capture everything interesting about a filesystem in a
  36 file with the following properties:
  37 .TP
  38 .I Completeness
  39 The digest file describes everything `interesting' about the filesystem,
  40 such that two filesystems which are interestingly different will have
  41 distinct digests.
  42 .TP
  43 .I Canonicalness
  44 If two filesystems aren't different in any interesting way, then their
  45 digests should be identical.
  46 .TP
  47 .I Readability
  48 Given two subtly different filesystems, it's easy for a human equipped
  49 with digests for them and
  50 .BR diff (1)
  51 to work out what the differences actually are.
  52 .SS Command-line processing
  53 The following command-line arguments are accepted.
  54 .TP
  55 .B \-h, \-\-help
  56 Show a summary of the command-line syntax, and exit successfully.
  57 .TP
  58 .B \-\-version
  59 Show the program's version number, and exit successfully.
  60 .TP
  61 .B \-a, \-\-all
  62 Clear the cache of information about all files except those processed in
  63 this run.
  64 .TP
  65 .B \-c, \-\-cache=\fIfile
  66 Keep a cache of file hashes in the
  67 .IR file .
  68 The cache is keyed by inode and modification time: if a file has an
  69 entry in the cache already then it won't be hashed again, which can
  70 provide a valuable performance improvement on large filesystems.  If the
  71 .I file
  72 doesn't exist, then it will be created.
  73 .TP
  74 .B \-f, \-\-files=\fIformat
  75 Read a list of filenames on standard input in the given
  76 .I format
  77 and write digest lines for them.  The
  78 .I format
  79 may be:
  80 .B find0
  81 for simple null-terminated names, as produced by
  82 .BR "find \-\-print0" ;
  83 or
  84 .B rsync
  85 for file data as produced by
  86 .BR rsync (1).
  87 The latter is useful, since
  88 .B rsync
  89 has powerful file inclusion and exclusion capabilities \(en and a common
  90 use case is generating a digest for a collection of files copied using
  91 .BR rsync .
  92 (The
  93 .B find0
  94 format doesn't work well: see
  95 .B BUGS
  96 below.)
  97 .TP
  98 .B \-H, \-\-hash=\fIhash
  99 Use the
 100 .I hash
 101 function, which can be any hash function supported by Python's
 102 .BR hashlib .
 103 This option may be omitted: if it is, then the hash is read from the
 104 cache file; if there is no cache file either, then an error is reported.
 105 .TP
 106 .B \-u, \-\-udiff
 107 Rather than produce a manifest, read a unified
 108 .BR diff (1)
 109 from standard input, and clear from the cache all files mentioned as
 110 being different.  Filenames in the diff are considered relative to
 111 .I dir ,
 112 defaulting to the current working directory.
 113 .PP
 114 Positional arguments are interpreted as files and directories to be
 115 processed, in order.  A directory name which ends in
 116 .RB ` / '
 117 is treated specially:
 118 .B fshash
 119 writes filenames relative to the given directory.
 120 .SS Output format
 121 Information about each filesystem object is written on a separate line.
 122 These lines can be quite long, and consist of a number of fields:
 123 .hP 1.
 124 For regular files, a cryptographic hash of the file's content, in
 125 hexadecimal.  For other kinds of filesystem object, a description of the
 126 object type and any special information about it, in square brackets,
 127 and padded with spaces so as to take the same width as a hash; see
 128 below for details.
 129 .hP 2.
 130 A `virtual inode identifier': a string which will be the same in two
 131 lines if and only if they represent hard links to the same underlying
 132 inode.  Some care is taken so that files are assigned the same
 133 identifier even if other parts of the filesystem are different, so as to
 134 avoid spurious differences.
 135 .hP 3.
 136 The object's permissions and mode bits, in octal.
 137 .hP 4.
 138 The file's owner and group, in decimal, separated by a colon.
 139 .hP 5.
 140 The file's last-modified time, in UTC, in ISO8601 format, i.e.,
 141 .IB yyyy \(en mm \(en dd T hh : mm : ss Z \fR.
 142 .hP 6.
 143 The file's size in bytes, in decimal.
 144 .hP 7.
 145 The file's name (relative to some appropriate parent directory).
 146 Characters which
 147 would cause ambiguity are escaped: tab, linefeed and carriage return are
 148 printed as
 149 .RB ` \et ',
 150 .RB ` \en ',
 151 and
 152 .RB ` \er ',
 153 respectively;
 154 .RB ` ' '
 155 is printed as
 156 .RB ` \e' ';
 157 .RB ` \e '
 158 is printed as
 159 .RB ` \e\e ';
 160 and other codes outside the range 32\(en127 are printed as hex escaped,
 161 in the form
 162 .RB ` \ex\fIxx '.
 163 Finally, the sequence
 164 .RB ` \~\->\~ '
 165 is printed as
 166 .RB ` \~\e\->\~ '
 167 so that symlink targets are presented unambiguously (see below).
 168 .PP
 169 For non-regular file objects, the first field is an information field
 170 enclosed in square brackets, and some of the other fields provide other
 171 information or are suppressed, follows.
 172 .TP
 173 .I Errors
 174 If there was an error reading the object's metadata then the information
 175 field shows
 176 .BI E nn
 177 .IR message ,
 178 and the other fields, except the name, are printed as
 179 .B error
 180 rather than having any useful information.
 181 .TP
 182 .I Sockets
 183 The information field shows
 184 .BR socket .
 185 .TP
 186 .I Named pipes
 187 The information field shows
 188 .BR fifo .
 189 .TP
 190 .I Symbolic links
 191 The information field shows
 192 .BR symbolic-link .
 193 The name is followed by
 194 .RB ` \~\->\~ '
 195 and the link target (or
 196 .BI <E nn \~ message >
 197 if there was an error reading the link destination).
 198 .TP
 199 .I Directories
 200 The information field shows
 201 .BR directory ,
 202 and the size field shows
 203 .B dir
 204 (since directory sizes are not consistent across filesystem
 205 implementations).  The name is followed by
 206 .RB ` / '.
 207 .TP
 208 .I Block and character devices
 209 The information field shows
 210 .B block-device
 211 or
 212 .BR character-device ,
 213 as appropriate, followed by the major and minor device numbers in
 214 decimal, and separated by a colon.
 215 .PP
 216 .SH BUGS
 217 No attempt is made to sort filenames read in
 218 .B find0
 219 format, so they're not very likely to match digests produced any other
 220 way.  Indeed, they're not very likely to match digests produced by
 221 .B find0
 222 on other machines either.
 223 .SH SEE ALSO
 224 .BR find (1),
 225 .BR rsync (1),
 226 .BR sha256sum (1)
 227 etc.
 228 .SH AUTHOR
 229 Mark Wooding, <mdw@distorted.org.uk>