.ie t .ds o \(bu
.el .ds o o
.de hP
.IP
\h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
..
.TH fshash 1 "8 October 2012" rsync-backup
.SH SYNOPSIS
.B fshash
.RB [ \-a ]
.RB [ \-c
.IR cache ]
.RB [ \-f
.IR format ]
.RB [ \-H
.IR hash ]
.RI [ file
\&...]
.br
.B fshash
.RB \-u
.B \-c
.I cache
.RB [ \-H
.IR hash ]
.RI [ dir ]
.SH DESCRIPTION
The
.B fshash
program generates digests of filesystems.  It's similar in concept to
(but somewhat different from) Ian Jackson's
.BR summer (1)
tool.
.PP
The idea is to capture everything interesting about a filesystem in a
file with the following properties:
.TP
.I Completeness
The digest file describes everything `interesting' about the filesystem,
such that two filesystems which are interestingly different will have
distinct digests.
.TP
.I Canonicalness
If two filesystems aren't different in any interesting way, then their
digests should be identical.
.TP
.I Readability
Given two subtly different filesystems, it's easy for a human equipped
with digests for them and
.BR diff (1)
to work out what the differences actually are.
.SS Command-line processing
The following command-line arguments are accepted.
.TP
.B \-h, \-\-help
Show a summary of the command-line syntax, and exit successfully.
.TP
.B \-\-version
Show the program's version number, and exit successfully.
.TP
.B \-a, \-\-all
Clear the cache of information about all files except those processed in
this run.
.TP
.B \-c, \-\-cache=\fIfile
Keep a cache of file hashes in the
.IR file .
The cache is keyed by inode and modification time: if a file has an
entry in the cache already then it won't be hashed again, which can
provide a valuable performance improvement on large filesystems.  If the
.I file
doesn't exist, then it will be created.
.TP
.B \-f, \-\-files=\fIformat
Read a list of filenames on standard input in the given
.I format
and write digest lines for them.  The
.I format
may be:
.B find0
for simple null-terminated names, as produced by
.BR "find \-\-print0" ;
or
.B rsync
for file data as produced by
.BR rsync (1).
The latter is useful, since
.B rsync
has powerful file inclusion and exclusion capabilities \(en and a common
use case is generating a digest for a collection of files copied using
.BR rsync .
(The
.B find0
format doesn't work well: see
.B BUGS
below.)
.TP
.B \-H, \-\-hash=\fIhash
Use the
.I hash
function, which can be any hash function supported by Python's
.BR hashlib .
This option may be omitted: if it is, then the hash is read from the
cache file; if there is no cache file either, then an error is reported.
.TP
.B \-u, \-\-udiff
Rather than produce a manifest, read a unified
.BR diff (1)
from standard input, and clear from the cache all files mentioned as
being different.  Filenames in the diff are considered relative to
.I dir ,
defaulting to the current working directory.
.PP
Positional arguments are interpreted as files and directories to be
processed, in order.  A directory name which ends in
.RB ` / '
is treated specially:
.B fshash
writes filenames relative to the given directory.
.SS Output format
Information about each filesystem object is written on a separate line.
These lines can be quite long, and consist of a number of fields:
.hP 1.
For regular files, a cryptographic hash of the file's content, in
hexadecimal.  For other kinds of filesystem object, a description of the
object type and any special information about it, in square brackets,
and padded with spaces so as to take the same width as a hash; see
below for details.
.hP 2.
A `virtual inode identifier': a string which will be the same in two
lines if and only if they represent hard links to the same underlying
inode.  Some care is taken so that files are assigned the same
identifier even if other parts of the filesystem are different, so as to
avoid spurious differences.
.hP 3.
The object's permissions and mode bits, in octal.
.hP 4.
The file's owner and group, in decimal, separated by a colon.
.hP 5.
The file's last-modified time, in UTC, in ISO8601 format, i.e.,
.IB yyyy \(en mm \(en dd T hh : mm : ss Z \fR.
.hP 6.
The file's size in bytes, in decimal.
.hP 7.
The file's name (relative to some appropriate parent directory).
Characters which
would cause ambiguity are escaped: tab, linefeed and carriage return are
printed as
.RB ` \et ',
.RB ` \en ',
and
.RB ` \er ',
respectively;
.RB ` ' '
is printed as
.RB ` \e' ';
.RB ` \e '
is printed as
.RB ` \e\e ';
and other codes outside the range 32\(en127 are printed as hex escaped,
in the form
.RB ` \ex\fIxx '.
Finally, the sequence
.RB ` \~\->\~ '
is printed as
.RB ` \~\e\->\~ '
so that symlink targets are presented unambiguously (see below).
.PP
For non-regular file objects, the first field is an information field
enclosed in square brackets, and some of the other fields provide other
information or are suppressed, follows.
.TP
.I Errors
If there was an error reading the object's metadata then the information
field shows
.BI E nn
.IR message ,
and the other fields, except the name, are printed as
.B error
rather than having any useful information.
.TP
.I Sockets
The information field shows
.BR socket .
.TP
.I Named pipes
The information field shows
.BR fifo .
.TP
.I Symbolic links
The information field shows
.BR symbolic-link .
The name is followed by
.RB ` \~\->\~ '
and the link target (or
.BI <E nn \~ message >
if there was an error reading the link destination).
.TP
.I Directories
The information field shows
.BR directory ,
and the size field shows
.B dir
(since directory sizes are not consistent across filesystem
implementations).  The name is followed by
.RB ` / '.
.TP
.I Block and character devices
The information field shows
.B block-device
or
.BR character-device ,
as appropriate, followed by the major and minor device numbers in
decimal, and separated by a colon.
.PP
.SH BUGS
No attempt is made to sort filenames read in
.B find0
format, so they're not very likely to match digests produced any other
way.  Indeed, they're not very likely to match digests produced by
.B find0
on other machines either.
.SH SEE ALSO
.BR find (1),
.BR rsync (1),
.BR sha256sum (1)
etc.
.SH AUTHOR
Mark Wooding, <mdw@distorted.org.uk>