Release 1.1.2.
[rsync-backup] / fshash.1
... / ...
CommitLineData
1.ie t .ds o \(bu
2.el .ds o o
3.de hP
4.IP
5\h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
6..
7.TH fshash 1 "8 October 2012" rsync-backup
8.SH SYNOPSIS
9.B fshash
10.RB [ \-a ]
11.RB [ \-c
12.IR cache ]
13.RB [ \-f
14.IR format ]
15.RB [ \-C
16.IR version ]
17.RB [ \-H
18.IR hash ]
19.RI [ file
20\&...]
21.br
22.B fshash
23.RB \-u
24.B \-c
25.I cache
26.RB [ \-H
27.IR hash ]
28.RI [ dir ]
29.SH DESCRIPTION
30The
31.B fshash
32program generates digests of filesystems. It's similar in concept to
33(but somewhat different from) Ian Jackson's
34.BR summer (1)
35tool.
36.PP
37The idea is to capture everything interesting about a filesystem in a
38file with the following properties:
39.TP
40.I Completeness
41The digest file describes everything `interesting' about the filesystem,
42such that two filesystems which are interestingly different will have
43distinct digests.
44.TP
45.I Canonicalness
46If two filesystems aren't different in any interesting way, then their
47digests should be identical.
48.TP
49.I Readability
50Given two subtly different filesystems, it's easy for a human equipped
51with digests for them and
52.BR diff (1)
53to work out what the differences actually are.
54.SS Command-line processing
55The following command-line arguments are accepted.
56.TP
57.B \-h, \-\-help
58Show a summary of the command-line syntax, and exit successfully.
59.TP
60.B \-\-version
61Show the program's version number, and exit successfully.
62.TP
63.B \-a, \-\-all
64Clear the cache of information about all files except those processed in
65this run.
66.TP
67.B \-c, \-\-cache=\fIfile
68Keep a cache of file hashes in the
69.IR file .
70The cache is keyed by inode and modification time: if a file has an
71entry in the cache already then it won't be hashed again, which can
72provide a valuable performance improvement on large filesystems. If the
73.I file
74doesn't exist, then it will be created.
75.TP
76.B \-f, \-\-files=\fIformat
77Read a list of filenames on standard input in the given
78.I format
79and write digest lines for them. The
80.I format
81may be:
82.B find0
83for simple null-terminated names, as produced by
84.BR "find \-\-print0" ;
85or
86.B rsync
87for file data as produced by
88.BR rsync (1).
89The latter is useful, since
90.B rsync
91has powerful file inclusion and exclusion capabilities \(en and a common
92use case is generating a digest for a collection of files copied using
93.BR rsync .
94(The
95.B find0
96format doesn't work well: see
97.B BUGS
98below.)
99.TP
100.B \-C, \-\-compat=\fIversion
101Produce a manifest with the given compatibility
102.IR version .
103Alas,
104.B fshash
105has bugs in the way it produces manifests. Fixing the bugs makes the
106output better, but now it can't be compared with old manifests which
107were made with the bugs. By default,
108.B fshash
109produces manifests in the most recent format, but this option will force
110it to be compatible with old versions. The original version was 1; all
111later versions print a comment reporting the version number at the start
112of the manifest. The current version is 2.
113.TP
114.B \-H, \-\-hash=\fIhash
115Use the
116.I hash
117function, which can be any hash function supported by Python's
118.BR hashlib .
119This option may be omitted: if it is, then the hash is read from the
120cache file; if there is no cache file either, then an error is reported.
121.TP
122.B \-u, \-\-udiff
123Rather than produce a manifest, read a unified
124.BR diff (1)
125from standard input, and clear from the cache all files mentioned as
126being different. Filenames in the diff are considered relative to
127.I dir ,
128defaulting to the current working directory.
129.PP
130Positional arguments are interpreted as files and directories to be
131processed, in order. A directory name which ends in
132.RB ` / '
133is treated specially:
134.B fshash
135writes filenames relative to the given directory.
136.SS Output format
137Information about each filesystem object is written on a separate line.
138These lines can be quite long, and consist of a number of fields:
139.hP 1.
140For regular files, a cryptographic hash of the file's content, in
141hexadecimal. For other kinds of filesystem object, a description of the
142object type and any special information about it, in square brackets,
143and padded with spaces so as to take the same width as a hash; see
144below for details.
145.hP 2.
146A `virtual inode identifier': a string which will be the same in two
147lines if and only if they represent hard links to the same underlying
148inode. Some care is taken so that files are assigned the same
149identifier even if other parts of the filesystem are different, so as to
150avoid spurious differences.
151.hP 3.
152The object's permissions and mode bits, in octal.
153.hP 4.
154The file's owner and group, in decimal, separated by a colon.
155.hP 5.
156The file's last-modified time, in UTC, in ISO8601 format, i.e.,
157.IB yyyy \(en mm \(en dd T hh : mm : ss Z \fR.
158.hP 6.
159The file's size in bytes, in decimal.
160.hP 7.
161The file's name (relative to some appropriate parent directory).
162Characters which
163would cause ambiguity are escaped: tab, linefeed and carriage return are
164printed as
165.RB ` \et ',
166.RB ` \en ',
167and
168.RB ` \er ',
169respectively;
170.RB ` ' '
171is printed as
172.RB ` \e' ';
173.RB ` \e '
174is printed as
175.RB ` \e\e ';
176and other codes outside the range 32\(en127 are printed as hex escaped,
177in the form
178.RB ` \ex\fIxx '.
179Finally, the sequence
180.RB ` \~\->\~ '
181is printed as
182.RB ` \~\e\->\~ '
183so that symlink targets are presented unambiguously (see below).
184.PP
185For non-regular file objects, the first field is an information field
186enclosed in square brackets, and some of the other fields provide other
187information or are suppressed, follows.
188.TP
189.I Errors
190If there was an error reading the object's metadata then the information
191field shows
192.BI E nn
193.IR message ,
194and the other fields, except the name, are printed as
195.B error
196rather than having any useful information.
197.TP
198.I Sockets
199The information field shows
200.BR socket .
201.TP
202.I Named pipes
203The information field shows
204.BR fifo .
205.TP
206.I Symbolic links
207The information field shows
208.BR symbolic-link .
209The name is followed by
210.RB ` \~\->\~ '
211and the link target (or
212.BI <E nn \~ message >
213if there was an error reading the link destination).
214.TP
215.I Directories
216The information field shows
217.BR directory ,
218and the size field shows
219.B dir
220(since directory sizes are not consistent across filesystem
221implementations). The name is followed by
222.RB ` / '.
223.TP
224.I Block and character devices
225The information field shows
226.B block-device
227or
228.BR character-device ,
229as appropriate, followed by the major and minor device numbers in
230decimal, and separated by a colon.
231.PP
232.SH BUGS
233No attempt is made to sort filenames read in
234.B find0
235format, so they're not very likely to match digests produced any other
236way. Indeed, they're not very likely to match digests produced by
237.B find0
238on other machines either.
239.SH SEE ALSO
240.BR find (1),
241.BR rsync (1),
242.BR sha256sum (1)
243etc.
244.SH AUTHOR
245Mark Wooding, <mdw@distorted.org.uk>