Release 1.1.1.
[rsync-backup] / fshash.1
CommitLineData
69305044
MW
1.ie t .ds o \(bu
2.el .ds o o
3.de hP
4.IP
5\h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
6..
7.TH fshash 1 "8 October 2012" rsync-backup
8.SH SYNOPSIS
9.B fshash
10.RB [ \-a ]
11.RB [ \-c
12.IR cache ]
13.RB [ \-f
14.IR format ]
15.RB [ \-H
16.IR hash ]
17.RI [ file
18\&...]
80d1feec
MW
19.br
20.B fshash
21.RB \-u
22.B \-c
23.I cache
24.RB [ \-H
25.IR hash ]
26.RI [ dir ]
69305044
MW
27.SH DESCRIPTION
28The
29.B fshash
4d1e50d8
MW
30program generates digests of filesystems. It's similar in concept to
31(but somewhat different from) Ian Jackson's
69305044
MW
32.BR summer (1)
33tool.
34.PP
35The idea is to capture everything interesting about a filesystem in a
36file with the following properties:
37.TP
38.I Completeness
39The digest file describes everything `interesting' about the filesystem,
40such that two filesystems which are interestingly different will have
41distinct digests.
42.TP
43.I Canonicalness
44If two filesystems aren't different in any interesting way, then their
45digests should be identical.
46.TP
47.I Readability
48Given two subtly different filesystems, it's easy for a human equipped
49with digests for them and
50.BR diff (1)
51to work out what the differences actually are.
52.SS Command-line processing
53The following command-line arguments are accepted.
54.TP
55.B \-h, \-\-help
56Show a summary of the command-line syntax, and exit successfully.
57.TP
58.B \-\-version
59Show the program's version number, and exit successfully.
60.TP
61.B \-a, \-\-all
62Clear the cache of information about all files except those processed in
63this run.
64.TP
65.B \-c, \-\-cache=\fIfile
66Keep a cache of file hashes in the
67.IR file .
68The cache is keyed by inode and modification time: if a file has an
69entry in the cache already then it won't be hashed again, which can
70provide a valuable performance improvement on large filesystems. If the
71.I file
72doesn't exist, then it will be created.
73.TP
74.B \-f, \-\-files=\fIformat
75Read a list of filenames on standard input in the given
76.I format
77and write digest lines for them. The
78.I format
79may be:
80.B find0
81for simple null-terminated names, as produced by
82.BR "find \-\-print0" ;
83or
84.B rsync
85for file data as produced by
86.BR rsync (1).
87The latter is useful, since
88.B rsync
89has powerful file inclusion and exclusion capabilities \(en and a common
90use case is generating a digest for a collection of files copied using
91.BR rsync .
92(The
93.B find0
94format doesn't work well: see
95.B BUGS
96below.)
97.TP
98.B \-H, \-\-hash=\fIhash
99Use the
100.I hash
101function, which can be any hash function supported by Python's
102.BR hashlib .
915b95f4
MW
103This option may be omitted: if it is, then the hash is read from the
104cache file; if there is no cache file either, then an error is reported.
80d1feec
MW
105.TP
106.B \-u, \-\-udiff
107Rather than produce a manifest, read a unified
108.BR diff (1)
109from standard input, and clear from the cache all files mentioned as
110being different. Filenames in the diff are considered relative to
111.I dir ,
112defaulting to the current working directory.
69305044
MW
113.PP
114Positional arguments are interpreted as files and directories to be
115processed, in order. A directory name which ends in
116.RB ` / '
117is treated specially:
118.B fshash
119writes filenames relative to the given directory.
120.SS Output format
121Information about each filesystem object is written on a separate line.
122These lines can be quite long, and consist of a number of fields:
123.hP 1.
124For regular files, a cryptographic hash of the file's content, in
125hexadecimal. For other kinds of filesystem object, a description of the
126object type and any special information about it, in square brackets,
127and padded with spaces so as to take the same width as a hash; see
128below for details.
69305044
MW
129.hP 2.
130A `virtual inode identifier': a string which will be the same in two
131lines if and only if they represent hard links to the same underlying
132inode. Some care is taken so that files are assigned the same
133identifier even if other parts of the filesystem are different, so as to
134avoid spurious differences.
135.hP 3.
136The object's permissions and mode bits, in octal.
137.hP 4.
138The file's owner and group, in decimal, separated by a colon.
139.hP 5.
140The file's last-modified time, in UTC, in ISO8601 format, i.e.,
141.IB yyyy \(en mm \(en dd T hh : mm : ss Z \fR.
142.hP 6.
143The file's size in bytes, in decimal.
144.hP 7.
145The file's name (relative to some appropriate parent directory).
146Characters which
147would cause ambiguity are escaped: tab, linefeed and carriage return are
148printed as
149.RB ` \et ',
150.RB ` \en ',
151and
152.RB ` \er ',
153respectively;
154.RB ` ' '
155is printed as
156.RB ` \e' ';
157.RB ` \e '
158is printed as
159.RB ` \e\e ';
160and other codes outside the range 32\(en127 are printed as hex escaped,
161in the form
162.RB ` \ex\fIxx '.
163Finally, the sequence
164.RB ` \~\->\~ '
165is printed as
166.RB ` \~\e\->\~ '
167so that symlink targets are presented unambiguously (see below).
168.PP
169For non-regular file objects, the first field is an information field
170enclosed in square brackets, and some of the other fields provide other
171information or are suppressed, follows.
172.TP
173.I Errors
174If there was an error reading the object's metadata then the information
175field shows
4d1e50d8 176.BI E nn
69305044
MW
177.IR message ,
178and the other fields, except the name, are printed as
179.B error
180rather than having any useful information.
181.TP
182.I Sockets
183The information field shows
184.BR socket .
185.TP
186.I Named pipes
187The information field shows
188.BR fifo .
189.TP
190.I Symbolic links
191The information field shows
192.BR symbolic-link .
193The name is followed by
194.RB ` \~\->\~ '
915b95f4 195and the link target (or
69305044
MW
196.BI <E nn \~ message >
197if there was an error reading the link destination).
198.TP
199.I Directories
200The information field shows
201.BR directory ,
202and the size field shows
203.B dir
204(since directory sizes are not consistent across filesystem
205implementations). The name is followed by
206.RB ` / '.
207.TP
208.I Block and character devices
209The information field shows
210.B block-device
211or
212.BR character-device ,
213as appropriate, followed by the major and minor device numbers in
214decimal, and separated by a colon.
215.PP
216.SH BUGS
217No attempt is made to sort filenames read in
218.B find0
219format, so they're not very likely to match digests produced any other
220way. Indeed, they're not very likely to match digests produced by
221.B find0
222on other machines either.
223.SH SEE ALSO
224.BR find (1),
225.BR rsync (1),
226.BR sha256sum (1)
227etc.
228.SH AUTHOR
229Mark Wooding, <mdw@distorted.org.uk>