hashsum.1: Fix counting error (left over from some previous edit).
[u/mdw/catacomb] / hashsum.1
CommitLineData
4a3d0d52 1.\" -*-nroff-*-
2.de hP
3.IP
4.ft B
5\h'-\w'\\$1\ 'u'\\$1\ \c
6.ft P
7..
8.ie t .ds o \(bu
9.el .ds o o
d07dfe80 10.TH hashsum 1 "29 July 2000" "Straylight/Edgeware" "Catacomb cryptographic library"
4a3d0d52 11.SH NAME
12hashsum \- compute and verify cryptographic checksums of files
13.SH SYNOPSIS
14.B hashsum
43d1332f 15.RB [ \-f0ecbpv ]
4a3d0d52 16.RB [ \-a
17.IR algorithm ]
c65df279 18.RB [ \-E
19.IR encoding ]
4a3d0d52 20.IR files ...
21.SH DESCRIPTION
22The
23.B hashsum
24program generates and verifies cryptographic checksums (hashes) of
25files. A number of hashing algorithms are available.
26.PP
27The
28.B hashsum
29program's options and output are designed to be upwardly compatible with
30the GNU
31.BR md5sum (1)
32program.
33.PP
34Usually,
35.B hashsum
36generates checksums of a collection of files named either on the command
37line or read from standard input, and write their hashes to standard
38output using a simple file format. However, given the
39.B \-c
40option, it will read in files in its usual output format and verify that
41the named files have the reported hashes.
42.SS "Options"
43The
44.B hashsum
45program understands the following options:
46.TP
47.B "\-h, \-\-help"
48Prints a help message to standard output and exits successfully.
49.TP
50.B "\-V, \-\-version"
51Prints the program's version number to standard output and exits
52successfully.
53.TP
54.B "\-u, \-\-usage"
55Prints a brief usage summary to standard output and exits successfully.
56.TP
c65df279 57.BR "\-l, \-\-list " [ \fIitem ...]
58Show lists of hash functions and encodings supported.
59.TP
4a3d0d52 60.BI "\-a, \-\-algorithm=" alg
61Use the hash algorithm
62.IR alg .
63If this option is not given, a default hashing algorithm is selected:
64see
65.B "Hashing algorithms"
66below.
67.TP
c65df279 68.BI "\-E, \-\-encoding=" encoding
69Use the given
70.I encoding
71to represent hashes in the output. This is not interoperable with other
72programs, but it's handy, e.g., for building sha1 URNs. The encodings
73recognized are
45c0fd36 74.B hex
c65df279 75(the default),
76.B base64
77and
78.BR base32 .
79Type
80.B hashsum \-\-list enc
81for a list of supported encodings.
4a3d0d52 82.TP
83.B "\-f, \-\-files"
84Each input file is considered to be a list of filenames which should be
85read and hashed. By default, the filenames are considered to be
86whitespace-separated, although control characters can be escaped (see
87.B "Escaping control characters"
88below).
89.TP
90.B "\-0, \-\-null"
91In conjunction with the
92.B \-f
93option above, reads null-terminated filenames, as emitted by GNU
94.BR find (1)'s
95.B \-print0
96option, rather than whitespace-delimited filenames. If the
97.B \-c
98option is also given, each named in the list is a list of filename/hash
99pairs to be checked.
100.TP
101.B "\-e, \-\-escape"
102Escape control characters (see
103.B "Escaping control characters"
104below) in filenames when generating output. Escaped
105output is not compatible with
106.BR md5sum (1),
107but copes better with files containing newlines and other strange
108control characters.
109.TP
110.B "\-c, \-\-check"
111Check hashes. Each input file is assumed to be in
112.BR hashsum 's
113output format. It is read, and
114.B hashsum
115will verify that each named file has the correct hash. Assuming that
116the hash list is authentic (e.g., it has been digitally signed, or
117obtained via some secure medium), this provides strong assurance that
118the files listed have not been tampered with.
119.TP
120.B "\-b, \-\-binary"
121Assume that the files to be hashed are binary files. This doesn't make
122any difference in Unix systems, although it might on other platforms
123which draw a distinction.
124.TP
43d1332f
MW
125.B "\-p, \-\-progress"
126Display a progress indicator while hashing large files. The progress
127indicator is written to standard error.
128.TP
4a3d0d52 129.B "\-v, \-\-verbose"
130In conjunction with the
131.B \-c
132option above, be verbose when checking files.
133.PP
134If no filenames are given on the command line, standard input is read.
135Standard input does not have a filename.
136.SS "Output format"
137There are three types of line in
138.BR hashsum 's
139output format:
140.IR directives ,
141.IR "file lines" ,
142and
143.IR rubbish .
144.PP
145A
146.I directive
147begins with a hash
148.RB (` # ')
7fb0660b 149character. These directives are currently understood:
4a3d0d52 150.TP
151.BI "#hash " alg
152Subsequent hashes in this file were generated using the algorithm
153.IR alg .
154.TP
c65df279 155.BI "#encoding " encoding
156Subsequent hashes in this file are represented using the named
157.IR encoding .
158.TP
4a3d0d52 159.BI "#escape"
160Filenames in subsequence lines are written using the `escaped' format,
161described below.
162.PP
163A
164.I "file line"
c65df279 165consists of a hash, in the requested encoding, followed by a space, a
4a3d0d52 166.IR flag ,
c65df279 167and the filename. The
4a3d0d52 168.I flag
169is either a star
170.RB (` * ')
171to indicate that the file should be read in binary mode, or a space.
172The rest of the line contains the filename.
173.PP
174A
175.I rubbish
176line is one which doesn't look like a directive or a file line. Rubbish
177lines are ignored. Hence, you can apply PGP clear-signing to a
178.B hashsum
179file without preventing it from being read.
180.SS "Escaping control characters"
181When reading filenames to hash from a list of files or an escaped hash
182list, the following rules are obeyed:
183.hP \*o
184An escaped string cannot contain unescaped, unquoted whitespace
185characters. If such a character is found, the string is considered to
186have ended.
187.hP \*o
188A backslash
189.RB (` \e ')
190escapes the following character. If the character is one of
191.RB ` a ',
192.RB ` b ',
193.RB ` f ',
194.RB ` n ',
195.RB ` r ',
196.RB ` t ',
197or
198.RB ` v ',
199it is replaced by the control character for an audible alert, backspace,
200form-feed, newline, carriage return, horizontal tab or vertical tab
201respectively; other escaped characters are unchanged, although they lose
202any special meaning they might have had.
203.hP \*o
204A section of text may be quoted by surrounding it by
205.BR ' ... ' ,
206.BR """" ... """" ,
207or
208.BR ` ... '
209pairs. Within a quoted section, whitespace characters may appear
210unescaped. The backslash may be used to quote control characters or the
211quoting characters as usual.
212.hP \*o
213A word beginning with a hash
214.RB (` # ')
215character is considered to begin a
216.I comment
217which extends to the end of the current line. The hash character may be
218escaped as usual.
219.SS "Hashing algorithms"
220The
221.B hashsum
222program understands several hashing algorithms:
223.TP
2d3de78a 224.BR md2
225Designed by Ron Rivest, although I don't know when, and described in
226RFC1319, MD2 is a really old and slow hash function. Its security is
227suspect too: only its checksum stands between it and collision-finding
228attacks. Use of MD2 is not recommended, though it's still used in
229various standards.
230.TP
4a3d0d52 231.BR md4 " and " md5
232Designed by Ron Rivest in 1990 and 1992 respectively and described in
233RFCs 1186, 1320 and 1321, these two early hash functions are efficient
234but cryptographically suspect: the MD4 algorithm has been shown not to
235be collision-resistant and there are `pseudo-collisions' in MD5.
236Despite this,
237.B md5
238has been used heavily since its introduction and is still popular. MD4
239is still useful when a fast non-cryptographic hash is wanted.
240.TP
241.B sha
242Designed by the US National Security Agency as part of the Digital
243Signature Standard, SHA-1 provides a longer output than
244.B md4
245and
246.BR md5 ,
247and is seen as being more secure.
248.TP
249.BR rmd128 ", " rmd160 ", " rmd256 " and " rmd320
250Designed by Antoon Bosselaers, Hans Dobbertin and Bart Preneel in 1996
251as a replacement for the earlier RIPEMD algorithm, RIPEMD160 provides
252the same length output as SHA-1, but has been designed in the open by
253experts. RIPEMD28 is a shortened version of RIPEMD160 designed as a
254drop-in replacement for MD4, MD5 and the old RIPEMD. The 256 and
255320-bit versions are efficient double-width extensions of the 128 and
256160-bit hashes, although they may not offer any additional security.
257.TP
258.B tiger
259Designed by Ross Anderson and Eli Biham to take advantage of 64-bit
260processors, Tiger seems to be an efficient and strong hash function.
4a3d0d52 261It's a relatively new algorithm, however, and should probably be
262approached with an open-minded caution.
2d3de78a 263.TP
bad16614 264.BR sha256 ", " sha384 " and " sha512
2d3de78a 265Designed by the US National Security Agency to provide security
266commensurate with the Advanced Encryption Standard, these hash functions
267provide long outputs. SHA-256 is fairly quick, though the longer
268variants are slower on 32-bit hardware since they require 64-bit
269arithmetic. They're all very new at the moment, and should be
270approached with an open-minded caution.
4a3d0d52 271.PP
272The default hashing algorithm is determined by looking at the name by
273which it was invoked passed to it in
274.BR argv[0] :
275if it has the form
276.RI ` alg \c
277.BR sum '
278where
279.I alg
280is the name of a hash function, that hash becomes the default. (Hence,
281.B hashsum
282can be used as a drop-in replacement for
283.BR md5sum (1).)
284If the program name doesn't match an algorithm, then
285.B md5
286is selected for compatibility with files generated by
287.BR md5sum (1).
288.PP
289Note that the same default algorithm is used for both generating new
290output files and checking existing ones. If the algorithm is forced by
291the
292.B \-a
293option,
294.B hashsum
295will emit a
296.RB ` #hash '
297directive in its output.
298.SH "SEE ALSO"
fa54fe1e 299.BR md5sum (1),
300.BR dsig (1),
301.BR catsign (1),
302.BR catcrypt (1).
4a3d0d52 303.SH "AUTHOR"
f387fcb1 304Mark Wooding, <mdw@distorted.org.uk>