Change manpage style slightly.
[u/mdw/catacomb] / hashsum.1
1 .\" -*-nroff-*-
2 .de hP
3 .IP
4 .ft B
5 \h'-\w'\\$1\ 'u'\\$1\ \c
6 .ft P
7 ..
8 .ie t .ds o \(bu
9 .el .ds o o
10 .TH hashsum 1 "29 July 2000" "Straylight/Edgeware" "Catacomb cryptographic library"
11 .SH NAME
12 hashsum \- compute and verify cryptographic checksums of files
13 .SH SYNOPSIS
14 .B hashsum
15 .RB [ \-f0ecbv ]
16 .RB [ \-a
17 .IR algorithm ]
18 .IR files ...
19 .SH DESCRIPTION
20 The
21 .B hashsum
22 program generates and verifies cryptographic checksums (hashes) of
23 files. A number of hashing algorithms are available.
24 .PP
25 The
26 .B hashsum
27 program's options and output are designed to be upwardly compatible with
28 the GNU
29 .BR md5sum (1)
30 program.
31 .PP
32 Usually,
33 .B hashsum
34 generates checksums of a collection of files named either on the command
35 line or read from standard input, and write their hashes to standard
36 output using a simple file format. However, given the
37 .B \-c
38 option, it will read in files in its usual output format and verify that
39 the named files have the reported hashes.
40 .SS "Options"
41 The
42 .B hashsum
43 program understands the following options:
44 .TP
45 .B "\-h, \-\-help"
46 Prints a help message to standard output and exits successfully.
47 .TP
48 .B "\-V, \-\-version"
49 Prints the program's version number to standard output and exits
50 successfully.
51 .TP
52 .B "\-u, \-\-usage"
53 Prints a brief usage summary to standard output and exits successfully.
54 .TP
55 .BI "\-a, \-\-algorithm=" alg
56 Use the hash algorithm
57 .IR alg .
58 If this option is not given, a default hashing algorithm is selected:
59 see
60 .B "Hashing algorithms"
61 below.
62 .TP
63 .B "\-l, \-\-list"
64 Prints a space-separated list of available hashing algorithms to
65 standard output and exits successfully.
66 .TP
67 .B "\-f, \-\-files"
68 Each input file is considered to be a list of filenames which should be
69 read and hashed. By default, the filenames are considered to be
70 whitespace-separated, although control characters can be escaped (see
71 .B "Escaping control characters"
72 below).
73 .TP
74 .B "\-0, \-\-null"
75 In conjunction with the
76 .B \-f
77 option above, reads null-terminated filenames, as emitted by GNU
78 .BR find (1)'s
79 .B \-print0
80 option, rather than whitespace-delimited filenames. If the
81 .B \-c
82 option is also given, each named in the list is a list of filename/hash
83 pairs to be checked.
84 .TP
85 .B "\-e, \-\-escape"
86 Escape control characters (see
87 .B "Escaping control characters"
88 below) in filenames when generating output. Escaped
89 output is not compatible with
90 .BR md5sum (1),
91 but copes better with files containing newlines and other strange
92 control characters.
93 .TP
94 .B "\-c, \-\-check"
95 Check hashes. Each input file is assumed to be in
96 .BR hashsum 's
97 output format. It is read, and
98 .B hashsum
99 will verify that each named file has the correct hash. Assuming that
100 the hash list is authentic (e.g., it has been digitally signed, or
101 obtained via some secure medium), this provides strong assurance that
102 the files listed have not been tampered with.
103 .TP
104 .B "\-b, \-\-binary"
105 Assume that the files to be hashed are binary files. This doesn't make
106 any difference in Unix systems, although it might on other platforms
107 which draw a distinction.
108 .TP
109 .B "\-v, \-\-verbose"
110 In conjunction with the
111 .B \-c
112 option above, be verbose when checking files.
113 .PP
114 If no filenames are given on the command line, standard input is read.
115 Standard input does not have a filename.
116 .SS "Output format"
117 There are three types of line in
118 .BR hashsum 's
119 output format:
120 .IR directives ,
121 .IR "file lines" ,
122 and
123 .IR rubbish .
124 .PP
125 A
126 .I directive
127 begins with a hash
128 .RB (` # ')
129 character. Two directives are currently understood:
130 .TP
131 .BI "#hash " alg
132 Subsequent hashes in this file were generated using the algorithm
133 .IR alg .
134 .TP
135 .BI "#escape"
136 Filenames in subsequence lines are written using the `escaped' format,
137 described below.
138 .PP
139 A
140 .I "file line"
141 consists of a hash, in hexadecimal, followed by a space, a
142 .IR flag ,
143 and the filename. If the current hash algorithm produces
144 .IR n -bit
145 output, there must be
146 .IR n /4
147 hex digits of hash in a file line. The
148 .I flag
149 is either a star
150 .RB (` * ')
151 to indicate that the file should be read in binary mode, or a space.
152 The rest of the line contains the filename.
153 .PP
154 A
155 .I rubbish
156 line is one which doesn't look like a directive or a file line. Rubbish
157 lines are ignored. Hence, you can apply PGP clear-signing to a
158 .B hashsum
159 file without preventing it from being read.
160 .SS "Escaping control characters"
161 When reading filenames to hash from a list of files or an escaped hash
162 list, the following rules are obeyed:
163 .hP \*o
164 An escaped string cannot contain unescaped, unquoted whitespace
165 characters. If such a character is found, the string is considered to
166 have ended.
167 .hP \*o
168 A backslash
169 .RB (` \e ')
170 escapes the following character. If the character is one of
171 .RB ` a ',
172 .RB ` b ',
173 .RB ` f ',
174 .RB ` n ',
175 .RB ` r ',
176 .RB ` t ',
177 or
178 .RB ` v ',
179 it is replaced by the control character for an audible alert, backspace,
180 form-feed, newline, carriage return, horizontal tab or vertical tab
181 respectively; other escaped characters are unchanged, although they lose
182 any special meaning they might have had.
183 .hP \*o
184 A section of text may be quoted by surrounding it by
185 .BR ' ... ' ,
186 .BR """" ... """" ,
187 or
188 .BR ` ... '
189 pairs. Within a quoted section, whitespace characters may appear
190 unescaped. The backslash may be used to quote control characters or the
191 quoting characters as usual.
192 .hP \*o
193 A word beginning with a hash
194 .RB (` # ')
195 character is considered to begin a
196 .I comment
197 which extends to the end of the current line. The hash character may be
198 escaped as usual.
199 .SS "Hashing algorithms"
200 The
201 .B hashsum
202 program understands several hashing algorithms:
203 .TP
204 .BR md4 " and " md5
205 Designed by Ron Rivest in 1990 and 1992 respectively and described in
206 RFCs 1186, 1320 and 1321, these two early hash functions are efficient
207 but cryptographically suspect: the MD4 algorithm has been shown not to
208 be collision-resistant and there are `pseudo-collisions' in MD5.
209 Despite this,
210 .B md5
211 has been used heavily since its introduction and is still popular. MD4
212 is still useful when a fast non-cryptographic hash is wanted.
213 .TP
214 .B sha
215 Designed by the US National Security Agency as part of the Digital
216 Signature Standard, SHA-1 provides a longer output than
217 .B md4
218 and
219 .BR md5 ,
220 and is seen as being more secure.
221 .TP
222 .BR rmd128 ", " rmd160 ", " rmd256 " and " rmd320
223 Designed by Antoon Bosselaers, Hans Dobbertin and Bart Preneel in 1996
224 as a replacement for the earlier RIPEMD algorithm, RIPEMD160 provides
225 the same length output as SHA-1, but has been designed in the open by
226 experts. RIPEMD28 is a shortened version of RIPEMD160 designed as a
227 drop-in replacement for MD4, MD5 and the old RIPEMD. The 256 and
228 320-bit versions are efficient double-width extensions of the 128 and
229 160-bit hashes, although they may not offer any additional security.
230 .TP
231 .B tiger
232 Designed by Ross Anderson and Eli Biham to take advantage of 64-bit
233 processors, Tiger seems to be an efficient and strong hash function.
234 Its 192-bit output is wider than that of any other algorithm supported
235 by
236 .BR hashsum .
237 It's a relatively new algorithm, however, and should probably be
238 approached with an open-minded caution.
239 .PP
240 The default hashing algorithm is determined by looking at the name by
241 which it was invoked passed to it in
242 .BR argv[0] :
243 if it has the form
244 .RI ` alg \c
245 .BR sum '
246 where
247 .I alg
248 is the name of a hash function, that hash becomes the default. (Hence,
249 .B hashsum
250 can be used as a drop-in replacement for
251 .BR md5sum (1).)
252 If the program name doesn't match an algorithm, then
253 .B md5
254 is selected for compatibility with files generated by
255 .BR md5sum (1).
256 .PP
257 Note that the same default algorithm is used for both generating new
258 output files and checking existing ones. If the algorithm is forced by
259 the
260 .B \-a
261 option,
262 .B hashsum
263 will emit a
264 .RB ` #hash '
265 directive in its output.
266 .SH "SEE ALSO"
267 .BR md5sum (1).
268 .SH "AUTHOR"
269 Mark Wooding, <mdw@nsict.org>