4a3d0d52 |
1 | .\" -*-nroff-*- |
2 | .de hP |
3 | .IP |
4 | .ft B |
5 | \h'-\w'\\$1\ 'u'\\$1\ \c |
6 | .ft P |
7 | .. |
8 | .ie t .ds o \(bu |
9 | .el .ds o o |
d07dfe80 |
10 | .TH hashsum 1 "29 July 2000" "Straylight/Edgeware" "Catacomb cryptographic library" |
4a3d0d52 |
11 | .SH NAME |
12 | hashsum \- compute and verify cryptographic checksums of files |
13 | .SH SYNOPSIS |
14 | .B hashsum |
15 | .RB [ \-f0ecbv ] |
16 | .RB [ \-a |
17 | .IR algorithm ] |
18 | .IR files ... |
19 | .SH DESCRIPTION |
20 | The |
21 | .B hashsum |
22 | program generates and verifies cryptographic checksums (hashes) of |
23 | files. A number of hashing algorithms are available. |
24 | .PP |
25 | The |
26 | .B hashsum |
27 | program's options and output are designed to be upwardly compatible with |
28 | the GNU |
29 | .BR md5sum (1) |
30 | program. |
31 | .PP |
32 | Usually, |
33 | .B hashsum |
34 | generates checksums of a collection of files named either on the command |
35 | line or read from standard input, and write their hashes to standard |
36 | output using a simple file format. However, given the |
37 | .B \-c |
38 | option, it will read in files in its usual output format and verify that |
39 | the named files have the reported hashes. |
40 | .SS "Options" |
41 | The |
42 | .B hashsum |
43 | program understands the following options: |
44 | .TP |
45 | .B "\-h, \-\-help" |
46 | Prints a help message to standard output and exits successfully. |
47 | .TP |
48 | .B "\-V, \-\-version" |
49 | Prints the program's version number to standard output and exits |
50 | successfully. |
51 | .TP |
52 | .B "\-u, \-\-usage" |
53 | Prints a brief usage summary to standard output and exits successfully. |
54 | .TP |
55 | .BI "\-a, \-\-algorithm=" alg |
56 | Use the hash algorithm |
57 | .IR alg . |
58 | If this option is not given, a default hashing algorithm is selected: |
59 | see |
60 | .B "Hashing algorithms" |
61 | below. |
62 | .TP |
63 | .B "\-l, \-\-list" |
64 | Prints a space-separated list of available hashing algorithms to |
65 | standard output and exits successfully. |
66 | .TP |
67 | .B "\-f, \-\-files" |
68 | Each input file is considered to be a list of filenames which should be |
69 | read and hashed. By default, the filenames are considered to be |
70 | whitespace-separated, although control characters can be escaped (see |
71 | .B "Escaping control characters" |
72 | below). |
73 | .TP |
74 | .B "\-0, \-\-null" |
75 | In conjunction with the |
76 | .B \-f |
77 | option above, reads null-terminated filenames, as emitted by GNU |
78 | .BR find (1)'s |
79 | .B \-print0 |
80 | option, rather than whitespace-delimited filenames. If the |
81 | .B \-c |
82 | option is also given, each named in the list is a list of filename/hash |
83 | pairs to be checked. |
84 | .TP |
85 | .B "\-e, \-\-escape" |
86 | Escape control characters (see |
87 | .B "Escaping control characters" |
88 | below) in filenames when generating output. Escaped |
89 | output is not compatible with |
90 | .BR md5sum (1), |
91 | but copes better with files containing newlines and other strange |
92 | control characters. |
93 | .TP |
94 | .B "\-c, \-\-check" |
95 | Check hashes. Each input file is assumed to be in |
96 | .BR hashsum 's |
97 | output format. It is read, and |
98 | .B hashsum |
99 | will verify that each named file has the correct hash. Assuming that |
100 | the hash list is authentic (e.g., it has been digitally signed, or |
101 | obtained via some secure medium), this provides strong assurance that |
102 | the files listed have not been tampered with. |
103 | .TP |
104 | .B "\-b, \-\-binary" |
105 | Assume that the files to be hashed are binary files. This doesn't make |
106 | any difference in Unix systems, although it might on other platforms |
107 | which draw a distinction. |
108 | .TP |
109 | .B "\-v, \-\-verbose" |
110 | In conjunction with the |
111 | .B \-c |
112 | option above, be verbose when checking files. |
113 | .PP |
114 | If no filenames are given on the command line, standard input is read. |
115 | Standard input does not have a filename. |
116 | .SS "Output format" |
117 | There are three types of line in |
118 | .BR hashsum 's |
119 | output format: |
120 | .IR directives , |
121 | .IR "file lines" , |
122 | and |
123 | .IR rubbish . |
124 | .PP |
125 | A |
126 | .I directive |
127 | begins with a hash |
128 | .RB (` # ') |
129 | character. Two directives are currently understood: |
130 | .TP |
131 | .BI "#hash " alg |
132 | Subsequent hashes in this file were generated using the algorithm |
133 | .IR alg . |
134 | .TP |
135 | .BI "#escape" |
136 | Filenames in subsequence lines are written using the `escaped' format, |
137 | described below. |
138 | .PP |
139 | A |
140 | .I "file line" |
141 | consists of a hash, in hexadecimal, followed by a space, a |
142 | .IR flag , |
143 | and the filename. If the current hash algorithm produces |
144 | .IR n -bit |
145 | output, there must be |
146 | .IR n /4 |
147 | hex digits of hash in a file line. The |
148 | .I flag |
149 | is either a star |
150 | .RB (` * ') |
151 | to indicate that the file should be read in binary mode, or a space. |
152 | The rest of the line contains the filename. |
153 | .PP |
154 | A |
155 | .I rubbish |
156 | line is one which doesn't look like a directive or a file line. Rubbish |
157 | lines are ignored. Hence, you can apply PGP clear-signing to a |
158 | .B hashsum |
159 | file without preventing it from being read. |
160 | .SS "Escaping control characters" |
161 | When reading filenames to hash from a list of files or an escaped hash |
162 | list, the following rules are obeyed: |
163 | .hP \*o |
164 | An escaped string cannot contain unescaped, unquoted whitespace |
165 | characters. If such a character is found, the string is considered to |
166 | have ended. |
167 | .hP \*o |
168 | A backslash |
169 | .RB (` \e ') |
170 | escapes the following character. If the character is one of |
171 | .RB ` a ', |
172 | .RB ` b ', |
173 | .RB ` f ', |
174 | .RB ` n ', |
175 | .RB ` r ', |
176 | .RB ` t ', |
177 | or |
178 | .RB ` v ', |
179 | it is replaced by the control character for an audible alert, backspace, |
180 | form-feed, newline, carriage return, horizontal tab or vertical tab |
181 | respectively; other escaped characters are unchanged, although they lose |
182 | any special meaning they might have had. |
183 | .hP \*o |
184 | A section of text may be quoted by surrounding it by |
185 | .BR ' ... ' , |
186 | .BR """" ... """" , |
187 | or |
188 | .BR ` ... ' |
189 | pairs. Within a quoted section, whitespace characters may appear |
190 | unescaped. The backslash may be used to quote control characters or the |
191 | quoting characters as usual. |
192 | .hP \*o |
193 | A word beginning with a hash |
194 | .RB (` # ') |
195 | character is considered to begin a |
196 | .I comment |
197 | which extends to the end of the current line. The hash character may be |
198 | escaped as usual. |
199 | .SS "Hashing algorithms" |
200 | The |
201 | .B hashsum |
202 | program understands several hashing algorithms: |
203 | .TP |
2d3de78a |
204 | .BR md2 |
205 | Designed by Ron Rivest, although I don't know when, and described in |
206 | RFC1319, MD2 is a really old and slow hash function. Its security is |
207 | suspect too: only its checksum stands between it and collision-finding |
208 | attacks. Use of MD2 is not recommended, though it's still used in |
209 | various standards. |
210 | .TP |
4a3d0d52 |
211 | .BR md4 " and " md5 |
212 | Designed by Ron Rivest in 1990 and 1992 respectively and described in |
213 | RFCs 1186, 1320 and 1321, these two early hash functions are efficient |
214 | but cryptographically suspect: the MD4 algorithm has been shown not to |
215 | be collision-resistant and there are `pseudo-collisions' in MD5. |
216 | Despite this, |
217 | .B md5 |
218 | has been used heavily since its introduction and is still popular. MD4 |
219 | is still useful when a fast non-cryptographic hash is wanted. |
220 | .TP |
221 | .B sha |
222 | Designed by the US National Security Agency as part of the Digital |
223 | Signature Standard, SHA-1 provides a longer output than |
224 | .B md4 |
225 | and |
226 | .BR md5 , |
227 | and is seen as being more secure. |
228 | .TP |
229 | .BR rmd128 ", " rmd160 ", " rmd256 " and " rmd320 |
230 | Designed by Antoon Bosselaers, Hans Dobbertin and Bart Preneel in 1996 |
231 | as a replacement for the earlier RIPEMD algorithm, RIPEMD160 provides |
232 | the same length output as SHA-1, but has been designed in the open by |
233 | experts. RIPEMD28 is a shortened version of RIPEMD160 designed as a |
234 | drop-in replacement for MD4, MD5 and the old RIPEMD. The 256 and |
235 | 320-bit versions are efficient double-width extensions of the 128 and |
236 | 160-bit hashes, although they may not offer any additional security. |
237 | .TP |
238 | .B tiger |
239 | Designed by Ross Anderson and Eli Biham to take advantage of 64-bit |
240 | processors, Tiger seems to be an efficient and strong hash function. |
4a3d0d52 |
241 | It's a relatively new algorithm, however, and should probably be |
242 | approached with an open-minded caution. |
2d3de78a |
243 | .TP |
bad16614 |
244 | .BR sha256 ", " sha384 " and " sha512 |
2d3de78a |
245 | Designed by the US National Security Agency to provide security |
246 | commensurate with the Advanced Encryption Standard, these hash functions |
247 | provide long outputs. SHA-256 is fairly quick, though the longer |
248 | variants are slower on 32-bit hardware since they require 64-bit |
249 | arithmetic. They're all very new at the moment, and should be |
250 | approached with an open-minded caution. |
4a3d0d52 |
251 | .PP |
252 | The default hashing algorithm is determined by looking at the name by |
253 | which it was invoked passed to it in |
254 | .BR argv[0] : |
255 | if it has the form |
256 | .RI ` alg \c |
257 | .BR sum ' |
258 | where |
259 | .I alg |
260 | is the name of a hash function, that hash becomes the default. (Hence, |
261 | .B hashsum |
262 | can be used as a drop-in replacement for |
263 | .BR md5sum (1).) |
264 | If the program name doesn't match an algorithm, then |
265 | .B md5 |
266 | is selected for compatibility with files generated by |
267 | .BR md5sum (1). |
268 | .PP |
269 | Note that the same default algorithm is used for both generating new |
270 | output files and checking existing ones. If the algorithm is forced by |
271 | the |
272 | .B \-a |
273 | option, |
274 | .B hashsum |
275 | will emit a |
276 | .RB ` #hash ' |
277 | directive in its output. |
278 | .SH "SEE ALSO" |
279 | .BR md5sum (1). |
280 | .SH "AUTHOR" |
281 | Mark Wooding, <mdw@nsict.org> |