| 1 | .\" -*-nroff-*- |
| 2 | .de hP |
| 3 | .IP |
| 4 | .ft B |
| 5 | \h'-\w'\\$1\ 'u'\\$1\ \c |
| 6 | .ft P |
| 7 | .. |
| 8 | .ie t .ds o \(bu |
| 9 | .el .ds o o |
| 10 | .TH hashsum 1 "29 July 2000" "Straylight/Edgeware" "Catacomb cryptographic library" |
| 11 | .SH NAME |
| 12 | hashsum \- compute and verify cryptographic checksums of files |
| 13 | .SH SYNOPSIS |
| 14 | .B hashsum |
| 15 | .RB [ \-f0ecbjpv ] |
| 16 | .RB [ \-a |
| 17 | .IR algorithm ] |
| 18 | .RB [ \-E |
| 19 | .IR encoding ] |
| 20 | .IR files ... |
| 21 | .SH DESCRIPTION |
| 22 | The |
| 23 | .B hashsum |
| 24 | program generates and verifies cryptographic checksums (hashes) of |
| 25 | files. A number of hashing algorithms are available. |
| 26 | .PP |
| 27 | The |
| 28 | .B hashsum |
| 29 | program's options and output were originally designed to be upwardly |
| 30 | compatible with the GNU |
| 31 | .BR md5sum (1) |
| 32 | program, but the two have diverged somewhat. See the |
| 33 | .B "COMPATIBILITY NOTES" |
| 34 | section of this manual for details. |
| 35 | .PP |
| 36 | Usually, |
| 37 | .B hashsum |
| 38 | generates checksums of a collection of files named either on the command |
| 39 | line or read from standard input, and write their hashes to standard |
| 40 | output using a simple file format. However, given the |
| 41 | .B \-c |
| 42 | option, it will read in files in its usual output format and verify that |
| 43 | the named files have the reported hashes. |
| 44 | .SS "Options" |
| 45 | The |
| 46 | .B hashsum |
| 47 | program understands the following options: |
| 48 | .TP |
| 49 | .B "\-h, \-\-help" |
| 50 | Prints a help message to standard output and exits successfully. |
| 51 | .TP |
| 52 | .B "\-V, \-\-version" |
| 53 | Prints the program's version number to standard output and exits |
| 54 | successfully. |
| 55 | .TP |
| 56 | .B "\-u, \-\-usage" |
| 57 | Prints a brief usage summary to standard output and exits successfully. |
| 58 | .TP |
| 59 | .BR "\-l, \-\-list " [ \fIitem ...] |
| 60 | Show lists of hash functions and encodings supported. |
| 61 | .TP |
| 62 | .BI "\-a, \-\-algorithm=" alg |
| 63 | Use the hash algorithm |
| 64 | .IR alg . |
| 65 | If this option is not given, a default hashing algorithm is selected: |
| 66 | see |
| 67 | .B "Hashing algorithms" |
| 68 | below. |
| 69 | .TP |
| 70 | .BI "\-E, \-\-encoding=" encoding |
| 71 | Use the given |
| 72 | .I encoding |
| 73 | to represent hashes in the output. This is not interoperable with other |
| 74 | programs, but it's handy, e.g., for building sha1 URNs. The encodings |
| 75 | recognized are |
| 76 | .B hex |
| 77 | (the default), |
| 78 | .B base64 |
| 79 | and |
| 80 | .BR base32 . |
| 81 | Type |
| 82 | .B hashsum \-\-list enc |
| 83 | for a list of supported encodings. |
| 84 | .TP |
| 85 | .B "\-f, \-\-files" |
| 86 | Each input file is considered to be a list of filenames which should be |
| 87 | read and hashed. By default, the filenames are considered to be |
| 88 | whitespace-separated, although control characters can be escaped (see |
| 89 | .B "Escaping control characters" |
| 90 | below). |
| 91 | .TP |
| 92 | .B "\-0, \-\-null" |
| 93 | In conjunction with the |
| 94 | .B \-f |
| 95 | option above, reads null-terminated filenames, as emitted by GNU |
| 96 | .BR find (1)'s |
| 97 | .B \-print0 |
| 98 | option, rather than whitespace-delimited filenames. If the |
| 99 | .B \-c |
| 100 | option is also given, each named in the list is a list of filename/hash |
| 101 | pairs to be checked. |
| 102 | .TP |
| 103 | .B "\-e, \-\-escape" |
| 104 | Escape control characters (see |
| 105 | .B "Escaping control characters" |
| 106 | below) in filenames when generating output. Escaped |
| 107 | output is not compatible with |
| 108 | .BR md5sum (1), |
| 109 | but copes better with files containing newlines and other strange |
| 110 | control characters. |
| 111 | .TP |
| 112 | .B "\-c, \-\-check" |
| 113 | Check hashes. Each input file is assumed to be in |
| 114 | .BR hashsum 's |
| 115 | output format. It is read, and |
| 116 | .B hashsum |
| 117 | will verify that each named file has the correct hash. Assuming that |
| 118 | the hash list is authentic (e.g., it has been digitally signed, or |
| 119 | obtained via some secure medium), this provides strong assurance that |
| 120 | the files listed have not been tampered with. |
| 121 | .TP |
| 122 | .B "\-j, \-\-junk" |
| 123 | Report files whose hashes have not been checked. This is most useful in |
| 124 | conjunction with |
| 125 | .RB ` \-c ', |
| 126 | though it's valid without. The program merely prints warnings about |
| 127 | junk files when computing hashes, but will exit nonzero if any are found |
| 128 | when checking them. |
| 129 | .TP |
| 130 | .B "\-b, \-\-binary" |
| 131 | Assume that the files to be hashed are binary files. This doesn't make |
| 132 | any difference in Unix systems, although it might on other platforms |
| 133 | which draw a distinction. |
| 134 | .TP |
| 135 | .B "\-p, \-\-progress" |
| 136 | Display a progress indicator while hashing large files. The progress |
| 137 | indicator is written to standard error. |
| 138 | .TP |
| 139 | .B "\-v, \-\-verbose" |
| 140 | In conjunction with the |
| 141 | .B \-c |
| 142 | option above, be verbose when checking files. |
| 143 | .PP |
| 144 | If no filenames are given on the command line, standard input is read. |
| 145 | Standard input does not have a filename. |
| 146 | .SS "Output format" |
| 147 | There are three types of line in |
| 148 | .BR hashsum 's |
| 149 | output format: |
| 150 | .IR directives , |
| 151 | .IR "file lines" , |
| 152 | and |
| 153 | .IR rubbish . |
| 154 | .PP |
| 155 | A |
| 156 | .I directive |
| 157 | begins with a hash |
| 158 | .RB (` # ') |
| 159 | character. These directives are currently understood: |
| 160 | .TP |
| 161 | .BI "#hash " alg |
| 162 | Subsequent hashes in this file were generated using the algorithm |
| 163 | .IR alg . |
| 164 | .TP |
| 165 | .BI "#encoding " encoding |
| 166 | Subsequent hashes in this file are represented using the named |
| 167 | .IR encoding . |
| 168 | .TP |
| 169 | .BI "#escape" |
| 170 | Filenames in subsequence lines are written using the `escaped' format, |
| 171 | described below. |
| 172 | .PP |
| 173 | A |
| 174 | .I "file line" |
| 175 | consists of a hash, in the requested encoding, followed by a space, a |
| 176 | .IR flag , |
| 177 | and the filename. The |
| 178 | .I flag |
| 179 | is either a star |
| 180 | .RB (` * ') |
| 181 | to indicate that the file should be read in binary mode, or a space. |
| 182 | The rest of the line contains the filename. |
| 183 | .PP |
| 184 | A |
| 185 | .I rubbish |
| 186 | line is one which doesn't look like a directive or a file line. Rubbish |
| 187 | lines are ignored. Hence, you can apply PGP clear-signing to a |
| 188 | .B hashsum |
| 189 | file without preventing it from being read. |
| 190 | .SS "Escaping control characters" |
| 191 | When reading filenames to hash from a list of files or an escaped hash |
| 192 | list, the following rules are obeyed: |
| 193 | .hP \*o |
| 194 | An escaped string cannot contain unescaped, unquoted whitespace |
| 195 | characters. If such a character is found, the string is considered to |
| 196 | have ended. |
| 197 | .hP \*o |
| 198 | A backslash |
| 199 | .RB (` \e ') |
| 200 | escapes the following character. If the character is one of |
| 201 | .RB ` a ', |
| 202 | .RB ` b ', |
| 203 | .RB ` f ', |
| 204 | .RB ` n ', |
| 205 | .RB ` r ', |
| 206 | .RB ` t ', |
| 207 | or |
| 208 | .RB ` v ', |
| 209 | it is replaced by the control character for an audible alert, backspace, |
| 210 | form-feed, newline, carriage return, horizontal tab or vertical tab |
| 211 | respectively; other escaped characters are unchanged, although they lose |
| 212 | any special meaning they might have had. |
| 213 | .hP \*o |
| 214 | A section of text may be quoted by surrounding it by |
| 215 | .BR ' ... ' , |
| 216 | .BR """" ... """" , |
| 217 | or |
| 218 | .BR ` ... ' |
| 219 | pairs. Within a quoted section, whitespace characters may appear |
| 220 | unescaped. The backslash may be used to quote control characters or the |
| 221 | quoting characters as usual. |
| 222 | .hP \*o |
| 223 | A word beginning with a hash |
| 224 | .RB (` # ') |
| 225 | character is considered to begin a |
| 226 | .I comment |
| 227 | which extends to the end of the current line. The hash character may be |
| 228 | escaped as usual. |
| 229 | .SS "Hashing algorithms" |
| 230 | The |
| 231 | .B hashsum |
| 232 | program understands several hashing algorithms: |
| 233 | .TP |
| 234 | .BR md2 |
| 235 | Designed by Ron Rivest, although I don't know when, and described in |
| 236 | RFC1319, MD2 is a really old and slow hash function. Its security is |
| 237 | suspect too: only its checksum stands between it and collision-finding |
| 238 | attacks. Use of MD2 is not recommended, though it's still used in |
| 239 | various standards. |
| 240 | .TP |
| 241 | .BR md4 " and " md5 |
| 242 | Designed by Ron Rivest in 1990 and 1992 respectively and described in |
| 243 | RFCs 1186, 1320 and 1321, these two early hash functions are efficient |
| 244 | but cryptographically suspect: the MD4 algorithm has been shown not to |
| 245 | be collision-resistant and there are `pseudo-collisions' in MD5. |
| 246 | Despite this, |
| 247 | .B md5 |
| 248 | has been used heavily since its introduction and is still popular. MD4 |
| 249 | is still useful when a fast non-cryptographic hash is wanted. |
| 250 | .TP |
| 251 | .B sha |
| 252 | Designed by the US National Security Agency as part of the Digital |
| 253 | Signature Standard, SHA-1 provides a longer output than |
| 254 | .B md4 |
| 255 | and |
| 256 | .BR md5 , |
| 257 | and is seen as being more secure. |
| 258 | .TP |
| 259 | .BR rmd128 ", " rmd160 ", " rmd256 " and " rmd320 |
| 260 | Designed by Antoon Bosselaers, Hans Dobbertin and Bart Preneel in 1996 |
| 261 | as a replacement for the earlier RIPEMD algorithm, RIPEMD160 provides |
| 262 | the same length output as SHA-1, but has been designed in the open by |
| 263 | experts. RIPEMD28 is a shortened version of RIPEMD160 designed as a |
| 264 | drop-in replacement for MD4, MD5 and the old RIPEMD. The 256 and |
| 265 | 320-bit versions are efficient double-width extensions of the 128 and |
| 266 | 160-bit hashes, although they may not offer any additional security. |
| 267 | .TP |
| 268 | .B tiger |
| 269 | Designed by Ross Anderson and Eli Biham to take advantage of 64-bit |
| 270 | processors, Tiger seems to be an efficient and strong hash function. |
| 271 | It's a relatively new algorithm, however, and should probably be |
| 272 | approached with an open-minded caution. |
| 273 | .TP |
| 274 | .BR sha256 ", " sha384 " and " sha512 |
| 275 | Designed by the US National Security Agency to provide security |
| 276 | commensurate with the Advanced Encryption Standard, these hash functions |
| 277 | provide long outputs. SHA-256 is fairly quick, though the longer |
| 278 | variants are slower on 32-bit hardware since they require 64-bit |
| 279 | arithmetic. They're all very new at the moment, and should be |
| 280 | approached with an open-minded caution. |
| 281 | .PP |
| 282 | The default hashing algorithm is determined by looking at the name by |
| 283 | which it was invoked passed to it in |
| 284 | .BR argv[0] : |
| 285 | if it has the form |
| 286 | .RI ` alg \c |
| 287 | .BR sum ' |
| 288 | where |
| 289 | .I alg |
| 290 | is the name of a hash function, that hash becomes the default. (Hence, |
| 291 | .B hashsum |
| 292 | can be used as a drop-in replacement for |
| 293 | .BR md5sum (1).) |
| 294 | If the program name doesn't match an algorithm, then |
| 295 | .B md5 |
| 296 | is selected for compatibility with files generated by |
| 297 | .BR md5sum (1). |
| 298 | .PP |
| 299 | Note that the same default algorithm is used for both generating new |
| 300 | output files and checking existing ones. If the algorithm is forced by |
| 301 | the |
| 302 | .B \-a |
| 303 | option, |
| 304 | .B hashsum |
| 305 | will emit a |
| 306 | .RB ` #hash ' |
| 307 | directive in its output. |
| 308 | .SH "COMPATIBILITY NOTES" |
| 309 | Once upon a time, there was only the |
| 310 | .BR md5sum (1) |
| 311 | utility. As its name suggested, it calculated MD5 hashes of files. MD5 |
| 312 | was shown to be weak, so the author wrote |
| 313 | .B hashsum |
| 314 | to do the same job with other, hopefully stronger, hash functions. The |
| 315 | original |
| 316 | .B hashsum |
| 317 | program tried hard to be compatible with GNU |
| 318 | .BR md5sum (1), |
| 319 | but the latter has itself changed in incompatible ways since then; |
| 320 | .B hashsum |
| 321 | has intentionally not changed to match. |
| 322 | .PP |
| 323 | The following |
| 324 | .B hashsum |
| 325 | features are not found in the GNU Coreutils hashing utilities. |
| 326 | .hP |
| 327 | Filename escaping (the |
| 328 | .B \-e |
| 329 | option). |
| 330 | .hP |
| 331 | Magic comment lines in hash data to indicate algorithm selection, hash |
| 332 | encoding, and filename escaping. |
| 333 | .hP |
| 334 | Base-64 and Base-32 output. |
| 335 | .PP |
| 336 | Other differences are as follows. |
| 337 | .hP |
| 338 | Originally, if GNU |
| 339 | .B md5sum |
| 340 | was invoked without any filename arguments, it would print only the hash |
| 341 | of its stdin to stdout, which was very convenient for scripts which |
| 342 | manipulate hashes in nontrivial ways. This behaviour was later changed, |
| 343 | and now the GNU Coreutils hashing utilities always print a filename or |
| 344 | .RB ` \- ' |
| 345 | after the hash. The |
| 346 | .B hashsum |
| 347 | program follows the original |
| 348 | .B md5sum |
| 349 | behaviour, and doesn't print a filename if no files were listed on the |
| 350 | command line. |
| 351 | .SH "SEE ALSO" |
| 352 | .BR md5sum (1), |
| 353 | .BR dsig (1), |
| 354 | .BR catsign (1), |
| 355 | .BR catcrypt (1). |
| 356 | .SH "AUTHOR" |
| 357 | Mark Wooding, <mdw@distorted.org.uk> |