| 1 | .\" -*-nroff-*- |
| 2 | .de hP |
| 3 | .IP |
| 4 | .ft B |
| 5 | \h'-\w'\\$1\ 'u'\\$1\ \c |
| 6 | .ft P |
| 7 | .. |
| 8 | .ie t .ds o \(bu |
| 9 | .el .ds o o |
| 10 | .TH hashsum 1 "29 July 2000" "Straylight/Edgeware" "Catacomb cryptographic library" |
| 11 | .SH NAME |
| 12 | hashsum \- compute and verify cryptographic checksums of files |
| 13 | .SH SYNOPSIS |
| 14 | .B hashsum |
| 15 | .RB [ \-f0ecbv ] |
| 16 | .RB [ \-a |
| 17 | .IR algorithm ] |
| 18 | .RB [ \-E |
| 19 | .IR encoding ] |
| 20 | .IR files ... |
| 21 | .SH DESCRIPTION |
| 22 | The |
| 23 | .B hashsum |
| 24 | program generates and verifies cryptographic checksums (hashes) of |
| 25 | files. A number of hashing algorithms are available. |
| 26 | .PP |
| 27 | The |
| 28 | .B hashsum |
| 29 | program's options and output are designed to be upwardly compatible with |
| 30 | the GNU |
| 31 | .BR md5sum (1) |
| 32 | program. |
| 33 | .PP |
| 34 | Usually, |
| 35 | .B hashsum |
| 36 | generates checksums of a collection of files named either on the command |
| 37 | line or read from standard input, and write their hashes to standard |
| 38 | output using a simple file format. However, given the |
| 39 | .B \-c |
| 40 | option, it will read in files in its usual output format and verify that |
| 41 | the named files have the reported hashes. |
| 42 | .SS "Options" |
| 43 | The |
| 44 | .B hashsum |
| 45 | program understands the following options: |
| 46 | .TP |
| 47 | .B "\-h, \-\-help" |
| 48 | Prints a help message to standard output and exits successfully. |
| 49 | .TP |
| 50 | .B "\-V, \-\-version" |
| 51 | Prints the program's version number to standard output and exits |
| 52 | successfully. |
| 53 | .TP |
| 54 | .B "\-u, \-\-usage" |
| 55 | Prints a brief usage summary to standard output and exits successfully. |
| 56 | .TP |
| 57 | .BR "\-l, \-\-list " [ \fIitem ...] |
| 58 | Show lists of hash functions and encodings supported. |
| 59 | .TP |
| 60 | .BI "\-a, \-\-algorithm=" alg |
| 61 | Use the hash algorithm |
| 62 | .IR alg . |
| 63 | If this option is not given, a default hashing algorithm is selected: |
| 64 | see |
| 65 | .B "Hashing algorithms" |
| 66 | below. |
| 67 | .TP |
| 68 | .BI "\-E, \-\-encoding=" encoding |
| 69 | Use the given |
| 70 | .I encoding |
| 71 | to represent hashes in the output. This is not interoperable with other |
| 72 | programs, but it's handy, e.g., for building sha1 URNs. The encodings |
| 73 | recognized are |
| 74 | .B hex |
| 75 | (the default), |
| 76 | .B base64 |
| 77 | and |
| 78 | .BR base32 . |
| 79 | Type |
| 80 | .B hashsum \-\-list enc |
| 81 | for a list of supported encodings. |
| 82 | .TP |
| 83 | .B "\-f, \-\-files" |
| 84 | Each input file is considered to be a list of filenames which should be |
| 85 | read and hashed. By default, the filenames are considered to be |
| 86 | whitespace-separated, although control characters can be escaped (see |
| 87 | .B "Escaping control characters" |
| 88 | below). |
| 89 | .TP |
| 90 | .B "\-0, \-\-null" |
| 91 | In conjunction with the |
| 92 | .B \-f |
| 93 | option above, reads null-terminated filenames, as emitted by GNU |
| 94 | .BR find (1)'s |
| 95 | .B \-print0 |
| 96 | option, rather than whitespace-delimited filenames. If the |
| 97 | .B \-c |
| 98 | option is also given, each named in the list is a list of filename/hash |
| 99 | pairs to be checked. |
| 100 | .TP |
| 101 | .B "\-e, \-\-escape" |
| 102 | Escape control characters (see |
| 103 | .B "Escaping control characters" |
| 104 | below) in filenames when generating output. Escaped |
| 105 | output is not compatible with |
| 106 | .BR md5sum (1), |
| 107 | but copes better with files containing newlines and other strange |
| 108 | control characters. |
| 109 | .TP |
| 110 | .B "\-c, \-\-check" |
| 111 | Check hashes. Each input file is assumed to be in |
| 112 | .BR hashsum 's |
| 113 | output format. It is read, and |
| 114 | .B hashsum |
| 115 | will verify that each named file has the correct hash. Assuming that |
| 116 | the hash list is authentic (e.g., it has been digitally signed, or |
| 117 | obtained via some secure medium), this provides strong assurance that |
| 118 | the files listed have not been tampered with. |
| 119 | .TP |
| 120 | .B "\-b, \-\-binary" |
| 121 | Assume that the files to be hashed are binary files. This doesn't make |
| 122 | any difference in Unix systems, although it might on other platforms |
| 123 | which draw a distinction. |
| 124 | .TP |
| 125 | .B "\-v, \-\-verbose" |
| 126 | In conjunction with the |
| 127 | .B \-c |
| 128 | option above, be verbose when checking files. |
| 129 | .PP |
| 130 | If no filenames are given on the command line, standard input is read. |
| 131 | Standard input does not have a filename. |
| 132 | .SS "Output format" |
| 133 | There are three types of line in |
| 134 | .BR hashsum 's |
| 135 | output format: |
| 136 | .IR directives , |
| 137 | .IR "file lines" , |
| 138 | and |
| 139 | .IR rubbish . |
| 140 | .PP |
| 141 | A |
| 142 | .I directive |
| 143 | begins with a hash |
| 144 | .RB (` # ') |
| 145 | character. Two directives are currently understood: |
| 146 | .TP |
| 147 | .BI "#hash " alg |
| 148 | Subsequent hashes in this file were generated using the algorithm |
| 149 | .IR alg . |
| 150 | .TP |
| 151 | .BI "#encoding " encoding |
| 152 | Subsequent hashes in this file are represented using the named |
| 153 | .IR encoding . |
| 154 | .TP |
| 155 | .BI "#escape" |
| 156 | Filenames in subsequence lines are written using the `escaped' format, |
| 157 | described below. |
| 158 | .PP |
| 159 | A |
| 160 | .I "file line" |
| 161 | consists of a hash, in the requested encoding, followed by a space, a |
| 162 | .IR flag , |
| 163 | and the filename. The |
| 164 | .I flag |
| 165 | is either a star |
| 166 | .RB (` * ') |
| 167 | to indicate that the file should be read in binary mode, or a space. |
| 168 | The rest of the line contains the filename. |
| 169 | .PP |
| 170 | A |
| 171 | .I rubbish |
| 172 | line is one which doesn't look like a directive or a file line. Rubbish |
| 173 | lines are ignored. Hence, you can apply PGP clear-signing to a |
| 174 | .B hashsum |
| 175 | file without preventing it from being read. |
| 176 | .SS "Escaping control characters" |
| 177 | When reading filenames to hash from a list of files or an escaped hash |
| 178 | list, the following rules are obeyed: |
| 179 | .hP \*o |
| 180 | An escaped string cannot contain unescaped, unquoted whitespace |
| 181 | characters. If such a character is found, the string is considered to |
| 182 | have ended. |
| 183 | .hP \*o |
| 184 | A backslash |
| 185 | .RB (` \e ') |
| 186 | escapes the following character. If the character is one of |
| 187 | .RB ` a ', |
| 188 | .RB ` b ', |
| 189 | .RB ` f ', |
| 190 | .RB ` n ', |
| 191 | .RB ` r ', |
| 192 | .RB ` t ', |
| 193 | or |
| 194 | .RB ` v ', |
| 195 | it is replaced by the control character for an audible alert, backspace, |
| 196 | form-feed, newline, carriage return, horizontal tab or vertical tab |
| 197 | respectively; other escaped characters are unchanged, although they lose |
| 198 | any special meaning they might have had. |
| 199 | .hP \*o |
| 200 | A section of text may be quoted by surrounding it by |
| 201 | .BR ' ... ' , |
| 202 | .BR """" ... """" , |
| 203 | or |
| 204 | .BR ` ... ' |
| 205 | pairs. Within a quoted section, whitespace characters may appear |
| 206 | unescaped. The backslash may be used to quote control characters or the |
| 207 | quoting characters as usual. |
| 208 | .hP \*o |
| 209 | A word beginning with a hash |
| 210 | .RB (` # ') |
| 211 | character is considered to begin a |
| 212 | .I comment |
| 213 | which extends to the end of the current line. The hash character may be |
| 214 | escaped as usual. |
| 215 | .SS "Hashing algorithms" |
| 216 | The |
| 217 | .B hashsum |
| 218 | program understands several hashing algorithms: |
| 219 | .TP |
| 220 | .BR md2 |
| 221 | Designed by Ron Rivest, although I don't know when, and described in |
| 222 | RFC1319, MD2 is a really old and slow hash function. Its security is |
| 223 | suspect too: only its checksum stands between it and collision-finding |
| 224 | attacks. Use of MD2 is not recommended, though it's still used in |
| 225 | various standards. |
| 226 | .TP |
| 227 | .BR md4 " and " md5 |
| 228 | Designed by Ron Rivest in 1990 and 1992 respectively and described in |
| 229 | RFCs 1186, 1320 and 1321, these two early hash functions are efficient |
| 230 | but cryptographically suspect: the MD4 algorithm has been shown not to |
| 231 | be collision-resistant and there are `pseudo-collisions' in MD5. |
| 232 | Despite this, |
| 233 | .B md5 |
| 234 | has been used heavily since its introduction and is still popular. MD4 |
| 235 | is still useful when a fast non-cryptographic hash is wanted. |
| 236 | .TP |
| 237 | .B sha |
| 238 | Designed by the US National Security Agency as part of the Digital |
| 239 | Signature Standard, SHA-1 provides a longer output than |
| 240 | .B md4 |
| 241 | and |
| 242 | .BR md5 , |
| 243 | and is seen as being more secure. |
| 244 | .TP |
| 245 | .BR rmd128 ", " rmd160 ", " rmd256 " and " rmd320 |
| 246 | Designed by Antoon Bosselaers, Hans Dobbertin and Bart Preneel in 1996 |
| 247 | as a replacement for the earlier RIPEMD algorithm, RIPEMD160 provides |
| 248 | the same length output as SHA-1, but has been designed in the open by |
| 249 | experts. RIPEMD28 is a shortened version of RIPEMD160 designed as a |
| 250 | drop-in replacement for MD4, MD5 and the old RIPEMD. The 256 and |
| 251 | 320-bit versions are efficient double-width extensions of the 128 and |
| 252 | 160-bit hashes, although they may not offer any additional security. |
| 253 | .TP |
| 254 | .B tiger |
| 255 | Designed by Ross Anderson and Eli Biham to take advantage of 64-bit |
| 256 | processors, Tiger seems to be an efficient and strong hash function. |
| 257 | It's a relatively new algorithm, however, and should probably be |
| 258 | approached with an open-minded caution. |
| 259 | .TP |
| 260 | .BR sha256 ", " sha384 " and " sha512 |
| 261 | Designed by the US National Security Agency to provide security |
| 262 | commensurate with the Advanced Encryption Standard, these hash functions |
| 263 | provide long outputs. SHA-256 is fairly quick, though the longer |
| 264 | variants are slower on 32-bit hardware since they require 64-bit |
| 265 | arithmetic. They're all very new at the moment, and should be |
| 266 | approached with an open-minded caution. |
| 267 | .PP |
| 268 | The default hashing algorithm is determined by looking at the name by |
| 269 | which it was invoked passed to it in |
| 270 | .BR argv[0] : |
| 271 | if it has the form |
| 272 | .RI ` alg \c |
| 273 | .BR sum ' |
| 274 | where |
| 275 | .I alg |
| 276 | is the name of a hash function, that hash becomes the default. (Hence, |
| 277 | .B hashsum |
| 278 | can be used as a drop-in replacement for |
| 279 | .BR md5sum (1).) |
| 280 | If the program name doesn't match an algorithm, then |
| 281 | .B md5 |
| 282 | is selected for compatibility with files generated by |
| 283 | .BR md5sum (1). |
| 284 | .PP |
| 285 | Note that the same default algorithm is used for both generating new |
| 286 | output files and checking existing ones. If the algorithm is forced by |
| 287 | the |
| 288 | .B \-a |
| 289 | option, |
| 290 | .B hashsum |
| 291 | will emit a |
| 292 | .RB ` #hash ' |
| 293 | directive in its output. |
| 294 | .SH "SEE ALSO" |
| 295 | .BR md5sum (1). |
| 296 | .SH "AUTHOR" |
| 297 | Mark Wooding, <mdw@nsict.org> |