.\" -*-nroff-*- .de hP .IP .ft B \h'-\w'\\$1\ 'u'\\$1\ \c .ft P .. .ie t .ds o \(bu .el .ds o o .TH hashsum 1 "29 July 2000" "Straylight/Edgeware" "Catacomb cryptographic library" .SH NAME hashsum \- compute and verify cryptographic checksums of files .SH SYNOPSIS .B hashsum .RB [ \-f0ecbjpv ] .RB [ \-a .IR algorithm ] .RB [ \-E .IR encoding ] .IR files ... .SH DESCRIPTION The .B hashsum program generates and verifies cryptographic checksums (hashes) of files. A number of hashing algorithms are available. .PP The .B hashsum program's options and output were originally designed to be upwardly compatible with the GNU .BR md5sum (1) program, but the two have diverged somewhat. See the .B "COMPATIBILITY NOTES" section of this manual for details. .PP Usually, .B hashsum generates checksums of a collection of files named either on the command line or read from standard input, and write their hashes to standard output using a simple file format. However, given the .B \-c option, it will read in files in its usual output format and verify that the named files have the reported hashes. .SS "Options" The .B hashsum program understands the following options: .TP .B "\-h, \-\-help" Prints a help message to standard output and exits successfully. .TP .B "\-V, \-\-version" Prints the program's version number to standard output and exits successfully. .TP .B "\-u, \-\-usage" Prints a brief usage summary to standard output and exits successfully. .TP .BR "\-l, \-\-list " [ \fIitem ...] Show lists of hash functions and encodings supported. .TP .BI "\-a, \-\-algorithm=" alg Use the hash algorithm .IR alg . If this option is not given, a default hashing algorithm is selected: see .B "Hashing algorithms" below. .TP .BI "\-E, \-\-encoding=" encoding Use the given .I encoding to represent hashes in the output. This is not interoperable with other programs, but it's handy, e.g., for building sha1 URNs. The encodings recognized are .B hex (the default), .B base64 and .BR base32 . Type .B hashsum \-\-list enc for a list of supported encodings. .TP .B "\-f, \-\-files" Each input file is considered to be a list of filenames which should be read and hashed. By default, the filenames are considered to be whitespace-separated, although control characters can be escaped (see .B "Escaping control characters" below). .TP .B "\-0, \-\-null" In conjunction with the .B \-f option above, reads null-terminated filenames, as emitted by GNU .BR find (1)'s .B \-print0 option, rather than whitespace-delimited filenames. If the .B \-c option is also given, each named in the list is a list of filename/hash pairs to be checked. .TP .B "\-e, \-\-escape" Escape control characters (see .B "Escaping control characters" below) in filenames when generating output. Escaped output is not compatible with .BR md5sum (1), but copes better with files containing newlines and other strange control characters. .TP .B "\-c, \-\-check" Check hashes. Each input file is assumed to be in .BR hashsum 's output format. It is read, and .B hashsum will verify that each named file has the correct hash. Assuming that the hash list is authentic (e.g., it has been digitally signed, or obtained via some secure medium), this provides strong assurance that the files listed have not been tampered with. .TP .B "\-j, \-\-junk" Report files whose hashes have not been checked. This is most useful in conjunction with .RB ` \-c ', though it's valid without. The program merely prints warnings about junk files when computing hashes, but will exit nonzero if any are found when checking them. .TP .B "\-b, \-\-binary" Assume that the files to be hashed are binary files. This doesn't make any difference in Unix systems, although it might on other platforms which draw a distinction. .TP .B "\-p, \-\-progress" Display a progress indicator while hashing large files. The progress indicator is written to standard error. .TP .B "\-v, \-\-verbose" In conjunction with the .B \-c option above, be verbose when checking files. .PP If no filenames are given on the command line, standard input is read. Standard input does not have a filename. .SS "Output format" There are three types of line in .BR hashsum 's output format: .IR directives , .IR "file lines" , and .IR rubbish . .PP A .I directive begins with a hash .RB (` # ') character. These directives are currently understood: .TP .BI "#hash " alg Subsequent hashes in this file were generated using the algorithm .IR alg . .TP .BI "#encoding " encoding Subsequent hashes in this file are represented using the named .IR encoding . .TP .BI "#escape" Filenames in subsequence lines are written using the `escaped' format, described below. .PP A .I "file line" consists of a hash, in the requested encoding, followed by a space, a .IR flag , and the filename. The .I flag is either a star .RB (` * ') to indicate that the file should be read in binary mode, or a space. The rest of the line contains the filename. .PP A .I rubbish line is one which doesn't look like a directive or a file line. Rubbish lines are ignored. Hence, you can apply PGP clear-signing to a .B hashsum file without preventing it from being read. .SS "Escaping control characters" When reading filenames to hash from a list of files or an escaped hash list, the following rules are obeyed: .hP \*o An escaped string cannot contain unescaped, unquoted whitespace characters. If such a character is found, the string is considered to have ended. .hP \*o A backslash .RB (` \e ') escapes the following character. If the character is one of .RB ` a ', .RB ` b ', .RB ` f ', .RB ` n ', .RB ` r ', .RB ` t ', or .RB ` v ', it is replaced by the control character for an audible alert, backspace, form-feed, newline, carriage return, horizontal tab or vertical tab respectively; other escaped characters are unchanged, although they lose any special meaning they might have had. .hP \*o A section of text may be quoted by surrounding it by .BR ' ... ' , .BR """" ... """" , or .BR ` ... ' pairs. Within a quoted section, whitespace characters may appear unescaped. The backslash may be used to quote control characters or the quoting characters as usual. .hP \*o A word beginning with a hash .RB (` # ') character is considered to begin a .I comment which extends to the end of the current line. The hash character may be escaped as usual. .SS "Hashing algorithms" The .B hashsum program understands several hashing algorithms: .TP .BR md2 Designed by Ron Rivest, although I don't know when, and described in RFC1319, MD2 is a really old and slow hash function. Its security is suspect too: only its checksum stands between it and collision-finding attacks. Use of MD2 is not recommended, though it's still used in various standards. .TP .BR md4 " and " md5 Designed by Ron Rivest in 1990 and 1992 respectively and described in RFCs 1186, 1320 and 1321, these two early hash functions are efficient but cryptographically suspect: the MD4 algorithm has been shown not to be collision-resistant and there are `pseudo-collisions' in MD5. Despite this, .B md5 has been used heavily since its introduction and is still popular. MD4 is still useful when a fast non-cryptographic hash is wanted. .TP .B sha Designed by the US National Security Agency as part of the Digital Signature Standard, SHA-1 provides a longer output than .B md4 and .BR md5 , and is seen as being more secure. .TP .BR rmd128 ", " rmd160 ", " rmd256 " and " rmd320 Designed by Antoon Bosselaers, Hans Dobbertin and Bart Preneel in 1996 as a replacement for the earlier RIPEMD algorithm, RIPEMD160 provides the same length output as SHA-1, but has been designed in the open by experts. RIPEMD28 is a shortened version of RIPEMD160 designed as a drop-in replacement for MD4, MD5 and the old RIPEMD. The 256 and 320-bit versions are efficient double-width extensions of the 128 and 160-bit hashes, although they may not offer any additional security. .TP .B tiger Designed by Ross Anderson and Eli Biham to take advantage of 64-bit processors, Tiger seems to be an efficient and strong hash function. It's a relatively new algorithm, however, and should probably be approached with an open-minded caution. .TP .BR sha256 ", " sha384 " and " sha512 Designed by the US National Security Agency to provide security commensurate with the Advanced Encryption Standard, these hash functions provide long outputs. SHA-256 is fairly quick, though the longer variants are slower on 32-bit hardware since they require 64-bit arithmetic. They're all very new at the moment, and should be approached with an open-minded caution. .PP The default hashing algorithm is determined by looking at the name by which it was invoked passed to it in .BR argv[0] : if it has the form .RI ` alg \c .BR sum ' where .I alg is the name of a hash function, that hash becomes the default. (Hence, .B hashsum can be used as a drop-in replacement for .BR md5sum (1).) If the program name doesn't match an algorithm, then .B md5 is selected for compatibility with files generated by .BR md5sum (1). .PP Note that the same default algorithm is used for both generating new output files and checking existing ones. If the algorithm is forced by the .B \-a option, .B hashsum will emit a .RB ` #hash ' directive in its output. .SH "COMPATIBILITY NOTES" Once upon a time, there was only the .BR md5sum (1) utility. As its name suggested, it calculated MD5 hashes of files. MD5 was shown to be weak, so the author wrote .B hashsum to do the same job with other, hopefully stronger, hash functions. The original .B hashsum program tried hard to be compatible with GNU .BR md5sum (1), but the latter has itself changed in incompatible ways since then; .B hashsum has intentionally not changed to match. .PP The following .B hashsum features are not found in the GNU Coreutils hashing utilities. .hP Filename escaping (the .B \-e option). .hP Magic comment lines in hash data to indicate algorithm selection, hash encoding, and filename escaping. .hP Base-64 and Base-32 output. .PP Other differences are as follows. .hP Originally, if GNU .B md5sum was invoked without any filename arguments, it would print only the hash of its stdin to stdout, which was very convenient for scripts which manipulate hashes in nontrivial ways. This behaviour was later changed, and now the GNU Coreutils hashing utilities always print a filename or .RB ` \- ' after the hash. The .B hashsum program follows the original .B md5sum behaviour, and doesn't print a filename if no files were listed on the command line. .SH "SEE ALSO" .BR md5sum (1), .BR dsig (1), .BR catsign (1), .BR catcrypt (1). .SH "AUTHOR" Mark Wooding,