| 1 | %%% -*-latex-*- |
| 2 | %%% |
| 3 | %%% $Id: rand.tex,v 1.3 1999/10/15 21:05:56 mdw Exp $ |
| 4 | %%% |
| 5 | %%% Description of Catacomb's random number generator |
| 6 | %%% |
| 7 | %%% (c) 1999 Straylight/Edgeware |
| 8 | %%% |
| 9 | |
| 10 | %%%----- Licensing notice --------------------------------------------------- |
| 11 | %%% |
| 12 | %%% This file is part of Catacomb. |
| 13 | %%% |
| 14 | %%% Catacomb is free software; you can redistribute it and/or modify |
| 15 | %%% it under the terms of the GNU Library General Public License as |
| 16 | %%% published by the Free Software Foundation; either version 2 of the |
| 17 | %%% License, or (at your option) any later version. |
| 18 | %%% |
| 19 | %%% Catacomb is distributed in the hope that it will be useful, |
| 20 | %%% but WITHOUT ANY WARRANTY; without even the implied warranty of |
| 21 | %%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
| 22 | %%% GNU Library General Public License for more details. |
| 23 | %%% |
| 24 | %%% You should have received a copy of the GNU Library General Public |
| 25 | %%% License along with Catacomb; if not, write to the Free |
| 26 | %%% Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, |
| 27 | %%% MA 02111-1307, USA. |
| 28 | |
| 29 | %%%----- Revision history --------------------------------------------------- |
| 30 | %%% |
| 31 | %%% $Log: rand.tex,v $ |
| 32 | %%% Revision 1.3 1999/10/15 21:05:56 mdw |
| 33 | %%% Add a little more explanatory text for the pool and buffer sizes. |
| 34 | %%% |
| 35 | %%% Revision 1.2 1999/10/12 21:00:34 mdw |
| 36 | %%% Updated. Almost finished, in fact. ;-) |
| 37 | %%% |
| 38 | %%% Revision 1.1 1999/09/03 08:41:13 mdw |
| 39 | %%% Initial import. |
| 40 | %%% |
| 41 | |
| 42 | %%%----- Header ------------------------------------------------------------- |
| 43 | |
| 44 | \documentclass[a4paper, article, 10pt, notitlepage, numbering]{strayman} |
| 45 | \usepackage[palatino, helvetica, courier, maths=cmr]{mdwfonts} |
| 46 | \usepackage{mdwtab, mathenv} |
| 47 | \usepackage[T1]{fontenc} |
| 48 | \usepackage{cmtt, url} |
| 49 | \usepackage[tpic, all]{xy} |
| 50 | \usepackage{mathbbol} |
| 51 | % \usepackage{crypto} |
| 52 | |
| 53 | \def\mdw{{\normalfont[{\bfseries\itshape mdw}]}} |
| 54 | \urlstyle{tt} |
| 55 | \def\email{\begingroup\urlstyle{rm}\Url} |
| 56 | \urldef\myemail\email{mdw@nsict.org} |
| 57 | \def\Z{\mathbb{Z}} |
| 58 | \let\assign\leftarrow |
| 59 | \let\xor\oplus |
| 60 | \let\bigxor\bigoplus |
| 61 | \def\cat{\mathbin{\|}} |
| 62 | |
| 63 | \title{The Catacomb random number generator} |
| 64 | \author{Mark Wooding, \myemail} |
| 65 | |
| 66 | %%%----- The main document -------------------------------------------------- |
| 67 | |
| 68 | \begin{document} |
| 69 | |
| 70 | \maketitle |
| 71 | |
| 72 | \begin{abstract} |
| 73 | The author describes the random number generator used in the |
| 74 | Straylight/\-Edgeware `Catacomb' library. While the generator is |
| 75 | superficially similar to (for example) the Linux and OpenBSD random number |
| 76 | generators, it introduces a number of its own innovations which improve |
| 77 | both security and performance. |
| 78 | |
| 79 | The Catacomb generator uses an optional secret key, which can provide |
| 80 | additional security against forward state compromise extension. It uses a |
| 81 | catastrophic reseeding operation to prevent a compromise yielding |
| 82 | information about past generator states. This operation works on |
| 83 | arbitrary-sized blocks of data, so the generator's output buffer can be |
| 84 | large. This minimizes the effect of the reseeding overhead. |
| 85 | \end{abstract} |
| 86 | |
| 87 | \tableofcontents |
| 88 | |
| 89 | |
| 90 | \section{The one-way transformation} |
| 91 | \label{sec:oneway} |
| 92 | |
| 93 | The most novel part of the generator\footnote{I believe this construction to |
| 94 | be novel. If I'm wrong, let me know.} is the one-way transformation which is |
| 95 | used to allow pooled input data to affect the output buffer. |
| 96 | |
| 97 | Let $H$ be some collision-resistant hash function, and let $E_k$ be a |
| 98 | symmetric cipher with key $k$. Then I can define the one-way transformation |
| 99 | $T$ by |
| 100 | \[ T(x) = E_{H(x)}(x) \] |
| 101 | |
| 102 | I believe, although formal proof seems difficult, that an adversary in |
| 103 | posession of $T(x)$ and a portion of the original $x$ cannot reconstruct the |
| 104 | remainder of $x$ without breaking one of the cryptographic primitives (which |
| 105 | I assume is `difficult') or performing an exhaustive search of one of: the |
| 106 | space of the unknown portion of $x$, the range of the hash function $H$, or |
| 107 | the keyspace of the cipher $E$. |
| 108 | |
| 109 | A similar feat of cryptanalysis or exhaustive search seems necessary to work |
| 110 | in a forwards direction: given partial knowledge of both $x$ and $T(x)$, the |
| 111 | adversary cannot work out the remainder of either without trying every |
| 112 | possibility for one or the other unknown portions, or working through the |
| 113 | hash- or keyspace. |
| 114 | |
| 115 | A keyed version of $T$ may be defined, given a keyed hash (or MAC) $H_k$: |
| 116 | \[ T_k(x) = E_{H_k(x)}(x) \] |
| 117 | If this is done, the adversary cannot work forwards even with \emph{complete} |
| 118 | knowledge of $B$, or performing one of the obvious exhaustive searches. |
| 119 | |
| 120 | |
| 121 | \section{Abstract description of the generator} |
| 122 | |
| 123 | The generator is divided into two parts: an \emph{input pool} which |
| 124 | accumulates random input data from the environment, and an \emph{output |
| 125 | buffer} which contains data to be passed to clients of the generator on |
| 126 | request. |
| 127 | |
| 128 | \subsection{The input pool and mixing function} |
| 129 | |
| 130 | New information is contributed to the generator by mixing it with the input |
| 131 | pool, using a mixing function derived from the Linux random number generator |
| 132 | \cite{linux:devrandom}. The mixing function views the input pool as eight |
| 133 | parallel shift registers. Input data is added one octet at a time. Each bit |
| 134 | of an input octet is mixed with a different shift register. |
| 135 | |
| 136 | Formally, let $I$ be the input pool, with size $N_I$ bytes; let $P(x) = a_0 + |
| 137 | a_1 x + a_2 x^2 + \cdots + a_{N_I} x^{N_I}$ be a primitive polynomial in |
| 138 | $\mathrm{GF}(2^{N_I})$ with degree $N_I$; let $i$ be an integer such that $0 |
| 139 | \le i < N_I$, and $r$ be an integer such that $0 \le r < 8$; and let $x$ be |
| 140 | an input byte. The result of mixing $x$ with the pool $I$ is calculated as |
| 141 | follows: |
| 142 | \begin{eqlines*} |
| 143 | \begin{spliteqn*} |
| 144 | I'[8j + b] = |
| 145 | \begin{cases} |
| 146 | x\bigl[(r + b) \bmod 8\bigr] \xor |
| 147 | \bigxor_{0 \le k < N_I} |
| 148 | a_k I\bigl[8\bigl((j + k) \bmod N_I\bigr) + b\bigr] & if $i = j$ \\ |
| 149 | I[j + b] & otherwise |
| 150 | \end{cases} \\ |
| 151 | \textrm{for all integers $j$ and $b$ where $0 \le j < N_I$ and |
| 152 | $0 \le b < 8$} |
| 153 | \end{spliteqn*} |
| 154 | \\ |
| 155 | I \assign I' \qquad |
| 156 | i \assign (i + 1) \bmod N_I \qquad |
| 157 | r \assign (r + 5) \bmod 8 |
| 158 | \end{eqlines*} |
| 159 | Initially, $i$ and $r$ are both zero. The use of 8-bit bytes above is |
| 160 | arbitrary but convenient for modern computers. |
| 161 | |
| 162 | The mixing function isn't intended to be cryptographically strong. Its |
| 163 | purpose is just to hold data without letting too much of the randomness get |
| 164 | away. |
| 165 | |
| 166 | \subsection{The output buffer} |
| 167 | |
| 168 | Newly added data doesn't affect the output buffer until a `gating' operation |
| 169 | is performed. This uses the one-way transformation described earlier over |
| 170 | the entire generator state. |
| 171 | |
| 172 | Data requested by clients of the generator is read from the output buffer |
| 173 | $O$. Initially the buffer contains zeroes. The output buffer is large |
| 174 | enough for $N_O$ bits. |
| 175 | |
| 176 | When the generator estimates that there's enough entropy in the input pool, a |
| 177 | \emph{gating} operation is performed, using the one-way function described in |
| 178 | section~\ref{sec:oneway}. Hash both the input pool and output buffer, and |
| 179 | then encrypt both using the hash as the key: |
| 180 | \begin{eqlines*} |
| 181 | h = H(I \cat O) \\ |
| 182 | I \assign E_h(I) \qquad O \assign E_h(O) |
| 183 | \end{eqlines*} |
| 184 | If the output buffer is exhausted before the next gating operation, it is |
| 185 | \emph{stretched} using the one-way function: $O \assign E_{H(O)}(O)$. |
| 186 | |
| 187 | \subsection{Other tweaks} |
| 188 | |
| 189 | The first $N_S$ bits of this buffer take part in the output transformation |
| 190 | but are never actually output. They're there to make predicting further |
| 191 | output from the generator difficult. |
| 192 | |
| 193 | Also, optionally, the one-way functions can be keyed. This does, of course, |
| 194 | beg the question as to where the key comes from. This might be one of those |
| 195 | things best done the old-fashioned way with a bunch of coins or dice or |
| 196 | something. |
| 197 | |
| 198 | |
| 199 | \section{The actual implementation} |
| 200 | |
| 201 | The Catacomb implementation of the generator uses the following parameters: |
| 202 | \begin{itemize} |
| 203 | \item The hash function used in the one-way transformation is RIPEMD-160 |
| 204 | \cite{rmd160}; the block cipher is Blowfish, using a 160-bit key. |
| 205 | \item The input pool size $N_I$ is 128 bytes. The output buffer size $N_O$ |
| 206 | is 512 bytes. The size $N_S$ of the secret part of the output buffer |
| 207 | is 160 bits (20 bytes). |
| 208 | \item The polynomial $P(x)$ used for mixing in new input is |
| 209 | $1 + x + x^2 + x^7 + x^{128}$. |
| 210 | \end{itemize} |
| 211 | The hash and block cipher are well-known and respected cryptographic |
| 212 | primitives. |
| 213 | |
| 214 | The input pool is rater larger than it strictly needs to be to contain |
| 215 | `enough' entropy to bring the generator up to the strength of its |
| 216 | cryptographic primitives. The pool is large to reduce the effect of |
| 217 | asymptotic behaviour in the amount of entropy in the pool. |
| 218 | |
| 219 | The output buffer is large simply to improve performance: Blowfish has a |
| 220 | heavy key schedule, so it pays to perform fewer rekeyings per byte of data. |
| 221 | The precise size of 512 bytes was chosen empirically as being about where the |
| 222 | performance improvement stops being linear with the buffer size on my |
| 223 | machine. |
| 224 | |
| 225 | \begin{thebibliography}{99} |
| 226 | |
| 227 | \bibitem{cp:rand} |
| 228 | J.~Kelsey, B.~Schneier, D.~Wagner, and C.~Hall, ``Cryptographic Attacks on |
| 229 | Pseudorandom Number Generators'', \emph{Fast Software Encryption, Fifth |
| 230 | International Workshop Proceedings (March 1998)}, Springer-Verlag, 1998, |
| 231 | pp. 168--188, \url{http://www.counterpane.com/pseudorandom_number.html} |
| 232 | |
| 233 | \bibitem{linux:devrandom} |
| 234 | T.~Ts'o, ``A string random number generator'', Linux sources, |
| 235 | \path{drivers/char/random.c}. |
| 236 | |
| 237 | \bibitem{mdw:devrandom} |
| 238 | M.~Wooding, ``Linux \path{/dev/random} generator security'', Usenet article |
| 239 | posted to \mtt{sci.crypt}, July 1998. |
| 240 | |
| 241 | \end{thebibliography} |
| 242 | |
| 243 | %%%----- That's all, folks -------------------------------------------------- |
| 244 | |
| 245 | \end{document} |