Major memory management overhaul. Added arena support. Use the secure
[u/mdw/catacomb] / papers / rand.tex
1 %%% -*-latex-*-
2 %%%
3 %%% $Id: rand.tex,v 1.3 1999/10/15 21:05:56 mdw Exp $
4 %%%
5 %%% Description of Catacomb's random number generator
6 %%%
7 %%% (c) 1999 Straylight/Edgeware
8 %%%
9
10 %%%----- Licensing notice ---------------------------------------------------
11 %%%
12 %%% This file is part of Catacomb.
13 %%%
14 %%% Catacomb is free software; you can redistribute it and/or modify
15 %%% it under the terms of the GNU Library General Public License as
16 %%% published by the Free Software Foundation; either version 2 of the
17 %%% License, or (at your option) any later version.
18 %%%
19 %%% Catacomb is distributed in the hope that it will be useful,
20 %%% but WITHOUT ANY WARRANTY; without even the implied warranty of
21 %%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
22 %%% GNU Library General Public License for more details.
23 %%%
24 %%% You should have received a copy of the GNU Library General Public
25 %%% License along with Catacomb; if not, write to the Free
26 %%% Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
27 %%% MA 02111-1307, USA.
28
29 %%%----- Revision history ---------------------------------------------------
30 %%%
31 %%% $Log: rand.tex,v $
32 %%% Revision 1.3 1999/10/15 21:05:56 mdw
33 %%% Add a little more explanatory text for the pool and buffer sizes.
34 %%%
35 %%% Revision 1.2 1999/10/12 21:00:34 mdw
36 %%% Updated. Almost finished, in fact. ;-)
37 %%%
38 %%% Revision 1.1 1999/09/03 08:41:13 mdw
39 %%% Initial import.
40 %%%
41
42 %%%----- Header -------------------------------------------------------------
43
44 \documentclass[a4paper, article, 10pt, notitlepage, numbering]{strayman}
45 \usepackage[palatino, helvetica, courier, maths=cmr]{mdwfonts}
46 \usepackage{mdwtab, mathenv}
47 \usepackage[T1]{fontenc}
48 \usepackage{cmtt, url}
49 \usepackage[tpic, all]{xy}
50 \usepackage{mathbbol}
51 % \usepackage{crypto}
52
53 \def\mdw{{\normalfont[{\bfseries\itshape mdw}]}}
54 \urlstyle{tt}
55 \def\email{\begingroup\urlstyle{rm}\Url}
56 \urldef\myemail\email{mdw@nsict.org}
57 \def\Z{\mathbb{Z}}
58 \let\assign\leftarrow
59 \let\xor\oplus
60 \let\bigxor\bigoplus
61 \def\cat{\mathbin{\|}}
62
63 \title{The Catacomb random number generator}
64 \author{Mark Wooding, \myemail}
65
66 %%%----- The main document --------------------------------------------------
67
68 \begin{document}
69
70 \maketitle
71
72 \begin{abstract}
73 The author describes the random number generator used in the
74 Straylight/\-Edgeware `Catacomb' library. While the generator is
75 superficially similar to (for example) the Linux and OpenBSD random number
76 generators, it introduces a number of its own innovations which improve
77 both security and performance.
78
79 The Catacomb generator uses an optional secret key, which can provide
80 additional security against forward state compromise extension. It uses a
81 catastrophic reseeding operation to prevent a compromise yielding
82 information about past generator states. This operation works on
83 arbitrary-sized blocks of data, so the generator's output buffer can be
84 large. This minimizes the effect of the reseeding overhead.
85 \end{abstract}
86
87 \tableofcontents
88
89
90 \section{The one-way transformation}
91 \label{sec:oneway}
92
93 The most novel part of the generator\footnote{I believe this construction to
94 be novel. If I'm wrong, let me know.} is the one-way transformation which is
95 used to allow pooled input data to affect the output buffer.
96
97 Let $H$ be some collision-resistant hash function, and let $E_k$ be a
98 symmetric cipher with key $k$. Then I can define the one-way transformation
99 $T$ by
100 \[ T(x) = E_{H(x)}(x) \]
101
102 I believe, although formal proof seems difficult, that an adversary in
103 posession of $T(x)$ and a portion of the original $x$ cannot reconstruct the
104 remainder of $x$ without breaking one of the cryptographic primitives (which
105 I assume is `difficult') or performing an exhaustive search of one of: the
106 space of the unknown portion of $x$, the range of the hash function $H$, or
107 the keyspace of the cipher $E$.
108
109 A similar feat of cryptanalysis or exhaustive search seems necessary to work
110 in a forwards direction: given partial knowledge of both $x$ and $T(x)$, the
111 adversary cannot work out the remainder of either without trying every
112 possibility for one or the other unknown portions, or working through the
113 hash- or keyspace.
114
115 A keyed version of $T$ may be defined, given a keyed hash (or MAC) $H_k$:
116 \[ T_k(x) = E_{H_k(x)}(x) \]
117 If this is done, the adversary cannot work forwards even with \emph{complete}
118 knowledge of $B$, or performing one of the obvious exhaustive searches.
119
120
121 \section{Abstract description of the generator}
122
123 The generator is divided into two parts: an \emph{input pool} which
124 accumulates random input data from the environment, and an \emph{output
125 buffer} which contains data to be passed to clients of the generator on
126 request.
127
128 \subsection{The input pool and mixing function}
129
130 New information is contributed to the generator by mixing it with the input
131 pool, using a mixing function derived from the Linux random number generator
132 \cite{linux:devrandom}. The mixing function views the input pool as eight
133 parallel shift registers. Input data is added one octet at a time. Each bit
134 of an input octet is mixed with a different shift register.
135
136 Formally, let $I$ be the input pool, with size $N_I$ bytes; let $P(x) = a_0 +
137 a_1 x + a_2 x^2 + \cdots + a_{N_I} x^{N_I}$ be a primitive polynomial in
138 $\mathrm{GF}(2^{N_I})$ with degree $N_I$; let $i$ be an integer such that $0
139 \le i < N_I$, and $r$ be an integer such that $0 \le r < 8$; and let $x$ be
140 an input byte. The result of mixing $x$ with the pool $I$ is calculated as
141 follows:
142 \begin{eqlines*}
143 \begin{spliteqn*}
144 I'[8j + b] =
145 \begin{cases}
146 x\bigl[(r + b) \bmod 8\bigr] \xor
147 \bigxor_{0 \le k < N_I}
148 a_k I\bigl[8\bigl((j + k) \bmod N_I\bigr) + b\bigr] & if $i = j$ \\
149 I[j + b] & otherwise
150 \end{cases} \\
151 \textrm{for all integers $j$ and $b$ where $0 \le j < N_I$ and
152 $0 \le b < 8$}
153 \end{spliteqn*}
154 \\
155 I \assign I' \qquad
156 i \assign (i + 1) \bmod N_I \qquad
157 r \assign (r + 5) \bmod 8
158 \end{eqlines*}
159 Initially, $i$ and $r$ are both zero. The use of 8-bit bytes above is
160 arbitrary but convenient for modern computers.
161
162 The mixing function isn't intended to be cryptographically strong. Its
163 purpose is just to hold data without letting too much of the randomness get
164 away.
165
166 \subsection{The output buffer}
167
168 Newly added data doesn't affect the output buffer until a `gating' operation
169 is performed. This uses the one-way transformation described earlier over
170 the entire generator state.
171
172 Data requested by clients of the generator is read from the output buffer
173 $O$. Initially the buffer contains zeroes. The output buffer is large
174 enough for $N_O$ bits.
175
176 When the generator estimates that there's enough entropy in the input pool, a
177 \emph{gating} operation is performed, using the one-way function described in
178 section~\ref{sec:oneway}. Hash both the input pool and output buffer, and
179 then encrypt both using the hash as the key:
180 \begin{eqlines*}
181 h = H(I \cat O) \\
182 I \assign E_h(I) \qquad O \assign E_h(O)
183 \end{eqlines*}
184 If the output buffer is exhausted before the next gating operation, it is
185 \emph{stretched} using the one-way function: $O \assign E_{H(O)}(O)$.
186
187 \subsection{Other tweaks}
188
189 The first $N_S$ bits of this buffer take part in the output transformation
190 but are never actually output. They're there to make predicting further
191 output from the generator difficult.
192
193 Also, optionally, the one-way functions can be keyed. This does, of course,
194 beg the question as to where the key comes from. This might be one of those
195 things best done the old-fashioned way with a bunch of coins or dice or
196 something.
197
198
199 \section{The actual implementation}
200
201 The Catacomb implementation of the generator uses the following parameters:
202 \begin{itemize}
203 \item The hash function used in the one-way transformation is RIPEMD-160
204 \cite{rmd160}; the block cipher is Blowfish, using a 160-bit key.
205 \item The input pool size $N_I$ is 128 bytes. The output buffer size $N_O$
206 is 512 bytes. The size $N_S$ of the secret part of the output buffer
207 is 160 bits (20 bytes).
208 \item The polynomial $P(x)$ used for mixing in new input is
209 $1 + x + x^2 + x^7 + x^{128}$.
210 \end{itemize}
211 The hash and block cipher are well-known and respected cryptographic
212 primitives.
213
214 The input pool is rater larger than it strictly needs to be to contain
215 `enough' entropy to bring the generator up to the strength of its
216 cryptographic primitives. The pool is large to reduce the effect of
217 asymptotic behaviour in the amount of entropy in the pool.
218
219 The output buffer is large simply to improve performance: Blowfish has a
220 heavy key schedule, so it pays to perform fewer rekeyings per byte of data.
221 The precise size of 512 bytes was chosen empirically as being about where the
222 performance improvement stops being linear with the buffer size on my
223 machine.
224
225 \begin{thebibliography}{99}
226
227 \bibitem{cp:rand}
228 J.~Kelsey, B.~Schneier, D.~Wagner, and C.~Hall, ``Cryptographic Attacks on
229 Pseudorandom Number Generators'', \emph{Fast Software Encryption, Fifth
230 International Workshop Proceedings (March 1998)}, Springer-Verlag, 1998,
231 pp. 168--188, \url{http://www.counterpane.com/pseudorandom_number.html}
232
233 \bibitem{linux:devrandom}
234 T.~Ts'o, ``A string random number generator'', Linux sources,
235 \path{drivers/char/random.c}.
236
237 \bibitem{mdw:devrandom}
238 M.~Wooding, ``Linux \path{/dev/random} generator security'', Usenet article
239 posted to \mtt{sci.crypt}, July 1998.
240
241 \end{thebibliography}
242
243 %%%----- That's all, folks --------------------------------------------------
244
245 \end{document}