Updated. Almost finished, in fact. ;-)
[u/mdw/catacomb] / papers / rand.tex
1 %%% -*-latex-*-
2 %%%
3 %%% $Id: rand.tex,v 1.2 1999/10/12 21:00:34 mdw Exp $
4 %%%
5 %%% Description of Catacomb's random number generator
6 %%%
7 %%% (c) 1999 Straylight/Edgeware
8 %%%
9
10 %%%----- Licensing notice ---------------------------------------------------
11 %%%
12 %%% This file is part of Catacomb.
13 %%%
14 %%% Catacomb is free software; you can redistribute it and/or modify
15 %%% it under the terms of the GNU Library General Public License as
16 %%% published by the Free Software Foundation; either version 2 of the
17 %%% License, or (at your option) any later version.
18 %%%
19 %%% Catacomb is distributed in the hope that it will be useful,
20 %%% but WITHOUT ANY WARRANTY; without even the implied warranty of
21 %%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
22 %%% GNU Library General Public License for more details.
23 %%%
24 %%% You should have received a copy of the GNU Library General Public
25 %%% License along with Catacomb; if not, write to the Free
26 %%% Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
27 %%% MA 02111-1307, USA.
28
29 %%%----- Revision history ---------------------------------------------------
30 %%%
31 %%% $Log: rand.tex,v $
32 %%% Revision 1.2 1999/10/12 21:00:34 mdw
33 %%% Updated. Almost finished, in fact. ;-)
34 %%%
35 %%% Revision 1.1 1999/09/03 08:41:13 mdw
36 %%% Initial import.
37 %%%
38
39 %%%----- Header -------------------------------------------------------------
40
41 \documentclass[a4paper, article, 10pt, notitlepage, numbering]{strayman}
42 \usepackage[palatino, helvetica, courier, maths=cmr]{mdwfonts}
43 \usepackage{mdwtab, mathenv}
44 \usepackage[T1]{fontenc}
45 \usepackage{cmtt, url}
46 \usepackage[tpic, all]{xy}
47 \usepackage{mathbbol}
48 % \usepackage{crypto}
49
50 \def\mdw{{\normalfont[{\bfseries\itshape mdw}]}}
51 \urlstyle{tt}
52 \def\email{\begingroup\urlstyle{rm}\Url}
53 \urldef\myemail\email{mdw@nsict.org}
54 \def\Z{\mathbb{Z}}
55 \let\assign\leftarrow
56 \let\xor\oplus
57 \let\bigxor\bigoplus
58 \def\cat{\mathbin{\|}}
59
60 \title{The Catacomb random number generator}
61 \author{Mark Wooding, \myemail}
62
63 %%%----- The main document --------------------------------------------------
64
65 \begin{document}
66
67 \maketitle
68
69 \begin{abstract}
70 The author describes the random number generator used in the
71 Straylight/\-Edgeware `Catacomb' library. While the generator is
72 superficially similar to (for example) the Linux and OpenBSD random number
73 generators, it introduces a number of its own innovations which improve
74 both security and performance.
75
76 The Catacomb generator uses an optional secret key, which can provide
77 additional security against forward state compromise extension. It uses a
78 catastrophic reseeding operation to prevent a compromise yielding
79 information about past generator states. This operation works on
80 arbitrary-sized blocks of data, so the generator's output buffer can be
81 large. This minimizes the effect of the reseeding overhead.
82 \end{abstract}
83
84 \tableofcontents
85
86
87 \section{The one-way transformation}
88 \label{sec:oneway}
89
90 The most novel part of the generator\footnote{I believe this construction to
91 be novel. If I'm wrong, let me know.} is the one-way transformation which is
92 used to allow pooled input data to affect the output buffer.
93
94 Let $H$ be some collision-resistant hash function, and let $E_k$ be a
95 symmetric cipher with key $k$. Then I can define the one-way transformation
96 $T$ by
97 \[ T(x) = E_{H(x)}(x) \]
98
99 I believe, although formal proof seems difficult, that an adversary in
100 posession of $T(x)$ and a portion of the original $x$ cannot reconstruct the
101 remainder of $x$ without breaking one of the cryptographic primitives (which
102 I assume is `difficult') or performing an exhaustive search of one of: the
103 space of the unknown portion of $x$, the range of the hash function $H$, or
104 the keyspace of the cipher $E$.
105
106 A similar feat of cryptanalysis or exhaustive search seems necessary to work
107 in a forwards direction: given partial knowledge of both $x$ and $T(x)$, the
108 adversary cannot work out the remainder of either without trying every
109 possibility for one or the other unknown portions, or working through the
110 hash- or keyspace.
111
112 A keyed version of $T$ may be defined, given a keyed hash (or MAC) $H_k$:
113 \[ T_k(x) = E_{H_k(x)}(x) \]
114 If this is done, the adversary cannot work forwards even with \emph{complete}
115 knowledge of $B$, or performing one of the obvious exhaustive searches.
116
117
118 \section{Abstract description of the generator}
119
120 The generator is divided into two parts: an \emph{input pool} which
121 accumulates random input data from the environment, and an \emph{output
122 buffer} which contains data to be passed to clients of the generator on
123 request.
124
125 \subsection{The input pool and mixing function}
126
127 New information is contributed to the generator by mixing it with the input
128 pool, using a mixing function derived from the Linux random number generator
129 \cite{linux:devrandom}. The mixing function views the input pool as eight
130 parallel shift registers. Input data is added one octet at a time. Each bit
131 of an input octet is mixed with a different shift register.
132
133 Formally, let $I$ be the input pool, with size $N_I$ bytes; let $P(x) = a_0 +
134 a_1 x + a_2 x^2 + \cdots + a_{N_I} x^{N_I}$ be a primitive polynomial in
135 $\mathrm{GF}(2^{N_I})$ with degree $N_I$; let $i$ be an integer such that $0
136 \le i < N_I$, and $r$ be an integer such that $0 \le r < 8$; and let $x$ be
137 an input byte. The result of mixing $x$ with the pool $I$ is calculated as
138 follows:
139 \begin{eqlines*}
140 \begin{spliteqn*}
141 I'[8j + b] =
142 \begin{cases}
143 x\bigl[(r + b) \bmod 8\bigr] \xor
144 \bigxor_{0 \le k < N_I}
145 a_k I\bigl[8\bigl((j + k) \bmod N_I\bigr) + b\bigr] & if $i = j$ \\
146 I[j + b] & otherwise
147 \end{cases} \\
148 \textrm{for all integers $j$ and $b$ where $0 \le j < N_I$ and
149 $0 \le b < 8$}
150 \end{spliteqn*}
151 \\
152 I \assign I' \qquad
153 i \assign (i + 1) \bmod N_I \qquad
154 r \assign (r + 5) \bmod 8
155 \end{eqlines*}
156 Initially, $i$ and $r$ are both zero. The use of 8-bit bytes above is
157 arbitrary but convenient for modern computers.
158
159 The mixing function isn't intended to be cryptographically strong. Its
160 purpose is just to hold data without letting too much of the randomness get
161 away.
162
163 \subsection{The output buffer}
164
165 Newly added data doesn't affect the output buffer until a `gating' operation
166 is performed. This uses the one-way transformation described earlier over
167 the entire generator state.
168
169 Data requested by clients of the generator is read from the output buffer
170 $O$. Initially the buffer contains zeroes. The output buffer is large
171 enough for $N_O$ bits.
172
173 When the generator estimates that there's enough entropy in the input pool, a
174 \emph{gating} operation is performed, using the one-way function described in
175 section~\ref{sec:oneway}. Hash both the input pool and output buffer, and
176 then encrypt both using the hash as the key:
177 \begin{eqlines*}
178 h = H(I \cat O) \\
179 I \assign E_h(I) \qquad O \assign E_h(O)
180 \end{eqlines*}
181 If the output buffer is exhausted before the next gating operation, it is
182 \emph{stretched} using the one-way function: $O \assign E_{H(O)}(O)$.
183
184 \subsection{Other tweaks}
185
186 The first $N_S$ bits of this buffer take part in the output transformation
187 but are never actually output. They're there to make predicting further
188 output from the generator difficult.
189
190 Also, optionally, the one-way functions can be keyed. This does, of course,
191 beg the question as to where the key comes from. This might be one of those
192 things best done the old-fashioned way with a bunch of coins or dice or
193 something.
194
195
196 \section{The actual implementation}
197
198 The Catacomb implementation of the generator uses the following parameters:
199 \begin{itemize}
200 \item The hash function used in the one-way transformation is RIPEMD-160
201 \cite{rmd160}; the block cipher is Blowfish, using a 160-bit key.
202 \item The input pool size $N_I$ is 128 bytes. The output buffer size $N_O$
203 is also 128 bytes. The size $N_S$ of the secret part of the output buffer
204 is 160 bits (20 bytes).
205 \item The polynomial $P(x)$ used for mixing in new input is
206 $1 + x + x^2 + x^7 + x^{128}$.
207 \end{itemize}
208 The hash and block cipher are well-known and respected cryptographic
209 primitives.
210
211 \begin{thebibliography}{99}
212
213 \bibitem{cp:rand}
214 J.~Kelsey, B.~Schneier, D.~Wagner, and C.~Hall, ``Cryptographic Attacks on
215 Pseudorandom Number Generators'', \emph{Fast Software Encryption, Fifth
216 International Workshop Proceedings (March 1998)}, Springer-Verlag, 1998,
217 pp. 168--188, \url{http://www.counterpane.com/pseudorandom_number.html}
218
219 \bibitem{linux:devrandom}
220 T.~Ts'o, ``A string random number generator'', Linux sources,
221 \path{drivers/char/random.c}.
222
223 \bibitem{mdw:devrandom}
224 M.~Wooding, ``Linux \path{/dev/random} generator security'', Usenet article
225 posted to \mtt{sci.crypt}, July 1998.
226
227 \end{thebibliography}
228
229 %%%----- That's all, folks --------------------------------------------------
230
231 \end{document}