cleanup: Big pile of whitespace fixes, all at once.
[u/mdw/catacomb] / papers / rand.tex
CommitLineData
d03ab969 1%%% -*-latex-*-
2%%%
b817bfc6 3%%% $Id: rand.tex,v 1.4 2004/04/08 01:36:15 mdw Exp $
d03ab969 4%%%
5%%% Description of Catacomb's random number generator
6%%%
7%%% (c) 1999 Straylight/Edgeware
8%%%
9
10%%%----- Licensing notice ---------------------------------------------------
11%%%
12%%% This file is part of Catacomb.
13%%%
14%%% Catacomb is free software; you can redistribute it and/or modify
15%%% it under the terms of the GNU Library General Public License as
16%%% published by the Free Software Foundation; either version 2 of the
17%%% License, or (at your option) any later version.
45c0fd36 18%%%
d03ab969 19%%% Catacomb is distributed in the hope that it will be useful,
20%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
21%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
22%%% GNU Library General Public License for more details.
45c0fd36 23%%%
d03ab969 24%%% You should have received a copy of the GNU Library General Public
25%%% License along with Catacomb; if not, write to the Free
26%%% Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
27%%% MA 02111-1307, USA.
28
d03ab969 29%%%----- Header -------------------------------------------------------------
30
31\documentclass[a4paper, article, 10pt, notitlepage, numbering]{strayman}
32\usepackage[palatino, helvetica, courier, maths=cmr]{mdwfonts}
33\usepackage{mdwtab, mathenv}
34\usepackage[T1]{fontenc}
35\usepackage{cmtt, url}
36\usepackage[tpic, all]{xy}
37\usepackage{mathbbol}
38% \usepackage{crypto}
39
40\def\mdw{{\normalfont[{\bfseries\itshape mdw}]}}
41\urlstyle{tt}
42\def\email{\begingroup\urlstyle{rm}\Url}
43\urldef\myemail\email{mdw@nsict.org}
44\def\Z{\mathbb{Z}}
45\let\assign\leftarrow
46\let\xor\oplus
47\let\bigxor\bigoplus
25936864 48\def\cat{\mathbin{\|}}
d03ab969 49
50\title{The Catacomb random number generator}
51\author{Mark Wooding, \myemail}
52
53%%%----- The main document --------------------------------------------------
54
55\begin{document}
56
57\maketitle
58
59\begin{abstract}
60 The author describes the random number generator used in the
61 Straylight/\-Edgeware `Catacomb' library. While the generator is
62 superficially similar to (for example) the Linux and OpenBSD random number
63 generators, it introduces a number of its own innovations which improve
64 both security and performance.
45c0fd36 65
d03ab969 66 The Catacomb generator uses an optional secret key, which can provide
67 additional security against forward state compromise extension. It uses a
68 catastrophic reseeding operation to prevent a compromise yielding
69 information about past generator states. This operation works on
70 arbitrary-sized blocks of data, so the generator's output buffer can be
71 large. This minimizes the effect of the reseeding overhead.
72\end{abstract}
73
74\tableofcontents
75
76
77\section{The one-way transformation}
25936864 78\label{sec:oneway}
d03ab969 79
80The most novel part of the generator\footnote{I believe this construction to
81be novel. If I'm wrong, let me know.} is the one-way transformation which is
82used to allow pooled input data to affect the output buffer.
83
84Let $H$ be some collision-resistant hash function, and let $E_k$ be a
85symmetric cipher with key $k$. Then I can define the one-way transformation
86$T$ by
87\[ T(x) = E_{H(x)}(x) \]
88
89I believe, although formal proof seems difficult, that an adversary in
90posession of $T(x)$ and a portion of the original $x$ cannot reconstruct the
91remainder of $x$ without breaking one of the cryptographic primitives (which
92I assume is `difficult') or performing an exhaustive search of one of: the
93space of the unknown portion of $x$, the range of the hash function $H$, or
94the keyspace of the cipher $E$.
95
96A similar feat of cryptanalysis or exhaustive search seems necessary to work
97in a forwards direction: given partial knowledge of both $x$ and $T(x)$, the
98adversary cannot work out the remainder of either without trying every
99possibility for one or the other unknown portions, or working through the
100hash- or keyspace.
101
102A keyed version of $T$ may be defined, given a keyed hash (or MAC) $H_k$:
103\[ T_k(x) = E_{H_k(x)}(x) \]
104If this is done, the adversary cannot work forwards even with \emph{complete}
105knowledge of $B$, or performing one of the obvious exhaustive searches.
106
107
25936864 108\section{Abstract description of the generator}
d03ab969 109
110The generator is divided into two parts: an \emph{input pool} which
111accumulates random input data from the environment, and an \emph{output
112buffer} which contains data to be passed to clients of the generator on
113request.
114
25936864 115\subsection{The input pool and mixing function}
116
d03ab969 117New information is contributed to the generator by mixing it with the input
118pool, using a mixing function derived from the Linux random number generator
119\cite{linux:devrandom}. The mixing function views the input pool as eight
120parallel shift registers. Input data is added one octet at a time. Each bit
121of an input octet is mixed with a different shift register.
122
25936864 123Formally, let $I$ be the input pool, with size $N_I$ bytes; let $P(x) = a_0 +
124a_1 x + a_2 x^2 + \cdots + a_{N_I} x^{N_I}$ be a primitive polynomial in
125$\mathrm{GF}(2^{N_I})$ with degree $N_I$; let $i$ be an integer such that $0
126\le i < N_I$, and $r$ be an integer such that $0 \le r < 8$; and let $x$ be
d03ab969 127an input byte. The result of mixing $x$ with the pool $I$ is calculated as
128follows:
129\begin{eqlines*}
130 \begin{spliteqn*}
131 I'[8j + b] =
132 \begin{cases}
133 x\bigl[(r + b) \bmod 8\bigr] \xor
45c0fd36
MW
134 \bigxor_{0 \le k < N_I}
135 a_k I\bigl[8\bigl((j + k) \bmod N_I\bigr) + b\bigr] & if $i = j$ \\
d03ab969 136 I[j + b] & otherwise
137 \end{cases} \\
25936864 138 \textrm{for all integers $j$ and $b$ where $0 \le j < N_I$ and
d03ab969 139 $0 \le b < 8$}
140 \end{spliteqn*}
141 \\
142 I \assign I' \qquad
25936864 143 i \assign (i + 1) \bmod N_I \qquad
d03ab969 144 r \assign (r + 5) \bmod 8
145\end{eqlines*}
146Initially, $i$ and $r$ are both zero. The use of 8-bit bytes above is
25936864 147arbitrary but convenient for modern computers.
148
149The mixing function isn't intended to be cryptographically strong. Its
150purpose is just to hold data without letting too much of the randomness get
151away.
152
153\subsection{The output buffer}
d03ab969 154
155Newly added data doesn't affect the output buffer until a `gating' operation
156is performed. This uses the one-way transformation described earlier over
157the entire generator state.
158
159Data requested by clients of the generator is read from the output buffer
25936864 160$O$. Initially the buffer contains zeroes. The output buffer is large
161enough for $N_O$ bits.
162
163When the generator estimates that there's enough entropy in the input pool, a
164\emph{gating} operation is performed, using the one-way function described in
165section~\ref{sec:oneway}. Hash both the input pool and output buffer, and
166then encrypt both using the hash as the key:
167\begin{eqlines*}
168 h = H(I \cat O) \\
169 I \assign E_h(I) \qquad O \assign E_h(O)
170\end{eqlines*}
171If the output buffer is exhausted before the next gating operation, it is
172\emph{stretched} using the one-way function: $O \assign E_{H(O)}(O)$.
173
174\subsection{Other tweaks}
175
176The first $N_S$ bits of this buffer take part in the output transformation
177but are never actually output. They're there to make predicting further
178output from the generator difficult.
179
180Also, optionally, the one-way functions can be keyed. This does, of course,
181beg the question as to where the key comes from. This might be one of those
182things best done the old-fashioned way with a bunch of coins or dice or
183something.
184
185
186\section{The actual implementation}
187
188The Catacomb implementation of the generator uses the following parameters:
189\begin{itemize}
190\item The hash function used in the one-way transformation is RIPEMD-160
191 \cite{rmd160}; the block cipher is Blowfish, using a 160-bit key.
192\item The input pool size $N_I$ is 128 bytes. The output buffer size $N_O$
61383268 193 is 512 bytes. The size $N_S$ of the secret part of the output buffer
25936864 194 is 160 bits (20 bytes).
195\item The polynomial $P(x)$ used for mixing in new input is
196 $1 + x + x^2 + x^7 + x^{128}$.
197\end{itemize}
198The hash and block cipher are well-known and respected cryptographic
61383268 199primitives.
200
201The input pool is rater larger than it strictly needs to be to contain
202`enough' entropy to bring the generator up to the strength of its
203cryptographic primitives. The pool is large to reduce the effect of
204asymptotic behaviour in the amount of entropy in the pool.
205
206The output buffer is large simply to improve performance: Blowfish has a
207heavy key schedule, so it pays to perform fewer rekeyings per byte of data.
208The precise size of 512 bytes was chosen empirically as being about where the
209performance improvement stops being linear with the buffer size on my
210machine.
d03ab969 211
212\begin{thebibliography}{99}
45c0fd36 213
d03ab969 214\bibitem{cp:rand}
215 J.~Kelsey, B.~Schneier, D.~Wagner, and C.~Hall, ``Cryptographic Attacks on
216 Pseudorandom Number Generators'', \emph{Fast Software Encryption, Fifth
217 International Workshop Proceedings (March 1998)}, Springer-Verlag, 1998,
218 pp. 168--188, \url{http://www.counterpane.com/pseudorandom_number.html}
219
220\bibitem{linux:devrandom}
221 T.~Ts'o, ``A string random number generator'', Linux sources,
222 \path{drivers/char/random.c}.
223
224\bibitem{mdw:devrandom}
225 M.~Wooding, ``Linux \path{/dev/random} generator security'', Usenet article
226 posted to \mtt{sci.crypt}, July 1998.
227
228\end{thebibliography}
229
230%%%----- That's all, folks --------------------------------------------------
231
45c0fd36 232\end{document}