Rearrange the file tree.
[u/mdw/catacomb] / papers / rand.tex
CommitLineData
d03ab969 1%%% -*-latex-*-
2%%%
d03ab969 3%%% Description of Catacomb's random number generator
4%%%
5%%% (c) 1999 Straylight/Edgeware
6%%%
7
8%%%----- Licensing notice ---------------------------------------------------
9%%%
10%%% This file is part of Catacomb.
11%%%
12%%% Catacomb is free software; you can redistribute it and/or modify
13%%% it under the terms of the GNU Library General Public License as
14%%% published by the Free Software Foundation; either version 2 of the
15%%% License, or (at your option) any later version.
45c0fd36 16%%%
d03ab969 17%%% Catacomb is distributed in the hope that it will be useful,
18%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
19%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
20%%% GNU Library General Public License for more details.
45c0fd36 21%%%
d03ab969 22%%% You should have received a copy of the GNU Library General Public
23%%% License along with Catacomb; if not, write to the Free
24%%% Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
25%%% MA 02111-1307, USA.
26
d03ab969 27%%%----- Header -------------------------------------------------------------
28
29\documentclass[a4paper, article, 10pt, notitlepage, numbering]{strayman}
30\usepackage[palatino, helvetica, courier, maths=cmr]{mdwfonts}
31\usepackage{mdwtab, mathenv}
32\usepackage[T1]{fontenc}
33\usepackage{cmtt, url}
34\usepackage[tpic, all]{xy}
35\usepackage{mathbbol}
36% \usepackage{crypto}
37
38\def\mdw{{\normalfont[{\bfseries\itshape mdw}]}}
39\urlstyle{tt}
40\def\email{\begingroup\urlstyle{rm}\Url}
41\urldef\myemail\email{mdw@nsict.org}
42\def\Z{\mathbb{Z}}
43\let\assign\leftarrow
44\let\xor\oplus
45\let\bigxor\bigoplus
25936864 46\def\cat{\mathbin{\|}}
d03ab969 47
48\title{The Catacomb random number generator}
49\author{Mark Wooding, \myemail}
50
51%%%----- The main document --------------------------------------------------
52
53\begin{document}
54
55\maketitle
56
57\begin{abstract}
58 The author describes the random number generator used in the
59 Straylight/\-Edgeware `Catacomb' library. While the generator is
60 superficially similar to (for example) the Linux and OpenBSD random number
61 generators, it introduces a number of its own innovations which improve
62 both security and performance.
45c0fd36 63
d03ab969 64 The Catacomb generator uses an optional secret key, which can provide
65 additional security against forward state compromise extension. It uses a
66 catastrophic reseeding operation to prevent a compromise yielding
67 information about past generator states. This operation works on
68 arbitrary-sized blocks of data, so the generator's output buffer can be
69 large. This minimizes the effect of the reseeding overhead.
70\end{abstract}
71
72\tableofcontents
73
74
75\section{The one-way transformation}
25936864 76\label{sec:oneway}
d03ab969 77
78The most novel part of the generator\footnote{I believe this construction to
79be novel. If I'm wrong, let me know.} is the one-way transformation which is
80used to allow pooled input data to affect the output buffer.
81
82Let $H$ be some collision-resistant hash function, and let $E_k$ be a
83symmetric cipher with key $k$. Then I can define the one-way transformation
84$T$ by
85\[ T(x) = E_{H(x)}(x) \]
86
87I believe, although formal proof seems difficult, that an adversary in
88posession of $T(x)$ and a portion of the original $x$ cannot reconstruct the
89remainder of $x$ without breaking one of the cryptographic primitives (which
90I assume is `difficult') or performing an exhaustive search of one of: the
91space of the unknown portion of $x$, the range of the hash function $H$, or
92the keyspace of the cipher $E$.
93
94A similar feat of cryptanalysis or exhaustive search seems necessary to work
95in a forwards direction: given partial knowledge of both $x$ and $T(x)$, the
96adversary cannot work out the remainder of either without trying every
97possibility for one or the other unknown portions, or working through the
98hash- or keyspace.
99
100A keyed version of $T$ may be defined, given a keyed hash (or MAC) $H_k$:
101\[ T_k(x) = E_{H_k(x)}(x) \]
102If this is done, the adversary cannot work forwards even with \emph{complete}
103knowledge of $B$, or performing one of the obvious exhaustive searches.
104
105
25936864 106\section{Abstract description of the generator}
d03ab969 107
108The generator is divided into two parts: an \emph{input pool} which
109accumulates random input data from the environment, and an \emph{output
110buffer} which contains data to be passed to clients of the generator on
111request.
112
25936864 113\subsection{The input pool and mixing function}
114
d03ab969 115New information is contributed to the generator by mixing it with the input
116pool, using a mixing function derived from the Linux random number generator
117\cite{linux:devrandom}. The mixing function views the input pool as eight
118parallel shift registers. Input data is added one octet at a time. Each bit
119of an input octet is mixed with a different shift register.
120
25936864 121Formally, let $I$ be the input pool, with size $N_I$ bytes; let $P(x) = a_0 +
122a_1 x + a_2 x^2 + \cdots + a_{N_I} x^{N_I}$ be a primitive polynomial in
123$\mathrm{GF}(2^{N_I})$ with degree $N_I$; let $i$ be an integer such that $0
124\le i < N_I$, and $r$ be an integer such that $0 \le r < 8$; and let $x$ be
d03ab969 125an input byte. The result of mixing $x$ with the pool $I$ is calculated as
126follows:
127\begin{eqlines*}
128 \begin{spliteqn*}
129 I'[8j + b] =
130 \begin{cases}
131 x\bigl[(r + b) \bmod 8\bigr] \xor
45c0fd36
MW
132 \bigxor_{0 \le k < N_I}
133 a_k I\bigl[8\bigl((j + k) \bmod N_I\bigr) + b\bigr] & if $i = j$ \\
d03ab969 134 I[j + b] & otherwise
135 \end{cases} \\
25936864 136 \textrm{for all integers $j$ and $b$ where $0 \le j < N_I$ and
d03ab969 137 $0 \le b < 8$}
138 \end{spliteqn*}
139 \\
140 I \assign I' \qquad
25936864 141 i \assign (i + 1) \bmod N_I \qquad
d03ab969 142 r \assign (r + 5) \bmod 8
143\end{eqlines*}
144Initially, $i$ and $r$ are both zero. The use of 8-bit bytes above is
25936864 145arbitrary but convenient for modern computers.
146
147The mixing function isn't intended to be cryptographically strong. Its
148purpose is just to hold data without letting too much of the randomness get
149away.
150
151\subsection{The output buffer}
d03ab969 152
153Newly added data doesn't affect the output buffer until a `gating' operation
154is performed. This uses the one-way transformation described earlier over
155the entire generator state.
156
157Data requested by clients of the generator is read from the output buffer
25936864 158$O$. Initially the buffer contains zeroes. The output buffer is large
159enough for $N_O$ bits.
160
161When the generator estimates that there's enough entropy in the input pool, a
162\emph{gating} operation is performed, using the one-way function described in
163section~\ref{sec:oneway}. Hash both the input pool and output buffer, and
164then encrypt both using the hash as the key:
165\begin{eqlines*}
166 h = H(I \cat O) \\
167 I \assign E_h(I) \qquad O \assign E_h(O)
168\end{eqlines*}
169If the output buffer is exhausted before the next gating operation, it is
170\emph{stretched} using the one-way function: $O \assign E_{H(O)}(O)$.
171
172\subsection{Other tweaks}
173
174The first $N_S$ bits of this buffer take part in the output transformation
175but are never actually output. They're there to make predicting further
176output from the generator difficult.
177
178Also, optionally, the one-way functions can be keyed. This does, of course,
179beg the question as to where the key comes from. This might be one of those
180things best done the old-fashioned way with a bunch of coins or dice or
181something.
182
183
184\section{The actual implementation}
185
186The Catacomb implementation of the generator uses the following parameters:
187\begin{itemize}
188\item The hash function used in the one-way transformation is RIPEMD-160
189 \cite{rmd160}; the block cipher is Blowfish, using a 160-bit key.
190\item The input pool size $N_I$ is 128 bytes. The output buffer size $N_O$
61383268 191 is 512 bytes. The size $N_S$ of the secret part of the output buffer
25936864 192 is 160 bits (20 bytes).
193\item The polynomial $P(x)$ used for mixing in new input is
194 $1 + x + x^2 + x^7 + x^{128}$.
195\end{itemize}
196The hash and block cipher are well-known and respected cryptographic
61383268 197primitives.
198
199The input pool is rater larger than it strictly needs to be to contain
200`enough' entropy to bring the generator up to the strength of its
201cryptographic primitives. The pool is large to reduce the effect of
202asymptotic behaviour in the amount of entropy in the pool.
203
204The output buffer is large simply to improve performance: Blowfish has a
205heavy key schedule, so it pays to perform fewer rekeyings per byte of data.
206The precise size of 512 bytes was chosen empirically as being about where the
207performance improvement stops being linear with the buffer size on my
208machine.
d03ab969 209
210\begin{thebibliography}{99}
45c0fd36 211
d03ab969 212\bibitem{cp:rand}
213 J.~Kelsey, B.~Schneier, D.~Wagner, and C.~Hall, ``Cryptographic Attacks on
214 Pseudorandom Number Generators'', \emph{Fast Software Encryption, Fifth
215 International Workshop Proceedings (March 1998)}, Springer-Verlag, 1998,
216 pp. 168--188, \url{http://www.counterpane.com/pseudorandom_number.html}
217
218\bibitem{linux:devrandom}
219 T.~Ts'o, ``A string random number generator'', Linux sources,
220 \path{drivers/char/random.c}.
221
222\bibitem{mdw:devrandom}
223 M.~Wooding, ``Linux \path{/dev/random} generator security'', Usenet article
224 posted to \mtt{sci.crypt}, July 1998.
225
226\end{thebibliography}
227
228%%%----- That's all, folks --------------------------------------------------
229
45c0fd36 230\end{document}