cleanup: Big pile of whitespace fixes, all at once.
[u/mdw/catacomb] / manual / mp-mp.tex
CommitLineData
3471ebd1 1%%% -*-latex-*-
2
3\section{High-level multiprecision integer represention}
4\label{sec:mp}
5
6The high-level interface to multiprecision integers is exported by the header
7file \hdr{<catacomb/mp.h>}.
8
9
10\subsection{The \code{mp} structure}
11\label{sec:mp-struct}
12
13At the high level, a multiprecision integer is represented as an object of
14type \code{mp}. Most of the time, programs use pointers to \code{mp} objects
15which have been dynamically allocated. (See \code{mp_build},
16page~\pageref{fn:mp-build}, for the exception to this rule.)
17
18The \code{mp} type is a structure. Some of the members are considered
19internal to the implementation. However, the following may be used by client
20programs:
21\begin{description}
22 \def\makelabel#1{\code{#1}\hfil}
23\item[mpw *v, *vl] The base and limit of the multiprecision
24 integer array.
25\item[unsigned f] Various flags describing the current state of the integer.
26\end{description}
27Other members are used for keeping track of how much memory has been
28allocated to the number and for managing the reference counting system.
29These members should not be accessed directly by application code. None of
30the structure members should be modified directly.
31
32The following flags are defined:
33\begin{description*}
34 \let\makelabel\code
35\item[MP_NEG] The integer is negative.
36\item[MP_BURN] Clear the integer's storage after use.
37\item[MP_CONST] The integer may not be altered or freed.
38\item[MP_UNDEF] The integer currently has no interesting value.
39\item[MP_DESTROYED] The integer has been destroyed.
40\end{description*}
41
42The \code{MP_BURN} flag is inherited by the results of calculations. Thus,
43setting the flag on a value causes it and all other values calculated from it
44to be destroyed after use.\footnote{%
45 There are currently a small number of places where the \code{MP_BURN} flag
46 may be lost from a result. These usually result from performing complex
47 calculations with degenerate arguments (e.g., computing a GCD with zero).
48 Fixing this is not a high priority.} %
49
50The \code{MP_DESTROYED} flag is used by \code{mp_destroy}
51(page~\pageref{fn:mp-destroy}) to trap attempts to destroy the same integer
52more than once, which can otherwise lead to confusing bugs.
53
54
55\subsubsection{The \code{MP_LEN} macro}
56\label{fn:MP-LEN}
57
58\fsec{Synopsis}
59
60\begin{listinglist}
61|#include <catacomb/mp.h>| \\
62|ptrdiff_t MP_LEN(const mp *|$m$|);|
63\end{listinglist}
64
65\fsec{Returns}
66
674cd11e 67The number of words used in the representation of the integer~$m$. Note that
68the argument~$m$ may be evaluated multiple times by this macro.
3471ebd1 69
70
71\subsubsection{The \code{mp_burn} function}
72\label{fn:mp-burn}
73
74\fsec{Synopsis}
75
76\begin{listinglist}
77|#include <catacomb/mp.h>| \\
78|void mp_burn(mp *|$m$|);|
79\end{listinglist}
80
81\fsec{Description}
82
674cd11e 83Sets the \code{MP_BURN} flag on the integer~$m$. Whenever memory associated
84with~$m$, or results derived from it, is deallocated, it is overwritten with
3471ebd1 85zeros to prevent traces being found in swap files or core dumps.
86
87
88\subsection{Multiprecision integer constants}
89\label{sec:mp-const}
90
91There are a few commonly-used multiprecision integer constants provided. All
92are declared in \hdr{<catacomb/mp.h>}:
93
94\begin{tabular}[C]
95 {| >{\ttfamily}l | Mr @{\extracolsep{0pt plus 10000fill}\hskip\tabcolsep}
96 | c @{\extracolsep{0pt}} | >{\ttfamily}l | Mr |} \hlx{c{1,2,4,5}v}
97 \multicolumn{1}{|c|}{\textbf{Constant}} &
98 \multicolumn{1}{c|}{\textbf{Value}} & \qquad &
99 \multicolumn{1}{c|}{\textbf{Constant}} &
100 \multicolumn{1}{c|}{\textbf{Value}} \\ \hlx{vc{1,2,4,5}v}
45c0fd36
MW
101 MP_ZERO & 0 && MP_FOUR & 4 \\
102 MP_ONE & 1 && MP_FIVE & 5 \\
103 MP_TWO & 2 && MP_TEN & 10 \\
104 MP_THREE & 3 && MP_MONE & -1 \\ \hlx{vc{1,2,4,5}}
3471ebd1 105\end{tabular}
106\goodbreak
107
108All of the multiprecision constants are actually pointers to the real
109values. All have the \code{MP_CONST} flag set, so they cannot be modified or
110destroyed.
111
112
113\subsection{Creation and destruction}
114\label{sec:mp-create}
115
116Multiprecision integers are reference counted; i.e., each \code{mp} remembers
117how many references there are to it, and is destroyed only when the last
118reference disappears.
119
120New multiprecision integers are created using the functions \code{mp_create}
121(page~\pageref{fn:mp-create}) or \code{mp_build}
122(page~\pageref{fn:mp-build}). A newly created integer has one reference.
123Additional references are returned by the function \code{mp_copy}
124(page~\pageref{fn:mp-copy}). A reference may be removed from an integer by
125calling \code{mp_drop} (page~\pageref{fn:mp-drop}).
126
127It isn't usually a good idea to modify an integer to which there are multiple
128references: the owners of the other references will usually react badly if
129their number has changed. A modifiable copy may be made of an integer using
130the \code{mp_split} function (page~\pageref{fn:mp-split}). If there are no
131other references, the integer is returned unchanged; otherwise a real copy is
132made and returned, and the number of references to the original is
133decremented by one.
134
135Most of the arithmetic functions allow the caller to state a preferred
136location for the result. This may either be an existing integer or a
137completely new one, requested by the special value \code{MP_NEW}. This
138behaviour is provided by the \code{mp_modify} function
139(page~\pageref{fn:mp-modify}).
140
141Note that functions may not actually use the location requested, although if
142it decides not to overwrite the requested destination, a reference to it is
143removed.
144
145
146\subsubsection{The \code{mp_create} function}
147\label{fn:mp-create}
148
149\fsec{Synopsis}
150
151\begin{listinglist}
152|#include <catacomb/mp.h>| \\
153|mp *mp_create(size_t |$\sz$|);|
154\end{listinglist}
155
156\fsec{Description}
157
158Allocates a new multiprecision integer, with at least $\sz$ words of storage
159available for its representation. The flag \code{MP_UNDEF} is set initially;
160the other flags are cleared. The integer's array limit pointer \code{vl} is
161set to be $\sz$ higher than the base pointer \code{v}.
162
163\fsec{Returns}
164
165A pointer to a newly allocated multiprecision integer.
166
167\subsubsection{The \code{mp_destroy} function}
168\label{fn:mp-destroy}
169
170\fsec{Synopsis}
171
172\begin{listinglist}
173|#include <catacomb/mp.h>| \\
174|void mp_destroy(mp *|$m$|);|
175\end{listinglist}
176
177\fsec{Description}
178
674cd11e 179Destroys the multiprecision integer~$m$. All resouces allocated to the
3471ebd1 180integer are freed. It's most unusual for this to be the function you want:
181most of the time \code{mp_drop} (page~\pageref{fn:mp-drop}) is much more
182useful.
183
184If compiled with debugging support, the \code{mp_destroy} function will abort
185if requested to destroy an integer whose \code{MP_CONST} or
186\code{MP_DESTROYED} flags are set.
187
188\subsubsection{The \code{mp_build} function}
189\label{fn:mp-build}
190
191\fsec{Synopsis}
192
193\begin{listinglist}
194|#include <catacomb/mp.h>| \\
195|void mp_build(mp *|$m$|, mpw *|$v$|, mpw *|$\vl$|);|
196\end{listinglist}
197
198\fsec{Description}
199
200Creates a constant multiprecision integer from an array of \code{mpw} words.
201Storage for the \code{mp} object and the array is provided by the user --
202\code{mp_build} allocates no memory. Conversely, it's safe to dispose of the
203\code{mp} block and \code{mpw} array at any time, as long as there are no
204live references to them.
205
206The \code{mp_build} function is particularly useful for constructing small
207multiprecision integers, and for addressing parts of other integers. For
208example, if $x$ is a multiprecision integer, then
209\begin{listing}
210mp_build(&m, x->v + 5, x->vl);
211\end{listing}
212sets $m$ equal to $\lfloor x / 2^{5 \cdot \code{MPW_BITS}} \rfloor$ without
213copying any data.\footnote{%
214 This technique is used in the implementation of Barrett reduction
215 (page~\pageref{sec:mp-barrett}), for example. See the source file
216 \code{mpbarrett.c} for details.} %
217
218\subsubsection{The \code{mp_copy} function and \code{MP_COPY} macro}
219\label{fn:mp-copy}
220
221\fsec{Synopsis}
222
223\begin{listinglist}
224|#include <catacomb/mp.h>| \\
225|mp *mp_copy(mp *|$m$|);| \\
226|mp *MP_COPY(mp *|$m$|);|
227\end{listinglist}
228
229\fsec{Description}
230
674cd11e 231The function \code{mp_copy} adds a reference to the multiprecision
232integer~$m$. The integer will not be destroyed until all references to it
233have been dropped.
3471ebd1 234
235The macro \code{MP_COPY} performs exactly the same action as the
236\code{mp_copy} function, except that it uses inline code rather than a
237function call. It's probably smaller and faster than the function call too.
238
239\fsec{Returns}
240
241The address of the new reference. This will usually be the same as the
674cd11e 242original argument~$m$.
3471ebd1 243
244\subsubsection{The \code{mp_drop} function and \code{MP_DROP} macro}
245\label{fn:mp-drop}
246
247\fsec{Synopsis}
248
249\begin{listinglist}
250|#include <catacomb/mp.h>| \\
251|void mp_drop(mp *|$m$|);| \\
252|void MP_DROP(mp *|$m$|);|
253\end{listinglist}
254
255\fsec{Description}
256
257The function \code{mp_drop} removes (`drops') a reference to the
674cd11e 258multiprecision integer~$m$. If there are no more references to the integer,
3471ebd1 259it is destroyed, as if by \code{mp_destroy} (page~\pageref{fn:mp-destroy}).
260
261The macro \code{MP_DROP} performs exactly the same action as the
262\code{mp_drop} function, except that it uses inline code rather than a
263function call, so it's slightly faster but uses a little more code.
264
265\subsubsection{The \code{mp_split} function and \code{MP_SPLIT} macro}
266\label{fn:mp-split}
267
268\fsec{Synopsis}
269
270\begin{listinglist}
271|#include <catacomb/mp.h>| \\
272|mp *mp_split(mp *|$m$|);| \\
273|void MP_SPLIT(mp *|$m$|);|
274\end{listinglist}
275
276\fsec{Description}
277
674cd11e 278If the integer~$m$ has only one reference, the \code{mp_split} function
3471ebd1 279returns $m$ unchanged; otherwise, a copy is made and returned, and the
280reference count of $m$ is decremented. Thus, either way, \code{mp_split}
281returns a pointer to an integer with the same value as $m$ and a single
282refernce. Thus, the result of \code{mp_split} is therefore safe to modify.
283
284The macro \code{MP_SPLIT} performs the same action as the \code{mp_split}
285function except that it uses inline code rather than a function call, and it
286modifies its argument in place rather than returning the new reference. The
674cd11e 287macro \code{MP_SPLIT} may evaluate its argument~$m$ multiple times.
3471ebd1 288
289\fsec{Returns}
290
291The function \code{mp_split} returns a pointer to a copy of its argument $m$
292which is safe to modify.
293
294\subsubsection{The \code{mp_modify} function and \code{MP_MODIFY} macro}
295\label{fn:mp-modify}
296
297\fsec{Synopsis}
298
299\begin{listinglist}
300|#include <catacomb/mp.h>| \\
301|mp *mp_modify(mp *|$m$|, size_t |$\sz$|);| \\
302|void MP_MODIFY(mp *|$m$|, size_t |$\sz$|);|
303\end{listinglist}
304
305\fsec{Description}
306
307The function \code{mp_modify} returns a pointer to a suitable destination $d$
308for a result given a suggested destination $m$ and a required size $\sz$.
309The actual behaviour is complicated:
310
311\begin{itemize}
312\item If $m$ is equal to \code{MP_NEW}, or the \code{MP_CONST} flag is set on
674cd11e 313 $m$, then allocate a new integer~$d$ of size $\sz$ by calling
3471ebd1 314 \code{mp_create} (page~\pageref{fn:mp-create}).
315\item Otherwise, if $m$ has more than one reference, a reference is removed
316 asif by \code{mp_drop} (page~\pageref{fn:mp-drop}) and a new integer $d$ of
317 size $\sz$ is allocated as if by \code{mp_create}.
674cd11e 318\item Otherwise, the integer~$m$ is extended to $\sz$ words, as if by calling
319 \code{mp_ensure} (page~\pageref{fn:mp-ensure}), and $d$ is set equal
320 to~$m$.
3471ebd1 321\end{itemize}
322
323The resulting integer $d$ is returned from the function.
324
325The upshot of all of this is that $d$ has exactly one reference, space for
674cd11e 326$\sz$~words of data, and no particular value set. Any other references to
327the original integer~$m$ (whether counted or not) continue to refer to the
328same value. The total number of references to $d$ and~$m$ after
3471ebd1 329\code{mp_modify} returns is equal to the initial number of references to $m$
330before the call.
331
332The macro \code{MP_MODIFY} does the same job using inline code; it sets $m$
674cd11e 333to the new integer~$d$ when it's finished. The macro expands to a fairly
3471ebd1 334large chunk of code.
335
336\fsec{Returns}
337
674cd11e 338The function \code{mp_modify} returns the destination integer~$d$ as
3471ebd1 339described above.
340
341
342\subsection{Resizing multiprecision integers}
343\label{sec:mp-resize}
344
345There are a collection of useful functions and macros for changing the size
346of a multiprecision integer.
347
348\subsubsection{The \code{mp_resize} function and \code{MP_RESIZE} macro}
349\label{fn:mp-resize}
350
351\fsec{Synopsis}
352
353\begin{listinglist}
354|#include <catacomb/mp.h>| \\
355|void mp_resize(mp *|$m$|, size_t |$\sz$|);| \\
356|void MP_RESIZE(mp *|$m$|, size_t |$\sz$|);|
357\end{listinglist}
358
359\fsec{Description}
360
361The function \code{mp_resize} resizes the array used to hold the value of the
674cd11e 362integer~$m$ such that it has space for $\sz$ words. No check is made to see
3471ebd1 363whether $m$'s array is already the correct size. The integer's array limit
674cd11e 364pointer \code{vl} is set to point $\sz$~words higher than the base pointer
3471ebd1 365\code{v}.
366
367The macro \code{MP_RESIZE} performs exactly the same action as the
368\code{mp_resize} function, except that it uses inline code rather than a
369function call. It's slightly faster but uses quite a lot more code than the
370plain function call.
371
372\subsubsection{The \code{mp_ensure} function and \code{MP_ENSURE} macro}
373\label{fn:mp-ensure}
374
375\fsec{Synopsis}
376
377\begin{listinglist}
378|#include <catacomb/mp.h>| \\
379|void mp_ensure(mp *|$m$|, size_t |$\sz$|);| \\
380|void MP_ENSURE(mp *|$m$|, size_t |$\sz$|);|
381\end{listinglist}
382
383\fsec{Description}
384
385The function \code{mp_ensure} resizes the integer $m$ (as if by calling
386\code{mp_resize}) if its array is currently smaller than $\sz$ words; if not,
387the array is left alone. The integer's array limit pointer \code{vl} is set
388to point $\sz$ words higher than the base pointer \code{v}.
389
390The macro \code{MP_ENSURE} performs exactly the same action as the
391\code{mp_ensure} function, except that it uses inline code rather than a
392function call. It's slightly faster but uses considerably more code.
393
394\subsubsection{The \code{mp_shrink} function and \code{MP_SHRINK} macro}
395\label{fn:mp-shrink}
396
397\fsec{Synopsis}
398
399\begin{listinglist}
400|#include <catacomb/mp.h>| \\
401|void mp_shrink(mp *|$m$|);| \\
402|void MP_SHRINK(mp *|$m$|);|
403\end{listinglist}
404
405\fsec{Description}
406
407The function \code{mp_shrink} shrinks the representation of the integer $m$.
408It lowers the integer's array limit pointer so that the most significant word
409in the representation is nonzero. This improves the efficiency of arithmetic
410operations on $m$.
411
412Note that shrinking an integer doesn't release any memory. Use
413\code{mp_minimize} (below) for that.
414
415The macro \code{MP_SHRINK} performs exactly the same operation as
416\code{mp_shrink} using inline code rather than a function call.
417
418\subsubsection{The \code{mp_minimize} function}
419\label{fn:mp-minimize}
420
421\fsec{Synopsis}
422
423\begin{listinglist}
424|#include <catacomb/mp.h>| \\
425|void mp_minimize(mp *|$m$|);|
426\end{listinglist}
427
428\fsec{Description}
429
674cd11e 430Minimizes the amount of memory used by the represenation of the integer~$m$.
3471ebd1 431This isn't usually useful unless $m$ is going to remain constant for a long
432time, or has become extremely large for some reason.
433
434
435\subsection{Bit-scanning}
436\label{sec:mp-scan}
437
438\subsection{Loading and storing}
439\label{sec:mp-io}
440
441\subsection{Simple arithmetic}
442\label{sec:mp-arith}
443
444\subsection{More advanced algorithms}
445\label{sec:mp-adv}
446
674cd11e 447\subsubsection{The \code{mp_gcd} function}
448\label{fn:mp-gcd}
449
450\fsec{Synopsis}
451
452\begin{listinglist}
453|#include <catacomb/mp.h>| \\
454|void mp_gcd(mp **|$g$|, mp **|$a$|, mp **|$b$|, mp *|$x$|, mp *|$y$|);|
455\end{listinglist}
456
457\fsec{Description}
458
459The function \code{mp_gcd} implements an `extended GCD' algorithm. Given
460integers $x$ and~$y$, it computes integers $g$, $a$, and~$b$ such that $g =
461\gcd(x, y)$ is the greatest common divisor of $x$ and $y$, and $ax + by = g$.
462
463If the arguments $g$, $a$ or~$b$ are null pointers, the appropriate result is
464discarded; otherwise, the current value of $|*|g$, $|*|a$ or $|*|b$ is used
465as a suggested destination for the result, and replaced by the actual result.
466
467If both $a$ and~$b$ are null, the results $a$ and $b$ are not computed.
468Hence \code{mp_gcd} can efficiently perform a simple greatest-common-divisor
469computation.
470
471A major use for \code{mp_gcd} is the calculation of modular inverses. If
472$\gcd(x, n) = 1$ then \code{mp_gcd} computes integers $a$ and~$b$ such that
473$ax + bn = 1$, i.e., $ax \equiv 1 \pmod n$, so $a$ is the inverse of $x$,
474written $a \equiv x^{-1} \pmod n$.
475
476In order to better support this usage, \code{mp_gcd} implements a \emph{sign
477preference} system. If only $a$ is requested, it will either be zero or
478have the same sign as~$y$; otherwise, $b$ will be zero or have the same sign
479as~$x$. Thus, $x^{-1} \bmod n$ may be conveniently computed using the code
480\begin{listing}
481mp *xinv = MP_NEW;
482mp_gcd(0, 0, &xinv, n, x);
483\end{listing}
484(It's conventional to ensure that $x > y$, even though it doesn't actually
485make any difference.)
486
487The function \code{mp_gcd} works correctly for all integers $x$ and $y$. The
488following properties of the results are guaranteed:
489\begin{itemize}
490 \unverb\|
491\item $\gcd(x, y) = \gcd(y, x)$ for all integers $x$ and~$y$.
492\item $\gcd(x, 0) = \gcd(0, x) = x$ for all integers~$x$.
493\item $|a| < |y|$, unless $y = 0$; similarly, $|b| < |x|$ if $x \ne 0$.
494\item If $x = y = 0$, then $a = b = 0$. Otherwise, if $x = 0$, then $a = 0$
495 and $b = 1$; similarly, if $y = 0$, then $a = 1$ and $b = 0$.
496\end{itemize}
497
498\subsubsection{The \code{mp_jacobi} function}
499\label{fn:mp-jacobi}
500
501\fsec{Synopsis}
502
503\begin{listinglist}
504|#include <catacomb/mp.h>| \\
505|int mp_jacobi(mp *|$a$|, mp *|$n$|);|
506\end{listinglist}
507
508\fsec{Returns}
509
510The value of the Jacobi symbol $\jacobi{a}{n}$.\footnote{%
511 Schneier writes the Jacobi symbol as $J(a, n)$. The notation
512 $(\!\frac{a}{n}\!)$ seems more usual.} %
513The Jacobi symbol $\jacobi{a}{n}$ is defined for all integers~$a$ and odd
514integers~$n$ in terms of the Legendre symbol $\jacobi{a}{p}$ for prime~$p$,
515which is:
516\[
517 \jacobi{a}{p} = \begin{cases}
518 0 & if $a$ is a multiple of $p$ \\
519 -1 & if $a$ is a quadratic nonresidue mod $p$ \\
520 1 & if $a$ is a quadratic residue mod $p$
521 \end{cases}
522\]
523(An integer $x$ is a quadratic residue mod $n$ if and only if there exists an
524integer $y$ such that $y^2 \equiv x \pmod n$. Note that $\jacobi{a}{p}
525\equiv a^{(p - 1)/2} \pmod p$.)
526
527The Jacobi symbol is defined as follows: let $p_0^{e_0} p_1^{e_1} \ldots
528p_k^{e_k} = n$ be the prime factors of $n$. Then
529\[
530 \jacobi{a}{n} = \jacobi{a}{p_0}^{e_0} \jacobi{a}{p_1}^{e_1} \cdots
531 \jacobi{a}{p_k}^{e_k}
532\]
533Fortunately, the Jacobi symbol can be computed efficiently without factoring
534$n$ or performing modular exponentiations.
535
536The Jacobi symbol is used in the Solovay-Strassen primality test (not
537implemented in Catacomb -- the Rabin-Miller test is better), and in some
538subliminal channels in digital signature algorithms.
539
540The RSA public-key encryption algorithm leaks the Jacobi symbol of its
541plaintext.
542
45c0fd36 543%%% Local Variables:
3471ebd1 544%%% mode: latex
545%%% TeX-master: "catacomb"
45c0fd36 546%%% End: