More changes. Still embryonic.
[u/mdw/catacomb] / manual / mp-mp.tex
1 %%% -*-latex-*-
2
3 \section{High-level multiprecision integer represention}
4 \label{sec:mp}
5
6 The high-level interface to multiprecision integers is exported by the header
7 file \hdr{<catacomb/mp.h>}.
8
9
10 \subsection{The \code{mp} structure}
11 \label{sec:mp-struct}
12
13 At the high level, a multiprecision integer is represented as an object of
14 type \code{mp}. Most of the time, programs use pointers to \code{mp} objects
15 which have been dynamically allocated. (See \code{mp_build},
16 page~\pageref{fn:mp-build}, for the exception to this rule.)
17
18 The \code{mp} type is a structure. Some of the members are considered
19 internal to the implementation. However, the following may be used by client
20 programs:
21 \begin{description}
22 \def\makelabel#1{\code{#1}\hfil}
23 \item[mpw *v, *vl] The base and limit of the multiprecision
24 integer array.
25 \item[unsigned f] Various flags describing the current state of the integer.
26 \end{description}
27 Other members are used for keeping track of how much memory has been
28 allocated to the number and for managing the reference counting system.
29 These members should not be accessed directly by application code. None of
30 the structure members should be modified directly.
31
32 The following flags are defined:
33 \begin{description*}
34 \let\makelabel\code
35 \item[MP_NEG] The integer is negative.
36 \item[MP_BURN] Clear the integer's storage after use.
37 \item[MP_CONST] The integer may not be altered or freed.
38 \item[MP_UNDEF] The integer currently has no interesting value.
39 \item[MP_DESTROYED] The integer has been destroyed.
40 \end{description*}
41
42 The \code{MP_BURN} flag is inherited by the results of calculations. Thus,
43 setting the flag on a value causes it and all other values calculated from it
44 to be destroyed after use.\footnote{%
45 There are currently a small number of places where the \code{MP_BURN} flag
46 may be lost from a result. These usually result from performing complex
47 calculations with degenerate arguments (e.g., computing a GCD with zero).
48 Fixing this is not a high priority.} %
49
50 The \code{MP_DESTROYED} flag is used by \code{mp_destroy}
51 (page~\pageref{fn:mp-destroy}) to trap attempts to destroy the same integer
52 more than once, which can otherwise lead to confusing bugs.
53
54
55 \subsubsection{The \code{MP_LEN} macro}
56 \label{fn:MP-LEN}
57
58 \fsec{Synopsis}
59
60 \begin{listinglist}
61 |#include <catacomb/mp.h>| \\
62 |ptrdiff_t MP_LEN(const mp *|$m$|);|
63 \end{listinglist}
64
65 \fsec{Returns}
66
67 The number of words used in the representation of the integer~$m$. Note that
68 the argument~$m$ may be evaluated multiple times by this macro.
69
70
71 \subsubsection{The \code{mp_burn} function}
72 \label{fn:mp-burn}
73
74 \fsec{Synopsis}
75
76 \begin{listinglist}
77 |#include <catacomb/mp.h>| \\
78 |void mp_burn(mp *|$m$|);|
79 \end{listinglist}
80
81 \fsec{Description}
82
83 Sets the \code{MP_BURN} flag on the integer~$m$. Whenever memory associated
84 with~$m$, or results derived from it, is deallocated, it is overwritten with
85 zeros to prevent traces being found in swap files or core dumps.
86
87
88 \subsection{Multiprecision integer constants}
89 \label{sec:mp-const}
90
91 There are a few commonly-used multiprecision integer constants provided. All
92 are declared in \hdr{<catacomb/mp.h>}:
93
94 \begin{tabular}[C]
95 {| >{\ttfamily}l | Mr @{\extracolsep{0pt plus 10000fill}\hskip\tabcolsep}
96 | c @{\extracolsep{0pt}} | >{\ttfamily}l | Mr |} \hlx{c{1,2,4,5}v}
97 \multicolumn{1}{|c|}{\textbf{Constant}} &
98 \multicolumn{1}{c|}{\textbf{Value}} & \qquad &
99 \multicolumn{1}{c|}{\textbf{Constant}} &
100 \multicolumn{1}{c|}{\textbf{Value}} \\ \hlx{vc{1,2,4,5}v}
101 MP_ZERO & 0 && MP_FOUR & 4 \\
102 MP_ONE & 1 && MP_FIVE & 5 \\
103 MP_TWO & 2 && MP_TEN & 10 \\
104 MP_THREE & 3 && MP_MONE & -1 \\ \hlx{vc{1,2,4,5}}
105 \end{tabular}
106 \goodbreak
107
108 All of the multiprecision constants are actually pointers to the real
109 values. All have the \code{MP_CONST} flag set, so they cannot be modified or
110 destroyed.
111
112
113 \subsection{Creation and destruction}
114 \label{sec:mp-create}
115
116 Multiprecision integers are reference counted; i.e., each \code{mp} remembers
117 how many references there are to it, and is destroyed only when the last
118 reference disappears.
119
120 New multiprecision integers are created using the functions \code{mp_create}
121 (page~\pageref{fn:mp-create}) or \code{mp_build}
122 (page~\pageref{fn:mp-build}). A newly created integer has one reference.
123 Additional references are returned by the function \code{mp_copy}
124 (page~\pageref{fn:mp-copy}). A reference may be removed from an integer by
125 calling \code{mp_drop} (page~\pageref{fn:mp-drop}).
126
127 It isn't usually a good idea to modify an integer to which there are multiple
128 references: the owners of the other references will usually react badly if
129 their number has changed. A modifiable copy may be made of an integer using
130 the \code{mp_split} function (page~\pageref{fn:mp-split}). If there are no
131 other references, the integer is returned unchanged; otherwise a real copy is
132 made and returned, and the number of references to the original is
133 decremented by one.
134
135 Most of the arithmetic functions allow the caller to state a preferred
136 location for the result. This may either be an existing integer or a
137 completely new one, requested by the special value \code{MP_NEW}. This
138 behaviour is provided by the \code{mp_modify} function
139 (page~\pageref{fn:mp-modify}).
140
141 Note that functions may not actually use the location requested, although if
142 it decides not to overwrite the requested destination, a reference to it is
143 removed.
144
145
146 \subsubsection{The \code{mp_create} function}
147 \label{fn:mp-create}
148
149 \fsec{Synopsis}
150
151 \begin{listinglist}
152 |#include <catacomb/mp.h>| \\
153 |mp *mp_create(size_t |$\sz$|);|
154 \end{listinglist}
155
156 \fsec{Description}
157
158 Allocates a new multiprecision integer, with at least $\sz$ words of storage
159 available for its representation. The flag \code{MP_UNDEF} is set initially;
160 the other flags are cleared. The integer's array limit pointer \code{vl} is
161 set to be $\sz$ higher than the base pointer \code{v}.
162
163 \fsec{Returns}
164
165 A pointer to a newly allocated multiprecision integer.
166
167 \subsubsection{The \code{mp_destroy} function}
168 \label{fn:mp-destroy}
169
170 \fsec{Synopsis}
171
172 \begin{listinglist}
173 |#include <catacomb/mp.h>| \\
174 |void mp_destroy(mp *|$m$|);|
175 \end{listinglist}
176
177 \fsec{Description}
178
179 Destroys the multiprecision integer~$m$. All resouces allocated to the
180 integer are freed. It's most unusual for this to be the function you want:
181 most of the time \code{mp_drop} (page~\pageref{fn:mp-drop}) is much more
182 useful.
183
184 If compiled with debugging support, the \code{mp_destroy} function will abort
185 if requested to destroy an integer whose \code{MP_CONST} or
186 \code{MP_DESTROYED} flags are set.
187
188 \subsubsection{The \code{mp_build} function}
189 \label{fn:mp-build}
190
191 \fsec{Synopsis}
192
193 \begin{listinglist}
194 |#include <catacomb/mp.h>| \\
195 |void mp_build(mp *|$m$|, mpw *|$v$|, mpw *|$\vl$|);|
196 \end{listinglist}
197
198 \fsec{Description}
199
200 Creates a constant multiprecision integer from an array of \code{mpw} words.
201 Storage for the \code{mp} object and the array is provided by the user --
202 \code{mp_build} allocates no memory. Conversely, it's safe to dispose of the
203 \code{mp} block and \code{mpw} array at any time, as long as there are no
204 live references to them.
205
206 The \code{mp_build} function is particularly useful for constructing small
207 multiprecision integers, and for addressing parts of other integers. For
208 example, if $x$ is a multiprecision integer, then
209 \begin{listing}
210 mp_build(&m, x->v + 5, x->vl);
211 \end{listing}
212 sets $m$ equal to $\lfloor x / 2^{5 \cdot \code{MPW_BITS}} \rfloor$ without
213 copying any data.\footnote{%
214 This technique is used in the implementation of Barrett reduction
215 (page~\pageref{sec:mp-barrett}), for example. See the source file
216 \code{mpbarrett.c} for details.} %
217
218 \subsubsection{The \code{mp_copy} function and \code{MP_COPY} macro}
219 \label{fn:mp-copy}
220
221 \fsec{Synopsis}
222
223 \begin{listinglist}
224 |#include <catacomb/mp.h>| \\
225 |mp *mp_copy(mp *|$m$|);| \\
226 |mp *MP_COPY(mp *|$m$|);|
227 \end{listinglist}
228
229 \fsec{Description}
230
231 The function \code{mp_copy} adds a reference to the multiprecision
232 integer~$m$. The integer will not be destroyed until all references to it
233 have been dropped.
234
235 The macro \code{MP_COPY} performs exactly the same action as the
236 \code{mp_copy} function, except that it uses inline code rather than a
237 function call. It's probably smaller and faster than the function call too.
238
239 \fsec{Returns}
240
241 The address of the new reference. This will usually be the same as the
242 original argument~$m$.
243
244 \subsubsection{The \code{mp_drop} function and \code{MP_DROP} macro}
245 \label{fn:mp-drop}
246
247 \fsec{Synopsis}
248
249 \begin{listinglist}
250 |#include <catacomb/mp.h>| \\
251 |void mp_drop(mp *|$m$|);| \\
252 |void MP_DROP(mp *|$m$|);|
253 \end{listinglist}
254
255 \fsec{Description}
256
257 The function \code{mp_drop} removes (`drops') a reference to the
258 multiprecision integer~$m$. If there are no more references to the integer,
259 it is destroyed, as if by \code{mp_destroy} (page~\pageref{fn:mp-destroy}).
260
261 The macro \code{MP_DROP} performs exactly the same action as the
262 \code{mp_drop} function, except that it uses inline code rather than a
263 function call, so it's slightly faster but uses a little more code.
264
265 \subsubsection{The \code{mp_split} function and \code{MP_SPLIT} macro}
266 \label{fn:mp-split}
267
268 \fsec{Synopsis}
269
270 \begin{listinglist}
271 |#include <catacomb/mp.h>| \\
272 |mp *mp_split(mp *|$m$|);| \\
273 |void MP_SPLIT(mp *|$m$|);|
274 \end{listinglist}
275
276 \fsec{Description}
277
278 If the integer~$m$ has only one reference, the \code{mp_split} function
279 returns $m$ unchanged; otherwise, a copy is made and returned, and the
280 reference count of $m$ is decremented. Thus, either way, \code{mp_split}
281 returns a pointer to an integer with the same value as $m$ and a single
282 refernce. Thus, the result of \code{mp_split} is therefore safe to modify.
283
284 The macro \code{MP_SPLIT} performs the same action as the \code{mp_split}
285 function except that it uses inline code rather than a function call, and it
286 modifies its argument in place rather than returning the new reference. The
287 macro \code{MP_SPLIT} may evaluate its argument~$m$ multiple times.
288
289 \fsec{Returns}
290
291 The function \code{mp_split} returns a pointer to a copy of its argument $m$
292 which is safe to modify.
293
294 \subsubsection{The \code{mp_modify} function and \code{MP_MODIFY} macro}
295 \label{fn:mp-modify}
296
297 \fsec{Synopsis}
298
299 \begin{listinglist}
300 |#include <catacomb/mp.h>| \\
301 |mp *mp_modify(mp *|$m$|, size_t |$\sz$|);| \\
302 |void MP_MODIFY(mp *|$m$|, size_t |$\sz$|);|
303 \end{listinglist}
304
305 \fsec{Description}
306
307 The function \code{mp_modify} returns a pointer to a suitable destination $d$
308 for a result given a suggested destination $m$ and a required size $\sz$.
309 The actual behaviour is complicated:
310
311 \begin{itemize}
312 \item If $m$ is equal to \code{MP_NEW}, or the \code{MP_CONST} flag is set on
313 $m$, then allocate a new integer~$d$ of size $\sz$ by calling
314 \code{mp_create} (page~\pageref{fn:mp-create}).
315 \item Otherwise, if $m$ has more than one reference, a reference is removed
316 asif by \code{mp_drop} (page~\pageref{fn:mp-drop}) and a new integer $d$ of
317 size $\sz$ is allocated as if by \code{mp_create}.
318 \item Otherwise, the integer~$m$ is extended to $\sz$ words, as if by calling
319 \code{mp_ensure} (page~\pageref{fn:mp-ensure}), and $d$ is set equal
320 to~$m$.
321 \end{itemize}
322
323 The resulting integer $d$ is returned from the function.
324
325 The upshot of all of this is that $d$ has exactly one reference, space for
326 $\sz$~words of data, and no particular value set. Any other references to
327 the original integer~$m$ (whether counted or not) continue to refer to the
328 same value. The total number of references to $d$ and~$m$ after
329 \code{mp_modify} returns is equal to the initial number of references to $m$
330 before the call.
331
332 The macro \code{MP_MODIFY} does the same job using inline code; it sets $m$
333 to the new integer~$d$ when it's finished. The macro expands to a fairly
334 large chunk of code.
335
336 \fsec{Returns}
337
338 The function \code{mp_modify} returns the destination integer~$d$ as
339 described above.
340
341
342 \subsection{Resizing multiprecision integers}
343 \label{sec:mp-resize}
344
345 There are a collection of useful functions and macros for changing the size
346 of a multiprecision integer.
347
348 \subsubsection{The \code{mp_resize} function and \code{MP_RESIZE} macro}
349 \label{fn:mp-resize}
350
351 \fsec{Synopsis}
352
353 \begin{listinglist}
354 |#include <catacomb/mp.h>| \\
355 |void mp_resize(mp *|$m$|, size_t |$\sz$|);| \\
356 |void MP_RESIZE(mp *|$m$|, size_t |$\sz$|);|
357 \end{listinglist}
358
359 \fsec{Description}
360
361 The function \code{mp_resize} resizes the array used to hold the value of the
362 integer~$m$ such that it has space for $\sz$ words. No check is made to see
363 whether $m$'s array is already the correct size. The integer's array limit
364 pointer \code{vl} is set to point $\sz$~words higher than the base pointer
365 \code{v}.
366
367 The macro \code{MP_RESIZE} performs exactly the same action as the
368 \code{mp_resize} function, except that it uses inline code rather than a
369 function call. It's slightly faster but uses quite a lot more code than the
370 plain function call.
371
372 \subsubsection{The \code{mp_ensure} function and \code{MP_ENSURE} macro}
373 \label{fn:mp-ensure}
374
375 \fsec{Synopsis}
376
377 \begin{listinglist}
378 |#include <catacomb/mp.h>| \\
379 |void mp_ensure(mp *|$m$|, size_t |$\sz$|);| \\
380 |void MP_ENSURE(mp *|$m$|, size_t |$\sz$|);|
381 \end{listinglist}
382
383 \fsec{Description}
384
385 The function \code{mp_ensure} resizes the integer $m$ (as if by calling
386 \code{mp_resize}) if its array is currently smaller than $\sz$ words; if not,
387 the array is left alone. The integer's array limit pointer \code{vl} is set
388 to point $\sz$ words higher than the base pointer \code{v}.
389
390 The macro \code{MP_ENSURE} performs exactly the same action as the
391 \code{mp_ensure} function, except that it uses inline code rather than a
392 function call. It's slightly faster but uses considerably more code.
393
394 \subsubsection{The \code{mp_shrink} function and \code{MP_SHRINK} macro}
395 \label{fn:mp-shrink}
396
397 \fsec{Synopsis}
398
399 \begin{listinglist}
400 |#include <catacomb/mp.h>| \\
401 |void mp_shrink(mp *|$m$|);| \\
402 |void MP_SHRINK(mp *|$m$|);|
403 \end{listinglist}
404
405 \fsec{Description}
406
407 The function \code{mp_shrink} shrinks the representation of the integer $m$.
408 It lowers the integer's array limit pointer so that the most significant word
409 in the representation is nonzero. This improves the efficiency of arithmetic
410 operations on $m$.
411
412 Note that shrinking an integer doesn't release any memory. Use
413 \code{mp_minimize} (below) for that.
414
415 The macro \code{MP_SHRINK} performs exactly the same operation as
416 \code{mp_shrink} using inline code rather than a function call.
417
418 \subsubsection{The \code{mp_minimize} function}
419 \label{fn:mp-minimize}
420
421 \fsec{Synopsis}
422
423 \begin{listinglist}
424 |#include <catacomb/mp.h>| \\
425 |void mp_minimize(mp *|$m$|);|
426 \end{listinglist}
427
428 \fsec{Description}
429
430 Minimizes the amount of memory used by the represenation of the integer~$m$.
431 This isn't usually useful unless $m$ is going to remain constant for a long
432 time, or has become extremely large for some reason.
433
434
435 \subsection{Bit-scanning}
436 \label{sec:mp-scan}
437
438 \subsection{Loading and storing}
439 \label{sec:mp-io}
440
441 \subsection{Simple arithmetic}
442 \label{sec:mp-arith}
443
444 \subsection{More advanced algorithms}
445 \label{sec:mp-adv}
446
447 \subsubsection{The \code{mp_gcd} function}
448 \label{fn:mp-gcd}
449
450 \fsec{Synopsis}
451
452 \begin{listinglist}
453 |#include <catacomb/mp.h>| \\
454 |void mp_gcd(mp **|$g$|, mp **|$a$|, mp **|$b$|, mp *|$x$|, mp *|$y$|);|
455 \end{listinglist}
456
457 \fsec{Description}
458
459 The function \code{mp_gcd} implements an `extended GCD' algorithm. Given
460 integers $x$ and~$y$, it computes integers $g$, $a$, and~$b$ such that $g =
461 \gcd(x, y)$ is the greatest common divisor of $x$ and $y$, and $ax + by = g$.
462
463 If the arguments $g$, $a$ or~$b$ are null pointers, the appropriate result is
464 discarded; otherwise, the current value of $|*|g$, $|*|a$ or $|*|b$ is used
465 as a suggested destination for the result, and replaced by the actual result.
466
467 If both $a$ and~$b$ are null, the results $a$ and $b$ are not computed.
468 Hence \code{mp_gcd} can efficiently perform a simple greatest-common-divisor
469 computation.
470
471 A major use for \code{mp_gcd} is the calculation of modular inverses. If
472 $\gcd(x, n) = 1$ then \code{mp_gcd} computes integers $a$ and~$b$ such that
473 $ax + bn = 1$, i.e., $ax \equiv 1 \pmod n$, so $a$ is the inverse of $x$,
474 written $a \equiv x^{-1} \pmod n$.
475
476 In order to better support this usage, \code{mp_gcd} implements a \emph{sign
477 preference} system. If only $a$ is requested, it will either be zero or
478 have the same sign as~$y$; otherwise, $b$ will be zero or have the same sign
479 as~$x$. Thus, $x^{-1} \bmod n$ may be conveniently computed using the code
480 \begin{listing}
481 mp *xinv = MP_NEW;
482 mp_gcd(0, 0, &xinv, n, x);
483 \end{listing}
484 (It's conventional to ensure that $x > y$, even though it doesn't actually
485 make any difference.)
486
487 The function \code{mp_gcd} works correctly for all integers $x$ and $y$. The
488 following properties of the results are guaranteed:
489 \begin{itemize}
490 \unverb\|
491 \item $\gcd(x, y) = \gcd(y, x)$ for all integers $x$ and~$y$.
492 \item $\gcd(x, 0) = \gcd(0, x) = x$ for all integers~$x$.
493 \item $|a| < |y|$, unless $y = 0$; similarly, $|b| < |x|$ if $x \ne 0$.
494 \item If $x = y = 0$, then $a = b = 0$. Otherwise, if $x = 0$, then $a = 0$
495 and $b = 1$; similarly, if $y = 0$, then $a = 1$ and $b = 0$.
496 \end{itemize}
497
498 \subsubsection{The \code{mp_jacobi} function}
499 \label{fn:mp-jacobi}
500
501 \fsec{Synopsis}
502
503 \begin{listinglist}
504 |#include <catacomb/mp.h>| \\
505 |int mp_jacobi(mp *|$a$|, mp *|$n$|);|
506 \end{listinglist}
507
508 \fsec{Returns}
509
510 The value of the Jacobi symbol $\jacobi{a}{n}$.\footnote{%
511 Schneier writes the Jacobi symbol as $J(a, n)$. The notation
512 $(\!\frac{a}{n}\!)$ seems more usual.} %
513 The Jacobi symbol $\jacobi{a}{n}$ is defined for all integers~$a$ and odd
514 integers~$n$ in terms of the Legendre symbol $\jacobi{a}{p}$ for prime~$p$,
515 which is:
516 \[
517 \jacobi{a}{p} = \begin{cases}
518 0 & if $a$ is a multiple of $p$ \\
519 -1 & if $a$ is a quadratic nonresidue mod $p$ \\
520 1 & if $a$ is a quadratic residue mod $p$
521 \end{cases}
522 \]
523 (An integer $x$ is a quadratic residue mod $n$ if and only if there exists an
524 integer $y$ such that $y^2 \equiv x \pmod n$. Note that $\jacobi{a}{p}
525 \equiv a^{(p - 1)/2} \pmod p$.)
526
527 The Jacobi symbol is defined as follows: let $p_0^{e_0} p_1^{e_1} \ldots
528 p_k^{e_k} = n$ be the prime factors of $n$. Then
529 \[
530 \jacobi{a}{n} = \jacobi{a}{p_0}^{e_0} \jacobi{a}{p_1}^{e_1} \cdots
531 \jacobi{a}{p_k}^{e_k}
532 \]
533 Fortunately, the Jacobi symbol can be computed efficiently without factoring
534 $n$ or performing modular exponentiations.
535
536 The Jacobi symbol is used in the Solovay-Strassen primality test (not
537 implemented in Catacomb -- the Rabin-Miller test is better), and in some
538 subliminal channels in digital signature algorithms.
539
540 The RSA public-key encryption algorithm leaks the Jacobi symbol of its
541 plaintext.
542
543 %%% Local Variables:
544 %%% mode: latex
545 %%% TeX-master: "catacomb"
546 %%% End: