X-Git-Url: https://git.distorted.org.uk/u/mdw/catacomb/blobdiff_plain/836e0c19dceaad96151e7f7f42898d1263130ba4..3471ebd194145da52d419c6315459237b076e18d:/manual/mp-mpx.tex diff --git a/manual/mp-mpx.tex b/manual/mp-mpx.tex new file mode 100644 index 0000000..f775f6b --- /dev/null +++ b/manual/mp-mpx.tex @@ -0,0 +1,635 @@ +%%% -*-latex-*- + +\section{Low-level details: MPX} +\label{sec:mpx} + + +\subsection{Data types for low-level arithmetic} + +The header file \hdr{} defines two types, \code{mpw} and +\code{mpd}, and a collection of macros describing them. + +The `multiprecision word' type, \code{mpw}, is an unsigned integer type whose +representation uses \code{MPW_BITS}. The largest value representable in an +\code{mpw} is $\code{MPW_MAX} = 2^{\code{MPW_BITS}} - 1$. +The value of \code{MPW_BITS} is at least 16. Note that, on some +architectures, an \code{mpw} may be capable of representing values larger +than \code{MPW_MAX}. The expression \code{MPW($x$)} returns the value of $x +\bmod \code{MPW_MAX}$ cast to type \code{mpw}. + +The `multiprecision doubleword' type, \code{mpd}, is an unsigned integer type +with at least double the width of \code{mpw}. Hence, an \code{mpd} is +capable of representing the product of two \code{mpw}s, and is useful when +performing multiplications and divisions. The constants \code{MPD_BITS} and +\code{MPD_MAX} are defined to be the number of bits used in the +representation of an \code{mpd} and the largest value representable in an +\code{mpd} respectively. + +A few macros useful for manipulating \code{mpw}s are defined. + +\subsubsection{The \code{MPW} macro} +\label{fn:MPW} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|mpw MPW(|$x$|);| +\end{listinglist} + +\fsec{Returns} + +The value of $x \bmod 2^{\code{MPW_BITS}}$, as an \code{mpw}. This is a +frequently used abbreviation for |(mpw)(|$x$| & MPW_MAX)|. + +\subsubsection{The \code{MPWS} macro} +\label{fn:MPWS} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|size_t MPWS(size_t |$n$|);| +\end{listinglist} + +\fsec{Returns} + +The number of bytes occupied by an array of $n$ \code{mpw}s (i.e., $n \cdot +|sizeof(mpw)|$). + +\subsubsection{The \code{MPW_RQ} macro} +\label{fn:MPW-RQ} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|size_t MPW_RQ(size_t |$\sz$|);| +\end{listinglist} + +\fsec{Returns} + +The number of \code{mpw}s required to represent a multiprecision integer +occupying $\sz$ octets in an external representation. + + +\subsection{Low-level multiprecision integer representation} + +The low-level multiprecision definitions are exported in the header file +\hdr{}. This header includes \hdr{}, so +there's usually no need to include it explicitly. + +A multiprecision integer is represented as an array of \code{mpw}s, +least-significant first. The array's bounds are described by a a pair of +pointers to the array's \emph{base} (i.e., its first element) and +\emph{limit} (i.e., the location immediately following its last element). + +Let $v$ be the base and $\vl$ the limit of a multiprecision integer array. +The array itself is notated as $v .. \vl$. The array's size in words may be +computed as $\vl - v$ in the normal C fashion. The integer represented by +the array, denoted $\mp(v .. \vl)$, is defined to be +\[ + \mp(v .. \vl) = + \sum_{0 \le i < \vl - v} + 2^{\code{MPW_BITS} \cdot i} v[i] +\] +If the array is empty (i.e., $v = \vl$) then the number is zero. If the +array is empty, or the final word is nonzero, then the representation is said +to be \emph{shrunken}. Shrunken representations are more efficient, since +the arithmetic algorithms don't need to consider high-order words which make +no difference to the final result anyway. + +Whenever a result is too large to be represented in the memory allocated for +it, high-order bits are discarded. Thus, a result written to an array of $k$ +words is reduced modulo $2^{\code{MPW_BITS} \cdot k}$. + + +\subsection{Low-level macros} +\label{sec:mpx-macro} + +The following macros perform various simple operations useful for +manipulating multiprecision integers at the MPX level. + +\subsubsection{The \code{MPX_SHRINK} macro} +\label{fn:MPX-SHRINK} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|MPX_SHRINK(const mpw *|$v$|, const mpw *|$\vl$|);| +\end{listinglist} + +\fsec{Description} + +The \code{MPX_SHRINK} macro reduces the limit $\vl$ of a multiprecision +integer array so that either $v = \vl$ or $\vl[-1] \ne 0$. The $\vl$ +argument must be an lvalue, since the macro writes the result back when it +finishes. + +\subsubsection{The \code{MPX_BITS} macro} +\label{fn:MPX-BITS} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|MPX_BITS(unsigned long |$b$|, const mpw *|$v$|, const mpw *|$\vl$|);| +\end{listinglist} + +\fsec{Description} + +Determines the smallest number of bits which could represent the number +$\mp(v .. \vl)$, and writes the answer to $b$, which must therefore be an +lvalue. The result is zero if the number is zero; otherwise $b$ is the +largest integer such that +\[ \mp(v .. \vl) \ge 2^{(b - 1) \bmod \code{MPW_BITS}}.\] + +\subsubsection{The \code{MPX_OCTETS} macro} +\label{fn:MPX-OCTETS} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|MPX_OCTETS(size_t |$o$|,| + |const mpw *|$v$|, const mpw *|$\vl$|);| +\end{listinglist} + +\fsec{Description} + +Determines the smallest number of octets which could represent the number +$\mp(v .. \vl)$, and writes the answer to $o$, which must therefore be an +lvalue. This is useful for determining appropriate buffer sizes for the +results of \code{mpx_storel} and \code{mpx_storeb}. + +The result $o$ can be calculated from the number of bits $b$ reported by +\code{MPX_BITS} as $o = \lceil b / 8 \rceil$; the algorithm used by +\code{MPX_OCTETS} is more efficient than this, however. + +\subsubsection{The \code{MPX_COPY} macro} +\label{fn:MPX-COPY} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|MPX_COPY(mpw *|$\dv$|, mpw *|$\dvl$|,| + |const mpw *|$\av$|, const mpw *|$\avl$|);| +\end{listinglist} + +\fsec{Description} + +Copies a multiprecision integer from the source array $\av .. \avl$ to the +destination array $\dv .. \dvl$. If the destination array is large enough, +the result is equal to the source; otherwise, high-order bits are discarded +as usual. + +\subsubsection{The \code{MPX_ZERO} macro} +\label{fn:MPX-ZERO} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|MPX_ZERO(mpw *|$v$|, mpw *|$\vl$|);| +\end{listinglist} + +\fsec{Description} + +Sets the integer $\mp(v .. \vl)$ to zero. + + +\subsection{Transfer formats: loading and storing} +\label{sec:mpx-io} + +The MPX layer can translate between internal representations of integers as +arrays of \code{mpw}s and external formats where integers are stored as +arrays of octets. Both little- and big-endian orders are supported. + +If $a_0, a_1, \ldots, a_{k - 1}$ is an array of $k$ octets, with +$0 \le a_i < 256$ for all $0 \le i < k$, then the number $n$ represented by +$a$ in little-endian byte order is +\[ n = \sum_{0 \le i < k} 256^i a_i .\] +Similarly, the number $n'$ represented by $a$ in big-endian byte order is +\[ n' = \sum_{0 \le i < k} 256^{k - i - 1} a_i .\] + +If, in a store operation, the destination octet array is not large enough to +represent the number to be stored, high-order octets are discarded; hence, +the number is reduced modulo $2^{8k}$, where $k$ is the size of the +destination array in octets. + +Two useful macros help when working out how much memory is needed for +translation between internal and transfer formats for multiprecision +integers. The macro \code{MPX_OCTETS} (page~\pageref{fn:MPX-OCTETS}) +calculates the smallest octet array size which can represent a multiprecision +integer. The macro \code{MPW_RQ} (page~\pageref{fn:MPW-RQ}) calculates the +smallest \code{mpw} array which can represent a multiprecision integer held +in an octet array of a particular size. + +\subsubsection{The \code{mpx_storel} function} +\label{fn:mpx-storel} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|void mpx_storel(const mpw *|$v$|, const mpw *|$\vl$|,| + |void *|$p$|, size_t |$\sz$|);| +\end{listinglist} + +\fsec{Description} + +Stores the number held in the array $v .. \vl$ to the array of $\sz$ octets +starting at address $p$ in little-endian byte order (i.e., least significant +byte first). + +\subsubsection{The \code{mpx_loadl} function} +\label{fn:mpx-loadl} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|void mpx_loadl(mpw *|$v$|, mpw *|$\vl$|,| + |const void *|$p$|, size_t |$\sz$|);| +\end{listinglist} + +\fsec{Description} + +Loads into the array $v .. \vl$ the number represented in the array of $\sz$ +octets starting at address $p$ in little-endian byte order (i.e., least +significant byte first). + +\subsubsection{The \code{mpx_storeb} function} +\label{fn:mpx-storeb} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|void mpx_storeb(const mpw *|$v$|, const mpw *|$\vl$|,| + |void *|$p$|, size_t |$\sz$|);| +\end{listinglist} + +\fsec{Description} + +Stores the number held in the array $v .. \vl$ to the array of $\sz$ octets +starting at address $p$ in big-endian byte order (i.e., least significant +byte last). + +\subsubsection{The \code{mpx_loadb} function} +\label{fn:mpx-loadb} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|void mpx_loadb(mpw *|$v$|, mpw *|$\vl$|,| + |const void *|$p$|, size_t |$\sz$|);| +\end{listinglist} + +\fsec{Description} + +Loads into the array $v .. \vl$ the number represented in the array of $\sz$ +octets starting at address $p$ in big-endian byte order (i.e., least +significant byte last). + + +\subsection{Bit shifting operations} +\label{sec:mpx-shift} + +The MPX layer provides functions for efficient multiplication and division of +multiprecision integers by powers of two, performed simply by shifting the +binary representation left or right by a number of bits. Shifts by one bit +and by a multiple of \code{MPW_BITS} are particularly fast. + +There are two shift functions, one for left shifts (multiplications) and one +for right shifts (divisions). Each function is passed an array containing +the number to be shifted, a destination array for the result (which may be +the source array), and the number of bits to be shifted as an unsigned +integer. + + +\subsubsection{The \code{mpx_lsl} function} +\label{fn:mpx-lsl} + +\fsec{Synopsis} + +\begin{listinglist} +\begin{tabbing} +|#include | \\ +|void mpx_lsl(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| \\ + \>|const mpw *|$\av$|, const mpw *|$\avl$|, size_t |$n$|);| +\end{tabbing} +\end{listinglist} + +\fsec{Description} + +Stores in $\dv .. \dvl$ the result of shifting $\mp(\av .. \avl)$ left by $n$ +bits (i.e., multiplying it by $2^n$). + +\subsubsection{The \code{mpx_lsr} function} +\label{fn:mpx-lsr} + +\fsec{Synopsis} + +\begin{listinglist} +\begin{tabbing} +|#include | \\ +|void mpx_lsr(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| \\ + \>|const mpw *|$\av$|, const mpw *|$\avl$|, size_t |$n$|);| +\end{tabbing} +\end{listinglist} + +\fsec{Description} + +Stores in $\dv .. \dvl$ the result of shifting $\mp(\av .. \avl)$ right by +$n$ bits (i.e., dividing it by $2^n$, rounding towards zero). + + +\subsection{Low level arithmetic} + +The remaining functions perform standard arithmetic operations on large +integers. The form for the arguments is fairly well-standardized: +destination arrays are given first, followed by the source arrays in the +conventional order. (This ordering reflects the usual order in a C +assignment expression, the \code{strcpy} and \code{memcpy} functions, and the +order of operands in many assembly languages.) + +Some functions allow the destination array to be the same as one (or both) +source arrays; others forbid this. Under no circumstances may the +destination partially overlap the sources. + + +\subsubsection{The \code{mpx_2c} function} +\label{fn:mpx-2c} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|void mpx_2c(mpw *|$\dv$|, mpw *|$\dvl$|,| + |const mpw *|$v$|, const mpw *|$\vl$|);| +\end{listinglist} + +\fsec{Description} + +Computes the two's complement of the number $\mp(v .. \vl)$ and stores it in +$\dv .. \dvl$. The two arrays $v .. \vl$ and $\dv .. \dvl$ may be the same. + +\subsubsection{The \code{mpx_ucmp} function and \code{MPX_UCMP} macro} +\label{fn:mpx-ucmp} + +\fsec{Synopsis} +\begin{listinglist} +\begin{tabbing} +|#include | \\ +|int mpx_ucmp(|\=|const mpw *|$\av$|, const mpw *|$\avl$|,| \\ + \>|const mpw *|$\bv$|, const mpw *|$\bvl$|);| \\ +|int MPX_UCMP(|\=|const mpw *|$\av$|, const mpw *|$\avl$|, |\synt{rel-op}|,| + \\ \>|const mpw *|$\bv$|, const mpw *|$\bvl$|);| +\end{tabbing} +\end{listinglist} + +\fsec{Description} + +The function \code{mpx_ucmp} performs an unsigned comparison of two unsigned +multiprecision integers $a$ and $b$, passed in the arrays $\av .. \avl$ and +$\bv .. \bvl$ respectively. + +The macro \code{MPX_UCMP} provides a slightly more readable interface for +comparing integers. The \synt{rel-op} may be any C relational operator +(e.g., |!=|, or |<=|). The macro tests whether +$a \mathrel{\synt{rel-op}} b$. + +\fsec{Returns} + +The function \code{mpx_ucmp} returns a value less then, equal to, or +greater than zero depending on whether $a$ is less than, equal to or greater +than $b$. + +The macro \code{MPX_UCMP} returns a nonzero result if $a +\mathrel{\synt{rel-op}} b$ is true, and zero if false. + +\subsubsection{The \code{mpx_uadd} function} +\label{fn:mpx-uadd} + +\fsec{Synopsis} + +\begin{listinglist} +\begin{tabbing} +|#include | \\ +|void mpx_uadd(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| + |const mpw *|$\av$|, const mpw *|$\avl$|,| \\ + \>|const mpw *|$\bv$|, const mpw *|$\bvl$|);| +\end{tabbing} +\end{listinglist} + +\fsec{Description} + +Adds two multiprecision integers. The sum of the two arguments +$\mp(\av .. \avl) + \mp(\bv .. \bvl)$ is stored in $\dv .. \dvl$. The +destination array may be equal to either or both source arrays.\footnote{% + Adding a number to itself is a poor way of doubling. A call to + \code{mpx_lsl} (page~\pageref{fn:mpx-lsl}) is much more efficient.} % + +\subsubsection{The \code{mpx_uaddn} function and \code{MPX_UADDN} macro} +\label{fn:mpx-uaddn} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|void mpx_uaddn(mpw *|$\dv$|, mpw *|$\dvl$|, mpw |$n$|);| \\ +|void MPX_UADDN(mpw *|$\dv$|, mpw *|$\dvl$|, mpw |$n$|);| +\end{listinglist} + +\fsec{Description} + +The function \code{mpx_uaddn} adds a small integer $n$ (expressed as a single +\code{mpw}) to the multiprecision integer held in $\dv .. \dvl$. + +The macro \code{MPX_UADDN} performs exactly the same operation, but uses +inline code rather than calling a function. + +\subsubsection{The \code{mpx_usub} function} +\label{fn:mpx-usub} + +\fsec{Synopsis} + +\begin{listinglist} +\begin{tabbing} +|#include | \\ +|void mpx_usub(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| + |const mpw *|$\av$|, const mpw *|$\avl$|,| \\ + \>|const mpw *|$\bv$|, const mpw *|$\bvl$|);| +\end{tabbing} +\end{listinglist} + +\fsec{Description} + +Subtracts one multiprecision integer from another. The difference of the two +arguments $\mp(\av .. \avl) - \mp(\bv .. \bvl)$ is stored in $\dv .. \dvl$. +The destination array may be equal to either or both source +arrays.\footnote{% + Subtracting a number from itself is a particularly poor way of clearing an + integer to zero. A call to \code{MPX_ZERO} (page~\pageref{fn:MPX-ZERO}) is + much more efficient.} % + +Because overly large results are truncated to fit the destination array, if +the second argument is larger than the first, a two's-complement result is +obtained. + +\subsubsection{The \code{mpx_usubn} function and \code{MPX_USUBN} macro} +\label{fn:mpx-usubn} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|void mpx_usubn(mpw *|$\dv$|, mpw *|$\dvl$|, mpw |$n$|);| \\ +|void MPX_USUBN(mpw *|$\dv$|, mpw *|$\dvl$|, mpw |$n$|);| +\end{listinglist} + +\fsec{Description} + +The function \code{mpx_usubn} subtracts a small integer $n$ (expressed as a +single \code{mpw}) from the multiprecision integer held in $\dv .. \dvl$. + +The macro \code{MPX_USUBN} performs exactly the same operation, but uses +inline code rather than calling a function. + +\subsubsection{The \code{mpx_umul} function} +\label{fn:mpx-umul} + +\fsec{Synopsis} + +\begin{listinglist} +\begin{tabbing} +|#include | \\ +|void mpx_umul(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| + |const mpw *|$\av$|, const mpw *|$\avl$|,| \\ + \>|const mpw *|$\bv$|, const mpw *|$\bvl$|);| +\end{tabbing} +\end{listinglist} + +\fsec{Description} + +Multiplies two multiprecision integers. The product of the two arguments +$\mp(\av .. \avl) \times \mp(\bv .. \bvl)$ is stored in $\dv .. \dvl$. The +destination array may not be equal to either source array. + +\subsubsection{The \code{mpx_umuln} function and \code{MPX_UMULN} macro} +\label{fn:mpx-umuln} + +\fsec{Synopsis} + +\begin{listinglist} +\begin{tabbing} +|#include | \\ +|void mpx_umuln(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| \\ + \> |const mpw *|$\av$|, const mpw *|$\avl$|, mpw |$n$|);| \\ +|void MPX_UMULN(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| \\ + \> |const mpw *|$\av$|, const mpw *|$\avl$|, mpw |$n$|);| +\end{tabbing} +\end{listinglist} + +\fsec{Description} + +The function \code{mpx_umuln} multiplies the multiprecision integer passed in +$\av .. \avl$ by a small integer $n$ (expressed as a single \code{mpw}), +writing the product $n \cdot \mp(\av .. \avl)$ to the destination array $\dv +.. \dvl$. The destination array may be equal to the source array $\av +.. \avl$. + +The macro \code{MPX_UMULN} performs exactly the same operation, but uses +inline code rather than calling a function. + +\subsubsection{The \code{mpx_umlan} function and \code{MPX_UMLAN} macro} +\label{fn:mpx-umlan} + +\fsec{Synopsis} + +\begin{listinglist} +\begin{tabbing} +|#include | \\* +|void mpx_umlan(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| \\* + \> |const mpw *|$\av$|, const mpw *|$\avl$|, mpw |$n$|);| \\ +|void MPX_UMLAN(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| \\ + \> |const mpw *|$\av$|, const mpw *|$\avl$|, mpw |$n$|);| +\end{tabbing} +\end{listinglist} + +\fsec{Description} + +The function \code{mpx_umlan} multiplies the multiprecision integer passed in +$\av .. \avl$ by a small integer $n$ (expressed as a single \code{mpw}), and +adds it to the value already stored in the destination array $\dv .. \dvl$. +The destination array may be equal to the source array $\av .. \avl$, +although this isn't very useful. + +The macro \code{MPX_UMLAN} performs exactly the same operation, but uses +inline code rather than calling a function. + +\subsubsection{The \code{mpx_usqr} function} +\label{fn:mpx-usqr} + +\fsec{Synopsis} + +\begin{listinglist} +|#include | \\ +|void mpx_usqr(mpw *|$\dv$|, mpw *|$\dvl$|,| + |const mpw *|$\av$|, const mpw *|$\avl$|);| +\end{listinglist} + +\fsec{Description} + +Squares a multiprecision integer. The result $\mp(\av .. \avl)^2$ is stored +in $\dv .. \dvl$. The destination array may not be equal to the source +array. + +Squaring a number is approximately twice as fast as multiplying a number by +itself. + +\subsubsection{The \code{mpx_udiv} function} +\label{fn:mpx-udiv} + +\fsec{Synopsis} + +\begin{listinglist} +\begin{tabbing} +|#include | \\ +|void mpx_udiv(|\=|mpw *|$\qv$|, mpw *|$\qvl$|, mpw *|$\rv$|, mpw *|$\rvl$|,| +\\ \>|const mpw *|$\dv$|, const mpw *|$\dvl$|,| + |mpw *|$\mathit{sv}$|, mpw *|$\mathit{svl}$|);| +\end{tabbing} +\end{listinglist} + +\fsec{Description} + +Calculates the quotient and remainder when one multiprecision integer is +divided by another. + +The calling convention is slightly strange. The dividend and divisor are +passed in the arrays $\rv .. \rvl$ and $\dv .. \dvl$ respectively. The +quotient is written to the array $\qv .. \qvl$; the remainder is left in $\rv +.. \rvl$. + +The division function requires some workspace. The `scratch' array +$\mathit{sv} .. \mathit{svl}$ must be at least one word longer than the +divisor array $\dv .. \dvl$. The scratch array'sfinal contents are not +useful. Also, the remainder array $\rv .. \rvl$ must have at least two +words' headroom. + +\unverb\| +Given a dividend $x$ and a divisor $y$, the function calculates a quotient +$q$ and remainder $r$ such that $q = \lfloor x / y \rfloor$ and $x = qy + r$. +In particular, this definition implies that $r$ has the same sign as $y$, +which is a useful property when performing modular reductions. +\shortverb\| + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "catacomb" +%%% End: