+Multiprecision maths library
+
+
+ Catacomb comes with a fairly reasonable multiprecision maths
+ library. It's not stunningly fast, and doesn't do everything,
+ but it seems good for the sorts of things needed in
+ cryptographic algorithms (which, for example, GMP isn't[1]).
+
+ There are basically three layers to the system:
+
+ * The low-level `mpx' layer, which implements some basic
+ arithmetic but is fairly unpleasant to actually use.
+
+ * A high-level `mp' layer, which provides a useful `mp'
+ multiprecision type and implements useful arithmetic on
+ them.
+
+ * Everything else. A collection of miscellaneous routines for
+ performing particular operations efficiently or simply.
+
+ [1] In particular, GMP is truly rotten at converting numbers
+ into well-defined binary formats. It has its own format,
+ but it's not defined in the manual, has changed once before,
+ probably doesn't conform to any standards in the
+ cryptographic community, and can only be written to or read
+ from a file stream, so you can't use it in memory (for
+ example to encrypt it). Basically, it's a right pain.
+
+
+The low-level interface
+
+ The low level is described in `mpx.h'. It includes everything
+ else it needs.
+
+ An integer is represented in an array of words. Each word has
+ type `mpw' (multi precision word). The least significant word
+ comes first. An array is described by its `base' -- the address
+ of the first element -- and its `limit' -- the address *just
+ after* the last element. Thus, the length of the array is
+ `limit - base'. This representation is particularly convenient
+ for many things. An old version used base and length, which was
+ a nuisance.
+
+ The largest value which can be represented in a word is
+ MPW_MAX. The number of bits used in the representation of a
+ word is MPW_BITS. Actually, you might be able to use more bits
+ and store larger numbers, but everything will go wrong if you
+ try.
+
+ The macro call MPW(x) returns the low MPW_BITS bits of x. It's
+ very useful in low-level arithmetic.
+
+ The call MPWS(n) tells you the `sizeof' an array of n words.
+ It's also very useful. Try not to get MPW and MPWS confused
+ because they mean very different things.
+
+ The actual low-level macros and functions are documented
+ relatively well. They are given source and destination arrays
+ for things to be read and written to. Usually the sources
+ mustn't overlap the destinations, and destinations certainly
+ mustn't overlap other destinations; sometimes this restriction
+ is lifted, where it would be both convenient and easy to do so.
+ Sources can overlap each other without problems, because they're
+ not written to.
+
+ The difficult parts are the Karatsuba functions. Karatsuba
+ and Ofman discovered a neat recursive way to multiply large
+ numbers. It requires quite a bit of workspace, and adds some
+ overhead, but has lower asymptotic complexity than the standard
+ multiply algorithm. Usually, Karatsuba functions get used
+ automatically by the rest of the library when appropriate and
+ you don't have to care.
+
+
+The high-level interface
+
+ The high-level invents a type, `mp'. It's a structured type,
+ with the following members:
+
+ mpw *v, *vl Base and limit of the word array.
+ size_t sz Number of words actually allocated to the array.
+ unsigned f Various flags.
+ unsigned ref Number of references to the integer.
+
+ The flags are interesting. Here's what they mean:
+
+ MP_NEG The number is negative
+ MP_BURN Burn the number after using it
+ MP_CONST The number mustn't be modified or freed
+ MP_UNDEF The number contains no interesting value
+ MP_DESTROYED The number has been freed once already
+
+ The last one is handy for catching bugs. MP_NEG is obvious --
+ Catacomb uses a signed-magnitude representation because things
+ are usually easier that way. MP_CONST is set on various
+ constants provided for general convenience so they don't get
+ destroyed by accident. MP_UNDEF is used to avoid copying
+ useless data.
+
+ MP_BURN marks a `secret' number. Secret numbers are overwritten
+ with zeroes (`burnt') when they're freed. Secrecy is a viral
+ concept: anything calculated from a secret number is also
+ secret, so they get burnt too.
+
+ Most of the time, you refer to numbers by their addresses. Lots
+ of `mp *' pointers get used. You hardly ever see a real `mp',
+ just pointers to them.
+
+ There are a collection of utility functions for keeping the
+ system running -- for creating new numbers without interesting
+ values, for throwing old ones away, for extending the amount of
+ memory they use, and so on.
+
+ Throwing numbers away is simple. You pass the pointer to the
+ number to `mp_drop'. If someone else still has a reference to
+ the number, it stays where it is, but remembers that you're not
+ interested in it any more. If nobody's interested then the
+ number is disposed of.
+
+ You can add a reference using MP_COPY. It returns the new
+ reference. The new reference is the same as the old one, but
+ it's a useful fiction to pretend that it's different.
+
+ Real arithmetic happens in a fairly uniform way. There's an
+ important point to make about `destinations', though. Just
+ about all of the arithmetic functions take `destination'
+ arguments (one for each result), with the idea that, at least
+ conceptually, they store the results there. What actually
+ happens is that a reference is lost to whatever the destination
+ used to refer to, and a reference to the new result is gained.
+ The address might be different, and then again it might not.
+ Destinations acn be the same as sources -- that works fine.
+ Finally, there's the magic value `MP_NEW'. MP_NEW is a special
+ constant which, as a destination, means `create a new place and
+ put the answer there'.
+
+ The idea is that, sometimes it helps that memory is already
+ allocated for a destination, so it doesn't have to be obtained
+ specially, and, even more often, you want to throw away old
+ intermediate results while you're calculating new ones.
+
+ Here are some examples of how to do it right:
+
+ x = mp_add(MP_NEW, y, z);
+ Assumes x didn't have a useful value before.
+ Afterwards it refers to the sum of y and z. y and z
+ themselves are unchanged.
+
+ y = mp_add(y, y, z);
+ Add z to y. Now y is the sum of z and the old y. z is
+ unchanged.
+
+ q = MP_NEW;
+ mp_div(&q, &x, x, y);
+ Sets q (a new value) equal to x / y, and puts the
+ remainder back in x. y is unchanged.
+
+ z = mp_sub(z, y, z);
+ Subtract z from y, putting the result back in z. y is
+ unchanged.
+
+ x = mp_mul(y, y, z);
+ Multiply y by z, putting the result in x, but making y
+ no longer useful.
+
+ Here's some examples of how to do it wrong:
+
+ mp_mul(y, y, z);
+ mp_mul might choose somewhere other than b's storage to
+ write its result. (In fact, the current version
+ certainly will.) So y is trashed because there are no
+ references to it any more, and the result of the
+ multiplication is lost.
+
+ x = mp_mul(y, y, z);
+ x = mp_add(x, x, y);
+ Whoops. We stored the result in x, but y still got
+ trashed and might contain anything. Adding it to x is a
+ bad move.
+
+ It's not hard once you get the hang of it, although it might
+ feel a bit weird to start with.
+
+
+Modular multiplication and exponentiation
+
+ Lots of public-key crypto uses modular multiplication and
+ exponentation operations. Doing them efficiently is very
+ important. Multiplication on its own is easy, and Catacomb's
+ multiplication algorithms are fairly good (though not up with
+ the very best). Doing the modular reduction afterwards is the
+ tricky bit.
+
+ There are three approaches to modular reduction in Catacomb.
+ All of them have their good and bad points.
+
+ 1. Use mp_div
+
+ For a one-off calculation, this is a good idea. Division is a
+ slow operation -- that's a shame, but life's like that. But
+ when you're doing a one-shot operation, that's the best you can
+ manage.
+
+ 2. Use Barrett reduction
+
+ Barrett reduction is a system whereby you do a little
+ precomputation up-front (one division, in fact) and are then
+ able to perform modular reductions without resorting to
+ expensive division operations. To use it, you initialize a
+ `context' with the modulus, reduce some numbers, and then
+ destroy the context when you've finished.
+
+ Conceptually, Barrett reduction is simple. You give it x and it
+ gives you back x mod m. The mathematics behind it is fairly
+ clever, though -- the real manual has a page of explanation for
+ how it works.
+
+ Barrett reduction works on any positive modulus m. It will do a
+ modular reduction on any integer x which is at most twice as
+ long as m (in words). x can be larger than m^2, but it mustn't
+ be more than twice as long.
+
+ 3. Use Montgomery reduction
+
+ Montgomery reduction is a little more clever. It does a little
+ more work up-front than Barrett reduction, and understanding the
+ benefits is a little harder. However, it's actually slightly
+ more efficient for general use. There's also a disadvantage
+ that Montgomery reduction only works when the modulus is *odd*.
+ It all goes horribly wrong with an even modulus. Don't try it.
+
+ I'm not going to explain how Montgomery reduction actually
+ works. But I am going to explain how to use it.
+
+ The precomputation for Montgomery reduction, performed when you
+ initialize a context, computes a value called R, and also R^2
+ (both mod m). (It also computes some other stuff, but that's
+ not important for this discussion.) Note that R is dependent on
+ the modulus m.
+
+ A Montgomery reduction takes a number x, which should be less
+ than m^2 (it can actually be a bit larger) and computes the
+ value x R^{-1} mod m. That doesn't sound very useful, does it?
+
+ Things start looking more hopeful when you multiply your inputs
+ by R. (There's a clever way of doing that: see below.) To
+ compute xy mod m, you can first calculate xR and yR, multiply
+ them together to get xyR^2, and do a Montgomery reduction to get
+ xyR^2 R^{-1} mod m. Then the R^{-1} cancels one of the Rs and
+ you end up with xyR mod m. Essentially, you do your entire
+ calculation with an extra factor of R all the way through it.
+
+ Removing the factor of R at the end is easy. You just run the
+ Montgomery reduction algorithm again -- the R^{-1} cancels the R
+ and the answer pops out the end. Getting the factor of R in the
+ first place is easy -- you multiply by R^2 and do a reduction.
+
+ The multiply-and-reduce thing is so common that there's a
+ dedicated function which does them both in one go.
+
+ Here's a simple example of multiplying two numbers mod m:
+
+ mpmont mm;
+ mp *ar, *br, *p;
+ mpmont_create(&mm, m);
+ ar = mp_mul(MP_NEW, a, m.r2); /* aR mod m */
+ br = mp_mul(MP_NEW, b, m.r2); /* bR mod m */
+ p = mpmont_mul(&mm, MP_NEW, ar, br); /* abR mod m */
+ p = mpmont_reduce(&mm, p, p); /* ab mod m */
+ mpmont_destroy(&mm);
+ mp_drop(ar);
+ mp_drop(br);
+
+ Actually, it's not a very clever example. Here's a better way,
+ for such a simple calculation:
+
+ mpmont mm;
+ mp *p;
+ mpmont_create(&mm, m);
+ p = mpmont_mul(&mm, MP_NEW, a, b); /* abR^{-1} mod m */
+ p = mpmont_mul(&mm, p, p, mm.r2); /* ab mod m */
+ mpmont_destroy(&mm);
+
+ Of course, for a single multiply-and-reduce, you're better off
+ just using
+
+ mp *p = mp_mul(MP_NEW, a, b);
+ mp_div(0, &p, p, m);
+
+ the traditional division method.
+
+ Both Barrett and Montgomery reduction methods have modular
+ exponentiation functions. If m is a message, n is an RSA
+ modulus, and e is the public exponent, I can do an RSA
+ encryption simply by doing
+
+ mpmont mm;
+ mpmont_create(&mm, n); /* RSA modulus is odd */
+ c = mpmont_exp(&mm, MP_NEW, m, e);
+ mpmont_destroy(&mm);
+
+ (Yes, it's well worth creating a Montgomery context for a
+ full-blown exponentiation, unless someone's deliberately gone
+ out of their way to make it easy for you by choosing a very
+ small public exponent.)
+
+
+Finding prime numbers
+
+ Prime numbers are important too. Finding them is not ever-so
+ easy.
+
+ Catacomb has two useful facilities. There's a fast sequential-
+ search filtering system called `pgen', and a good (but
+ probabilistic) primality tester which implements the Rabin-
+ Miller test.
+
+ The idea is that you initialize a pgen object with a starting
+ place. It spends a short time initializing itself, and then
+ tells you whether the number you gave it is (a) definitely
+ composite, (b) definitely prime (only for small numbers), or (c)
+ it doesn't know. The advantage of the pgen system is that it's
+ good at stepping regularly. You can advance it by 2 or 4 very
+ rapidly, and it will give you the same sort of answer for the
+ next number.
+
+ When you find a number which pgen says it doesn't know about,
+ you must test it more thoroughly. This is time-consuming, but
+ pgen has weeded out many no-hopers so it's not so bad.
+
+ The Rabin-Miller test takes a number to test, and stores it in a
+ context. You then give it some random numbers, and it gives you
+ results. When you've finished, you destroy the context.
+
+ The Rabin-Miller test has two possible responses: (a) definitely
+ not prime, and (b) probably prime. Answer (a) is certain.
+ Answer (b) you may get with a composite number at most one time
+ in four. If you try n times, the probability of a composite
+ number going unnoticed is at most one in 4^n. In fact, the real
+ probability is *much* lower than this.
+
+ It's also important to bear in mind that this is examining the
+ probability that a number passes n Rabin-Miller tests given
+ that it's composite. What's actually interesting the the
+ probability that a number is composite given that it passed n
+ tests. This probability is lower still. For 1024-bit numbers,
+ which are about the right size by current standards of security,
+ you need only five tests to ensure a probability of less than 1
+ in 2^80 of a composite number slipping through.
+
+ There are specialized functions for finding various sorts of
+ prime numbers with particular properties. Read the header files
+ to find out what they do.
+
+
+Conclusion
+
+ This basically concludes the tour of Catacomb's multiprecision
+ maths library. I hope it's provided enough background that the
+ comments start making sense.
+
+--
+[mdw]
+
+\f
+Local variables:
+mode: text
+End: