X-Git-Url: https://git.distorted.org.uk/~mdw/catacomb/blobdiff_plain/674cd11ec63b56561980249cb19a0db54bacfa86..759513d1c2f79d94abe726682f43a28f363229cc:/README.mp

diff --git a/README.mp b/README.mp
new file mode 100644
index 00000000..a7909f2e
--- /dev/null
+++ b/README.mp
@@ -0,0 +1,368 @@
+Multiprecision maths library
+
+
+	Catacomb comes with a fairly reasonable multiprecision maths
+	library.  It's not stunningly fast, and doesn't do everything,
+	but it seems good for the sorts of things needed in
+	cryptographic algorithms (which, for example, GMP isn't[1]).
+
+	There are basically three layers to the system:
+
+	  * The low-level `mpx' layer, which implements some basic
+	    arithmetic but is fairly unpleasant to actually use.
+
+	  * A high-level `mp' layer, which provides a useful `mp'
+	    multiprecision type and implements useful arithmetic on
+	    them.
+
+	  * Everything else.  A collection of miscellaneous routines for
+	    performing particular operations efficiently or simply.
+
+	[1] In particular, GMP is truly rotten at converting numbers
+	    into well-defined binary formats.  It has its own format,
+	    but it's not defined in the manual, has changed once before,
+	    probably doesn't conform to any standards in the
+	    cryptographic community, and can only be written to or read
+	    from a file stream, so you can't use it in memory (for
+	    example to encrypt it).  Basically, it's a right pain.
+
+
+The low-level interface
+
+	The low level is described in `mpx.h'.  It includes everything
+	else it needs.
+
+	An integer is represented in an array of words.  Each word has
+	type `mpw' (multi precision word).  The least significant word
+	comes first.  An array is described by its `base' -- the address
+	of the first element -- and its `limit' -- the address *just
+	after* the last element.  Thus, the length of the array is
+	`limit - base'.  This representation is particularly convenient
+	for many things.  An old version used base and length, which was
+	a nuisance.
+
+	The largest value which can be represented in a word is
+	MPW_MAX.  The number of bits used in the representation of a
+	word is MPW_BITS.  Actually, you might be able to use more bits
+	and store larger numbers, but everything will go wrong if you
+	try.
+
+	The macro call MPW(x) returns the low MPW_BITS bits of x.  It's
+	very useful in low-level arithmetic.
+
+	The call MPWS(n) tells you the `sizeof' an array of n words.
+	It's also very useful.  Try not to get MPW and MPWS confused
+	because they mean very different things.
+
+	The actual low-level macros and functions are documented
+	relatively well.  They are given source and destination arrays
+	for things to be read and written to.  Usually the sources
+	mustn't overlap the destinations, and destinations certainly
+	mustn't overlap other destinations; sometimes this restriction
+	is lifted, where it would be both convenient and easy to do so.
+	Sources can overlap each other without problems, because they're
+	not written to.
+
+	The difficult parts are the Karatsuba functions.  Karatsuba
+	and Ofman discovered a neat recursive way to multiply large
+	numbers.  It requires quite a bit of workspace, and adds some
+	overhead, but has lower asymptotic complexity than the standard
+	multiply algorithm.  Usually, Karatsuba functions get used
+	automatically by the rest of the library when appropriate and
+	you don't have to care.
+
+
+The high-level interface
+
+	The high-level invents a type, `mp'.  It's a structured type,
+	with the following members:
+
+	mpw *v, *vl	Base and limit of the word array.
+	size_t sz	Number of words actually allocated to the array.
+	unsigned f	Various flags.
+	unsigned ref	Number of references to the integer.
+
+	The flags are interesting.  Here's what they mean:
+
+	MP_NEG		The number is negative
+	MP_BURN		Burn the number after using it
+	MP_CONST	The number mustn't be modified or freed
+	MP_UNDEF	The number contains no interesting value
+	MP_DESTROYED	The number has been freed once already
+
+	The last one is handy for catching bugs.  MP_NEG is obvious --
+	Catacomb uses a signed-magnitude representation because things
+	are usually easier that way.  MP_CONST is set on various
+	constants provided for general convenience so they don't get
+	destroyed by accident.  MP_UNDEF is used to avoid copying
+	useless data.
+
+	MP_BURN marks a `secret' number.  Secret numbers are overwritten
+	with zeroes (`burnt') when they're freed.  Secrecy is a viral
+	concept: anything calculated from a secret number is also
+	secret, so they get burnt too.
+
+	Most of the time, you refer to numbers by their addresses.  Lots
+	of `mp *' pointers get used.  You hardly ever see a real `mp',
+	just pointers to them.
+
+	There are a collection of utility functions for keeping the
+	system running -- for creating new numbers without interesting
+	values, for throwing old ones away, for extending the amount of
+	memory they use, and so on.
+
+	Throwing numbers away is simple.  You pass the pointer to the
+	number to `mp_drop'.  If someone else still has a reference to
+	the number, it stays where it is, but remembers that you're not
+	interested in it any more.  If nobody's interested then the
+	number is disposed of.
+
+	You can add a reference using MP_COPY.  It returns the new
+	reference.  The new reference is the same as the old one, but
+	it's a useful fiction to pretend that it's different.
+
+	Real arithmetic happens in a fairly uniform way.  There's an
+	important point to make about `destinations', though.  Just
+	about all of the arithmetic functions take `destination'
+	arguments (one for each result), with the idea that, at least
+	conceptually, they store the results there.  What actually
+	happens is that a reference is lost to whatever the destination
+	used to refer to, and a reference to the new result is gained.
+	The address might be different, and then again it might not.
+	Destinations acn be the same as sources -- that works fine.
+	Finally, there's the magic value `MP_NEW'.  MP_NEW is a special
+	constant which, as a destination, means `create a new place and
+	put the answer there'.
+
+	The idea is that, sometimes it helps that memory is already
+	allocated for a destination, so it doesn't have to be obtained
+	specially, and, even more often, you want to throw away old
+	intermediate results while you're calculating new ones.
+
+	Here are some examples of how to do it right:
+
+	x = mp_add(MP_NEW, y, z);
+		Assumes x didn't have a useful value before.
+		Afterwards it refers to the sum of y and z.  y and z
+		themselves are unchanged.
+
+	y = mp_add(y, y, z);
+		Add z to y.  Now y is the sum of z and the old y.  z is
+		unchanged.
+
+	q = MP_NEW;
+	mp_div(&q, &x, x, y);
+		Sets q (a new value) equal to x / y, and puts the
+		remainder back in x.  y is unchanged.
+
+	z = mp_sub(z, y, z);
+		Subtract z from y, putting the result back in z.  y is
+		unchanged.
+
+	x = mp_mul(y, y, z);
+		Multiply y by z, putting the result in x, but making y
+		no longer useful.
+
+	Here's some examples of how to do it wrong:
+
+	mp_mul(y, y, z);
+		mp_mul might choose somewhere other than b's storage to
+		write its result.  (In fact, the current version
+		certainly will.)  So y is trashed because there are no
+		references to it any more, and the result of the
+		multiplication is lost.
+
+	x = mp_mul(y, y, z);
+	x = mp_add(x, x, y);
+		Whoops.  We stored the result in x, but y still got
+		trashed and might contain anything.  Adding it to x is a
+		bad move.
+
+	It's not hard once you get the hang of it, although it might
+	feel a bit weird to start with.
+
+
+Modular multiplication and exponentiation
+
+	Lots of public-key crypto uses modular multiplication and
+	exponentation operations.  Doing them efficiently is very
+	important.  Multiplication on its own is easy, and Catacomb's
+	multiplication algorithms are fairly good (though not up with
+	the very best).  Doing the modular reduction afterwards is the
+	tricky bit.
+
+	There are three approaches to modular reduction in Catacomb.
+	All of them have their good and bad points.
+
+	1. Use mp_div
+
+	For a one-off calculation, this is a good idea.  Division is a
+	slow operation -- that's a shame, but life's like that.  But
+	when you're doing a one-shot operation, that's the best you can
+	manage.
+
+	2. Use Barrett reduction
+
+	Barrett reduction is a system whereby you do a little
+	precomputation up-front (one division, in fact) and are then
+	able to perform modular reductions without resorting to
+	expensive division operations.  To use it, you initialize a
+	`context' with the modulus, reduce some numbers, and then
+	destroy the context when you've finished.
+
+	Conceptually, Barrett reduction is simple.  You give it x and it
+	gives you back x mod m.  The mathematics behind it is fairly
+	clever, though -- the real manual has a page of explanation for
+	how it works.
+
+	Barrett reduction works on any positive modulus m.  It will do a
+	modular reduction on any integer x which is at most twice as
+	long as m (in words).  x can be larger than m^2, but it mustn't
+	be more than twice as long.
+
+	3. Use Montgomery reduction
+
+	Montgomery reduction is a little more clever.  It does a little
+	more work up-front than Barrett reduction, and understanding the
+	benefits is a little harder.  However, it's actually slightly
+	more efficient for general use.  There's also a disadvantage
+	that Montgomery reduction only works when the modulus is *odd*.
+	It all goes horribly wrong with an even modulus.  Don't try it.
+
+	I'm not going to explain how Montgomery reduction actually
+	works.  But I am going to explain how to use it.
+
+	The precomputation for Montgomery reduction, performed when you
+	initialize a context, computes a value called R, and also R^2
+	(both mod m).  (It also computes some other stuff, but that's
+	not important for this discussion.)  Note that R is dependent on
+	the modulus m.
+
+	A Montgomery reduction takes a number x, which should be less
+	than m^2 (it can actually be a bit larger) and computes the
+	value x R^{-1} mod m.  That doesn't sound very useful, does it?
+
+	Things start looking more hopeful when you multiply your inputs
+	by R.  (There's a clever way of doing that: see below.)  To
+	compute xy mod m, you can first calculate xR and yR, multiply
+	them together to get xyR^2, and do a Montgomery reduction to get
+	xyR^2 R^{-1} mod m.  Then the R^{-1} cancels one of the Rs and
+	you end up with xyR mod m.  Essentially, you do your entire
+	calculation with an extra factor of R all the way through it.
+
+	Removing the factor of R at the end is easy.  You just run the
+	Montgomery reduction algorithm again -- the R^{-1} cancels the R
+	and the answer pops out the end.  Getting the factor of R in the
+	first place is easy -- you multiply by R^2 and do a reduction.
+
+	The multiply-and-reduce thing is so common that there's a
+	dedicated function which does them both in one go.
+
+	Here's a simple example of multiplying two numbers mod m:
+
+		mpmont mm;
+		mp *ar, *br, *p;
+		mpmont_create(&mm, m);
+		ar = mp_mul(MP_NEW, a, m.r2);		/* aR mod m */
+		br = mp_mul(MP_NEW, b, m.r2);		/* bR mod m */
+		p = mpmont_mul(&mm, MP_NEW, ar, br);	/* abR mod m */
+		p = mpmont_reduce(&mm, p, p);		/* ab mod m */
+		mpmont_destroy(&mm);
+		mp_drop(ar);
+		mp_drop(br);
+
+	Actually, it's not a very clever example.  Here's a better way,
+	for such a simple calculation:
+
+		mpmont mm;
+		mp *p;
+		mpmont_create(&mm, m);
+		p = mpmont_mul(&mm, MP_NEW, a, b);	/* abR^{-1} mod m */
+		p = mpmont_mul(&mm, p, p, mm.r2);	/* ab mod m */
+		mpmont_destroy(&mm);
+
+	Of course, for a single multiply-and-reduce, you're better off
+	just using
+
+		mp *p = mp_mul(MP_NEW, a, b);
+		mp_div(0, &p, p, m);
+
+	the traditional division method.
+
+	Both Barrett and Montgomery reduction methods have modular
+	exponentiation functions.  If m is a message, n is an RSA
+	modulus, and e is the public exponent, I can do an RSA
+	encryption simply by doing
+
+		mpmont mm;
+		mpmont_create(&mm, n);		/* RSA modulus is odd */
+		c = mpmont_exp(&mm, MP_NEW, m, e);
+		mpmont_destroy(&mm);
+
+	(Yes, it's well worth creating a Montgomery context for a
+	full-blown exponentiation, unless someone's deliberately gone
+	out of their way to make it easy for you by choosing a very
+	small public exponent.)
+
+
+Finding prime numbers
+
+	Prime numbers are important too.  Finding them is not ever-so
+	easy.
+
+	Catacomb has two useful facilities.  There's a fast sequential-
+	search filtering system called `pgen', and a good (but
+	probabilistic) primality tester which implements the Rabin-
+	Miller test.
+
+	The idea is that you initialize a pgen object with a starting
+	place.  It spends a short time initializing itself, and then
+	tells you whether the number you gave it is (a) definitely
+	composite, (b) definitely prime (only for small numbers), or (c)
+	it doesn't know.  The advantage of the pgen system is that it's
+	good at stepping regularly.  You can advance it by 2 or 4 very
+	rapidly, and it will give you the same sort of answer for the
+	next number.
+
+	When you find a number which pgen says it doesn't know about,
+	you must test it more thoroughly.  This is time-consuming, but
+	pgen has weeded out many no-hopers so it's not so bad.
+
+	The Rabin-Miller test takes a number to test, and stores it in a
+	context.  You then give it some random numbers, and it gives you
+	results.  When you've finished, you destroy the context.
+
+	The Rabin-Miller test has two possible responses: (a) definitely
+	not prime, and (b) probably prime.  Answer (a) is certain.
+	Answer (b) you may get with a composite number at most one time
+	in four.  If you try n times, the probability of a composite
+	number going unnoticed is at most one in 4^n.  In fact, the real
+	probability is *much* lower than this.
+
+	It's also important to bear in mind that this is examining the
+	probability that a number passes n Rabin-Miller tests given
+	that it's composite.  What's actually interesting the the
+	probability that a number is composite given that it passed n
+	tests.  This probability is lower still.  For 1024-bit numbers,
+	which are about the right size by current standards of security,
+	you need only five tests to ensure a probability of less than 1
+	in 2^80 of a composite number slipping through.
+
+	There are specialized functions for finding various sorts of
+	prime numbers with particular properties.  Read the header files
+	to find out what they do.
+
+
+Conclusion
+
+	This basically concludes the tour of Catacomb's multiprecision
+	maths library.  I hope it's provided enough background that the
+	comments start making sense.
+
+--
+[mdw]
+
+
+Local variables:
+mode: text
+End: