git.distorted.org.uk Git - u/mdw/catacomb/blob - README.mp

   1 Multiprecision maths library
   2
   3
   4         Catacomb comes with a fairly reasonable multiprecision maths
   5         library.  It's not stunningly fast, and doesn't do everything,
   6         but it seems good for the sorts of things needed in
   7         cryptographic algorithms (which, for example, GMP isn't[1]).
   8
   9         There are basically three layers to the system:
  10
  11           * The low-level `mpx' layer, which implements some basic
  12             arithmetic but is fairly unpleasant to actually use.
  13
  14           * A high-level `mp' layer, which provides a useful `mp'
  15             multiprecision type and implements useful arithmetic on
  16             them.
  17
  18           * Everything else.  A collection of miscellaneous routines for
  19             performing particular operations efficiently or simply.
  20
  21         [1] In particular, GMP is truly rotten at converting numbers
  22             into well-defined binary formats.  It has its own format,
  23             but it's not defined in the manual, has changed once before,
  24             probably doesn't conform to any standards in the
  25             cryptographic community, and can only be written to or read
  26             from a file stream, so you can't use it in memory (for
  27             example to encrypt it).  Basically, it's a right pain.
  28
  29
  30 The low-level interface
  31
  32         The low level is described in `mpx.h'.  It includes everything
  33         else it needs.
  34
  35         An integer is represented in an array of words.  Each word has
  36         type `mpw' (multi precision word).  The least significant word
  37         comes first.  An array is described by its `base' -- the address
  38         of the first element -- and its `limit' -- the address *just
  39         after* the last element.  Thus, the length of the array is
  40         `limit - base'.  This representation is particularly convenient
  41         for many things.  An old version used base and length, which was
  42         a nuisance.
  43
  44         The largest value which can be represented in a word is
  45         MPW_MAX.  The number of bits used in the representation of a
  46         word is MPW_BITS.  Actually, you might be able to use more bits
  47         and store larger numbers, but everything will go wrong if you
  48         try.
  49
  50         The macro call MPW(x) returns the low MPW_BITS bits of x.  It's
  51         very useful in low-level arithmetic.
  52
  53         The call MPWS(n) tells you the `sizeof' an array of n words.
  54         It's also very useful.  Try not to get MPW and MPWS confused
  55         because they mean very different things.
  56
  57         The actual low-level macros and functions are documented
  58         relatively well.  They are given source and destination arrays
  59         for things to be read and written to.  Usually the sources
  60         mustn't overlap the destinations, and destinations certainly
  61         mustn't overlap other destinations; sometimes this restriction
  62         is lifted, where it would be both convenient and easy to do so.
  63         Sources can overlap each other without problems, because they're
  64         not written to.
  65
  66         The difficult parts are the Karatsuba functions.  Karatsuba
  67         and Ofman discovered a neat recursive way to multiply large
  68         numbers.  It requires quite a bit of workspace, and adds some
  69         overhead, but has lower asymptotic complexity than the standard
  70         multiply algorithm.  Usually, Karatsuba functions get used
  71         automatically by the rest of the library when appropriate and
  72         you don't have to care.
  73
  74
  75 The high-level interface
  76
  77         The high-level invents a type, `mp'.  It's a structured type,
  78         with the following members:
  79
  80         mpw *v, *vl     Base and limit of the word array.
  81         size_t sz       Number of words actually allocated to the array.
  82         unsigned f      Various flags.
  83         unsigned ref    Number of references to the integer.
  84
  85         The flags are interesting.  Here's what they mean:
  86
  87         MP_NEG          The number is negative
  88         MP_BURN         Burn the number after using it
  89         MP_CONST        The number mustn't be modified or freed
  90         MP_UNDEF        The number contains no interesting value
  91         MP_DESTROYED    The number has been freed once already
  92
  93         The last one is handy for catching bugs.  MP_NEG is obvious --
  94         Catacomb uses a signed-magnitude representation because things
  95         are usually easier that way.  MP_CONST is set on various
  96         constants provided for general convenience so they don't get
  97         destroyed by accident.  MP_UNDEF is used to avoid copying
  98         useless data.
  99
 100         MP_BURN marks a `secret' number.  Secret numbers are overwritten
 101         with zeroes (`burnt') when they're freed.  Secrecy is a viral
 102         concept: anything calculated from a secret number is also
 103         secret, so they get burnt too.
 104
 105         Most of the time, you refer to numbers by their addresses.  Lots
 106         of `mp *' pointers get used.  You hardly ever see a real `mp',
 107         just pointers to them.
 108
 109         There are a collection of utility functions for keeping the
 110         system running -- for creating new numbers without interesting
 111         values, for throwing old ones away, for extending the amount of
 112         memory they use, and so on.
 113
 114         Throwing numbers away is simple.  You pass the pointer to the
 115         number to `mp_drop'.  If someone else still has a reference to
 116         the number, it stays where it is, but remembers that you're not
 117         interested in it any more.  If nobody's interested then the
 118         number is disposed of.
 119
 120         You can add a reference using MP_COPY.  It returns the new
 121         reference.  The new reference is the same as the old one, but
 122         it's a useful fiction to pretend that it's different.
 123
 124         Real arithmetic happens in a fairly uniform way.  There's an
 125         important point to make about `destinations', though.  Just
 126         about all of the arithmetic functions take `destination'
 127         arguments (one for each result), with the idea that, at least
 128         conceptually, they store the results there.  What actually
 129         happens is that a reference is lost to whatever the destination
 130         used to refer to, and a reference to the new result is gained.
 131         The address might be different, and then again it might not.
 132         Destinations acn be the same as sources -- that works fine.
 133         Finally, there's the magic value `MP_NEW'.  MP_NEW is a special
 134         constant which, as a destination, means `create a new place and
 135         put the answer there'.
 136
 137         The idea is that, sometimes it helps that memory is already
 138         allocated for a destination, so it doesn't have to be obtained
 139         specially, and, even more often, you want to throw away old
 140         intermediate results while you're calculating new ones.
 141
 142         Here are some examples of how to do it right:
 143
 144         x = mp_add(MP_NEW, y, z);
 145                 Assumes x didn't have a useful value before.
 146                 Afterwards it refers to the sum of y and z.  y and z
 147                 themselves are unchanged.
 148
 149         y = mp_add(y, y, z);
 150                 Add z to y.  Now y is the sum of z and the old y.  z is
 151                 unchanged.
 152
 153         q = MP_NEW;
 154         mp_div(&q, &x, x, y);
 155                 Sets q (a new value) equal to x / y, and puts the
 156                 remainder back in x.  y is unchanged.
 157
 158         z = mp_sub(z, y, z);
 159                 Subtract z from y, putting the result back in z.  y is
 160                 unchanged.
 161
 162         x = mp_mul(y, y, z);
 163                 Multiply y by z, putting the result in x, but making y
 164                 no longer useful.
 165
 166         Here's some examples of how to do it wrong:
 167
 168         mp_mul(y, y, z);
 169                 mp_mul might choose somewhere other than b's storage to
 170                 write its result.  (In fact, the current version
 171                 certainly will.)  So y is trashed because there are no
 172                 references to it any more, and the result of the
 173                 multiplication is lost.
 174
 175         x = mp_mul(y, y, z);
 176         x = mp_add(x, x, y);
 177                 Whoops.  We stored the result in x, but y still got
 178                 trashed and might contain anything.  Adding it to x is a
 179                 bad move.
 180
 181         It's not hard once you get the hang of it, although it might
 182         feel a bit weird to start with.
 183
 184
 185 Modular multiplication and exponentiation
 186
 187         Lots of public-key crypto uses modular multiplication and
 188         exponentiation operations.  Doing them efficiently is very
 189         important.  Multiplication on its own is easy, and Catacomb's
 190         multiplication algorithms are fairly good (though not up with
 191         the very best).  Doing the modular reduction afterwards is the
 192         tricky bit.
 193
 194         There are three approaches to modular reduction in Catacomb.
 195         All of them have their good and bad points.
 196
 197         1. Use mp_div
 198
 199         For a one-off calculation, this is a good idea.  Division is a
 200         slow operation -- that's a shame, but life's like that.  But
 201         when you're doing a one-shot operation, that's the best you can
 202         manage.
 203
 204         2. Use Barrett reduction
 205
 206         Barrett reduction is a system whereby you do a little
 207         precomputation up-front (one division, in fact) and are then
 208         able to perform modular reductions without resorting to
 209         expensive division operations.  To use it, you initialize a
 210         `context' with the modulus, reduce some numbers, and then
 211         destroy the context when you've finished.
 212
 213         Conceptually, Barrett reduction is simple.  You give it x and it
 214         gives you back x mod m.  The mathematics behind it is fairly
 215         clever, though -- the real manual has a page of explanation for
 216         how it works.
 217
 218         Barrett reduction works on any positive modulus m.  It will do a
 219         modular reduction on any integer x which is at most twice as
 220         long as m (in words).  x can be larger than m^2, but it mustn't
 221         be more than twice as long.
 222
 223         3. Use Montgomery reduction
 224
 225         Montgomery reduction is a little more clever.  It does a little
 226         more work up-front than Barrett reduction, and understanding the
 227         benefits is a little harder.  However, it's actually slightly
 228         more efficient for general use.  There's also a disadvantage
 229         that Montgomery reduction only works when the modulus is *odd*.
 230         It all goes horribly wrong with an even modulus.  Don't try it.
 231
 232         I'm not going to explain how Montgomery reduction actually
 233         works.  But I am going to explain how to use it.
 234
 235         The precomputation for Montgomery reduction, performed when you
 236         initialize a context, computes a value called R, and also R^2
 237         (both mod m).  (It also computes some other stuff, but that's
 238         not important for this discussion.)  Note that R is dependent on
 239         the modulus m.
 240
 241         A Montgomery reduction takes a number x, which should be less
 242         than m^2 (it can actually be a bit larger) and computes the
 243         value x R^{-1} mod m.  That doesn't sound very useful, does it?
 244
 245         Things start looking more hopeful when you multiply your inputs
 246         by R.  (There's a clever way of doing that: see below.)  To
 247         compute xy mod m, you can first calculate xR and yR, multiply
 248         them together to get xyR^2, and do a Montgomery reduction to get
 249         xyR^2 R^{-1} mod m.  Then the R^{-1} cancels one of the Rs and
 250         you end up with xyR mod m.  Essentially, you do your entire
 251         calculation with an extra factor of R all the way through it.
 252
 253         Removing the factor of R at the end is easy.  You just run the
 254         Montgomery reduction algorithm again -- the R^{-1} cancels the R
 255         and the answer pops out the end.  Getting the factor of R in the
 256         first place is easy -- you multiply by R^2 and do a reduction.
 257
 258         The multiply-and-reduce thing is so common that there's a
 259         dedicated function which does them both in one go.
 260
 261         Here's a simple example of multiplying two numbers mod m:
 262
 263                 mpmont mm;
 264                 mp *ar, *br, *p;
 265                 mpmont_create(&mm, m);
 266                 ar = mp_mul(MP_NEW, a, m.r2);           /* aR mod m */
 267                 br = mp_mul(MP_NEW, b, m.r2);           /* bR mod m */
 268                 p = mpmont_mul(&mm, MP_NEW, ar, br);    /* abR mod m */
 269                 p = mpmont_reduce(&mm, p, p);           /* ab mod m */
 270                 mpmont_destroy(&mm);
 271                 mp_drop(ar);
 272                 mp_drop(br);
 273
 274         Actually, it's not a very clever example.  Here's a better way,
 275         for such a simple calculation:
 276
 277                 mpmont mm;
 278                 mp *p;
 279                 mpmont_create(&mm, m);
 280                 p = mpmont_mul(&mm, MP_NEW, a, b);      /* abR^{-1} mod m */
 281                 p = mpmont_mul(&mm, p, p, mm.r2);       /* ab mod m */
 282                 mpmont_destroy(&mm);
 283
 284         Of course, for a single multiply-and-reduce, you're better off
 285         just using
 286
 287                 mp *p = mp_mul(MP_NEW, a, b);
 288                 mp_div(0, &p, p, m);
 289
 290         the traditional division method.
 291
 292         Both Barrett and Montgomery reduction methods have modular
 293         exponentiation functions.  If m is a message, n is an RSA
 294         modulus, and e is the public exponent, I can do an RSA
 295         encryption simply by doing
 296
 297                 mpmont mm;
 298                 mpmont_create(&mm, n);          /* RSA modulus is odd */
 299                 c = mpmont_exp(&mm, MP_NEW, m, e);
 300                 mpmont_destroy(&mm);
 301
 302         (Yes, it's well worth creating a Montgomery context for a
 303         full-blown exponentiation, unless someone's deliberately gone
 304         out of their way to make it easy for you by choosing a very
 305         small public exponent.)
 306
 307
 308 Finding prime numbers
 309
 310         Prime numbers are important too.  Finding them is not ever-so
 311         easy.
 312
 313         Catacomb has two useful facilities.  There's a fast sequential-
 314         search filtering system called `pgen', and a good (but
 315         probabilistic) primality tester which implements the Rabin-
 316         Miller test.
 317
 318         The idea is that you initialize a pgen object with a starting
 319         place.  It spends a short time initializing itself, and then
 320         tells you whether the number you gave it is (a) definitely
 321         composite, (b) definitely prime (only for small numbers), or (c)
 322         it doesn't know.  The advantage of the pgen system is that it's
 323         good at stepping regularly.  You can advance it by 2 or 4 very
 324         rapidly, and it will give you the same sort of answer for the
 325         next number.
 326
 327         When you find a number which pgen says it doesn't know about,
 328         you must test it more thoroughly.  This is time-consuming, but
 329         pgen has weeded out many no-hopers so it's not so bad.
 330
 331         The Rabin-Miller test takes a number to test, and stores it in a
 332         context.  You then give it some random numbers, and it gives you
 333         results.  When you've finished, you destroy the context.
 334
 335         The Rabin-Miller test has two possible responses: (a) definitely
 336         not prime, and (b) probably prime.  Answer (a) is certain.
 337         Answer (b) you may get with a composite number at most one time
 338         in four.  If you try n times, the probability of a composite
 339         number going unnoticed is at most one in 4^n.  In fact, the real
 340         probability is *much* lower than this.
 341
 342         It's also important to bear in mind that this is examining the
 343         probability that a number passes n Rabin-Miller tests given
 344         that it's composite.  What's actually interesting the the
 345         probability that a number is composite given that it passed n
 346         tests.  This probability is lower still.  For 1024-bit numbers,
 347         which are about the right size by current standards of security,
 348         you need only five tests to ensure a probability of less than 1
 349         in 2^80 of a composite number slipping through.
 350
 351         There are specialized functions for finding various sorts of
 352         prime numbers with particular properties.  Read the header files
 353         to find out what they do.
 354
 355
 356 Conclusion
 357
 358         This basically concludes the tour of Catacomb's multiprecision
 359         maths library.  I hope it's provided enough background that the
 360         comments start making sense.
 361
 362 --
 363 [mdw]
 364
 365 \f
 366 Local variables:
 367 mode: text
 368 End: