1 Multiprecision maths library

4 Catacomb comes with a fairly reasonable multiprecision maths

5 library. It's not stunningly fast, and doesn't do everything,

6 but it seems good for the sorts of things needed in

7 cryptographic algorithms (which, for example, GMP isn't[1]).

9 There are basically three layers to the system:

11 * The low-level `mpx' layer, which implements some basic

12 arithmetic but is fairly unpleasant to actually use.

14 * A high-level `mp' layer, which provides a useful `mp'

15 multiprecision type and implements useful arithmetic on

16 them.

18 * Everything else. A collection of miscellaneous routines for

19 performing particular operations efficiently or simply.

21 [1] In particular, GMP is truly rotten at converting numbers

22 into well-defined binary formats. It has its own format,

23 but it's not defined in the manual, has changed once before,

24 probably doesn't conform to any standards in the

25 cryptographic community, and can only be written to or read

26 from a file stream, so you can't use it in memory (for

27 example to encrypt it). Basically, it's a right pain.

30 The low-level interface

32 The low level is described in `mpx.h'. It includes everything

33 else it needs.

35 An integer is represented in an array of words. Each word has

36 type `mpw' (multi precision word). The least significant word

37 comes first. An array is described by its `base' -- the address

38 of the first element -- and its `limit' -- the address *just

39 after* the last element. Thus, the length of the array is

40 `limit - base'. This representation is particularly convenient

41 for many things. An old version used base and length, which was

42 a nuisance.

44 The largest value which can be represented in a word is

45 MPW_MAX. The number of bits used in the representation of a

46 word is MPW_BITS. Actually, you might be able to use more bits

47 and store larger numbers, but everything will go wrong if you

48 try.

50 The macro call MPW(x) returns the low MPW_BITS bits of x. It's

51 very useful in low-level arithmetic.

53 The call MPWS(n) tells you the `sizeof' an array of n words.

54 It's also very useful. Try not to get MPW and MPWS confused

55 because they mean very different things.

57 The actual low-level macros and functions are documented

58 relatively well. They are given source and destination arrays

59 for things to be read and written to. Usually the sources

60 mustn't overlap the destinations, and destinations certainly

61 mustn't overlap other destinations; sometimes this restriction

62 is lifted, where it would be both convenient and easy to do so.

63 Sources can overlap each other without problems, because they're

64 not written to.

66 The difficult parts are the Karatsuba functions. Karatsuba

67 and Ofman discovered a neat recursive way to multiply large

68 numbers. It requires quite a bit of workspace, and adds some

69 overhead, but has lower asymptotic complexity than the standard

70 multiply algorithm. Usually, Karatsuba functions get used

71 automatically by the rest of the library when appropriate and

72 you don't have to care.

75 The high-level interface

77 The high-level invents a type, `mp'. It's a structured type,

78 with the following members:

80 mpw *v, *vl Base and limit of the word array.

81 size_t sz Number of words actually allocated to the array.

82 unsigned f Various flags.

83 unsigned ref Number of references to the integer.

85 The flags are interesting. Here's what they mean:

87 MP_NEG The number is negative

88 MP_BURN Burn the number after using it

89 MP_CONST The number mustn't be modified or freed

90 MP_UNDEF The number contains no interesting value

91 MP_DESTROYED The number has been freed once already

93 The last one is handy for catching bugs. MP_NEG is obvious --

94 Catacomb uses a signed-magnitude representation because things

95 are usually easier that way. MP_CONST is set on various

96 constants provided for general convenience so they don't get

97 destroyed by accident. MP_UNDEF is used to avoid copying

98 useless data.

100 MP_BURN marks a `secret' number. Secret numbers are overwritten

101 with zeroes (`burnt') when they're freed. Secrecy is a viral

102 concept: anything calculated from a secret number is also

103 secret, so they get burnt too.

105 Most of the time, you refer to numbers by their addresses. Lots

106 of `mp *' pointers get used. You hardly ever see a real `mp',

107 just pointers to them.

109 There are a collection of utility functions for keeping the

110 system running -- for creating new numbers without interesting

111 values, for throwing old ones away, for extending the amount of

112 memory they use, and so on.

114 Throwing numbers away is simple. You pass the pointer to the

115 number to `mp_drop'. If someone else still has a reference to

116 the number, it stays where it is, but remembers that you're not

117 interested in it any more. If nobody's interested then the

118 number is disposed of.

120 You can add a reference using MP_COPY. It returns the new

121 reference. The new reference is the same as the old one, but

122 it's a useful fiction to pretend that it's different.

124 Real arithmetic happens in a fairly uniform way. There's an

125 important point to make about `destinations', though. Just

126 about all of the arithmetic functions take `destination'

127 arguments (one for each result), with the idea that, at least

128 conceptually, they store the results there. What actually

129 happens is that a reference is lost to whatever the destination

130 used to refer to, and a reference to the new result is gained.

131 The address might be different, and then again it might not.

132 Destinations can be the same as sources -- that works fine.

133 Finally, there's the magic value `MP_NEW'. MP_NEW is a special

134 constant which, as a destination, means `create a new place and

135 put the answer there'.

137 The idea is that, sometimes it helps that memory is already

138 allocated for a destination, so it doesn't have to be obtained

139 specially, and, even more often, you want to throw away old

140 intermediate results while you're calculating new ones.

142 Here are some examples of how to do it right:

144 x = mp_add(MP_NEW, y, z);

145 Assumes x didn't have a useful value before.

146 Afterwards it refers to the sum of y and z. y and z

147 themselves are unchanged.

149 y = mp_add(y, y, z);

150 Add z to y. Now y is the sum of z and the old y. z is

151 unchanged.

153 q = MP_NEW;

154 mp_div(&q, &x, x, y);

155 Sets q (a new value) equal to x / y, and puts the

156 remainder back in x. y is unchanged.

158 z = mp_sub(z, y, z);

159 Subtract z from y, putting the result back in z. y is

160 unchanged.

162 x = mp_mul(y, y, z);

163 Multiply y by z, putting the result in x, but making y

164 no longer useful.

166 Here are some examples of how to do it wrong:

168 mp_mul(y, y, z);

169 mp_mul might choose somewhere other than y's storage to

170 write its result. (In fact, the current version

171 certainly will.) So y is trashed because there are no

172 references to it any more, and the result of the

173 multiplication is lost.

175 x = mp_mul(y, y, z);

176 x = mp_add(x, x, y);

177 Whoops. We stored the result in x, but y still got

178 trashed and might contain anything. Adding it to x is a

179 bad move.

181 It's not hard once you get the hang of it, although it might

182 feel a bit weird to start with.

185 Modular multiplication and exponentiation

187 Lots of public-key crypto uses modular multiplication and

188 exponentiation operations. Doing them efficiently is very

189 important. Multiplication on its own is easy, and Catacomb's

190 multiplication algorithms are fairly good (though not up with

191 the very best). Doing the modular reduction afterwards is the

192 tricky bit.

194 There are three approaches to modular reduction in Catacomb.

195 All of them have their good and bad points.

197 1. Use mp_div

199 For a one-off calculation, this is a good idea. Division is a

200 slow operation -- that's a shame, but life's like that. But

201 when you're doing a one-shot operation, that's the best you can

202 manage.

204 2. Use Barrett reduction

206 Barrett reduction is a system whereby you do a little

207 precomputation up-front (one division, in fact) and are then

208 able to perform modular reductions without resorting to

209 expensive division operations. To use it, you initialize a

210 `context' with the modulus, reduce some numbers, and then

211 destroy the context when you've finished.

213 Conceptually, Barrett reduction is simple. You give it x and it

214 gives you back x mod m. The mathematics behind it is fairly

215 clever, though -- the real manual has a page of explanation for

216 how it works.

218 Barrett reduction works on any positive modulus m. It will do a

219 modular reduction on any integer x which is at most twice as

220 long as m (in words). x can be larger than m^2, but it mustn't

221 be more than twice as long.

223 3. Use Montgomery reduction

225 Montgomery reduction is a little more clever. It does a little

226 more work up-front than Barrett reduction, and understanding the

227 benefits is a little harder. However, it's actually slightly

228 more efficient for general use. There's also a disadvantage

229 that Montgomery reduction only works when the modulus is *odd*.

230 It all goes horribly wrong with an even modulus. Don't try it.

232 I'm not going to explain how Montgomery reduction actually

233 works. But I am going to explain how to use it.

235 The precomputation for Montgomery reduction, performed when you

236 initialize a context, computes a value called R, and also R^2

237 (both mod m). (It also computes some other stuff, but that's

238 not important for this discussion.) Note that R is dependent on

239 the modulus m.

241 A Montgomery reduction takes a number x, which should be less

242 than m^2 (it can actually be a bit larger) and computes the

243 value x R^{-1} mod m. That doesn't sound very useful, does it?

245 Things start looking more hopeful when you multiply your inputs

246 by R. (There's a clever way of doing that: see below.) To

247 compute xy mod m, you can first calculate xR and yR, multiply

248 them together to get xyR^2, and do a Montgomery reduction to get

249 xyR^2 R^{-1} mod m. Then the R^{-1} cancels one of the Rs and

250 you end up with xyR mod m. Essentially, you do your entire

251 calculation with an extra factor of R all the way through it.

253 Removing the factor of R at the end is easy. You just run the

254 Montgomery reduction algorithm again -- the R^{-1} cancels the R

255 and the answer pops out the end. Getting the factor of R in the

256 first place is easy -- you multiply by R^2 and do a reduction.

258 The multiply-and-reduce thing is so common that there's a

259 dedicated function which does them both in one go.

261 Here's a simple example of multiplying two numbers mod m:

263 mpmont mm;

264 mp *ar, *br, *p;

265 mpmont_create(&mm, m);

266 ar = mpmont_mul(MP_NEW, a, m.r2); /* aR mod m */

267 br = mpmont_mul(MP_NEW, b, m.r2); /* bR mod m */

268 p = mpmont_mul(&mm, MP_NEW, ar, br); /* abR mod m */

269 p = mpmont_reduce(&mm, p, p); /* ab mod m */

270 mpmont_destroy(&mm);

271 mp_drop(ar);

272 mp_drop(br);

274 Actually, it's not a very clever example. Here's a better way,

275 for such a simple calculation:

277 mpmont mm;

278 mp *p;

279 mpmont_create(&mm, m);

280 p = mpmont_mul(&mm, MP_NEW, a, b); /* abR^{-1} mod m */

281 p = mpmont_mul(&mm, p, p, mm.r2); /* ab mod m */

282 mpmont_destroy(&mm);

284 Of course, for a single multiply-and-reduce, you're better off

285 just using

287 mp *p = mp_mul(MP_NEW, a, b);

288 mp_div(0, &p, p, m);

290 the traditional division method.

292 Both Barrett and Montgomery reduction methods have modular

293 exponentiation functions. If m is a message, n is an RSA

294 modulus, and e is the public exponent, I can do an RSA

295 encryption simply by doing

297 mpmont mm;

298 mpmont_create(&mm, n); /* RSA modulus is odd */

299 c = mpmont_exp(&mm, MP_NEW, m, e);

300 mpmont_destroy(&mm);

302 (Yes, it's well worth creating a Montgomery context for a

303 full-blown exponentiation, unless someone's deliberately gone

304 out of their way to make it easy for you by choosing a very

305 small public exponent.)

307 RSA decryption is fully covered by the library, because doing it

308 properly is complicated and difficult since it involves playing

309 with blinding factors and the Chinese Remainder Theorem.

311 (There's also a useful RSA parameter recovery system which can

312 determine all the useful bits of information for efficient RSA

313 decryption given a very small amount of information. For

314 example, given the modulus and both exponents, it can recover

315 the two secret factors. This is, by the way, a good reason not

316 to share a modulus among many users.)

319 Finding prime numbers

321 Prime numbers are important too. Finding them is not ever-so

322 easy. There's a huge variety of useful properties which are

323 needed, and it's basically impossible to cover everything.

325 Catacomb has two useful facilities. There's a fast sequential-

326 search filtering system called `pfilt', and a good (but

327 probabilistic) primality tester which implements the Rabin-

328 Miller test.

330 Over the top of this is a configurable plug-in-appropriate-bits

331 system called `pgen' which ties everything together. You're

332 much better off using `pgen' than grovelling about with the

333 filter and Rabin-Miller tests by hand. The low-level details

334 are much better used to create new `pgen' components.

337 Conclusion

339 This basically concludes the tour of Catacomb's multiprecision

340 maths library. I hope it's provided enough background that the

341 comments start making sense.

343 -- [mdw]

345 \f

346 Local variables:

347 mode: text

348 End: