Commit | Line | Data |
---|---|---|

759513d1 | 1 | Multiprecision maths library |

2 | ||

3 | ||

4 | Catacomb comes with a fairly reasonable multiprecision maths | |

5 | library. It's not stunningly fast, and doesn't do everything, | |

6 | but it seems good for the sorts of things needed in | |

7 | cryptographic algorithms (which, for example, GMP isn't[1]). | |

8 | ||

9 | There are basically three layers to the system: | |

10 | ||

11 | * The low-level `mpx' layer, which implements some basic | |

12 | arithmetic but is fairly unpleasant to actually use. | |

13 | ||

14 | * A high-level `mp' layer, which provides a useful `mp' | |

15 | multiprecision type and implements useful arithmetic on | |

16 | them. | |

17 | ||

18 | * Everything else. A collection of miscellaneous routines for | |

19 | performing particular operations efficiently or simply. | |

20 | ||

21 | [1] In particular, GMP is truly rotten at converting numbers | |

22 | into well-defined binary formats. It has its own format, | |

23 | but it's not defined in the manual, has changed once before, | |

24 | probably doesn't conform to any standards in the | |

25 | cryptographic community, and can only be written to or read | |

26 | from a file stream, so you can't use it in memory (for | |

27 | example to encrypt it). Basically, it's a right pain. | |

28 | ||

29 | ||

30 | The low-level interface | |

31 | ||

32 | The low level is described in `mpx.h'. It includes everything | |

33 | else it needs. | |

34 | ||

35 | An integer is represented in an array of words. Each word has | |

36 | type `mpw' (multi precision word). The least significant word | |

37 | comes first. An array is described by its `base' -- the address | |

38 | of the first element -- and its `limit' -- the address *just | |

39 | after* the last element. Thus, the length of the array is | |

40 | `limit - base'. This representation is particularly convenient | |

41 | for many things. An old version used base and length, which was | |

42 | a nuisance. | |

43 | ||

44 | The largest value which can be represented in a word is | |

45 | MPW_MAX. The number of bits used in the representation of a | |

46 | word is MPW_BITS. Actually, you might be able to use more bits | |

47 | and store larger numbers, but everything will go wrong if you | |

48 | try. | |

49 | ||

50 | The macro call MPW(x) returns the low MPW_BITS bits of x. It's | |

51 | very useful in low-level arithmetic. | |

52 | ||

53 | The call MPWS(n) tells you the `sizeof' an array of n words. | |

54 | It's also very useful. Try not to get MPW and MPWS confused | |

55 | because they mean very different things. | |

56 | ||

57 | The actual low-level macros and functions are documented | |

58 | relatively well. They are given source and destination arrays | |

59 | for things to be read and written to. Usually the sources | |

60 | mustn't overlap the destinations, and destinations certainly | |

61 | mustn't overlap other destinations; sometimes this restriction | |

62 | is lifted, where it would be both convenient and easy to do so. | |

63 | Sources can overlap each other without problems, because they're | |

64 | not written to. | |

65 | ||

66 | The difficult parts are the Karatsuba functions. Karatsuba | |

67 | and Ofman discovered a neat recursive way to multiply large | |

68 | numbers. It requires quite a bit of workspace, and adds some | |

69 | overhead, but has lower asymptotic complexity than the standard | |

70 | multiply algorithm. Usually, Karatsuba functions get used | |

71 | automatically by the rest of the library when appropriate and | |

72 | you don't have to care. | |

73 | ||

74 | ||

75 | The high-level interface | |

76 | ||

77 | The high-level invents a type, `mp'. It's a structured type, | |

78 | with the following members: | |

79 | ||

80 | mpw *v, *vl Base and limit of the word array. | |

81 | size_t sz Number of words actually allocated to the array. | |

82 | unsigned f Various flags. | |

83 | unsigned ref Number of references to the integer. | |

84 | ||

85 | The flags are interesting. Here's what they mean: | |

86 | ||

87 | MP_NEG The number is negative | |

88 | MP_BURN Burn the number after using it | |

89 | MP_CONST The number mustn't be modified or freed | |

90 | MP_UNDEF The number contains no interesting value | |

91 | MP_DESTROYED The number has been freed once already | |

92 | ||

93 | The last one is handy for catching bugs. MP_NEG is obvious -- | |

94 | Catacomb uses a signed-magnitude representation because things | |

95 | are usually easier that way. MP_CONST is set on various | |

96 | constants provided for general convenience so they don't get | |

97 | destroyed by accident. MP_UNDEF is used to avoid copying | |

98 | useless data. | |

99 | ||

100 | MP_BURN marks a `secret' number. Secret numbers are overwritten | |

101 | with zeroes (`burnt') when they're freed. Secrecy is a viral | |

102 | concept: anything calculated from a secret number is also | |

103 | secret, so they get burnt too. | |

104 | ||

105 | Most of the time, you refer to numbers by their addresses. Lots | |

106 | of `mp *' pointers get used. You hardly ever see a real `mp', | |

107 | just pointers to them. | |

108 | ||

109 | There are a collection of utility functions for keeping the | |

110 | system running -- for creating new numbers without interesting | |

111 | values, for throwing old ones away, for extending the amount of | |

112 | memory they use, and so on. | |

113 | ||

114 | Throwing numbers away is simple. You pass the pointer to the | |

115 | number to `mp_drop'. If someone else still has a reference to | |

116 | the number, it stays where it is, but remembers that you're not | |

117 | interested in it any more. If nobody's interested then the | |

118 | number is disposed of. | |

119 | ||

120 | You can add a reference using MP_COPY. It returns the new | |

121 | reference. The new reference is the same as the old one, but | |

122 | it's a useful fiction to pretend that it's different. | |

123 | ||

124 | Real arithmetic happens in a fairly uniform way. There's an | |

125 | important point to make about `destinations', though. Just | |

126 | about all of the arithmetic functions take `destination' | |

127 | arguments (one for each result), with the idea that, at least | |

128 | conceptually, they store the results there. What actually | |

129 | happens is that a reference is lost to whatever the destination | |

130 | used to refer to, and a reference to the new result is gained. | |

131 | The address might be different, and then again it might not. | |

4a2da2d1 | 132 | Destinations can be the same as sources -- that works fine. |

759513d1 | 133 | Finally, there's the magic value `MP_NEW'. MP_NEW is a special |

134 | constant which, as a destination, means `create a new place and | |

135 | put the answer there'. | |

136 | ||

137 | The idea is that, sometimes it helps that memory is already | |

138 | allocated for a destination, so it doesn't have to be obtained | |

139 | specially, and, even more often, you want to throw away old | |

140 | intermediate results while you're calculating new ones. | |

141 | ||

142 | Here are some examples of how to do it right: | |

143 | ||

144 | x = mp_add(MP_NEW, y, z); | |

145 | Assumes x didn't have a useful value before. | |

146 | Afterwards it refers to the sum of y and z. y and z | |

147 | themselves are unchanged. | |

148 | ||

149 | y = mp_add(y, y, z); | |

150 | Add z to y. Now y is the sum of z and the old y. z is | |

151 | unchanged. | |

152 | ||

153 | q = MP_NEW; | |

154 | mp_div(&q, &x, x, y); | |

155 | Sets q (a new value) equal to x / y, and puts the | |

156 | remainder back in x. y is unchanged. | |

157 | ||

158 | z = mp_sub(z, y, z); | |

159 | Subtract z from y, putting the result back in z. y is | |

160 | unchanged. | |

161 | ||

162 | x = mp_mul(y, y, z); | |

163 | Multiply y by z, putting the result in x, but making y | |

164 | no longer useful. | |

165 | ||

4a2da2d1 | 166 | Here are some examples of how to do it wrong: |

759513d1 | 167 | |

168 | mp_mul(y, y, z); | |

4a2da2d1 | 169 | mp_mul might choose somewhere other than y's storage to |

759513d1 | 170 | write its result. (In fact, the current version |

171 | certainly will.) So y is trashed because there are no | |

172 | references to it any more, and the result of the | |

173 | multiplication is lost. | |

174 | ||

175 | x = mp_mul(y, y, z); | |

176 | x = mp_add(x, x, y); | |

177 | Whoops. We stored the result in x, but y still got | |

178 | trashed and might contain anything. Adding it to x is a | |

179 | bad move. | |

180 | ||

181 | It's not hard once you get the hang of it, although it might | |

182 | feel a bit weird to start with. | |

183 | ||

184 | ||

185 | Modular multiplication and exponentiation | |

186 | ||

187 | Lots of public-key crypto uses modular multiplication and | |

6540d069 | 188 | exponentiation operations. Doing them efficiently is very |

759513d1 | 189 | important. Multiplication on its own is easy, and Catacomb's |

190 | multiplication algorithms are fairly good (though not up with | |

191 | the very best). Doing the modular reduction afterwards is the | |

192 | tricky bit. | |

193 | ||

194 | There are three approaches to modular reduction in Catacomb. | |

195 | All of them have their good and bad points. | |

196 | ||

197 | 1. Use mp_div | |

198 | ||

199 | For a one-off calculation, this is a good idea. Division is a | |

200 | slow operation -- that's a shame, but life's like that. But | |

201 | when you're doing a one-shot operation, that's the best you can | |

202 | manage. | |

203 | ||

204 | 2. Use Barrett reduction | |

205 | ||

206 | Barrett reduction is a system whereby you do a little | |

207 | precomputation up-front (one division, in fact) and are then | |

208 | able to perform modular reductions without resorting to | |

209 | expensive division operations. To use it, you initialize a | |

210 | `context' with the modulus, reduce some numbers, and then | |

211 | destroy the context when you've finished. | |

212 | ||

213 | Conceptually, Barrett reduction is simple. You give it x and it | |

214 | gives you back x mod m. The mathematics behind it is fairly | |

215 | clever, though -- the real manual has a page of explanation for | |

216 | how it works. | |

217 | ||

218 | Barrett reduction works on any positive modulus m. It will do a | |

219 | modular reduction on any integer x which is at most twice as | |

220 | long as m (in words). x can be larger than m^2, but it mustn't | |

221 | be more than twice as long. | |

222 | ||

223 | 3. Use Montgomery reduction | |

224 | ||

225 | Montgomery reduction is a little more clever. It does a little | |

226 | more work up-front than Barrett reduction, and understanding the | |

227 | benefits is a little harder. However, it's actually slightly | |

228 | more efficient for general use. There's also a disadvantage | |

229 | that Montgomery reduction only works when the modulus is *odd*. | |

230 | It all goes horribly wrong with an even modulus. Don't try it. | |

231 | ||

232 | I'm not going to explain how Montgomery reduction actually | |

233 | works. But I am going to explain how to use it. | |

234 | ||

235 | The precomputation for Montgomery reduction, performed when you | |

236 | initialize a context, computes a value called R, and also R^2 | |

237 | (both mod m). (It also computes some other stuff, but that's | |

238 | not important for this discussion.) Note that R is dependent on | |

239 | the modulus m. | |

240 | ||

241 | A Montgomery reduction takes a number x, which should be less | |

242 | than m^2 (it can actually be a bit larger) and computes the | |

243 | value x R^{-1} mod m. That doesn't sound very useful, does it? | |

244 | ||

245 | Things start looking more hopeful when you multiply your inputs | |

45c0fd36 | 246 | by R. (There's a clever way of doing that: see below.) To |

759513d1 | 247 | compute xy mod m, you can first calculate xR and yR, multiply |

248 | them together to get xyR^2, and do a Montgomery reduction to get | |

249 | xyR^2 R^{-1} mod m. Then the R^{-1} cancels one of the Rs and | |

250 | you end up with xyR mod m. Essentially, you do your entire | |

251 | calculation with an extra factor of R all the way through it. | |

252 | ||

253 | Removing the factor of R at the end is easy. You just run the | |

254 | Montgomery reduction algorithm again -- the R^{-1} cancels the R | |

255 | and the answer pops out the end. Getting the factor of R in the | |

256 | first place is easy -- you multiply by R^2 and do a reduction. | |

257 | ||

258 | The multiply-and-reduce thing is so common that there's a | |

259 | dedicated function which does them both in one go. | |

260 | ||

261 | Here's a simple example of multiplying two numbers mod m: | |

262 | ||

263 | mpmont mm; | |

264 | mp *ar, *br, *p; | |

265 | mpmont_create(&mm, m); | |

4a2da2d1 | 266 | ar = mpmont_mul(MP_NEW, a, m.r2); /* aR mod m */ |

267 | br = mpmont_mul(MP_NEW, b, m.r2); /* bR mod m */ | |

759513d1 | 268 | p = mpmont_mul(&mm, MP_NEW, ar, br); /* abR mod m */ |

269 | p = mpmont_reduce(&mm, p, p); /* ab mod m */ | |

270 | mpmont_destroy(&mm); | |

271 | mp_drop(ar); | |

272 | mp_drop(br); | |

273 | ||

274 | Actually, it's not a very clever example. Here's a better way, | |

275 | for such a simple calculation: | |

276 | ||

277 | mpmont mm; | |

278 | mp *p; | |

279 | mpmont_create(&mm, m); | |

280 | p = mpmont_mul(&mm, MP_NEW, a, b); /* abR^{-1} mod m */ | |

281 | p = mpmont_mul(&mm, p, p, mm.r2); /* ab mod m */ | |

282 | mpmont_destroy(&mm); | |

283 | ||

284 | Of course, for a single multiply-and-reduce, you're better off | |

285 | just using | |

286 | ||

287 | mp *p = mp_mul(MP_NEW, a, b); | |

288 | mp_div(0, &p, p, m); | |

289 | ||

290 | the traditional division method. | |

291 | ||

292 | Both Barrett and Montgomery reduction methods have modular | |

293 | exponentiation functions. If m is a message, n is an RSA | |

294 | modulus, and e is the public exponent, I can do an RSA | |

295 | encryption simply by doing | |

296 | ||

297 | mpmont mm; | |

298 | mpmont_create(&mm, n); /* RSA modulus is odd */ | |

299 | c = mpmont_exp(&mm, MP_NEW, m, e); | |

300 | mpmont_destroy(&mm); | |

301 | ||

302 | (Yes, it's well worth creating a Montgomery context for a | |

303 | full-blown exponentiation, unless someone's deliberately gone | |

304 | out of their way to make it easy for you by choosing a very | |

305 | small public exponent.) | |

306 | ||

0ca05eea | 307 | RSA decryption is fully covered by the library, because doing it |

308 | properly is complicated and difficult since it involves playing | |

309 | with blinding factors and the Chinese Remainder Theorem. | |

310 | ||

311 | (There's also a useful RSA parameter recovery system which can | |

312 | determine all the useful bits of information for efficient RSA | |

313 | decryption given a very small amount of information. For | |

314 | example, given the modulus and both exponents, it can recover | |

315 | the two secret factors. This is, by the way, a good reason not | |

316 | to share a modulus among many users.) | |

317 | ||

759513d1 | 318 | |

319 | Finding prime numbers | |

320 | ||

321 | Prime numbers are important too. Finding them is not ever-so | |

4a2da2d1 | 322 | easy. There's a huge variety of useful properties which are |

0ca05eea | 323 | needed, and it's basically impossible to cover everything. |

759513d1 | 324 | |

325 | Catacomb has two useful facilities. There's a fast sequential- | |

4a2da2d1 | 326 | search filtering system called `pfilt', and a good (but |

759513d1 | 327 | probabilistic) primality tester which implements the Rabin- |

328 | Miller test. | |

329 | ||

4a2da2d1 | 330 | Over the top of this is a configurable plug-in-appropriate-bits |

331 | system called `pgen' which ties everything together. You're | |

0ca05eea | 332 | much better off using `pgen' than grovelling about with the |

333 | filter and Rabin-Miller tests by hand. The low-level details | |

334 | are much better used to create new `pgen' components. | |

759513d1 | 335 | |

336 | ||

337 | Conclusion | |

338 | ||

339 | This basically concludes the tour of Catacomb's multiprecision | |

340 | maths library. I hope it's provided enough background that the | |

341 | comments start making sense. | |

342 | ||

ee62fa16 | 343 | -- [mdw] |

759513d1 | 344 | |

345 | \f | |

346 | Local variables: | |

347 | mode: text | |

348 | End: |