git.distorted.org.uk Git - u/mdw/putty/blame_incremental

... / ...

Commit	Line	Data
	1	/*
	2	* Elliptic curve arithmetic.
	3	*/
	4
	5	#include "bn-internal.h"
	6	#include "ssh.h"
	7
	8	/* Some background.
	9	*
	10	* These notes are intended to help programmers who are mathematically
	11	* inclined, but not well versed in the theory of elliptic curves. If more
	12	* detail is required, introductory texts of various levels of sophistication
	13	* are readily available. I found
	14	*
	15	* L. Washington, `Elliptic Curves: Number Theory and Cryptography', CRC
	16	* Press, 2003
	17	*
	18	* useful. Further background on algebra and number theory can be found in
	19	*
	20	* V. Shoup, `A Computational Introduction to Number Theory and Algebra',
	21	* version 2, Cambridge University Press, 2008; http://shoup.net/ntb/
	22	* [CC-ND-NC].
	23	*
	24	* We work in a finite field k = GF(p^m). To make things easier, we'll only
	25	* deal with the cases where p = 2 (`binary' fields) or m = 1 (`prime'
	26	* fields). In fact, we don't currently implement binary fields either.
	27	*
	28	* We start in two-dimensional projective space over k, denoted P^2(k). This
	29	* is the space of classes of triples of elements of k { (a x : a y : a z) \|
	30	* a in k^* } such that x y z /= 0. Projective points with z /= 0 are called
	31	* `finite', and correspond to points in the more familiar two-dimensional
	32	* affine space (x/z, y/z); points with z = 0 are called `infinite'.
	33	*
	34	* An elliptic curve is the set of projective points satisfying an equation
	35	* of the form
	36	*
	37	* y^2 z + a_1 x y z + a_3 y z^2 = x^3 + a_2 x^2 z + a_4 x z^2 + a_6 z^3
	38	*
	39	* (the `Weierstrass form'), such that the partial derivatives don't both
	40	* vanish at any point. (Notice that this is a homogeneous equation -- all
	41	* of the monomials have the same degree -- so this is a well-defined subset
	42	* of our projective space. Don't worry about the numbering of the
	43	* coefficients: maybe the later stuff about Jacobian coordinates will make
	44	* them clear.)
	45	*
	46	* Let's consider, for a moment, a straight line r x + s y + t z = 0 which
	47	* meets the curve at two points P and Q; then they must also meet again at
	48	* some other point R. (Here, any or all of P, Q, and R might be equal;
	49	* in this case, the simultaneous equations have repeated solutions.) Let's
	50	* go out on a limb and write
	51	*
	52	* P + Q + R = 0
	53	*
	54	* whenever three points P, Q, and R on the curve are collinear. Obviously
	55	* this notion of `addition' is commutative: three points are collinear
	56	* whichever order we write them. If we designate some point O as being an
	57	* `identity' then we can introduce a notion of inverses. And then some
	58	* magic occurs: this `addition' operation we've just invented is also
	59	* associative -- so it's an abelian group operation! (This can be
	60	* demonstrated by slogging through the algebra, but there's a fancy way
	61	* involving the Weil pairing. Washington's book has both versions.)
	62	*
	63	* So we have an abelian group. It's actually quite a fun group, for
	64	* complicated reasons I'm not going to do into here. For our purposes it's
	65	* sufficient to note that, in most elliptic curve groups, the discrete
	66	* logarithm problem -- finding x given x P -- seems really very hard. I
	67	* mean, much harder than in multiplicative subgroups of finite fields,
	68	* where index calculus techniques can give you the answer in subexponential
	69	* time. The best algorithms known for solving this problem in elliptic
	70	* curve groups are the ones which work in any old group G -- and they run in
	71	* O(sqrt(#G)) time.
	72	*
	73	* Staring at the Weierstrass equation some more, we can determine that in
	74	* fact there's exactly one infinite point on the curve: if z = 0 then we
	75	* must have x = 0, so (0 : 1 : 0) is precisely that point. That's kind of
	76	* handy.
	77	*
	78	* The Weierstrass form is pretty horrid, really. Fortunately, it's always
	79	* possible to apply a linear transformation to make it less nasty. If p > 3
	80	* (which it will be for the prime fields we care about), we can simplify
	81	* this to
	82	*
	83	* y^2 z = x^3 + a x z^2 + b z^3 (with 4 a^3 + 27 b^2 /= 0)
	84	*
	85	* and in the binary case we can simplify to
	86	*
	87	* y^2 z + x y z = x^3 + a x^2 z + b z^3 (with b /= 0)
	88	*
	89	* and it's these kinds of forms which we're actually going to work with.
	90	* (There are other forms of elliptic curves which have interesting benefits,
	91	* e.g., the Montgomery and Edwards forms, but they've not really caught on
	92	* in standards.) Linear transformations don't mess up the geometry, so the
	93	* group we constructed above still works -- and, indeed, the transformation
	94	* is an isomorphism of groups. And, finally, the point at infinity we found
	95	* above is still infinite. Let's call that point O, and declare it to be
	96	* our additive identity. That leaves the finite points (x : y : 1), which
	97	* correspond to the affine points
	98	*
	99	* y^2 = x^3 + a x + b or y^2 + x y = x^3 + a x^2 + b
	100	*
	101	* So, if we want to add two points P and Q, we draw the line between them;
	102	* if P = Q then the line in question is the tangent to the curve at P. (The
	103	* smoothness condition above is precisely what we need for this tangent to
	104	* be well-defined. Unsmooth curves aren't worth worrying about, because
	105	* they're cryptographically useless: they're isomorphic to some other group
	106	* in which discrete logs are very easy.) In prime fields, if this line is
	107	* vertical then it meets the curve again at O, and P + Q = O; otherwise it
	108	* meets the curve at some other finite point R = (x, y), and the answer we
	109	* want is P + Q = -R = (x, -y). In binary fields, the situation is similar
	110	* but a little messier because of the x y term we couldn't eliminate.
	111	*
	112	* It's not very hard to work out where this point R is. You get a cubic
	113	* equation out the far end which looks kind of daunting until you notice
	114	* that the negation of the x^2 coefficient is exactly the sum of the roots,
	115	* and you already know two of them. Unfortunately, the solution is still
	116	* somewhat unpleasant and involves a bunch of division -- and in finite
	117	* fields that involves computing a modular inverse, which is rather slow.
	118	*
	119	* Wouldn't it be really nice if we could somehow keep the denominators
	120	* separate, so we could just do a few `big' divisions at the end of some
	121	* calculation? Of course, we'd need to tweak the point-addition formulae so
	122	* that we could cope with input points which have separate denominators.
	123	* And so we enter the realm of fun coordinate systems.
	124	*
	125	* Perhaps an obvious choice of coordinates is the projective form above:
	126	* rather than working out (x/m, y/n), we could just keep track of a
	127	* projective point (x : y : m n). It turns out that there's a slightly
	128	* different variant which is a little better: `Jacobian coordinates' are
	129	* classes of triples (x :: y :: z), where x y z /= 0, and the point (x :: y
	130	* :: z) is equivalent to (a^2 x :: a^3 y :: a z).
	131	*
	132	* The (simplified) Wieierstrass equation, using Jacobian projective
	133	* coordinates, looks like this.
	134	*
	135	* y^2 = x^3 + a x z^2 + b z^6
	136	*
	137	* or
	138	*
	139	* y^2 + x y z = x^3 + a x^2 z + b z^3
	140	*
	141	* (Maybe this is a hint about the strange coefficient numbering in the
	142	* general Weierstrass form.) Anyway, if a point is infinite then we have
	143	* y^2 = x^3, which has the obvious solution x = y = 1.
	144	*
	145	* That's about all the theory we need for now. Let's get into the code.
	146	*/
	147
	148	/* Let's start by making an abstract interface for finite field arithmetic.
	149	* If we need to do clever things later then this will be a convenient
	150	* extension point (e.g., supporting binary fields, working in extension
	151	* fields as we'd need if we wanted to implement Tate or Weil pairings, or
	152	* just doing different kinds of optimizations for the prime case).
	153	*
	154	* This is quite cheesy, but it's enough for now.
	155	*/
	156
	157	union fieldelt {
	158	Bignum n;
	159	};
	160
	161	struct field {
	162	const struct field_ops ops; / Field operations table. */
	163	union fieldelt zero, one; /* Identity elements. */
	164	Bignum q; /* The field size. */
	165	};
	166
	167	struct field_ops {
	168	void (destroy)(struct field f); /* Destroy the field object. */
	169
	170	void (init)(struct field f, union fieldelt *x);
	171	/* Initialize x (to null). */
	172
	173	void (free)(struct field f, union fieldelt *x);
	174	/* Free an element x, leaving it
	175	* null.
	176	*/
	177
	178	void (copy)(struct field f, union fieldelt d, union fieldelt x);
	179	/* Make d be a copy of x. */
	180
	181	void (frombn)(struct field f, union fieldelt *d, Bignum n);
	182	/* Convert n to an element. */
	183
	184	Bignum (tobn)(struct field f, union fieldelt *x);
	185	/* Convert x to an integer. */
	186
	187	int (zerop)(struct field f, union fieldelt *x);
	188	/* Return nonzero if x is zero. */
	189
	190	void (dbl)(struct field f, union fieldelt d, union fieldelt x);
	191	/* Store 2 x in d. */
	192
	193	void (add)(struct field f, union fieldelt *d,
	194	union fieldelt x, union fieldelt y);
	195	/* Store x + y in d. */
	196
	197	void (sub)(struct field f, union fieldelt *d,
	198	union fieldelt x, union fieldelt y);
	199	/* Store x - y in d. */
	200
	201	void (mul)(struct field f, union fieldelt *d,
	202	union fieldelt x, union fieldelt y);
	203	/* Store x y in d. */
	204
	205	void (sqr)(struct field f, union fieldelt d, union fieldelt x);
	206	/* Store x^2 in d. */
	207
	208	void (inv)(struct field f, union fieldelt d, union fieldelt x);
	209	/* Store 1/x in d; it is an error
	210	* for x to be zero.
	211	*/
	212
	213	int (sqrt)(struct field f, union fieldelt d, union fieldelt x);
	214	/* If x is a square, store a square
	215	* root (chosen arbitrarily) in d
	216	* and return nonzero; otherwise
	217	* return zero leaving d unchanged.
	218	*/
	219
	220	/* For internal use only. The standard methods call on these for
	221	* detailed field-specific behaviour.
	222	*/
	223	void (_reduce)(struct field f, union fieldelt *x);
	224	/* Reduce an intermediate result so
	225	* that it's within whatever bounds
	226	* are appropriate.
	227	*/
	228
	229	/* These functions are only important for prime fields. */
	230	void (dbl)(struct field f, union fieldelt d, union fieldelt x);
	231	/* Store 2 x in d. */
	232
	233	void (tpl)(struct field f, union fieldelt d, union fieldelt x);
	234	/* Store 3 x in d. */
	235
	236	void (dql)(struct field f, union fieldelt d, union fieldelt x);
	237	/* Store 4 x in d. */
	238
	239	void (hlv)(struct field f, union fieldelt d, union fieldelt x);
	240	/* Store x/2 in d. */
	241	};
	242
	243	/* Some standard methods for prime fields. */
	244
	245	static void gen_destroy(struct field *f)
	246	{ sfree(f); }
	247
	248	static void prime_init(struct field f, union fieldelt x)
	249	{ x->n = 0; }
	250
	251	static void prime_free(struct field f, union fieldelt x)
	252	{ if (x->n) { freebn(x->n); x->n = 0; } }
	253
	254	static void prime_frombn(struct field f, union fieldelt d, Bignum n)
	255	{ f->ops->free(f, d); d->n = copybn(n); }
	256
	257	static Bignum prime_tobn(struct field f, union fieldelt x)
	258	{ return copybn(x->n); }
	259
	260	static Bignum prime_zerop(struct field f, union fieldelt x)
	261	{ return bignum_cmp(x->n, f->zero.n) == 0; }
	262
	263	static void prime_add(struct field f, union fieldelt d,
	264	union fieldelt x, union fieldelt y)
	265	{
	266	Bignum sum = bigadd(x, y);
	267	Bignum red = bigsub(sum, f->q);
	268
	269	f->ops->free(f, d);
	270	if (!red)
	271	d->n = sum;
	272	else {
	273	freebn(sum);
	274	d->n = red;
	275	}
	276	}
	277
	278	static void prime_sub(struct field f, union fieldelt d,
	279	union fieldelt x, union fieldelt y)
	280	{
	281	Bignum raise = bigadd(x, f->q);
	282	Bignum diff = bigsub(raise, y);
	283	Bignum drop = bigsub(diff, f->q);
	284
	285	f->ops->free(f, d); freebn(raise);
	286	if (!drop)
	287	d->n = diff;
	288	else {
	289	freebn(diff);
	290	d->n = drop;
	291	}
	292	}
	293
	294	static void prime_mul(struct field f, union fieldelt d,
	295	union fieldelt x, union fieldelt y)
	296	{
	297	Bignum prod = bigmul(x, y);
	298
	299	f->ops->free(f, d);
	300	d->n = prod;
	301	f->ops->_reduce(f, d);
	302	}
	303
	304	static void gen_sqr(struct field f, union fieldelt d, union fieldelt *x)
	305	{ f->ops->mul(f->ops, d, x, x); }
	306
	307	static void prime_inv(struct field f, union fieldelt d, union fieldelt *x)
	308	{
	309	Bignum inv = modinv(x->n, f->q);
	310
	311	f->ops->free(f, d);
	312	d->n = inv;
	313	}
	314
	315	static int prime_sqrt(struct field f, union fieldelt d, union fieldelt *x)
	316	{
	317	Bignum sqrt = modsqrt(x->n, f->q);
	318
	319	if (!sqrt) return 0;
	320	f->ops->free(f, d);
	321	d->n = sqrt;
	322	return 1;
	323	}
	324
	325	static void prime_dbl(struct field f, union_fieldelt d, union fieldelt *x)
	326	{ f->ops->add(f, d, x, x); }
	327
	328	static void prime_tpl(struct field f, union_fieldelt d, union fieldelt *x)
	329	{
	330	union_fieldelt t;
	331
	332	if (d != x) {
	333	f->ops->add(f, d, x, x);
	334	f->ops->add(f, d, d, x);
	335	} else {
	336	f->ops->init(f, &t);
	337	f->ops->copy(f, d, x);
	338	f->ops->add(f, d, &t, &t);
	339	f->ops->add(f, d, d, &t);
	340	f->ops->free(f, &t);
	341	}
	342	}
	343
	344	static void prime_qdl(struct field f, union fieldelt d,
	345	union fieldelt *x)
	346	{ f->ops->add(f, d, x, x); f->ops->add(f, d, d, d); }
	347
	348	static void prime_hlv(struct field f, union fieldelt d, union fieldelt *x)
	349	{
	350	Bignum t, u;
	351
	352	if (!x->n[0])
	353	f->ops->copy(f, d, x);
	354	else {
	355	/* The tedious answer is to multiply by the inverse of 2 in this
	356	* field. But there's a better way: either x or q - x is actually
	357	* divisible by 2, so the answer we want is either x >> 1 or p -
	358	* ((p - x) >> 1) depending on whether x is even.
	359	*/
	360	if (!(x->n[1] & 1)) {
	361	t = copybn(x->n);
	362	shift_right(t + 1, t[0], 1);
	363	} else {
	364	u = bigsub(f->q, x);
	365	shift_right(u + 1, u[0], 1);
	366	t = bigsub(f->q, u);
	367	freebn(u);
	368	}
	369	f->ops->free(f, d);
	370	d->n = t;
	371	}
	372
	373	/* Now some utilities for modular reduction. All of the primes we're
	374	* concerned about are of the form p = 2^n - d for `convenient' values of d
	375	* -- such numbers are called `pseudo-Mersenne primes'. Suppose we're given
	376	* a value x = x_0 2^n + x_1: then x - x_0 p = x_1 + x_0 d is clearly less
	377	* than x and congruent to it mod p. So we can reduce x below 2^n by
	378	* iterating this trick; if we're unlucky enough to have p <= x < 2^n then we
	379	* can just subtract p.
	380	*
	381	* The values of d we'll be dealing with are specifically convenient because
	382	* they satisfy the following properties.
	383	*
	384	* * We can write d = SUM_i d_i 2^i, with d_i in { -1, 0, +1 }. (This is
	385	* no surprise.)
	386	*
	387	* * d > 0.
	388	*
	389	* * Very few d_i /= 0. With one small exception, if d_i /= 0 then i is
	390	* divisible by 32.
	391	*
	392	* The following functions will therefore come in very handy.
	393	*/
	394
	395	static void add_imm_lsl(BignumInt *x, BignumInt y, unsigned shift)
	396	{
	397	BignumDbl c = y << (shift % BIGNUM_INT_BITS);
	398
	399	for (x += shift/BIGNUM_INT_BITS; c; x++) {
	400	c += *x;
	401	*x = c & BIGNUM_INT_MASK;
	402	c >>= BIGNUM_INT_BITS;
	403	}
	404	}
	405
	406	static void sub_imm_lsl(BignumInt *x, BignumInt y, unsigned shift)
	407	{
	408	BignumDbl c = y << (shift % BIGNUM_INT_BITS);
	409	BignumDbl t;
	410
	411	c--;
	412	for (x += shift/BIGNUM_INT_BITS; c; x++) {
	413	c ^= BIGNUM_INT_MASK;
	414	c += *x;
	415	*x = c & BIGNUM_INT_MASK;
	416	c >>= BIGNUM_INT_BITS;
	417	}
	418	}
	419
	420	static void pmp_reduce(struct field f, union fieldelt xx,
	421	void (add_mul_d_lsl)(BignumInt x,
	422	BignumInt y,
	423	unsigned shift))
	424	{
	425	/* Notation: write w = BIGNUM_INT_WORDS, and B = 2^w is the base we're
	426	* working in. We assume that p = 2^n - d for some d. The job of
	427	* add_mul_d_lsl is to add to its argument x the value y d 2^shift.
	428	*/
	429
	430	BignumInt top, p, x;
	431	unsigned plen, xlen, i;
	432	unsigned topbits, topclear;
	433	Bignum t;
	434
	435	/* Initialization: if x is shorter than p then there's nothing to do.
	436	* Otherwise, ensure that it's at least one word longer, since this is
	437	* required by the utility functions which add_mul_d_lsl will call.
	438	*/
	439	p = f->q + 1; plen = f->q[0];
	440	x = xx->n + 1; xlen = xx->n[0];
	441	if (xlen < plen) return;
	442	else if (xlen == plen) {
	443	t = newbn(plen + 1);
	444	for (i = 0; i < plen; i++) t[i + 1] = x[i];
	445	t[plen] = 0;
	446	freebn(xx->n); xx->n = t;
	447	x = t + 1; xlen = plen + 1;
	448	}
	449
	450	/* Preparation: Work out the bit length of p. At the end of this, we'll
	451	* have topbits such that n = w (plen - 1) + topbits, and topclear = w -
	452	* topbits, so that n = w plen - topclear.
	453	*/
	454	while (plen > 0 && !p[plen - 1]) plen--;
	455	top = p[plen - 1];
	456	assert(top);
	457	if (top & BIGNUM_TOP_BIT) topbits = BIGNUM_INT_BITS;
	458	else for (topbits = 0; top >> topbits; topbits++);
	459	topclear = BIGNUM_INT_BITS - topbits;
	460
	461	/* Step 1: Trim x down to the right number of words. */
	462	while (xlen > plen) {
	463
	464	/* Pick out the topmost word; call it y for now. */
	465	y = x[xlen - 1];
	466	if (!y) {
	467	xlen--;
	468	continue;
	469	}
	470
	471	/* Observe that d == 2^n (mod p). Clear that top word. This
	472	* effectively subtracts y B^{xlen-1} from x, so we must add back d
	473	* 2^{w(xlen-1)-n} in order to preserve congruence modulo p. Since d
	474	* is smaller than 2^n, this will reduce the absolute value of x, so
	475	* we make (at least some) progress. In practice, we expect that d
	476	* is a lot smaller.
	477	*
	478	* Note that w (xlen - 1) - n = w xlen - w - n = w (xlen - plen - 1)
	479	* + topclear.
	480	*/
	481	x[xlen - 1] = 0;
	482	add_mul_d_lsl(x + xlen - plen - 1, y, topclear);
	483	}
	484
	485	/* Step 2: Trim off the high bits of x, beyond the top bit of p. In more
	486	* detail: if x >= 2^n > p, then write x = x_0 + 2^n y; then we can clear
	487	* the top topclear bits of x, and add y d to preserve congruence mod p.
	488	*
	489	* If topclear is zero then there's nothing to do here: step 1 will
	490	* already have arranged that x < 2^n.
	491	*/
	492	if (topclear) {
	493	for (;;) {
	494	y = x[xlen - 1] >> topbits;
	495	if (!y) break;
	496	x[xlen - 1] &= (1 << topbits) - 1;
	497	add_mul_d_lsl(x, y, 0);
	498	}
	499	}
	500
	501	/* Step 3: If x >= p then subtract p from x. */
	502	for (i = plen; i-- && x[i] == p[i]; );
	503	if (i >= plen \|\| x[i] > p[i]) internal_sub(x, p, plen);
	504
	505	/* We're done now. */
	506	d->n[0] = xlen;
	507	bn_restore_invariant(d->n);
	508	}
	509
	510	/* Putting together a field-ops structure for a pseudo-Mersenne prime field.
	511	*/
	512
	513	static void pmp_destroy(struct field *f)
	514	{ freebn(f->q); sfree(f); }
	515
	516	#define DEFINE_PMP_FIELD(fname) \
	517	static const struct field_ops fname##_fieldops { \
	518	pmp_destroy, \
	519	prime_init, prime_free, prime_copy, \
	520	prime_frombn, prime_tobn, \
	521	prime_zerop, \
	522	prime_add, prime_sub, prime_mul, gen_sqr, prime_inv, prime_sqrt, \
	523	fname_reduce, \
	524	gen_dbl, gen_tpl, gen_qdl, \
	525	prime_hlv \
	526	}; \
	527	\
	528	static struct field *make_##fname##_field(void) \
	529	{ \
	530	struct field *f = snew(struct field); \
	531	f->ops = &fname##_fieldops; \
	532	f->zero.n = Zero; \
	533	f->one.n = One; \
	534	f->q = bigunm_from_bytes(fname##_p, sizeof(fname##_p)); \
	535	return f; \
	536	}
	537
	538	/* Some actual field definitions. */
	539
	540	static const unsigned char p256_p[] = {
	541	0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x01,
	542	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
	543	0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
	544	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
	545	};
	546
	547	static void p256_add_mul_d_lsl(BignumInt *x, BignumInt y, unsigned shift)
	548	{
	549	/* p_{256} = 2^{256} - 2^{224} + 2^{192} + 2^{96} - 1 */
	550	add_imm_lsl(x, y, shift + 224);
	551	sub_imm_lsl(x, y, shift + 192);
	552	sub_imm_lsl(x, y, shift + 96);
	553	add_imm_lsl(x, y, shift + 0);
	554	}
	555
	556	static void p256_reduce(struct field f, union fieldelt d)
	557	{ pmp_reduce(f, d, p256_add_mul_d_lsl); }
	558
	559	DEFINE_PMP_FIELD(p256)
	560
	561	static const unsigned char p384_p[] = {
	562	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
	563	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
	564	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
	565	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfe,
	566	0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00,
	567	0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff
	568	};
	569
	570	static void p384_add_mul_d_lsl(BignumInt *x, BignumInt y, unsigned shift)
	571	{
	572	/* p_{384} = 2^{384} - 2^{128} - 2^{96} + 2^{32} - 1 */
	573	add_imm_lsl(x, y, shift + 128);
	574	add_imm_lsl(x, y, shift + 96);
	575	sub_imm_lsl(x, y, shift + 32);
	576	add_imm_lsl(x, y, shift + 0);
	577	}
	578
	579	static void p384_reduce(struct field f, union fieldelt d)
	580	{ pmp_reduce(f, d, p384_add_mul_d_lsl); }
	581
	582	DEFINE_PMP_FIELD(p384)
	583
	584	static const unsigned char p521_p[] = {
	585	0x01, 0xff,
	586	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
	587	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
	588	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
	589	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
	590	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
	591	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
	592	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
	593	0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
	594	};
	595
	596	static void p521_add_mul_d_lsl(BignumInt *x, BignumInt y, unsigned shift)
	597	{ add_imm_lsl(x, y, shift); } /* p_{521} = 2^{521} - 1 */
	598
	599	static void p521_reduce(struct field f, union fieldelt d)
	600	{ pmp_reduce(f, d, p521_add_mul_d_lsl); }
	601
	602	DEFINE_PMP_FIELD(p521)
	603
	604	/* And now, some actual elliptic curves.
	605	*
	606	* As with fields, we start by defining an abstract interface which may have
	607	* various implementations (e.g., different algorithms with different
	608	* performance properties, or based on different kinds of underlying fields.
	609	*/
	610
	611	/* An external-form elliptic curve point. */
	612	struct ecpt {
	613	unsigned f; /* Flags */
	614	#define ECPTF_INF 1u /* Point is at infinity */
	615	union fieldelt x, y; /* x- and y-coordinates */
	616	};
	617
	618	/* An internal-form elliptic curve point. This is where most of the action
	619	* happens.
	620	*/
	621	union iecpt {
	622	struct {
	623	union fieldelt x, y, z; /* Projective-form triple. */
	624	} jac; /* Jacobian projective coords. */
	625	};
	626
	627	struct ec {
	628	const struct ec_ops ops; / Curve operations table. */
	629	struct field f; / The underlying finite field. */
	630	struct iecpt p; /* The well-known generator point */
	631	Bignum r; /* The size of the generated group */
	632	union {
	633	struct { union fieldelt a, b; } w; /* Simple Weierstrass coeffs. */
	634	} u;
	635	};
	636
	637	struct ec_ops {
	638	void (destroy)(struct ec ec); /* Destroy the curve itself. */
	639
	640	void (init)(struct ec ec, union iecpt *p);
	641	/* Initialize p (to null). */
	642
	643	void (free)(struct ec ec, union iecpt *p);
	644	/* Free the point p. */
	645
	646	void (setinf)(struct ec ec, union iecpt *p);
	647	/* Set p to be the point at infinity.
	648	*/
	649
	650	int (infp)(struct ec ec, union iecpt *p);
	651	/* Answer whether p is infinite. */
	652
	653	void (copy)(struct ec ec, union iecpt d, union iecpt d);
	654	/* Make p be a copy of d. */
	655
	656	void (in)(struct ec ec, union iecpt d, struct ecpt p);
	657	/* Convert external-format p to
	658	* internal-format d.
	659	*/
	660
	661	void (out)(struct ec ec, struct ecpt d, union iecpt p);
	662	/* Convert internal-format p to
	663	* external-format d.
	664	*/
	665
	666	void (add)(struct ec ec, union iecpt *d,
	667	union iecpt p, union iecpt q);
	668	/* Store in d the sum of the points p
	669	* and q.
	670	*/
	671
	672	void (sub)(struct ec ec, union iecpt *d,
	673	union iecpt p, union iecpt q);
	674	/* Store in d the result of
	675	* subtracting q from p.
	676	*/
	677
	678	void (neg)(struct ec ec, union iecpt d, union iecpt p);
	679	/* Store in d the negation of p. */
	680	};
	681
	682	/* And now some common utilities. */
	683
	684	static void weier_destroy(struct ec *ec)
	685	{
	686	ec->f->ops(ec->f, &ec->u.w.a);
	687	ec->f->ops(ec->f, &ec->u.w.b);
	688	freebn(ec->r);
	689	f->ops->destroy(ec->f);
	690	sfree(ec);
	691	}
	692
	693	static void jac_init(struct ec ec, union iecpt p)
	694	{
	695	ec->f->ops->init(ec->f, &p->proj.x);
	696	ec->f->ops->init(ec->f, &p->proj.y);
	697	ec->f->ops->init(ec->f, &p->proj.z);
	698	}
	699
	700	static void jac_free(struct ec ec, union iecpt p)
	701	{
	702	ec->f->ops->free(ec->f, &p->proj.x);
	703	ec->f->ops->free(ec->f, &p->proj.y);
	704	ec->f->ops->free(ec->f, &p->proj.z);
	705	}
	706
	707	static void jac_setinf(struct ec ec, union iecpt d)
	708	{
	709	ec->f->ops->copy(ec->f, &d->proj.x, &ec->f->one);
	710	ec->f->ops->copy(ec->f, &d->proj.y, &ec->f->one);
	711	ec->f->ops->copy(ec->f, &d->proj.z, &ec->f->zero);
	712	}
	713
	714	static int jac_infp(struct ec ec, union iecpt d)
	715	{ return ec->f->ops->zerop(ec->f, &d->proj.z); }
	716
	717	static void jac_copy(struct ec ec, union iecpt d, union iecpt *d)
	718	{
	719	ec->f->ops->copy(ec->f, &d->proj.x, &p->proj.x);
	720	ec->f->ops->copy(ec->f, &d->proj.y, &p->proj.y);
	721	ec->f->ops->copy(ec->f, &d->proj.z, &p->proj.z);
	722	}
	723
	724	static void jac_in(struct ec ec, union iecpt d, struct ecpt *p)
	725	{
	726	ec->f->ops->copy(ec->f, &d->x, &p->proj.x);
	727	ec->f->ops->copy(ec->f, &d->y, &p->proj.y);
	728	ec->f->ops->copy(ec->f, &d->z, &ec->f->one);
	729	}
	730
	731	static void jac_out(struct ec ec, struct ecpt d, union iecpt *p)
	732	{
	733	union fieldelt iz, iz2, iz3;
	734
	735	if (ec->f->ops->zerop(ec->f, &p->proj.z)) {
	736	ec->f->ops->free(ec->f, d->x);
	737	ec->f->ops->free(ec->f, d->y);
	738	d->f = ECPTF_INF;
	739	} else {
	740	ec->f->ops->init(ec->f, &iz2);
	741	ec->f->ops->init(ec->f, &iz3);
	742
	743	ec->f->ops->inv(ec->f, &iz3, &p->proj.z);
	744	ec->f->ops->sqr(ec->f, &iz2, &iz3);
	745	ec->f->ops->mul(ec->f, &iz3, &iz3, &iz2);
	746
	747	ec->f->ops->mul(ec->f, &d->x, &p->proj.x, &iz2);
	748	ec->f->ops->mul(ec->f, &d->y, &p->proj.x, &iz3);
	749	d->f = 0;
	750
	751	ec->f->ops->free(ec->f, &iz2);
	752	ec->f->ops->free(ec->f, &iz3);
	753	}
	754	}
	755
	756	static void gen_sub(struct ec ec, union iecpt d,
	757	union iecpt p, union iecpt q)
	758	{
	759	union iecpt t;
	760
	761	ec->ops->init(ec, &t);
	762	ec->ops->neg(ec, &t, q);
	763	ec->ops->add(ec, d, p, &t);
	764	ec->ops->free(ec, &t);
	765	}
	766
	767	/* Finally, let's define some actual curve arithmetic functions. */
	768
	769	static void pwjac_dbl(struct ec ec, union iecpt d, union iecpt *p)
	770	{
	771	struct field *f = ec->f;
	772	union fieldelt m, s, t, u;
	773
	774	if (ec->ops->infp(ec, p)) {
	775	ec->setinf(ec, d);
	776	return;
	777	}
	778
	779	f->ops->init(f, &m);
	780	f->ops->init(f, &s);
	781	f->ops->init(f, &t);
	782	f->ops->init(f, &u);
	783
	784	f->ops->sqr(f, &u, &p->proj.z); /* z^2 */
	785	f->ops->sqr(f, &t, &u); /* z^4 */
	786	f->ops->mul(f, &u, &t, &ec->u.w.a); /* a z^4 */
	787	f->ops->sqr(f, &m, &p->proj.x); /* x^2 */
	788	f->ops->tpl(f, &m, &m); /* 3 x^2 */
	789	f->ops->add(f, &m, &m, &u); /* m = 3 x^2 + a z^4 */
	790
	791	f->ops->dbl(f, &t, &p->proj.y); /* 2 y */
	792	f->ops->mul(f, &d->proj.z, &t, &p->proj.z); /* z' = 2 y z */
	793
	794	f->ops->sqr(f, &u, &t); /* 4 y^2 */
	795	f->ops->mul(f, &s, &u, &p->proj.x); /* s = 4 x y^2 */
	796	f->ops->sqr(f, &t, &u); /* 16 y^4 */
	797	f->ops->hlv(f, &t, &t); /* t = 8 y^4 */
	798
	799	f->ops->dbl(f, &u, &s); /* 2 s */
	800	f->ops->sqr(f, &d->proj.x, &m); /* m^2 */
	801	f->ops->sub(f, &d->proj.x, &d->proj.x, &u); /* x' = m^2 - 2 s */
	802
	803	f->ops->sub(f, &u, &s, &d->proj.x); /* s - x' */
	804	f->ops->mul(f, &d->proj.y, &m, &u); /* m (s - x') */
	805	f->ops->sub(f, &d->proj.y, &d->proj.y, &t); /* y' = m (s - x') - t */
	806
	807	f->ops->free(f, &m);
	808	f->ops->free(f, &s);
	809	f->ops->free(f, &t);
	810	f->ops->free(f, &u);
	811	}
	812
	813	static void pwjac_add(struct ec ec, union iecpt d,
	814	union iecpt p, union iecpt q)
	815	{
	816	struct field *f = ec->f;
	817	union fieldelt m, r, s, ss, t, u, uu, w;
	818
	819	if (a == b) { pwjac_dbl(ec, d, p); return; }
	820	else if (ec->ops->infp(ec, p)) { ec->ops->copy(ec, d, q); return; }
	821	else if (ec->ops->infp(ec, q)) { ec->ops->copy(ec, d, p); return; }
	822
	823	f->ops->init(f, &m);
	824	f->ops->init(f, &r);
	825	f->ops->init(f, &s);
	826	f->ops->init(f, &ss);
	827	f->ops->init(f, &t);
	828	f->ops->init(f, &u);
	829	f->ops->init(f, &uu);
	830	f->ops->init(f, &w);
	831
	832	f->ops->sqr(f, &s, &p->proj.z); /* z_0^2 */
	833	f->ops->mul(f, &u, &s, &q->proj.x); /* u = x_1 z_0^2 */
	834	f->ops->mul(f, &s, &s, &p->proj.z); /* z_0^3 */
	835	f->ops->mul(f, &s, &s, &q->proj.y); /* s = y_1 z_0^3 */
	836
	837	f->ops->sqr(f, &ss, &q->proj.z); /* z_1^2 */
	838	f->ops->mul(f, &uu, &ss, &p->proj.x); /* uu = x_0 z_1^2 */
	839	f->ops->mul(f, &ss, &ss, &q->proj.z); /* z_1^3 */
	840	f->ops->mul(f, &ss, &ss, &p->proj.y); /* ss = y_0 z_1^3 */
	841
	842	f->ops->sub(f, &w, &uu, &u); /* w = uu - u */
	843	f->ops->sub(f, &r, &ss, &s); /* r = ss - s */
	844	if (f->ops->zerop(f, w)) {
	845	if (f->ops->zerop(f, r)) ec->ops->dbl(ec, d, p);
	846	else ec->ops->setinf(ec, d);
	847	goto cleanup;
	848	}
	849
	850	f->ops->add(f, &t, &u, &uu); /* t = uu + u */
	851	f->ops->add(f, &m, &s, &ss); /* m = ss + s */
	852
	853	f->ops->mul(f, &uu, &p->proj.z, &w); /* z_0 w */
	854	f->ops->mul(f, &d->proj.z, &uu, &q->proj.z); /* z' = z_0 z_1 w */
	855
	856	f->ops->sqr(f, &s, &w); /* w^2 */
	857	f->ops->mul(f, &u, &s, &t); /* t w^2 */
	858	f->ops->mul(f, &s, &s, &w); /* w^3 */
	859	f->ops->mul(f, &s, &s, &m); /* m w^3 */
	860
	861	f->ops->sqr(f, &m, &r); /* r^2 */
	862	f->ops->sub(f, &d->proj.x, &m, &u); /* x' = r^2 - t w^2 */
	863
	864	f->ops->dbl(f, &m, &d->proj.x); /* 2 x' */
	865	f->ops->sub(f, &u, &u, &m); /* v = t w^2 - 2 x' */
	866	f->ops->mul(f, &u, &u, &r); /* v r */
	867	f->ops->sub(f, &u, &u, &s); /* v r - m w^3 */
	868	f->ops->hlv(f, &d->proj.y, &m); /* y' = (v r - m w^3)/2 */
	869
	870	cleanup:
	871	f->ops->free(f, &m);
	872	f->ops->free(f, &r);
	873	f->ops->free(f, &s);
	874	f->ops->free(f, &ss);
	875	f->ops->free(f, &t);
	876	f->ops->free(f, &u);
	877	f->ops->free(f, &uu);
	878	f->ops->free(f, &w);
	879	}
	880