From: Mark Wooding Date: Thu, 26 May 2016 08:26:09 +0000 (+0100) Subject: symm/poly1305.c: Fix 16/32-bit `carry_reduce'. X-Git-Tag: 2.4.0~62 X-Git-Url: https://git.distorted.org.uk/~mdw/catacomb/commitdiff_plain/57e7040b318f0ffc5ab43c3fb62df9a2bef42ac7 symm/poly1305.c: Fix 16/32-bit `carry_reduce'. I managed to botch the bounds last time. --- diff --git a/symm/poly1305.c b/symm/poly1305.c index 99cc5797..55fe3711 100644 --- a/symm/poly1305.c +++ b/symm/poly1305.c @@ -230,9 +230,10 @@ static void carry_reduce(uint32 u[12]) uint32 c; /* Do sequential carry propagation (16-bit CPUs are less likely to benefit - * from instruction-level parallelism). Start at u_10; truncate it to 11 - * bits, and add the carry onto u_11. Truncate u_11 to 10 bits, and add - * five times the carry onto u_0. And so on. + * from instruction-level parallelism). Start at u_9; truncate it to 11 + * bits, and add the carry onto u_10. Truncate u10 to 11 bits, and add the + * carry onto u_11. Truncate u_11 to 10 bits, and add five times the carry + * onto u_0. And so on. * * The carry is larger than the pieces we're leaving behind. Let c_i be * the high portion of u_i, to be carried onto u_{i+1}. I claim that c_i < @@ -241,13 +242,15 @@ static void carry_reduce(uint32 u[12]) * carry is bounded above by 636*2^22 + 12785*2^10 < 2557*2^20. Hence, the * carry out is at most 2557*2^10, as claimed. * - * Once we reach u_10 for the second time, we start with u_10 < 2^11. The - * carry into u_10 is at most 2557*2^10 < 1279*2^11 as calculated above; so - * the carry out into u_11 is at most 1280. Since u_11 < 2^10 prior to - * this carry in, all of the u_i are now bounded above by 2^11. The final - * reduction therefore only needs a conditional subtraction. + * Once we reach u_9 for the second time, we start with u_9 < 2^11. The + * carry into u_9 is at most 2557*2^10 < 1279*2^11 as calculated above; so + * the carry out into u_10 is at most 1280. Since u_10 < 2^11 prior to + * this carry in, we now have u_10 < 2^11 + 1280 < 2^12; so the carry out + * into u_11 is at most 1. The final reduction therefore only needs a + * conditional subtraction. */ - { c = u[10] >> 11; u[10] &= M11; } + { c = u[9] >> 11; u[9] &= M11; } + { u[10] += c; c = u[10] >> 11; u[10] &= M11; } { u[11] += c; c = u[11] >> 10; u[11] &= M10; } { u[0] += 5*c; c = u[0] >> 11; u[0] &= M11; } for (i = 1; i < 5; i++) { u[i] += c; c = u[i] >> 11; u[i] &= M11; }