catacomb
6 years agoRelease 2.3.1. 2.3.1
Mark Wooding [Sat, 13 May 2017 14:21:43 +0000 (15:21 +0100)]
Release 2.3.1.

6 years agopub/bbs-gen.c, pub/rsa-gen.c: Remove the lower-bounding on q.
Mark Wooding [Thu, 11 May 2017 09:42:15 +0000 (10:42 +0100)]
pub/bbs-gen.c, pub/rsa-gen.c: Remove the lower-bounding on q.

It's unnecessary.  It was a bad idea because it biases q quite heavily,
but now `strongprime' generates primes in the right interval so that
getting the right bit length isn't a problem.

6 years agomath/strongprime.c: Clamp the starting point.
Mark Wooding [Thu, 11 May 2017 09:42:15 +0000 (10:42 +0100)]
math/strongprime.c: Clamp the starting point.

Now the result will be in the upper quarter of the `obvious' range, and
the product of two such values is guaranteed to have the desired number
of bits.  This saves callers from doing stupid things like trying to
clamp one of the factors by hand, which ends up significantly biasing
the second factor.  (This isn't very bad, because there's a /lot/ of
randomness in the chosen congruence class, but it's good to fix this
sort of thing.)

6 years agomath/strongprime.c: Reduce failures by adding some more slop bits.
Mark Wooding [Thu, 11 May 2017 09:42:15 +0000 (10:42 +0100)]
math/strongprime.c: Reduce failures by adding some more slop bits.

In my experiments, failures were happening about 2--3% of the time,
which is way more than one is really willing to tolerate.

6 years agoprogs/catcrypt.c, progs/cc-sig.c: Compare MAC tags in constant time.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
progs/catcrypt.c, progs/cc-sig.c: Compare MAC tags in constant time.

6 years agoprogs/cc-sig.c: Initialize hash context properly for RSA-PSS.
Mark Wooding [Mon, 17 Apr 2017 23:03:01 +0000 (00:03 +0100)]
progs/cc-sig.c: Initialize hash context properly for RSA-PSS.

Somehow this seemed to work anyway on my machine; but valgrind agrees
that it was wrong.

6 years agoprogs/cc-sig.c: Don't destroy an RSA context just after building it.
Mark Wooding [Mon, 17 Apr 2017 22:31:11 +0000 (23:31 +0100)]
progs/cc-sig.c: Don't destroy an RSA context just after building it.

It causes an assertion failure later.  Really embarrassing.

6 years agomath/g-bin.c, math/g-prime.c: Fix type incompatibility.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
math/g-bin.c, math/g-prime.c: Fix type incompatibility.

Callers of the abstract group API expect to pass in a pointer-to-
structure.  The binary and prime group implementations expected a
pointer-to-pointer, which looks different.  Change the way these work,
so that the group element is a structure holding a pointer, rather than
just a bare pointer.  This doesn't make any difference on targets with
sane ABIs, but it fixes a potentially nasty problem on weirder
platforms.

Add a macro explaining this change so that users of this unstable
interface can cope with both versions.

6 years agomath/g-*.c: Group implementations include `group.h' via `group-guts.h'.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
math/g-*.c: Group implementations include `group.h' via `group-guts.h'.

And not directly.

6 years agokey/key-io.c: Produce valid key lines for empty keys.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
key/key-io.c: Produce valid key lines for empty keys.

If a key contains only an empty tree of structures, then `key_write'
returns an empty string, which breaks the whitespace-separated field
structure of the output key line.  Notice this and insert an empty
structure by hand as an unpleasant bodge.

The resulting key is still highly anomalous.  In particular, it doesn't
match any filter, because structure nodes don't have flags.  I don't
know what to do about this.

6 years agokey/key-io.c: Fix segfault opening `KOPEN_READ | KOPEN_NOFILE' key files.
Mark Wooding [Sat, 13 May 2017 11:27:31 +0000 (12:27 +0100)]
key/key-io.c: Fix segfault opening `KOPEN_READ | KOPEN_NOFILE' key files.

They're useless, but they shouldn't cause a crash.

7 years agosymm/salsa20.[ch]: Add missing LGPL notices.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
symm/salsa20.[ch]: Add missing LGPL notices.

7 years agomath/mpx-mul4-test.c: Set `dstr' length correctly in conversion function.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
math/mpx-mul4-test.c: Set `dstr' length correctly in conversion function.

(cherry picked from commit b00264d9e2ac2f2be2808e7ad663c35115519504)

7 years agosymm/chacha.c: Fix `tell' response.
Mark Wooding [Thu, 13 Apr 2017 13:47:28 +0000 (14:47 +0100)]
symm/chacha.c: Fix `tell' response.

7 years agosymm/chacha.[ch]: Fix comment headers.
Mark Wooding [Thu, 13 Apr 2017 14:50:46 +0000 (15:50 +0100)]
symm/chacha.[ch]: Fix comment headers.

7 years agosymm/{chacha.c,salsa20.c}: Fix random generator allocation sizes.
Mark Wooding [Thu, 13 Apr 2017 13:47:11 +0000 (14:47 +0100)]
symm/{chacha.c,salsa20.c}: Fix random generator allocation sizes.

This makes a real mess.

7 years agoRelease 2.3.0.1. 2.3.0.1
Mark Wooding [Wed, 5 Apr 2017 08:01:13 +0000 (09:01 +0100)]
Release 2.3.0.1.

7 years agobase/asm-common.h: Fix the sense of the `WANT_EXECUTABLE_STACK' check.
Mark Wooding [Wed, 5 Apr 2017 07:59:33 +0000 (08:59 +0100)]
base/asm-common.h: Fix the sense of the `WANT_EXECUTABLE_STACK' check.

Brown paper bag time.

7 years agomath/: Distribute the `mpx-mul4' test vectors, with the correct name.
Mark Wooding [Wed, 5 Apr 2017 08:05:59 +0000 (09:05 +0100)]
math/: Distribute the `mpx-mul4' test vectors, with the correct name.

7 years agomath/: Add low-level testing for accelerated `mpx-mul4' multiplier.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
math/: Add low-level testing for accelerated `mpx-mul4' multiplier.

7 years agoMakefile.am: Some reformatting.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
Makefile.am: Some reformatting.

7 years agovars.am: Some reformatting.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
vars.am: Some reformatting.

7 years agoRelease 2.3.0. 2.3.0
Mark Wooding [Mon, 3 Apr 2017 09:25:30 +0000 (10:25 +0100)]
Release 2.3.0.

7 years agomath/mpx-mul4-amd64-sse2.S: SSE2 multipliers for AMD64.
Mark Wooding [Wed, 4 Jan 2017 01:42:16 +0000 (01:42 +0000)]
math/mpx-mul4-amd64-sse2.S: SSE2 multipliers for AMD64.

Plus the various hangers on.

7 years agomath/mpx-mul4-x86-sse2.S: Maintain a local copy of the counter.
Mark Wooding [Wed, 4 Jan 2017 01:41:22 +0000 (01:41 +0000)]
math/mpx-mul4-x86-sse2.S: Maintain a local copy of the counter.

I've no idea whether one's allowed to mutate a parameter passed on the
stack.  Play it safe.

This means that (a) the counter is now in a fixed place in the frame so
that `testtail' doesn't need to be told where it is, an (b)
`testprologue' needs to initialize it from the caller's parameter, so it
needs to grow a macro argument.

7 years agomath/mpx-mul4-x86-sse2.S: Make stack alignment more standard.
Mark Wooding [Wed, 4 Jan 2017 01:35:50 +0000 (01:35 +0000)]
math/mpx-mul4-x86-sse2.S: Make stack alignment more standard.

This actually slightly reduces the amount of stack needed, but I don't
quite understand why.  There's a knock-on rearrangement of the stack
frame in the test wrappers and C-interface subroutines.

There's also a slightly sneaky introduction of space for a later change.
But there shouldn't be any externally observable difference.

7 years agomath/mpx-mul4-x86-sse2.S: Slightly reorder to reduce dependence.
Mark Wooding [Wed, 4 Jan 2017 01:36:56 +0000 (01:36 +0000)]
math/mpx-mul4-x86-sse2.S: Slightly reorder to reduce dependence.

Doesn't help much.

7 years agomath/mpx-mul4-x86-sse2.S: Fix comment formatting.
Mark Wooding [Wed, 4 Jan 2017 01:36:13 +0000 (01:36 +0000)]
math/mpx-mul4-x86-sse2.S: Fix comment formatting.

7 years agomath/mpx-mul4-x86-sse2.S: Additional piece of commentary.
Mark Wooding [Thu, 29 Dec 2016 15:24:56 +0000 (15:24 +0000)]
math/mpx-mul4-x86-sse2.S: Additional piece of commentary.

7 years agomath/mpx-mul4-x86-sse2.S: Use default arguments for macros.
Mark Wooding [Thu, 29 Dec 2016 15:24:26 +0000 (15:24 +0000)]
math/mpx-mul4-x86-sse2.S: Use default arguments for macros.

I'd muddled up my macro languages and misremembered that GNU as handles
omitted macro arguments sensibly.  So use default argument values
throughout.  Some of the macro arguments have been reordered to make
defaulting work better.  No functional change.

7 years agomath/mpx-mul4-x86-sse2.S: Use the correct vector-multiply instruction.
Mark Wooding [Thu, 29 Dec 2016 14:36:12 +0000 (14:36 +0000)]
math/mpx-mul4-x86-sse2.S: Use the correct vector-multiply instruction.

Not sure why GNU as let me get away with that.

7 years agomath/mpx-mul4-x86-sse2.S: Give `squash' an explicit destination argument.
Mark Wooding [Sat, 5 Nov 2016 21:28:22 +0000 (21:28 +0000)]
math/mpx-mul4-x86-sse2.S: Give `squash' an explicit destination argument.

Also, rearrange the arguments so the destination(s) are at the start.

7 years agomath/mpx-mul4-x86-sse2.S: Optimize `squash'.
Mark Wooding [Sat, 5 Nov 2016 21:28:22 +0000 (21:28 +0000)]
math/mpx-mul4-x86-sse2.S: Optimize `squash'.

We can use `punpckldq' to assemble the 32-bit pieces, rather than a lot
of shifting to clear bits and then `por'.

7 years agomath/mpx-mul4-x86-sse2.S: Use `movdqa' to move between XMM registers.
Mark Wooding [Sat, 5 Nov 2016 21:28:22 +0000 (21:28 +0000)]
math/mpx-mul4-x86-sse2.S: Use `movdqa' to move between XMM registers.

Not `movdqu'.  I don't think there's a performance difference (any
more), but it's better style.

7 years agomath/mpx-mul4-x86-sse2.S: Add an extra blank line to improve layout.
Mark Wooding [Sat, 5 Nov 2016 21:28:22 +0000 (21:28 +0000)]
math/mpx-mul4-x86-sse2.S: Add an extra blank line to improve layout.

7 years agomath/mpx-mul4-x86-sse2.S: Fix operand name in commentary.
Mark Wooding [Sat, 5 Nov 2016 21:28:22 +0000 (21:28 +0000)]
math/mpx-mul4-x86-sse2.S: Fix operand name in commentary.

7 years agomath/mpx-mul4-x86-sse2.S: `mmla4' only need 48 bytes of stack.
Mark Wooding [Sat, 5 Nov 2016 21:28:22 +0000 (21:28 +0000)]
math/mpx-mul4-x86-sse2.S: `mmla4' only need 48 bytes of stack.

7 years agosymm/salsa20-arm-neon.S: Improve output permutation still further.
Mark Wooding [Mon, 7 Nov 2016 12:24:35 +0000 (12:24 +0000)]
symm/salsa20-arm-neon.S: Improve output permutation still further.

7 years agosymm/rijndael-x86ish-aesni.S: Use `.extern' for external symbols.
Mark Wooding [Thu, 29 Dec 2016 14:35:06 +0000 (14:35 +0000)]
symm/rijndael-x86ish-aesni.S: Use `.extern' for external symbols.

Duh.  `.globl' is certainly the wrong thing here.

7 years agobase/asm-common.h, */*.S: New macros for making stack-unwinding tables.
Mark Wooding [Thu, 29 Dec 2016 15:21:08 +0000 (15:21 +0000)]
base/asm-common.h, */*.S: New macros for making stack-unwinding tables.

Previously, I only supported Microsoft SEH tables, because they're
basically essential to having a working 64-bit binary (because Microsoft
are crazy and throw asynchronous exceptions).  But there are three
variants of stack-unwinding tables which are useful to make:

  * Microsoft's SEH tables for AMD64, constructed using `.seh_...'
    directives;

  * ARM's `.ARM.exidx' and `.ARM.extab' tables; and

  * Dwarf `.eh_frame' and `.debug_frame' tables.

These are all quite similar in flavour, but different in detail.  Rather
than write lots of hairy conditional stuff around subroutine prologues
and epilogues, wrap the whole lot up in some target-specific macros.

7 years agobase/asm-common.h, *.S: Add `INTFUNC' macro for internal subroutines.
Mark Wooding [Sat, 5 Nov 2016 21:28:22 +0000 (21:28 +0000)]
base/asm-common.h, *.S: Add `INTFUNC' macro for internal subroutines.

This provides correct alignment, and scoping for Windows SEH
annotations.

7 years agobase/asm-common.h: Define `WORDSZ' appropriately for x86ish platforms.
Mark Wooding [Thu, 29 Dec 2016 14:15:40 +0000 (14:15 +0000)]
base/asm-common.h: Define `WORDSZ' appropriately for x86ish platforms.

Four for 32-bit, eight for 64-bit, obviously.

7 years agobase/asm-common.h: Use `_' consistently for ignored macro arguments.
Mark Wooding [Thu, 29 Dec 2016 14:14:45 +0000 (14:14 +0000)]
base/asm-common.h: Use `_' consistently for ignored macro arguments.

7 years agobase/asm-common.h, symm/*.S: New macros for register name decoration.
Mark Wooding [Sat, 5 Nov 2016 21:28:22 +0000 (21:28 +0000)]
base/asm-common.h, symm/*.S: New macros for register name decoration.

Enhance `base/asm-common.h' with new macros for translating between
various ways of describing pieces of machine registers.

The x86/AMD64 general-purpose registers are a complicated mess of
overlapping pieces, and trying to write code which works on both just
makes everything even more interesting.

The ARM NEON registers are somewhat complicated, and GNU as isn't as
good as it should be at coping with alternative ways of denoting pieces
of them.  (For example, it ought to allow {q0-q7} instead of {d0-d15},
but doesn't; and it ought to allow q2[2] instead of d5[0], but doesn't.)

Use these macros tastefully in the various pieces of assembler code.

7 years agobase/asm-common.h: Add some general C preprocessor utilities.
Mark Wooding [Sat, 5 Nov 2016 21:28:22 +0000 (21:28 +0000)]
base/asm-common.h: Add some general C preprocessor utilities.

7 years agobase/ct.c: Better constant-time algorithms from /Hacker's Delight/.
Mark Wooding [Thu, 29 Dec 2016 11:50:50 +0000 (11:50 +0000)]
base/ct.c: Better constant-time algorithms from /Hacker's Delight/.

Improve equality checking and ordering, and add detailed commentary.

7 years agobase/asm-common.h, symm/rijndael-x86ish-aesni.S: Better section switching.
Mark Wooding [Mon, 12 Sep 2016 21:32:37 +0000 (22:32 +0100)]
base/asm-common.h, symm/rijndael-x86ish-aesni.S: Better section switching.

Provide macros for changing section which handle (a) switching to the
right text subsection, and (b) a section for readonly data.

7 years agobase/asm-common.h: Include `.note.GNU-stack' section on ELF targets.
Mark Wooding [Sat, 5 Nov 2016 12:22:43 +0000 (12:22 +0000)]
base/asm-common.h: Include `.note.GNU-stack' section on ELF targets.

This will ensure that Catacomb doesn't force an executable stack on
processes using it.  Oops.

7 years agomath/mpx-mul4-x86-sse2.S: Use `SHUF' instead of hardwired constants.
Mark Wooding [Sat, 5 Nov 2016 20:38:31 +0000 (20:38 +0000)]
math/mpx-mul4-x86-sse2.S: Use `SHUF' instead of hardwired constants.

7 years agosymm/salsa20-*.S: Optimize the output permutations.
Mark Wooding [Tue, 1 Nov 2016 22:38:41 +0000 (22:38 +0000)]
symm/salsa20-*.S: Optimize the output permutations.

A little analysis, and a lot of trial and error, shows reveals that the
state permutation can be decomposed into some rotations of the rows, a
matrix transpose, and another rotation of the rows.  These steps can be
done moderately efficiently using the Intel and ARM SIMD instructions.

7 years agomath/mpx.h, math/mpmont.c: Retune the Karatsuba thresholds.
Mark Wooding [Sun, 2 Oct 2016 23:27:11 +0000 (00:27 +0100)]
math/mpx.h, math/mpmont.c: Retune the Karatsuba thresholds.

It seems like Karatsuba isn't especially worthwhile for Montgomery
multiplication at any cryptographically relevant modulus size.  It's
certainly a lose with the new SSE2 multipliers.

7 years agomath/ptab.in: Include the correct Oakley 2048 group!
Mark Wooding [Sun, 2 Oct 2016 23:26:28 +0000 (00:26 +0100)]
math/ptab.in: Include the correct Oakley 2048 group!

I'd mistakenly duplicated the 1536 group.  This is... unfortunate.

7 years agomath/: SSE2-based high-performance multipliers.
Mark Wooding [Mon, 12 Sep 2016 21:32:37 +0000 (22:32 +0100)]
math/: SSE2-based high-performance multipliers.

7 years agobase/asm-common.h: Add some debugging macros.
Mark Wooding [Mon, 12 Sep 2016 21:32:37 +0000 (22:32 +0100)]
base/asm-common.h: Add some debugging macros.

Currently only for 32-bit x86.  More will come when they seem useful...

7 years agovars.am: Don't delete `*.t' files after running tests.
Mark Wooding [Mon, 12 Sep 2016 21:32:37 +0000 (22:32 +0100)]
vars.am: Don't delete `*.t' files after running tests.

Silly GNU Make.  Of course I wanted to keep them.

7 years agomath/mpmont.c: Make REDC coefficient as long as the modulus.
Mark Wooding [Mon, 12 Sep 2016 21:32:37 +0000 (22:32 +0100)]
math/mpmont.c: Make REDC coefficient as long as the modulus.

We'll have trouble later if it's too short.

7 years agomath/mpmont.c: Factor out the computational core of the algorithm.
Mark Wooding [Mon, 12 Sep 2016 21:32:37 +0000 (22:32 +0100)]
math/mpmont.c: Factor out the computational core of the algorithm.

Surprisingly, this makes everything a little simpler.

7 years agomath/ec-test.c: Add in missing space in test failure reports.
Mark Wooding [Mon, 12 Sep 2016 21:32:37 +0000 (22:32 +0100)]
math/ec-test.c: Add in missing space in test failure reports.

7 years agovars.am: Associate more useful dependencies with test programs.
Mark Wooding [Sun, 11 Sep 2016 14:05:49 +0000 (15:05 +0100)]
vars.am: Associate more useful dependencies with test programs.

For a long time, probably forever, `make FOO.t' hasn't actually worked
to rebuild the test program because of deficiencies in make(1) suffix
rules.  Add GNU Make pattern rules, which can have dependencies, to
finally fix this.

7 years agovars.am, symm/Makefile.am: Associate built test vector files with logs.
Mark Wooding [Sun, 11 Sep 2016 14:03:59 +0000 (15:03 +0100)]
vars.am, symm/Makefile.am: Associate built test vector files with logs.

Previously they were associated with the test executables, but that's
not the right approach, and it's going to be a problem if we don't fix
it.

Unfortunately, Automake only allows dependencies on test log files if
the test-file suffix is listed in $(TEST_EXTENSIONS).

7 years agomath/mprand.[ch], rand/grand.c: Check range of arguments.
Mark Wooding [Sun, 11 Sep 2016 13:29:49 +0000 (14:29 +0100)]
math/mprand.[ch], rand/grand.c: Check range of arguments.

  * mprand: It doesn't make sense to ask for a zero-bit integer whose
    low bit is set; or, indeed, a four-bit integer whose fourth bit is
    set.  So check the mask against the bit length.

  * mprand: On the other hand, it /does/ make sense to ask for a
    zero-bit integer, and the answer is simply zero.  But the code used
    to segfault.  Fix this.

  * mprand_range, grand_defaultrange: It doesn't make sense to ask for
    an integer in [0, 0), because there aren't any.  Check before
    trying.

7 years agomath/mpint.h (MP_TOINT): Convert MP digits to target type before shifting.
Mark Wooding [Sun, 11 Sep 2016 13:24:34 +0000 (14:24 +0100)]
math/mpint.h (MP_TOINT): Convert MP digits to target type before shifting.

Otherwise conversions to types wider than `mpw' never actually works.

7 years agomath/{mpbarrett,mpmont}.h: Provide correctness proofs for these methods.
Mark Wooding [Fri, 9 Sep 2016 10:06:41 +0000 (11:06 +0100)]
math/{mpbarrett,mpmont}.h: Provide correctness proofs for these methods.

Add commentary explaining how these reduction algorithms actually work,
with proofs.

7 years agosymm/rijndael-x86ish-aesni.S: Load destination pointer earlier on 32-bit.
Mark Wooding [Thu, 11 Aug 2016 09:07:04 +0000 (10:07 +0100)]
symm/rijndael-x86ish-aesni.S: Load destination pointer earlier on 32-bit.

We don't need EDX for anything for most of the code, so repurpose it
earlier ready for the final store.

7 years agosymm/rijndael-x86ish-aesni.S: Fix conflict in 32-bit register allocation.
Mark Wooding [Thu, 11 Aug 2016 09:02:56 +0000 (10:02 +0100)]
symm/rijndael-x86ish-aesni.S: Fix conflict in 32-bit register allocation.

Since 28321c9..., the context pointer in ECX was overwritten with the
GOT pointer, used to find the end-swapping table, resulting in an abort.
Reallocate so that the round count, which is loaded /after/ the endswap
pointer, ends up in ECX, and the context pointer goes in EAX, which
doesn't get clobbered.

This doesn't affect 64-bit targets.

7 years agobase/asm-common.h, *-x86ish-*.S: Centralize SSE shuffling constants.
Mark Wooding [Thu, 11 Aug 2016 08:15:12 +0000 (09:15 +0100)]
base/asm-common.h, *-x86ish-*.S: Centralize SSE shuffling constants.

Introduce a centrally defined `SHUF(D, C, B, A)' macro to make shuffling
constants for `pshufd' and friends, rather than defining inscrutable
`ROTL' etc. macros in each file.

There are lots of other shuffling instructions, which may need their own
magic macros, so this might prove to have been a bad name, but we'll
worry about that later.

7 years agosymm/{chacha,salsa20}-*.S: Indent the hoisted transposition instructions.
Mark Wooding [Mon, 8 Aug 2016 09:33:29 +0000 (10:33 +0100)]
symm/{chacha,salsa20}-*.S: Indent the hoisted transposition instructions.

This hopefully makes it clearer how the various interleaved strands of
computation work.

7 years agosymm/salsa20-x86ish-sse2.S: Cosmetic fixes.
Mark Wooding [Mon, 8 Aug 2016 09:32:04 +0000 (10:32 +0100)]
symm/salsa20-x86ish-sse2.S: Cosmetic fixes.

Fix a mangled comment, and remove a spurious blank line.

7 years agosymm/rijndael-arm-crypto.S: More aggressive loading of subkey data.
Mark Wooding [Sat, 30 Jul 2016 10:48:16 +0000 (11:48 +0100)]
symm/rijndael-arm-crypto.S: More aggressive loading of subkey data.

Rewrite the block-encryption primitives so that they load key data in
multiple round chunks.  There's now a separate prefix piece for each
number of rounds other than ten which does the extra and flows into the
main sequence.  Because the code is now rather more complicated, there's
only one copy of it, in a macro, as for the AESNI version.

7 years agobase/asm-common.h, *.S: Include metadata for 64-bit Windows stack unwinding.
Mark Wooding [Wed, 13 Jul 2016 22:19:03 +0000 (23:19 +0100)]
base/asm-common.h, *.S: Include metadata for 64-bit Windows stack unwinding.

There are (annoyingly undocumented) assembler directives, which make
this fairly straightforward.  I've manually verified that they're
setting up the expected data structures correctly.  Under normal
circumstances, we don't expect these leaf functions to throw exceptions.

Note that the `endswap_block' subroutine of `rijndael_setup_x86ish_-
aesni' is not currently properly described.

7 years agosymm/rijndael-x86ish-aesni.S: Move setup of endswap table after prologue.
Mark Wooding [Wed, 13 Jul 2016 22:16:38 +0000 (23:16 +0100)]
symm/rijndael-x86ish-aesni.S: Move setup of endswap table after prologue.

When we introduce metadata for Windows stack unwinding, it will be ugly
to have to count this code as part of the stack-frame establishment
prologue.  Move it later.

7 years agobase/asm-common.h, *.S: Introduce `AUXFN'/`ENDAUXFN'; abolish `gotaux'.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
base/asm-common.h, *.S: Introduce `AUXFN'/`ENDAUXFN'; abolish `gotaux'.

This change introduces a new macro pair `AUXFN' and `ENDAUXFN' which are
mostly useful in other macros.  They bracket an auxiliary function
definition which will be put somewhere convenient (at the end of the
text section), and defined exactly once.

This is exactly what we need to make the `_where_am_i.GOTREG' macros
automatically in `ldgot', so use this and abolish `gotaux' from the
codebase.

7 years agosymm/rijndael-*.S (rijndael_setup_*): Roll up the inner loop.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
symm/rijndael-*.S (rijndael_setup_*): Roll up the inner loop.

Reduce code size by tracking position in the main key-schedule loop in a
register and dispatching rather than tracking it in the program-counter.

7 years agosymm/rijndael-x86ish-aesni.S (rijndael_setup_x86ish_aesni): Label numbering.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
symm/rijndael-x86ish-aesni.S (rijndael_setup_x86ish_aesni): Label numbering.

Follow what appear to be my current conventions properly.

7 years agoRelease 2.2.5. 2.2.5
Mark Wooding [Tue, 12 Jul 2016 09:28:05 +0000 (10:28 +0100)]
Release 2.2.5.

7 years agosymm/rijndael-arm-crypto.S (rijndael_setup_arm_crypto): Avoid reload.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
symm/rijndael-arm-crypto.S (rijndael_setup_arm_crypto): Avoid reload.

Juggle the register allocation in the loop which copies over the first
key-data cycle, so as to arrange to leave the last copied key word in
r4.  Then we can elide the explicit load of r4 at the start of the main
key expansion loop, because it already has the right value, saving a
whole instruction.

7 years agosymm/rijndael-arm-crypto.S (rijndael_setup_arm_crypto): Renumber labels.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
symm/rijndael-arm-crypto.S (rijndael_setup_arm_crypto): Renumber labels.

Be more consistent about the label numbering.  Specifically, 0 is
usually a loop head, and 9 is usually a thing to do next.

7 years agosymm/rijndael-arm-crypto.S (rijndael_setup_arm_crypto): Fix missing label.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
symm/rijndael-arm-crypto.S (rijndael_setup_arm_crypto): Fix missing label.

The loop which copies the initial key data from a misaligned address
ends with a branch to the next label 9 which, due to an oversight on my
part, skipped out the setup for the main loop.  Introduce an extra label
9 to fix this.

7 years agoconfigure.ac: Use modern version of `AC_CHECK_TYPES'.
Mark Wooding [Mon, 11 Jul 2016 10:10:50 +0000 (11:10 +0100)]
configure.ac: Use modern version of `AC_CHECK_TYPES'.

Also, check for `socklen_t' in <sys/socket.h>, so that we can find it on
Android.  I don't expect this to be the last Android portability
failure, because I've not even tried building it there yet.

7 years agosymm/rijndael-arm-crypto.S: Outdent the `.rept/.endr' directives.
Mark Wooding [Mon, 11 Jul 2016 09:50:31 +0000 (10:50 +0100)]
symm/rijndael-arm-crypto.S: Outdent the `.rept/.endr' directives.

7 years agoconfigure.ac: Use new name for `AX_C_LONG_LONG'.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
configure.ac: Use new name for `AX_C_LONG_LONG'.

The correct name is in wheezy's version of the Autoconf archive, so I
guess it's not that new really.

7 years agoRelease 2.2.4. 2.2.4
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
Release 2.2.4.

7 years agoconfigure.ac, symm/rijndael*: Use ARMv8 AES instructions where available.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
configure.ac, symm/rijndael*: Use ARMv8 AES instructions where available.

This matches the x86 AESNI support, but is less mad.

7 years agobase/dispatch.c: Add notional support for `AT_HWCAP2' entry.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
base/dispatch.c: Add notional support for `AT_HWCAP2' entry.

Later ARM-based kernels provide one of these, at least.

7 years agoconfigure.ac: Check that the chosen assembler will actually work.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
configure.ac: Check that the chosen assembler will actually work.

If the system assembler doesn't like the GNUish directive syntax I'm
using, then the build will fail badly and be hard to fix.  Now, if the
assembler doesn't look like it's going to work, then declare the target
platform to be unknown so as to disable all of this fancy machinery.

7 years agoconfigure.ac: Segregate checks by source language better.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
configure.ac: Segregate checks by source language better.

Move the poking-about-for-CPU-features function checks in with the rest
of the C code probing.

7 years agosymm/rijndael-x86ish-aesni.S: Have `endswap_block' copy NKW to ECX.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
symm/rijndael-x86ish-aesni.S: Have `endswap_block' copy NKW to ECX.

Eliminate a tiny bit of code duplication.  It's not like anyone else
uses that subroutine.

7 years agobase/asm-common.h: Factor out `deposit fake literal pool' macro.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
base/asm-common.h: Factor out `deposit fake literal pool' macro.

This might be useful for debugging purposes.

7 years agoHave a small reformatting session.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
Have a small reformatting session.

  * Outdent `.macro' and `.endm' directives.  Firstly, this makes them
    more prominent, similar to `FUNC' and `ENDFUNC'.  Secondly, though,
    it has the effect of moving the macro name into the mnemonic column.

  * Remove the second `External definitions' banner from `symm/
    rijndael-x86ish-aesni.S'.

  * Reflow the various `CPU_DISPATCH' stanze.

7 years agosymm/rijndael-x86ish-aesni.S: Decorate `rijndael_rcon' correctly.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
symm/rijndael-x86ish-aesni.S: Decorate `rijndael_rcon' correctly.

I don't think I've tested this on 32-bit Windows, which is the only
platform I'm currently supporting which needs nontrivial symbol
decoration.

7 years agobase/dispatch.c: Fix list-macro invocation if we have `getauxval'.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
base/dispatch.c: Fix list-macro invocation if we have `getauxval'.

Caused hopeless build failure on ARM versions of jessie.

7 years agobase/dispatch.c: Just include all the auxvec-related headers we can.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
base/dispatch.c: Just include all the auxvec-related headers we can.

The necessary stuff will be in one of them.  It turns out that the
previous approach sometimes missed some important definitions.

7 years agobase/asm-common.h: Use the correct `CPUFAM_*' name for ARM.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
base/asm-common.h: Use the correct `CPUFAM_*' name for ARM.

7 years agoconfigure.ac: Quote `$ac_cv_search_clock_gettime' properly.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
configure.ac: Quote `$ac_cv_search_clock_gettime' properly.

This can expand to `none required', which confuses test(1).  My fault.

7 years agomath/pfilt.c (pfilt_jump): Fix off-by-one error in reduction.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
math/pfilt.c (pfilt_jump): Fix off-by-one error in reduction.

Oh, dear.  This is quite a bad one.  The loop added the residues for the
jump to the candidate, and reduced each if the result was strictly
higher than the modulus.  It then reports failure (or immediate success)
if any residue is zero, otherwise reporting a candidate for subsequent
testing.  Obviously, this is a stupid bug, with the result that,
effectively, every step reports a candidate for further testing.

This bug has two bad consequences.

  * Candidates with small factors aren't weeded out, so prime searching
    takes an unnecessarily long time.  I'd spotted this, but didn't have
    a way in to investigate the problem.

  * Candidates which actually have small factors, but are in fact below
    the `smallenough' threshold, are reported as being verified as
    prime, so the overall procedure erroneously returns known
    composites.

7 years agopub/bbs-gen.c, pub/rsa-gen.c: Fail if the generated key is the wrong length.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
pub/bbs-gen.c, pub/rsa-gen.c: Fail if the generated key is the wrong length.

7 years agopub/bbs-gen.c: Carefully generate numbers of the correct sizes.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
pub/bbs-gen.c: Carefully generate numbers of the correct sizes.

7 years agopub/bbs-gen.c: Return secret numbers for private keys.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
pub/bbs-gen.c: Return secret numbers for private keys.

7 years agomath/strongprime.c: Choose the smaller primes' sizes more carefully.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
math/strongprime.c: Choose the smaller primes' sizes more carefully.

The old code would indeed, as the warning in the comment said, produce
numbers which are larger than requested because the component primes'
sizes were chosen in a naïve manner.  I've now (eventually!) thought
about the issue some more and come up with a better approach.

The `BITSLOP' macro is now gone, replaced by a carefully chosen value
supported by some actual mathematics.  As a result, the warning comments
have been removed.  Also, `strongprime' will fail if it actually returns
a number of the wrong size.

7 years agomath/, pub/: Take a more consistent approach to prime-generation failures.
Mark Wooding [Thu, 26 May 2016 08:26:09 +0000 (09:26 +0100)]
math/, pub/: Take a more consistent approach to prime-generation failures.

  * Don't have `strongprime_setup' assert just because the requested
    size is too small.

  * Fix `strongprime' itself, so that it leaves its destination in a
    predictable state (specifically, it's unmolested) if it fails.

  * Remove the retry loops from `bbs_gen' and `rsa_gen'.  Now,
    downstream failures are consistently propagated.