Mostly abolish inline assembler code in favour of dedicated files.
Move the fancy feature probing from `dispatch.c'. This makes it easier to
understand because it's not covered in `%' sigils and backwards, and
also simplifies things because we have better machinery for papering
over the differences between 32- and 64-bit instruction sets.
Also move the `rdrand' code from `rand.c'. This makes things
significantly more complicated because it calls back into C, but it does
improve availability of a security feature, so that's good.
That leaves only a use of `rdtsc' in `perftest.c', which is hardly
critical, and the `rbit' in the ARM64 `gcm.c' code, which has a slightly
slower portable alternative.