// u v = SUM_{0<=i,j<n} u_i v_j t^{i+j}
//
// Suppose instead that we're given ũ = SUM_{0<=i<n} u_{n-i-1} t^i
- // and ṽ = SUM_{0<=j<n} v_{n-j-1} t^j, so the bits are backwards.
+ // and ṽ = SUM_{0<=j<n} v_{n-j-1} t^j, so the bits are backwards.
// Then
//
- // ũ ṽ = SUM_{0<=i,j<n} u_{n-i-1} v_{n-j-1} t^{i+j}
+ // ũ ṽ = SUM_{0<=i,j<n} u_{n-i-1} v_{n-j-1} t^{i+j}
// = SUM_{0<=i,j<n} u_i v_j t^{2n-2-(i+j)}
//
// which is almost the bit-reversal of u v, only it's shifted right
// Enter with u and v in the most-significant three words of q0 and
// q1 respectively, and zero in the low words, and zero in q15; leave
// with z = u v in the high three words of q0, and /junk/ in the low
- // word. Clobbers ???.
+ // word. Clobbers q1--q3, q8, q9.
// This is an inconvenient size. There's nothing for it but to do
// four multiplications, as if for the 128-bit case. It's possible