// latency, so arrange to start a new shuffle into a temporary as
// soon as we've written out the old value.
paddd xmm0, SAVE0
// latency, so arrange to start a new shuffle into a temporary as
// soon as we've written out the old value.
paddd xmm0, SAVE0