Commit | Line | Data |
---|---|---|

759513d1 | 1 | Catacomb's random number generators |

2 | ||

3 | ||

4 | Catacomb contains a large number of pseudorandom number | |

5 | generators. Some are dedicated to generating pseudorandom | |

6 | numbers; others are ciphers in output-feedback mode. Some are | |

7 | cryptographically strong, others are fast or simple. All have | |

8 | been statistically tested and come out OK. | |

9 | ||

10 | ||

11 | The `rand' generator | |

12 | ||

13 | Cryptographic applications need a source of true random data. | |

14 | The `rand' generator is Catacomb's true-random-data source. | |

15 | ||

16 | The design is my own. I wasn't happy with any existing designs, | |

17 | and I'm still not -- even Counterpane's Yarrow gives me an | |

18 | uneasy feeling[1]. I'm not aware of any cryptanalytic attack | |

19 | against my generator, and I'd very much like to hear if anyone | |

20 | else knows differently. | |

21 | ||

22 | The generator's state is held in an object of type `rand_pool'. | |

23 | Data can be added into the pool using the `rand_add' function. | |

24 | Random numbers can be extracted by calling `rand_get'. | |

25 | ||

26 | It's possible to attach a `noise acquisition' function to a | |

27 | rand_pool. The function `rand_getgood' extracts a number of | |

28 | bytes from the generator, calling the noise acquisition function | |

29 | if it thinks it's low on randomness. This shouldn't actually be | |

30 | necessary once the generator has been topped up once. It makes | |

31 | the generator very slow. | |

32 | ||

33 | It's also possible to key the generator. The output | |

34 | transformation then depends on the key, making it even harder | |

35 | for an adversary to work out anything useful about the | |

36 | generator. | |

37 | ||

38 | The underlying transformation used by the generator is | |

39 | ||

40 | x' = E_{H(x)}(x) | |

41 | ||

42 | i.e., hash, use the hash as a key, and encrypt. | |

43 | ||

44 | ||

45 | [1] The Yarrow paper states that it offers 160 bits-worth of | |

46 | security, and that an adversary can't distinguish its output | |

47 | from real random bits. I have an attack which takes about | |

48 | 2^64 64-bit outputs and tells the difference between | |

49 | Yarrow-160 and real random data, by analysing the frequency | |

50 | of adjacent 64-bit blocks being equal. | |

51 | ||

52 | ||

53 | Noncryptographic generators | |

54 | ||

55 | Two pseudorandom-number generators are supplied which have no | |

56 | cryptographic strength whatever. They are, respectively, a | |

57 | traditional linear-congruential generator (lcrand) and a | |

58 | Fibonacci generator (fibrand). | |

59 | ||

60 | The linear-congruential generator has three (fixed) parameters, | |

61 | a, c and p, and calculates | |

62 | ||

63 | x[i] = (a x[i - 1] + c) mod p | |

64 | ||

65 | The prime p is fairly large (2^32 - 5, in fact), and a and c are | |

66 | carefully chosen. The generator has a period of p - 1. There | |

67 | is one fixed point (i.e., it transforms to itself rather than | |

68 | being in the cycle) which must be avoided. | |

69 | ||

70 | The Fibonacci generator works by adding together old results: | |

71 | ||

72 | x[i] = (x[i - 24] + x[i - 55]) mod 2^32 | |

73 | ||

74 | It has a very long period, and is the fastest pseudorandom | |

75 | number generator in Catacomb. | |

76 | ||

77 | These two generators are very useful when the cryptographic | |

78 | strength of the output isn't important, for example when | |

79 | choosing random numbers for primality testing. | |

80 | ||

81 | ||

82 | The Blum-Blum-Shub generator | |

83 | ||

84 | This is by far the strongest and by even further the slowest | |

85 | pseudorandom-number generator in Catacomb. Its output has been | |

86 | proven to be indistinguishable from random data by *any* | |

87 | polynomial-time test, assuming that factoring large numbers is | |

88 | difficult. | |

89 | ||

90 | Use the BBS generator when you're feeling really paranoid, and | |

91 | you've got a few hours with nothing to do. | |

92 | ||

93 | ||

94 | Output-feedback ciphers | |

95 | ||

96 | Ciphers in output-feedback mode simply emit a pseudorandom | |

97 | stream. To construct a cipher, you XOR the stream with the | |

98 | plaintext. It's even easier just to use the output as random | |

99 | numbers. | |

100 | ||

101 | ||

102 | Generic interface | |

103 | ||

104 | There's a fairly comprehensive generic interface for Catacomb's | |

105 | various generators. | |

106 | ||

107 | Each generator provides a function which returns a pointer to a | |

108 | `generic pseudorandom-number generator', of type `grand'. | |

109 | ||

110 | A `grand' is a structured type containing one member, `ops', | |

111 | which is a pointer to various useful things. Let `r' be a | |

112 | pointer to a generic pseudorandom-number generator; then, | |

113 | ||

114 | r->ops->name The name of the generator. | |

115 | ||

116 | r->ops->max One greater than the maximum output of | |

117 | the `raw' function, or zero if it just | |

118 | emits words. | |

119 | ||

120 | r->ops->misc(r, op, ...) | |

121 | Performs a miscellaneous operation. The | |

122 | various standard `op' values are | |

123 | described below. | |

124 | ||

125 | r->ops->destroy(r) Destroy the generic generator. | |

126 | ||

127 | r->ops->raw(r) Return a raw output from the generator. | |

128 | For `lcrand', this is an integer in the | |

129 | range [0, p); for other generators, it's | |

130 | the same as `word'. | |

131 | ||

132 | r->ops->byte(r) Return an octet. The result is | |

133 | uniformly distributed over the integers | |

134 | in the interval [0, 256). | |

135 | ||

136 | r->ops->word(r) Returns a word. The result is uniformly | |

137 | distributed over the integers in the | |

138 | interval [0, 2^32). | |

139 | ||

140 | r->ops->range(r, l) Returns an integer unformly distributed | |

141 | in the interval [0, l), l < 2^32. | |

142 | ||

143 | r->ops->fill(r, p, sz) Fill the sz bytes at address p with | |

144 | uniformly distributed bytes (i.e., | |

145 | integers in the interval [0, 256). | |

146 | ||

147 | All of these operations are supported by all generators. Some | |

148 | may be inefficient, however, since they're synthesized using | |

149 | operations more suited to the particular generator. For | |

150 | example, `word' is not efficient on the linear-congruential | |

151 | generator, but is for the Fibonacci generator. | |

152 | ||

153 | The misc-ops are as follows: | |

154 | ||

155 | misc(r, GRAND_CHECK, op) Return nonzero if `op' is a | |

156 | recognized misc-op for this | |

157 | generator. | |

158 | ||

159 | misc(r, GRAND_SEEDINT, s) Seed the generator using the | |

160 | integer s. | |

161 | ||

162 | misc(r, GRAND_SEEDUINT32, s) Seed the generator using the | |

163 | 32-bit word s. | |

164 | ||

165 | misc(r, GRAND_SEEDBLOCK, p, sz) Seed the generator using the | |

166 | block of sz bytes starting at | |

167 | address p. | |

168 | ||

169 | misc(r, GRAND_SEEDMP, m) Seed the generator using the | |

170 | multiprecision integer m. | |

171 | ||

172 | misc(r, GRAND_SEEDRAND, r') Seed the generator using some | |

173 | output from the generic pseudo- | |

174 | random number generator r'. | |

175 | ||

176 | ||

177 | Statistical testing | |

178 | ||

179 | I generated 10M of data with each of Catacomb's random number | |

180 | generators, and tested them with George Marsaglia's DIEHARD | |

181 | suite. While the random data is not included here, the | |

182 | following trivial Bourne shell script will generate the exact | |

183 | output streams I used, assuming that the supplied `rspit' | |

184 | program is in the current PATH: | |

185 | ||

186 | for i in des 3des idea blowfish rc5 rc4; do | |

187 | rspit $i -kseed -z10m -o$i.rand | |

188 | done | |

189 | for i in lcrand fibrand; do | |

190 | rspit $i -p -s0 -z10m -o$i.rand | |

191 | done | |

192 | rspit rand -p -z10m -orand.rand | |

193 | nice rspit bbs -p -s2 -z10m -obbs.rand | |

194 | ||

195 | (Note that generating the BBS output data can take over an | |

196 | hour. Make sure you have a good book to read.) | |

197 | ||

198 | Looking at the results, I think that they all passed the tests | |

199 | relatively convincingly. | |

200 | ||

201 | I've not actually included the DIEHARD output files in the | |

202 | distribution, because they're quite large, and anyone can | |

203 | reproduce my results exactly using the publically available | |

204 | DIEHARD distribution and the code supplied. If you do actually | |

205 | want them, send me some mail and I'll send you a 60K tar.gz | |

206 | file by (approximate) return. | |

207 | -- | |

208 | [mdw] | |

209 | ||

210 | \f | |

211 | Local variables: | |

212 | mode: text | |

213 | End: |