Commit | Line | Data |
---|---|---|
974d0468 | 1 | * Design of new, multi-subnet secnet protocol |
2fe58dfd | 2 | |
974d0468 SE |
3 | Like the first (1995/6) version, we're tunnelling IP packets inside |
4 | UDP packets. To defeat various restrictions which may be imposed on us | |
5 | by network providers (like the prohibition of incoming TCP | |
6 | connections) we're sticking with UDP for everything this time, | |
3454dce4 | 7 | including key setup. This means we have to handle retries, etc. |
2fe58dfd SE |
8 | |
9 | Other new features include being able to deal with subnets hidden | |
10 | behind changing 'real' IP addresses, and the ability to choose | |
11 | algorithms and keys per pair of communicating sites. | |
12 | ||
13 | ** Configuration and structure | |
14 | ||
3454dce4 SE |
15 | [The original plan] |
16 | ||
2fe58dfd SE |
17 | The network is made up from a number of 'sites'. These are collections |
18 | of machines with private IP addresses. The new secnet code runs on | |
19 | machines which have interfaces on the private site network and some | |
20 | way of accessing the 'real' internet. | |
21 | ||
22 | Each end of a tunnel is identified by a name. Often it will be | |
23 | convenient for every gateway machine to use the same name for each | |
24 | tunnel endpoint, but this is not vital. Individual tunnels are | |
25 | identified by their two endpoint names. | |
26 | ||
3454dce4 SE |
27 | [The new plan] |
28 | ||
29 | It appears that people want to be able to use secnet on mobile | |
30 | machines like laptops as well as to interconnect sites. In particular, | |
31 | they want to be able to use their laptop in three situations: | |
32 | ||
33 | 1) connected to their internal LAN by a cable; no tunnel involved | |
34 | 2) connected via wireless, using a tunnel to protect traffic | |
35 | 3) connected to some other network, using a tunnel to access the | |
36 | internal LAN. | |
37 | ||
38 | They want the laptop to keep the same IP address all the time. | |
39 | ||
40 | Case (1) is simple. | |
41 | ||
42 | Case (2) requires that the laptop run a copy of secnet, and have a | |
43 | tunnel configured between it and the main internal LAN default | |
44 | gateway. secnet must support the concept of a 'soft' tunnel where it | |
45 | adds a route and causes the gateway to do proxy-ARP when the tunnel is | |
46 | up, and removes the route again when the tunnel is down. | |
47 | ||
48 | The usual prohibition of packets coming in from one tunnel and going | |
49 | out another must be relaxed in this case (in particular, the | |
50 | destination address of packets from these 'mobile station' tunnels may | |
51 | be another tunnel as well as the host). | |
52 | ||
53 | (Quick sanity check: if chiark's secnet address was in | |
54 | 192.168.73.0/24, would this work properly? Yes, because there will be | |
55 | an explicit route to it, and proxy ARP will be done for it. Do we want | |
56 | packets from the chiark tunnel to be able to go out along other | |
57 | routes? No. So, spotting a 'local' address in a remote site's list of | |
58 | networks isn't sufficient to switch on routing for a site. We need an | |
59 | explicit option. NB packets may be routed if the source OR the | |
60 | destination is marked as allowing routing [otherwise packets couldn't | |
61 | get back from eg. chiark to a laptop at greenend]). | |
62 | ||
4f5e39ec SE |
63 | [the even newer plan] |
64 | ||
65 | secnet sites are configured to grant access to particular IP address | |
66 | ranges to the holder of a particular public key. The key can certify | |
67 | other keys, which will then be permitted to use a subrange of the IP | |
68 | address range of the certifying key. | |
69 | ||
70 | This means that secnet won't know in advance (i.e. at configuration | |
71 | time) how many tunnels it might be required to support, so we have to | |
72 | be able to create them (and routes, and so on) on the fly. | |
73 | ||
3454dce4 SE |
74 | ** VPN-level configuration |
75 | ||
76 | At a high level we just want to be able to indicate which groups of | |
77 | users can claim ownership of which ranges of IP addresses. Assuming | |
78 | these users (or their representatives) all have accounts on a single | |
79 | machine, we can automate the submission of keys and other information | |
80 | to make up a 'sites' file for the entire VPN. | |
81 | ||
82 | The distributed 'sites' file should be in a more restricted format | |
83 | than the secnet configuration file, to prevent attackers who manage to | |
84 | distribute bogus sites files from taking over their victim's machines. | |
85 | ||
86 | The distributed 'sites' file is read one line at a time. Each line | |
87 | consists of a keyword followed by other information. It defines a | |
88 | number of VPNs; within each VPN it defines a number of locations; | |
89 | within each location it defines a number of sites. These VPNs, | |
90 | locations and sites are turned into a secnet.conf file fragment using | |
91 | a script. | |
92 | ||
93 | Some keywords are valid at any 'level' of the distributed 'sites' | |
94 | file, indicating defaults. | |
95 | ||
96 | The keywords are: | |
97 | ||
98 | vpn n: we are now declaring information to do with VPN 'n'. Must come first. | |
99 | ||
100 | location n: we are now declaring information for location 'n'. | |
101 | ||
102 | site n: we are now declaring information for site 'n'. | |
103 | endsite: we're finished declaring information for the current site | |
104 | ||
105 | restrict-nets a b c ...: restrict the allowable 'networks' for the current | |
106 | level to those in this list. | |
107 | end-definitions: prevent definition of further vpns and locations, and | |
108 | modification of defaults at VPN level | |
109 | ||
110 | dh x y: the current VPN uses the specified group; x=modulus, y=generator | |
111 | ||
112 | hash x: which hash function to use. Valid options are 'md5' and 'sha1'. | |
113 | ||
114 | admin n: administrator email address for current level | |
115 | ||
116 | key-lifetime n | |
117 | setup-retries n | |
118 | setup-timeout n | |
119 | wait-time n | |
120 | renegotiate-time n | |
121 | ||
122 | address a b: a=dnsname, b=port | |
123 | networks a b c ... | |
124 | pubkey x y z: x=keylen, y=encryption key, z=modulus | |
125 | mobile: declare this to be a 'mobile' site | |
126 | ||
b2a56f7c SE |
127 | ** Logging etc. |
128 | ||
129 | There are several possible ways of running secnet: | |
130 | ||
131 | 'reporting' only: --version, --help, etc. command line options and the | |
132 | --just-check-config mode. | |
133 | ||
134 | 'normal' run: perform setup in the foreground, and then background. | |
135 | ||
136 | 'failed' run: setup in the foreground, and terminate with an error | |
137 | before going to background. | |
138 | ||
139 | 'reporting' modes should never output anything except to stdout/stderr. | |
140 | 'normal' and 'failed' runs output to stdout/stderr before | |
141 | backgrounding, then thereafter output only to log destinations. | |
142 | ||
2fe58dfd SE |
143 | ** Protocols |
144 | ||
145 | *** Protocol environment: | |
146 | ||
147 | Each gateway machine serves a particular, well-known set of private IP | |
148 | addresses (i.e. the agreement over which addresses it serves is | |
149 | outside the scope of this discussion). Each gateway machine has an IP | |
150 | address on the interconnecting network (usually the Internet), which | |
151 | may be dynamically allocated and may change at any point. | |
152 | ||
153 | Each gateway knows the RSA public keys of the other gateways with | |
154 | which it wishes to communicate. The mechanism by which this happens is | |
155 | outside the scope of this discussion. There exists a means by which | |
156 | each gateway can look up the probable IP address of any other. | |
157 | ||
158 | *** Protocol goals: | |
159 | ||
160 | The ultimate goal of the protocol is for the originating gateway | |
161 | machine to be able to forward packets from its section of the private | |
162 | network to the appropriate gateway machine for the destination | |
163 | machine, in such a way that it can be sure that the packets are being | |
164 | sent to the correct destination machine, the destination machine can | |
165 | be sure that the source of the packets is the originating gateway | |
166 | machine, and the contents of the packets cannot be understood other | |
167 | than by the two communicating gateways. | |
168 | ||
169 | XXX not sure about the address-change stuff; leave it out of the first | |
170 | version of the protocol. From experience, IP addresses seem to be | |
171 | quite stable so the feature doesn't gain us much. | |
172 | ||
173 | **** Protocol sub-goal 1: establish a shared key | |
174 | ||
175 | Definitions: | |
176 | ||
1737eeef IJ |
177 | A is the originating gateway machine name |
178 | B is the destination gateway machine name | |
09a385fb | 179 | A+ and B+ are the names with optional additional data, see below |
2fe58dfd SE |
180 | PK_A is the public RSA key of A |
181 | PK_B is the public RSA key of B | |
182 | PK_A^-1 is the private RSA key of A | |
183 | PK_B^-1 is the private RSA key of B | |
184 | x is the fresh private DH key of A | |
185 | y is the fresh private DH key of B | |
186 | k is g^xy mod m | |
187 | g and m are generator and modulus for Diffie-Hellman | |
188 | nA is a nonce generated by A | |
189 | nB is a nonce generated by B | |
190 | iA is an index generated by A, to be used in packets sent from B to A | |
191 | iB is an index generated by B, to be used in packets sent from A to B | |
192 | i? is appropriate index for receiver | |
193 | ||
194 | Note that 'i' may be re-used from one session to the next, whereas 'n' | |
195 | is always fresh. | |
196 | ||
09a385fb IJ |
197 | The optional additional data after the sender's name consists of some |
198 | initial subset of the following list of items: | |
199 | * A 32-bit integer with a set of capability flags, representing the | |
200 | abilities of the sender. | |
3ed1846a IJ |
201 | * In MSG3/MSG4: a 16-bit integer being the sender's MTU, or zero. |
202 | (In other messages: nothing.) See below. | |
09a385fb IJ |
203 | * More data which is yet to be defined and which must be ignored |
204 | by receivers. | |
205 | The optional additional data after the receiver's name is not | |
206 | currently used. If any is seen, it must be ignored. | |
207 | ||
208 | Capability flag bits must be in one the following two categories: | |
209 | ||
210 | 1. Early capability flags must be advertised in MSG1 or MSG2, as | |
211 | applicable. If MSG3 or MSG4 advertise any "early" capability bits, | |
212 | MSG1 or MSG3 (as applicable) must have advertised them too. Sadly, | |
213 | advertising an early capability flag will produce MSG1s which are | |
214 | not understood by versions of secnet which predate the capability | |
215 | mechanism. | |
216 | ||
217 | 2. Late capability flags are advertised in MSG2 or MSG3, as | |
218 | applicable. They may also appear in MSG1, but this is not | |
219 | guaranteed. MSG4 must advertise the same set as MSG2. | |
220 | ||
221 | No capability flags are currently defined. Unknown capability flags | |
222 | should be treated as late ones. | |
223 | ||
ff05a229 | 224 | |
3ed1846a IJ |
225 | MTU handling |
226 | ||
227 | In older versions of secnet, secnet was not capable of fragmentation | |
228 | or sending ICMP Frag Needed. Administrators were expected to configure | |
229 | consistent MTUs across the network. | |
230 | ||
231 | It is still the case in the current version that the MTUs need to be | |
232 | configured reasonably coherently across the network: the allocated | |
233 | buffer sizes must be sufficient to cope with packets from all other | |
234 | peers. | |
235 | ||
236 | However, provided the buffers are sufficient, all packets will be | |
237 | processed properly: a secnet receiving a packet larger than the | |
238 | applicable MTU for its delivery will either fragment it, or reject it | |
239 | with ICMP Frag Needed. | |
240 | ||
241 | The MTU additional data field allows secnet to advertise an MTU to the | |
242 | peer. This allows the sending end to handle overlarge packets, before | |
243 | they are transmitted across the underlying public network. This can | |
244 | therefore be used to work around underlying network braindamage | |
245 | affecting large packets. | |
246 | ||
247 | If the MTU additional data field is zero or not present, then the peer | |
248 | should use locally-configured MTU information (normally, its local | |
249 | netlink MTU) instead. | |
250 | ||
251 | If it is nonzero, the peer may send packets up to the advertised size | |
252 | (and if that size is bigger than the peer's administratively | |
253 | configured size, the advertiser promises that its buffers can handle | |
254 | such a large packet). | |
255 | ||
256 | A secnet instance should not assume that just because it has | |
257 | advertised an mtu which is lower than usual for the vpn, the peer will | |
258 | honour it, unless the administrator knows that the peers are | |
259 | sufficiently modern to understand the mtu advertisement option. So | |
260 | secnet will still accept packets which exceed the link MTU (whether | |
261 | negotiated or assumed). | |
262 | ||
263 | ||
2fe58dfd SE |
264 | Messages: |
265 | ||
7e29719e IJ |
266 | 1) A->B: *,iA,msg1,A+,B+,nA |
267 | ||
268 | i* must be encoded as 0. (However, it is permitted for a site to use | |
269 | zero as its "index" for another site.) | |
2fe58dfd | 270 | |
1737eeef | 271 | 2) B->A: iA,iB,msg2,B+,A+,nB,nA |
2fe58dfd SE |
272 | |
273 | (The order of B and A reverses in alternate messages so that the same | |
274 | code can be used to construct them...) | |
275 | ||
5b5f297f | 276 | 3) A->B: {iB,iA,msg3,A+,B+,[chosen-transform],nA,nB,g^x mod m}_PK_A^-1 |
2fe58dfd SE |
277 | |
278 | If message 1 was a replay then A will not generate message 3, because | |
279 | it doesn't recognise nA. | |
280 | ||
281 | If message 2 was from an attacker then B will not generate message 4, | |
282 | because it doesn't recognise nB. | |
283 | ||
1737eeef | 284 | 4) B->A: {iA,iB,msg4,B+,A+,nB,nA,g^y mod m}_PK_B^-1 |
2fe58dfd SE |
285 | |
286 | At this point, A and B share a key, k. B must keep retransmitting | |
287 | message 4 until it receives a packet encrypted using key k. | |
288 | ||
289 | 5) A: iB,iA,msg5,(ping/msg5)_k | |
290 | ||
291 | 6) B: iA,iB,msg6,(pong/msg6)_k | |
292 | ||
293 | (Note that these are encrypted using the same transform that's used | |
294 | for normal traffic, so they include sequence number, MAC, etc.) | |
295 | ||
296 | The ping and pong messages can be used by either end of the tunnel at | |
297 | any time, but using msg0 as the unencrypted message type indicator. | |
298 | ||
299 | **** Protocol sub-goal 2: end the use of a shared key | |
300 | ||
301 | 7) i?,i?,msg0,(end-session/msg7,A,B)_k | |
302 | ||
303 | This message can be sent by either party. Once sent, k can be | |
304 | forgotten. Once received and checked, k can be forgotten. No need to | |
305 | retransmit or confirm reception. It is suggested that this message be | |
306 | sent when a key times out, or the tunnel is forcibly terminated for | |
307 | some reason. | |
308 | ||
1e80c220 | 309 | **** Protocol sub-goal 3: send a packet |
2fe58dfd | 310 | |
1e80c220 | 311 | 8) i?,i?,msg0,(send-packet/msg9,packet)_k |
2fe58dfd | 312 | |
1e80c220 | 313 | **** Other messages |
974d0468 | 314 | |
1e80c220 | 315 | 9) i?,i?,NAK (NAK is encoded as zero) |
2fe58dfd | 316 | |
1e80c220 IJ |
317 | If the link-layer can't work out what to do with a packet (session has |
318 | gone away, etc.) it can transmit a NAK back to the sender. | |
2fe58dfd | 319 | |
1e80c220 IJ |
320 | This can alert the sender to the situation where the sender has a key |
321 | but the receiver doesn't (eg because it has been restarted). The | |
322 | sender, on receiving the NAK, will try to initiate a key exchange. | |
4f5e39ec | 323 | |
1e80c220 IJ |
324 | Forged (or overly delayed) NAKs can cause wasted resources due to |
325 | spurious key exchange initiation, but there is a limit on this because | |
326 | of the key exchange retry timeout. | |
4f5e39ec SE |
327 | |
328 | 10) i?,i?,msg8,A,B,nA,nB,msg? | |
1e80c220 IJ |
329 | |
330 | This is an obsolete form of NAK packet which is not sent by any even | |
331 | vaguely recent version of secnet. (In fact, there is no evidence in | |
332 | the git history of it ever being sent.) | |
333 | ||
334 | This message number is reserved. | |
dd9209d1 IJ |
335 | |
336 | 11) *,*,PROD,A,B | |
337 | ||
338 | Sent in response to a NAK from B to A. Requests that B initiates a | |
339 | key exchange with A, if B is willing and lacks a transport key for A. | |
340 | (If B doesn't have A's address configured, implicitly supplies A's | |
341 | public address.) | |
342 | ||
343 | This is necessary because if one end of a link (B) is restarted while | |
344 | a key exchange is in progress, the following bad state can persist: | |
345 | the non-restarted end (A) thinks that the key is still valid and keeps | |
346 | sending packets, but B either doesn't realise that a key exchange with | |
347 | A is necessary or (if A is a mobile site) doesn't know A's public IP | |
348 | address. | |
349 | ||
350 | Normally in these circumstances B would send NAKs to A, causing A to | |
351 | initiate a key exchange. However if A and B were already in the | |
352 | middle of a key exchange then A will not want to try another one until | |
353 | the first one has timed out ("setup-time" x "setup-retries") and then | |
354 | the key exchange retry timeout ("wait-time") has elapsed. | |
355 | ||
356 | However if B's setup has timed out, B would be willing to participate | |
357 | in a key exchange initiated by A, if A could be induced to do so. | |
358 | This is the purpose of the PROD packet. | |
359 | ||
360 | We send no more PRODs than we would want to send data packets, to | |
361 | avoid a traffic amplification attack. We also send them only in state | |
362 | WAIT, as in other states we wouldn't respond favourably. And we only | |
363 | honour them if we don't already have a key. | |
364 | ||
365 | With PROD, the period of broken communication due to a key exchange | |
366 | interrupted by a restart is limited to the key exchange total | |
367 | retransmission timeout, rather than also including the key exchange | |
368 | retry timeout. |