initial version
[doc/ips] / basics.tex
1 \xcalways\section{Basics}\x
2
3 \xcalways\subsection{Introduction and motivation}\x
4
5 \begin{slide}
6 \topic{joke}
7 \head{Mandatory joke}
8
9 An elderly Frenchman rises every morning at 5, goes out to the street in
10 front of his house, and sprinkles a white powder up and down the street.
11
12 One day, a neighbour, who has watched his routine for many years, confronts
13 him. `What is this powder you sprinkle on the street every morning,
14 Pierre?'
15
16 `It is elephant powder, \emph{mon ami},' the gentleman says. `It keeps the
17 elephants away.'
18
19 `But Pierre,' says the neighbour. `Everybody knows that there are no
20 elephants in France.'
21
22 Says the older man matter-of-factly, `I guess it must be working, then.'
23 \end{slide}
24
25 The joke has its mandatory corresponding serious message. Cryptography is
26 like elephant powder. If it seems to work, and keeps the attackers -- our
27 elephants -- away, then we trust it. This isn't really very satisfactory.
28 When the elephant powder fails, a 2-ton elephant turns up, with three of its
29 chums in a Mini, and you notice it hiding upside-down in your bowl of
30 custard. Let's face it: elephants aren't good at being surreptitious.
31
32 But if your cryptography is no good, you may never know.
33
34 \begin{slide}
35 \topic{serious message}
36 \head{Cryptography is elephant powder}
37
38 So, what can we do about the situtation?
39 \begin{itemize}
40 \item Design simple cryptographic primitives, e.g., block ciphers, hash
41 functions. Develop techniques for analysing them, and attempt to stay
42 ahead of the bad guy.
43 \item Build useful constructions from trusted primitives in a modular way.
44 \emph{Prove that the constructions are secure.}
45 \end{itemize}
46 Here we look at this second part of the approach.
47 \end{slide}
48
49 \xcalways\subsection{Approach}\x
50
51 \begin{slide}
52 \head{The provable security approach}
53
54 \begin{itemize}
55 \item Define notions of security. Now we know what to aim for, and what to
56 expect of our components.
57 \item Prove how the security of a construction relates to the security of
58 its primitives. If it breaks, we know who to blame.
59 \end{itemize}
60 \end{slide}
61
62 \begin{slide}
63 \topic{adversaries}
64 \head{Adversaries}
65
66 We model our adversary as a \emph{probabilistic algorithm}; i.e., the
67 algorithm is allowed to \emph{flip coins} to make decisions. The
68 adversary's output (for some input) follows a probability distribution. We
69 define what it means for an adversary to \emph{break} our construction, and
70 examine the probability with which this happens.
71
72 We provide the adversary with \emph{oracles}:
73 \begin{itemize}
74 \item Oracles compute using secrets hidden from the adversary.
75 \item We count the number of \emph{queries} made to an oracle.
76 \item We can restrict the types of queries the adversary makes.
77 \end{itemize}
78
79 Oracles are written as superscripts. For example, an adversary given a
80 chosen-plaintext oracle might be written as $A^{E_K(\cdot)}$.
81 \end{slide}
82
83 \begin{slide}
84 \topic{the asymptotic approach}
85 \head{The asymptotic approach, 1}
86
87 A function $\nu\colon \N \to \R$ is \emph{negligible} if, for any integer
88 $c$, there exists an $n \in \N$ such that $0 \le \nu(k) < k^{-c}$ for all
89 $k \ge n$. That is, $\nu(k)$ is `eventually' less than any polynomial
90 function of $k$.
91
92 We examine families of constructions, with a \emph{security parameter} $k$.
93 We say that a function is (asymptotically) secure in some sense if, for any
94 polynomial $p(k)$ there is a negligible function $\nu(k)$ such that, for
95 any construction in the family, parameterized by $k$, no adversary which
96 runs for time $p(k)$ has success probability better than $\nu(k)$.
97 \end{slide}
98
99 \begin{slide}
100 \head{The asymptotic approach, 2}
101
102 Suppose we build an encryption scheme from a one-way function. We'd like
103 to prove that the encryption is good if the one-way function is secure. We
104 do this by contradiction:
105 \begin{enumerate}
106 \item Suppose an adversary $A$ breaks the encryption scheme with
107 better-than-negligible probability.
108 \item Show a polynomial-time \emph{reduction}: an algorithm which uses $A$
109 to break the one-way function, in polynomial time, and with
110 better-than-negligible probability,
111 \item Claim that this violates the assumption of a secure one-way function.
112 \end{enumerate}
113
114 This doesn't work with real constructions. We don't know where the
115 asymptotics set in, and they can conceal large constants. It's still
116 better than nothing.
117 \end{slide}
118
119 \begin{slide}
120 \topic{the concrete (quantitative) approach}
121 \head{The concrete (quantitative) approach}
122
123 We constrain the resources we allow the adversary:
124 \begin{itemize}
125 \item Running time (including program size).
126 \item Number of oracle queries.
127 \item Maximum size of oracle queries.
128 \end{itemize}
129 Write that something is \emph{$(t, q, \epsilon)$-secure} if no adversary
130 which runs in time $t$ and makes $q$ oracle queries can break it with
131 probability better than $\epsilon$.
132
133 We make statements like `foo is $(t, q, 2 q \epsilon)$-secure if bar is $(t
134 + O(q), 2 q, \epsilon)$-secure'.
135
136 This is a much more satisfactory approach. However, we still have to
137 \emph{assume} the security of our primitive operations.
138 \end{slide}
139
140 \xcalways\subsection{Notation}\x
141
142 \begin{slide}
143 \topic{boolean operators}
144 \head{Notation, 1: boolean operators}
145
146 If $P$ and $Q$ are \emph{predicates} -- i.e., either true or false -- then:
147 \begin{itemize}
148 \item $P \land Q$ is true if both $P$ \emph{and} $Q$ is true;
149 \item $P \lor Q$ is true if either $P$ \emph{or} $Q$ (or both) is true;
150 \item $\lnot P$ is true if $P$ is false; and
151 \item $P \implies Q$ is true if $Q$ is true or $P$ is false.
152 \end{itemize}
153 \end{slide}
154
155 \begin{slide}
156 \topic{sets}
157 \head{Notation, 2: sets}
158
159 For our purposes, we can think of sets as being collections of objects.
160
161 We use the usual notations for set membership ($x \in X$), intersection ($X
162 \cap Y$), union ($X \cup Y$) and subset containment ($X \subseteq Y$). The
163 \emph{empty set}, which contains no elements, is written $\emptyset$.
164
165 The notation $\{f(x) \mid P(x)\}$ describes the set containing those items
166 $f(x)$ for which the predicate $P(x)$ is true.
167
168 The \emph{cardinality} $|X|$ of a (finite) set $X$ is the number of
169 elements in the set.
170
171 The power set $\powerset(X)$ of a set $X$ is the set of all subsets of $X$.
172 \end{slide}
173
174 \begin{slide}
175 \head{Notation, 3: sets (cont.)}
176
177 The \emph{cartesian product} of two sets $X \times Y$ is the set of all
178 ordered pairs $\{ (x, y) \mid x \in X \land y \in Y \}$. We use exponents
179 to indicate the product of a set with itself: hence, $X^2 = X \times X$.
180
181 A \emph{relation} $R$ is a subset of a cartesian product. We write $R(x,
182 y)$ if $(x, y) \in R$. Relations between two sets are often written as
183 infix symbols: e.g., $x \sim y$.
184 \end{slide}
185
186 \begin{slide}
187 \topic{functions}
188 \head{Notation, 4: functions}
189
190 A \emph{function} $f\colon X \to Y$ is a mapping which assigns every
191 element $x$ in the \emph{domain} $X$ a corresponding element $f(x)$ in the
192 \emph{range} (or sometimes \emph{codomain}) $Y$. The notation $\dom f$
193 describes the domain of an arbitrary function; $\ran f$ describes its
194 range.
195
196 We sometimes apply the function notation to sets, indicating that the
197 function should be applied pointwise; i.e., $f(Z) = \{ f(z) \mid z \in Z
198 \}$. The \emph{image} of a function $f$ is the set $f(\dom f)$.
199
200 If $f\colon X \to Y$ preserves equality, i.e., $f(x) = f(x') \implies x =
201 x'$ for all $x, x' \in X$, then we say $f$ is \emph{injective} (or
202 \emph{1-to-1}). If $f(X) = Y$ then we say that it is \emph{surjective} (or
203 \emph{onto}). If $f$ is both injective and surjective then we say that it
204 is \emph{bijective}. In this case, there is a well-defined inverse
205 $f^{-1}\colon Y \to X$ defined by $f(f^{-1}(y)) = y$ for all $y \in Y$.
206 \end{slide}
207
208 \begin{slide}
209 \head{Notation, 5: functions (cont.)}
210
211 We can consider a function $f\colon X \to Y$ to be a particular sort of
212 relation $f \subseteq X \times Y$, subject to the constraint that if $(x,
213 y) \in f$ and $(x, y') \in f$ then $y = y'$.
214
215 We shall use this view in some of the algorithms we present. In addition,
216 we shall use the \emph{maplet} notation $x \mapsto y$ rather than the
217 ordered pair notation $(x, y)$ to reinforce the notion of a mapping.
218
219 We might write, for example,
220 \begin{program}
221 $f \gets f \cup \{ x \mapsto y \}$;
222 \end{program}
223 to augment a function by the addition of a new mapping. This is clearly
224 only valid if $x \notin \dom f$ (or $f(x) = y$) initially.
225 \end{slide}
226
227 \begin{slide}
228 \topic{strings}
229 \head{Notation, 6: strings}
230
231 An \emph{alphabet} is a finite set of \emph{symbols}. The one we'll use
232 most of the time is the set $\Sigma = \{0, 1\}$ of \emph{bits}.
233
234 Suppose $A$ is an alphabet. The set of sequences of exactly $n$ symbols
235 from $A$ is written $A^n$. Hence, $\{0, 1\}^{64}$ is the set of all 64-bit
236 sequences. The set of (finite) \emph{strings} over an alphabet $A$ is $A^*
237 = \bigcup_{i \in \N} A^n$. The empty string is named $\emptystring$.
238
239 The \emph{length} of a string $a \in A^*$, written $|a|$, is the natural
240 number $n$ where $a \in A^n$.
241
242 If $x, y \in A^*$ are strings then their \emph{concatenation}, written $x
243 \cat y$, or sometimes just $x y$, is the result of writing the symbols in
244 $x$ followed by the symbols in $y$, in order. We have $|x y| = |x| + |y|$.
245 \end{slide}
246
247 \begin{slide}
248 \head{Notation, 7: strings (cont.)}
249
250 There are natural (bijective) mappings between bit strings and natual
251 numbers.
252
253 If $x = x_0 x_1 \ldots x_{n-1} \in \{0, 1\}^*$ then we can associate with
254 it the natural number
255 \[ \overrightarrow{x} = \sum_{0 \le i < n} 2^i x_i. \]
256
257 The other natural mapping is
258 \[ \overleftarrow{x} = \sum_{0 \le i < n} 2^{n-i-1} x_i. \]
259 It doesn't matter which you choose, as long as you're consistent.
260
261 For simplicity's sake, we shall tend to switch between strings and the
262 numbers (and occasionally more exotic objects) they represent implictly.
263 \end{slide}
264
265 \begin{slide}
266 \topic{parsing}
267 \head{Notation, 8: parsing}
268
269 We'll find it useful to be able to break up strings in the algorithms we
270 present. We use the statement
271 \begin{program}
272 \PARSE $x$ \AS $n_0\colon x_0, n_1\colon x_1, \ldots, n_k\colon x_k$;
273 \end{program}
274 to mean that the string $x$ is broken up into individual strings $x_0$,
275 $x_1$, \ldots, $x_k$, such that
276 \begin{itemize}
277 \item $x = x_0 \cat x_1 \cat \cdots \cat x_k$; and
278 \item $|x_i| = n_i$ for all $0 \le i \le k$.
279 \end{itemize}
280 We may omit one of the $n_i$ length indicators, since it can be deduced
281 from the length of the string $x$ and the other $n_j$.
282 \end{slide}
283
284 \begin{slide}
285 \topic{vectors}
286 \head{Notation, 9: vectors}
287
288 A \emph{vector} $\vect{x}$ is a finite ordered collection of elements from
289 some set $X$. If $\vect{x}$ contains $n$ elements then we write $\vect{x}
290 \in X^n$, and that $|\vect{x}| = n$. We write the individual elements as
291 $\vect{x}[0], \vect{x}[1], \ldots, \vect{x}[n - 1]$.
292
293 We shall use set membership notation for vectors; i.e., we write $x \in
294 \vect{x}$ if there is an $i$ ($0 \le i < |\vect{x}|$) such that
295 $\vect{x}[i] = x$.
296
297 When we apply functions to vectors, we mean that they are applied
298 pointwise, as for sets. Thus, if we write that $\vect{y} =
299 f(\vect{x})$ then $|\vect{y}| = |\vect{x}|$ and $\vect{y}[i] =
300 f(\vect{x}[i])$ for all $0 \le i < |\vect{x}|$.
301 \end{slide}
302
303 \begin{slide}
304 \topic{distributions and randomness}
305 \head{Notation, 10: distributions and randomness}
306
307 A \emph{probability distribution} over a (countable) set $X$ is a
308 function $\mathcal{D}\colon X \to [0, 1]$ such that
309 \[ \sum_{x \in X} \mathcal{D}(x) = 1. \]
310
311 The \emph{support} of $\mathcal{D}$, written $\supp \mathcal{D}$, is the
312 set $\{ x \in X \mid \mathcal{D}(x) \ne 0 \}$; i.e., those elements of $X$
313 which occur with nonzero probability.
314
315 We write $x \getsr \mathcal{D}$ in algorithms to indicate that $x$ is to be
316 chosen independently at random, according to the distribution
317 $\mathcal{D}$. The notation $x \inr \mathcal{D}$ indicates that $x$ has
318 been chosen at independently at random according to $\mathcal{D}$.
319
320 The \emph{uniform distribution} over a (finite) set $X$ is
321 $\mathcal{U}_X\colon X \to [0, 1]$ defined by $\mathcal{U}_X(x) = 1/|X|$
322 for all $x \in X$. We shall write $x \getsr X$ and $x \inr X$ as
323 convenient shorthands, meaning $x \getsr \mathcal{U}_X$ and $x \inr
324 \mathcal{U}_X$ respectively.
325 \end{slide}
326
327 \xcalways\subsection{Background}\x
328
329 \begin{slide}
330 \topic{distinguishability}
331 \head{Distinguishability, 1}
332
333 Suppose that $\mathcal{X}$ and $\mathcal{Y}$ are two probability
334 distributions.
335
336 Let $X$ be a random variable distributed according to $\mathcal{X}$, and
337 let $Y$ be a random variable distributed according to $\mathcal{Y}$. We
338 say that $\mathcal{A}$ and $\mathcal{B}$ are \emph{identically
339 distributed} if, for all possible values $x$ of $X$, we have
340 \[ \Pr[X = x] = \Pr[Y = x]. \]
341
342 Equivalently, we require that, for all $x \in \supp \mathcal{X}$ we have
343 \[ x \in \supp \mathcal{Y} \land \mathcal{X}(x) = \mathcal{Y}(y). \]
344 \end{slide}
345
346 \begin{slide}
347 \head{Distinguishability, 2: statistical closeness}
348
349 Now we generalize the setting slightly. Consider two \emph{families} of
350 distributions, $\{\mathcal{X}_k\}_{k \in \N}$ and $\{\mathcal{Y}_k\}_{k \in
351 \N}$, parameterized by a security parameter $k$.
352
353 Fix a value of $k$. Again, let $X$ be distributed according to
354 $\mathcal{X}_k$, and let $Y$ be distributed according to $\mathcal{Y}_k$.
355 We say that $\{\mathcal{X}_k\}_{k \in \N}$ and $\{\mathcal{Y}_k\}_{k \in
356 \N}$ are \emph{statistically close} if there is a negligible function
357 $\nu(\cdot)$ such that for any $k \in \N$, and for all $x \in
358 \supp \mathcal{X}_k$,
359 \[ |{\Pr[X = x]} - \Pr[Y = x]| \le \nu(k). \]
360
361 (Reminder: Saying that $\nu\colon \N \to \R$ is \emph{negligible} means
362 that, for any $c \in \Z$ there is an $n \in N$ such that $\nu(k) <
363 k^{-c}$.)
364 \end{slide}
365
366 \begin{slide}
367 \head{Distinguishability, 3: computational indistinguishability}
368
369 We say that two families of distributions are computationally
370 indistinguishable if no `efficient' algorithm can tell them apart with
371 better than `negligible' probability.
372
373 So, we say that $\{\mathcal{X}_k\}_{k \in \N}$ and $\{\mathcal{Y}_k\}_{k
374 \in \N}$ are \emph{computationally indistinguishable} if, for any
375 probabilistic polynomial-time algorithm $A$, there is a negligible function
376 $\nu(\cdot)$ such that, for any $k$:
377 \[ \Pr[x \getsr \mathcal{X}_k; b \gets A(x) : b = 1] -
378 \Pr[x \getsr \mathcal{Y}_k; b \gets A(x) : b = 1] \le \nu(k). \]%
379 \end{slide}
380
381 \begin{slide}
382 \topic{collisions}
383 \head{Collisions}
384
385 Suppose we throw $q$ balls into $n$ bins at random (with $q \le n$). Let
386 $C_{q, n}$ be the event that, at the end of this, we have a bin containing
387 more than one ball -- a \emph{collision}.
388
389 Let $B_{q, n}$ be the event that the $q$-th ball collides with a
390 previous one. Obviously, the worst case for this is when none of the other
391 balls have collided, so
392 \[ \Pr[B_{q, n}] \le \frac{q - 1}{n}. \]
393 Then
394 \begin{eqnarray*}[rl]
395 \Pr[C_{q, n}]
396 &\le \Pr[B_{2, n}] + \Pr[B_{3, n}] + \cdots + \Pr[B_{q, n}] \\
397 &= \frac{1}{n} + \frac{2}{n} + \cdots + \frac{q - 1}{n} \\
398 &= \frac{q(q - 1)}{2 n}.
399 \end{eqnarray*}
400 This is an extremely useful result, and we shall need it often.
401 \end{slide}
402
403 \xcalways\subsection{Primitives}\x
404
405 \begin{slide}
406 \topic{one-way functions}
407 \head{One-way functions, 1: introduction}
408
409 Intuition: a one-way function is easy to compute but hard to invert.
410
411 Choose a function $f\colon X \to Y$. Let $I$ be a prospective inverter.
412 Now we play a game:
413 \begin{enumerate}
414 \item Choose $x \in X$ at random. Compute $y = f(x)$.
415 \item Let $x'$ be the output of $I$ when run on input $y$.
416 \item We say that $I$ `wins' if $f(x') = y$; otherwise it loses.
417 \end{enumerate}
418 Note that we don't care whether $x = x'$.
419
420 Examples: SHA-1, or $x \mapsto g^x \bmod p$.
421 \end{slide}
422
423 \begin{slide}
424 \head{One-way functions, 2: formalism}
425
426 The \emph{success probability} of an inverter $I$ against the function $f$
427 is
428 \[ \Succ{owf}{f}(I) =
429 \Pr[x \getsr X;
430 y \gets f(x);
431 x' \gets I(y) :
432 f(x') = y] \]%
433 where the probability is taken over all choices of $x$ and all coin flips
434 made by $I$.
435
436 We measure the \emph{insecurity} of a one-way function (OWF) by maximizing
437 over possible inverters:
438 \[ \InSec{owf}(f; t) = \max_I \Succ{owf}{f}(I) \]
439 where the maximum is taken over all $I$ running in time $t$.
440
441 If $\InSec{owf}(f; t) \le \epsilon$ then we say that $f$ is a \emph{$(t,
442 \epsilon)$-secure one-way function}.
443 \end{slide}
444
445 \begin{slide}
446 \head{One-way functions, 3: trapdoors}
447
448 Intuition: a \emph{trapdoor} is secret information which makes inverting a
449 one-way function easy. A trapdoor one-way function generator $\mathcal{T}
450 = (G, f, T)$ is a triple of algorithms:
451
452 \begin{itemize}
453 \item The probabilistic algorithm $G$ is given some parameter $k$ and
454 returns a pair $(P, K)$, containing the public parameters $P$ for
455 computing an instance of the one-way function, and the secret trapdoor
456 information $K$ for inverting it. We write $(P, K) \in G(k)$.
457
458 \item The algorithm $f$ implements the one-way function. That is, if $(P,
459 K) \in G(k)$ then $f(P, \cdot)$ (usually written $f_P(\cdot)$) is a
460 one-way function.
461
462 \item The algorithm $T$ inverts $f$ using the trapdoor information. That
463 is, if $(P, K) \in G(k)$ and $y = f_P(x)$ for some $x$, then $x =
464 T(K, y)$. We usually write $T_K(\cdot)$ instead of $T(K, \cdot)$.
465 \end{itemize}
466 \end{slide}
467
468 \begin{slide}
469 \topic{pseudorandom generators (PRGs)}
470 \head{Pseudorandom generators (PRGs)}
471
472 A pseudorandom generator (PRG) `stretches' an input seed into a longer
473 string which `looks' random.
474
475 Let $G\colon \{0, 1\}^k \to \{0, 1\}^L$ be a function from $k$-bit strings
476 to $L$-bit strings. The \emph{advantage} of a distinguisher $D$ against
477 $G$ is:
478 \begin{eqnarray*}[rl]
479 \Adv{prg}{G}(D) = &
480 \Pr[x \getsr \{0, 1\}^k; y \gets G(x) : D(y) = 1] - {}\\
481 & \Pr[y \getsr \{0, 1\}^L : D(y) = 1].
482 \end{eqnarray*}
483 The \emph{insecurity} is simply the maximum advantage:
484 \[ \InSec{prg}(G; t) = \max_D \Adv{prg}{G}(D) \]
485 where the maximum is taken over all distinguishers $D$ running in time
486 $t$. If $\InSec{prg}(G; t) \le \epsilon$ then we also say that $G$ is a
487 $(t, \epsilon)$-secure PRG\@.
488 \end{slide}
489
490 \begin{exercise}
491 We say that a PRG $g\colon \{0, 1\}^k \to \{0, 1\}^L$ is \emph{trivial} if
492 $k \ge L$.
493 \begin{enumerate}
494 \item Show that trivial PRGs exist.
495 \item Show that if $g$ is nontrivial, then $g$ is also a one-way function,
496 with
497 \[ \InSec{owf}(g; t) \le \InSec{prg}(g; t) + 2^{k-L}. \]
498 \end{enumerate}
499 \answer%
500 \begin{parenum}
501 \item The identity function $I(x) = x$ is a trivial PRG, with
502 $\InSec{prg}(I, t) = 0$, as is easily seen from the definition.
503 \item Suppose $A$ inverts $g$. Then consider adversary $B(y)$: \{ $x \gets
504 A(y)$; \IF $g(x) = y$ \THEN \RETURN $1$; \ELSE \RETURN $0$;~\}. If $y$
505 is the output of $g$, then $A$ inverts $y$ with probability
506 $\Succ{owf}{g}(A)$; if $y$ is random in $\{0, 1\}^L$ then there is a
507 probability at least $1 - 2^{k-L}$ that $y$ has \emph{no} inverse,
508 proving the result. Note that \cite{Wagner:2000:PSU} provides a better
509 security bound than this simplistic analysis.
510 \end{parenum}
511 \end{exercise}
512
513 \begin{exercise}
514 \label{ex:dbl-prg}
515 Suppose that we have a \emph{length-doubling} PRG $g\colon \{0, 1\}^k \to
516 \{0, 1\}^{2k}$. Let $g_0(\cdot)$ be the first $k$ bits of $g(x)$ and
517 $g_1(\cdot)$ be the second $k$ bits. Define the sequence of generators
518 $g^{(i)}$ (for $i >= 1$) by
519 \[ g^{(1)}(x) = g(x); \qquad
520 g^{(i+1)}(x) = g_0(x) \cat g^{(i)}(g_1(x)). \]%
521 Relate the security of $g^{(i)}$ to that of $g$.
522 \answer%
523 Let $A$ be an adversary running in time $t$ and attacking $g^{(i+1)}$.
524 Firstly, we attack $g$: consider adversary $B(y)$: \{ \PARSE $y$ \AS $y_0,
525 k\colon y_1$; $z \gets g^{(i)}$; $b \gets A(y_0 \cat z)$; \RETURN $b$;~\}.
526 Then $\Adv{prg}{g}(B) \ge \Adv{prg}{g^{(i+1)}}(A) + \delta$, so
527 $\InSec{prg}(g^{(i+1)}; t) \le \InSec{prg}(g; t) + \delta$, where
528 \begin{eqnarray*}[rl]
529 \delta = &\Pr[x_0 \gets \{0, 1\}^k; x_1 \gets \{0, 1\}^k;
530 y \gets g^{(i)}(x) : A(x_0 \cat y) = 1] - \\
531 &\Pr[y \getsr \{0, 1\}^{(i+2)k} : A(y) = 1].
532 \end{eqnarray*}
533 We attack $g^{(i)}$ to bound $\delta$: consider adversary $C(y)$: \{ $x_0
534 \getsr \{0, 1\}^k$; $b \gets A(x_0 \cat y)$; \RETURN $b$;~\}. Now $\delta
535 \le \Adv{prg}{g^{(i)}}(C) \le \InSec{prg}(g^{(i)}; t)$. So by induction,
536 \[ \InSec{prg}(g^{(i)}; t) \le i \cdot \InSec{prg}(g; t). \]
537 \end{exercise}
538
539 \begin{slide}
540 \topic{advantage}
541 \head{Notes about advantage}
542
543 Advantage is a concept used in many definitions of security notions:
544 \begin{eqnarray*}[rl]
545 \Adv{}{}(A) = &
546 \Pr[\text{$A$ returns 1 in setting $a$}] - {} \\
547 & \Pr[\text{$A$ returns 1 in setting $b$}].
548 \end{eqnarray*}
549 \begin{enumerate}
550 \item We have $-1 \le \Adv{}{}(A) \le +1$.
551 \item Zero means that the adversary couldn't distinguish.
552 \item Negative advantage means the adversary got them the wrong way
553 around. There is another adversary which uses the same resources but has
554 positive advantage.
555 \item \label{item:adv-guess} If $A$ is attempting to guess some hidden bit
556 $b^* \inr \{0, 1\}$, we have
557 \[ \Pr[b \gets A : b = b^*] = \frac{\Adv{}{}(A)}{2} + \frac{1}{2}. \]
558 \end{enumerate}
559 \end{slide}
560
561 \begin{proof}
562 Let $b$ be the bit that $A$ returns, and let $b^*$ be the experiment's
563 hidden bit. Then
564 \[ \Adv{}{}(A) = \Pr[b = 1 \mid b^* = 1] - \Pr[b = 1 \mid b^* = 0]. \]
565 Addressing the above claims in order:
566 \begin{enumerate}
567 \item By definition of probability, $0 \le \Pr[b = 1 \mid b^* = 1] \le 1$
568 and $0 \le \Pr[b = 1 \mid b^* = 0]$, so their absolute difference can be
569 at most 1.
570 \item This is a corollary of \ref{item:adv-guess}.
571 \item Consider the adversary $\bar{A}$ which runs $A$ and returns the
572 complement bit $\bar{b} = b \xor 1$. Then
573 \begin{eqnarray*}[rl]
574 \Adv{}{}(\bar{A})
575 &= \Pr[\bar{b} = 1 \mid b^* = 1] - \Pr[\bar{b} = 1 \mid b^* = 0] \\
576 &= \Pr[b = 0 \mid b^* = 1] - \Pr[b = 0 \mid b^* = 0] \\
577 &= (1 - \Pr[b = 1 \mid b^* = 1]) - (1 - \Pr[b = 1 \mid b^* = 0]) \\
578 &= \Pr[b = 1 \mid b^* = 0] - \Pr[b = 1 \mid b^* = 1] \\
579 &= -\Adv{}{}(A).
580 \end{eqnarray*}
581 \item Note that $\Pr[b^* = 1] = \Pr[b^* = 0] = \frac{1}{2}$. Then
582 \begin{eqnarray*}[rl]
583 \Pr[b = b^*]
584 &= \Pr[b = 1 \land b^* = 1] + \Pr[b = 0 \land b^* = 0] \\
585 &= \frac{1}{2}(\Pr[b = 1 \mid b^* = 1] + \Pr[b = 1 \mid b^* = 0]) \\
586 &= \frac{1}{2}(\Pr[b = 1 \mid b^* = 1] +
587 (1 - \Pr[b = 0 \mid b^* = 0])) \\
588 &= \frac{1}{2}(1 + \Pr[b = 1 \mid b^* = 1] -
589 \Pr[b = 0 \mid b^* = 0]) \\
590 &= \frac{\Adv{}{}(A)}{2} + \frac{1}{2}.
591 \end{eqnarray*}
592 \end{enumerate}
593 All present and correct.
594 \end{proof}
595
596 \begin{slide}
597 \topic{pseudorandom functions (PRFs)}
598 \head{Pseudorandom functions (PRFs)}
599
600 A \emph{pseudorandom function family} (PRF) is a collection of functions
601 $F_K\colon \{0, 1\}^l \to \{0, 1\}^L$, where $K$ is some index, typically
602 from a set of fixed-size strings $\{0, 1\}^k$.
603
604 We want to say that $F$ is a strong PRF if adversaries find it hard to
605 distinguish an instance $F_K$ from a function chosen completely at random
606 with the same `shape'.
607
608 We provide the adversary with an \emph{oracle}, either for a randomly
609 selected $F_K$, or for completely random function, and ask it to try and
610 say which it is given.
611
612 We write $\Func{l}{L}$ as the set of \emph{all} functions from $\{0, 1\}^l$
613 to $\{0, 1\}^L$.
614 \end{slide}
615
616 \begin{slide}
617 \head{Pseudorandomness, 3: PRFs (cont.)}
618
619 We define the advantage of a distinguisher $D$ against the PRF $F$ as
620 follows:
621 \begin{eqnarray*}[rl]
622 \Adv{prf}{F}(D) = &
623 \Pr[K \getsr \{0, 1\}^k : D^{F_K(\cdot)} = 1] - {}\\
624 & \Pr[R \getsr \Func{l}{L} : D^{R(\cdot)} = 1].
625 \end{eqnarray*}
626 The insecurity of the PRF is then measured as
627 \[ \InSec{prf}(F; t, q) = \max_D \Adv{prf}{F}(D) \]
628 where the maximum is taken over all distinguishers $D$ which run for time
629 $t$ and make $q$ oracle queries. As is usual, if $\InSec{prf}(F; t, q)
630 \le \epsilon$ then we say that $F$ is a $(t, q, \epsilon)$-secure PRF.
631 \end{slide}
632
633 \begin{slide}
634 \topic{pseudorandom permutations (PRPs)}
635 \head{Pseudorandom permutations (PRPs)}
636
637 We define a \emph{pseudorandom permutation family} (PRP) in a similar way
638 to the PRFs we've already seen. Suppose $F_K\colon \{0, 1\}^L \to \{0,
639 1\}^L$ is a family of permutations, indexed by elements of some finite set,
640 e.g., $\{0, 1\}^k$. Let $\Perm{L}$ be the set of \emph{all} permutations
641 over the set of $L$-bit strings $\{0, 1\}^L$.
642
643 The advantage of a distinguisher $D$ against the PRP $F$ is
644 \begin{eqnarray*}[rl]
645 \Adv{prp}{F}(D) = &
646 \Pr[K \getsr \{0, 1\}^k : D^{F_K(\cdot)} = 1] - {}\\
647 & \Pr[R \getsr \Perm{L} : D^{R(\cdot)} = 1].
648 \end{eqnarray*}
649
650 We define $\InSec{prp}(F; t, q) = \max_D \Adv{prp}{F}(D)$ exactly as for
651 PRFs, and the notion of $(t, q, \epsilon)$-security is the same.
652 \end{slide}
653
654 \begin{slide}
655 \head{Pseudorandomness, 5: super PRPs}
656
657 PRPs are bijective. A \emph{super PRP} is a PRP which remains secure when
658 the distinguisher is allowed to make inverse queries:
659 \begin{eqnarray*}[rl]
660 \Adv{sprp}{F}(D) = &
661 \Pr[D^{F_K(\cdot), F_K^{-1}(\cdot)} = 1 \mid
662 K \getsr \{0, 1\}^k] - {} \\
663 & \Pr[R \getsr \Perm{L} : D^{R(\cdot), R^{-1}(\cdot)} = 1].
664 \end{eqnarray*}
665 Since there are two oracles, we count queries to both when evaluating the
666 insecurity:
667 \[ \InSec{sprp}(F; t, q, q') = \max_D \Adv{sprp}{F}(D) \]
668 where the maximum is taken over all distinguishers $D$ which run for time
669 $t$, make $q$ queries to the standard oracle, and $q'$ queries to the
670 inverse oracle. If $\InSec{sprp}(F; t, q, q') \le \epsilon$ then we say
671 $F$ is a $(t, q, q', \epsilon)$-secure super PRP\@.
672 \end{slide}
673
674 \begin{exercise}
675 Note that the key length hasn't appeared anywhere in the definition of
676 insecurity for a PRP. Derive lower bounds for the insecurity of a PRP with
677 a $k$-bit key.
678 \answer%
679 Let $E\colon \{0, 1\}^k \times \{0, 1\}^L \to \{0, 1\}^L$ be a PRP. Fix
680 $n$ and $c$. Then consider adversary $S^{E(\cdot)}$: \{ \FOR $i = 0$ \TO
681 $c - 1$ \DO $y[i] \gets E(i)$; \FOR $K = 0$ \TO $n - 1$ \DO \{ $i \gets 0$;
682 $\id{good} \gets 1$; \WHILE $i < c \land \id{good} = 1$ \DO \{ \IF $E_K(i)
683 \ne y[i]$ \THEN $\id{good} \gets 0$;~\} \IF $\id{good} = 1$ \THEN \RETURN
684 $1$;~\}~\}. Then $\Adv{prp}{E}(S) \ge n(2^{-k} - 2^{-Lc})$.
685 \end{exercise}
686
687 \begin{slide}
688 \head{Block ciphers: PRPs are PRFs}
689
690 We model block ciphers as families of PRPs (not super PRPs). Most of the
691 analysis works best on PRFs, though. We show that a PRP makes a `pretty
692 good' PRF, as long as it's not over-used.
693
694 Let $F$ be any PRP family. Then
695 \[ \InSec{prf}(F; t, q) \le
696 \InSec{prp}(F; t, q) + \frac{q(q - 1)}{2^{L+1}}. \]%
697 This is a useful result. As long as $q^2$ is small compared to $2^L$ --
698 the block size -- then a PRP makes a good PRF.
699
700 The value $2^{L/2}$ is often called the \emph{Birthday bound} (after the
701 birthday `paradox'). We shall meet it often when we examine modes of
702 operation.
703 \end{slide}
704
705 \begin{proof}
706 Let $F$ be a PRP family, and $K$ a randomly chosen key from the keyspace of
707 $F$; let $R \inr \Func{L}{L}$ be a random function, and let $P \inr
708 \Perm{L}$ be a random permutation.
709
710 Let $A$ be any distinguisher running in time $t$, and making $q$ oracle
711 queries. Then:
712 \begin{eqnarray*}[rl]
713 \Adv{prf}{F}(A)
714 & = \Pr[A^{F_K(\cdot)} = 1] - \Pr[A^{R(\cdot)} = 1] \\
715 & = \Pr[A^{F_K(\cdot)} = 1] - \Pr[A^{P(\cdot)} = 1] +
716 \Pr[A^{P(\cdot)} = 1] - \Pr[A^{R(\cdot)} = 1] \\
717 & = \Adv{prp}{F}(A) + \Pr[A^{P(\cdot)} = 1] - \Pr[A^{R(\cdot)} = 1] \\
718 & = \Adv{prp}{F}(A) + \Pr[A^{R(\cdot)} = 0] - \Pr[A^{P(\cdot)} = 0].
719 \end{eqnarray*}
720 Now we need to bound the quantity $\Pr[A^{R(\cdot)} = 0] - \Pr[A^{P(\cdot)}
721 = 0]$. Let $D$ be the event that all of the \emph{distinct} oracle queries
722 to $R$ yield distinct results. Then
723 \begin{eqnarray*}[rl]
724 \Pr[A^{R(\cdot)} = 0]
725 & = \Pr[A^{R(\cdot)} = 0 \mid D] \Pr[D] +
726 \Pr[A^{R(\cdot)} = 0 \mid \lnot D] \Pr[\lnot D] \\
727 & = \Pr[A^{P(\cdot)} = 0] \Pr[D] +
728 \Pr[A^{R(\cdot)} = 0 \mid \lnot D] \Pr[\lnot D] \\
729 & \le \Pr[A^{P(\cdot)} = 0] + \Pr[\lnot D] \\
730 & \le \Pr[A^{P(\cdot)} = 0] + \frac{q(q - 1)}{2^{L+1}}.
731 \end{eqnarray*}
732 Substituting this into the above equation yields
733 \[ \Adv{prf}{F}(A) \le \Adv{prp}{F}(A) + \frac{q(q - 1)}{2^{L+1}}. \]
734 Select $A$ such that $\Adv{prf}{F}(A) = \InSec{prf}(F; t, q)$. We know
735 $\Adv{prp}{F}(A) \le \InSec{prp}(F; t, q)$ by definition. The result
736 follows.
737 \end{proof}
738
739 \begin{slide}
740 \topic{hash functions}
741 \head{Hash functions, 1: properties}
742
743 Hash functions like MD5 and SHA-1 are extremely useful primitives. What
744 properties do we expect of them? This out to be an extremely difficult
745 question to answer.
746 \begin{itemize}
747 \item One-wayness. We've seen a definition for this already. But it's not
748 enough.
749 \item Collision-resistance. This is the property usually claimed as the
750 requirement for use in digital signature systems. We'll look at this
751 later.
752 \item Randomness. What does this mean, when the function is completely
753 public? A distinguishability criterion is clearly hopeless.
754 \end{itemize}
755 \end{slide}
756
757 \begin{slide}
758 \head{Hash functions, 2: Merkle-Damg\aa{}rd iterated hashing
759 \cite{Damgaard:1990:DPH, Merkle:1991:FSE}}
760
761 Let $F\colon \{0, 1\}^{k+L} \to \{0, 1\}^k$ be a \emph{compression}
762 function. Now consider the function $H\colon \{0, 1\}^* \to \{0, 1\}^k$
763 which transforms an input string $x$ as follows:
764 \begin{enumerate}
765 \item Pad $x$ to a multiple of $L$ bits in some injective way. Divide the
766 padded message into $L$-bit blocks $x_0$, $x_1$, \ldots, $x_{n-1}$.
767 \item Fix some $L$-bit constant $I$. Let $I_0 = I$. Define $I_{i+1} =
768 F(I_i \cat x_i)$ for $0 \le i < n$.
769 \item The result $H(x) = I_n$.
770 \end{enumerate}
771
772 Suppose we have two strings $x \ne y$, such that $H(x) = H(y)$; i.e., a
773 \emph{collision}. Then \emph{either} we can find a collision for $F$
774 \emph{or} a string $z$ for which $F(z) = I$. (This is why initialization
775 vectors for hash functions have such obviously regular forms.)
776 \end{slide}
777
778 \begin{proof}
779 Let $x_0, x_1, \ldots, x_{n-1}$ and $x'_0, x'_1, \ldots, x'_{n'-1}$ be
780 the $l$-bit blocks of two distinct (padded) messages, and without loss
781 of generality suppose that $n \ge n'$. Let $I_0 = I'_0 = I$, let
782 $I_{i+1} = F(I_i \cat x_i)$, and $I'_{i+1} = F(I'_i \cat x'_i)$. We
783 have $I_n = I'_{n'}$.
784
785 We prove the result by induction on $n$. The case $n = 0$ is trivially
786 true, since there is only one zero-block message. Suppose, then, that the
787 result is true for $n$-block messages. There are three cases to consider.
788 Firstly, if $n' = 0$ then $F(I_n \cat x_n) = I$. Secondly, if $I_n \ne
789 I'_{n'}$ or $x_n \ne x'_{n'}$, then we have a collision, for $F(I_n \cat
790 x_n) = I_n = I'_{n'} = F(I'_{n'} \cat x'_{n'})$. Finally, if $I_n =
791 I'_{n'}$ and $x_n = x'_{n'}$ then we remove the final block from both
792 messages, and because the remaining messages must differ in at least one
793 block, we can apply the inductive hypothesis on these shorter messages to
794 complete the proof.
795 \end{proof}
796
797 \begin{slide}
798 \head{Hash functions, 2: any-collision resistance}
799
800 The statement usually made about a `good' hash function $h$ is that it
801 should be `difficult' to find a collision: i.e., two preimages $x \ne y$
802 where $H(x) = H(y)$. How do we formalize this? Here's one attempt:
803 \begin{eqlines*}
804 \Succ{acr}{H}(A) = \Pr[(x, y) \gets A : x \ne y \land H(x) = H(y)]; \\
805 \InSec{acr}(H; t) = \max_A \Succ{acr}{H}(A).
806 \end{eqlines*}
807 But this doesn't work. There clearly \emph{exists} an adversary which
808 already `knows' the a collision for $H$ and just outputs the right answer.
809 It succeeds very quickly, and with probability 1. So this definition is
810 impossible to satisfy.
811 \end{slide}
812
813 \begin{slide}
814 \head{Hash functions, 3: targetted collision resistance}
815
816 The concept of targetted collision resistance is relatively new, but quite
817 promising. It replaces a single hash function with a \emph{family} of hash
818 functions. They're not really keyed, because the indices aren't kept
819 secret.
820
821 When making a signature, an index $i$ is chosen at random, and the
822 signature for message $m$ is formed over the pair $(i, H_i(M))$.
823
824 TCR-hash functions are the subject of ongoing research. No practical
825 designs exist at the moment.
826 \end{slide}
827
828 \begin{slide}
829 \head{Hash functions, 4: targetted collision resistance (cont.)}
830
831 Consider the following experiment:
832 \begin{program}
833 $\Expt{tcr}{H}(A)$: \+ \\
834 $(x, s) \gets A(\cookie{find})$; \\
835 $i \getsr \keys H$; \\
836 $y \gets A(\cookie{collide}, i, s)$; \\
837 \IF $x \ne y \land H_i(x) = H_i(y)$
838 \THEN \RETURN $1$; \\
839 \ELSE \RETURN $0$;
840 \end{program}
841 The security of a TCR-hash function is measured as:
842 \[ \InSec{tcr}(H; t) = \max_A \Pr[\Expt{tcr}{H}(A) = 1] \]
843 where the maximum is taken over all adversaries running in time $t$. We
844 define $(t, \epsilon)$-security as usual.
845 \end{slide}
846
847 \begin{slide}
848 \head{Hash functions, 5: random oracles \cite{Bellare:1993:ROP}}
849
850 In practice, we expect much more than just collision resistance from hash
851 functions: we expect a certain amount of `random' behaviour. But this is
852 hard to quantify.
853
854 One approach is to model a hash function as a `random oracle', i.e., an
855 oracle containing a function chosen at random, used by the construction
856 under consideration, and made available to the adversary. The idea is that
857 reductions can `capture' the knowledge of an adversary by examining the
858 queries it makes to its random oracle.
859
860 Hash functions \emph{aren't} random oracles. But a random oracle proof is
861 better than nothing.
862 \end{slide}
863
864 \xcalways\subsection{Standard assumptions}\x
865
866 \begin{slide}
867 \head{Standard assumptions}
868
869 There are a number of `standard' assumptions that are made about the
870 difficulty of various problems:
871 \begin{itemize}
872 \item IFP, the Integer Factorization Problem;
873 \item QRP, the Quadratic Residuosity Problem;
874 \item DLP, the Discrete Logarithm Problem;
875 \item RSAP, the RSA Problem; and
876 \item CDH, the Computational Diffie-Hellman problem and its variants
877 \end{itemize}
878 \cite{Menezes:1997:HAC} has excellent material on the above.
879 \end{slide}
880
881 \begin{slide}
882 \topic{integer factorization}
883 \head{The Integer Factorization Problem, 1}
884
885 We often assume that large integers of the form $n = p q$, where $p$ and
886 $q$ are primes of roughly the same size, are `hard' to factor.
887 Specifically, there is no algorithm which will factor such an $n$ in time
888 bounded by a polynomial function of $\log n$.
889
890 The difficulty of various other problems, e.g., Quadratic Residuosity, or
891 RSA, depend on the difficulty of factoring; however, it is not yet known
892 whether the ability to solve QRP or RSAP can be used to factor.
893 \end{slide}
894
895 \begin{slide}
896 \head{The Integer Factorization Problem, 2: square roots}
897
898 The problem of extracting square roots modulo $n = p q$ is provably as hard
899 as factoring. This is the basis of Rabin's public key encryption and
900 digital signature schemes. We shall analyse these later.
901
902 Suppose $Q(n, y)$ is an algorithm which returns an $x$ such that $x^2
903 \equiv y \pmod{n}$, provided such an $x$ exists. Then we can find a
904 nontrivial factor of $n$ as follows:
905 \begin{program}
906 Algorithm $\id{find-factor}(n)$: \+ \\
907 \REPEAT \\ \quad\=\+\kill
908 $x \getsr \{1, 2, \ldots, n - 1\}$; \\
909 $y \gets x^2 \bmod n$; \\
910 $x' \gets Q(n, y)$; \\
911 $p \gets \gcd(x + x', n)$; \\
912 \IF $p \notin \{1, n\}$ \THEN \RETURN $p$; \- \\
913 \FOREVER;
914 \end{program}
915 \end{slide}
916
917 \begin{proof}
918 The program attempts to find two square roots of $y$ mod $n$. It's easy to
919 see that this might lead to factors of $n$. If $x^2 \equiv x'^2 \pmod{n}$
920 then $x^2 - x'^2 = k n$ for some constant $k$. Then $(x + x')(x - x')$ is
921 a factorization of $k n$. It remains to prove the probability bound on $x
922 + x'$ being a nontrivial factor of $n$.
923
924 Let $n$ be an odd composite. Then, if $x \not\equiv \pm y \pmod{n}$ but
925 $x^2 \equiv y^2 \pmod{n}$, then $\gcd(x + y, n)$ is a nontrivial factor of
926 $n$.
927
928 Firstly, we claim that, if $p$ is an odd prime then the congruence $x^2
929 \equiv y \pmod{p}$ has precisely two solutions $x$, $x'$ such that $x
930 \equiv -x' \pmod{p}$. Let $g$ be primitive mod $p$, with $x = g^\alpha$,
931 $x' = g^\beta$. Then $g^{2 \alpha} \equiv g^{2 \beta} \pmod{p}$, so $2
932 \alpha \equiv 2 \beta \pmod{p - 1}$. But $p - 1$ is even, by hypothesis,
933 so $\alpha \equiv \beta \pmod{(p - 1)/2}$. But $g^{(p-1)/2} \equiv -1
934 \pmod{p}$; hence $x \equiv \pm x' \pmod{p}$, proving the claim.
935
936 There must exist odd primes $p$, $q$, such that $p|n$ and $q|n$, and $x
937 \equiv -y \pmod{p}$ and $x \equiv y \pmod{q}$, for if not, then $x \equiv
938 \pm y \pmod{n}$ contrary to hypothesis. But then $x + y \equiv 0
939 \pmod{p}$, so $p|(x + y)$; but $x + y \equiv 2 x \not\equiv 0 \pmod{q}$,
940 since $q$ is odd. Hence, $p$ divides $x + y$, but $q$ does not, so $\gcd(x
941 + y, n)$ is a nontrivial factor of $n$, completing the proof.
942 \end{proof}
943
944 \begin{slide}
945 \topic{quadratic residuosity}
946 \head{The Quadratic Residuosity Problem}
947
948 If there is an $x$ such that $x^2 \equiv y \pmod{n}$ then $y$ is a
949 \emph{quadratic residue modulo $n$}, and we write $y \in Q_n$; if there is
950 no such $x$ then $y$ is a \emph{quadratic nonresidue modulo $n$}.
951
952 If $p$ is prime, then we can use the \emph{Legendre symbol} to decide
953 whether $x$ is a quadratic residue mod $p$:
954 \[ \jacobi{x}{p} = x^{(p-1)/2} \bmod p =
955 \begin{cases}
956 0 & if $p$ divides $x$ \\
957 -1 & if $x$ is a quadratic nonresidue mod $p$ \\
958 +1 & if $x$ is a quadratic residue mod $p$
959 \end{cases}. \]%
960 The \emph{Jacobi symbol} (written the same way) is defined for odd~$n$: if
961 $n = p_1^{a_1} p_2^{a_2} \cdots p_k^{a_k}$ where the $p_i$ are prime, then
962 \[ \jacobi{x}{n} =
963 \jacobi{x}{p_1}^{a_1} \jacobi{x}{p_2}^{a_2} \cdots
964 \jacobi{x}{p_k}^{a_k}. \]%
965 This can be efficiently computed without knowledge of the factors of $n$
966 \cite[Section~2.4.5]{Menezes:1997:HAC}.
967 \end{slide}
968
969 \begin{slide}
970 \head{The Quadratic Residuosity Problem (cont.)}
971
972 If $\jacobi{x}{n} = -1$ then $x$ is certainly \emph{not} a quadratic
973 residue mod $n$; however, if $\jacobi{x}{n} = 1$ then $x$ might be a
974 quadratic residue or it might not; if not, then we say that $x$ is a
975 \emph{pseudosquare}.
976
977 If $n = p q$ is a product of two primes and $x \inr (\Z/n\Z)^*$ is chosen
978 at random, then
979 \[ \Pr\Bigl[x \in Q_n \Bigm| \jacobi{x}{n} = 1\Bigr] = \frac{1}{2}. \]
980
981 The problem of distinguishing pseudosquares from quadratic residues is
982 called the Quadratic Residuosity Problem (QRP). It is not known how to
983 solve this problem without factoring $n$.
984 \end{slide}
985
986 \begin{slide}
987 \topic{discrete logarithms}
988 \head{The Discrete Logarithm Problem}
989
990 The (Integer) Discrete Logarithm Problem asks for the solution $x$ given a
991 congruence of the form $g^x \equiv y \pmod{n}$. This seems to be about as
992 difficult as factoring. (The ability to solve discrete logs modulo $n$ is
993 sufficient to factor $n$. The best known algorithms for IFP and DLP have
994 the same running time.)
995
996 The problem generalizes to other cyclic groups, e.g., elliptic curves over
997 finite fields.
998 \end{slide}
999
1000 \begin{slide}
1001 \topic{self-reducibility}
1002 \head{Self-reducibility, 1}
1003
1004 The problems of square-root extraction, deciding quadratic residuosity, the
1005 RSA problem, and finding discrete logarithms share the property of being
1006 \emph{randomly self-reducible}; i.e., an instance of the problem can be
1007 transformed into many different derived instances \emph{without skewing the
1008 probability distribution of problem instances}, such that a solution to
1009 one of the derived instances yields a solution to the original one.
1010
1011 This is a good property to have. It implies that `most' problem instances
1012 are as hard as the hardest instances.
1013 \end{slide}
1014
1015 \begin{slide}
1016 \head{Self-reducibility, 2: the RSA problem \cite{Rivest:1978:MOD}}
1017
1018 The RSA problem is to compute $e$-th roots modulo $n = p q$, where $e$ is
1019 relatively prime to $n$. Suppose that the algorithm $S(n, e, y)$ returns a
1020 value $x$ such that $x^e \equiv y \pmod{n}$ for `many' choices of $y$, or
1021 the special symbol $\bot$ otherwise. The following probabilistic algorithm
1022 then solves the RSA problem for arbitary $y$:
1023 \begin{program}
1024 Algorithm $\id{solve-rsa}(n, e, y)$: \+ \\
1025 \REPEAT \\ \quad\=\+\kill
1026 $x' \getsr \{1, 2, \ldots, n - 1\}$; \\
1027 \IF $\gcd(x', n) = 1$ \THEN \\ \quad\=\+\kill
1028 $y' \gets y x'^e \bmod n$; \\
1029 $x \gets S(n, e, y')$; \\
1030 \IF $x \ne \bot$ \THEN \RETURN $x x'^{-1} \bmod n$; \-\- \\
1031 \FOREVER;
1032 \end{program}
1033 \end{slide}
1034
1035 \begin{slide}
1036 \topic{the Diffie-Hellman problem}
1037 \head{The Diffie-Hellman problem \cite{Diffie:1976:NDC}}
1038
1039 Let $G = \langle g \rangle$ be a cyclic group or order $q$. Let $\alpha$
1040 and $\beta$ be indices, $\alpha, \beta \in \Z/q\Z$.
1041
1042 The (computational) \emph{Diffie-Hellman} problem is, given $g^\alpha$ and
1043 $g^\beta$, to find $g^{\alpha\beta}$.
1044
1045 The (computational) Diffie-Hellman \emph{assumption} is that there is no
1046 probabilistic algorithm which solves the computational Diffie-Hellman
1047 problem in time polynomial in $q$.
1048
1049 Obviously, being able to compute discrete logs in $G$ efficiently would
1050 yield a solution to the Diffie-Hellman problem. But it may be the case
1051 that the Diffie-Hellman problem is easier than the discrete log problem.
1052
1053 The Diffie-Hellman problem is self-reducible.
1054 \end{slide}
1055
1056 \begin{slide}
1057 \head{The Decisional Diffie-Hellman assumption \cite{Boneh:1998:DDP}}
1058
1059 The computational Diffie-Hellman assumption makes a statement only about
1060 algorithms which compute the \emph{entire} answer $g^{\alpha\beta}$. Since
1061 Diffie-Hellman is frequently used for key-exchange, what can we say about
1062 the ability of an adversary to guess any of the bits?
1063
1064 The Decisional Diffie-Hellman (DDH) assumption asserts that, if you don't
1065 know $\alpha$ or $\beta$, then it's hard to tell $g^{\alpha\beta}$ from a
1066 random element of $G$; that is, that the distributions of the following
1067 experiments are computationally indistinguishable:
1068 \begin{program}
1069 $\alpha \getsr \Z/q\Z;$ \\
1070 $\beta \getsr \Z/q\Z;$ \\
1071 \RETURN $(g^\alpha, g^\beta, g^{\alpha\beta})$;
1072 \next
1073 $\alpha \getsr \Z/q\Z;$ \\
1074 $\beta \getsr \Z/q\Z;$ \\
1075 $\gamma \getsr \Z/q\Z;$ \\
1076 \RETURN $(g^\alpha, g^\beta, g^\gamma)$;
1077 \end{program}
1078 \end{slide}
1079
1080 \begin{slide}
1081 \head{The Decisional Diffie-Hellman assumption (cont.)}
1082
1083 If $A$ is an algorithm attempting to solve DDH in the group $G$, then we
1084 write its advantage as
1085 \begin{eqnarray*}[rl]
1086 \Adv{ddh}{G}(A) =
1087 & \Pr[\alpha \getsr \Z/q\Z; \beta \getsr \Z/q\Z :
1088 A(g^\alpha, g^\beta, g^{\alpha\beta}) = 1] - {} \\
1089 & \Pr[\alpha \getsr \Z/q\Z; \beta \getsr \Z/q\Z;
1090 \gamma \getsr \Z/q\Z :
1091 A(g^\alpha, g^\beta, g^\gamma) = 1]
1092 \end{eqnarray*}
1093 and the insecurity of DDH in $G$ as
1094 \[ \InSec{ddh}(G; t) = \max_A \Adv{ddh}{G}(A) \]
1095 with the maximum over all algorithms $A$ running in time $t$.
1096 \end{slide}
1097
1098 \begin{slide}
1099 \head{The Decisional Diffie-Hellman assumption (cont.)}
1100
1101 If you can solve the computational Diffie-Hellman problem, you can solve
1102 the decisional version. If you can compute discrete logs, you can solve
1103 both.
1104
1105 There \emph{are} groups in which the computation problem is (believed to
1106 be) hard, but the decisional problem is easy. In particular, if the group
1107 order $q$ has small factors then the decisional problem isn't hard.
1108 \end{slide}
1109
1110 \endinput
1111
1112 %%% Local Variables:
1113 %%% mode: latex
1114 %%% TeX-master: "ips"
1115 %%% End: