41761fdc |
1 | \xcalways\section{Basics}\x |
2 | |
3 | \xcalways\subsection{Introduction and motivation}\x |
4 | |
5 | \begin{slide} |
6 | \topic{joke} |
7 | \head{Mandatory joke} |
8 | |
9 | An elderly Frenchman rises every morning at 5, goes out to the street in |
10 | front of his house, and sprinkles a white powder up and down the street. |
11 | |
12 | One day, a neighbour, who has watched his routine for many years, confronts |
13 | him. `What is this powder you sprinkle on the street every morning, |
14 | Pierre?' |
15 | |
16 | `It is elephant powder, \emph{mon ami},' the gentleman says. `It keeps the |
17 | elephants away.' |
18 | |
19 | `But Pierre,' says the neighbour. `Everybody knows that there are no |
20 | elephants in France.' |
21 | |
22 | Says the older man matter-of-factly, `I guess it must be working, then.' |
23 | \end{slide} |
24 | |
25 | The joke has its mandatory corresponding serious message. Cryptography is |
26 | like elephant powder. If it seems to work, and keeps the attackers -- our |
27 | elephants -- away, then we trust it. This isn't really very satisfactory. |
28 | When the elephant powder fails, a 2-ton elephant turns up, with three of its |
29 | chums in a Mini, and you notice it hiding upside-down in your bowl of |
30 | custard. Let's face it: elephants aren't good at being surreptitious. |
31 | |
32 | But if your cryptography is no good, you may never know. |
33 | |
34 | \begin{slide} |
35 | \topic{serious message} |
36 | \head{Cryptography is elephant powder} |
37 | |
53aa10b5 |
38 | So, what can we do about the situation? |
41761fdc |
39 | \begin{itemize} |
40 | \item Design simple cryptographic primitives, e.g., block ciphers, hash |
41 | functions. Develop techniques for analysing them, and attempt to stay |
42 | ahead of the bad guy. |
43 | \item Build useful constructions from trusted primitives in a modular way. |
44 | \emph{Prove that the constructions are secure.} |
45 | \end{itemize} |
46 | Here we look at this second part of the approach. |
47 | \end{slide} |
48 | |
49 | \xcalways\subsection{Approach}\x |
50 | |
51 | \begin{slide} |
52 | \head{The provable security approach} |
53 | |
54 | \begin{itemize} |
55 | \item Define notions of security. Now we know what to aim for, and what to |
56 | expect of our components. |
57 | \item Prove how the security of a construction relates to the security of |
58 | its primitives. If it breaks, we know who to blame. |
59 | \end{itemize} |
60 | \end{slide} |
61 | |
62 | \begin{slide} |
63 | \topic{adversaries} |
64 | \head{Adversaries} |
65 | |
66 | We model our adversary as a \emph{probabilistic algorithm}; i.e., the |
67 | algorithm is allowed to \emph{flip coins} to make decisions. The |
68 | adversary's output (for some input) follows a probability distribution. We |
69 | define what it means for an adversary to \emph{break} our construction, and |
70 | examine the probability with which this happens. |
71 | |
72 | We provide the adversary with \emph{oracles}: |
73 | \begin{itemize} |
74 | \item Oracles compute using secrets hidden from the adversary. |
75 | \item We count the number of \emph{queries} made to an oracle. |
76 | \item We can restrict the types of queries the adversary makes. |
77 | \end{itemize} |
78 | |
79 | Oracles are written as superscripts. For example, an adversary given a |
80 | chosen-plaintext oracle might be written as $A^{E_K(\cdot)}$. |
81 | \end{slide} |
82 | |
83 | \begin{slide} |
84 | \topic{the asymptotic approach} |
53aa10b5 |
85 | \resetseq |
86 | \head{The asymptotic approach, \seq} |
41761fdc |
87 | |
88 | A function $\nu\colon \N \to \R$ is \emph{negligible} if, for any integer |
89 | $c$, there exists an $n \in \N$ such that $0 \le \nu(k) < k^{-c}$ for all |
90 | $k \ge n$. That is, $\nu(k)$ is `eventually' less than any polynomial |
91 | function of $k$. |
92 | |
93 | We examine families of constructions, with a \emph{security parameter} $k$. |
94 | We say that a function is (asymptotically) secure in some sense if, for any |
95 | polynomial $p(k)$ there is a negligible function $\nu(k)$ such that, for |
96 | any construction in the family, parameterized by $k$, no adversary which |
97 | runs for time $p(k)$ has success probability better than $\nu(k)$. |
98 | \end{slide} |
99 | |
100 | \begin{slide} |
53aa10b5 |
101 | \head{The asymptotic approach, \seq} |
41761fdc |
102 | |
103 | Suppose we build an encryption scheme from a one-way function. We'd like |
104 | to prove that the encryption is good if the one-way function is secure. We |
105 | do this by contradiction: |
106 | \begin{enumerate} |
107 | \item Suppose an adversary $A$ breaks the encryption scheme with |
108 | better-than-negligible probability. |
109 | \item Show a polynomial-time \emph{reduction}: an algorithm which uses $A$ |
110 | to break the one-way function, in polynomial time, and with |
111 | better-than-negligible probability, |
112 | \item Claim that this violates the assumption of a secure one-way function. |
113 | \end{enumerate} |
114 | |
115 | This doesn't work with real constructions. We don't know where the |
116 | asymptotics set in, and they can conceal large constants. It's still |
117 | better than nothing. |
118 | \end{slide} |
119 | |
120 | \begin{slide} |
121 | \topic{the concrete (quantitative) approach} |
122 | \head{The concrete (quantitative) approach} |
123 | |
124 | We constrain the resources we allow the adversary: |
125 | \begin{itemize} |
126 | \item Running time (including program size). |
127 | \item Number of oracle queries. |
128 | \item Maximum size of oracle queries. |
129 | \end{itemize} |
130 | Write that something is \emph{$(t, q, \epsilon)$-secure} if no adversary |
131 | which runs in time $t$ and makes $q$ oracle queries can break it with |
132 | probability better than $\epsilon$. |
133 | |
134 | We make statements like `foo is $(t, q, 2 q \epsilon)$-secure if bar is $(t |
135 | + O(q), 2 q, \epsilon)$-secure'. |
136 | |
137 | This is a much more satisfactory approach. However, we still have to |
138 | \emph{assume} the security of our primitive operations. |
139 | \end{slide} |
140 | |
141 | \xcalways\subsection{Notation}\x |
142 | |
143 | \begin{slide} |
144 | \topic{boolean operators} |
53aa10b5 |
145 | \resetseq |
146 | \head{Notation, \seq: boolean operators} |
41761fdc |
147 | |
148 | If $P$ and $Q$ are \emph{predicates} -- i.e., either true or false -- then: |
149 | \begin{itemize} |
150 | \item $P \land Q$ is true if both $P$ \emph{and} $Q$ is true; |
151 | \item $P \lor Q$ is true if either $P$ \emph{or} $Q$ (or both) is true; |
152 | \item $\lnot P$ is true if $P$ is false; and |
153 | \item $P \implies Q$ is true if $Q$ is true or $P$ is false. |
154 | \end{itemize} |
155 | \end{slide} |
156 | |
157 | \begin{slide} |
158 | \topic{sets} |
53aa10b5 |
159 | \head{Notation, \seq: sets} |
41761fdc |
160 | |
161 | For our purposes, we can think of sets as being collections of objects. |
162 | |
163 | We use the usual notations for set membership ($x \in X$), intersection ($X |
164 | \cap Y$), union ($X \cup Y$) and subset containment ($X \subseteq Y$). The |
165 | \emph{empty set}, which contains no elements, is written $\emptyset$. |
166 | |
53aa10b5 |
167 | The notation $\{\, f(x) \mid P(x) \,\}$ describes the set containing those |
168 | items $f(x)$ for those $x$ for which the predicate $P(x)$ is true. |
41761fdc |
169 | |
170 | The \emph{cardinality} $|X|$ of a (finite) set $X$ is the number of |
171 | elements in the set. |
172 | |
173 | The power set $\powerset(X)$ of a set $X$ is the set of all subsets of $X$. |
53aa10b5 |
174 | |
175 | The \emph{Cartesian product} of two sets $X \times Y$ is the set of all |
176 | ordered pairs $\{\, (x, y) \mid x \in X \land y \in Y \,\}$. We use |
177 | exponents to indicate the product of a set with itself: hence, $X^2 = X |
178 | \times X$. |
179 | |
180 | A \emph{relation} $R$ is a subset of a Cartesian product. We write $R(x, |
181 | y)$ if $(x, y) \in R$. Relations between two sets are often written as |
182 | infix symbols: e.g., $x \sim y$. |
41761fdc |
183 | \end{slide} |
184 | |
185 | \begin{slide} |
53aa10b5 |
186 | \head{Notation, \seq: sets (cont.)} |
41761fdc |
187 | |
53aa10b5 |
188 | In addition to strings, defined later, we use the following standard sets: |
189 | \begin{itemize} |
190 | \item the set $\Z$ of integers; |
191 | \item the set $\N = \{\, x \in \Z \mid x \ge 0 \,\}$ of natural numbers; |
192 | \item the set $\R$ of real numbers; |
193 | \item closed intervals $[a, b] = \{\, x \in \R \mid a \le x \le b \,\}$; |
194 | \item the finite field $\F_q$ of $q$ elements, and its multiplicative |
195 | subgroup $\F_q^* = \F_q \setminus \{0\}$; and |
196 | \item the ring $\Z/n\Z$ of residue classes modulo $n$ (i.e., if $x \in |
197 | \Z/n\Z$ and $a, b \in x$ then $a \equiv b \pmod{n}$), and its |
198 | multiplicative subgroup $(\Z/n\Z)^* = \Z/n\Z - \{\, x + n\Z \mid \gcd(x, |
199 | n) > 1 \,\}$. |
200 | \end{itemize} |
41761fdc |
201 | \end{slide} |
202 | |
203 | \begin{slide} |
204 | \topic{functions} |
53aa10b5 |
205 | \head{Notation, \seq: functions} |
41761fdc |
206 | |
207 | A \emph{function} $f\colon X \to Y$ is a mapping which assigns every |
208 | element $x$ in the \emph{domain} $X$ a corresponding element $f(x)$ in the |
209 | \emph{range} (or sometimes \emph{codomain}) $Y$. The notation $\dom f$ |
210 | describes the domain of an arbitrary function; $\ran f$ describes its |
211 | range. |
212 | |
213 | We sometimes apply the function notation to sets, indicating that the |
214 | function should be applied pointwise; i.e., $f(Z) = \{ f(z) \mid z \in Z |
215 | \}$. The \emph{image} of a function $f$ is the set $f(\dom f)$. |
216 | |
217 | If $f\colon X \to Y$ preserves equality, i.e., $f(x) = f(x') \implies x = |
218 | x'$ for all $x, x' \in X$, then we say $f$ is \emph{injective} (or |
219 | \emph{1-to-1}). If $f(X) = Y$ then we say that it is \emph{surjective} (or |
220 | \emph{onto}). If $f$ is both injective and surjective then we say that it |
221 | is \emph{bijective}. In this case, there is a well-defined inverse |
222 | $f^{-1}\colon Y \to X$ defined by $f(f^{-1}(y)) = y$ for all $y \in Y$. |
53aa10b5 |
223 | |
224 | If $f\colon X \to X$ (i.e., its domain and range are the same set) is |
225 | bijective, then we say that $f$ is a \emph{permutation on $X$}. |
41761fdc |
226 | \end{slide} |
227 | |
228 | \begin{slide} |
53aa10b5 |
229 | \head{Notation, \seq: functions (cont.)} |
41761fdc |
230 | |
231 | We can consider a function $f\colon X \to Y$ to be a particular sort of |
232 | relation $f \subseteq X \times Y$, subject to the constraint that if $(x, |
233 | y) \in f$ and $(x, y') \in f$ then $y = y'$. |
234 | |
235 | We shall use this view in some of the algorithms we present. In addition, |
236 | we shall use the \emph{maplet} notation $x \mapsto y$ rather than the |
237 | ordered pair notation $(x, y)$ to reinforce the notion of a mapping. |
238 | |
239 | We might write, for example, |
240 | \begin{program} |
241 | $f \gets f \cup \{ x \mapsto y \}$; |
242 | \end{program} |
243 | to augment a function by the addition of a new mapping. This is clearly |
244 | only valid if $x \notin \dom f$ (or $f(x) = y$) initially. |
245 | \end{slide} |
246 | |
247 | \begin{slide} |
248 | \topic{strings} |
53aa10b5 |
249 | \head{Notation, \seq: strings} |
41761fdc |
250 | |
251 | An \emph{alphabet} is a finite set of \emph{symbols}. The one we'll use |
252 | most of the time is the set $\Sigma = \{0, 1\}$ of \emph{bits}. |
253 | |
254 | Suppose $A$ is an alphabet. The set of sequences of exactly $n$ symbols |
255 | from $A$ is written $A^n$. Hence, $\{0, 1\}^{64}$ is the set of all 64-bit |
256 | sequences. The set of (finite) \emph{strings} over an alphabet $A$ is $A^* |
53aa10b5 |
257 | = \bigcup_{i \in \N} A^i$. The empty string is named $\emptystring$. |
41761fdc |
258 | |
259 | The \emph{length} of a string $a \in A^*$, written $|a|$, is the natural |
260 | number $n$ where $a \in A^n$. |
261 | |
262 | If $x, y \in A^*$ are strings then their \emph{concatenation}, written $x |
263 | \cat y$, or sometimes just $x y$, is the result of writing the symbols in |
264 | $x$ followed by the symbols in $y$, in order. We have $|x y| = |x| + |y|$. |
265 | \end{slide} |
266 | |
267 | \begin{slide} |
53aa10b5 |
268 | \head{Notation, \seq: strings (cont.)} |
41761fdc |
269 | |
53aa10b5 |
270 | There are natural (injective) mappings between bit strings and natural |
41761fdc |
271 | numbers. |
272 | |
273 | If $x = x_0 x_1 \ldots x_{n-1} \in \{0, 1\}^*$ then we can associate with |
274 | it the natural number |
275 | \[ \overrightarrow{x} = \sum_{0 \le i < n} 2^i x_i. \] |
276 | |
277 | The other natural mapping is |
278 | \[ \overleftarrow{x} = \sum_{0 \le i < n} 2^{n-i-1} x_i. \] |
279 | It doesn't matter which you choose, as long as you're consistent. |
280 | |
281 | For simplicity's sake, we shall tend to switch between strings and the |
53aa10b5 |
282 | numbers (and occasionally more exotic objects) they represent implicitly. |
41761fdc |
283 | \end{slide} |
284 | |
285 | \begin{slide} |
286 | \topic{parsing} |
53aa10b5 |
287 | \head{Notation, \seq: parsing} |
41761fdc |
288 | |
289 | We'll find it useful to be able to break up strings in the algorithms we |
290 | present. We use the statement |
291 | \begin{program} |
292 | \PARSE $x$ \AS $n_0\colon x_0, n_1\colon x_1, \ldots, n_k\colon x_k$; |
293 | \end{program} |
294 | to mean that the string $x$ is broken up into individual strings $x_0$, |
295 | $x_1$, \ldots, $x_k$, such that |
296 | \begin{itemize} |
297 | \item $x = x_0 \cat x_1 \cat \cdots \cat x_k$; and |
298 | \item $|x_i| = n_i$ for all $0 \le i \le k$. |
299 | \end{itemize} |
300 | We may omit one of the $n_i$ length indicators, since it can be deduced |
301 | from the length of the string $x$ and the other $n_j$. |
302 | \end{slide} |
303 | |
304 | \begin{slide} |
305 | \topic{vectors} |
53aa10b5 |
306 | \head{Notation, \seq: vectors} |
41761fdc |
307 | |
308 | A \emph{vector} $\vect{x}$ is a finite ordered collection of elements from |
309 | some set $X$. If $\vect{x}$ contains $n$ elements then we write $\vect{x} |
310 | \in X^n$, and that $|\vect{x}| = n$. We write the individual elements as |
311 | $\vect{x}[0], \vect{x}[1], \ldots, \vect{x}[n - 1]$. |
312 | |
53aa10b5 |
313 | We shall abuse set membership notation for vectors; i.e., we write $x \in |
41761fdc |
314 | \vect{x}$ if there is an $i$ ($0 \le i < |\vect{x}|$) such that |
315 | $\vect{x}[i] = x$. |
316 | |
317 | When we apply functions to vectors, we mean that they are applied |
318 | pointwise, as for sets. Thus, if we write that $\vect{y} = |
319 | f(\vect{x})$ then $|\vect{y}| = |\vect{x}|$ and $\vect{y}[i] = |
320 | f(\vect{x}[i])$ for all $0 \le i < |\vect{x}|$. |
321 | \end{slide} |
322 | |
323 | \begin{slide} |
324 | \topic{distributions and randomness} |
53aa10b5 |
325 | \head{Notation, \seq: distributions and randomness} |
41761fdc |
326 | |
327 | A \emph{probability distribution} over a (countable) set $X$ is a |
328 | function $\mathcal{D}\colon X \to [0, 1]$ such that |
329 | \[ \sum_{x \in X} \mathcal{D}(x) = 1. \] |
330 | |
331 | The \emph{support} of $\mathcal{D}$, written $\supp \mathcal{D}$, is the |
332 | set $\{ x \in X \mid \mathcal{D}(x) \ne 0 \}$; i.e., those elements of $X$ |
333 | which occur with nonzero probability. |
334 | |
335 | We write $x \getsr \mathcal{D}$ in algorithms to indicate that $x$ is to be |
336 | chosen independently at random, according to the distribution |
337 | $\mathcal{D}$. The notation $x \inr \mathcal{D}$ indicates that $x$ has |
338 | been chosen at independently at random according to $\mathcal{D}$. |
339 | |
340 | The \emph{uniform distribution} over a (finite) set $X$ is |
341 | $\mathcal{U}_X\colon X \to [0, 1]$ defined by $\mathcal{U}_X(x) = 1/|X|$ |
342 | for all $x \in X$. We shall write $x \getsr X$ and $x \inr X$ as |
343 | convenient shorthands, meaning $x \getsr \mathcal{U}_X$ and $x \inr |
344 | \mathcal{U}_X$ respectively. |
345 | \end{slide} |
346 | |
347 | \xcalways\subsection{Background}\x |
348 | |
349 | \begin{slide} |
350 | \topic{distinguishability} |
53aa10b5 |
351 | \resetseq |
352 | \head{Distinguishability, \seq} |
41761fdc |
353 | |
354 | Suppose that $\mathcal{X}$ and $\mathcal{Y}$ are two probability |
355 | distributions. |
53aa10b5 |
356 | |
41761fdc |
357 | Let $X$ be a random variable distributed according to $\mathcal{X}$, and |
358 | let $Y$ be a random variable distributed according to $\mathcal{Y}$. We |
53aa10b5 |
359 | say that $\mathcal{X}$ and $\mathcal{Y}$ are \emph{identically |
360 | distributed}, and write that $\mathcal{X} \equiv \mathcal{Y}$, if, for |
361 | all possible values $x$ of $X$, we have |
41761fdc |
362 | \[ \Pr[X = x] = \Pr[Y = x]. \] |
363 | |
364 | Equivalently, we require that, for all $x \in \supp \mathcal{X}$ we have |
365 | \[ x \in \supp \mathcal{Y} \land \mathcal{X}(x) = \mathcal{Y}(y). \] |
366 | \end{slide} |
367 | |
368 | \begin{slide} |
53aa10b5 |
369 | \head{Distinguishability, \seq: statistical closeness} |
41761fdc |
370 | |
371 | Now we generalize the setting slightly. Consider two \emph{families} of |
53aa10b5 |
372 | distributions, $\{\mathcal{X}_k\}_{k\in\N}$ and |
373 | $\{\mathcal{Y}_k\}_{k\in\N}$, parameterized by a security parameter $k$, |
374 | where $\dom\mathcal{X}_k = \dom\mathcal{Y}_k$. To make the asymptotics |
375 | work, we require that $|x| \le p(k)$ for some polynomial $p(\cdot)$, for |
376 | all $x \in \dom\mathcal{X}_k$. |
41761fdc |
377 | |
378 | Fix a value of $k$. Again, let $X$ be distributed according to |
379 | $\mathcal{X}_k$, and let $Y$ be distributed according to $\mathcal{Y}_k$. |
53aa10b5 |
380 | We say that $\{\mathcal{X}_k\}_{k \in \N}$ and $\{\mathcal{Y}_k\}_{k\in\N}$ |
381 | are \emph{statistically close}, and write that $\{\mathcal{X}_k\}_{k\in\N} |
382 | \statclose \{\mathcal{Y}_k\}_{k\in\N}$, if there is a negligible function |
383 | $\nu(\cdot)$ such that, for any $k \in \N$, |
384 | \[ \sum_{x\in\dom{\mathcal{X}_k}} |
385 | |{\Pr[X = x]} - \Pr[Y = x]| \le \nu(k). \]% |
41761fdc |
386 | (Reminder: Saying that $\nu\colon \N \to \R$ is \emph{negligible} means |
387 | that, for any $c \in \Z$ there is an $n \in N$ such that $\nu(k) < |
388 | k^{-c}$.) |
389 | \end{slide} |
390 | |
391 | \begin{slide} |
53aa10b5 |
392 | \head{Distinguishability, \seq: computational indistinguishability} |
41761fdc |
393 | |
394 | We say that two families of distributions are computationally |
395 | indistinguishable if no `efficient' algorithm can tell them apart with |
396 | better than `negligible' probability. |
397 | |
53aa10b5 |
398 | So, we say that $\{\mathcal{X}_k\}_{k\in\N}$ and |
399 | $\{\mathcal{Y}_k\}_{k\in\N}$ are \emph{computationally indistinguishable} |
400 | and write that $\{\mathcal{X}_k\}_{k\in\N} \compind |
401 | \{\mathcal{Y}_k\}_{k\in\N}$, if, for any probabilistic polynomial-time |
402 | algorithm $A$, there is a negligible function $\nu(\cdot)$ such that, for |
403 | any $k$: |
41761fdc |
404 | \[ \Pr[x \getsr \mathcal{X}_k; b \gets A(x) : b = 1] - |
53aa10b5 |
405 | \Pr[y \getsr \mathcal{Y}_k; b \gets A(y) : b = 1] \le \nu(k). \]% |
406 | Statistical closeness implies computational indistinguishability. |
41761fdc |
407 | \end{slide} |
408 | |
53aa10b5 |
409 | \begin{proof} |
410 | Let two statistically close distributions $\{\mathcal{X}_k\}_{k\in\N} |
411 | \statclose \{\mathcal{Y}_k\}_{y\in\N}$ be given. Fix some $k$, and let $Z |
412 | = \dom\mathcal{X}_k = \dom\mathcal{Y}_k$. Now note that the adversary's |
413 | advantage is given by $\sum_{z\in Z} \Pr[b \gets A(z) : b = 1] |
414 | |\mathcal{X}_k(z) - \mathcal{Y}_k(z)| \le \sum_{z\in Z} |\mathcal{X}_k(z) - |
415 | \mathcal{Y}_k(z)| \le \nu(k)$. Hence the two distributions are |
416 | computationally indistinguishable. |
417 | \end{proof} |
418 | |
41761fdc |
419 | \begin{slide} |
420 | \topic{collisions} |
53aa10b5 |
421 | \head{Collisions -- the Birthday `paradox'} |
41761fdc |
422 | |
423 | Suppose we throw $q$ balls into $n$ bins at random (with $q \le n$). Let |
424 | $C_{q, n}$ be the event that, at the end of this, we have a bin containing |
425 | more than one ball -- a \emph{collision}. |
426 | |
427 | Let $B_{q, n}$ be the event that the $q$-th ball collides with a |
428 | previous one. Obviously, the worst case for this is when none of the other |
429 | balls have collided, so |
430 | \[ \Pr[B_{q, n}] \le \frac{q - 1}{n}. \] |
431 | Then |
432 | \begin{eqnarray*}[rl] |
433 | \Pr[C_{q, n}] |
434 | &\le \Pr[B_{2, n}] + \Pr[B_{3, n}] + \cdots + \Pr[B_{q, n}] \\ |
53aa10b5 |
435 | &\le \frac{1}{n} + \frac{2}{n} + \cdots + \frac{q - 1}{n} \\ |
436 | &= \frac{q(q - 1)}{2 n}. |
41761fdc |
437 | \end{eqnarray*} |
438 | This is an extremely useful result, and we shall need it often. |
439 | \end{slide} |
440 | |
441 | \xcalways\subsection{Primitives}\x |
442 | |
443 | \begin{slide} |
444 | \topic{one-way functions} |
53aa10b5 |
445 | \resetseq |
446 | \head{One-way functions, \seq: introduction} |
41761fdc |
447 | |
448 | Intuition: a one-way function is easy to compute but hard to invert. |
449 | |
450 | Choose a function $f\colon X \to Y$. Let $I$ be a prospective inverter. |
451 | Now we play a game: |
452 | \begin{enumerate} |
453 | \item Choose $x \in X$ at random. Compute $y = f(x)$. |
454 | \item Let $x'$ be the output of $I$ when run on input $y$. |
455 | \item We say that $I$ `wins' if $f(x') = y$; otherwise it loses. |
456 | \end{enumerate} |
457 | Note that we don't care whether $x = x'$. |
458 | |
459 | Examples: SHA-1, or $x \mapsto g^x \bmod p$. |
460 | \end{slide} |
461 | |
462 | \begin{slide} |
53aa10b5 |
463 | \head{One-way functions, \seq: formalism} |
41761fdc |
464 | |
465 | The \emph{success probability} of an inverter $I$ against the function $f$ |
466 | is |
467 | \[ \Succ{owf}{f}(I) = |
468 | \Pr[x \getsr X; |
469 | y \gets f(x); |
470 | x' \gets I(y) : |
471 | f(x') = y] \]% |
472 | where the probability is taken over all choices of $x$ and all coin flips |
473 | made by $I$. |
474 | |
475 | We measure the \emph{insecurity} of a one-way function (OWF) by maximizing |
476 | over possible inverters: |
477 | \[ \InSec{owf}(f; t) = \max_I \Succ{owf}{f}(I) \] |
478 | where the maximum is taken over all $I$ running in time $t$. |
479 | |
480 | If $\InSec{owf}(f; t) \le \epsilon$ then we say that $f$ is a \emph{$(t, |
481 | \epsilon)$-secure one-way function}. |
482 | \end{slide} |
483 | |
484 | \begin{slide} |
53aa10b5 |
485 | \head{One-way functions, \seq: trapdoors} |
41761fdc |
486 | |
487 | Intuition: a \emph{trapdoor} is secret information which makes inverting a |
53aa10b5 |
488 | one-way function easy. This is most useful when the one-way function is a |
489 | permutation. A trapdoor one-way function generator $\mathcal{T} = (G, f, |
490 | T)$ is a triple of algorithms: |
41761fdc |
491 | |
492 | \begin{itemize} |
493 | \item The probabilistic algorithm $G$ is given some parameter $k$ and |
494 | returns a pair $(P, K)$, containing the public parameters $P$ for |
495 | computing an instance of the one-way function, and the secret trapdoor |
496 | information $K$ for inverting it. We write $(P, K) \in G(k)$. |
497 | |
498 | \item The algorithm $f$ implements the one-way function. That is, if $(P, |
499 | K) \in G(k)$ then $f(P, \cdot)$ (usually written $f_P(\cdot)$) is a |
500 | one-way function. |
501 | |
502 | \item The algorithm $T$ inverts $f$ using the trapdoor information. That |
53aa10b5 |
503 | is, if $(P, K) \in G(k)$ and $y = f_P(x)$ for some $x$, then $y = |
504 | f_P(T(K, y))$. We usually write $T_K(\cdot)$ instead of $T(K, \cdot)$. |
41761fdc |
505 | \end{itemize} |
506 | \end{slide} |
507 | |
508 | \begin{slide} |
509 | \topic{pseudorandom generators (PRGs)} |
510 | \head{Pseudorandom generators (PRGs)} |
511 | |
512 | A pseudorandom generator (PRG) `stretches' an input seed into a longer |
513 | string which `looks' random. |
514 | |
515 | Let $G\colon \{0, 1\}^k \to \{0, 1\}^L$ be a function from $k$-bit strings |
516 | to $L$-bit strings. The \emph{advantage} of a distinguisher $D$ against |
517 | $G$ is: |
518 | \begin{eqnarray*}[rl] |
519 | \Adv{prg}{G}(D) = & |
520 | \Pr[x \getsr \{0, 1\}^k; y \gets G(x) : D(y) = 1] - {}\\ |
521 | & \Pr[y \getsr \{0, 1\}^L : D(y) = 1]. |
522 | \end{eqnarray*} |
523 | The \emph{insecurity} is simply the maximum advantage: |
524 | \[ \InSec{prg}(G; t) = \max_D \Adv{prg}{G}(D) \] |
525 | where the maximum is taken over all distinguishers $D$ running in time |
526 | $t$. If $\InSec{prg}(G; t) \le \epsilon$ then we also say that $G$ is a |
527 | $(t, \epsilon)$-secure PRG\@. |
528 | \end{slide} |
529 | |
530 | \begin{exercise} |
531 | We say that a PRG $g\colon \{0, 1\}^k \to \{0, 1\}^L$ is \emph{trivial} if |
532 | $k \ge L$. |
533 | \begin{enumerate} |
534 | \item Show that trivial PRGs exist. |
535 | \item Show that if $g$ is nontrivial, then $g$ is also a one-way function, |
536 | with |
537 | \[ \InSec{owf}(g; t) \le \InSec{prg}(g; t) + 2^{k-L}. \] |
538 | \end{enumerate} |
539 | \answer% |
540 | \begin{parenum} |
541 | \item The identity function $I(x) = x$ is a trivial PRG, with |
542 | $\InSec{prg}(I, t) = 0$, as is easily seen from the definition. |
543 | \item Suppose $A$ inverts $g$. Then consider adversary $B(y)$: \{ $x \gets |
544 | A(y)$; \IF $g(x) = y$ \THEN \RETURN $1$; \ELSE \RETURN $0$;~\}. If $y$ |
545 | is the output of $g$, then $A$ inverts $y$ with probability |
546 | $\Succ{owf}{g}(A)$; if $y$ is random in $\{0, 1\}^L$ then there is a |
547 | probability at least $1 - 2^{k-L}$ that $y$ has \emph{no} inverse, |
548 | proving the result. Note that \cite{Wagner:2000:PSU} provides a better |
549 | security bound than this simplistic analysis. |
550 | \end{parenum} |
551 | \end{exercise} |
552 | |
553 | \begin{exercise} |
554 | \label{ex:dbl-prg} |
555 | Suppose that we have a \emph{length-doubling} PRG $g\colon \{0, 1\}^k \to |
556 | \{0, 1\}^{2k}$. Let $g_0(\cdot)$ be the first $k$ bits of $g(x)$ and |
557 | $g_1(\cdot)$ be the second $k$ bits. Define the sequence of generators |
558 | $g^{(i)}$ (for $i >= 1$) by |
559 | \[ g^{(1)}(x) = g(x); \qquad |
560 | g^{(i+1)}(x) = g_0(x) \cat g^{(i)}(g_1(x)). \]% |
561 | Relate the security of $g^{(i)}$ to that of $g$. |
562 | \answer% |
563 | Let $A$ be an adversary running in time $t$ and attacking $g^{(i+1)}$. |
564 | Firstly, we attack $g$: consider adversary $B(y)$: \{ \PARSE $y$ \AS $y_0, |
565 | k\colon y_1$; $z \gets g^{(i)}$; $b \gets A(y_0 \cat z)$; \RETURN $b$;~\}. |
566 | Then $\Adv{prg}{g}(B) \ge \Adv{prg}{g^{(i+1)}}(A) + \delta$, so |
567 | $\InSec{prg}(g^{(i+1)}; t) \le \InSec{prg}(g; t) + \delta$, where |
568 | \begin{eqnarray*}[rl] |
569 | \delta = &\Pr[x_0 \gets \{0, 1\}^k; x_1 \gets \{0, 1\}^k; |
570 | y \gets g^{(i)}(x) : A(x_0 \cat y) = 1] - \\ |
571 | &\Pr[y \getsr \{0, 1\}^{(i+2)k} : A(y) = 1]. |
572 | \end{eqnarray*} |
573 | We attack $g^{(i)}$ to bound $\delta$: consider adversary $C(y)$: \{ $x_0 |
574 | \getsr \{0, 1\}^k$; $b \gets A(x_0 \cat y)$; \RETURN $b$;~\}. Now $\delta |
575 | \le \Adv{prg}{g^{(i)}}(C) \le \InSec{prg}(g^{(i)}; t)$. So by induction, |
576 | \[ \InSec{prg}(g^{(i)}; t) \le i \cdot \InSec{prg}(g; t). \] |
577 | \end{exercise} |
578 | |
579 | \begin{slide} |
580 | \topic{advantage} |
581 | \head{Notes about advantage} |
582 | |
583 | Advantage is a concept used in many definitions of security notions: |
584 | \begin{eqnarray*}[rl] |
585 | \Adv{}{}(A) = & |
586 | \Pr[\text{$A$ returns 1 in setting $a$}] - {} \\ |
587 | & \Pr[\text{$A$ returns 1 in setting $b$}]. |
588 | \end{eqnarray*} |
589 | \begin{enumerate} |
590 | \item We have $-1 \le \Adv{}{}(A) \le +1$. |
591 | \item Zero means that the adversary couldn't distinguish. |
592 | \item Negative advantage means the adversary got them the wrong way |
593 | around. There is another adversary which uses the same resources but has |
594 | positive advantage. |
595 | \item \label{item:adv-guess} If $A$ is attempting to guess some hidden bit |
596 | $b^* \inr \{0, 1\}$, we have |
597 | \[ \Pr[b \gets A : b = b^*] = \frac{\Adv{}{}(A)}{2} + \frac{1}{2}. \] |
598 | \end{enumerate} |
599 | \end{slide} |
600 | |
601 | \begin{proof} |
602 | Let $b$ be the bit that $A$ returns, and let $b^*$ be the experiment's |
603 | hidden bit. Then |
604 | \[ \Adv{}{}(A) = \Pr[b = 1 \mid b^* = 1] - \Pr[b = 1 \mid b^* = 0]. \] |
605 | Addressing the above claims in order: |
606 | \begin{enumerate} |
607 | \item By definition of probability, $0 \le \Pr[b = 1 \mid b^* = 1] \le 1$ |
608 | and $0 \le \Pr[b = 1 \mid b^* = 0]$, so their absolute difference can be |
609 | at most 1. |
610 | \item This is a corollary of \ref{item:adv-guess}. |
611 | \item Consider the adversary $\bar{A}$ which runs $A$ and returns the |
612 | complement bit $\bar{b} = b \xor 1$. Then |
613 | \begin{eqnarray*}[rl] |
614 | \Adv{}{}(\bar{A}) |
615 | &= \Pr[\bar{b} = 1 \mid b^* = 1] - \Pr[\bar{b} = 1 \mid b^* = 0] \\ |
616 | &= \Pr[b = 0 \mid b^* = 1] - \Pr[b = 0 \mid b^* = 0] \\ |
617 | &= (1 - \Pr[b = 1 \mid b^* = 1]) - (1 - \Pr[b = 1 \mid b^* = 0]) \\ |
618 | &= \Pr[b = 1 \mid b^* = 0] - \Pr[b = 1 \mid b^* = 1] \\ |
619 | &= -\Adv{}{}(A). |
620 | \end{eqnarray*} |
621 | \item Note that $\Pr[b^* = 1] = \Pr[b^* = 0] = \frac{1}{2}$. Then |
622 | \begin{eqnarray*}[rl] |
623 | \Pr[b = b^*] |
624 | &= \Pr[b = 1 \land b^* = 1] + \Pr[b = 0 \land b^* = 0] \\ |
625 | &= \frac{1}{2}(\Pr[b = 1 \mid b^* = 1] + \Pr[b = 1 \mid b^* = 0]) \\ |
626 | &= \frac{1}{2}(\Pr[b = 1 \mid b^* = 1] + |
627 | (1 - \Pr[b = 0 \mid b^* = 0])) \\ |
628 | &= \frac{1}{2}(1 + \Pr[b = 1 \mid b^* = 1] - |
629 | \Pr[b = 0 \mid b^* = 0]) \\ |
630 | &= \frac{\Adv{}{}(A)}{2} + \frac{1}{2}. |
631 | \end{eqnarray*} |
632 | \end{enumerate} |
633 | All present and correct. |
634 | \end{proof} |
635 | |
636 | \begin{slide} |
637 | \topic{pseudorandom functions (PRFs)} |
638 | \head{Pseudorandom functions (PRFs)} |
639 | |
640 | A \emph{pseudorandom function family} (PRF) is a collection of functions |
641 | $F_K\colon \{0, 1\}^l \to \{0, 1\}^L$, where $K$ is some index, typically |
53aa10b5 |
642 | from a set of fixed-size strings $\{0, 1\}^k$. We shall often consider a |
643 | PRF to be a single function $F\colon \{0, 1\}^k \times \{0, 1\}^l \to \{0, |
644 | 1\}^L$. |
41761fdc |
645 | |
646 | We want to say that $F$ is a strong PRF if adversaries find it hard to |
647 | distinguish an instance $F_K$ from a function chosen completely at random |
648 | with the same `shape'. |
649 | |
650 | We provide the adversary with an \emph{oracle}, either for a randomly |
651 | selected $F_K$, or for completely random function, and ask it to try and |
652 | say which it is given. |
653 | |
654 | We write $\Func{l}{L}$ as the set of \emph{all} functions from $\{0, 1\}^l$ |
655 | to $\{0, 1\}^L$. |
656 | \end{slide} |
657 | |
658 | \begin{slide} |
53aa10b5 |
659 | \head{Pseudorandom functions (cont.)} |
41761fdc |
660 | |
661 | We define the advantage of a distinguisher $D$ against the PRF $F$ as |
662 | follows: |
663 | \begin{eqnarray*}[rl] |
664 | \Adv{prf}{F}(D) = & |
665 | \Pr[K \getsr \{0, 1\}^k : D^{F_K(\cdot)} = 1] - {}\\ |
666 | & \Pr[R \getsr \Func{l}{L} : D^{R(\cdot)} = 1]. |
667 | \end{eqnarray*} |
668 | The insecurity of the PRF is then measured as |
669 | \[ \InSec{prf}(F; t, q) = \max_D \Adv{prf}{F}(D) \] |
670 | where the maximum is taken over all distinguishers $D$ which run for time |
671 | $t$ and make $q$ oracle queries. As is usual, if $\InSec{prf}(F; t, q) |
672 | \le \epsilon$ then we say that $F$ is a $(t, q, \epsilon)$-secure PRF. |
673 | \end{slide} |
674 | |
675 | \begin{slide} |
676 | \topic{pseudorandom permutations (PRPs)} |
677 | \head{Pseudorandom permutations (PRPs)} |
53aa10b5 |
678 | |
41761fdc |
679 | We define a \emph{pseudorandom permutation family} (PRP) in a similar way |
53aa10b5 |
680 | to the PRFs we've already seen. A PRP is a family $F_K\colon \{0, 1\}^L |
681 | \to \{0, 1\}^L$ is a of permutations, indexed by elements of some finite |
682 | set, e.g., $\{0, 1\}^k$. We shall often consider a PRP to be a single |
683 | function $F\colon \{0, 1\}^k \times \{0, 1\}^l \to \{0, 1\}^L$. |
684 | |
685 | Let $\Perm{L}$ be the set of \emph{all} permutations over the set of |
686 | $L$-bit strings $\{0, 1\}^L$. |
41761fdc |
687 | |
688 | The advantage of a distinguisher $D$ against the PRP $F$ is |
689 | \begin{eqnarray*}[rl] |
690 | \Adv{prp}{F}(D) = & |
691 | \Pr[K \getsr \{0, 1\}^k : D^{F_K(\cdot)} = 1] - {}\\ |
692 | & \Pr[R \getsr \Perm{L} : D^{R(\cdot)} = 1]. |
693 | \end{eqnarray*} |
694 | |
695 | We define $\InSec{prp}(F; t, q) = \max_D \Adv{prp}{F}(D)$ exactly as for |
696 | PRFs, and the notion of $(t, q, \epsilon)$-security is the same. |
697 | \end{slide} |
698 | |
699 | \begin{slide} |
53aa10b5 |
700 | \head{Super pseudorandom permutations} |
41761fdc |
701 | |
702 | PRPs are bijective. A \emph{super PRP} is a PRP which remains secure when |
703 | the distinguisher is allowed to make inverse queries: |
704 | \begin{eqnarray*}[rl] |
705 | \Adv{sprp}{F}(D) = & |
53aa10b5 |
706 | \Pr[K \getsr \{0, 1\}^k : D^{F_K(\cdot), F_K^{-1}(\cdot)} = 1] - {} \\ |
41761fdc |
707 | & \Pr[R \getsr \Perm{L} : D^{R(\cdot), R^{-1}(\cdot)} = 1]. |
708 | \end{eqnarray*} |
709 | Since there are two oracles, we count queries to both when evaluating the |
710 | insecurity: |
711 | \[ \InSec{sprp}(F; t, q, q') = \max_D \Adv{sprp}{F}(D) \] |
712 | where the maximum is taken over all distinguishers $D$ which run for time |
713 | $t$, make $q$ queries to the standard oracle, and $q'$ queries to the |
714 | inverse oracle. If $\InSec{sprp}(F; t, q, q') \le \epsilon$ then we say |
715 | $F$ is a $(t, q, q', \epsilon)$-secure super PRP\@. |
716 | \end{slide} |
717 | |
718 | \begin{exercise} |
719 | Note that the key length hasn't appeared anywhere in the definition of |
720 | insecurity for a PRP. Derive lower bounds for the insecurity of a PRP with |
721 | a $k$-bit key. |
722 | \answer% |
723 | Let $E\colon \{0, 1\}^k \times \{0, 1\}^L \to \{0, 1\}^L$ be a PRP. Fix |
724 | $n$ and $c$. Then consider adversary $S^{E(\cdot)}$: \{ \FOR $i = 0$ \TO |
725 | $c - 1$ \DO $y[i] \gets E(i)$; \FOR $K = 0$ \TO $n - 1$ \DO \{ $i \gets 0$; |
726 | $\id{good} \gets 1$; \WHILE $i < c \land \id{good} = 1$ \DO \{ \IF $E_K(i) |
727 | \ne y[i]$ \THEN $\id{good} \gets 0$;~\} \IF $\id{good} = 1$ \THEN \RETURN |
728 | $1$;~\}~\}. Then $\Adv{prp}{E}(S) \ge n(2^{-k} - 2^{-Lc})$. |
729 | \end{exercise} |
730 | |
731 | \begin{slide} |
53aa10b5 |
732 | \resetseq |
733 | \head{PRPs are PRFs, \seq} |
41761fdc |
734 | |
735 | We model block ciphers as families of PRPs (not super PRPs). Most of the |
736 | analysis works best on PRFs, though. We show that a PRP makes a `pretty |
737 | good' PRF, as long as it's not over-used. |
738 | |
739 | Let $F$ be any PRP family. Then |
740 | \[ \InSec{prf}(F; t, q) \le |
741 | \InSec{prp}(F; t, q) + \frac{q(q - 1)}{2^{L+1}}. \]% |
742 | This is a useful result. As long as $q^2$ is small compared to $2^L$ -- |
743 | the block size -- then a PRP makes a good PRF. |
744 | |
53aa10b5 |
745 | The value $2^{L/2}$ is often called the \emph{Birthday bound}. We shall |
746 | meet it often when we examine modes of operation. We shall examine the |
747 | proof, because it illustrates some useful techniques. |
41761fdc |
748 | \end{slide} |
749 | |
53aa10b5 |
750 | \begin{slide} |
751 | \head{Shoup's lemma} |
41761fdc |
752 | |
53aa10b5 |
753 | This handy lemma states that the difference in the probability of some |
754 | outcome between the two games is bounded above by the probability that the |
755 | games differ. |
756 | |
757 | \begin{lemma}[Shoup \cite{Shoup:2001:OAEPR}] |
758 | \label{lem:shoup} |
759 | If $X$, $Y$ and $F$ are events, and $\Pr[X \land \lnot F] = \Pr[Y \land |
760 | \lnot F]$ then $|{\Pr[X]} - \Pr[Y]| \le \Pr[F]$. |
761 | \end{lemma} |
762 | \begin{proof} |
763 | We have: |
764 | \begin{eqnarray*}[rll] |
765 | \Pr[X] &= \Pr[X \land F] &+ \Pr[X \land \lnot F] \\ |
766 | \Pr[Y] &= \Pr[Y \land F] &+ \Pr[Y \land \lnot F] |
767 | \end{eqnarray*} |
768 | Subtracting gives |
769 | \[ |{\Pr[X]} - \Pr[Y]| = |{\Pr[X \land F]} - |
770 | \Pr[Y \land F]| \le \Pr[F] \]% |
771 | as required. |
772 | \end{proof} |
773 | \end{slide} |
774 | |
775 | \begin{slide} |
776 | \head{PRPs are PRFs, \seq: proof} |
777 | |
778 | Let $F\colon \{0, 1\}^k \times \{0, 1\}^L \to \{0, 1\}^L$ be a pseudorandom |
779 | permutation. We aim to show that $F$ is also a pseudorandom function. |
780 | |
781 | Let $A$ be an adversary which distinguishes~$F$ from a pseudorandom |
782 | function in time~$t$, after making $q$ oracle queries. We consider a |
783 | sequence of games $\G{i}$ played with the adversary. In each, let $S_i$ be |
784 | the event that the adversary returns~$1$ at the end of the game. |
785 | |
786 | Game~$\G0$ is the `random function' game. We given $A$ an oracle |
787 | containing a random function $R \inr \Func{L}{L}$. |
788 | |
789 | Game~$\G1$ is the `PRF' game. We give $A$ an oracle which computes |
790 | $F_K(\cdot)$ for some randomly chosen $K \inr \{0, 1\}^k$. |
791 | |
792 | By definition, then, |
793 | \[ \Adv{prf}{F}(A) = \Pr[S_1] - \Pr[S_0]. \] |
794 | \end{slide} |
795 | |
796 | \begin{slide} |
797 | \head{PRPs are PRFs, \seq: proof (cont.)} |
798 | |
799 | Let $x_0, x_1, \ldots, x_{q-1}$ be the oracle queries made by $A$, and let |
800 | $y_0, y_1, \ldots, y_{q-1}$ be the corresponding responses. |
801 | |
802 | Game~$\G2$ works in the same way as $\G0$, except that if there is a |
803 | \emph{collision} in the query replies (i.e., $y_i = y_j$ but $x_i \ne x_j$ |
804 | for any $0 \le i, j < q$) then we stop the game immediately. Let $F_2$ be |
805 | the event that this occurs. |
806 | |
807 | Because $\G2$ behaves exactly the same as $\G0$ unless $F_2$ occurs, we |
808 | must have |
809 | \[ \Pr[S_2 \land \lnot F_2] = \Pr[S_0 \land \lnot F_2] \] |
810 | so we invoke Lemma~\ref{lem:shoup} and discover that |
811 | \[ |{\Pr[S_2]} - \Pr[S_0]| \le \Pr[F_2]. \] |
812 | Using the earlier result on collisions, it's easy to see that |
813 | \[ \Pr[F_2] \le \frac{q(q - 1)}{2^{L+1}}. \] |
814 | \end{slide} |
815 | |
816 | \begin{slide} |
817 | \head{PRPs are PRFs, \seq: proof (cont.)} |
818 | |
819 | Game~$\G3$ works in the same way as $\G2$, except that we use a random |
820 | permutation $P \inr \Perm{L}$ instead of a random function $R \inr |
821 | \Func{L}{L}$. Firstly, note that $F_2$ can't occur in $\G3$. But, if |
822 | $F_2$ doesn't occur in $\G2$ (i.e., there is no collision), then the random |
823 | function is indistinguishable from a random permutation. So |
824 | \[ \Pr[S_3] = \Pr[S_2]. \] |
825 | |
826 | By definition, we have |
827 | \[ \Adv{prp}{F}(A) = \Pr[S_1] - \Pr[S_3]. \] |
828 | We can now tie all of this together. |
829 | \end{slide} |
830 | |
831 | \begin{slide} |
832 | \head{PRPs are PRFs, \seq: proof (cont.)} |
833 | |
834 | A simple calculation shows that |
41761fdc |
835 | \begin{eqnarray*}[rl] |
53aa10b5 |
836 | \Adv{prf}{F}(A) &= \Pr[S_1] - \Pr[S_0] \\ |
837 | &\le \Pr[S_1] - \Pr[S_2] + \Pr[F_2] \\ |
838 | &= \Pr[S_1] - \Pr[S_3] + \Pr[F_2] \\ |
839 | &= \Adv{prp}{F}(A) + \Pr[F_2] \\ |
840 | &\le \InSec{prp}(F; t, q) + \frac{q(q - 1)}{2^{L+1}}. |
41761fdc |
841 | \end{eqnarray*} |
53aa10b5 |
842 | In the second line, we used the bound we computed on the absolute |
843 | difference between $\Pr[S_2]$ and $\Pr[S_0]$; in the third, we noted that |
844 | $\Pr[S_2] = \Pr[S_3]$; in the fourth, we used the definition of advantage |
845 | against a PRP; and in the fifth we used the definition of insecurity for a |
846 | PRP. |
847 | \end{slide} |
848 | |
849 | \begin{slide} |
850 | \head{PRPs are PRFs, \seq: proof (cont.)} |
851 | |
852 | Finally, we imposed no restrictions on $A$, except that it run in time $t$ |
853 | and make $q$ oracle queries. So our bound |
854 | \[ \Adv{prf}{F}(A) \le \InSec{prp}(F; t, q) + \frac{q(q - 1)}{2^{L+1}} \]% |
855 | is true for \emph{any} such adversary $A$, and in particular, it's true for |
856 | the most successful adversary running with those resource bounds. |
857 | |
858 | Hence, we can maximize, showing that |
859 | \[ \InSec{prf}(F; t, q) \le |
860 | \InSec{prp}(F; t, q) + \frac{q(q - 1)}{2^{L+1}} \]% |
861 | as required. |
862 | \end{slide} |
41761fdc |
863 | |
864 | \begin{slide} |
865 | \topic{hash functions} |
53aa10b5 |
866 | \resetseq |
867 | \head{Hash functions, \seq: properties} |
41761fdc |
868 | |
869 | Hash functions like MD5 and SHA-1 are extremely useful primitives. What |
870 | properties do we expect of them? This out to be an extremely difficult |
871 | question to answer. |
872 | \begin{itemize} |
873 | \item One-wayness. We've seen a definition for this already. But it's not |
874 | enough. |
875 | \item Collision-resistance. This is the property usually claimed as the |
876 | requirement for use in digital signature systems. We'll look at this |
877 | later. |
878 | \item Randomness. What does this mean, when the function is completely |
879 | public? A distinguishability criterion is clearly hopeless. |
880 | \end{itemize} |
881 | \end{slide} |
882 | |
883 | \begin{slide} |
53aa10b5 |
884 | \head{Hash functions, \seq: Merkle-Damg\aa{}rd iterated hashing |
41761fdc |
885 | \cite{Damgaard:1990:DPH, Merkle:1991:FSE}} |
886 | |
887 | Let $F\colon \{0, 1\}^{k+L} \to \{0, 1\}^k$ be a \emph{compression} |
888 | function. Now consider the function $H\colon \{0, 1\}^* \to \{0, 1\}^k$ |
889 | which transforms an input string $x$ as follows: |
890 | \begin{enumerate} |
891 | \item Pad $x$ to a multiple of $L$ bits in some injective way. Divide the |
892 | padded message into $L$-bit blocks $x_0$, $x_1$, \ldots, $x_{n-1}$. |
53aa10b5 |
893 | \item Fix some $k$-bit constant $I$. Let $I_0 = I$. Define $I_{i+1} = |
41761fdc |
894 | F(I_i \cat x_i)$ for $0 \le i < n$. |
895 | \item The result $H(x) = I_n$. |
896 | \end{enumerate} |
897 | |
898 | Suppose we have two strings $x \ne y$, such that $H(x) = H(y)$; i.e., a |
899 | \emph{collision}. Then \emph{either} we can find a collision for $F$ |
900 | \emph{or} a string $z$ for which $F(z) = I$. (This is why initialization |
901 | vectors for hash functions have such obviously regular forms.) |
902 | \end{slide} |
903 | |
904 | \begin{proof} |
905 | Let $x_0, x_1, \ldots, x_{n-1}$ and $x'_0, x'_1, \ldots, x'_{n'-1}$ be |
906 | the $l$-bit blocks of two distinct (padded) messages, and without loss |
907 | of generality suppose that $n \ge n'$. Let $I_0 = I'_0 = I$, let |
908 | $I_{i+1} = F(I_i \cat x_i)$, and $I'_{i+1} = F(I'_i \cat x'_i)$. We |
909 | have $I_n = I'_{n'}$. |
910 | |
911 | We prove the result by induction on $n$. The case $n = 0$ is trivially |
912 | true, since there is only one zero-block message. Suppose, then, that the |
913 | result is true for $n$-block messages. There are three cases to consider. |
914 | Firstly, if $n' = 0$ then $F(I_n \cat x_n) = I$. Secondly, if $I_n \ne |
915 | I'_{n'}$ or $x_n \ne x'_{n'}$, then we have a collision, for $F(I_n \cat |
916 | x_n) = I_n = I'_{n'} = F(I'_{n'} \cat x'_{n'})$. Finally, if $I_n = |
917 | I'_{n'}$ and $x_n = x'_{n'}$ then we remove the final block from both |
918 | messages, and because the remaining messages must differ in at least one |
919 | block, we can apply the inductive hypothesis on these shorter messages to |
920 | complete the proof. |
921 | \end{proof} |
922 | |
923 | \begin{slide} |
53aa10b5 |
924 | \head{Hash functions, \seq: any-collision resistance} |
41761fdc |
925 | |
926 | The statement usually made about a `good' hash function $h$ is that it |
927 | should be `difficult' to find a collision: i.e., two preimages $x \ne y$ |
928 | where $H(x) = H(y)$. How do we formalize this? Here's one attempt: |
929 | \begin{eqlines*} |
930 | \Succ{acr}{H}(A) = \Pr[(x, y) \gets A : x \ne y \land H(x) = H(y)]; \\ |
931 | \InSec{acr}(H; t) = \max_A \Succ{acr}{H}(A). |
932 | \end{eqlines*} |
933 | But this doesn't work. There clearly \emph{exists} an adversary which |
934 | already `knows' the a collision for $H$ and just outputs the right answer. |
935 | It succeeds very quickly, and with probability 1. So this definition is |
936 | impossible to satisfy. |
937 | \end{slide} |
938 | |
939 | \begin{slide} |
53aa10b5 |
940 | \head{Hash functions, \seq: targetted collision resistance} |
41761fdc |
941 | |
942 | The concept of targetted collision resistance is relatively new, but quite |
943 | promising. It replaces a single hash function with a \emph{family} of hash |
944 | functions. They're not really keyed, because the indices aren't kept |
945 | secret. |
946 | |
947 | When making a signature, an index $i$ is chosen at random, and the |
948 | signature for message $m$ is formed over the pair $(i, H_i(M))$. |
949 | |
950 | TCR-hash functions are the subject of ongoing research. No practical |
951 | designs exist at the moment. |
952 | \end{slide} |
953 | |
954 | \begin{slide} |
53aa10b5 |
955 | \head{Hash functions, \seq: targetted collision resistance (cont.)} |
41761fdc |
956 | |
957 | Consider the following experiment: |
958 | \begin{program} |
959 | $\Expt{tcr}{H}(A)$: \+ \\ |
960 | $(x, s) \gets A(\cookie{find})$; \\ |
961 | $i \getsr \keys H$; \\ |
962 | $y \gets A(\cookie{collide}, i, s)$; \\ |
963 | \IF $x \ne y \land H_i(x) = H_i(y)$ |
964 | \THEN \RETURN $1$; \\ |
965 | \ELSE \RETURN $0$; |
966 | \end{program} |
967 | The security of a TCR-hash function is measured as: |
968 | \[ \InSec{tcr}(H; t) = \max_A \Pr[\Expt{tcr}{H}(A) = 1] \] |
969 | where the maximum is taken over all adversaries running in time $t$. We |
970 | define $(t, \epsilon)$-security as usual. |
971 | \end{slide} |
972 | |
973 | \begin{slide} |
53aa10b5 |
974 | \head{Hash functions, \seq: random oracles \cite{Bellare:1993:ROP}} |
41761fdc |
975 | |
976 | In practice, we expect much more than just collision resistance from hash |
977 | functions: we expect a certain amount of `random' behaviour. But this is |
978 | hard to quantify. |
979 | |
980 | One approach is to model a hash function as a `random oracle', i.e., an |
981 | oracle containing a function chosen at random, used by the construction |
982 | under consideration, and made available to the adversary. The idea is that |
983 | reductions can `capture' the knowledge of an adversary by examining the |
984 | queries it makes to its random oracle. |
985 | |
986 | Hash functions \emph{aren't} random oracles. But a random oracle proof is |
987 | better than nothing. |
988 | \end{slide} |
989 | |
990 | \xcalways\subsection{Standard assumptions}\x |
991 | |
992 | \begin{slide} |
993 | \head{Standard assumptions} |
994 | |
995 | There are a number of `standard' assumptions that are made about the |
996 | difficulty of various problems: |
997 | \begin{itemize} |
998 | \item IFP, the Integer Factorization Problem; |
999 | \item QRP, the Quadratic Residuosity Problem; |
1000 | \item DLP, the Discrete Logarithm Problem; |
1001 | \item RSAP, the RSA Problem; and |
1002 | \item CDH, the Computational Diffie-Hellman problem and its variants |
1003 | \end{itemize} |
1004 | \cite{Menezes:1997:HAC} has excellent material on the above. |
1005 | \end{slide} |
1006 | |
1007 | \begin{slide} |
1008 | \topic{integer factorization} |
53aa10b5 |
1009 | \resetseq |
1010 | \head{The Integer Factorization Problem, \seq} |
41761fdc |
1011 | |
1012 | We often assume that large integers of the form $n = p q$, where $p$ and |
1013 | $q$ are primes of roughly the same size, are `hard' to factor. |
1014 | Specifically, there is no algorithm which will factor such an $n$ in time |
1015 | bounded by a polynomial function of $\log n$. |
1016 | |
1017 | The difficulty of various other problems, e.g., Quadratic Residuosity, or |
1018 | RSA, depend on the difficulty of factoring; however, it is not yet known |
1019 | whether the ability to solve QRP or RSAP can be used to factor. |
1020 | \end{slide} |
1021 | |
1022 | \begin{slide} |
53aa10b5 |
1023 | \head{The Integer Factorization Problem, \seq: square roots} |
41761fdc |
1024 | |
1025 | The problem of extracting square roots modulo $n = p q$ is provably as hard |
1026 | as factoring. This is the basis of Rabin's public key encryption and |
1027 | digital signature schemes. We shall analyse these later. |
1028 | |
1029 | Suppose $Q(n, y)$ is an algorithm which returns an $x$ such that $x^2 |
1030 | \equiv y \pmod{n}$, provided such an $x$ exists. Then we can find a |
1031 | nontrivial factor of $n$ as follows: |
1032 | \begin{program} |
1033 | Algorithm $\id{find-factor}(n)$: \+ \\ |
1034 | \REPEAT \\ \quad\=\+\kill |
1035 | $x \getsr \{1, 2, \ldots, n - 1\}$; \\ |
1036 | $y \gets x^2 \bmod n$; \\ |
1037 | $x' \gets Q(n, y)$; \\ |
1038 | $p \gets \gcd(x + x', n)$; \\ |
1039 | \IF $p \notin \{1, n\}$ \THEN \RETURN $p$; \- \\ |
1040 | \FOREVER; |
1041 | \end{program} |
1042 | \end{slide} |
1043 | |
1044 | \begin{proof} |
1045 | The program attempts to find two square roots of $y$ mod $n$. It's easy to |
1046 | see that this might lead to factors of $n$. If $x^2 \equiv x'^2 \pmod{n}$ |
1047 | then $x^2 - x'^2 = k n$ for some constant $k$. Then $(x + x')(x - x')$ is |
1048 | a factorization of $k n$. It remains to prove the probability bound on $x |
1049 | + x'$ being a nontrivial factor of $n$. |
1050 | |
1051 | Let $n$ be an odd composite. Then, if $x \not\equiv \pm y \pmod{n}$ but |
1052 | $x^2 \equiv y^2 \pmod{n}$, then $\gcd(x + y, n)$ is a nontrivial factor of |
1053 | $n$. |
1054 | |
1055 | Firstly, we claim that, if $p$ is an odd prime then the congruence $x^2 |
1056 | \equiv y \pmod{p}$ has precisely two solutions $x$, $x'$ such that $x |
1057 | \equiv -x' \pmod{p}$. Let $g$ be primitive mod $p$, with $x = g^\alpha$, |
1058 | $x' = g^\beta$. Then $g^{2 \alpha} \equiv g^{2 \beta} \pmod{p}$, so $2 |
1059 | \alpha \equiv 2 \beta \pmod{p - 1}$. But $p - 1$ is even, by hypothesis, |
1060 | so $\alpha \equiv \beta \pmod{(p - 1)/2}$. But $g^{(p-1)/2} \equiv -1 |
1061 | \pmod{p}$; hence $x \equiv \pm x' \pmod{p}$, proving the claim. |
1062 | |
1063 | There must exist odd primes $p$, $q$, such that $p|n$ and $q|n$, and $x |
1064 | \equiv -y \pmod{p}$ and $x \equiv y \pmod{q}$, for if not, then $x \equiv |
1065 | \pm y \pmod{n}$ contrary to hypothesis. But then $x + y \equiv 0 |
1066 | \pmod{p}$, so $p|(x + y)$; but $x + y \equiv 2 x \not\equiv 0 \pmod{q}$, |
1067 | since $q$ is odd. Hence, $p$ divides $x + y$, but $q$ does not, so $\gcd(x |
1068 | + y, n)$ is a nontrivial factor of $n$, completing the proof. |
1069 | \end{proof} |
1070 | |
1071 | \begin{slide} |
1072 | \topic{quadratic residuosity} |
1073 | \head{The Quadratic Residuosity Problem} |
1074 | |
1075 | If there is an $x$ such that $x^2 \equiv y \pmod{n}$ then $y$ is a |
1076 | \emph{quadratic residue modulo $n$}, and we write $y \in Q_n$; if there is |
1077 | no such $x$ then $y$ is a \emph{quadratic nonresidue modulo $n$}. |
1078 | |
1079 | If $p$ is prime, then we can use the \emph{Legendre symbol} to decide |
1080 | whether $x$ is a quadratic residue mod $p$: |
1081 | \[ \jacobi{x}{p} = x^{(p-1)/2} \bmod p = |
1082 | \begin{cases} |
1083 | 0 & if $p$ divides $x$ \\ |
1084 | -1 & if $x$ is a quadratic nonresidue mod $p$ \\ |
1085 | +1 & if $x$ is a quadratic residue mod $p$ |
1086 | \end{cases}. \]% |
1087 | The \emph{Jacobi symbol} (written the same way) is defined for odd~$n$: if |
1088 | $n = p_1^{a_1} p_2^{a_2} \cdots p_k^{a_k}$ where the $p_i$ are prime, then |
1089 | \[ \jacobi{x}{n} = |
1090 | \jacobi{x}{p_1}^{a_1} \jacobi{x}{p_2}^{a_2} \cdots |
1091 | \jacobi{x}{p_k}^{a_k}. \]% |
1092 | This can be efficiently computed without knowledge of the factors of $n$ |
1093 | \cite[Section~2.4.5]{Menezes:1997:HAC}. |
1094 | \end{slide} |
1095 | |
1096 | \begin{slide} |
1097 | \head{The Quadratic Residuosity Problem (cont.)} |
1098 | |
1099 | If $\jacobi{x}{n} = -1$ then $x$ is certainly \emph{not} a quadratic |
1100 | residue mod $n$; however, if $\jacobi{x}{n} = 1$ then $x$ might be a |
1101 | quadratic residue or it might not; if not, then we say that $x$ is a |
1102 | \emph{pseudosquare}. |
1103 | |
1104 | If $n = p q$ is a product of two primes and $x \inr (\Z/n\Z)^*$ is chosen |
1105 | at random, then |
53aa10b5 |
1106 | \[ \Pr\Bigl[x \in Q_n \Bigm| \jacobi{x}{n} = 1\Bigr] = \frac{1}{2}, \] |
1107 | since we have |
1108 | \[ \jacobi{x}{p} = \jacobi{x}{q} = \pm 1 \] |
1109 | with each occurring with equal probability. |
41761fdc |
1110 | |
1111 | The problem of distinguishing pseudosquares from quadratic residues is |
1112 | called the Quadratic Residuosity Problem (QRP). It is not known how to |
1113 | solve this problem without factoring $n$. |
1114 | \end{slide} |
1115 | |
1116 | \begin{slide} |
1117 | \topic{discrete logarithms} |
1118 | \head{The Discrete Logarithm Problem} |
1119 | |
1120 | The (Integer) Discrete Logarithm Problem asks for the solution $x$ given a |
1121 | congruence of the form $g^x \equiv y \pmod{n}$. This seems to be about as |
1122 | difficult as factoring. (The ability to solve discrete logs modulo $n$ is |
1123 | sufficient to factor $n$. The best known algorithms for IFP and DLP have |
1124 | the same running time.) |
1125 | |
1126 | The problem generalizes to other cyclic groups, e.g., elliptic curves over |
1127 | finite fields. |
1128 | \end{slide} |
1129 | |
1130 | \begin{slide} |
1131 | \topic{self-reducibility} |
53aa10b5 |
1132 | \resetseq |
1133 | \head{Self-reducibility, \seq} |
41761fdc |
1134 | |
1135 | The problems of square-root extraction, deciding quadratic residuosity, the |
1136 | RSA problem, and finding discrete logarithms share the property of being |
1137 | \emph{randomly self-reducible}; i.e., an instance of the problem can be |
1138 | transformed into many different derived instances \emph{without skewing the |
1139 | probability distribution of problem instances}, such that a solution to |
1140 | one of the derived instances yields a solution to the original one. |
1141 | |
1142 | This is a good property to have. It implies that `most' problem instances |
1143 | are as hard as the hardest instances. |
1144 | \end{slide} |
1145 | |
1146 | \begin{slide} |
53aa10b5 |
1147 | \head{Self-reducibility, \seq: the RSA problem \cite{Rivest:1978:MOD}} |
41761fdc |
1148 | |
1149 | The RSA problem is to compute $e$-th roots modulo $n = p q$, where $e$ is |
1150 | relatively prime to $n$. Suppose that the algorithm $S(n, e, y)$ returns a |
1151 | value $x$ such that $x^e \equiv y \pmod{n}$ for `many' choices of $y$, or |
1152 | the special symbol $\bot$ otherwise. The following probabilistic algorithm |
53aa10b5 |
1153 | then solves the RSA problem for arbitrary $y$: |
41761fdc |
1154 | \begin{program} |
1155 | Algorithm $\id{solve-rsa}(n, e, y)$: \+ \\ |
1156 | \REPEAT \\ \quad\=\+\kill |
1157 | $x' \getsr \{1, 2, \ldots, n - 1\}$; \\ |
1158 | \IF $\gcd(x', n) = 1$ \THEN \\ \quad\=\+\kill |
1159 | $y' \gets y x'^e \bmod n$; \\ |
1160 | $x \gets S(n, e, y')$; \\ |
1161 | \IF $x \ne \bot$ \THEN \RETURN $x x'^{-1} \bmod n$; \-\- \\ |
1162 | \FOREVER; |
1163 | \end{program} |
1164 | \end{slide} |
1165 | |
1166 | \begin{slide} |
1167 | \topic{the Diffie-Hellman problem} |
1168 | \head{The Diffie-Hellman problem \cite{Diffie:1976:NDC}} |
1169 | |
1170 | Let $G = \langle g \rangle$ be a cyclic group or order $q$. Let $\alpha$ |
1171 | and $\beta$ be indices, $\alpha, \beta \in \Z/q\Z$. |
1172 | |
1173 | The (computational) \emph{Diffie-Hellman} problem is, given $g^\alpha$ and |
1174 | $g^\beta$, to find $g^{\alpha\beta}$. |
1175 | |
1176 | The (computational) Diffie-Hellman \emph{assumption} is that there is no |
1177 | probabilistic algorithm which solves the computational Diffie-Hellman |
53aa10b5 |
1178 | problem in time polynomial in $\log q$. |
41761fdc |
1179 | |
1180 | Obviously, being able to compute discrete logs in $G$ efficiently would |
1181 | yield a solution to the Diffie-Hellman problem. But it may be the case |
1182 | that the Diffie-Hellman problem is easier than the discrete log problem. |
1183 | |
1184 | The Diffie-Hellman problem is self-reducible. |
1185 | \end{slide} |
1186 | |
1187 | \begin{slide} |
1188 | \head{The Decisional Diffie-Hellman assumption \cite{Boneh:1998:DDP}} |
1189 | |
1190 | The computational Diffie-Hellman assumption makes a statement only about |
1191 | algorithms which compute the \emph{entire} answer $g^{\alpha\beta}$. Since |
1192 | Diffie-Hellman is frequently used for key-exchange, what can we say about |
1193 | the ability of an adversary to guess any of the bits? |
1194 | |
1195 | The Decisional Diffie-Hellman (DDH) assumption asserts that, if you don't |
1196 | know $\alpha$ or $\beta$, then it's hard to tell $g^{\alpha\beta}$ from a |
1197 | random element of $G$; that is, that the distributions of the following |
1198 | experiments are computationally indistinguishable: |
1199 | \begin{program} |
1200 | $\alpha \getsr \Z/q\Z;$ \\ |
1201 | $\beta \getsr \Z/q\Z;$ \\ |
1202 | \RETURN $(g^\alpha, g^\beta, g^{\alpha\beta})$; |
1203 | \next |
1204 | $\alpha \getsr \Z/q\Z;$ \\ |
1205 | $\beta \getsr \Z/q\Z;$ \\ |
1206 | $\gamma \getsr \Z/q\Z;$ \\ |
1207 | \RETURN $(g^\alpha, g^\beta, g^\gamma)$; |
1208 | \end{program} |
1209 | \end{slide} |
1210 | |
1211 | \begin{slide} |
1212 | \head{The Decisional Diffie-Hellman assumption (cont.)} |
1213 | |
1214 | If $A$ is an algorithm attempting to solve DDH in the group $G$, then we |
1215 | write its advantage as |
1216 | \begin{eqnarray*}[rl] |
1217 | \Adv{ddh}{G}(A) = |
1218 | & \Pr[\alpha \getsr \Z/q\Z; \beta \getsr \Z/q\Z : |
1219 | A(g^\alpha, g^\beta, g^{\alpha\beta}) = 1] - {} \\ |
1220 | & \Pr[\alpha \getsr \Z/q\Z; \beta \getsr \Z/q\Z; |
1221 | \gamma \getsr \Z/q\Z : |
1222 | A(g^\alpha, g^\beta, g^\gamma) = 1] |
1223 | \end{eqnarray*} |
1224 | and the insecurity of DDH in $G$ as |
1225 | \[ \InSec{ddh}(G; t) = \max_A \Adv{ddh}{G}(A) \] |
1226 | with the maximum over all algorithms $A$ running in time $t$. |
1227 | \end{slide} |
1228 | |
1229 | \begin{slide} |
1230 | \head{The Decisional Diffie-Hellman assumption (cont.)} |
1231 | |
1232 | If you can solve the computational Diffie-Hellman problem, you can solve |
1233 | the decisional version. If you can compute discrete logs, you can solve |
1234 | both. |
1235 | |
1236 | There \emph{are} groups in which the computation problem is (believed to |
1237 | be) hard, but the decisional problem is easy. In particular, if the group |
1238 | order $q$ has small factors then the decisional problem isn't hard. |
1239 | \end{slide} |
1240 | |
1241 | \endinput |
1242 | |
1243 | %%% Local Variables: |
1244 | %%% mode: latex |
1245 | %%% TeX-master: "ips" |
1246 | %%% End: |