[doc/ips] / basics.tex

\xcalways\section{Basics}\x

\xcalways\subsection{Introduction and motivation}\x

\begin{slide}
  \topic{joke}
  \head{Mandatory joke}
  
  An elderly Frenchman rises every morning at 5, goes out to the street in
  front of his house, and sprinkles a white powder up and down the street.
  
  One day, a neighbour, who has watched his routine for many years, confronts
  him.  `What is this powder you sprinkle on the street every morning,
  Pierre?'
  
  `It is elephant powder, \emph{mon ami},' the gentleman says.  `It keeps the
  elephants away.'
  
  `But Pierre,' says the neighbour.  `Everybody knows that there are no
  elephants in France.'
  
  Says the older man matter-of-factly, `I guess it must be working, then.'
\end{slide}

The joke has its mandatory corresponding serious message.  Cryptography is
like elephant powder.  If it seems to work, and keeps the attackers -- our
elephants -- away, then we trust it.  This isn't really very satisfactory.
When the elephant powder fails, a 2-ton elephant turns up, with three of its
chums in a Mini, and you notice it hiding upside-down in your bowl of
custard.  Let's face it: elephants aren't good at being surreptitious.

But if your cryptography is no good, you may never know.

\begin{slide}
  \topic{serious message}
  \head{Cryptography is elephant powder}

  So, what can we do about the situation?
  \begin{itemize}
  \item Design simple cryptographic primitives, e.g., block ciphers, hash
    functions.  Develop techniques for analysing them, and attempt to stay
    ahead of the bad guy.
  \item Build useful constructions from trusted primitives in a modular way.
    \emph{Prove that the constructions are secure.}
  \end{itemize}
  Here we look at this second part of the approach.
\end{slide}

\xcalways\subsection{Approach}\x

\begin{slide}
  \head{The provable security approach}

  \begin{itemize}
  \item Define notions of security.  Now we know what to aim for, and what to
    expect of our components.
  \item Prove how the security of a construction relates to the security of
    its primitives.  If it breaks, we know who to blame.
  \end{itemize}
\end{slide}

\begin{slide}
  \topic{adversaries}
  \head{Adversaries}

  We model our adversary as a \emph{probabilistic algorithm}; i.e., the
  algorithm is allowed to \emph{flip coins} to make decisions.  The
  adversary's output (for some input) follows a probability distribution.  We
  define what it means for an adversary to \emph{break} our construction, and
  examine the probability with which this happens.

  We provide the adversary with \emph{oracles}:
  \begin{itemize}
  \item Oracles compute using secrets hidden from the adversary.
  \item We count the number of \emph{queries} made to an oracle.
  \item We can restrict the types of queries the adversary makes.
  \end{itemize}

  Oracles are written as superscripts.  For example, an adversary given a
  chosen-plaintext oracle might be written as $A^{E_K(\cdot)}$.
\end{slide}

\begin{slide}
  \topic{the asymptotic approach}
  \resetseq
  \head{The asymptotic approach, \seq}
  
  A function $\nu\colon \N \to \R$ is \emph{negligible} if, for any integer
  $c$, there exists an $n \in \N$ such that $0 \le \nu(k) < k^{-c}$ for all
  $k \ge n$.  That is, $\nu(k)$ is `eventually' less than any polynomial
  function of $k$.
  
  We examine families of constructions, with a \emph{security parameter} $k$.
  We say that a function is (asymptotically) secure in some sense if, for any
  polynomial $p(k)$ there is a negligible function $\nu(k)$ such that, for
  any construction in the family, parameterized by $k$, no adversary which
  runs for time $p(k)$ has success probability better than $\nu(k)$.
\end{slide}

\begin{slide}
  \head{The asymptotic approach, \seq}
  
  Suppose we build an encryption scheme from a one-way function.  We'd like
  to prove that the encryption is good if the one-way function is secure.  We
  do this by contradiction:
  \begin{enumerate}
  \item Suppose an adversary $A$ breaks the encryption scheme with
    better-than-negligible probability.
  \item Show a polynomial-time \emph{reduction}: an algorithm which uses $A$
    to break the one-way function, in polynomial time, and with
    better-than-negligible probability,
  \item Claim that this violates the assumption of a secure one-way function.
  \end{enumerate}
  
  This doesn't work with real constructions.  We don't know where the
  asymptotics set in, and they can conceal large constants.  It's still
  better than nothing.
\end{slide}

\begin{slide}
  \topic{the concrete (quantitative) approach}
  \head{The concrete (quantitative) approach}

  We constrain the resources we allow the adversary:
  \begin{itemize}
  \item Running time (including program size).
  \item Number of oracle queries.
  \item Maximum size of oracle queries.
  \end{itemize}
  Write that something is \emph{$(t, q, \epsilon)$-secure} if no adversary
  which runs in time $t$ and makes $q$ oracle queries can break it with
  probability better than $\epsilon$.
  
  We make statements like `foo is $(t, q, 2 q \epsilon)$-secure if bar is $(t
  + O(q), 2 q, \epsilon)$-secure'. 
  
  This is a much more satisfactory approach.  However, we still have to
  \emph{assume} the security of our primitive operations.
\end{slide}

\xcalways\subsection{Notation}\x

\begin{slide}
  \topic{boolean operators}
  \resetseq
  \head{Notation, \seq: boolean operators}

  If $P$ and $Q$ are \emph{predicates} -- i.e., either true or false -- then:
  \begin{itemize}
  \item $P \land Q$ is true if both $P$ \emph{and} $Q$ is true;
  \item $P \lor Q$ is true if either $P$ \emph{or} $Q$ (or both) is true;
  \item $\lnot P$ is true if $P$ is false; and
  \item $P \implies Q$ is true if $Q$ is true or $P$ is false.
  \end{itemize}
\end{slide}

\begin{slide}
  \topic{sets}
  \head{Notation, \seq: sets}

  For our purposes, we can think of sets as being collections of objects.
  
  We use the usual notations for set membership ($x \in X$), intersection ($X
  \cap Y$), union ($X \cup Y$) and subset containment ($X \subseteq Y$).  The
  \emph{empty set}, which contains no elements, is written $\emptyset$.
  
  The notation $\{\, f(x) \mid P(x) \,\}$ describes the set containing those
  items $f(x)$ for those $x$ for which the predicate $P(x)$ is true.

  The \emph{cardinality} $|X|$ of a (finite) set $X$ is the number of
  elements in the set.
  
  The power set $\powerset(X)$ of a set $X$ is the set of all subsets of $X$.
  
  The \emph{Cartesian product} of two sets $X \times Y$ is the set of all
  ordered pairs $\{\, (x, y) \mid x \in X \land y \in Y \,\}$.  We use
  exponents to indicate the product of a set with itself: hence, $X^2 = X
  \times X$.
  
  A \emph{relation} $R$ is a subset of a Cartesian product.  We write $R(x,
  y)$ if $(x, y) \in R$.  Relations between two sets are often written as
  infix symbols: e.g., $x \sim y$.
\end{slide}

\begin{slide}
  \head{Notation, \seq: sets (cont.)}
  
  In addition to strings, defined later, we use the following standard sets:
  \begin{itemize}
  \item the set $\Z$ of integers;
  \item the set $\N = \{\, x \in \Z \mid x \ge 0 \,\}$ of natural numbers;
  \item the set $\R$ of real numbers;
  \item closed intervals $[a, b] = \{\, x \in \R \mid a \le x \le b \,\}$;
  \item the finite field $\F_q$ of $q$ elements, and its multiplicative
    subgroup $\F_q^* = \F_q \setminus \{0\}$; and
  \item the ring $\Z/n\Z$ of residue classes modulo $n$ (i.e., if $x \in
    \Z/n\Z$ and $a, b \in x$ then $a \equiv b \pmod{n}$), and its
    multiplicative subgroup $(\Z/n\Z)^* = \Z/n\Z - \{\, x + n\Z \mid \gcd(x,
    n) > 1 \,\}$.
  \end{itemize}
\end{slide}
  
\begin{slide}
  \topic{functions}
  \head{Notation, \seq: functions}

  A \emph{function} $f\colon X \to Y$ is a mapping which assigns every
  element $x$ in the \emph{domain} $X$ a corresponding element $f(x)$ in the
  \emph{range} (or sometimes \emph{codomain}) $Y$.  The notation $\dom f$
  describes the domain of an arbitrary function; $\ran f$ describes its
  range.

  We sometimes apply the function notation to sets, indicating that the
  function should be applied pointwise; i.e., $f(Z) = \{ f(z) \mid z \in Z
  \}$.  The \emph{image} of a function $f$ is the set $f(\dom f)$.
  
  If $f\colon X \to Y$ preserves equality, i.e., $f(x) = f(x') \implies x =
  x'$ for all $x, x' \in X$, then we say $f$ is \emph{injective} (or
  \emph{1-to-1}).  If $f(X) = Y$ then we say that it is \emph{surjective} (or
  \emph{onto}).  If $f$ is both injective and surjective then we say that it
  is \emph{bijective}.  In this case, there is a well-defined inverse
  $f^{-1}\colon Y \to X$ defined by $f(f^{-1}(y)) = y$ for all $y \in Y$.
  
  If $f\colon X \to X$ (i.e., its domain and range are the same set) is
  bijective, then we say that $f$ is a \emph{permutation on $X$}.
\end{slide}

\begin{slide}
  \head{Notation, \seq: functions (cont.)}

  We can consider a function $f\colon X \to Y$ to be a particular sort of
  relation $f \subseteq X \times Y$, subject to the constraint that if $(x,
  y) \in f$ and $(x, y') \in f$ then $y = y'$.

  We shall use this view in some of the algorithms we present.  In addition,
  we shall use the \emph{maplet} notation $x \mapsto y$ rather than the
  ordered pair notation $(x, y)$ to reinforce the notion of a mapping.

  We might write, for example,
  \begin{program}
    $f \gets f \cup \{ x \mapsto y \}$;
  \end{program}
  to augment a function by the addition of a new mapping.  This is clearly
  only valid if $x \notin \dom f$ (or $f(x) = y$) initially.
\end{slide}

\begin{slide}
  \topic{strings}
  \head{Notation, \seq: strings}
  
  An \emph{alphabet} is a finite set of \emph{symbols}.  The one we'll use
  most of the time is the set $\Sigma = \{0, 1\}$ of \emph{bits}.
  
  Suppose $A$ is an alphabet.  The set of sequences of exactly $n$ symbols
  from $A$ is written $A^n$.  Hence, $\{0, 1\}^{64}$ is the set of all 64-bit
  sequences.  The set of (finite) \emph{strings} over an alphabet $A$ is $A^*
  = \bigcup_{i \in \N} A^i$.  The empty string is named $\emptystring$.
  
  The \emph{length} of a string $a \in A^*$, written $|a|$, is the natural
  number $n$ where $a \in A^n$.
  
  If $x, y \in A^*$ are strings then their \emph{concatenation}, written $x
  \cat y$, or sometimes just $x y$, is the result of writing the symbols in
  $x$ followed by the symbols in $y$, in order.  We have $|x y| = |x| + |y|$.
\end{slide}

\begin{slide}
  \head{Notation, \seq: strings (cont.)}
  
  There are natural (injective) mappings between bit strings and natural
  numbers.
  
  If $x = x_0 x_1 \ldots x_{n-1} \in \{0, 1\}^*$ then we can associate with
  it the natural number
  \[ \overrightarrow{x} = \sum_{0 \le i < n} 2^i x_i. \]

  The other natural mapping is
  \[ \overleftarrow{x} = \sum_{0 \le i < n} 2^{n-i-1} x_i. \]
  It doesn't matter which you choose, as long as you're consistent.
  
  For simplicity's sake, we shall tend to switch between strings and the
  numbers (and occasionally more exotic objects) they represent implicitly.
\end{slide}

\begin{slide}
  \topic{parsing}
  \head{Notation, \seq: parsing}

  We'll find it useful to be able to break up strings in the algorithms we
  present.  We use the statement
  \begin{program}
    \PARSE $x$ \AS $n_0\colon x_0, n_1\colon x_1, \ldots, n_k\colon x_k$;
  \end{program}
  to mean that the string $x$ is broken up into individual strings $x_0$,
  $x_1$, \ldots, $x_k$, such that
  \begin{itemize}
  \item $x = x_0 \cat x_1 \cat \cdots \cat x_k$; and
  \item $|x_i| = n_i$ for all $0 \le i \le k$.
  \end{itemize}
  We may omit one of the $n_i$ length indicators, since it can be deduced
  from the length of the string $x$ and the other $n_j$.
\end{slide}

\begin{slide}
  \topic{vectors}
  \head{Notation, \seq: vectors}
  
  A \emph{vector} $\vect{x}$ is a finite ordered collection of elements from
  some set $X$.  If $\vect{x}$ contains $n$ elements then we write $\vect{x}
  \in X^n$, and that $|\vect{x}| = n$.  We write the individual elements as
  $\vect{x}[0], \vect{x}[1], \ldots, \vect{x}[n - 1]$.

  We shall abuse set membership notation for vectors; i.e., we write $x \in
  \vect{x}$ if there is an $i$ ($0 \le i < |\vect{x}|$) such that
  $\vect{x}[i] = x$.

  When we apply functions to vectors, we mean that they are applied
  pointwise, as for sets.  Thus, if we write that $\vect{y} =
  f(\vect{x})$ then $|\vect{y}| = |\vect{x}|$ and $\vect{y}[i] =
  f(\vect{x}[i])$ for all $0 \le i < |\vect{x}|$.
\end{slide}

\begin{slide}
  \topic{distributions and randomness}
  \head{Notation, \seq: distributions and randomness}
  
  A \emph{probability distribution} over a (countable) set $X$ is a
  function $\mathcal{D}\colon X \to [0, 1]$ such that
  \[ \sum_{x \in X} \mathcal{D}(x) = 1. \]
  
  The \emph{support} of $\mathcal{D}$, written $\supp \mathcal{D}$, is the
  set $\{ x \in X \mid \mathcal{D}(x) \ne 0 \}$; i.e., those elements of $X$
  which occur with nonzero probability.
  
  We write $x \getsr \mathcal{D}$ in algorithms to indicate that $x$ is to be
  chosen independently at random, according to the distribution
  $\mathcal{D}$.  The notation $x \inr \mathcal{D}$ indicates that $x$ has
  been chosen at independently at random according to $\mathcal{D}$.
  
  The \emph{uniform distribution} over a (finite) set $X$ is
  $\mathcal{U}_X\colon X \to [0, 1]$ defined by $\mathcal{U}_X(x) = 1/|X|$
  for all $x \in X$.  We shall write $x \getsr X$ and $x \inr X$ as
  convenient shorthands, meaning $x \getsr \mathcal{U}_X$ and $x \inr
  \mathcal{U}_X$ respectively.
\end{slide}

\xcalways\subsection{Background}\x

\begin{slide}
  \topic{distinguishability}
  \resetseq
  \head{Distinguishability, \seq}

  Suppose that $\mathcal{X}$ and $\mathcal{Y}$ are two probability
  distributions.
  
  Let $X$ be a random variable distributed according to $\mathcal{X}$, and
  let $Y$ be a random variable distributed according to $\mathcal{Y}$.  We
  say that $\mathcal{X}$ and $\mathcal{Y}$ are \emph{identically
    distributed}, and write that $\mathcal{X} \equiv \mathcal{Y}$, if, for
  all possible values $x$ of $X$, we have
  \[ \Pr[X = x] = \Pr[Y = x]. \]

  Equivalently, we require that, for all $x \in \supp \mathcal{X}$ we have
  \[ x \in \supp \mathcal{Y} \land \mathcal{X}(x) = \mathcal{Y}(y). \]
\end{slide}

\begin{slide}
  \head{Distinguishability, \seq: statistical closeness}
  
  Now we generalize the setting slightly.  Consider two \emph{families} of
  distributions, $\{\mathcal{X}_k\}_{k\in\N}$ and
  $\{\mathcal{Y}_k\}_{k\in\N}$, parameterized by a security parameter $k$,
  where $\dom\mathcal{X}_k = \dom\mathcal{Y}_k$.  To make the asymptotics
  work, we require that $|x| \le p(k)$ for some polynomial $p(\cdot)$, for
  all $x \in \dom\mathcal{X}_k$.
  
  Fix a value of $k$.  Again, let $X$ be distributed according to
  $\mathcal{X}_k$, and let $Y$ be distributed according to $\mathcal{Y}_k$.
  We say that $\{\mathcal{X}_k\}_{k \in \N}$ and $\{\mathcal{Y}_k\}_{k\in\N}$
  are \emph{statistically close}, and write that $\{\mathcal{X}_k\}_{k\in\N}
  \statclose \{\mathcal{Y}_k\}_{k\in\N}$, if there is a negligible function
  $\nu(\cdot)$ such that, for any $k \in \N$,
  \[ \sum_{x\in\dom{\mathcal{X}_k}}
     |{\Pr[X = x]} - \Pr[Y = x]| \le \nu(k). \]%
  (Reminder: Saying that $\nu\colon \N \to \R$ is \emph{negligible} means
  that, for any $c \in \Z$ there is an $n \in N$ such that $\nu(k) <
  k^{-c}$.)
\end{slide}

\begin{slide}
  \head{Distinguishability, \seq: computational indistinguishability}

  We say that two families of distributions are computationally
  indistinguishable if no `efficient' algorithm can tell them apart with
  better than `negligible' probability.
  
  So, we say that $\{\mathcal{X}_k\}_{k\in\N}$ and
  $\{\mathcal{Y}_k\}_{k\in\N}$ are \emph{computationally indistinguishable}
  and write that $\{\mathcal{X}_k\}_{k\in\N} \compind
  \{\mathcal{Y}_k\}_{k\in\N}$, if, for any probabilistic polynomial-time
  algorithm $A$, there is a negligible function $\nu(\cdot)$ such that, for
  any $k$:
  \[ \Pr[x \getsr \mathcal{X}_k; b \gets A(x) : b = 1] -
     \Pr[y \getsr \mathcal{Y}_k; b \gets A(y) : b = 1] \le \nu(k). \]%
  Statistical closeness implies computational indistinguishability.
\end{slide}

\begin{proof}
  Let two statistically close distributions $\{\mathcal{X}_k\}_{k\in\N}
  \statclose \{\mathcal{Y}_k\}_{y\in\N}$ be given.  Fix some $k$, and let $Z
  = \dom\mathcal{X}_k = \dom\mathcal{Y}_k$.  Now note that the adversary's
  advantage is given by $\sum_{z\in Z} \Pr[b \gets A(z) : b = 1]
  |\mathcal{X}_k(z) - \mathcal{Y}_k(z)| \le \sum_{z\in Z} |\mathcal{X}_k(z) -
  \mathcal{Y}_k(z)| \le \nu(k)$.  Hence the two distributions are
  computationally indistinguishable.
\end{proof}

\begin{slide}
  \topic{collisions}
  \head{Collisions -- the Birthday `paradox'}
  
  Suppose we throw $q$ balls into $n$ bins at random (with $q \le n$).  Let
  $C_{q, n}$ be the event that, at the end of this, we have a bin containing
  more than one ball -- a \emph{collision}.
  
  Let $B_{q, n}$ be the event that the $q$-th ball collides with a
  previous one.  Obviously, the worst case for this is when none of the other
  balls have collided, so
  \[ \Pr[B_{q, n}] \le \frac{q - 1}{n}. \]
  Then
  \begin{eqnarray*}[rl]
    \Pr[C_{q, n}]
    &\le \Pr[B_{2, n}] + \Pr[B_{3, n}] + \cdots + \Pr[B_{q, n}] \\
    &\le \frac{1}{n} + \frac{2}{n} + \cdots + \frac{q - 1}{n} \\
    &=   \frac{q(q - 1)}{2 n}.
  \end{eqnarray*}
  This is an extremely useful result, and we shall need it often.
\end{slide}

\xcalways\subsection{Primitives}\x

\begin{slide}
  \topic{one-way functions}
  \resetseq
  \head{One-way functions, \seq: introduction}

  Intuition: a one-way function is easy to compute but hard to invert.

  Choose a function $f\colon X \to Y$.  Let $I$ be a prospective inverter.
  Now we play a game:
  \begin{enumerate}
  \item Choose $x \in X$ at random.  Compute $y = f(x)$.
  \item Let $x'$ be the output of $I$ when run on input $y$.
  \item We say that $I$ `wins' if $f(x') = y$; otherwise it loses.
  \end{enumerate}
  Note that we don't care whether $x = x'$.

  Examples: SHA-1, or $x \mapsto g^x \bmod p$.
\end{slide}

\begin{slide}
  \head{One-way functions, \seq: formalism}

  The \emph{success probability} of an inverter $I$ against the function $f$
  is
  \[ \Succ{owf}{f}(I) =
     \Pr[x \getsr X;
         y \gets f(x);
         x' \gets I(y) :
         f(x') = y] \]%
  where the probability is taken over all choices of $x$ and all coin flips
  made by $I$.
  
  We measure the \emph{insecurity} of a one-way function (OWF) by maximizing
  over possible inverters:
  \[ \InSec{owf}(f; t) = \max_I \Succ{owf}{f}(I) \]
  where the maximum is taken over all $I$ running in time $t$.

  If $\InSec{owf}(f; t) \le \epsilon$ then we say that $f$ is a \emph{$(t,
  \epsilon)$-secure one-way function}.
\end{slide}

\begin{slide}
  \head{One-way functions, \seq: trapdoors}
  
  Intuition: a \emph{trapdoor} is secret information which makes inverting a
  one-way function easy.  This is most useful when the one-way function is a
  permutation.  A trapdoor one-way function generator $\mathcal{T} = (G, f,
  T)$ is a triple of algorithms:

  \begin{itemize}
  \item The probabilistic algorithm $G$ is given some parameter $k$ and
    returns a pair $(P, K)$, containing the public parameters $P$ for
    computing an instance of the one-way function, and the secret trapdoor
    information $K$ for inverting it.  We write $(P, K) \in G(k)$.
    
  \item The algorithm $f$ implements the one-way function.  That is, if $(P,
    K) \in G(k)$ then $f(P, \cdot)$ (usually written $f_P(\cdot)$) is a
    one-way function.
    
  \item The algorithm $T$ inverts $f$ using the trapdoor information.  That
    is, if $(P, K) \in G(k)$ and $y = f_P(x)$ for some $x$, then $y =
    f_P(T(K, y))$.  We usually write $T_K(\cdot)$ instead of $T(K, \cdot)$.
  \end{itemize}
\end{slide}

\begin{slide}
  \topic{pseudorandom generators (PRGs)}
  \head{Pseudorandom generators (PRGs)}
  
  A pseudorandom generator (PRG) `stretches' an input seed into a longer
  string which `looks' random.
  
  Let $G\colon \{0, 1\}^k \to \{0, 1\}^L$ be a function from $k$-bit strings
  to $L$-bit strings.  The \emph{advantage} of a distinguisher $D$ against
  $G$ is:
  \begin{eqnarray*}[rl]
    \Adv{prg}{G}(D) = &
       \Pr[x \getsr \{0, 1\}^k; y \gets G(x) : D(y) = 1] - {}\\
    &  \Pr[y \getsr \{0, 1\}^L : D(y) = 1].
  \end{eqnarray*}
  The \emph{insecurity} is simply the maximum advantage:
  \[ \InSec{prg}(G; t) = \max_D \Adv{prg}{G}(D) \]
  where the maximum is taken over all distinguishers $D$ running in time
  $t$.  If $\InSec{prg}(G; t) \le \epsilon$ then we also say that $G$ is a
  $(t, \epsilon)$-secure PRG\@.
\end{slide}

\begin{exercise}
  We say that a PRG $g\colon \{0, 1\}^k \to \{0, 1\}^L$ is \emph{trivial} if
  $k \ge L$.
  \begin{enumerate}
  \item Show that trivial PRGs exist.
  \item Show that if $g$ is nontrivial, then $g$ is also a one-way function,
    with
    \[ \InSec{owf}(g; t) \le \InSec{prg}(g; t) + 2^{k-L}. \]
  \end{enumerate}
  \answer%
  \begin{parenum}
  \item The identity function $I(x) = x$ is a trivial PRG, with
    $\InSec{prg}(I, t) = 0$, as is easily seen from the definition.
  \item Suppose $A$ inverts $g$.  Then consider adversary $B(y)$: \{ $x \gets
    A(y)$; \IF $g(x) = y$ \THEN \RETURN $1$; \ELSE \RETURN $0$;~\}.  If $y$
    is the output of $g$, then $A$ inverts $y$ with probability
    $\Succ{owf}{g}(A)$; if $y$ is random in $\{0, 1\}^L$ then there is a
    probability at least $1 - 2^{k-L}$ that $y$ has \emph{no} inverse,
    proving the result.  Note that \cite{Wagner:2000:PSU} provides a better
    security bound than this simplistic analysis.
  \end{parenum}
\end{exercise}

\begin{exercise}
  \label{ex:dbl-prg}
  Suppose that we have a \emph{length-doubling} PRG $g\colon \{0, 1\}^k \to
  \{0, 1\}^{2k}$.  Let $g_0(\cdot)$ be the first $k$ bits of $g(x)$ and
  $g_1(\cdot)$ be the second $k$ bits.  Define the sequence of generators
  $g^{(i)}$ (for $i >= 1$) by
  \[ g^{(1)}(x) = g(x); \qquad
     g^{(i+1)}(x) = g_0(x) \cat g^{(i)}(g_1(x)). \]%
  Relate the security of $g^{(i)}$ to that of $g$.
  \answer%
  Let $A$ be an adversary running in time $t$ and attacking $g^{(i+1)}$.
  Firstly, we attack $g$: consider adversary $B(y)$: \{ \PARSE $y$ \AS $y_0,
  k\colon y_1$; $z \gets g^{(i)}$; $b \gets A(y_0 \cat z)$; \RETURN $b$;~\}.
  Then $\Adv{prg}{g}(B) \ge \Adv{prg}{g^{(i+1)}}(A) + \delta$, so
  $\InSec{prg}(g^{(i+1)}; t) \le \InSec{prg}(g; t) + \delta$, where
  \begin{eqnarray*}[rl]
    \delta = &\Pr[x_0 \gets \{0, 1\}^k; x_1 \gets \{0, 1\}^k;
                  y \gets g^{(i)}(x) : A(x_0 \cat y) = 1] - \\
              &\Pr[y \getsr \{0, 1\}^{(i+2)k} : A(y) = 1].
  \end{eqnarray*}
  We attack $g^{(i)}$ to bound $\delta$: consider adversary $C(y)$: \{ $x_0
  \getsr \{0, 1\}^k$; $b \gets A(x_0 \cat y)$; \RETURN $b$;~\}.  Now $\delta
  \le \Adv{prg}{g^{(i)}}(C) \le \InSec{prg}(g^{(i)}; t)$.  So by induction,
  \[ \InSec{prg}(g^{(i)}; t) \le i \cdot \InSec{prg}(g; t). \]
\end{exercise}

\begin{slide}
  \topic{advantage}
  \head{Notes about advantage}

  Advantage is a concept used in many definitions of security notions:
  \begin{eqnarray*}[rl]
     \Adv{}{}(A) = &
        \Pr[\text{$A$ returns 1 in setting $a$}] - {} \\
     &  \Pr[\text{$A$ returns 1 in setting $b$}].
  \end{eqnarray*}
  \begin{enumerate}
  \item We have $-1 \le \Adv{}{}(A) \le +1$.
  \item Zero means that the adversary couldn't distinguish.
  \item Negative advantage means the adversary got them the wrong way
    around.  There is another adversary which uses the same resources but has
    positive advantage.
  \item \label{item:adv-guess} If $A$ is attempting to guess some hidden bit
    $b^* \inr \{0, 1\}$, we have
    \[ \Pr[b \gets A : b = b^*] = \frac{\Adv{}{}(A)}{2} + \frac{1}{2}. \]
  \end{enumerate}
\end{slide}

\begin{proof}
  Let $b$ be the bit that $A$ returns, and let $b^*$ be the experiment's
  hidden bit.  Then
  \[ \Adv{}{}(A) = \Pr[b = 1 \mid b^* = 1] - \Pr[b = 1 \mid b^* = 0]. \]
  Addressing the above claims in order:
  \begin{enumerate}
  \item By definition of probability, $0 \le \Pr[b = 1 \mid b^* = 1] \le 1$
    and $0 \le \Pr[b = 1 \mid b^* = 0]$, so their absolute difference can be
    at most 1.
  \item This is a corollary of \ref{item:adv-guess}.
  \item Consider the adversary $\bar{A}$ which runs $A$ and returns the
    complement bit $\bar{b} = b \xor 1$.  Then
    \begin{eqnarray*}[rl]
      \Adv{}{}(\bar{A})
      &= \Pr[\bar{b} = 1 \mid b^* = 1] - \Pr[\bar{b} = 1 \mid b^* = 0] \\
      &= \Pr[b = 0 \mid b^* = 1] - \Pr[b = 0 \mid b^* = 0] \\
      &= (1 - \Pr[b = 1 \mid b^* = 1]) - (1 - \Pr[b = 1 \mid b^* = 0]) \\
      &= \Pr[b = 1 \mid b^* = 0] - \Pr[b = 1 \mid b^* = 1] \\
      &= -\Adv{}{}(A).
    \end{eqnarray*}
  \item Note that $\Pr[b^* = 1] = \Pr[b^* = 0] = \frac{1}{2}$.  Then
    \begin{eqnarray*}[rl]
      \Pr[b = b^*]
      &= \Pr[b = 1 \land b^* = 1] + \Pr[b = 0 \land b^* = 0] \\
      &= \frac{1}{2}(\Pr[b = 1 \mid b^* = 1] + \Pr[b = 1 \mid b^* = 0]) \\
      &= \frac{1}{2}(\Pr[b = 1 \mid b^* = 1] +
                     (1 - \Pr[b = 0 \mid b^* = 0])) \\
      &= \frac{1}{2}(1 + \Pr[b = 1 \mid b^* = 1] -
                           \Pr[b = 0 \mid b^* = 0]) \\
      &= \frac{\Adv{}{}(A)}{2} + \frac{1}{2}.
    \end{eqnarray*}
  \end{enumerate}
  All present and correct.
\end{proof}

\begin{slide}
  \topic{pseudorandom functions (PRFs)}
  \head{Pseudorandom functions (PRFs)}
  
  A \emph{pseudorandom function family} (PRF) is a collection of functions
  $F_K\colon \{0, 1\}^l \to \{0, 1\}^L$, where $K$ is some index, typically
  from a set of fixed-size strings $\{0, 1\}^k$.  We shall often consider a
  PRF to be a single function $F\colon \{0, 1\}^k \times \{0, 1\}^l \to \{0,
  1\}^L$.

  We want to say that $F$ is a strong PRF if adversaries find it hard to
  distinguish an instance $F_K$ from a function chosen completely at random
  with the same `shape'.

  We provide the adversary with an \emph{oracle}, either for a randomly
  selected $F_K$, or for completely random function, and ask it to try and
  say which it is given.

  We write $\Func{l}{L}$ as the set of \emph{all} functions from $\{0, 1\}^l$
  to $\{0, 1\}^L$.
\end{slide}

\begin{slide}
  \head{Pseudorandom functions (cont.)}

  We define the advantage of a distinguisher $D$ against the PRF $F$ as
  follows:
  \begin{eqnarray*}[rl]
    \Adv{prf}{F}(D) = &
       \Pr[K \getsr \{0, 1\}^k : D^{F_K(\cdot)} = 1] - {}\\
    &  \Pr[R \getsr \Func{l}{L} : D^{R(\cdot)} = 1].
  \end{eqnarray*}
  The insecurity of the PRF is then measured as
  \[ \InSec{prf}(F; t, q) = \max_D \Adv{prf}{F}(D) \]
  where the maximum is taken over all distinguishers $D$ which run for time
  $t$ and make $q$ oracle queries.  As is usual, if $\InSec{prf}(F; t, q)
  \le \epsilon$ then we say that $F$ is a $(t, q, \epsilon)$-secure PRF.
\end{slide}

\begin{slide}
  \topic{pseudorandom permutations (PRPs)}
  \head{Pseudorandom permutations (PRPs)}
  
  We define a \emph{pseudorandom permutation family} (PRP) in a similar way
  to the PRFs we've already seen.  A PRP is a family $F_K\colon \{0, 1\}^L
  \to \{0, 1\}^L$ is a of permutations, indexed by elements of some finite
  set, e.g., $\{0, 1\}^k$.  We shall often consider a PRP to be a single
  function $F\colon \{0, 1\}^k \times \{0, 1\}^l \to \{0, 1\}^L$.
  
  Let $\Perm{L}$ be the set of \emph{all} permutations over the set of
  $L$-bit strings $\{0, 1\}^L$.

  The advantage of a distinguisher $D$ against the PRP $F$ is
  \begin{eqnarray*}[rl]
    \Adv{prp}{F}(D) = &
       \Pr[K \getsr \{0, 1\}^k : D^{F_K(\cdot)} = 1] - {}\\
    &  \Pr[R \getsr \Perm{L} : D^{R(\cdot)} = 1].
  \end{eqnarray*}

  We define $\InSec{prp}(F; t, q) = \max_D \Adv{prp}{F}(D)$ exactly as for
  PRFs, and the notion of $(t, q, \epsilon)$-security is the same.
\end{slide}

\begin{slide}
  \head{Super pseudorandom permutations}
  
  PRPs are bijective.  A \emph{super PRP} is a PRP which remains secure when
  the distinguisher is allowed to make inverse queries:
  \begin{eqnarray*}[rl]
    \Adv{sprp}{F}(D) = &
      \Pr[K \getsr \{0, 1\}^k : D^{F_K(\cdot), F_K^{-1}(\cdot)} = 1] - {} \\
    & \Pr[R \getsr \Perm{L} : D^{R(\cdot), R^{-1}(\cdot)} = 1].
  \end{eqnarray*}
  Since there are two oracles, we count queries to both when evaluating the
  insecurity:
  \[ \InSec{sprp}(F; t, q, q') = \max_D \Adv{sprp}{F}(D) \]
  where the maximum is taken over all distinguishers $D$ which run for time
  $t$, make $q$ queries to the standard oracle, and $q'$ queries to the
  inverse oracle.  If $\InSec{sprp}(F; t, q, q') \le \epsilon$ then we say
  $F$ is a $(t, q, q', \epsilon)$-secure super PRP\@.
\end{slide}

\begin{exercise}
  Note that the key length hasn't appeared anywhere in the definition of
  insecurity for a PRP.  Derive lower bounds for the insecurity of a PRP with
  a $k$-bit key.
  \answer%
  Let $E\colon \{0, 1\}^k \times \{0, 1\}^L \to \{0, 1\}^L$ be a PRP.  Fix
  $n$ and $c$.  Then consider adversary $S^{E(\cdot)}$: \{ \FOR $i = 0$ \TO
  $c - 1$ \DO $y[i] \gets E(i)$; \FOR $K = 0$ \TO $n - 1$ \DO \{ $i \gets 0$;
  $\id{good} \gets 1$; \WHILE $i < c \land \id{good} = 1$ \DO \{ \IF $E_K(i)
  \ne y[i]$ \THEN $\id{good} \gets 0$;~\} \IF $\id{good} = 1$ \THEN \RETURN
  $1$;~\}~\}.  Then $\Adv{prp}{E}(S) \ge n(2^{-k} - 2^{-Lc})$.
\end{exercise}

\begin{slide}
  \resetseq
  \head{PRPs are PRFs, \seq}

  We model block ciphers as families of PRPs (not super PRPs).  Most of the
  analysis works best on PRFs, though.  We show that a PRP makes a `pretty
  good' PRF, as long as it's not over-used.

  Let $F$ be any PRP family.  Then
  \[ \InSec{prf}(F; t, q) \le
     \InSec{prp}(F; t, q) + \frac{q(q - 1)}{2^{L+1}}. \]%
  This is a useful result.  As long as $q^2$ is small compared to $2^L$ --
  the block size -- then a PRP makes a good PRF.
  
  The value $2^{L/2}$ is often called the \emph{Birthday bound}.  We shall
  meet it often when we examine modes of operation.  We shall examine the
  proof, because it illustrates some useful techniques.
\end{slide}

\begin{slide}
  \head{Shoup's lemma}
  
  This handy lemma states that the difference in the probability of some
  outcome between the two games is bounded above by the probability that the
  games differ.

  \begin{lemma}[Shoup \cite{Shoup:2001:OAEPR}]
    \label{lem:shoup}
    If $X$, $Y$ and $F$ are events, and $\Pr[X \land \lnot F] = \Pr[Y \land
    \lnot F]$ then $|{\Pr[X]} - \Pr[Y]| \le \Pr[F]$.
  \end{lemma}
  \begin{proof}
    We have:
    \begin{eqnarray*}[rll]
      \Pr[X] &= \Pr[X \land F] &+ \Pr[X \land \lnot F] \\
      \Pr[Y] &= \Pr[Y \land F] &+ \Pr[Y \land \lnot F]
    \end{eqnarray*}
    Subtracting gives
    \[ |{\Pr[X]} - \Pr[Y]| = |{\Pr[X \land F]} -
       \Pr[Y \land F]| \le \Pr[F] \]%
    as required.
  \end{proof}
\end{slide}

\begin{slide}
  \head{PRPs are PRFs, \seq: proof}

  Let $F\colon \{0, 1\}^k \times \{0, 1\}^L \to \{0, 1\}^L$ be a pseudorandom
  permutation.  We aim to show that $F$ is also a pseudorandom function.
  
  Let $A$ be an adversary which distinguishes~$F$ from a pseudorandom
  function in time~$t$, after making $q$ oracle queries.  We consider a
  sequence of games $\G{i}$ played with the adversary.  In each game $\G{i}$,
  let $S_i$ be the event that the adversary returns~$1$ at the end of the
  game.

  Game~$\G0$ is the `random function' game.  We given $A$ an oracle
  containing a random function $R \inr \Func{L}{L}$.

  Game~$\G1$ is the `PRF' game.  We give $A$ an oracle which computes
  $F_K(\cdot)$ for some randomly chosen $K \inr \{0, 1\}^k$.

  By definition, then,
  \[ \Adv{prf}{F}(A) = \Pr[S_1] - \Pr[S_0]. \]
\end{slide}

\begin{slide}
  \head{PRPs are PRFs, \seq: proof (cont.)}

  Let $x_0, x_1, \ldots, x_{q-1}$ be the oracle queries made by $A$, and let
  $y_0, y_1, \ldots, y_{q-1}$ be the corresponding responses.
  
  Game~$\G2$ works in the same way as $\G0$, except that if there is a
  \emph{collision} in the query replies (i.e., $y_i = y_j$ but $x_i \ne x_j$
  for any $0 \le i, j < q$) then we stop the game immediately.  Let $F_2$ be
  the event that this occurs.

  Because $\G2$ behaves exactly the same as $\G0$ unless $F_2$ occurs, we
  must have
  \[ \Pr[S_2 \land \lnot F_2] = \Pr[S_0 \land \lnot F_2] \]
  so we invoke Lemma~\ref{lem:shoup} and discover that
  \[ |{\Pr[S_2]} - \Pr[S_0]| \le \Pr[F_2]. \]
  Using the earlier result on collisions, it's easy to see that
  \[ \Pr[F_2] \le \frac{q(q - 1)}{2^{L+1}}. \]
\end{slide}

\begin{slide}
  \head{PRPs are PRFs, \seq: proof (cont.)}
  
  Game~$\G3$ works in the same way as $\G2$, except that we use a random
  permutation $P \inr \Perm{L}$ instead of a random function $R \inr
  \Func{L}{L}$.  Firstly, note that $F_2$ can't occur in $\G3$.  But, if
  $F_2$ doesn't occur in $\G2$ (i.e., there is no collision), then the random
  function is indistinguishable from a random permutation.  So
  \[ \Pr[S_3] = \Pr[S_2]. \]

  By definition, we have
  \[ \Adv{prp}{F}(A) = \Pr[S_1] - \Pr[S_3]. \]
  We can now tie all of this together.
\end{slide}

\begin{slide}
  \head{PRPs are PRFs, \seq: proof (cont.)}
  
  A simple calculation shows that
  \begin{eqnarray*}[rl]
    \Adv{prf}{F}(A) &=   \Pr[S_1] - \Pr[S_0] \\
                    &\le \Pr[S_1] - \Pr[S_2] + \Pr[F_2] \\
                    &=   \Pr[S_1] - \Pr[S_3] + \Pr[F_2] \\
                    &=   \Adv{prp}{F}(A) + \Pr[F_2] \\
                    &\le \InSec{prp}(F; t, q) + \frac{q(q - 1)}{2^{L+1}}.
  \end{eqnarray*}
  In the second line, we used the bound we computed on the absolute
  difference between $\Pr[S_2]$ and $\Pr[S_0]$; in the third, we noted that
  $\Pr[S_2] = \Pr[S_3]$; in the fourth, we used the definition of advantage
  against a PRP; and in the fifth we used the definition of insecurity for a
  PRP.
\end{slide}

\begin{slide}
  \head{PRPs are PRFs, \seq: proof (cont.)}

  Finally, we imposed no restrictions on $A$, except that it run in time $t$
  and make $q$ oracle queries.  So our bound
  \[ \Adv{prf}{F}(A) \le \InSec{prp}(F; t, q) + \frac{q(q - 1)}{2^{L+1}} \]%
  is true for \emph{any} such adversary $A$, and in particular, it's true for
  the most successful adversary running with those resource bounds.

  Hence, we can maximize, showing that
  \[ \InSec{prf}(F; t, q) \le
     \InSec{prp}(F; t, q) + \frac{q(q - 1)}{2^{L+1}} \]%
  as required.
\end{slide}

\begin{slide}
  \topic{hash functions}
  \resetseq
  \head{Hash functions, \seq: properties}

  Hash functions like MD5 and SHA-1 are extremely useful primitives.  What
  properties do we expect of them?  This out to be an extremely difficult
  question to answer.
  \begin{itemize}
  \item One-wayness.  We've seen a definition for this already.  But it's not
    enough.
  \item Collision-resistance.  This is the property usually claimed as the
    requirement for use in digital signature systems.  We'll look at this
    later.
  \item Randomness.  What does this mean, when the function is completely
    public?  A distinguishability criterion is clearly hopeless.
  \end{itemize}
\end{slide}
  
\begin{slide}
  \head{Hash functions, \seq: Merkle-Damg\aa{}rd iterated hashing
    \cite{Damgaard:1990:DPH, Merkle:1991:FSE}}
  
  Let $F\colon \{0, 1\}^{k+L} \to \{0, 1\}^k$ be a \emph{compression}
  function.  Now consider the function $H\colon \{0, 1\}^* \to \{0, 1\}^k$
  which transforms an input string $x$ as follows:
  \begin{enumerate}
  \item Pad $x$ to a multiple of $L$ bits in some injective way.  Divide the
    padded message into $L$-bit blocks $x_0$, $x_1$, \ldots, $x_{n-1}$.
  \item Fix some $k$-bit constant $I$.  Let $I_0 = I$.  Define $I_{i+1} =
    F(I_i \cat x_i)$ for $0 \le i < n$.
  \item The result $H(x) = I_n$.
  \end{enumerate}
  
  Suppose we have two strings $x \ne y$, such that $H(x) = H(y)$; i.e., a
  \emph{collision}.  Then \emph{either} we can find a collision for $F$
  \emph{or} a string $z$ for which $F(z) = I$.  (This is why initialization
  vectors for hash functions have such obviously regular forms.)
\end{slide}

\begin{proof}
  Let $x_0, x_1, \ldots, x_{n-1}$ and $x'_0, x'_1, \ldots, x'_{n'-1}$ be
  the $l$-bit blocks of two distinct (padded) messages, and without loss
  of generality suppose that $n \ge n'$.  Let $I_0 = I'_0 = I$, let
  $I_{i+1} = F(I_i \cat x_i)$, and $I'_{i+1} = F(I'_i \cat x'_i)$.  We
  have $I_n = I'_{n'}$.
  
  We prove the result by induction on $n$.  The case $n = 0$ is trivially
  true, since there is only one zero-block message.  Suppose, then, that the
  result is true for $n$-block messages.  There are three cases to consider.
  Firstly, if $n' = 0$ then $F(I_n \cat x_n) = I$.  Secondly, if $I_n \ne
  I'_{n'}$ or $x_n \ne x'_{n'}$, then we have a collision, for $F(I_n \cat
  x_n) = I_n = I'_{n'} = F(I'_{n'} \cat x'_{n'})$.  Finally, if $I_n =
  I'_{n'}$ and $x_n = x'_{n'}$ then we remove the final block from both
  messages, and because the remaining messages must differ in at least one
  block, we can apply the inductive hypothesis on these shorter messages to
  complete the proof.
\end{proof}

\begin{slide}
  \head{Hash functions, \seq: any-collision resistance}
  
  The statement usually made about a `good' hash function $h$ is that it
  should be `difficult' to find a collision: i.e., two preimages $x \ne y$
  where $H(x) = H(y)$.  How do we formalize this?  Here's one attempt:
  \begin{eqlines*}
    \Succ{acr}{H}(A) = \Pr[(x, y) \gets A : x \ne y \land H(x) = H(y)]; \\
    \InSec{acr}(H; t) = \max_A \Succ{acr}{H}(A).
  \end{eqlines*}
  But this doesn't work.  There clearly \emph{exists} an adversary which
  already `knows' the a collision for $H$ and just outputs the right answer.
  It succeeds very quickly, and with probability 1.  So this definition is
  impossible to satisfy.
\end{slide}

\begin{slide}
  \head{Hash functions, \seq: targetted collision resistance}

  The concept of targetted collision resistance is relatively new, but quite
  promising.  It replaces a single hash function with a \emph{family} of hash
  functions.  They're not really keyed, because the indices aren't kept
  secret.

  When making a signature, an index $i$ is chosen at random, and the
  signature for message $m$ is formed over the pair $(i, H_i(M))$.

  TCR-hash functions are the subject of ongoing research.  No practical
  designs exist at the moment.
\end{slide}

\begin{slide}
  \head{Hash functions, \seq: targetted collision resistance (cont.)}

  Consider the following experiment:
  \begin{program}
    $\Expt{tcr}{H}(A)$: \+ \\
      $(x, s) \gets A(\cookie{find})$; \\
      $i \getsr \keys H$; \\
      $y \gets A(\cookie{collide}, i, s)$; \\
      \IF $x \ne y \land H_i(x) = H_i(y)$
      \THEN \RETURN $1$; \\
      \ELSE \RETURN $0$;
  \end{program}
  The security of a TCR-hash function is measured as:
  \[ \InSec{tcr}(H; t) = \max_A \Pr[\Expt{tcr}{H}(A) = 1] \]
  where the maximum is taken over all adversaries running in time $t$.  We
  define $(t, \epsilon)$-security as usual.
\end{slide}

\begin{slide}
  \head{Hash functions, \seq: random oracles \cite{Bellare:1993:ROP}}

  In practice, we expect much more than just collision resistance from hash
  functions: we expect a certain amount of `random' behaviour.  But this is
  hard to quantify.
  
  One approach is to model a hash function as a `random oracle', i.e., an
  oracle containing a function chosen at random, used by the construction
  under consideration, and made available to the adversary.  The idea is that
  reductions can `capture' the knowledge of an adversary by examining the
  queries it makes to its random oracle.

  Hash functions \emph{aren't} random oracles.  But a random oracle proof is
  better than nothing.
\end{slide}

\xcalways\subsection{Standard assumptions}\x

\begin{slide}
  \head{Standard assumptions}

  There are a number of `standard' assumptions that are made about the
  difficulty of various problems:
  \begin{itemize}
  \item IFP, the Integer Factorization Problem;
  \item QRP, the Quadratic Residuosity Problem;
  \item DLP, the Discrete Logarithm Problem;
  \item RSAP, the RSA Problem; and
  \item CDH, the Computational Diffie-Hellman problem and its variants
  \end{itemize}
  \cite{Menezes:1997:HAC} has excellent material on the above.
\end{slide}

\begin{slide}
  \topic{integer factorization}
  \resetseq
  \head{The Integer Factorization Problem, \seq}
    
  We often assume that large integers of the form $n = p q$, where $p$ and
  $q$ are primes of roughly the same size, are `hard' to factor.
  Specifically, there is no algorithm which will factor such an $n$ in time
  bounded by a polynomial function of $\log n$.

  The difficulty of various other problems, e.g., Quadratic Residuosity, or
  RSA, depend on the difficulty of factoring; however, it is not yet known
  whether the ability to solve QRP or RSAP can be used to factor.
\end{slide}

\begin{slide}
  \head{The Integer Factorization Problem, \seq: square roots}
  
  The problem of extracting square roots modulo $n = p q$ is provably as hard
  as factoring.  This is the basis of Rabin's public key encryption and
  digital signature schemes.  We shall analyse these later.
  
  Suppose $Q(n, y)$ is an algorithm which returns an $x$ such that $x^2
  \equiv y \pmod{n}$, provided such an $x$ exists.  Then we can find a
  nontrivial factor of $n$ as follows:
  \begin{program}
    Algorithm $\id{find-factor}(n)$: \+ \\
      \REPEAT \\ \quad\=\+\kill
        $x \getsr \{1, 2, \ldots, n - 1\}$; \\
        $y \gets x^2 \bmod n$; \\
        $x' \gets Q(n, y)$; \\
        $p \gets \gcd(x + x', n)$; \\
        \IF $p \notin \{1, n\}$ \THEN \RETURN $p$; \- \\
      \FOREVER;
  \end{program}
\end{slide}

\begin{proof}
  The program attempts to find two square roots of $y$ mod $n$.  It's easy to
  see that this might lead to factors of $n$.  If $x^2 \equiv x'^2 \pmod{n}$
  then $x^2 - x'^2 = k n$ for some constant $k$.  Then $(x + x')(x - x')$ is
  a factorization of $k n$.  It remains to prove the probability bound on $x
  + x'$ being a nontrivial factor of $n$.

  Let $n$ be an odd composite.  Then, if $x \not\equiv \pm y \pmod{n}$ but
  $x^2 \equiv y^2 \pmod{n}$, then $\gcd(x + y, n)$ is a nontrivial factor of
  $n$.
  
  Firstly, we claim that, if $p$ is an odd prime then the congruence $x^2
  \equiv y \pmod{p}$ has precisely two solutions $x$, $x'$ such that $x
  \equiv -x' \pmod{p}$.  Let $g$ be primitive mod $p$, with $x = g^\alpha$,
  $x' = g^\beta$.  Then $g^{2 \alpha} \equiv g^{2 \beta} \pmod{p}$, so $2
  \alpha \equiv 2 \beta \pmod{p - 1}$.  But $p - 1$ is even, by hypothesis,
  so $\alpha \equiv \beta \pmod{(p - 1)/2}$.  But $g^{(p-1)/2} \equiv -1
  \pmod{p}$; hence $x \equiv \pm x' \pmod{p}$, proving the claim.
  
  There must exist odd primes $p$, $q$, such that $p|n$ and $q|n$, and $x
  \equiv -y \pmod{p}$ and $x \equiv y \pmod{q}$, for if not, then $x \equiv
  \pm y \pmod{n}$ contrary to hypothesis.  But then $x + y \equiv 0
  \pmod{p}$, so $p|(x + y)$; but $x + y \equiv 2 x \not\equiv 0 \pmod{q}$,
  since $q$ is odd.  Hence, $p$ divides $x + y$, but $q$ does not, so $\gcd(x
  + y, n)$ is a nontrivial factor of $n$, completing the proof.
\end{proof}

\begin{slide}
  \topic{quadratic residuosity}
  \head{The Quadratic Residuosity Problem}

  If there is an $x$ such that $x^2 \equiv y \pmod{n}$ then $y$ is a
  \emph{quadratic residue modulo $n$}, and we write $y \in Q_n$; if there is
  no such $x$ then $y$ is a \emph{quadratic nonresidue modulo $n$}.

  If $p$ is prime, then we can use the \emph{Legendre symbol} to decide
  whether $x$ is a quadratic residue mod $p$:
  \[ \jacobi{x}{p} = x^{(p-1)/2} \bmod p =
     \begin{cases}
        0 & if $p$ divides $x$ \\
       -1 & if $x$ is a quadratic nonresidue mod $p$ \\
       +1 & if $x$ is a quadratic residue mod $p$
     \end{cases}. \]%
  The \emph{Jacobi symbol} (written the same way) is defined for odd~$n$: if
  $n = p_1^{a_1} p_2^{a_2} \cdots p_k^{a_k}$ where the $p_i$ are prime, then
  \[ \jacobi{x}{n} =
     \jacobi{x}{p_1}^{a_1} \jacobi{x}{p_2}^{a_2} \cdots
       \jacobi{x}{p_k}^{a_k}. \]%
  This can be efficiently computed without knowledge of the factors of $n$
  \cite[Section~2.4.5]{Menezes:1997:HAC}.
\end{slide}

\begin{slide}
  \head{The Quadratic Residuosity Problem (cont.)}

  If $\jacobi{x}{n} = -1$ then $x$ is certainly \emph{not} a quadratic
  residue mod $n$; however, if $\jacobi{x}{n} = 1$ then $x$ might be a
  quadratic residue or it might not; if not, then we say that $x$ is a
  \emph{pseudosquare}.
  
  If $n = p q$ is a product of two primes and $x \inr (\Z/n\Z)^*$ is chosen
  at random, then
  \[ \Pr\Bigl[x \in Q_n \Bigm| \jacobi{x}{n} = 1\Bigr] = \frac{1}{2}, \]
  since we have
  \[ \jacobi{x}{p} = \jacobi{x}{q} = \pm 1 \]
  with each occurring with equal probability.

  The problem of distinguishing pseudosquares from quadratic residues is
  called the Quadratic Residuosity Problem (QRP).  It is not known how to
  solve this problem without factoring $n$.
\end{slide}

\begin{slide}
  \topic{discrete logarithms}
  \head{The Discrete Logarithm Problem}
  
  The (Integer) Discrete Logarithm Problem asks for the solution $x$ given a
  congruence of the form $g^x \equiv y \pmod{n}$.  This seems to be about as
  difficult as factoring.  (The ability to solve discrete logs modulo $n$ is
  sufficient to factor $n$.  The best known algorithms for IFP and DLP have
  the same running time.)

  The problem generalizes to other cyclic groups, e.g., elliptic curves over
  finite fields.
\end{slide}

\begin{slide}
  \topic{self-reducibility}
  \resetseq
  \head{Self-reducibility, \seq}
  
  The problems of square-root extraction, deciding quadratic residuosity, the
  RSA problem, and finding discrete logarithms share the property of being
  \emph{randomly self-reducible}; i.e., an instance of the problem can be
  transformed into many different derived instances \emph{without skewing the
    probability distribution of problem instances}, such that a solution to
  one of the derived instances yields a solution to the original one.

  This is a good property to have.  It implies that `most' problem instances
  are as hard as the hardest instances.
\end{slide}

\begin{slide}
  \head{Self-reducibility, \seq: the RSA problem \cite{Rivest:1978:MOD}}
  
  The RSA problem is to compute $e$-th roots modulo $n = p q$, where $e$ is
  relatively prime to $n$.  Suppose that the algorithm $S(n, e, y)$ returns a
  value $x$ such that $x^e \equiv y \pmod{n}$ for `many' choices of $y$, or
  the special symbol $\bot$ otherwise.  The following probabilistic algorithm
  then solves the RSA problem for arbitrary $y$:
  \begin{program}
    Algorithm $\id{solve-rsa}(n, e, y)$: \+ \\
      \REPEAT \\ \quad\=\+\kill
        $x' \getsr \{1, 2, \ldots, n - 1\}$; \\
        \IF $\gcd(x', n) = 1$ \THEN \\ \quad\=\+\kill
          $y' \gets y x'^e \bmod n$; \\
          $x \gets S(n, e, y')$; \\
          \IF $x \ne \bot$ \THEN \RETURN $x x'^{-1} \bmod n$; \-\- \\
      \FOREVER;
  \end{program}
\end{slide}

\begin{slide}
  \topic{the Diffie-Hellman problem}
  \head{The Diffie-Hellman problem \cite{Diffie:1976:NDC}}
  
  Let $G = \langle g \rangle$ be a cyclic group or order $q$.  Let $\alpha$
  and $\beta$ be indices, $\alpha, \beta \in \Z/q\Z$.
  
  The (computational) \emph{Diffie-Hellman} problem is, given $g^\alpha$ and
  $g^\beta$, to find $g^{\alpha\beta}$.
  
  The (computational) Diffie-Hellman \emph{assumption} is that there is no
  probabilistic algorithm which solves the computational Diffie-Hellman
  problem in time polynomial in $\log q$.
  
  Obviously, being able to compute discrete logs in $G$ efficiently would
  yield a solution to the Diffie-Hellman problem.  But it may be the case
  that the Diffie-Hellman problem is easier than the discrete log problem.

  The Diffie-Hellman problem is self-reducible.
\end{slide}

\begin{slide}
  \head{The Decisional Diffie-Hellman assumption \cite{Boneh:1998:DDP}}

  The computational Diffie-Hellman assumption makes a statement only about
  algorithms which compute the \emph{entire} answer $g^{\alpha\beta}$.  Since
  Diffie-Hellman is frequently used for key-exchange, what can we say about
  the ability of an adversary to guess any of the bits?
  
  The Decisional Diffie-Hellman (DDH) assumption asserts that, if you don't
  know $\alpha$ or $\beta$, then it's hard to tell $g^{\alpha\beta}$ from a
  random element of $G$; that is, that the distributions of the following
  experiments are computationally indistinguishable:
  \begin{program}
    $\alpha \getsr \Z/q\Z;$ \\
    $\beta  \getsr \Z/q\Z;$ \\
    \RETURN $(g^\alpha, g^\beta, g^{\alpha\beta})$;
  \next
    $\alpha \getsr \Z/q\Z;$ \\
    $\beta  \getsr \Z/q\Z;$ \\
    $\gamma \getsr \Z/q\Z;$ \\
    \RETURN $(g^\alpha, g^\beta, g^\gamma)$;
  \end{program}
\end{slide}

\begin{slide}
  \head{The Decisional Diffie-Hellman assumption (cont.)}
  
  If $A$ is an algorithm attempting to solve DDH in the group $G$, then we
  write its advantage as
  \begin{eqnarray*}[rl]
    \Adv{ddh}{G}(A) =
    & \Pr[\alpha \getsr \Z/q\Z; \beta \getsr \Z/q\Z :
          A(g^\alpha, g^\beta, g^{\alpha\beta}) = 1] - {} \\
    & \Pr[\alpha \getsr \Z/q\Z; \beta \getsr \Z/q\Z; 
          \gamma \getsr \Z/q\Z :
          A(g^\alpha, g^\beta, g^\gamma) = 1]
  \end{eqnarray*}
  and the insecurity of DDH in $G$ as
  \[ \InSec{ddh}(G; t) = \max_A \Adv{ddh}{G}(A) \]
  with the maximum over all algorithms $A$ running in time $t$.
\end{slide}

\begin{slide}
  \head{The Decisional Diffie-Hellman assumption (cont.)}

  If you can solve the computational Diffie-Hellman problem, you can solve
  the decisional version.  If you can compute discrete logs, you can solve
  both.
  
  There \emph{are} groups in which the computation problem is (believed to
  be) hard, but the decisional problem is easy.  In particular, if the group
  order $q$ has small factors then the decisional problem isn't hard.
\end{slide}

\endinput

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "ips"
%%% End: