'\" e
.\" -*-nroff-*-
.ie t \{\
.  ds o \(bu
.  if \n(.g .fam P
.\}
.el \{\
.  ds o o
.  ie '\*(.T'utf8' .ds mo ∈
.  el .ds mo \fBin\fP
.  ie '\*(.T'utf8' .ds *l \(*l
.  el .ds *l L
.\}
.de hP
.IP
\h'-\w'\fB\\$1\ \fP'u'\fB\\$1\ \fP\c
..
.de VS
.sp 1
.RS
.nf
.ft B
..
.de VE
.ft R
.fi
.RE
.sp 1
..
.ds pr $
.EQ
delim $$
tdefine member 'type "relation" \(mo'
define lbrace 'roman "{"'
define rbrace 'roman "}"'
define FIELD 'bold F'
.EN
.TH shamir 1 "3 May 2015" "distorted.org.uk key management"
.de EQ
.PP
.ce
..
.de EN
.PP
..
.SH NAME
shamir \- split a secret into shares, and recombine them
.SH SYNOPSIS
.B shamir
.I command
.PP
where
.I command
is one of:
.PP
.B help
.IB [ command\fR... ]
.br
.B issue
.RB [ \-H
.IR hashfn ]
.IB thresh / count
.RI [ secret-file ]
.br
.B recover
.SH DESCRIPTION
The
.B shamir
program implements a simple secret sharing scheme.
Given a secret,
the
.B issue
command will split it
into a number of shares
.IR n,
any
.ie t $t <= n$
.el \{\
.I t
\(<=
.I n
.\}
of which
can be used to reconstruct it, but
fewer than
.I t
shares provide no information at all
about the secret
(other than its length).
.SH COMMAND-LINE OPTIONS
The following options are recognized.
.TP
.B \-h, \-\-help
Print an overview
of the options and commands
to standard output
and exit successfully.
.TP
.B \-\-version
Print the version number
of the
.B distorted-keys
package
to standard output
and exit successfully.
.SH COMMAND REFERENCE
.SS help
The
.B help
command prints information to standard output
about the
.B shamir
program as a whole
(similar to the
.B \-\-help
option),
or about the named
.IR command s.
.SS issue
The
.B issue
command reads a secret from
.I secret-file
(or from standard input,
if
.I secret-file
is omitted or
.RB ` \- '),
and writes a number of lines to standard output.
.PP
The
.I count
is the total number of shares issued;
any
.I thresh
of them can be used to reassemble the original secret.
It must be the case that
.ie t $1 <= "thresh" <= "count" <= 255$.
.el \{\
1
\(<=
.I thresh
\(<=
.I count
\(<=
255.
.\}
.PP
The first line contains secret-sharing parameters,
which will be required in any recovery operation.
These do not usually need to be kept secret
(though they contain a cryptographic hash of the original input)
but it's a good idea to ensure their integrity.
This can be done by storing the parameters in a safe place,
possibly protected by a digital signature,
or by issuing each share holder with a copy of the parameters
and checking that they match at recovery time.
.PP
The remaining lines contain share data, one per line, for
.I count
lines.
Each share holder should be given one of these lines.
.PP
The following options are accepted.
.TP
.BI "\-H, \-\-hash-function=" hashfn
The hash function to use when hashing the secret.
This may be any hash function name known to OpenSSL.
.PP
Example:
.VS
\*(pr shamir issue 3/5 secret
.ft R
shamir-params:n=5;t=3;f=sha256;h=lbcQW/lmV4z0ZKORE5Y0/g+thgWFKr4Kv3i1vEfNkUQ=
shamir-share:i=0;y=Ft3jhDQgTdMgEJ3Og3+OZ98NTSRGEWFjIN6mAQmzvHc=
shamir-share:i=1;y=zHpbi0h+mFVVv0vAo0djMK296b5XRCAYRV6yMsy74Rs=
shamir-share:i=2;y=FbSHVSqoXLVbxkuVZweLXacXmqAnWMEgLOP3qlSvfPc=
shamir-share:i=3;y=Q0bUOCUfgSlZSFPOSwhr2YjCSfXKRjwRDmwK8PZPB48=
shamir-share:i=4;y=mogI5kfJRclXMVObj0iDtIJoOuu6Wt0pZ9FPaG5bmmM=
.VE
.SS recover
The
.B recover
command reads parameters and shares from standard input,
computes the original secret,
and writes it to standard output.
.PP
There are no command-line options.
.PP
The first line of standard input must contain the sharing parameters.
Subsequent lines contain shares,
one per line.
The shares may be in any order.
Exactly the correct
.RI ( thresh )
number of shares must be provided.
The shares are checked for various kinds of errors
(e.g., duplicate shares, out-of-range indices),
and the secret is computed
and checked against the hash stored with the parameters.
If everything is correct,
the secret is written to standard output;
otherwise errors are reported to standard error,
and the program exits with a nonzero status
having written nothing to standard output.
.PP
Example:
.VS
\*(pr cat shares
.ft R
shamir-params:n=5;t=3;f=sha256;h=lbcQW/lmV4z0ZKORE5Y0/g+thgWFKr4Kv3i1vEfNkUQ=
shamir-share:i=2;y=FbSHVSqoXLVbxkuVZweLXacXmqAnWMEgLOP3qlSvfPc=
shamir-share:i=1;y=zHpbi0h+mFVVv0vAo0djMK296b5XRCAYRV6yMsy74Rs=
shamir-share:i=4;y=mogI5kfJRclXMVObj0iDtIJoOuu6Wt0pZ9FPaG5bmmM=
.ft B
\*(pr shamir recover <shares >secret2
\*(pr cmp secret secret2
.VE
.SH DATA FORMATS
The
.B shamir
program applies a simple encoding
to the various kinds of structured data it deals with,
specifically to
.IR secrets ,
.IR parameters ,
and
.IR shares .
.PP
An encoded object has the form
.IP
.IB label : \c
.IB slot = value \c
.RB [ ; ...]
.PP
Encodings do not contain whitespace characters.
.PP
The
.I label
is a token which identifies the kind of object encoded.
Each
.I slot
is named with a token
(usually a single letter),
and supplied with its
.I value
in a suitable encoding,
as described below.
.PP
The syntax is very picky.
The order of the slots is significant,
and all slots must be defined.
.SS Secets
A
.I secret
object describes a secret and its sharing parameters.
Secret objects are not input or output directly,
but are hashed when constructing parameters objects (described below).
.PP
The label for a secret object is
.BR shamir-secret .
Its slots are as follows.
.TP
.B n
The total number of shares issued,
i.e., the
.I count
argument to
.BR "shamir issue" ,
as a decimal integer.
.TP
.B t
The threshold required for recovery,
i.e., the
.I thresh
argument to
.BR "shamir issue" ,
as a decimal integer.
.TP
.B s
The secret itself, Base64-encoded,
with no whitespace.
.SS Parameters
A
.I parameters
object describes how a secret has been split into shares
and provides information about how it can be reassembled.
.PP
The label for a parameters object is
.BR shamir-params .
Its slots are as follows.
.TP
.B n
The total number of shares issued,
i.e., the
.I count
argument to
.BR "shamir issue" ,
as a decimal integer.
.TP
.B t
The threshold required for recovery,
i.e., the
.I thresh
argument to
.BR "shamir issue" ,
as a decimal integer.
.TP
.B f
The name of the hash function
used to compute the
.B h
slot.
This may be any of the hash functions known to OpenSSL.
.TP
.B h
The hash of a secret object,
made using the hash function named by the
.B f
slot,
and Base64 encoded.
No trailing newline is included when hashing the secret object.
.SS Shares
A
.I share
object describes a single share.
The label for a parameters object is
.BR shamir-share .
Its slots are as follows.
.TP
.B i
The share's index,
as a decimal integer;
.ie t $0 <= i < "count"$.
.el \{\
0 \(<=
.I i
<
.IR count .
.\}
.TP
.B y
The share data, Base64 encoded.
.PP
.SH MATHEMATICAL BACKGROUND
(This section has lots of mathematical notation
and doesn't look especially good in a terminal.)
.PP
Let
.I k
be a field.
Suppose we are given a secret
.ie t $s member k$
.el \{\
.I s
\*(mo
.I k
.\}
which we want to split into shares
such that any $t$ of them can be used to recover
.IR s .
Choose coefficients
.ie t $a sub i member k$
.el \{\
.IR a _ i
\*(mo
.I k
.\}
for
.ie t $1 <= i < t$
.el \{\
1 \(<=
.I i
<
.I t
.\}
at random.
Let 
.ie t $p(x) = s + a sub 1 x + ... + a sub {t-1} x sup {t-1}$.
.el \{\
.IR p ( x )
=
.I s
+
.IR a _1
.I x
+ ... +
.IR a _( t \-1)
.IR x ^( t \-1).
.\}
Note that
.ie t $p(0) = s$.
.el \{\
.IR p (0)
=
.IR s .
.\}
We can evaluate 
.I p
to obtain shares
.ie t $y sub i = p( x sub i )$
.el \{\
.IR y _ i
=
.IR p ( x _ i )
.\}
for various
.ie t $x sub i member k - lbrace ~ 0 ~ rbrace$.
.el \{\
.IR x _ i
\*(mo
.I k
\-
{ 0 }.
.\}
.PP
How do we recover the secret?
Suppose we are given
.ie t $y sub i = p( x sub i )$
.el \{\
.IR y _ i
=
.IR p ( x _ i )
.\}
for
.ie t $0 <= i < t$,
.el \{\
0 \(<=
.I i
<
.IR t .
.\}
where the
.ie t $x sub i$
.el .IR x _ i
are distinct.
Define
.ie t \{\
.EQ
lambda sub i (x) =
  sum from {pile {0 <= j < t  above j != i}}
  {x - x sub j} over {x sub i - x sub j}.
.EN
.\}
.el \{\
.IP
.nf
.\"           ---   x - x_j 
.\" L_i(x) =  >    ---------.
.\"           ---  x_i - x_j
.\"          0<j<t
.\"           j=i
          ---   \fIx\fP \- \fIx\fP_\fIj\fP 
\*(*l_\fIi\fP(\fIx\fP) =  >    ---------.
          ---  \fIx\fP_\fIi\fP - \fIx\fP_\fIj\fP
         0\(<=\fIj\fP<\fIt\fP
          \fIj\fP\(!=\fIi\fP
.fi
.PP
.\}
Then 
.ie t $lambda sub i$
.el .RI \*(*l_ i
is a
.ie t degree-$(t - 1)$
.el \{\
.RI degree-( t
\- 1)
.\}
polynomial
such that 
.ie t $lambda sub i ( x sub i ) = 1$,
.el \{\
.RI \*(*l_ i ( x _ i )
= 1,
.\}
and
.ie t $lambda sub i ( x sub j ) = 0$
.el \{\
.RI \*(*l_ i ( x _ j )
= 0
.\}
if
.ie t $j != i$.
.el \{\
.I j
\(!=
.IR i .
.\}
Define
.ie t \{\
.EQ
q(x) = sum from {0 <= i < t} lambda sub i (x) y sub i .
.EN
.\}
.el \{\
.IP
.nf
.\"         ---
.\" q(x) =  >    L_i(x) y_i.
.\"         ---
.\"        0<i<t
        ---
\fIq\fP(\fIx\fP) =  >    \*(*l_\fIi\fP(\fIx\fP) \fIy\fP_\fIi\fP.
        ---
       0\(<=\fIi\fP<\fIt\fP
.fi
.PP
.\}
Note that
.I q
has degree
.ie t $t - 1$,
.el \{\
.I t
\- 1,
.\}
and
.ie t $q( x sub i ) = y sub i = p( x sub i )$
.el \{\
.IR q ( x _ i )
=
.IR y _ i
=
.IR p ( x _ i )
.\}
for all
.ie t $0 <= i < t$.
.el \{\
0 \(<=
.I i
<
.IR t .
.\}
Hence
.ie t $p - q$
.el \{\
.I p
\-
.I q
.\}
is a polynomial with degree at most
.ie t $t - 1$
.el \{\
.I t
\- 1
.\}
but with at least
.I t
distinct zeros;
therefore
.ie t $q - p$
.el \{\
.I p
\-
.I q
.\}
must be identically zero
and hence
.ie t $q == p$
.el \{\
.I q
\(==
.I p
.\}
and
.ie t $s = q(0)$.
.el \{\
.I s
=
.IR q (0).
.\}
.PP
Suppose we are given
.ie t $t - 1$
.el \{\
.I t
\- 1
.\}
shares.
Then,
for any putative secret
.ie t $s'$,
.el .IR s ',
there exists a distinct possible value for each of the remaining shares
which can be determined using the above interpolation,
each of which is equally likely.
.PP
The polynomial interpolation technique is due to Lagrange;
its use as a secret-sharing scheme is due to Adi Shamir.
.PP
In theory, then,
this secret-sharing scheme is
.IR "information-theoretically secure" ;
in practice,
there's a cryptographic hash of the secret in the sharing parameters,
so security is limited to the computational difficulty
of inverting this hash.
Knowledge of fewer than
.I t
shares doesn't make this any easier;
and there's a security benefit to
knowing that a dishonest share holder
can't introduce errors into the recovered secret.
.PP
All of the above works in any field
.IR k .
Computation is easiest in a small finite field
(rather than, say, the rationals).
This program uses the field
.ie t $FIELD sub {2 sup 8}$
.el GF(2^8)
represented as the quotient ring
.ie t $FIELD sub 2 [u]/( u sup 8 + u sup 4 + u sup 3 + u sup 2 + 1 )$.
.el \{\
.RI GF(2)[ u ]/( u ^8
+
.IR u ^4
+
.IR u ^3
+
.IR u ^2
+
1).
.\}
Hence, a field element fits exactly into a single byte:
specifically, the field element
.ie t $x = a sub 0 + a sub 1 u + ... + a sub 7 u sup 7$
.el \{\
.I x
=
.IR a _0
+
.IR a _1
.I u
+ ... +
.IR a _7
.IR u ^7
.\}
corresponds to the byte
.ie t $B(x) = a sub 0 + 2 a sub 1 + ... + 2 sup 7 a sub 7$.
.el \{\
.IR B ( x )
=
.IR a _0
+
2
.IR a _1
+ ... +
2^7
.IR a _7.
.\}
Secrets longer than a single byte are shared bytewise independently,
which is why the shares leak information about the secret size.
.PP
The program represents share indices as small integers
.ie t $0 <= i < n$;
.el \{\
0 \(<=
.I i
<
.IR n ;
.\}
specifically, the share with index
.I i
is generated by
evaluating the polynomial
.ie t $p(x)$
.el .IR p ( x )
where
.ie t $i = B(x) - 1$.
.el \{\
.I i
=
.IR B ( x )
\- 1.
.\}
.SH AUTHOR
Mark Wooding, <mdw@distorted.org.uk>
.SH SEE ALSO
.BR distorted-keys (7),
.BR dgst (1ssl).