From: Mark Wooding <mdw@distorted.org.uk>
Date: Sun, 21 Dec 2008 20:28:01 +0000 (+0000)
Subject: README: Provide a handy overview and tutorial.
X-Git-Tag: 1.0.2~2
X-Git-Url: https://git.distorted.org.uk/~mdw/preload-hacks/commitdiff_plain/99413c3b70f13f34b66f1a4d528eed8d14349fd7

README: Provide a handy overview and tutorial.

It could probably do with building instructions and stuff.
---

diff --git a/Makefile b/Makefile
index c7afb3a..ba847c5 100644
--- a/Makefile
+++ b/Makefile
@@ -55,7 +55,7 @@ DISTTAR = $(DISTDIR).tar.gz
 distdir:
 	rm -rf $(DISTDIR)
 	mkdir $(DISTDIR)
-	ln $(SOURCES) $(MAN1) Makefile COPYING $(DISTDIR)
+	ln $(SOURCES) $(MAN1) Makefile COPYING README $(DISTDIR)
 	mkdir $(DISTDIR)/debian
 	ln debian/rules debian/copyright debian/changelog debian/control \
 	  debian/*.install $(DISTDIR)/debian
diff --git a/README b/README
new file mode 100644
index 0000000..980f5a4
--- /dev/null
+++ b/README
@@ -0,0 +1,268 @@
+PRELOAD-HACKS
+~~~~~~~~~~~~~
+
+What is it?
+
+	The preload-hacks distribution contains a couple of LD_PRELOAD-
+	able libraries which I find very useful.  Well, one useful one,
+	and one which is really handy in theory.
+
+	uopen	Traps when a process is trying to open(2) a Unix-domain
+		socket, and does the appropriate socket(2)/connect(2)
+		dance instead.  I've no idea why it doesn't work like
+		this in the first place.
+
+	noip	Traps when a process is trying to make an Internet
+		socket, and makes a Unix-domain socket instead.
+
+	The first one is the one which is useful in theory but I've not
+	really made much use of in practice.
+
+
+uopen
+
+	The main use-case is variable signatures.  Many mail and news
+	clients nowadays have built-in sigmonsters, which choose a
+	.signature at random from a collection.  Some don't, of course,
+	which is a shame.  It would be nice if the sigmonster was
+	detachable, so you could just write a sigmonster and attach it
+	to your favourite newsreader.  It would extra nice if
+	newsreaders (and mail clients) don't have to use some kind of
+	weirdo sigmonster interface just to do this stupid thing with
+	.signatures.
+
+	All mail and news clients know how to read a .signature file.
+	It's why it's got that name.  So the right answer seems to be to
+	make this file magically have different contents each time it's
+	read.  Noticing when someone tries to read a regular file is
+	just awful so let's not think about that idea any more.  We
+	could make .signature be a named pipe; but named pipe servers
+	are very difficult to get right when there are multiple
+	simultaneous clients.  Sockets are, of course, the right answer
+	when client/server architectures come up.  And we've got a
+	convenient way of stashing sockets in the filesystem: PF_UNIX
+	sockets.
+
+	So, we write our sigmonster:
+
+	$ fwd -d from unix:$HOME/.signature to exec.fortune
+
+	And now we check to see whether it works.
+
+	$ cat ~/.signature
+	cat: /home/mdw/.signature: No such device or address
+
+	Hmm.  That blows.  Surely it's obvious how to read from a
+	socket.  But, no, the kernel won't do the socket/connect thing
+	for us.
+
+	Enter uopen.
+
+	$ uopen cat ~/.signature
+	Noise proves nothing.  Often a hen who has merely laid an egg 
+	cackles as if she laid an asteroid.
+                -- Mark Twain
+
+	Joy!
+
+	This isn't perfect.  The file is weird and not a proper file.
+	Emacs will refuse to visit it as a result.  But it /will/
+	happily insert the contents of the file into existing buffers.
+	Hopefully other editors are similar.  `less' wants the -f option
+	before it will bother.  But actually it works pretty well.
+
+	The right place for the functionality of uopen is in the kernel.
+	It shouldn't be difficult.  I even submitted a patch to the
+	Linux kernel list to do precisely that, once, back in the days
+	of 2.0.x.  It was ignored, and I gave up; the patch bitrotted
+	hopelessly.  My LD_PRELOAD hack still works.  There's no
+	configuration.  It just works.
+
+	My .signature has been `-- [mdw]' for years now, and that's
+	unlikely to change.  So I don't actually use uopen very much.
+	But it's cool to know that it exists.
+
+
+noip
+
+  The basic idea
+
+	This I use every day.  All the time.  Here's the use case.
+	We'll see some more examples later.
+
+	Some random program has a client/server split between the main
+	guts of the thing and its user interface, and the two
+	communicate over TCP sockets.  There are lots of examples: SLIME
+	(the Superior Lisp Interaction Mode for Emacs) runs a Common
+	Lisp system as a separate process.  The SAGE notebook runs a web
+	server and you're meant to use a Javascript-supporting web
+	browser to drive it.  All sorts of stuff.  Usually the
+	programmer knew just enough to remember to bind the server's
+	listening socket to 127.0.0.1, to stop everyone on the Internet
+	from connecting, but often the security consciousness stops
+	there.  If you're very lucky, there's some sort of password
+	mechanism.
+
+	The problem, of course, occurs on a multi-user system.  Binding
+	to localhost doesn't stop any other user of the same machine
+	from connecting.  In the cases of SLIME and SAGE, this is a big
+	problem: both provide a full programming environment
+	(respectively Common Lisp and Python) which would let an
+	attacker do anything he likes in your name.
+
+	Passwords are wretched as a security mechanism.  Besides, I
+	shouldn't need a damned password to talk to one of my own
+	processes from another one of my own processes!  The operating
+	system should be able to ensure that processes owned by the same
+	user can communicate securely.  There's a whole filesystem with
+	access control and everything.
+
+	The right answer is to use Unix-domain sockets, which live in
+	the filesystem and have proper access control applied to them.
+	But programmers are lazy, and Unix-domain sockets don't exist on
+	Windows (well, unless you install Cygwin, but I can see why
+	that's an unpopular idea).
+
+	The noip LD_PRELOAD hack intercepts the socket(2) system call.
+	If the process is asking for a PF_INET socket, then it hands out
+	a PF_UNIX socket instead.  If the process tries to bind(2) its
+	socket to 127.0.0.1:12345, say, then noip binds it to
+	/tmp/noip-USER/127.0.0.1:12345 instead (having previously
+	created the directory /tmp/noip-USER and made sure that nobody
+	else can get to it).  If the process tries to connect(2)
+	somewhere, noip fixes up the address.  The noip hack intercepts
+	14 different system calls in order to prevent its systematic
+	dishonesty from being discovered.
+
+  Configuration
+
+	Running a program under noip effectively only allows it to talk
+	to other programs running under noip.  This is sort of the idea,
+	but it's rather restrictive in practice.  I can happily run
+
+	$ noip emacs
+
+	and start up SLIME, and Emacs and SLIME will communicate
+	securely over a Unix-domain socket without either of them
+	noticing.  But now Emacs can't talk to anything other than
+	SLIME, which makes w3m-el less useful than it used to be, and,
+	worse, my Common Lisp programs can't talk to anything external
+	either, which may make writing network-aware Lisp programs
+	annoying.
+
+	It gets worse with SAGE.  I can run
+
+	$ noip sage -notebook
+
+	and in another window
+
+	$ noip iceweasel http://localhost:8000/
+
+	(or Firefox, on Ubuntu), and the two will communicate happily.
+	But now my Iceweasel is crippled and can't actually talk to the
+	rest of the Internet.  The point of the exercise was to make my
+	SAGE process secure, not to make me run two copies of Iceweasel
+	and have to cope with the inevitable profile fork.
+
+	So noip can be configured.  It still defaults to safety:
+	whenever the process asks for a new Internet socket, noip hands
+	it a fake plastic Unix-domain socket instead.  But when the
+	process tries to bind or connect its socket, noip will look the
+	address up in a list decide what to do.  If the result comes
+	back `allow', then noip will do a three-card Monte, rustling up
+	a real PF_INET socket and replacing the plastic imitation; if
+	the result comes back `deny' then noip will continue with its
+	elaborate deception.
+
+	The configuration file lives in $HOME/.noip.  Mine says
+	something like this.
+
+	## standard configuration
+
+	## debug
+	realconnect +172.29.199.2:25
+	realconnect +172.29.199.2:53
+	realconnect +172.29.199.2:80
+	realconnect +172.29.199.2:3128
+	realconnect +127.0.0.1:6010-6020
+	realconnect -127.0.0.0/8
+	realconnect -local
+
+	(172.29.199.2 is the IP address of the machine I took this
+	from.)  What this says is as follows.
+
+	  * Don't produce debugging output, but let me turn it on easily
+	    if I feel the urge.
+
+	  * Allow direct connection to my SMTP server, on port 25.  (The
+	    `+' means `allow'.)
+
+	  * Allow conversations with my local DNS server.  (The noip
+	    hack is not particularly discriminating.  It replaces UDP
+	    sockets with Unix-domain datagram sockets, just as it
+	    replaces TCP sockets with Unix-domain stream sockets.)
+
+	  * Allow conversations with my local web server.
+
+	  * Allow conversations with my local squid proxy.
+
+	  * Allow conversations with SSH-forwarded X displays.
+
+	  * Don't allow any other communication with anything else on
+	    the loopback network 127.0.0.0/8.  (I've still no idea why
+	    each machine needs 16 million IP addresses for talking to
+	    itself.  The `-' means `deny'.)
+
+	  * Don't allow any other communication with any of my other
+	    local IP addresses.  (noip will work out which IP addresses
+	    are local from your network interface configuration.)
+
+	  * And finally, implicitly, allow anything else.
+
+	The rules follow the squid convention: the default is to do
+	whatever the last rule didn't do, so if the last rule says
+	`deny' then the default is `allow', and vice versa.
+
+	Armed with this configuration, I now routinely run both Emacs
+	and Iceweasel exclusively under the control of noip.  And I've
+	done this for several years.
+
+  SSH tricks
+
+	SSH is made of win.  Its X forwarding is lovely.  Its port
+	forwarding divine.  Almost.
+
+	Here's a common scenario.  I'm running on a multi-user server,
+	shared with several other people whom I don't necessarily trust.
+	I want to check some files out from my office's version-control
+	system.  Traditional answer:
+
+	$ ssh -L 12345:vcs.work.com:345 mdw@gateway.work.com
+
+	Now I can run
+
+	$ vcs -d localhost:12345 checkout ...
+
+	and all works well.  Of course, anyone else on the server can do
+	the same thing, so I've just leaked my company's secret sauce.
+	(I don't believe in secret sauce, but I ought to show willing.)
+
+	How do I fix this?  Easy!
+
+	$ noip ssh -L 12345:vcs.work.com:345 mdw@gateway.work.com
+
+	$ noip vcs -d localhost:12345 checkout ...
+
+	And it all works.  In this case, in particular, it's essential
+	that the /same/ SSH process binds a safe, plastic local end to
+	its forwarded VCS port, and is able to make a real, potentially
+	dangerous Internet connection to gateway.work.com.  Of course,
+	since I run Emacs under noip anyway, all the version control
+	stuff that Emacs does magically find the SSH tunnel and work
+	without me having to care.
+
+
+Local variables:
+mode: text
+fill-column: 72
+End: