# -*-org-*- #+TITLE: ~runlisp~ -- run scripts written in Common Lisp #+AUTHOR: Mark Wooding #+LaTeX_CLASS: strayman #+LaTeX_HEADER: \usepackage{tikz, gnuplot-lua-tikz} ~runlisp~ is a small C program intended to be run from a script ~#!~ line. It selects and invokes a Common Lisp implementation, so as to run the script. In this sense, ~runlisp~ is a partial replacement for ~cl-launch~. Currently, the following Lisp implementations are supported: + Armed Bear Common Lisp (~abcl~), + Clozure Common Lisp (~ccl~), + GNU CLisp (~clisp~), + Carnegie--Mellon Univerity Common Lisp (~cmucl~), and + Embeddable Common Lisp (~ecl~), and + Steel Bank Common Lisp (~sbcl~). I'm happy to take patches to support additional free Lisp implementations. I'm not interested in supporting non-free Lisp systems. * Writing scripts in Lisp ** Basic use The obvious way to use ~runlisp~ is in a shebang (~#!~) line at the top of a script. For example: : #! /usr/local/bin/runlisp : (format t "Hello from Lisp!~%") Script interpreters must be named with absolute pathnames in shebang lines; if your ~runlisp~ is installed somewhere other than ~/usr/local/bin/~ then you'll need to write something different. Alternatively, a common hack involves abusing the ~env~ program as a script interpreter, because it will do a path search for the program it's supposed to run: : #! /usr/bin/env runlisp : (format t "Hello from Lisp!~%") ** Specific Lisps Lisp implementations are not created equal -- for good reason. If your script depends on the features of some particular Lisp implementation, then you can tell ~runlisp~ that it must use that implementation to run your script using the ~-L~ option; for example: : #! /usr/local/bin/runlisp -Lsbcl : (format t "Hello from Steel Bank Common Lisp!~%") If your script supports several Lisps, but not all, then list them all in the ~-L~ option, separated by commas: : #! /usr/local/bin/runlisp -Lsbcl,ccl : (format t #.(concatenate 'string : "Hello from " : #+sbcl "Steel Bank" : #+ccl "Clozure" : #-(or sbcl ccl) "an unexpected" : " Common Lisp!~%")) ** Embedded options If your script requires features of particular Lisp implementations /and/ you don't want to hardcode an absolute path to ~runlisp~, then you have a problem. Most Unix-like operating systems will parse a shebang line into the initial ~#!~, the pathname to the interpreter program, and a /single/ optional argument: any further spaces don't separate further arguments: they just get included in the first argument, all the way up to the end of the line. So : #! /usr/bin/env runlisp -Lsbcl : (format t "Hello from Steel Bank Common Lisp!~%") won't work: it'll just try to run a program named ~runlisp -Lsbcl~, with a space in the middle of its name, and that's quite unlikely to exist. To help with this situation, ~runlisp~ reads /embedded options/ from your script. Specifically, if the script's second line contains the token ~@RUNLISP:~ then ~runlisp~ will parse additional options from this line. So the following will work properly. : #! /usr/bin/env runlisp : ;;; @RUNLISP: -Lsbcl : (format t "Hello from Steel Bank Common Lisp!~%") Embedded options are split at spaces properly. Spaces can be escaped or quoted in (an approximation to) the usual shell manner, should that be necessary. See the manpage for the gory details. ** Common environment ~runlisp~ puts some effort into making sure that Lisp scripts get the same view of the world regardless of which implementation is running them. For example: + The ~asdf~ and ~uiop~ systems are loaded and ready for use. + The script's command-line arguments are available in ~uiop:*command-line-arguments*~. Its name can be found by calling ~(uiop:argv0)~ -- though it's probably also in ~*load-pathname*~. + The prevailing Unix standard input, output, and error files are available through the Lisp ~*standard-input*~, ~*standard-output*~, and ~*error-ouptut*~ streams, respectively. (This is, alas, not a foregone conclusion.) + The keyword ~:runlisp-script~ is added to the ~*features*~ list. This means that your script can tell whether it's being run from the command line, and should therefore do its thing and then quit; or merely being loaded into a Lisp system, e.g., for debugging or development, and should sit still and not do anything until it's asked. See the manual for the complete list of guarantees. * Invoking Lisp implementations ** Basic use A secondary use of ~runlisp~ is in build scripts for Lisp programs. If the entire project is just a Lisp library, then it's possibly acceptable to just provide an ASDF system definition and expect users to type ~(asdf:load-system "mumble")~ to use it. If it's a program, or there are things other than Lisp which ASDF can't or shouldn't handle -- significant pieces in other languages, or a Lisp executable image to make and install -- then it seems sensible to make the project's main build system be something language-agnostic, say Unix ~make~, and arrange for that to invoke ASDF at the appropriate time. But how should that be arranged? It's relatively easy for a project' Lisp code to support multiple Lisp implementation; but each implementation wants different runes for evaluating Lisp forms from the command line, and some of them don't provide an ideal environment for integrating into a build system. So ~runlisp~ provides a simple common command-line interface for evaluating Lisp forms. For example: : $ runlisp -e '(format t "~A~%" (+ 1 2))' : 3 If your build script needs to get information out of Lisp, then wrapping ~format~, or even ~prin1~, around forms is annoying; so ~runlisp~ has a ~-p~ option which prints the values of the forms it evaluates. : $ runlisp -e '(+ 1 2)' : 3 If a form produces multiple values, then ~-p~ will print all of them separated by spaces, on a single line: : $ runlisp -p '(floor 5 2)' : 2 1 In addition to evaluating forms with ~-e~, and printing their values with ~-p~, you can also load a file of Lisp code using ~-l~. When ~runlisp~ is acting on ~-e~, ~-p~, and/or ~-l~ options, it's said to be running in /eval/ mode, rather than its usual /script/ mode. In script mode, it /doesn't/ set ~:runlisp-script~ in ~*features*~. You can still insist that ~runlisp~ use a particular Lisp implementation, or one of a subset of implementations, using the ~-L~ option mentioned above. : $ runlisp -Lsbcl -p "(lisp-implementation-type)" : "SBCL" ** Command-line processing When scripting a Lisp -- as opposed to running a Lisp script -- it's not necessarily the case that your script knows in advance exactly what it needs to ask Lisp to do. For example, it might need to tell Lisp to install a program in a particular directory, determined by Autoconf. While it's certainly /possible/ to quote such data and splice them into Lisp forms, it's more convenient to pass them in separately. So ~runlisp~ ensures that the command-line options are available to Lisp forms via ~uiop:*command-line-arguments*~, as they are to a Lisp script. : $ runlisp -p "uiop:*command-line-arguments*" one two three : ("one" "two" "three") When running Lisp forms like this, ~(uiop:argv0)~ isn't very meaningful. (Currently, it reveals the name of the script which ~runlisp~ uses to implement this feature.) * Configuring =runlisp= ** Where =runlisp= looks for configuration You can influence which Lisp implementations are chosen by ~runlisp~ by writing a configuration file, and/or setting an environment variable. ~runlisp~ looks for configuration in ~~/.runlisprc~, and in ~~/.config/runlisprc~. You could put configuration in both, but that doesn't seem like a great idea. A configuration file just contains blank lines, comments, and command-line options, just as you'd write them to the shell. Simple quoting and escaping is provided: see the manual page for the full details. Each line is processed independently, so it doesn't work to write an option on one line and then its argument on the next. The environment variable ~RUNLISP_OPTIONS~ is processed /after/ reading the configuration file(s), if any. Again, it should contain command-line options, as you'd write them to the shell. ** Deciding which Lisp implementation to use The most useful option to use here is ~-P~, which builds up a /preference list/, in order. The argument to ~-P~ is a comma-separated list of Lisp implementation names, just like you'd give to ~-L~. If you provide multiple ~-P~ options (e.g., on different lines of your configuration file, or separately in the configuration file and environment variable, then the lists are concatenated. Since the environment variable is processed after the configuration file, this means that When deciding which Lisp implementation to use, ~runlisp~ works as follows. It builds a list of /acceptable/ Lisp implementations from the ~-L~ options, and a list of /preferred/ Lisp implementations from the ~-P~ options. If there aren't any ~-L~ options, then it assumes that /all/ Lisp implementations are acceptable; but if there are no ~-P~ options then it assumes that /no/ Lisp implementations are preferred. It then works through the preferred list in order: if it finds an implementation which is installed and acceptable, then it uses that one. If that doesn't work, then it works through the acceptable implementations that it hasn't tried yet, in order, and if it finds one of those that's installed, then it runs that one. Otherwise it reports an error and gives up. ** Clearing the preferred list Since the environment variable is processed after the configuration files, it can only append more Lisp implementations to the end of the preferred list, which may well not be so helpful. There's an additional option ~-C~, which completely clears the preferred list. The idea is that you can write ~-C~ at the start of your ~RUNLISP_OPTIONS~ environment variable to temporarily override your usual configuration for some special effect. * What's wrong with =cl-launch=? The short version is that ~cl-launch~ is slow and inconvenient. ~cl-launch~ is a big, complicated Common Lisp/Bourne shell polyglot which tries to do everything but doesn't quite succeed. ** It's slow. I took a trivial Lisp script: : (format t "Hello from ~A!~%~ : Script = `~A'~%~ : Arguments = (~{`~A'~^, ~})~%" : (lisp-implementation-type) : (uiop:argv0) : uiop:*command-line-arguments*) I timed how long it took to run on all of ~runlisp~'s supported Lisp implementations, and compared them to how long ~cl-launch~ took: the results are shown in table [[tab:runlisp-vanilla]]. ~runlisp~ is /at least/ two and half times faster at running this script than ~cl-launch~ on all implementations except Clozure CL[fn:slow-ccl], and approaching four and a half times faster on SBCL. #+CAPTION: ~cl-launch~ vs ~runlisp~ (with vanilla images) #+NAME: tab:runlisp-vanilla #+ATTR_LATEX: :float t :placement [tbp] |------------------+-------------------+-----------------+----------------------| | *Implementation* | *~cl-launch~ (s)* | *~runlisp~ (s)* | *~runlisp~ (factor)* | |------------------+-------------------+-----------------+----------------------| | ABCL | 7.3036 | 2.6027 | 2.806 | | Clozure CL | 1.2769 | 0.9678 | 1.319 | | GNU CLisp | 1.2498 | 0.2659 | 4.700 | | CMU CL | 0.9665 | 0.3065 | 3.153 | | ECL | 0.8025 | 0.3173 | 2.529 | | SBCL | 0.3266 | 0.0739 | 4.419 | |------------------+-------------------+-----------------+----------------------| #+TBLFM: $4=$2/$3;%.3f But this is using the `vanilla' Lisp images installed with the implementations. ~runlisp~ by default builds custom images for most Lisp implementations, which improves startup performance significantly; see table [[tab:runlisp-custom]]. (I don't currently know how to build a useful custom image for ABCL. ~runlisp~ does build a custom image for ECL, but it doesn't help significantly.) These results are summarized in figure [[fig:lisp-graph]]. #+CAPTION: ~cl-launch~ vs ~runlisp~ (with custom images) #+NAME: tab:runlisp-custom #+ATTR_LATEX: :float t :placement [tbp] |------------------+-------------------+-----------------+----------------------| | *Implementation* | *~cl-launch~ (s)* | *~runlisp~ (s)* | *~runlisp~ (factor)* | |------------------+-------------------+-----------------+----------------------| | ABCL | 7.3036 | 2.5873 | 2.823 | | Clozure CL | 1.2769 | 0.0088 | 145.102 | | GNU CLisp | 1.2498 | 0.0146 | 85.603 | | CMU CL | 0.9665 | 0.0063 | 153.413 | | ECL | 0.8025 | 0.3185 | 2.520 | | SBCL | 0.3266 | 0.0077 | 42.416 | |------------------+-------------------+-----------------+----------------------| #+TBLFM: $4=$2/$3;%.3f #+CAPTION: Comparison of ~runlisp~ and ~cl-launch~ times #+NAME: fig:lisp-graph #+ATTR_LATEX: :float t :placement [tbp] [[file:doc/lisp-graph.tikz]] Unlike ~cl-launch~, with some Lisp implementations at least, ~runlisp~ startup performance is usefully comparable to other popular scripting language implementations. I wrote similarly trivial scripts in a number of other languages, and timed them; the results are tabulated in table [[tab:runlisp-interp]] and graphed in figure [[fig:interp-graph]]. #+CAPTION: ~runlisp~ vs other interpreters #+NAME: tab:runlisp-interp #+ATTR_LATEX: :float t :placement [tbp] |------------------------------+-------------| | *Implementation* | *Time (ms)* | |------------------------------+-------------| | Clozure CL | 8.8 | | GNU CLisp | 14.6 | | CMU CL | 6.3 | | SBCL | 7.7 | |------------------------------+-------------| | Perl | 1.2 | | Python | 10.3 | |------------------------------+-------------| | Debian Almquist shell (dash) | 1.4 | | GNU Bash | 2.0 | | Z Shell | 4.1 | |------------------------------+-------------| | Tiny C (compile & run) | 1.2 | | GCC (precompiled) | 0.5 | |------------------------------+-------------| #+CAPTION: Comparison of ~runlisp~ and other script interpreters #+NAME: fig:interp-graph #+Attr_latex: :float t :placement [tbp] [[file:doc/interp-graph.tikz]] (All the timings in this section were performed on the same 2020 Dell XPS13 laptop running Debian `buster'. The tools used to make the measurements are included in the source distribution, in the ~bench/~ subdirectory.) [fn:slow-ccl] I don't know why Clozure CL shows such a small difference here. ** It's inconvenient ~cl-launch~ has this elaborate machinery which reads shell script fragments from various places and sets variables like ~$LISPS~, but it doesn't quite work. Unlike other scripting languages such as Perl or Python, Common Lisp has lots of implementations, and they all have various unique features (and bugs) which a script might rely on (or need to avoid). Also, a user might have preferences about which Lisps to use. ~cl-launch~'s approach to this problem is a ~system_preferred_lisps~ shell function which can be used in ~~/.cl-launchrc~ to select a Lisp system for a particular `software system', though this notion doesn't appear to be well-defined, but this all works by editing a single ~$LISPS~ shell variable. By contrast, ~runlisp~ has a ~-L~ option with which scripts can specify the Lisp systems they support (in a preference order), and a ~-P~ option with which users can express their own preferences (e.g., in the environment or a configuration file): ~runlisp~ will never choose a Lisp system which the script can't deal with, but it will respect the user's relative preferences. ** It doesn't establish a (useful) common environment A number of Lisp systems are annoyingly deficient in their handling of scripts. For example, when GNU CLisp's ~-x~ option is used, it rebinds ~*standard-input*~ to an internal string stream holding the expression passed in on the command line, leaving the process's actual stdin nearly impossible to access. : $ date | cl-launch -l sbcl -i "(princ (read-line nil nil))" # expected : Sun 9 Aug 14:39:10 BST 2020 : $ date | cl-launch -l clisp -i "(princ (read-line nil nil))" # bug! : NIL As another example, Armed Bear Common Lisp doesn't seem to believe in the stderr stream: when it starts up, ~*error-ouptut*~ is bound to the standard output, just like ~*standard-output*~. Also, ~cl-launch~ loading ASDF causes a huge number of ~style-warning~ messages to be written to stdout, making ABCL pretty much useless for writing filter scripts. : $ cl-launch -l sbcl -i '(progn : (format *standard-output* "output~%") : (format *error-output* "error~%"))' \ : > >(sed 's/^/stdout: /') 2> >(sed 's/^/stderr: /') : stdout: output : stderr: error : $ cl-launch -l abcl -i '(progn : (format *standard-output* "output~%") : (format *error-output* "error~%"))' \ : > >(sed 's/^/stdout: /') 2> >(sed 's/^/stderr: /') : [1813 lines of compiler warnings tagged `stdout:'] : stdout: output : stdout: error ~runlisp~ takes care of all of this, providing a basic but useful common level of shell integration for all its supported Lisp implementations. In particular: + It ensures that the standard Unix `stdin', `stdout', and `stdarr' file descriptors are hooked up to the Lisp ~*standard-input*~, ~*standard-output*~, and ~*error-output*~ streams. + It ensures that starting a script doesn't write a deluge of diagnostic drivel. The complete details are given in ~runlisp~'s manpage. ** Why might one prefer =cl-launch= anyway? On the other hand, ~cl-launch~ is well established and full-featured. ~cl-launch~ compiles scripts before trying to run them, so they'll run faster on Lisps which use an interpreter by default. It has a caching feature so running a script a second time doesn't need to recompile it. If your scripts are compute-intensive and benefit from ahead-of-time compilation then maybe ~cl-launch~ is preferable. ~cl-launch~ supports more Lisp systems. I only have six installed on my development machine at the moment, so those are the ones that ~runlisp~ supports. If you want your scripts to be able to run on other Lisps, then ~cl-launch~ is the way to do that. Of course, I welcome patches to help ~runlisp~ support other free Lisp implementations. ~cl-launch~ also supports proprietary Lisps: I have very little interest in these, so if you want to run scripts using Allegro or LispWorks then ~cl-launch~ is your only choice.