@@@ work in progress
[runlisp] / README.org
CommitLineData
e29834b8
MW
1# -*-org-*-
2#+TITLE: ~runlisp~ -- run scripts written in Common Lisp
3#+AUTHOR: Mark Wooding
4#+LaTeX_CLASS: strayman
5#+LaTeX_HEADER: \usepackage{tikz, gnuplot-lua-tikz}
6
7~runlisp~ is a small C program intended to be run from a script ~#!~
8line. It selects and invokes a Common Lisp implementation, so as to run
9the script. In this sense, ~runlisp~ is a partial replacement for
10~cl-launch~.
11
12Currently, the following Lisp implementations are supported:
13
14 + Armed Bear Common Lisp (~abcl~),
15 + Clozure Common Lisp (~ccl~),
16 + GNU CLisp (~clisp~),
17 + Carnegie--Mellon Univerity Common Lisp (~cmucl~), and
18 + Embeddable Common Lisp (~ecl~), and
19 + Steel Bank Common Lisp (~sbcl~).
20
21I'm happy to take patches to support additional free Lisp
22implementations. I'm not interested in supporting non-free Lisp
23systems.
24
25
26* Writing scripts in Lisp
27
28** Basic use
29
30The obvious way to use ~runlisp~ is in a shebang (~#!~) line at the top
31of a script. For example:
32
33: #! /usr/local/bin/runlisp
34: (format t "Hello from Lisp!~%")
35
36Script interpreters must be named with absolute pathnames in shebang
37lines; if your ~runlisp~ is installed somewhere other than
38~/usr/local/bin/~ then you'll need to write something different.
39Alternatively, a common hack involves abusing the ~env~ program as a
40script interpreter, because it will do a path search for the program
41it's supposed to run:
42
43: #! /usr/bin/env runlisp
44: (format t "Hello from Lisp!~%")
45
46** Specific Lisps
47
48Lisp implementations are not created equal -- for good reason. If your
49script depends on the features of some particular Lisp implementation,
50then you can tell ~runlisp~ that it must use that implementation to run
51your script using the ~-L~ option; for example:
52
53: #! /usr/local/bin/runlisp -Lsbcl
54: (format t "Hello from Steel Bank Common Lisp!~%")
55
56If your script supports several Lisps, but not all, then list them all
57in the ~-L~ option, separated by commas:
58
59: #! /usr/local/bin/runlisp -Lsbcl,ccl
60: (format t #.(concatenate 'string
61: "Hello from "
62: #+sbcl "Steel Bank"
63: #+ccl "Clozure"
64: #-(or sbcl ccl) "an unexpected"
65: " Common Lisp!~%"))
66
67** Embedded options
68
69If your script requires features of particular Lisp implementations
70/and/ you don't want to hardcode an absolute path to ~runlisp~, then you
71have a problem. Most Unix-like operating systems will parse a shebang
72line into the initial ~#!~, the pathname to the interpreter program,
73and a /single/ optional argument: any further spaces don't separate
74further arguments: they just get included in the first argument, all the
75way up to the end of the line. So
76
77: #! /usr/bin/env runlisp -Lsbcl
78: (format t "Hello from Steel Bank Common Lisp!~%")
79
80won't work: it'll just try to run a program named ~runlisp -Lsbcl~, with
81a space in the middle of its name, and that's quite unlikely to exist.
82
83To help with this situation, ~runlisp~ reads /embedded options/ from
84your script. Specifically, if the script's second line contains the
85token ~@RUNLISP:~ then ~runlisp~ will parse additional options from this
86line. So the following will work properly.
87
88: #! /usr/bin/env runlisp
89: ;;; @RUNLISP: -Lsbcl
90: (format t "Hello from Steel Bank Common Lisp!~%")
91
92Embedded options are split at spaces properly. Spaces can be escaped or
93quoted in (an approximation to) the usual shell manner, should that be
94necessary. See the manpage for the gory details.
95
96** Common environment
97
98~runlisp~ puts some effort into making sure that Lisp scripts get the
99same view of the world regardless of which implementation is running
100them.
101
102For example:
103
104 + The ~asdf~ and ~uiop~ systems are loaded and ready for use.
105
106 + The script's command-line arguments are available in
107 ~uiop:*command-line-arguments*~. Its name can be found by calling
108 ~(uiop:argv0)~ -- though it's probably also in ~*load-pathname*~.
109
110 + The prevailing Unix standard input, output, and error files are
111 available through the Lisp ~*standard-input*~, ~*standard-output*~,
112 and ~*error-ouptut*~ streams, respectively. (This is, alas, not a
113 foregone conclusion.)
114
115 + The keyword ~:runlisp-script~ is added to the ~*features*~ list.
116 This means that your script can tell whether it's being run from the
117 command line, and should therefore do its thing and then quit; or
118 merely being loaded into a Lisp system, e.g., for debugging or
119 development, and should sit still and not do anything until it's
120 asked.
121
122See the manual for the complete list of guarantees.
123
124
125* Invoking Lisp implementations
126
127** Basic use
128
129A secondary use of ~runlisp~ is in build scripts for Lisp programs. If
130the entire project is just a Lisp library, then it's possibly acceptable
131to just provide an ASDF system definition and expect users to type
132~(asdf:load-system "mumble")~ to use it. If it's a program, or there
133are things other than Lisp which ASDF can't or shouldn't handle --
134significant pieces in other languages, or a Lisp executable image to
135make and install -- then it seems sensible to make the project's main
136build system be something language-agnostic, say Unix ~make~, and
137arrange for that to invoke ASDF at the appropriate time.
138
139But how should that be arranged? It's relatively easy for a project'
140Lisp code to support multiple Lisp implementation; but each
141implementation wants different runes for evaluating Lisp forms from the
142command line, and some of them don't provide an ideal environment for
143integrating into a build system. So ~runlisp~ provides a simple common
144command-line interface for evaluating Lisp forms. For example:
145
146: $ runlisp -e '(format t "~A~%" (+ 1 2))'
147: 3
148
149If your build script needs to get information out of Lisp, then wrapping
150~format~, or even ~prin1~, around forms is annoying; so ~runlisp~ has a
151~-p~ option which prints the values of the forms it evaluates.
152
153: $ runlisp -e '(+ 1 2)'
154: 3
155
156If a form produces multiple values, then ~-p~ will print all of them
157separated by spaces, on a single line:
158
159: $ runlisp -p '(floor 5 2)'
160: 2 1
161
162In addition to evaluating forms with ~-e~, and printing their values
163with ~-p~, you can also load a file of Lisp code using ~-l~.
164
165When ~runlisp~ is acting on ~-e~, ~-p~, and/or ~-l~ options, it's said
166to be running in /eval/ mode, rather than its usual /script/ mode. In
167script mode, it /doesn't/ set ~:runlisp-script~ in ~*features*~.
168
169You can still insist that ~runlisp~ use a particular Lisp
170implementation, or one of a subset of implementations, using the ~-L~
171option mentioned above.
172
173: $ runlisp -Lsbcl -p "(lisp-implementation-type)"
174: "SBCL"
175
176** Command-line processing
177
178When scripting a Lisp -- as opposed to running a Lisp script -- it's not
179necessarily the case that your script knows in advance exactly what it
180needs to ask Lisp to do. For example, it might need to tell Lisp to
181install a program in a particular directory, determined by Autoconf.
182While it's certainly /possible/ to quote such data and splice them into
183Lisp forms, it's more convenient to pass them in separately. So
184~runlisp~ ensures that the command-line options are available to Lisp
185forms via ~uiop:*command-line-arguments*~, as they are to a Lisp script.
186
187: $ runlisp -p "uiop:*command-line-arguments*" one two three
188: ("one" "two" "three")
189
190When running Lisp forms like this, ~(uiop:argv0)~ isn't very
191meaningful. (Currently, it reveals the name of the script which
192~runlisp~ uses to implement this feature.)
193
194
195* Configuring =runlisp=
196
197** Where =runlisp= looks for configuration
198
199You can influence which Lisp implementations are chosen by ~runlisp~ by
200writing a configuration file, and/or setting an environment variable.
201
202~runlisp~ looks for configuration in ~~/.runlisprc~, and in
203~~/.config/runlisprc~. You could put configuration in both, but that
204doesn't seem like a great idea. A configuration file just contains
205blank lines, comments, and command-line options, just as you'd write
206them to the shell. Simple quoting and escaping is provided: see the
207manual page for the full details. Each line is processed independently,
208so it doesn't work to write an option on one line and then its argument
209on the next.
210
211The environment variable ~RUNLISP_OPTIONS~ is processed /after/ reading
212the configuration file(s), if any. Again, it should contain
213command-line options, as you'd write them to the shell.
214
215** Deciding which Lisp implementation to use
216
217The most useful option to use here is ~-P~, which builds up a
218/preference list/, in order. The argument to ~-P~ is a comma-separated
219list of Lisp implementation names, just like you'd give to ~-L~.
220
221If you provide multiple ~-P~ options (e.g., on different lines of your
222configuration file, or separately in the configuration file and
223environment variable, then the lists are concatenated. Since the
224environment variable is processed after the configuration file, this
225means that
226
227When deciding which Lisp implementation to use, ~runlisp~ works as
228follows. It builds a list of /acceptable/ Lisp implementations from the
229~-L~ options, and a list of /preferred/ Lisp implementations from the
230~-P~ options. If there aren't any ~-L~ options, then it assumes that
231/all/ Lisp implementations are acceptable; but if there are no ~-P~
232options then it assumes that /no/ Lisp implementations are preferred.
233It then works through the preferred list in order: if it finds an
234implementation which is installed and acceptable, then it uses that one.
235If that doesn't work, then it works through the acceptable
236implementations that it hasn't tried yet, in order, and if it finds one
237of those that's installed, then it runs that one. Otherwise it reports
238an error and gives up.
239
240** Clearing the preferred list
241
242Since the environment variable is processed after the configuration
243files, it can only append more Lisp implementations to the end of the
244preferred list, which may well not be so helpful. There's an additional
245option ~-C~, which completely clears the preferred list. The idea is
246that you can write ~-C~ at the start of your ~RUNLISP_OPTIONS~
247environment variable to temporarily override your usual configuration
248for some special effect.
249
250
251* What's wrong with =cl-launch=?
252
253The short version is that ~cl-launch~ is slow and inconvenient.
254~cl-launch~ is a big, complicated Common Lisp/Bourne shell polyglot
255which tries to do everything but doesn't quite succeed.
256
257** It's slow.
258
259I took a trivial Lisp script:
260
261: (format t "Hello from ~A!~%~
262: Script = `~A'~%~
263: Arguments = (~{`~A'~^, ~})~%"
264: (lisp-implementation-type)
265: (uiop:argv0)
266: uiop:*command-line-arguments*)
267
268I timed how long it took to run on all of ~runlisp~'s supported Lisp
269implementations, and compared them to how long ~cl-launch~ took: the
270results are shown in table [[tab:runlisp-vanilla]]. ~runlisp~ is /at least/
271two and half times faster at running this script than ~cl-launch~ on all
272implementations except Clozure CL[fn:slow-ccl], and approaching four and
273a half times faster on SBCL.
274
275#+CAPTION: ~cl-launch~ vs ~runlisp~ (with vanilla images)
276#+NAME: tab:runlisp-vanilla
277#+ATTR_LATEX: :float t :placement [tbp]
278|------------------+-------------------+-----------------+----------------------|
279| *Implementation* | *~cl-launch~ (s)* | *~runlisp~ (s)* | *~runlisp~ (factor)* |
280|------------------+-------------------+-----------------+----------------------|
281| ABCL | 7.3036 | 2.6027 | 2.806 |
282| Clozure CL | 1.2769 | 0.9678 | 1.319 |
283| GNU CLisp | 1.2498 | 0.2659 | 4.700 |
284| CMU CL | 0.9665 | 0.3065 | 3.153 |
285| ECL | 0.8025 | 0.3173 | 2.529 |
286| SBCL | 0.3266 | 0.0739 | 4.419 |
287|------------------+-------------------+-----------------+----------------------|
288#+TBLFM: $4=$2/$3;%.3f
289
290But this is using the `vanilla' Lisp images installed with the
291implementations. ~runlisp~ by default builds custom images for most
292Lisp implementations, which improves startup performance significantly;
293see table [[tab:runlisp-custom]]. (I don't currently know how to build a
294useful custom image for ABCL. ~runlisp~ does build a custom image for
295ECL, but it doesn't help significantly.) These results are summarized
296in figure [[fig:lisp-graph]].
297
298#+CAPTION: ~cl-launch~ vs ~runlisp~ (with custom images)
299#+NAME: tab:runlisp-custom
300#+ATTR_LATEX: :float t :placement [tbp]
301|------------------+-------------------+-----------------+----------------------|
302| *Implementation* | *~cl-launch~ (s)* | *~runlisp~ (s)* | *~runlisp~ (factor)* |
303|------------------+-------------------+-----------------+----------------------|
304| ABCL | 7.3036 | 2.5873 | 2.823 |
305| Clozure CL | 1.2769 | 0.0088 | 145.102 |
306| GNU CLisp | 1.2498 | 0.0146 | 85.603 |
307| CMU CL | 0.9665 | 0.0063 | 153.413 |
308| ECL | 0.8025 | 0.3185 | 2.520 |
309| SBCL | 0.3266 | 0.0077 | 42.416 |
310|------------------+-------------------+-----------------+----------------------|
311#+TBLFM: $4=$2/$3;%.3f
312
313#+CAPTION: Comparison of ~runlisp~ and ~cl-launch~ times
314#+NAME: fig:lisp-graph
315#+ATTR_LATEX: :float t :placement [tbp]
316[[file:doc/lisp-graph.tikz]]
317
318Unlike ~cl-launch~, with some Lisp implementations at least, ~runlisp~
319startup performance is usefully comparable to other popular scripting
320language implementations. I wrote similarly trivial scripts in a number
321of other languages, and timed them; the results are tabulated in table
322[[tab:runlisp-interp]] and graphed in figure [[fig:interp-graph]].
323
324#+CAPTION: ~runlisp~ vs other interpreters
325#+NAME: tab:runlisp-interp
326#+ATTR_LATEX: :float t :placement [tbp]
327|------------------------------+-------------|
328| *Implementation* | *Time (ms)* |
329|------------------------------+-------------|
330| Clozure CL | 8.8 |
331| GNU CLisp | 14.6 |
332| CMU CL | 6.3 |
333| SBCL | 7.7 |
334|------------------------------+-------------|
335| Perl | 1.2 |
336| Python | 10.3 |
337|------------------------------+-------------|
338| Debian Almquist shell (dash) | 1.4 |
339| GNU Bash | 2.0 |
340| Z Shell | 4.1 |
341|------------------------------+-------------|
342| Tiny C (compile & run) | 1.2 |
343| GCC (precompiled) | 0.5 |
344|------------------------------+-------------|
345
346#+CAPTION: Comparison of ~runlisp~ and other script interpreters
347#+NAME: fig:interp-graph
348#+Attr_latex: :float t :placement [tbp]
349[[file:doc/interp-graph.tikz]]
350
351(All the timings in this section were performed on the same 2020 Dell
352XPS13 laptop running Debian `buster'. The tools used to make the
353measurements are included in the source distribution, in the ~bench/~
354subdirectory.)
355
356[fn:slow-ccl] I don't know why Clozure CL shows such a small difference
357here.
358
359** It's inconvenient
360
361~cl-launch~ has this elaborate machinery which reads shell script
362fragments from various places and sets variables like ~$LISPS~, but it
363doesn't quite work.
364
365Unlike other scripting languages such as Perl or Python, Common Lisp has
366lots of implementations, and they all have various unique features (and
367bugs) which a script might rely on (or need to avoid). Also, a user
368might have preferences about which Lisps to use. ~cl-launch~'s approach
369to this problem is a ~system_preferred_lisps~ shell function which can
370be used in ~~/.cl-launchrc~ to select a Lisp system for a particular
371`software system', though this notion doesn't appear to be well-defined,
372but this all works by editing a single ~$LISPS~ shell variable. By
373contrast, ~runlisp~ has a ~-L~ option with which scripts can specify the
374Lisp systems they support (in a preference order), and a ~-P~ option
375with which users can express their own preferences (e.g., in the
376environment or a configuration file): ~runlisp~ will never choose a Lisp
377system which the script can't deal with, but it will respect the user's
378relative preferences.
379
380** It doesn't establish a (useful) common environment
381
382A number of Lisp systems are annoyingly deficient in their handling of
383scripts.
384
385For example, when GNU CLisp's ~-x~ option is used, it rebinds
386~*standard-input*~ to an internal string stream holding the expression
387passed in on the command line, leaving the process's actual stdin nearly
388impossible to access.
389
390: $ date | cl-launch -l sbcl -i "(princ (read-line nil nil))" # expected
391: Sun 9 Aug 14:39:10 BST 2020
392: $ date | cl-launch -l clisp -i "(princ (read-line nil nil))" # bug!
393: NIL
394
395As another example, Armed Bear Common Lisp doesn't seem to believe in
396the stderr stream: when it starts up, ~*error-ouptut*~ is bound to the
397standard output, just like ~*standard-output*~. Also, ~cl-launch~
398loading ASDF causes a huge number of ~style-warning~ messages to be
399written to stdout, making ABCL pretty much useless for writing filter
400scripts.
401
402: $ cl-launch -l sbcl -i '(progn
403: (format *standard-output* "output~%")
404: (format *error-output* "error~%"))' \
405: > >(sed 's/^/stdout: /') 2> >(sed 's/^/stderr: /')
406: stdout: output
407: stderr: error
408: $ cl-launch -l abcl -i '(progn
409: (format *standard-output* "output~%")
410: (format *error-output* "error~%"))' \
411: > >(sed 's/^/stdout: /') 2> >(sed 's/^/stderr: /')
412: [1813 lines of compiler warnings tagged `stdout:']
413: stdout: output
414: stdout: error
415
416~runlisp~ takes care of all of this, providing a basic but useful common
417level of shell integration for all its supported Lisp implementations.
418In particular:
419
420 + It ensures that the standard Unix `stdin', `stdout', and `stdarr'
421 file descriptors are hooked up to the Lisp ~*standard-input*~,
422 ~*standard-output*~, and ~*error-output*~ streams.
423
424 + It ensures that starting a script doesn't write a deluge of
425 diagnostic drivel.
426
427The complete details are given in ~runlisp~'s manpage.
428
429** Why might one prefer =cl-launch= anyway?
430
431On the other hand, ~cl-launch~ is well established and full-featured.
432
433~cl-launch~ compiles scripts before trying to run them, so they'll run
434faster on Lisps which use an interpreter by default. It has a caching
435feature so running a script a second time doesn't need to recompile it.
436If your scripts are compute-intensive and benefit from ahead-of-time
437compilation then maybe ~cl-launch~ is preferable.
438
439~cl-launch~ supports more Lisp systems. I only have six installed on my
440development machine at the moment, so those are the ones that ~runlisp~
441supports. If you want your scripts to be able to run on other Lisps,
442then ~cl-launch~ is the way to do that. Of course, I welcome patches to
443help ~runlisp~ support other free Lisp implementations. ~cl-launch~
444also supports proprietary Lisps: I have very little interest in these,
445so if you want to run scripts using Allegro or LispWorks then
446~cl-launch~ is your only choice.