@@@ work in progress
[runlisp] / README.org
1 # -*-org-*-
2 #+TITLE: ~runlisp~ -- run scripts written in Common Lisp
3 #+AUTHOR: Mark Wooding
4 #+LaTeX_CLASS: strayman
5 #+LaTeX_HEADER: \usepackage{tikz, gnuplot-lua-tikz}
6
7 ~runlisp~ is a small C program intended to be run from a script ~#!~
8 line. It selects and invokes a Common Lisp implementation, so as to run
9 the script. In this sense, ~runlisp~ is a partial replacement for
10 ~cl-launch~.
11
12 Currently, the following Lisp implementations are supported:
13
14 + Armed Bear Common Lisp (~abcl~),
15 + Clozure Common Lisp (~ccl~),
16 + GNU CLisp (~clisp~),
17 + Carnegie--Mellon Univerity Common Lisp (~cmucl~), and
18 + Embeddable Common Lisp (~ecl~), and
19 + Steel Bank Common Lisp (~sbcl~).
20
21 I'm happy to take patches to support additional free Lisp
22 implementations. I'm not interested in supporting non-free Lisp
23 systems.
24
25
26 * Writing scripts in Lisp
27
28 ** Basic use
29
30 The obvious way to use ~runlisp~ is in a shebang (~#!~) line at the top
31 of a script. For example:
32
33 : #! /usr/local/bin/runlisp
34 : (format t "Hello from Lisp!~%")
35
36 Script interpreters must be named with absolute pathnames in shebang
37 lines; if your ~runlisp~ is installed somewhere other than
38 ~/usr/local/bin/~ then you'll need to write something different.
39 Alternatively, a common hack involves abusing the ~env~ program as a
40 script interpreter, because it will do a path search for the program
41 it's supposed to run:
42
43 : #! /usr/bin/env runlisp
44 : (format t "Hello from Lisp!~%")
45
46 ** Specific Lisps
47
48 Lisp implementations are not created equal -- for good reason. If your
49 script depends on the features of some particular Lisp implementation,
50 then you can tell ~runlisp~ that it must use that implementation to run
51 your script using the ~-L~ option; for example:
52
53 : #! /usr/local/bin/runlisp -Lsbcl
54 : (format t "Hello from Steel Bank Common Lisp!~%")
55
56 If your script supports several Lisps, but not all, then list them all
57 in the ~-L~ option, separated by commas:
58
59 : #! /usr/local/bin/runlisp -Lsbcl,ccl
60 : (format t #.(concatenate 'string
61 : "Hello from "
62 : #+sbcl "Steel Bank"
63 : #+ccl "Clozure"
64 : #-(or sbcl ccl) "an unexpected"
65 : " Common Lisp!~%"))
66
67 ** Embedded options
68
69 If your script requires features of particular Lisp implementations
70 /and/ you don't want to hardcode an absolute path to ~runlisp~, then you
71 have a problem. Most Unix-like operating systems will parse a shebang
72 line into the initial ~#!~, the pathname to the interpreter program,
73 and a /single/ optional argument: any further spaces don't separate
74 further arguments: they just get included in the first argument, all the
75 way up to the end of the line. So
76
77 : #! /usr/bin/env runlisp -Lsbcl
78 : (format t "Hello from Steel Bank Common Lisp!~%")
79
80 won't work: it'll just try to run a program named ~runlisp -Lsbcl~, with
81 a space in the middle of its name, and that's quite unlikely to exist.
82
83 To help with this situation, ~runlisp~ reads /embedded options/ from
84 your script. Specifically, if the script's second line contains the
85 token ~@RUNLISP:~ then ~runlisp~ will parse additional options from this
86 line. So the following will work properly.
87
88 : #! /usr/bin/env runlisp
89 : ;;; @RUNLISP: -Lsbcl
90 : (format t "Hello from Steel Bank Common Lisp!~%")
91
92 Embedded options are split at spaces properly. Spaces can be escaped or
93 quoted in (an approximation to) the usual shell manner, should that be
94 necessary. See the manpage for the gory details.
95
96 ** Common environment
97
98 ~runlisp~ puts some effort into making sure that Lisp scripts get the
99 same view of the world regardless of which implementation is running
100 them.
101
102 For example:
103
104 + The ~asdf~ and ~uiop~ systems are loaded and ready for use.
105
106 + The script's command-line arguments are available in
107 ~uiop:*command-line-arguments*~. Its name can be found by calling
108 ~(uiop:argv0)~ -- though it's probably also in ~*load-pathname*~.
109
110 + The prevailing Unix standard input, output, and error files are
111 available through the Lisp ~*standard-input*~, ~*standard-output*~,
112 and ~*error-ouptut*~ streams, respectively. (This is, alas, not a
113 foregone conclusion.)
114
115 + The keyword ~:runlisp-script~ is added to the ~*features*~ list.
116 This means that your script can tell whether it's being run from the
117 command line, and should therefore do its thing and then quit; or
118 merely being loaded into a Lisp system, e.g., for debugging or
119 development, and should sit still and not do anything until it's
120 asked.
121
122 See the manual for the complete list of guarantees.
123
124
125 * Invoking Lisp implementations
126
127 ** Basic use
128
129 A secondary use of ~runlisp~ is in build scripts for Lisp programs. If
130 the entire project is just a Lisp library, then it's possibly acceptable
131 to just provide an ASDF system definition and expect users to type
132 ~(asdf:load-system "mumble")~ to use it. If it's a program, or there
133 are things other than Lisp which ASDF can't or shouldn't handle --
134 significant pieces in other languages, or a Lisp executable image to
135 make and install -- then it seems sensible to make the project's main
136 build system be something language-agnostic, say Unix ~make~, and
137 arrange for that to invoke ASDF at the appropriate time.
138
139 But how should that be arranged? It's relatively easy for a project'
140 Lisp code to support multiple Lisp implementation; but each
141 implementation wants different runes for evaluating Lisp forms from the
142 command line, and some of them don't provide an ideal environment for
143 integrating into a build system. So ~runlisp~ provides a simple common
144 command-line interface for evaluating Lisp forms. For example:
145
146 : $ runlisp -e '(format t "~A~%" (+ 1 2))'
147 : 3
148
149 If your build script needs to get information out of Lisp, then wrapping
150 ~format~, or even ~prin1~, around forms is annoying; so ~runlisp~ has a
151 ~-p~ option which prints the values of the forms it evaluates.
152
153 : $ runlisp -e '(+ 1 2)'
154 : 3
155
156 If a form produces multiple values, then ~-p~ will print all of them
157 separated by spaces, on a single line:
158
159 : $ runlisp -p '(floor 5 2)'
160 : 2 1
161
162 In addition to evaluating forms with ~-e~, and printing their values
163 with ~-p~, you can also load a file of Lisp code using ~-l~.
164
165 When ~runlisp~ is acting on ~-e~, ~-p~, and/or ~-l~ options, it's said
166 to be running in /eval/ mode, rather than its usual /script/ mode. In
167 script mode, it /doesn't/ set ~:runlisp-script~ in ~*features*~.
168
169 You can still insist that ~runlisp~ use a particular Lisp
170 implementation, or one of a subset of implementations, using the ~-L~
171 option mentioned above.
172
173 : $ runlisp -Lsbcl -p "(lisp-implementation-type)"
174 : "SBCL"
175
176 ** Command-line processing
177
178 When scripting a Lisp -- as opposed to running a Lisp script -- it's not
179 necessarily the case that your script knows in advance exactly what it
180 needs to ask Lisp to do. For example, it might need to tell Lisp to
181 install a program in a particular directory, determined by Autoconf.
182 While it's certainly /possible/ to quote such data and splice them into
183 Lisp forms, it's more convenient to pass them in separately. So
184 ~runlisp~ ensures that the command-line options are available to Lisp
185 forms via ~uiop:*command-line-arguments*~, as they are to a Lisp script.
186
187 : $ runlisp -p "uiop:*command-line-arguments*" one two three
188 : ("one" "two" "three")
189
190 When running Lisp forms like this, ~(uiop:argv0)~ isn't very
191 meaningful. (Currently, it reveals the name of the script which
192 ~runlisp~ uses to implement this feature.)
193
194
195 * Configuring =runlisp=
196
197 ** Where =runlisp= looks for configuration
198
199 You can influence which Lisp implementations are chosen by ~runlisp~ by
200 writing a configuration file, and/or setting an environment variable.
201
202 ~runlisp~ looks for configuration in ~~/.runlisprc~, and in
203 ~~/.config/runlisprc~. You could put configuration in both, but that
204 doesn't seem like a great idea. A configuration file just contains
205 blank lines, comments, and command-line options, just as you'd write
206 them to the shell. Simple quoting and escaping is provided: see the
207 manual page for the full details. Each line is processed independently,
208 so it doesn't work to write an option on one line and then its argument
209 on the next.
210
211 The environment variable ~RUNLISP_OPTIONS~ is processed /after/ reading
212 the configuration file(s), if any. Again, it should contain
213 command-line options, as you'd write them to the shell.
214
215 ** Deciding which Lisp implementation to use
216
217 The most useful option to use here is ~-P~, which builds up a
218 /preference list/, in order. The argument to ~-P~ is a comma-separated
219 list of Lisp implementation names, just like you'd give to ~-L~.
220
221 If you provide multiple ~-P~ options (e.g., on different lines of your
222 configuration file, or separately in the configuration file and
223 environment variable, then the lists are concatenated. Since the
224 environment variable is processed after the configuration file, this
225 means that
226
227 When deciding which Lisp implementation to use, ~runlisp~ works as
228 follows. It builds a list of /acceptable/ Lisp implementations from the
229 ~-L~ options, and a list of /preferred/ Lisp implementations from the
230 ~-P~ options. If there aren't any ~-L~ options, then it assumes that
231 /all/ Lisp implementations are acceptable; but if there are no ~-P~
232 options then it assumes that /no/ Lisp implementations are preferred.
233 It then works through the preferred list in order: if it finds an
234 implementation which is installed and acceptable, then it uses that one.
235 If that doesn't work, then it works through the acceptable
236 implementations that it hasn't tried yet, in order, and if it finds one
237 of those that's installed, then it runs that one. Otherwise it reports
238 an error and gives up.
239
240 ** Clearing the preferred list
241
242 Since the environment variable is processed after the configuration
243 files, it can only append more Lisp implementations to the end of the
244 preferred list, which may well not be so helpful. There's an additional
245 option ~-C~, which completely clears the preferred list. The idea is
246 that you can write ~-C~ at the start of your ~RUNLISP_OPTIONS~
247 environment variable to temporarily override your usual configuration
248 for some special effect.
249
250
251 * What's wrong with =cl-launch=?
252
253 The short version is that ~cl-launch~ is slow and inconvenient.
254 ~cl-launch~ is a big, complicated Common Lisp/Bourne shell polyglot
255 which tries to do everything but doesn't quite succeed.
256
257 ** It's slow.
258
259 I took a trivial Lisp script:
260
261 : (format t "Hello from ~A!~%~
262 : Script = `~A'~%~
263 : Arguments = (~{`~A'~^, ~})~%"
264 : (lisp-implementation-type)
265 : (uiop:argv0)
266 : uiop:*command-line-arguments*)
267
268 I timed how long it took to run on all of ~runlisp~'s supported Lisp
269 implementations, and compared them to how long ~cl-launch~ took: the
270 results are shown in table [[tab:runlisp-vanilla]]. ~runlisp~ is /at least/
271 two and half times faster at running this script than ~cl-launch~ on all
272 implementations except Clozure CL[fn:slow-ccl], and approaching four and
273 a half times faster on SBCL.
274
275 #+CAPTION: ~cl-launch~ vs ~runlisp~ (with vanilla images)
276 #+NAME: tab:runlisp-vanilla
277 #+ATTR_LATEX: :float t :placement [tbp]
278 |------------------+-------------------+-----------------+----------------------|
279 | *Implementation* | *~cl-launch~ (s)* | *~runlisp~ (s)* | *~runlisp~ (factor)* |
280 |------------------+-------------------+-----------------+----------------------|
281 | ABCL | 7.3036 | 2.6027 | 2.806 |
282 | Clozure CL | 1.2769 | 0.9678 | 1.319 |
283 | GNU CLisp | 1.2498 | 0.2659 | 4.700 |
284 | CMU CL | 0.9665 | 0.3065 | 3.153 |
285 | ECL | 0.8025 | 0.3173 | 2.529 |
286 | SBCL | 0.3266 | 0.0739 | 4.419 |
287 |------------------+-------------------+-----------------+----------------------|
288 #+TBLFM: $4=$2/$3;%.3f
289
290 But this is using the `vanilla' Lisp images installed with the
291 implementations. ~runlisp~ by default builds custom images for most
292 Lisp implementations, which improves startup performance significantly;
293 see table [[tab:runlisp-custom]]. (I don't currently know how to build a
294 useful custom image for ABCL. ~runlisp~ does build a custom image for
295 ECL, but it doesn't help significantly.) These results are summarized
296 in figure [[fig:lisp-graph]].
297
298 #+CAPTION: ~cl-launch~ vs ~runlisp~ (with custom images)
299 #+NAME: tab:runlisp-custom
300 #+ATTR_LATEX: :float t :placement [tbp]
301 |------------------+-------------------+-----------------+----------------------|
302 | *Implementation* | *~cl-launch~ (s)* | *~runlisp~ (s)* | *~runlisp~ (factor)* |
303 |------------------+-------------------+-----------------+----------------------|
304 | ABCL | 7.3036 | 2.5873 | 2.823 |
305 | Clozure CL | 1.2769 | 0.0088 | 145.102 |
306 | GNU CLisp | 1.2498 | 0.0146 | 85.603 |
307 | CMU CL | 0.9665 | 0.0063 | 153.413 |
308 | ECL | 0.8025 | 0.3185 | 2.520 |
309 | SBCL | 0.3266 | 0.0077 | 42.416 |
310 |------------------+-------------------+-----------------+----------------------|
311 #+TBLFM: $4=$2/$3;%.3f
312
313 #+CAPTION: Comparison of ~runlisp~ and ~cl-launch~ times
314 #+NAME: fig:lisp-graph
315 #+ATTR_LATEX: :float t :placement [tbp]
316 [[file:doc/lisp-graph.tikz]]
317
318 Unlike ~cl-launch~, with some Lisp implementations at least, ~runlisp~
319 startup performance is usefully comparable to other popular scripting
320 language implementations. I wrote similarly trivial scripts in a number
321 of other languages, and timed them; the results are tabulated in table
322 [[tab:runlisp-interp]] and graphed in figure [[fig:interp-graph]].
323
324 #+CAPTION: ~runlisp~ vs other interpreters
325 #+NAME: tab:runlisp-interp
326 #+ATTR_LATEX: :float t :placement [tbp]
327 |------------------------------+-------------|
328 | *Implementation* | *Time (ms)* |
329 |------------------------------+-------------|
330 | Clozure CL | 8.8 |
331 | GNU CLisp | 14.6 |
332 | CMU CL | 6.3 |
333 | SBCL | 7.7 |
334 |------------------------------+-------------|
335 | Perl | 1.2 |
336 | Python | 10.3 |
337 |------------------------------+-------------|
338 | Debian Almquist shell (dash) | 1.4 |
339 | GNU Bash | 2.0 |
340 | Z Shell | 4.1 |
341 |------------------------------+-------------|
342 | Tiny C (compile & run) | 1.2 |
343 | GCC (precompiled) | 0.5 |
344 |------------------------------+-------------|
345
346 #+CAPTION: Comparison of ~runlisp~ and other script interpreters
347 #+NAME: fig:interp-graph
348 #+Attr_latex: :float t :placement [tbp]
349 [[file:doc/interp-graph.tikz]]
350
351 (All the timings in this section were performed on the same 2020 Dell
352 XPS13 laptop running Debian `buster'. The tools used to make the
353 measurements are included in the source distribution, in the ~bench/~
354 subdirectory.)
355
356 [fn:slow-ccl] I don't know why Clozure CL shows such a small difference
357 here.
358
359 ** It's inconvenient
360
361 ~cl-launch~ has this elaborate machinery which reads shell script
362 fragments from various places and sets variables like ~$LISPS~, but it
363 doesn't quite work.
364
365 Unlike other scripting languages such as Perl or Python, Common Lisp has
366 lots of implementations, and they all have various unique features (and
367 bugs) which a script might rely on (or need to avoid). Also, a user
368 might have preferences about which Lisps to use. ~cl-launch~'s approach
369 to this problem is a ~system_preferred_lisps~ shell function which can
370 be used in ~~/.cl-launchrc~ to select a Lisp system for a particular
371 `software system', though this notion doesn't appear to be well-defined,
372 but this all works by editing a single ~$LISPS~ shell variable. By
373 contrast, ~runlisp~ has a ~-L~ option with which scripts can specify the
374 Lisp systems they support (in a preference order), and a ~-P~ option
375 with which users can express their own preferences (e.g., in the
376 environment or a configuration file): ~runlisp~ will never choose a Lisp
377 system which the script can't deal with, but it will respect the user's
378 relative preferences.
379
380 ** It doesn't establish a (useful) common environment
381
382 A number of Lisp systems are annoyingly deficient in their handling of
383 scripts.
384
385 For example, when GNU CLisp's ~-x~ option is used, it rebinds
386 ~*standard-input*~ to an internal string stream holding the expression
387 passed in on the command line, leaving the process's actual stdin nearly
388 impossible to access.
389
390 : $ date | cl-launch -l sbcl -i "(princ (read-line nil nil))" # expected
391 : Sun 9 Aug 14:39:10 BST 2020
392 : $ date | cl-launch -l clisp -i "(princ (read-line nil nil))" # bug!
393 : NIL
394
395 As another example, Armed Bear Common Lisp doesn't seem to believe in
396 the stderr stream: when it starts up, ~*error-ouptut*~ is bound to the
397 standard output, just like ~*standard-output*~. Also, ~cl-launch~
398 loading ASDF causes a huge number of ~style-warning~ messages to be
399 written to stdout, making ABCL pretty much useless for writing filter
400 scripts.
401
402 : $ cl-launch -l sbcl -i '(progn
403 : (format *standard-output* "output~%")
404 : (format *error-output* "error~%"))' \
405 : > >(sed 's/^/stdout: /') 2> >(sed 's/^/stderr: /')
406 : stdout: output
407 : stderr: error
408 : $ cl-launch -l abcl -i '(progn
409 : (format *standard-output* "output~%")
410 : (format *error-output* "error~%"))' \
411 : > >(sed 's/^/stdout: /') 2> >(sed 's/^/stderr: /')
412 : [1813 lines of compiler warnings tagged `stdout:']
413 : stdout: output
414 : stdout: error
415
416 ~runlisp~ takes care of all of this, providing a basic but useful common
417 level of shell integration for all its supported Lisp implementations.
418 In particular:
419
420 + It ensures that the standard Unix `stdin', `stdout', and `stdarr'
421 file descriptors are hooked up to the Lisp ~*standard-input*~,
422 ~*standard-output*~, and ~*error-output*~ streams.
423
424 + It ensures that starting a script doesn't write a deluge of
425 diagnostic drivel.
426
427 The complete details are given in ~runlisp~'s manpage.
428
429 ** Why might one prefer =cl-launch= anyway?
430
431 On the other hand, ~cl-launch~ is well established and full-featured.
432
433 ~cl-launch~ compiles scripts before trying to run them, so they'll run
434 faster on Lisps which use an interpreter by default. It has a caching
435 feature so running a script a second time doesn't need to recompile it.
436 If your scripts are compute-intensive and benefit from ahead-of-time
437 compilation then maybe ~cl-launch~ is preferable.
438
439 ~cl-launch~ supports more Lisp systems. I only have six installed on my
440 development machine at the moment, so those are the ones that ~runlisp~
441 supports. If you want your scripts to be able to run on other Lisps,
442 then ~cl-launch~ is the way to do that. Of course, I welcome patches to
443 help ~runlisp~ support other free Lisp implementations. ~cl-launch~
444 also supports proprietary Lisps: I have very little interest in these,
445 so if you want to run scripts using Allegro or LispWorks then
446 ~cl-launch~ is your only choice.