Commit | Line | Data |
---|---|---|
e29834b8 MW |
1 | # -*-org-*- |
2 | #+TITLE: ~runlisp~ -- run scripts written in Common Lisp | |
3 | #+AUTHOR: Mark Wooding | |
4 | #+LaTeX_CLASS: strayman | |
5 | #+LaTeX_HEADER: \usepackage{tikz, gnuplot-lua-tikz} | |
6 | ||
7 | ~runlisp~ is a small C program intended to be run from a script ~#!~ | |
8 | line. It selects and invokes a Common Lisp implementation, so as to run | |
9 | the script. In this sense, ~runlisp~ is a partial replacement for | |
10 | ~cl-launch~. | |
11 | ||
12 | Currently, the following Lisp implementations are supported: | |
13 | ||
14 | + Armed Bear Common Lisp (~abcl~), | |
15 | + Clozure Common Lisp (~ccl~), | |
16 | + GNU CLisp (~clisp~), | |
17 | + Carnegie--Mellon Univerity Common Lisp (~cmucl~), and | |
18 | + Embeddable Common Lisp (~ecl~), and | |
19 | + Steel Bank Common Lisp (~sbcl~). | |
20 | ||
21 | I'm happy to take patches to support additional free Lisp | |
22 | implementations. I'm not interested in supporting non-free Lisp | |
23 | systems. | |
24 | ||
25 | ||
26 | * Writing scripts in Lisp | |
27 | ||
28 | ** Basic use | |
29 | ||
30 | The obvious way to use ~runlisp~ is in a shebang (~#!~) line at the top | |
31 | of a script. For example: | |
32 | ||
33 | : #! /usr/local/bin/runlisp | |
34 | : (format t "Hello from Lisp!~%") | |
35 | ||
36 | Script interpreters must be named with absolute pathnames in shebang | |
37 | lines; if your ~runlisp~ is installed somewhere other than | |
38 | ~/usr/local/bin/~ then you'll need to write something different. | |
39 | Alternatively, a common hack involves abusing the ~env~ program as a | |
40 | script interpreter, because it will do a path search for the program | |
41 | it's supposed to run: | |
42 | ||
43 | : #! /usr/bin/env runlisp | |
44 | : (format t "Hello from Lisp!~%") | |
45 | ||
46 | ** Specific Lisps | |
47 | ||
48 | Lisp implementations are not created equal -- for good reason. If your | |
49 | script depends on the features of some particular Lisp implementation, | |
50 | then you can tell ~runlisp~ that it must use that implementation to run | |
51 | your script using the ~-L~ option; for example: | |
52 | ||
53 | : #! /usr/local/bin/runlisp -Lsbcl | |
54 | : (format t "Hello from Steel Bank Common Lisp!~%") | |
55 | ||
56 | If your script supports several Lisps, but not all, then list them all | |
57 | in the ~-L~ option, separated by commas: | |
58 | ||
59 | : #! /usr/local/bin/runlisp -Lsbcl,ccl | |
60 | : (format t #.(concatenate 'string | |
61 | : "Hello from " | |
62 | : #+sbcl "Steel Bank" | |
63 | : #+ccl "Clozure" | |
64 | : #-(or sbcl ccl) "an unexpected" | |
65 | : " Common Lisp!~%")) | |
66 | ||
67 | ** Embedded options | |
68 | ||
69 | If your script requires features of particular Lisp implementations | |
70 | /and/ you don't want to hardcode an absolute path to ~runlisp~, then you | |
71 | have a problem. Most Unix-like operating systems will parse a shebang | |
72 | line into the initial ~#!~, the pathname to the interpreter program, | |
73 | and a /single/ optional argument: any further spaces don't separate | |
74 | further arguments: they just get included in the first argument, all the | |
75 | way up to the end of the line. So | |
76 | ||
77 | : #! /usr/bin/env runlisp -Lsbcl | |
78 | : (format t "Hello from Steel Bank Common Lisp!~%") | |
79 | ||
80 | won't work: it'll just try to run a program named ~runlisp -Lsbcl~, with | |
81 | a space in the middle of its name, and that's quite unlikely to exist. | |
82 | ||
83 | To help with this situation, ~runlisp~ reads /embedded options/ from | |
84 | your script. Specifically, if the script's second line contains the | |
85 | token ~@RUNLISP:~ then ~runlisp~ will parse additional options from this | |
86 | line. So the following will work properly. | |
87 | ||
88 | : #! /usr/bin/env runlisp | |
89 | : ;;; @RUNLISP: -Lsbcl | |
90 | : (format t "Hello from Steel Bank Common Lisp!~%") | |
91 | ||
92 | Embedded options are split at spaces properly. Spaces can be escaped or | |
93 | quoted in (an approximation to) the usual shell manner, should that be | |
94 | necessary. See the manpage for the gory details. | |
95 | ||
96 | ** Common environment | |
97 | ||
98 | ~runlisp~ puts some effort into making sure that Lisp scripts get the | |
99 | same view of the world regardless of which implementation is running | |
100 | them. | |
101 | ||
102 | For example: | |
103 | ||
104 | + The ~asdf~ and ~uiop~ systems are loaded and ready for use. | |
105 | ||
106 | + The script's command-line arguments are available in | |
107 | ~uiop:*command-line-arguments*~. Its name can be found by calling | |
108 | ~(uiop:argv0)~ -- though it's probably also in ~*load-pathname*~. | |
109 | ||
110 | + The prevailing Unix standard input, output, and error files are | |
111 | available through the Lisp ~*standard-input*~, ~*standard-output*~, | |
112 | and ~*error-ouptut*~ streams, respectively. (This is, alas, not a | |
113 | foregone conclusion.) | |
114 | ||
115 | + The keyword ~:runlisp-script~ is added to the ~*features*~ list. | |
116 | This means that your script can tell whether it's being run from the | |
117 | command line, and should therefore do its thing and then quit; or | |
118 | merely being loaded into a Lisp system, e.g., for debugging or | |
119 | development, and should sit still and not do anything until it's | |
120 | asked. | |
121 | ||
122 | See the manual for the complete list of guarantees. | |
123 | ||
124 | ||
125 | * Invoking Lisp implementations | |
126 | ||
127 | ** Basic use | |
128 | ||
129 | A secondary use of ~runlisp~ is in build scripts for Lisp programs. If | |
130 | the entire project is just a Lisp library, then it's possibly acceptable | |
131 | to just provide an ASDF system definition and expect users to type | |
132 | ~(asdf:load-system "mumble")~ to use it. If it's a program, or there | |
133 | are things other than Lisp which ASDF can't or shouldn't handle -- | |
134 | significant pieces in other languages, or a Lisp executable image to | |
135 | make and install -- then it seems sensible to make the project's main | |
136 | build system be something language-agnostic, say Unix ~make~, and | |
137 | arrange for that to invoke ASDF at the appropriate time. | |
138 | ||
139 | But how should that be arranged? It's relatively easy for a project' | |
140 | Lisp code to support multiple Lisp implementation; but each | |
141 | implementation wants different runes for evaluating Lisp forms from the | |
142 | command line, and some of them don't provide an ideal environment for | |
143 | integrating into a build system. So ~runlisp~ provides a simple common | |
144 | command-line interface for evaluating Lisp forms. For example: | |
145 | ||
146 | : $ runlisp -e '(format t "~A~%" (+ 1 2))' | |
147 | : 3 | |
148 | ||
149 | If your build script needs to get information out of Lisp, then wrapping | |
150 | ~format~, or even ~prin1~, around forms is annoying; so ~runlisp~ has a | |
151 | ~-p~ option which prints the values of the forms it evaluates. | |
152 | ||
153 | : $ runlisp -e '(+ 1 2)' | |
154 | : 3 | |
155 | ||
156 | If a form produces multiple values, then ~-p~ will print all of them | |
157 | separated by spaces, on a single line: | |
158 | ||
159 | : $ runlisp -p '(floor 5 2)' | |
160 | : 2 1 | |
161 | ||
162 | In addition to evaluating forms with ~-e~, and printing their values | |
163 | with ~-p~, you can also load a file of Lisp code using ~-l~. | |
164 | ||
165 | When ~runlisp~ is acting on ~-e~, ~-p~, and/or ~-l~ options, it's said | |
166 | to be running in /eval/ mode, rather than its usual /script/ mode. In | |
167 | script mode, it /doesn't/ set ~:runlisp-script~ in ~*features*~. | |
168 | ||
169 | You can still insist that ~runlisp~ use a particular Lisp | |
170 | implementation, or one of a subset of implementations, using the ~-L~ | |
171 | option mentioned above. | |
172 | ||
173 | : $ runlisp -Lsbcl -p "(lisp-implementation-type)" | |
174 | : "SBCL" | |
175 | ||
176 | ** Command-line processing | |
177 | ||
178 | When scripting a Lisp -- as opposed to running a Lisp script -- it's not | |
179 | necessarily the case that your script knows in advance exactly what it | |
180 | needs to ask Lisp to do. For example, it might need to tell Lisp to | |
181 | install a program in a particular directory, determined by Autoconf. | |
182 | While it's certainly /possible/ to quote such data and splice them into | |
183 | Lisp forms, it's more convenient to pass them in separately. So | |
184 | ~runlisp~ ensures that the command-line options are available to Lisp | |
185 | forms via ~uiop:*command-line-arguments*~, as they are to a Lisp script. | |
186 | ||
187 | : $ runlisp -p "uiop:*command-line-arguments*" one two three | |
188 | : ("one" "two" "three") | |
189 | ||
190 | When running Lisp forms like this, ~(uiop:argv0)~ isn't very | |
191 | meaningful. (Currently, it reveals the name of the script which | |
192 | ~runlisp~ uses to implement this feature.) | |
193 | ||
194 | ||
195 | * Configuring =runlisp= | |
196 | ||
197 | ** Where =runlisp= looks for configuration | |
198 | ||
199 | You can influence which Lisp implementations are chosen by ~runlisp~ by | |
200 | writing a configuration file, and/or setting an environment variable. | |
201 | ||
202 | ~runlisp~ looks for configuration in ~~/.runlisprc~, and in | |
203 | ~~/.config/runlisprc~. You could put configuration in both, but that | |
204 | doesn't seem like a great idea. A configuration file just contains | |
205 | blank lines, comments, and command-line options, just as you'd write | |
206 | them to the shell. Simple quoting and escaping is provided: see the | |
207 | manual page for the full details. Each line is processed independently, | |
208 | so it doesn't work to write an option on one line and then its argument | |
209 | on the next. | |
210 | ||
211 | The environment variable ~RUNLISP_OPTIONS~ is processed /after/ reading | |
212 | the configuration file(s), if any. Again, it should contain | |
213 | command-line options, as you'd write them to the shell. | |
214 | ||
215 | ** Deciding which Lisp implementation to use | |
216 | ||
217 | The most useful option to use here is ~-P~, which builds up a | |
218 | /preference list/, in order. The argument to ~-P~ is a comma-separated | |
219 | list of Lisp implementation names, just like you'd give to ~-L~. | |
220 | ||
221 | If you provide multiple ~-P~ options (e.g., on different lines of your | |
222 | configuration file, or separately in the configuration file and | |
223 | environment variable, then the lists are concatenated. Since the | |
224 | environment variable is processed after the configuration file, this | |
225 | means that | |
226 | ||
227 | When deciding which Lisp implementation to use, ~runlisp~ works as | |
228 | follows. It builds a list of /acceptable/ Lisp implementations from the | |
229 | ~-L~ options, and a list of /preferred/ Lisp implementations from the | |
230 | ~-P~ options. If there aren't any ~-L~ options, then it assumes that | |
231 | /all/ Lisp implementations are acceptable; but if there are no ~-P~ | |
232 | options then it assumes that /no/ Lisp implementations are preferred. | |
233 | It then works through the preferred list in order: if it finds an | |
234 | implementation which is installed and acceptable, then it uses that one. | |
235 | If that doesn't work, then it works through the acceptable | |
236 | implementations that it hasn't tried yet, in order, and if it finds one | |
237 | of those that's installed, then it runs that one. Otherwise it reports | |
238 | an error and gives up. | |
239 | ||
240 | ** Clearing the preferred list | |
241 | ||
242 | Since the environment variable is processed after the configuration | |
243 | files, it can only append more Lisp implementations to the end of the | |
244 | preferred list, which may well not be so helpful. There's an additional | |
245 | option ~-C~, which completely clears the preferred list. The idea is | |
246 | that you can write ~-C~ at the start of your ~RUNLISP_OPTIONS~ | |
247 | environment variable to temporarily override your usual configuration | |
248 | for some special effect. | |
249 | ||
250 | ||
251 | * What's wrong with =cl-launch=? | |
252 | ||
253 | The short version is that ~cl-launch~ is slow and inconvenient. | |
254 | ~cl-launch~ is a big, complicated Common Lisp/Bourne shell polyglot | |
255 | which tries to do everything but doesn't quite succeed. | |
256 | ||
257 | ** It's slow. | |
258 | ||
259 | I took a trivial Lisp script: | |
260 | ||
261 | : (format t "Hello from ~A!~%~ | |
262 | : Script = `~A'~%~ | |
263 | : Arguments = (~{`~A'~^, ~})~%" | |
264 | : (lisp-implementation-type) | |
265 | : (uiop:argv0) | |
266 | : uiop:*command-line-arguments*) | |
267 | ||
268 | I timed how long it took to run on all of ~runlisp~'s supported Lisp | |
269 | implementations, and compared them to how long ~cl-launch~ took: the | |
270 | results are shown in table [[tab:runlisp-vanilla]]. ~runlisp~ is /at least/ | |
271 | two and half times faster at running this script than ~cl-launch~ on all | |
272 | implementations except Clozure CL[fn:slow-ccl], and approaching four and | |
273 | a half times faster on SBCL. | |
274 | ||
275 | #+CAPTION: ~cl-launch~ vs ~runlisp~ (with vanilla images) | |
276 | #+NAME: tab:runlisp-vanilla | |
277 | #+ATTR_LATEX: :float t :placement [tbp] | |
278 | |------------------+-------------------+-----------------+----------------------| | |
279 | | *Implementation* | *~cl-launch~ (s)* | *~runlisp~ (s)* | *~runlisp~ (factor)* | | |
280 | |------------------+-------------------+-----------------+----------------------| | |
281 | | ABCL | 7.3036 | 2.6027 | 2.806 | | |
282 | | Clozure CL | 1.2769 | 0.9678 | 1.319 | | |
283 | | GNU CLisp | 1.2498 | 0.2659 | 4.700 | | |
284 | | CMU CL | 0.9665 | 0.3065 | 3.153 | | |
285 | | ECL | 0.8025 | 0.3173 | 2.529 | | |
286 | | SBCL | 0.3266 | 0.0739 | 4.419 | | |
287 | |------------------+-------------------+-----------------+----------------------| | |
288 | #+TBLFM: $4=$2/$3;%.3f | |
289 | ||
290 | But this is using the `vanilla' Lisp images installed with the | |
291 | implementations. ~runlisp~ by default builds custom images for most | |
292 | Lisp implementations, which improves startup performance significantly; | |
293 | see table [[tab:runlisp-custom]]. (I don't currently know how to build a | |
294 | useful custom image for ABCL. ~runlisp~ does build a custom image for | |
295 | ECL, but it doesn't help significantly.) These results are summarized | |
296 | in figure [[fig:lisp-graph]]. | |
297 | ||
298 | #+CAPTION: ~cl-launch~ vs ~runlisp~ (with custom images) | |
299 | #+NAME: tab:runlisp-custom | |
300 | #+ATTR_LATEX: :float t :placement [tbp] | |
301 | |------------------+-------------------+-----------------+----------------------| | |
302 | | *Implementation* | *~cl-launch~ (s)* | *~runlisp~ (s)* | *~runlisp~ (factor)* | | |
303 | |------------------+-------------------+-----------------+----------------------| | |
304 | | ABCL | 7.3036 | 2.5873 | 2.823 | | |
305 | | Clozure CL | 1.2769 | 0.0088 | 145.102 | | |
306 | | GNU CLisp | 1.2498 | 0.0146 | 85.603 | | |
307 | | CMU CL | 0.9665 | 0.0063 | 153.413 | | |
308 | | ECL | 0.8025 | 0.3185 | 2.520 | | |
309 | | SBCL | 0.3266 | 0.0077 | 42.416 | | |
310 | |------------------+-------------------+-----------------+----------------------| | |
311 | #+TBLFM: $4=$2/$3;%.3f | |
312 | ||
313 | #+CAPTION: Comparison of ~runlisp~ and ~cl-launch~ times | |
314 | #+NAME: fig:lisp-graph | |
315 | #+ATTR_LATEX: :float t :placement [tbp] | |
316 | [[file:doc/lisp-graph.tikz]] | |
317 | ||
318 | Unlike ~cl-launch~, with some Lisp implementations at least, ~runlisp~ | |
319 | startup performance is usefully comparable to other popular scripting | |
320 | language implementations. I wrote similarly trivial scripts in a number | |
321 | of other languages, and timed them; the results are tabulated in table | |
322 | [[tab:runlisp-interp]] and graphed in figure [[fig:interp-graph]]. | |
323 | ||
324 | #+CAPTION: ~runlisp~ vs other interpreters | |
325 | #+NAME: tab:runlisp-interp | |
326 | #+ATTR_LATEX: :float t :placement [tbp] | |
327 | |------------------------------+-------------| | |
328 | | *Implementation* | *Time (ms)* | | |
329 | |------------------------------+-------------| | |
330 | | Clozure CL | 8.8 | | |
331 | | GNU CLisp | 14.6 | | |
332 | | CMU CL | 6.3 | | |
333 | | SBCL | 7.7 | | |
334 | |------------------------------+-------------| | |
335 | | Perl | 1.2 | | |
336 | | Python | 10.3 | | |
337 | |------------------------------+-------------| | |
338 | | Debian Almquist shell (dash) | 1.4 | | |
339 | | GNU Bash | 2.0 | | |
340 | | Z Shell | 4.1 | | |
341 | |------------------------------+-------------| | |
342 | | Tiny C (compile & run) | 1.2 | | |
343 | | GCC (precompiled) | 0.5 | | |
344 | |------------------------------+-------------| | |
345 | ||
346 | #+CAPTION: Comparison of ~runlisp~ and other script interpreters | |
347 | #+NAME: fig:interp-graph | |
348 | #+Attr_latex: :float t :placement [tbp] | |
349 | [[file:doc/interp-graph.tikz]] | |
350 | ||
351 | (All the timings in this section were performed on the same 2020 Dell | |
352 | XPS13 laptop running Debian `buster'. The tools used to make the | |
353 | measurements are included in the source distribution, in the ~bench/~ | |
354 | subdirectory.) | |
355 | ||
356 | [fn:slow-ccl] I don't know why Clozure CL shows such a small difference | |
357 | here. | |
358 | ||
359 | ** It's inconvenient | |
360 | ||
361 | ~cl-launch~ has this elaborate machinery which reads shell script | |
362 | fragments from various places and sets variables like ~$LISPS~, but it | |
363 | doesn't quite work. | |
364 | ||
365 | Unlike other scripting languages such as Perl or Python, Common Lisp has | |
366 | lots of implementations, and they all have various unique features (and | |
367 | bugs) which a script might rely on (or need to avoid). Also, a user | |
368 | might have preferences about which Lisps to use. ~cl-launch~'s approach | |
369 | to this problem is a ~system_preferred_lisps~ shell function which can | |
370 | be used in ~~/.cl-launchrc~ to select a Lisp system for a particular | |
371 | `software system', though this notion doesn't appear to be well-defined, | |
372 | but this all works by editing a single ~$LISPS~ shell variable. By | |
373 | contrast, ~runlisp~ has a ~-L~ option with which scripts can specify the | |
374 | Lisp systems they support (in a preference order), and a ~-P~ option | |
375 | with which users can express their own preferences (e.g., in the | |
376 | environment or a configuration file): ~runlisp~ will never choose a Lisp | |
377 | system which the script can't deal with, but it will respect the user's | |
378 | relative preferences. | |
379 | ||
380 | ** It doesn't establish a (useful) common environment | |
381 | ||
382 | A number of Lisp systems are annoyingly deficient in their handling of | |
383 | scripts. | |
384 | ||
385 | For example, when GNU CLisp's ~-x~ option is used, it rebinds | |
386 | ~*standard-input*~ to an internal string stream holding the expression | |
387 | passed in on the command line, leaving the process's actual stdin nearly | |
388 | impossible to access. | |
389 | ||
390 | : $ date | cl-launch -l sbcl -i "(princ (read-line nil nil))" # expected | |
391 | : Sun 9 Aug 14:39:10 BST 2020 | |
392 | : $ date | cl-launch -l clisp -i "(princ (read-line nil nil))" # bug! | |
393 | : NIL | |
394 | ||
395 | As another example, Armed Bear Common Lisp doesn't seem to believe in | |
396 | the stderr stream: when it starts up, ~*error-ouptut*~ is bound to the | |
397 | standard output, just like ~*standard-output*~. Also, ~cl-launch~ | |
398 | loading ASDF causes a huge number of ~style-warning~ messages to be | |
399 | written to stdout, making ABCL pretty much useless for writing filter | |
400 | scripts. | |
401 | ||
402 | : $ cl-launch -l sbcl -i '(progn | |
403 | : (format *standard-output* "output~%") | |
404 | : (format *error-output* "error~%"))' \ | |
405 | : > >(sed 's/^/stdout: /') 2> >(sed 's/^/stderr: /') | |
406 | : stdout: output | |
407 | : stderr: error | |
408 | : $ cl-launch -l abcl -i '(progn | |
409 | : (format *standard-output* "output~%") | |
410 | : (format *error-output* "error~%"))' \ | |
411 | : > >(sed 's/^/stdout: /') 2> >(sed 's/^/stderr: /') | |
412 | : [1813 lines of compiler warnings tagged `stdout:'] | |
413 | : stdout: output | |
414 | : stdout: error | |
415 | ||
416 | ~runlisp~ takes care of all of this, providing a basic but useful common | |
417 | level of shell integration for all its supported Lisp implementations. | |
418 | In particular: | |
419 | ||
420 | + It ensures that the standard Unix `stdin', `stdout', and `stdarr' | |
421 | file descriptors are hooked up to the Lisp ~*standard-input*~, | |
422 | ~*standard-output*~, and ~*error-output*~ streams. | |
423 | ||
424 | + It ensures that starting a script doesn't write a deluge of | |
425 | diagnostic drivel. | |
426 | ||
427 | The complete details are given in ~runlisp~'s manpage. | |
428 | ||
429 | ** Why might one prefer =cl-launch= anyway? | |
430 | ||
431 | On the other hand, ~cl-launch~ is well established and full-featured. | |
432 | ||
433 | ~cl-launch~ compiles scripts before trying to run them, so they'll run | |
434 | faster on Lisps which use an interpreter by default. It has a caching | |
435 | feature so running a script a second time doesn't need to recompile it. | |
436 | If your scripts are compute-intensive and benefit from ahead-of-time | |
437 | compilation then maybe ~cl-launch~ is preferable. | |
438 | ||
439 | ~cl-launch~ supports more Lisp systems. I only have six installed on my | |
440 | development machine at the moment, so those are the ones that ~runlisp~ | |
441 | supports. If you want your scripts to be able to run on other Lisps, | |
442 | then ~cl-launch~ is the way to do that. Of course, I welcome patches to | |
443 | help ~runlisp~ support other free Lisp implementations. ~cl-launch~ | |
444 | also supports proprietary Lisps: I have very little interest in these, | |
445 | so if you want to run scripts using Allegro or LispWorks then | |
446 | ~cl-launch~ is your only choice. |