gremlin/gremlin.in: Use `locale.getpreferredencoding'.
[autoys] / gremlin / gremlin.1
CommitLineData
13b093b1
MW
1.\" -*-nroff-*-
2.\"
3.\" Manual for the audio conversion gremlin
4.\"
5.\" (c) 2016 Mark Wooding
6.\"
7.
8.\"----- Licensing notice ---------------------------------------------------
9.\"
10.\" This file is part of the `autoys' audio tools collection.
11.\"
12.\" `autoys' is free software; you can redistribute it and/or modify
13.\" it under the terms of the GNU General Public License as published by
14.\" the Free Software Foundation; either version 2 of the License, or
15.\" (at your option) any later version.
16.\"
17.\" `autoys' is distributed in the hope that it will be useful,
18.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
19.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
20.\" GNU General Public License for more details.
21.\"
22.\" You should have received a copy of the GNU General Public License
23.\" along with `autoys'; if not, write to the Free Software Foundation,
24.\" Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
25.
26.TH gremlin 1 "13 February 2016" "Mark Wooding" "autoys"
27.
28.\"--------------------------------------------------------------------------
29.SH NAME
30gremlin \- batch audio file converter
31.
32.SH SYNOPSIS
33.B gremlin
34.RB [ \-in ]
35.RB [ \-T
36.IR timeout ]
37.RB [ \-t
38.IR timeout ]
39.I config
40.
41.\"--------------------------------------------------------------------------
42.SH DESCRIPTION
43.
44The
45.B gremlin
46program converts audio files
47in an input `master' directory tree,
48which presumably contains
49high-quality (ideally lossless) encodings
50of interesting audio,
51writing corresponding converted files
52to a collection of output directory trees.
53It's non-interactive, idempotent, and restartable;
54it never modifies its master tree.
55It's exactly the sort of thing you want to
56install as a daily cron job.
57.PP
58The
59.B gremlin
60reads a configuration file
61which describes the conversion policy for each of the output trees.
62The policy can say things like:
63copy MP3 files up to 160kb/s,
64or Ogg Vorbis files up to 128kb/s;
65and convert everything else to 128kb/s Ogg Vorbis.
66.PP
67The
68.B gremlin
69can also convert image files, such as cover art.
70.PP
71Input files can be anything which
72GStreamer and/or the Python Imaging Library can understand;
73output files are more constrained,
74because the
75.B gremlin
76has to be able to understand
77their relevant properties.
78The currently supported audio formats are
79Ogg Vorbis and
80MP3;
81image formats are
82JPEG,
83PNG, and
84BMP.
85.PP
86In a little more detail:
87the
88.B gremlin
89works through its input master tree,
90one directory at a time.
91For each master directory,
92it tries to write a converted version
93to a corresponding output directory
94in each of the output trees.
95For each file in the master directory,
96it determines which files should be made
97in each output directory:
98if those files exist,
99and are not older than the master file,
100then they're left alone on the assumption that they're up-to-date;
101otherwise, the
102.B gremlin
103will make the output files by converting the master file.
104.PP
105Any other files or directories in the output directory
106will be
107.IR deleted .
108The
109.B gremlin
110assumes that its output trees belong entirely to it,
111to maintain according to its configuration,
112and that unexpected files are either
113debris left over from an earlier failure
114or a result of a policy change,
115and in either case the right thing to do is
116to delete the offending files.
117.
118.SS "Command line syntax"
119The following options are recognized.
120.TP
121.B "\-h, \-\-help"
122Write a help message to standard output
123describing the
124.BR gremlin 's
125command-line options,
126and exit with status zero.
127.TP
128.B "\-\-version"
129Write the
130.BR gremlin 's
131version number to standard output
132and exit with status zero.
133.TP
134.B "\-i, \-\-interactive"
135Write progress eyecandy to standard output while running.
136While walking the master tree,
137the
138.B gremlin
139shows which directory it's currently examining.
140While converting audio files,
141it shows a progress meter showing
142a bar chart of the job in progress,
143the percentage of the job which is complete,
144and an estimated time to completion.
145(This last starts out rather inaccurate,
146but seems to be pretty good after a couple of seconds.)
147All this is done automatically if standard output is a terminal;
148this option can be used to turn it on under other circumstances.
149.TP
150.B "\-n, \-\-no-act"
151Don't actually modify the filesystem.
152No files will be created or removed.
153.TP
154.BI "\-t, \-\-timeout=" timeout
155Only run for about
156.I timeout
157seconds.
158Once the timeout has expired,
159.B gremlin
160will try to finish what it's doing
161and then exit with status zero.
162.IP
163(This might seem a surprising choice of exit status.
164The idea is that the
165.B gremlin
166was asked to spend some amount of time converting files,
167and it has done that successfully.)
168.TP
169.BI "\-T, \-\-timeout-nasty=" timeout
170If the timeout set by the
171.B \-t
172option (above) has expired,
173and a further
174.I timeout
175seconds have elapsed
176but the
177.B gremlin
178still hasn't managed to wrap things up,
179then exit immediately with status 3,
180possibly leaving files partially converted,
181or other kinds of incompleteness.
182(A future run of the
183.B gremlin
184will notice this wreckage and clean it up.)
185.
186.\"--------------------------------------------------------------------------
187.SH CONFIGURATION FILE
188.
189.SS "Lexical syntax"
190The
191.BR gremlin 's
192configuration file has a simple token-oriented lexical syntax.
193Whitespace acts to separate tokens but has no other meaning.
194A hash sign
195.RB ` # '
196outside of a quoted string introduces a comment
197which extends to the end of the line;
198newlines otherwise just separate tokens, just like other whitespace.
199There are no `reserved words',
200but some names have special meanings,
201depending on the context.
202.PP
203Integers are written in decimal.
204(There is no provision for entering numbers in hex or octal.)
205.IP
206.I int
207::=
208.I digit
209\&...
210.br
211.I digit
212::=
213.B 0
214|
215.B 1
216|
217.B 2
218|
219.B 3
220|
221.B 4
222|
223.B 5
224|
225.B 6
226|
227.B 7
228|
229.B 8
230|
231.B 9
232.PP
233Strings (mostly used for pathnames and suchlike)
234are enclosed in double quotes
235.RB ` """" ';
236quotes and backslashes to be included in the string
237must be escaped by preceding them with a backslash
238.RB ` \e '.
239.IP
240.I string
241::=
242.B """"
243.IR string-char ...\&
244.B """"
245.br
246.I string-char
247::=
248any character other than
249.B """"
250or
251.B \e
252.br
253\h'4m'|
254.B "\e"""
255|
256.B \e\e
257.
258.SS "Top-level syntax"
259At a high level,
260the configuration consists of a sequence of
261.IR "top-level items" .
262.IP
263.I config
264::=
265.I toplevel-item
266\&...
267.
268.SS "Global settings"
269Miscellaneous configuration for the whole program
270goes in a top-level
271.B vars
272section.
273.IP
274.I toplevel-item
275::=
276.I vars-section
277.br
278.I vars-section
279::=
280.B vars
281.B {
282.IR var-setting
283\&...\&
284.B }
285.PP
286There may be multiple such sections.
287The same variable may be set more than once;
288if that happens,
289only the last such setting has affect.
290.IP
291.I var-setting
292::=
293.B master
294.B =
295.I path
296.br
297.I path
298::=
299.I string
300.PP
301The
302.B master
303variable holds the pathname of the top of the master tree.
304.PP
305There are, at present, no other global settings.
306.
307.SS "Target definitions"
308The other kind of top-level configuration item
309defines a target directory
310to be constructed or updated
311by the
312.BR gremlin .
313.IP
314.I toplevel-item
315::=
316.I target-def
317.br
318.I target-def
319::=
320.B target
321.I path
322.B {
323.I type-clause
324\&...\&
325.B }
326.br
327.I type-clause
328::=
329.B type
330.I type
331.B {
332.I policy
333\&...\&
334.B }
335.PP
336A
337.B target
338definition tells the
339.B gremlin
340to populate a directory tree,
341named rooted at the given
342.IR path .
343The body of the target definition consists of
344a sequence of
345.B type
346clauses
347which explain what to do with different kinds of file.
348The possible
349.I type
350tokens are as follows.
351.TP
352.B audio
353Encoded audio files,
354which can be decoded by the GStreamer library.
355.TP
356.B image
357Image files,
358which can be decoded by the Python Imaging Library.
359.PP
360The body of the type clause defines a
361.I policy
362for converting files of that type.
363.
364.SS "Policy descriptions"
365There are two kinds of
366.I primitive
367policies,
368which are described in full below:
369.BR accept ,
370which copies (or links) a master file
371if its format is appropriate,
372or does nothing;
373and
374.BR convert ,
375which converts a master file into a chosen format,
376and (in principle) should always succeed.
377There are also two ways to build up
378.I compound
379policies from simpler ones.
380.IP
381.I policy
382::=
383.B and
384.B {
385.I policy
386\&...\&
387.B }
388.br
389\h'4m'|
390.B or
391.B {
392.I policy
393\&...\&
394.B }
395.PP
396The
397.B and
398policy applies
399.I all
400of its operand policies,
401potentially producing multiple output files.
402.PP
403The body of a
404.I type-clause
405consists of a sequence of policies
406which are implicitly combined together in this way.
407.PP
408The
409.B or
410policy
411tries its operand policies in turn,
412in the order specified,
413until one of them succeeds;
414no more policies are tried after this.
415.IP
416.I policy
417::=
418.B accept
419.I format-spec
420.br
421\h'4m'|
422.B convert
423.I format-spec
424.IP
425.I format-spec
426::=
427.I format-name
428.br
429\h'4m'|
430.I format-name
431.B {
432.I format-prop
433\&...\&
434.B }
435.PP
436(The possible
437.IR format-name s
438and the corresponding
439.IR format-spec s
440are described in the section below.)
441.PP
442The
443.B convert
444policy converts a file to the specified format.
445More specifically:
446if the file's format already matches the
447.I format-spec
448then it is copied to the target directory.
449(Indeed, if possible,
450the file is hard linked into the target directory.)
451If the file's format doesn't match,
452then the
453.B gremlin
454converts it,
455producing an output file of the requested format.
456.PP
457The
458.B accept
459policy copies or links a file if its format matches the
460.IR format-spec ,
461just as
462.B convert
463does.
464However, if the file doesn't match then
465.B accept
466fails.
467.PP
468The usual use of
469.B accept
470is within an
471.B or
472block.
473For example, suppose that the master tree mostly contains
474losslessly encoded files, such as FLAC,
475and we usually want to produce Ogg Vorbis
476for use on devices with limited storage capacity;
477but some of the master files are only available as MP3,
478and re-encoding MP3 as Ogg Vorbis won't be good for sound quality.
479Therefore, you can say something like
480.IP
481.nf
482.ft B
483or {
484 accept mp3 { bitrate = 160 }
485 convert ogg-vorbis { bitrate = 128 }
486}
487.fi
488.ft P
489.PP
490which means:
491if a master file is an MP3 file with bitrate approximately 160kb/s or less,
492then copy it;
493otherwise, convert the file to Ogg Vorbis, at about 128kb/s.
494.PP
495It's possible that even a simple policy
496acting on the files in a master directory
497will come up with multiple ways
498to produce the same output file.
499The rule used to decide is as follows:
500if the
501.B gremlin
502can make the output file by copying one of the master files
503then it does that;
504otherwise it converts one of the inputs chosen arbitrarily.
505For example,
506suppose that a policy for
507.B audio
508files says
509.IP
510.B convert ogg-vorbis
511.PP
512and the master directory contains
513.B foo.flac
514and
515.BR foo.ogg ;
516then it will copy
517.B foo.ogg
518and ignore
519.BR foo.flac .
520If, instead, the master contains
521.B foo.flac
522and
523.BR foo.mp3 ,
524then one of these will be converted,
525but it's hard to predict which.
526.
527.SS "Audio formats"
528Two audio
529.IR format-type s
530are defined.
531.PP
532All audio formats support a
533.B bitrate
534property.
535.IP
536.I format-prop
537::=
538.B bitrate
539.B =
540.I int
541.PP
542The bitrate is expressed in kilobits per second.
543For an existing file to match a
544.I format-spec
545containing a
546.B bitrate
547property,
548the file's bitrate must be less than
549the specified bitrate times a fudge factor
550(currently sqrt(2)).
551(The
552.B bitrate
553property is notionally the desired
554.I output
555bitrate;
556the
557.B gremlin
558assumes that it's better to make output files a bit larger
559than to re-encode an already lossily compressed master file.)
560.PP
561At present, the audio formats define no other properties.
562.TP
563.B mp3
564The MP3 format that everyone knows and loves.
565For encoding, the
566.B gremlin
567uses Lame,
568and stores metadata in an ID3v2 tag;
569it also tries to store an ID3v1.1 tag,
570but this can fail for a number of reasons
571(e.g., if the genre can't be represented,
572or text contains characters outside of the ISO 8859-1 character set
573used in ID3v1 tags).
574.TP
575.B ogg-vorbis
576Vorbis-encoded audio in an Ogg container,
577as defined by the Xiph.Org Foundation.
578On encoding, the
579.B bitrate
580parameter is actually mapped to a quality setting
581chosen to produce approximately the right bitrate.
582.
583.SS "Image formats"
584Three image
585.IR format-type s
586are defined.
587.PP
588All image formats support a
589.B size
590property.
591.IP
592.I format-prop
593::=
594.B size
595.B =
596.I int
597.PP
598The size provides an upper bound on the width and height of the image.
599A master file will only match if
600both its width and height are
601less than the stated size.
602On output, the image will be scaled to the right size,
603preserving its aspect ratio.
604.TP
605.B jpeg
606The JFIF format, defined by the Joint Photographic Experts Group.
607The following additional properties can be set;
608they affect output only.
609.RS
610.TP
611.B optimize
612Spend longer to select optimal encoder settings.
613.TP
614.B progressive
615Make a progressively-rendering output file.
616This isn't usually a good idea.
617.TP
618.BI "quality = " int
619Set the image quality (at the expense of file size).
620This is a percentage; the default is 75.
621.RE
622.TP
623.B png
624The Portable Network Graphics format,
625originally defined in RFC2083.
626The following additional properties can be set;
627they affect output only.
628.RS
629.TP
630.B optimize
631Spend longer to try to make the output file smaller.
632.RE
633.TP
634.B bmp
635The Windows BMP format.
636There are no additional properties.
637.
638.SS "Example file"
639The following is the author's configuration file.
640I have an archive which mostly consists of FLAC files,
641with a few MP3 files where I've been unable to obtain physical CDs.
642I generate two output trees.
643One mostly contains Ogg Vorbis files,
644but tolerates occasional MP3
645rather than suffer the quality loss of re-encoding.
646It also generates small BMP-format images from cover art,
647because I have an old portable audio player
648which runs the free RockBox firmware,
649whose player is only capable of displaying such images.
650.IP
651.nf
652.ft B
653### -*-conf-*-
654
655vars {
656 master = "/mnt/jb/master"
657}
658
659target "/mnt/jb/gremlin/ogg-vorbis-128" {
660 type audio {
661 or {
662 accept mp3 { bitrate = 160 }
663 convert ogg-vorbis { bitrate = 128 }
664 }
665 }
666 type image {
667 or {
668 accept png
669 convert jpeg { quality = 7 }
670 }
671 convert bmp { size = 75 }
672 }
673}
674
675target "/mnt/jb/gremlin/mp3-160" {
676 type audio {
677 convert mp3 { bitrate = 160 }
678 }
679 type image {
680 or {
681 accept png
682 convert jpeg { quality = 7 }
683 }
684 }
685}
686.fi
687.ft P
688.
689.\"--------------------------------------------------------------------------
690.SH BUGS
691.
692The
693.B gremlin
694makes no effort to process more than one file at a time.
695.PP
696It should probably support more audio formats.
697They're quite easy to add,
698but I don't have a good feel for which formats are good.
699Patches and advice are welcome.
700.PP
701The
702.B and
703and
704.B or
705policy names are possibly confusing.
706They suggest that they work like the standard logical operators;
707while
708.B or
709sort of does, if you squint a bit,
710.B and
711certainly doesn't;
712on the other hand, it does try to do all of the things you ask of it.
713.PP
714.B gremlin
715is a very unhelpful name for the program.
716.
717.\"--------------------------------------------------------------------------
718.SH AUTHOR
719Mark Wooding, <mdw@distorted.org.uk>
720.
721.SH SEE ALSO
722.BR hush (1),
723.BR rsync (1).
724.
725.\"----- That's all, folks --------------------------------------------------