When displaying sizes as a floating-point number (e.g. "123.4 Mb"),

[sgt/agedu] / TODO
diff --git a/TODO b/TODO

index f34f507..df84d4a 100644 (file)
--- a/TODO
+++ b/TODO
@@ -1,21 +1,18 @@
  TODO list for agedu
  ===================
  
  TODO list for agedu
  ===================
  
- - adjust the default web server address selection.
-    + some systems (e.g. OS X) don't like us binding to random
-      localhost addresses. So if that fails, try falling back to
-      127.0.0.1 proper (and a randomly selected port) before giving
-      up.
-    + since binding to port 80 isn't generally feasible, we should
-      adjust the default behaviour when the user specifies --addr
-      with no port: it should select port zero, and then print the
-      port number on standard output. (Possibly also print the URL
-      as usual, in that situation: translate INADDR_ANY to
-      INADDR_LOOPBACK and then do the same as when we made the
-      entire address up ourself.)
+ - flexibility in the HTML report output mode: expose the internal
+   mechanism for configuring the output filenames, and allow the
+   user to request individual files with hyperlinks as if the other
+   files existed. (In particular, functionality of this kind would
+   enable other modes of use like the built-in --cgi mode, without
+   me having to anticipate them in detail.)
  
  
- - we should munmap in all operating modes where we mmapped,
-   otherwise chaining them will run out of address space
+ - add the option (and perhaps even make it default) to address HTML
+   subpages by pathname rather than by index number. More than just
+   cosmetic: it means that in a scenario where agedu --cgi is always
+   running but the index file is updated by cron, subsidiary
+   pathnames will remain valid across a change.
  
   - we could still be using more of the information coming from
     autoconf. Our config.h is defining a whole bunch of HAVE_FOOs for
  
   - we could still be using more of the information coming from
     autoconf. Our config.h is defining a whole bunch of HAVE_FOOs for
@@ -30,16 +27,6 @@ TODO list for agedu
        controversial; IIRC it's all in POSIX, for one thing. So more
        likely this should simply wait until somebody complains.
  
        controversial; IIRC it's all in POSIX, for one thing. So more
        likely this should simply wait until somebody complains.
  
- - it would be useful to support a choice of indexing strategies.
-   The current system's tradeoff of taking O(N log N) space in order
-   to be able to support any age cutoff you like is not going to be
-   ideal for everybody. A second more conventional mechanism which
-   allows the user to specify a number of fixed cutoffs and just
-   indexes each directory on those alone would undoubtedly be a
-   useful thing for large-scale users. This will require
-   considerable thought about how to make the indexers pluggable at
-   both index-generation time and query time.
-
   - IPv6 support in the HTTP server
      * of course, Linux magic auth can still work in this context; we
        merely have to be prepared to open one of /proc/net/tcp or
   - IPv6 support in the HTTP server
      * of course, Linux magic auth can still work in this context; we
        merely have to be prepared to open one of /proc/net/tcp or
@@ -71,18 +58,42 @@ TODO list for agedu
        HTML. Then the former can be reused to produce very similar
        reports in coloured plain text.
  
        HTML. Then the former can be reused to produce very similar
        reports in coloured plain text.
  
- - http://msdn.microsoft.com/en-us/library/ms724290.aspx suggest
-   modern Windowses support atime-equivalents, so a Windows port is
-   possible in principle.
-    + For a full Windows port, would need to modify the current
-      structure a lot, to abstract away (at least) memory-mapping of
-      files, details of disk scan procedure, networking for httpd.
-      Unclear what the right UI would be on Windows, too;
-      command-line exactly as now might be considered just a
-      _little_ unfriendly. Or perhaps not.
-    + Alternatively, a much easier approach would be to write a
-      Windows version of just the --scan-dump mode, which does a
-      filesystem scan via the Windows API and generates a valid
-      agedu dump file on standard output. Then one would simply feed
-      that over the network connection of one's choice to the rest
-      of agedu running on Unix as usual.
+ - abstracting away all the Unix calls so as to enable a full
+   Windows port. We can already do the difficult bit on Windows
+   (scanning the filesystem and retrieving atime-analogues).
+   Everything else is just coding - albeit quite a _lot_ of coding,
+   since the Unix assumptions are woven quite tightly into the
+   current code.
+    + If nothing else, it's unclear what the user interface properly
+      ought to be in a Windows port of agedu. A command-line job
+      exactly like the Unix version might be useful to some people,
+      but would certainly be strange and confusing to others.
+
+ - it might conceivably be useful to support a choice of indexing
+   strategies. The current "continuous index" mechanism' tradeoff of
+   taking O(N log N) space in order to be able to support any age
+   cutoff you like is not going to be ideal for everybody. A second
+   more conventional "discrete index" mechanism which allows the
+   user to specify a number of fixed cutoffs and just indexes each
+   directory on those alone would undoubtedly be a useful thing for
+   large-scale users. This will require considerable thought about
+   how to make the indexers pluggable at both index-generation time
+   and query time.
+    * however, now we have the cut-down version of the continuous
+      index, the space saving is less compelling.
+
+ - A user requested what's essentially a VFS layer: given multiple
+   index files and a map of how they fit into an overall namespace,
+   we should be able to construct the right answers for any query
+   about the resulting aggregated hierarchy by doing at most
+   O(number of indexes * normal number of queries) work.
+
+ - Support for filtering the scan by ownership and permissions. The
+   index data structure can't handle this, so we can't build a
+   single index file admitting multiple subset views; but a user
+   suggested that the scan phase could record information about
+   ownership and permissions in the dump file, and then the indexing
+   phase could filter down to a particular sub-view - which would at
+   least allow the construction of various subset indices from one
+   dump file, without having to redo the full disk scan which is the
+   most time-consuming part of all.