Literate Programs in Ubik ==============================================
Haldean Brown                                      First draft: Jul 2016
                                                  Last updated: Jul 2016

Status: Complete


Ubik supports literate programming as a first-class language entity.
Ubik has a leg up on many mainstream programming languages when it comes
to literate programming: the ordering of definitions is not significant
in Ubik, so one can order the statements in a module in whatever order
makes a better "narrative". That, combined with the fact that a literate
Ubik file is also valid reStructuredText (reST), means that writing
literate Ubik is both super-easy and barely dissimilar from normal Ubik.

Literate Ubik files have the extension ".ul" or ".rst"; when an input
file has one of these extensions, the Ubik compiler correctly skips the
lines that do not correspond to actual code. Since the file itself is
already valid reST, no processing of the file needs to happen to turn it
into a valid markup file (often called "weaving" in the literate
programming world).

Any code block in a literate Ubik file is considered Ubik source code.
So this:

    Creating nothing from something
    A brief exposition on writing a programming language

    Something from nothing is a really cool trick, but nothing from
    something is even cooler. Check out what happens when I do something
    like this::

        : this ^ imp:MaybeThing -> imp:MaybeThing
        = \x -> imp:Nothing

    I did it. I made nothing from something. Check that out. And all I
    needed to do it were these things:

    .. code:: ubik
        . the-void
        + imp

    Which is not dissimilar from this Java code:

    .. code:: java
        package thevoid;
        import imp;

    Or this Go code::

    > package thevoid;
    > import imp;

Is equivalent to this Ubik file:

    : this ^ imp:MaybeThing -> imp:MaybeThing
    = \x -> imp:Nothing
    . the-void
    + imp

Specifically, the following elements are assumed to be Ubik source

    - Unquoted literal blocks of all stripes (any of the
      double-colon-delimited blocks, whether the double colon is on its
      own line or at the end of the previous).
    - Code directives where the name of the programming language is
      explicitly set to "ubik".

If you want code-formatted blocks that aren't treated as Ubik source
material, you can use a code block with the language set to something
else (or just unset entirely). You can also use a quoted literal block,
although you'll end up with the quoting character inside the result.

To avoid a dependency on an entire reST parser, we actually identify
things that are not proper reST when looking for the ..code:: marker.
The code directives must start on their own line and match the following
string, with whitespace appearing at any point in the line:


The code must start on a new line after the directive, and all lines are
taken until a line with no indentation is found again. This is more
permissive than the reST spec, but it simplifies the implementation of
the weave operation significantly. The following markers are all valid:

    .. code :: ubik
    . . c o d e : : u b i k
    .               .code:         :ubik
    ..code::ubi k

Internally, literate code weaving is implemented by commenting out each
line that is not source material before the stream is even passed to the
lexer. This is nice and streamable; an implementation of the stream that
does this is present in libubik/codegen/literate.c.