Architecture of the Ubik runtime =======================================
Haldean Brown                                      First draft: Sep 2016
                                                  Last updated: Sep 2016

Status: Needs revision
        While the value encodings have remained the same, the scheduler
        (now called the evaluator) has been significantly reworked to
        improve performance; that section needs rewriting.

------------------------------------------------------------------------

The Ubik runtime has a relatively simple architecture; the basic idea is
that Ubik values are allocated by the GC system and flow through the
scheduler, which is responsible for producing a "minimal value" from an
existing value. The scheduler attempts to minimize all terminal values,
and in so doing evaluates the result of a computation.

Ubik values come in 9 types:

     STR: a UTF-8 encoded string
          encoded as byte-arrays with a length

     RAT: a rational number
          encoded as a 64-bit signed numerator and a 64-bit unsigned
          denominator

     BOO: a boolean
          encoded as a C11 _Boolean (which is stored in a byte)

     TUP: a fixed-length typed tuple
          encoded as a list of values, a list of types and a size

     FUN: a function
          encoded as a list of nodes, an optional evaluator, and the
          index of the node that represents the result of the function.
          More on node encodings in a second.

     PAP: a partially-applied function
          encoded as a pair of values and an integer; the first value
          represents the function, the second represents the argument,
          and the integer represents the number of arguments the result
          can take before it is fully applied.

     MUL: a polymorphic multimethod
          TODO

     TYP: a type
          TODO

     IMP: an implementation of an interface
          TODO

Encoding functions -----------------------------------------------------

Functions are encoded through an implicit directed acyclic graph of
nodes, each of which takes some number of inputs and computes a result.
These nodes perform very simple operations; there are currently 8 types:

     APPLY:  applies a function to an argument
             contains a node index to the function and a node index to
             the argument.

     VALUE:  represents a constant value
             contains a pair of values; one is the type and one is the
             value of the constant value.

     LOAD:   loads a value from the environment
             contains a URI to load from.

     STORE:  stores a value in the environment
             contains a node index whose value will be stored, and the
             URI at which the value will be stored.

     INPUT:  represents an argument to the function
             contains the argument index this node corresponds to (i.e.,
             if the node represents the first argument, it's argument
             index is zero).

     REF:    adopts the type and value of a referrent node
             contains the node index of the referrent node. These are
             used to make compilation easier; in an ideal world, the
             compiler would optimize these out entirely.

     COND:   branches based on a boolean
             takes the node indexes of three nodes: the condition, the
             value if true, and the value if false.

     NATIVE: stores the result of a native computation
             this is a little piece of magic that enables the graph
             representation of functions that are implemented by the
             runtime instead of in Ubik. If the graph has a custom
             evaluator (which we'll discuss in the scheduler section)
             then this node will be filled with the result of the
             evaluator when the evaluator completes, as if by magic. It
             has no subfields, as it is only a container for a result.

Each node also contains an ID (a word that is only used for debugging)
and a flag that determines whether the node is terminal or not. Terminal
nodes are enqueued as "goals" in the scheduler when an evaluation of the
function is requested. This makes it possible to force eventual STORE
effects without keeping them in the dependency chain of the graph's
result.

Scheduling computation -------------------------------------------------

The scheduler's job is to take a FUN, PAP, or MUL value and collapse
them to their minimal values (i.e., if the FUN, PAP, or MUL has
input-arity 0, it evaluates the function). This operation is called
"evaluation". A user of the runtime can request evaluation of a value or
of multiple values; this is called "enqueuing" the values.

When an unreducable value is enqueued (like a STR or a BOO), the
scheduler halts as the value has already been reduced. Additionally, if
a value is executable but is not fully applied (for example, you've
provided one of the two arguments to a function) the scheduler halts, as
the value has already been reduced.

When a graph is enqueued, it is wrapped in a graph executor
(ubik_exec_graph). The executor is responsible for encoding all of the
information specific to an invoking of the function. This means:

     - the calculated value of each node (including the values of the
       arguments, if any)
     - the calculated type of each node
     - the readiness of each node (i.e., whether the node is ready to
       evaluate or is still waiting on a piece of data or one of the
       nodes it depends on)
     - the value being evaluated
     - the environment in which the graph will be executed
     - an optional callback that will be called when the graph's result
       is done being evaluated

Following this, we walk the dependency tree of the terminal nodes to
determine their evaluation status. The evaluation status is determined
as follows:

     - if the node is a COND node, the node is marked as waiting on the
       evaluation of the condition.
     - if the node has any dependencies, set the WAIT flags for each
       dependency. The maximum number of dependencies for any node type
       is 3, so there are 3 such flags: WAIT_D1, WAIT_D2, and WAIT_D3.
       The mapping from field to dependency index is arbitrary but
       consistent for a given node type (i.e., D1 for a VALUE node is
       always the value, and D2 is always the type).
     - if the node has no dependencies, set the READY flag

These flags are set on the graph executor. All READY nodes are enqueued
in the "ready" queue; all WAIT nodes are enqueued in the "wait" queue.
The scheduler then begins the evalulation loop:

     - it takes a node executor off of the ready queue
     - it evalutates the node in the executor
     - it stores the resulting value and type in the graph executor
     - it finds the tasks dependent on the calculated value and clears
       the dependency-wait flag associated with the evaluated node in
       each task. If all wait flags have been cleared on the dependent
       task, the node is marked as ready.
     - the scheduler then does a "ready-sweep", in which it moves node
       executors that are marked as ready from the wait queue to the
       ready queue.

The process then repeats, until the ready queue is empty. At that point,
if there are still nodes in the wait queue, the scheduler is deadlocked
and is unable to evaluate the computation; this can happen due to
circular node dependencies, or waiting on data that never appears. If
the wait queue is empty, the computation has completed successfully.