## The key to higher-order Cartesian programming

In this post, I will outline the implementation of TransLucid functions. In the TransLucid language, multidimensional arrays of dimensionally-abstract functions can manipulate multidimensional arrays. The problem to be solved is that since TransLucid functions are dimensionally abstract and first-class values, the implementation must avoid any clashes in the use of dimensions.

This blog post will not be easy to read, without a clear exposition of the principles of TransLucid. A further blog post will be written giving a proper introduction to the language as it currently stands in the https://github.com/plaice/TLghc implementation. In the meantime, a cursory introduction to TransLucid and Cartesian programming is given in the next few lines.

In TransLucid, a variable is an array of arbitrary rank (number of dimensions) and extent (number of cells). To refer to a specific cell, the variable is accessed using the current context, which is a set of (dimension, ordinate) pairs. This is exactly the idea behind Descartes’ analytical geometry, so it is appropriate to call this style of programming Cartesian programming.

A useful way to visualise Cartesian programming, initially, is to consider a (hypothetical) spreadsheet based on these ideas. The entire spreadsheet could be considered to be a single TransLucid variable, with dimensions such as row, column, sheet and time, and possibly many others. One or more equations would define this variable, and these equations could refer, if necessary, to these dimensions to reach neighbouring cells.

Our first TransLucid example is the following:

var A = 2*#x + 3*#y

Variable A is defined to be a two-dimensional array which looks like the following table (here we only consider non-negative ordinates):

y/x 0  1  2  3 ...
0   0  2  4  6 ...
1   3  5  7  9 ...
2   6  8 10 12 ...
3   9 11 13 15 ...
...

In this example, the current ordinates for dimensions x and y are examined explicitly (extensionally). Most of the time, dimensions are manipulated implicitly (intensionally). For example, the Fibonacci series could be defined as follows:

var fib = if #d<2 then #d else next.d(fib) + next.d(next.d(fib))

where the next function shifts its actual argument in parentheses one cell “to the left” in the direction of its actual dimension argument. The definition of next is as follows:

fun next.d(X) = let d <- #d+1 in X

The Fibonacci series could also be written as follows:

var fib = fby.d (0, fby.d (1, fib + next.d fib))

where the fby function starts with the first actual argument in parentheses, and continues with the second actual argument in parentheses, shifted one cell “to the right” in the direction of its actual dimension argument. The definition for fib is as follows:

fun fby.d(X,Y) = if #d < 1 then X else let d <- #d-1 in Y

Assuming a bit of extra support from the parser (not yet implemented), the definition of fib could be written as:

var fib = 0 fby.d 1 fby.d (fib + next.d fib)

And now we can make a Fib function, using a locally-declared dimension:

fun Fib(n) = fib where dim d <- n var fib = 0 fby.d 1 fby.d (fib + next.d fib) end

TransLucid is a higher-order language, so arrays of functions can also be created. For example, in the following definition,

fun nextN.d(X) = F(X) where var F = (lambda X -> X) fby.d (lambda X -> next.d (F X)) end

the variable F is an array of functions, corresponding to id, next.d, next.d(next.d), and so on. Hence,

nextN.d(#d)

will give us

#d at the 0-ordinate, i.e., 0,
next.d(#d) at the 1-ordinate, i.e., 2,
next.d(next.d(#d)) at the 2-ordinate, i.e., 4,
next.d(next.d(next.d(#d))) at the 3-ordinate, i.e., 6
and so forth.

The difficulty in implementing the higher-order functions is that during execution, the context is queried dynamically. As a result, it is crucial to avoid dimension clashes. The potential for this problem can be illustrated with some small examples.

(\lambda() -> #d) ()
should be the same as
#d

(\lambda(X) -> #d + X) (#e)
should be the same as
#d + #e

So far, so good. Now, it gets a bit trickier.

(\lambda(X) -> #d + X) (#d + #e)
should be the same as
#d + (#d + #e)

That seems fine. But what if we fix the d-ordinate inside the lambda?

(let d <- 3 in (\lambda(X) -> #d + X)) (#d + #e)
should be the same as
3 + (#d + #e)

So it becomes clear that the two occurrences of #d in the previous example are not of the same nature. The first occurrence of #d belonged to the body of the function, and was subject to the context at the point of abstraction, while the second occurrence of #d belonged to the actual argument, and was subject to the context at the point of application.

Getting all this to work took a long time to understand. Initially, it was thought that a TransLucid expression was evaluated against an environment, mapping identifiers to dimensions, expressions or functions, and a context, mapping dimensions to ordinates. However, this approach fails in function calls, because one cannot distinguish the two different occurrences of #d in the example above.

The solution lies in understanding that dimensions in TransLucid play a rôle similar to that of automatic variables do in imperative languages. In those languages, when a function is called, information is placed on the runtime stack, and the automatic variables for the called function are placed on top of the stack. If the TransLucid context is understood to be something akin to the runtime stack in other languages, then the solution becomes clearer.

The Translucid environment and the TransLucid context together form a pair of stacks, one of environment parts, the other of context parts. Each environment part is a mapping from identifiers to dimensions, expressions or functions, while each context part is a mapping from dimensions to ordinates. As a result, in a given context, it becomes possible for the same dimension to be mapped to different ordinates at different levels.

When a function is called, we are combining two contexts: the abstraction context (when the function was created) and the application context (when the function is applied to arguments). If an identifier appearing inside the body of the function does not correspond to any of the formal arguments of the function, then only the abstraction context is of relevance for its evaluation; the application context should have no influence. Should a dimension formal parameter be manipulated inside the body of the function, then the corresponding dimension refers directly to the application context. Should a variable formal parameter be manipulated inside the body of the function, then the relevant context is the application context: the only dimensions that come in from the function call that can be manipulated are those passed in through dimension actual parameters.

When a function is called, the environment stack for the evaluation of the body of the function consists of placing a modified (explained below) version of the environment stack at the point of the application on top of the environment stack at the point of the abstraction. The modification of the top part consists of

• keeping the same depth;
• removing all mappings to variables and functions;
• adding a mapping from formal variable parameters to actual expression parameters;
• for each actual dimension parameter, at the correct level, replacing the mapping from the actual dimension parameter to a mapping from the formal dimension parameter;
• removing all other mappings to dimensions.

The context stack for the evaluation of the body of the function consists of placing the context stack at the point of the application on top of the context stack at the point of the abstraction.

The approach described above ensures that when the body of the function is being evaluated, both the abstraction context and the application context are available. However, the application context can only be manipulated through the formal dimension parameters, which have been judiciously placed in the upper part of the modified environment.

The https://github.com/plaice/TLghc interpreter implements functions as described above. There is ample room for optimisation. For example, most of the TransLucid programs that have been written to date do not take advantage of the abstraction context, so the implementation of function calls in the majority of cases can be simplified. Furthermore, the memoisation of computations, very useful for recursively-defined TransLucid variables, requires the development of a fully-working rank-analysis algorithm.

Given the importance of solving the problem described in this blog post, the release of version 0.0.1 of TLghc was warranted.