Petar Maymounkov

Escher: A black-and-white language for data and process representation

Sat, Mar 29, 2014

The end objective of the MEMEX program, envisioned by Vannevar Bush, is to elucidate designs of a device, broadly described as external memory for humans. Implicit in any such design would be a formal representation of human knowledge (akin to data), actions (akin to processes) and intentions (akin to programs).

While these three, definitively open-ended, concepts (data, processes and programs) are enlisted as necessary (and not necessarily sufficient) ingredients of the understanding that any successful MEMEX machine should have of a human, it is misleading to think that a good MEMEX design should treat them as separate internally and possibly even externally (on the User Interface side).

In fact—advised by discoveries and philosophical trends in Cognitive Science, Modern (Chomskian) Linguistics, “traditional” Evolutionary Linguistics, Computer Science, Neuroscience, User-Interface Design and some linguistic aspects of modern mathematical practices—on the computational limitations of the brain, we postulate two design principles for a MEMEX device, which principles can equivalently be viewed as a non-presumptuous model of the limitations of cognition: The user of the MEMEX device.

First Principle:

All interfacing of a MEMEX device with its user—regardless of its superficial syntactic appearance and modality (interactive or not, reactive or not, written or vocal, visual or tactile, and so on)—should be governed by a tiny set of simple semantic rules of interaction between the software and its user. This rule is to be the only “knitting technique” with which the user, assisted by the MEMEX device, is to “weave” their knowledge—partial or complete—on any subject into the device.

Second Principle:

The device is to store the “weave” without prejudice (other than the prejudice woven in by its user), both on input and on output.

While these principles might sound severely restrictive, they are motivated by preceding discoveries from disparate fields of science. Their intellectual merit is highlighted by our specific linguistic design, dubbed the Escher Language, that has proven more-than-adequate in a proof-of-concept manner in the following different domains:

Our aim in the MEMEX context would be to show that this versatile language is a good choice for representation of user knowledge within the MEMEX device.

To this end, we aim to develop three software components that integrate the Escher linguistic device in an end-to-end fashion between a human operator and the digital counterparts available to them for interactive exchanges, such as the World Wide Web.

These components, comprising the Escher MEMEX device, are:

Historical Perspective and Cognitive Constraints

It is a conjectural belief among many leading scientist-philosophers that the human mind utilizes a small, universal set of “operations” to weave its knowledge about the world and to form its responses to it. More important to our setting is the implication that this “set of instructions” is shared among all humans.

This starting point gives rigorous substance to the aim of building software that seamlessly assists its user. In particular, it advises that any user interaction (UX), regardless of its superficial representation—whether textual, graphical, musical and so on—should be underlied by a semantic (a formal language) whose generating set (akin to generators of an algebraic structure) is small (easy to remember), natural (easy to follow, understand and “believe”) and effective (adequate in any topical domain of application).

Acknowledging the underlying semantic of every UI ellucidates the aim of Cognitive Science—to reconstruct the algorithm of the mind through theorizing—and the aim of UI design—to build the most intuitive user interface—as two sides of the same coin. UI design (specifically its semantic) is plainly a guided-by-creativity trial-and-error stab at the “generator set”.

Our “generator set” assumption is present virtually explicitly in leading “theories of the mind”. Chomskian Modern Linguistics, for instance, substantiates the Universal Grammar Theory, which stipulates that all natural languages are underlyied by identical mechanics, regardless of the modality of expression (e.g. sign language and the TADOMA system are not written). Furthermore, Chomsky's argument of universality irrespective of modality is precisely the claim that all user interfaces have an underlying semantic, otherwise humans wouldn't understand them.

Another theory of the mind is laid out in rigorous, non-presumptuous detail in Leslie Valiant's “Nature’s Algorithms for Learning and Prospering in a Complex World”. There, the focus of the mind, which is dubbed the Mind's Eye, is akin to an empty dinner table, accommodating of a small number of plates and utensils from the cupboard, and allowing for a small number of ways to arrange, utilize and return them. These operations are the “generator set”.

In fact, the “search for the best generator set” can even be traced to the roots of ancient cultures. NASA researcher Rick Briggs, for instance, summarizes the grammatical and analytical evolution of and of the study of Sanskrit, wherein grammarians from different generations have focused on perfecting the art of proposing small sets of rigorous codification and interpretation rules for the Sanskrit language. Their efforts are precisely identical to those of contemporary UI design for general purpose (like MEMEX) human-computer interaction, with the singular distinction that by mere necessity of circumstance they were bound to look for solutions within the only available medium at the time: The written.

We are not set out to discover the mind's working. We are set out to acknowledge the link between cognition and UI design, specifically in insisting that the richness, “intelligence” and usefulness of any such design has to be underlied by an algebraic semantic with a small generator set. Furthermore, this semantic should clearly have a linguistic genus because ample evidence suggests that we build our natural and programming languages to our taste of organizing the world. To give flesh to this principle we set out to realize the Escher language which is one example of a general purpose language with less than four grammar rules (a number that varies with the choice of the language generator set, of which there are a few).

Usability Objectives

In addition to the proposed core design principles—which should be viewed as sanity constraints—we consider a set of usability criteria, which are to be viewed as design objectives (under the sanity constraints).

We follow a refined version of the notion of “continuity in design”. In this context, continuity generically refers to “(cognitive) ease in change” and at the syntactic level—which is the visual and/or interactive UI layer—continuity is related to object permanence, and techniques like visual morphing are common expressions of it in design.

On the equivalent semantic side (of any user-interacting software: a programming language and its IDE, in particular) a notion of continuity is needed as well. Still embodying the meaning of “ease in change”, we interpret semantic continuity in a more abstract fashion that can be applied to formal languages:

Our last point deserves a discussion. Every individual programmer has their own way of thinking up ideas and then acting on them by codifying them in a given programming language. According to the “generator set” assumption, there is an invisible semantic to the way each programmer works with each language: The programmer is merely a translator of mind semantics (the ideas) to programming semantics while considering editing hoops and other tangential medium-related difficulties. In doing whatever they are doing, the user's mind follows discrete formal states (at least from time to time) and our claim is that a good programming language should make it easier for the user to align their basic operating steps (which are indivisible in nature, i.e. not continuous) to those editing manipulations that the language allows. (A manipulation is a set of changes that takes a program from one valid state to another.)

Without knowing or understanding the programmer's own semantic, the best bet at accomplishing a fluid connection with them is to make sure that the nature of the programming language allows for a selection of small sets of simple program transformations, whereby each such set is sufficient in itself for programming any conceivable valid program.

Notably, we claim that Escher has the aforementioned property, similarly to some functional programming languages, but unlike all other languages (known to us) it is a general-purpose language in a very broad sense. And, furthermore, it has the simplest syntax of any programming language known to us, including the UNIX shell's command-line calculator bc. (Connecting wires and drawing boxes is simpler than delimiting arithmetic.)

The language

The basic unit of meaning in Escher is the reflex. A reflex is a named black-box computational device, which interfaces with other objects in the linguistic environment through a set of named valves. Metaphorically, the valves can be viewed as labeled communication pipes coming out of the black-box.

In this illustration, the name of the depicted reflex is “AND”. It has three valves, labeled as “X”, “Y” and “X and Y”. This could be the reflex for a boolean AND operation, for instance.

The internal algorithm of a reflex can either be implemented in an external technology (like the Go programming language), in which case we call them gates, or as a composition of interconnected pre-existing reflexes, which we call a circuit. Naturally, both circuits and gates are reusable designs. The following illustration is a valid Escher program which parses key/value pairs out of HTTP requests and stores them in a database.

This is a complete high-level circuit of a simple RESTful API for a key/value-storing database. The reflex names of the circuit elements' are depicted in white text within them. Some, not all, of the valves are marked with a blue dot and an accompanying label. Elements corresponding to high-level concepts, like “Key Value DB”, could either be other Escher circuits or they could be gate reflexes which bridge to legacy technologies like a Redis DB instance.

A circuit design which connects all valves of its elements—the reflex instances used within it—is called a closed circuit and it is executable. Alternatively, new circuit designs can postpone connecting some of its internal element valves by associating the not-connected valves to the interface valves of the circuit's reflex itself.

In this Escher circuit, the brush strokes are gate reflexes which have a side-effect in some external technology—a WebGL canvas in someone's browser, for instance—of drawing an actual brush stroke. The “Position Vector” valve of each brush stroke maintains a value, be definition, always equal to the stroke's canvas position. The arithmetic gates connecting the three strokes ensure that if and whatever changes to the strokes' positions occur (for instance, as a result of the user dragging one or pinch-dragging two brush strokes), one of the strokes will always be the midpoint of the other two.

While the positions of the strokes might be varied by the stroke's gate reflexes themselves, the colors are connected to the valves of this circuit design, so that they can be set from within Escher by the circuit using this circuit.

This circuit simply enforces an underspecified system of equations over the position values of the brush strokes, such that any attempted free movement can be projected to the manifold (a simple convex body in this case) of allowable states.

Syntacically, reflexes, circuits, gates and their valves is the entire essence of the Escher language. The semantic side of the language, described next, is where most of the substance takes place.

Semantics

The underlying meaning of reflexes, i.e. what they do at runtime, can be described in two ways which are dual and equivalent, much like the wave and particle representations of light in physics are. The two types of meaning that are simultaneously bestowed to Escher programs are the invariance-based and the event-based.

Invariance semantics

In the invariance-based interpretation, at runtime an Escher program is in a state, fully determined by the state of its wires. Each wire has a current value, which is visible to both endpoint reflexes. A reflex can be seen as a constraint which ensures that its incident wires can collectively be in only one of a set of allowable states.

In other words, each reflex is analogous to an instance of a Constraint Satisfaction Problem, in the sense used within Combinatorial Optimization. (Precisely speaking, we are alluding to the simple definition of “generalized graph problems” used in the field of Probabilistically-Checkable Proofs.) In particular, the reflex valves correspond to the constraint variables and the presence of the reflex in a program ensures that all wires adjacent to it will have values that are feasible for the constraint prescription.

Normally, the circuit of an Escher program will be in a still equilibrium in the sense that all wires have unchanging values and all reflex constraints are satisfied. A change in value, however, can be sparked by any gate at any time, at which point the Escher circuit will adjusts its wire state (possibly globally across the program circuit) to reflect the new values at the “disturbing” gate and fulfill all constraints at the same time.

For instance, the brush-stroke example Escher program (above), illustrates the positional interrelationship of brush strokes drawn using an experimental drawing tool. The arithmetic gates connecting the three brush-stroke gates impose a constraint which ensures that despite potential eventual changes in the positions of the brush strokes, one of the them will always be at the midpoint of the other two. (It is worth noting that constraints where the values are numbers are nothing more than multivariate systems of equations.)

In addition to acting as a constraint, a reflex can have side-effects that are external to the Escher linguistic environment. For instance, the brush stroke gates in the example above impose no constraints on the value of their position valve; Their job is to ensure that somewhere else, on a peripheral graphic display perhaps, a brush stroke is drawn at the given position (whatever that means).

When the values assumed by the valves of a reflex come from a finite set, a constraint is like a “truth table”: An exhaustive listing of all allowable assignments of values to all the valves of the reflex. The truth table equivalence is an instructional one. In practice, however, assignable values almost always come from infinite (e.g. unconstrained key/value dictionaries) or prohibitively-large finite domains (e.g. floating point numbers). For this reason, we introduce another semantic interpretation of reflexes which is actually utilized when gate reflexes are implemented in other languages.

Event semantics

In the world of event semantics, every reflex is modeled as an independent computational process whose valves represent named permanent duplex connections to its reflex neighbors in the containing circuit. This is akin to the users of a social network who are restricted to chat online only with their circle of friends.

Each Escher wire is a duplex communication link comprised of two channels—ordered, synchronized links—one in each direction. While otherwise not dissimilar to channels in Erlang or Go, Escher channels have a mediation device in the middle. When one reflex sends a value through a valve and along a channel, the channel does not propagate the change to the other side unless the value is different from the previously sent one.

This semantic allows for reflexes and gates to be developed without much constraint on behavior, while ensuring that waves of value changes along wires do not spark flooding circuit-wide updates.

This semantic further implies that only changes in value can be perceived or signaled through valves and wires, which lands the programmer in an environment that supports expression of concurrency without the threat of traditional deadlocks and race conditions.

Finally, the exclusive dependence of dynamic program state on change-in-value events alone ensures that all programs written in Escher are by definition idempotent end-to-end: Another pain point in program design.

From a programming standpoint, to implement a new gate the programmer provides a type with a singular recognition method. For instance, in the Go language this type is captured in the Recognizer interface:

type Recognizer interface {
	Recognize(change Sentence)
}

type Sentence map[string]interface{}

Whenever there is a change in values incoming on the wires connected to the user's gate reflex, the circuit runtime will invoke the user’s Recognize method, synchronously supplying all changed values and only them, keyed by the valve name they are incoming on. This information is enclosed in the Sentence argument, which is a key/value map from valve names to an unconstrained valve value type.

Additionally, through a side channel, the user gets a hold of a pointer to an identical Recognizer object, supplied to it by the runtime and associated with the specific gate. If the user gate wants to send values to any subset of its valves synchronously, it merely does so by invoking the reciprocal Recognize method and pre-loading the sentence argument with the new valve values.

Program and point-of-view inversion

Consider an Escher circuit—like the one for a boolean mutually-exclusive OR-function below—where the circuit designs of the top-level circuit's internal elements are drawn recursively, alternating the colors of wires and elements and the ambient circuit space.

It should then be possible to discern that if one positions their point of view inside one of the lower-level circuit elements, the world is inverted: That lower level circuit is the top-circuit and the rest of them follow in lower recursive levels. This is strikingly true both syntactically (geometrically) and semantically (the result is a valid dual circuit design). This remarkable feature of the language is also the reason for its name.

Reversability

Escher reflexes intentionally have undirected valves and duplex wires, in contrast to the directed acyclic graphcs of boolean circuits in Computer Science.

Boolean gates, like AND and OR (but unlike XOR and NOT), can be computed in one direction only. For instance, according to “X and Y = Z” one can compute Z given X and Y, but not Y from X and Z, say. On the other hand, addition over integers or other groups, as in “X + Y = Z”, is a reversible operation: It is an equation that can be solved under various combinations of knowns and unknowns.

This combinatorial variability between systems of constraints (equations) is depicted graphically below.

The invariance-based circuit semantics (which is the lack of directionality) ensure that the reflex metaphor can accommodate concepts with general reversibilities and/or symmetries, or lack thereof.

The event-based semantic provides a simple way to implement arbitrarily complex systems of constraints as responses to event disturbances.

The moderator memory on the wires between reflexes ensure that misbehaving gate reflex implementations—whose communication behavior does not represent a valid invariant system of valve constraints—do not affect or cause computation in the rest of the circuit program.

Local-checkability and fluidity in design

Notice that the invariance semantic view of Escher programs is in substance identical to the way electrical circuits are viewed: as a stationary system of equations (constraints) that describe a complete set of allowable states of the program state (the wires) at runtime.

Such semantics are easier to reason about, compared to the reasoning involved in most non-functional programming languages, because the meaning and function of any individual element depends on its own function and its neighbors: Not on the history that lead the program to this state at runtime.

In Theoretical Computer Science, this programming format property is called local-checkability. It refers to the fact that by looking only at a reflex and its neighbors (the “local” part), one is able to “check” that the reflex is connected as intended.

We conjecture that local-checkability “viewed backwards” implies fluidity in design thought when targeting a locally-checkable grammar. In particular, if a human operator translates ideas into a format that is locally-checkable, their focus of attention would have to do much fewer trips to remote “pieces of code” as is the case with, say, imperative programming where history matters to program meaning.

Rigor in incompleteness

Escher circuits (and larger programs) are syntactially correct even when some of the reflex interfaces, used as circuit elements, are not “implemented”. In other words, when there is no gate or circuit that implements the reflex interfaces.

This allows users to encode high-level intention and specification before low-level details have been taken care of. Take for instance the example circuit for the RESTful HTTP service. This high-level circuit could be specified by the CEO of an Internet company, and then passed down—in the form of a source file—to the engineering team to implement the actual function of the boxes. An Escher program can thus be completed by itself being communicated through the creative hierarchy of a developer team.