A syntax or labelled sub terms
Motivation
Prolog defines a syntax that can only express trees of sub terms. At runtime however, a Prolog term can be any directed graph where the nodes are compound terms and the leaves are atomic terms. Such terms cannot be written and read while preserving the structure. There are two cases where this is problematic.
- We cannot represent rational trees, also
called cyclic terms. Such terms can be created by
unification. The typical example is
?- X = f(X).. - We cannot transmit terms with internal
sharing. Using write_term/2 followed by read_term/2
creates a copy that is a clean tree. The original
term with sharing and the tree are equivalent under
==/2, but they have a different layout in memory.
While (1) is a clear omission, this is less clear for (2). Some notes on this:
- Internal sharing is used in practice. Consider
library(rbtrees)from YAP and ported to various systems. This library uses a nil node with the shapeblack('',_,_,''). This node is stored in the toplevel term and reused for creating new nodes. Using this nil-node simplifies the algorithms with only a minor space overhead. If, however, we write and read such a tree, a copy of the nil node ends in each node. - We could argue that read_term/2 could minimize the term by finding common sub terms and using only a single copy thereof.
- With labelled sub terms in the syntax we can reduce the amount of data transferred and improve the readability of large terms with a lot of sharing.
- Whether a term is a copy or not matters when using
setarg/3. This could be resolved using SISCtus support for mutable terms or by adding a fresh variable to a compound on which you want to performsetarg/3. A fresh variable avoids unwanted sharing by smart implementations ofcopy_term/2as well as smart versions ofread_term/2, garbage collection, etc.
Solutions
SICStus introduced write_term/2 using the
option cycles(true) which, if a term is
cyclic, writes @(Skeleton, Substitutions),
e.g. a cyclic term X = f(X) is written as
@(X, [X=f(X)]). The predicate
read_term/2 has the same option and on
reading such a term the Substitutions are
unified and the resulting Skeleton is
returned. This approach has been copied by SWI-Prolog. It
has two disadvantages:
- It is ambiguous.
- Cyclic terms cannot be embedded into larger
terms as
@(_,_)only has a special meaning if it is the top-term. We can of course generate cycles at any level, e.g.@(t(X), [X=f(X)]).
ECLiPSe introduced an extension to its attributed
variable syntax. ECLiPSe supports
Var{att1: Val1, attr2: Val2, ...} to denote
Var with the given attributes. ECLiPSe
introduces a reserved attribute =. When
present, the variable is unified to the corresponding
value. E.g., X = f(X) becomes
X{= : f(X)}. Note the space between the
= and : is required to satisfy
Prolog tokenization.
Note that SICStus does not write attributes, but supports copy_term/3 to obtain a copy of a term holding attributed variables. The copy has no attributes and the 3rd argument is unified to a list of goals that reinstantiate the constraints when called.
Danielle Church has proposed on the SWI-Prolog
Discourse group to allow for (Var) Term. The
equivalent ECLiPSe notation is Var{= : Term}.
The main disadvantage of this is the ambiguity. It was
proposed to allow for
p((X) hello(Y)) :- q(X).
Eventually it was rejected in favour of compiler
support to make the code below having all benefits of
hello(Y) appearing in the head.
p(X) :- X = hello(Y), q(X).
Alternative proposals
A few variations on the ECLiPSe syntax proposed are
X{_: f(X)}. The advantage is that it requires no reserved attribute. Otherwise it is considered less clean.X{X: f(X)}. Similar issues
Issues
How to handle a term contains multiple attribute constructs for the same variable?
Two proposals are
- One option is to make each
Var{...}create an attributed term and use unification if we find multiple instances ofVar. This implies we need to call the constraint solver insideread/1. Constraint solvers may fail, raise exceptions or be non-deterministic. It is undesirable to propagate this to read/1. - The current ECLiPSe and SWI-Prolog implementations
allow for multiple
Var{...}on the same variable, but:- If the same attribute is provided, the value must be
the same under
==/2. In that case the second instance is ignored. Otherwise a syntax error is raised.
- If the same attribute is provided, the value must be
the same under
Can
the = attribute be combined with other
attributes?
This is not meaningful and problematic as it would require constraint solving.
What about
Var{}?
This can be
- A syntax error (current ECLiPSe)
- The same as
Var(current SWI-Prolog) - Create an attributed variable without attributes (if the system supports this notion).
Note that <atom>{} is already valid
syntax, both for struct/argnames and for dynamic dicts
(currently SWI-Prolog only).
Implications
Providing this syntax poses some new challenges to the Prolog system. Consider the query below. Correctly executed it should print an infinite number of lines.
?- G{= : (writeln('Hello world'), G)}.
However, current SWI-Prolog’s toplevel goal analysis as well as term and goal expansion loop on body terms that contain cyclic control structures.

