Representing dicts in Prolog
Many programming languages have a data structure to represent a set of key-value pairs. Think of records, structs, property lists, objects, maps, etc. Prolog does not. For O(1) access it only has compound terms that use positional arguments, i.e., a compound is what could be known elsewhere as a named array, where the functor is the name.
The Prolog has a wide variety of ways to deal with this. To name a few:
- Lists of
Key-Value
- Lists of
Key=Value
- Lists of
Key:Value
- Lists of
Key(Value)
library(assoc)
and/orlibrary(rbtrees)
While the list based versions are comfortable to read and write, the tree based versions can only be created and inspected through the API.
A further complication occurs if we use any of this to
represent nested data structures. Now it may
become ambiguous whether the list is a key-value set or
just a list. In any case, the empty list []
is ambiguous. This implies we must wrap the key-value list
in a functor to disambiguate, making it harder to use the
Prolog list manipulation functionality directly.
We argue that we need an easy to read and write as well as standard representation for key-value sets to facilitate portable interaction with other languages and data formats such as JSON.
There are at least two Prolog systems with built-in support for sets of key-value pairs.
In ECLiPSe we can define named arguments for a compound, after which name{k1:v1, k2:v2} is mapped to name(…, v1, …, v2, …) where unspecified arguments are left unbound. This does not allow for dynamic key-value sets.
In SWI-Prolog, the same name{k1:v1, k2:v2} maps to a dynamic data type, represented as a compound with reserved functor:
(name, v1, k1, v2, k2). Where keys are ordered such that many operations can be completed in O(log(N)).
With the Janus Python interface, we settled using
{k1:v1, k2:v2, ...}
optionally wrapped in a functor py/1
. Note
that wrapping is necessary because {}
can
both be an empty key/value set and an atom, which may be
mapped to a string in the target languages. Note that this
notation is syntactically compatible with the SWI-Prolog
and ECLiPSe notation if the py
is defined as
a prefix operator.
We propose to use the
f({k1:v1, k2:v2, ...})
as the core of the
key-value set representation, where we need to decide on
f
. Options:
- Leave it to the user, optionally by letting the user declare admissible names.
- Specify per source/destination, e.g.,
py{...}
orjson{...}
The advantage of this is we deal with small differences. For example, Python hasNone
where JSON hasnull
. - Settle on a single, language neutral name. (In SWI-Prolog I tend to use #{…} for “anonymous” dicts).
Operations on dicts
While f{...}
is easy to read and write, it
needs an API to handle. Here are provisional
predicates:
is_dict(@Term) is semidet. Type test.
dict_get(+Path, +Dict, -Value) is semidet. Get a Value from Dict.
Path
is either an atom or a termKey1/Key2/...
to access values in tested dicts.dict_get(+Path, +Dict, ?Value, -NewDict, +NewValue) is semidet. Replaces Value inside Dict by NewValue, returning a new dict.
dict_get_ex(+Path, +Dict, ?Value) is det. As dict_get/3, but raise an exception of a key does not exist.
put_dict(+Key, +DictIn, +Value, -DictOut) is det. Add or replace the value associated to
Key
put_dict(+New, +DictIn, -DictOut) is det. Merge
New
into dict. DicttOut contains the union of the keys, with all keys that appear in both dicts with the associated the the value as it appears inNew
.del_dict(+Key, +DictIn, ?Value, -DictOut) is semidet. Remove
Key:Value
fromDictIn
dict_pairs(?Dict, ?Pairs) Similar to Univ (
=..
), translating between a dict and a list ofKey-Value Pairs
.
SWI-Prolog also defines
- Dict1 :< Dict2 is semidet. True when all keys in Dict1 also appear in Dict2 and the values unify. This is useful for fetching multiple values efficiently.
- Dict1 >:< Dict2 is semidet. True when the values of the shared keys can be unified. This implies the dicts are “compatible”.