Representing dicts in Prolog

Many programming languages have a data structure to represent a set of key-value pairs. Think of records, structs, property lists, objects, maps, etc. Prolog does not. For O(1) access it only has compound terms that use positional arguments, i.e., a compound is what could be known elsewhere as a named array, where the functor is the name.

The Prolog has a wide variety of ways to deal with this. To name a few:

Lists of Key-Value
Lists of Key=Value
Lists of Key:Value
Lists of Key(Value)
library(assoc) and/or library(rbtrees)

While the list based versions are comfortable to read and write, the tree based versions can only be created and inspected through the API.

A further complication occurs if we use any of this to represent nested data structures. Now it may become ambiguous whether the list is a key-value set or just a list. In any case, the empty list [] is ambiguous. This implies we must wrap the key-value list in a functor to disambiguate, making it harder to use the Prolog list manipulation functionality directly.

We argue that we need an easy to read and write as well as standard representation for key-value sets to facilitate portable interaction with other languages and data formats such as JSON.

There are at least two Prolog systems with built-in support for sets of key-value pairs.

In ECLiPSe we can define named arguments for a compound, after which name{k1:v1, k2:v2} is mapped to name(…, v1, …, v2, …) where unspecified arguments are left unbound. This does not allow for dynamic key-value sets.
In SWI-Prolog, the same name{k1:v1, k2:v2} maps to a dynamic data type, represented as a compound with reserved functor:

(name, v1, k1, v2, k2).

Where keys are ordered such that many operations can be completed in O(log(N)).

With the Janus Python interface, we settled using

{k1:v1, k2:v2, ...}

optionally wrapped in a functor py/1. Note that wrapping is necessary because {} can both be an empty key/value set and an atom, which may be mapped to a string in the target languages. Note that this notation is syntactically compatible with the SWI-Prolog and ECLiPSe notation if the py is defined as a prefix operator.

We propose to use the f({k1:v1, k2:v2, ...}) as the core of the key-value set representation, where we need to decide on f. Options:

Leave it to the user, optionally by letting the user declare admissible names.
Specify per source/destination, e.g., py{...} or json{...} The advantage of this is we deal with small differences. For example, Python has None where JSON has null.
Settle on a single, language neutral name. (In SWI-Prolog I tend to use #{…} for “anonymous” dicts).

Operations on dicts

While f{...} is easy to read and write, it needs an API to handle. Here are provisional predicates:

is_dict(@Term) is semidet. Type test.
dict_get(+Path, +Dict, -Value) is semidet. Get a Value from Dict. Path is either an atom or a term Key1/Key2/... to access values in tested dicts.
dict_get(+Path, +Dict, ?Value, -NewDict, +NewValue) is semidet. Replaces Value inside Dict by NewValue, returning a new dict.
dict_get_ex(+Path, +Dict, ?Value) is det. As dict_get/3, but raise an exception of a key does not exist.
put_dict(+Key, +DictIn, +Value, -DictOut) is det. Add or replace the value associated to Key
put_dict(+New, +DictIn, -DictOut) is det. Merge New into dict. DicttOut contains the union of the keys, with all keys that appear in both dicts with the associated the the value as it appears in New.
del_dict(+Key, +DictIn, ?Value, -DictOut) is semidet. Remove Key:Value from DictIn
dict_pairs(?Dict, ?Pairs) Similar to Univ (=..), translating between a dict and a list of Key-Value Pairs.

SWI-Prolog also defines

Dict1 :< Dict2 is semidet. True when all keys in Dict1 also appear in Dict2 and the values unify. This is useful for fetching multiple values efficiently.
Dict1 >:< Dict2 is semidet. True when the values of the shared keys can be unified. This implies the dicts are “compatible”.