Common API for Key-Value Maps

[Note: If you have comments please post them at the [Prolog Community Discourse for this PIP](https://discourse.prolog-lang.org/c/implementers-forum]

Abstract

Defines an API for Key-Value mappings, intended to serve as a common interface to various underlying implementations.

Notation and Terminology

Abbreviations for Prolog systems: Ciao (Ciao-Prolog), ECL (ECLiPSe), GP (GNU-Prolog), IF (IF-Prolog), SP (SICStus Prolog), SWI (SWI-Prolog), XSB (XSB-Prolog), ISO (strict ISO-Prolog).

Motivation and Rationale

Mapping keys to values is a common operation in Prolog programs. In this document, we assume that maps have the following properties: * a mapping is a (partial) function, i.e. every key maps to at most one value * a key is a ground term * a value can be any term * optionally, a total order may be defined on the set of keys

Various implementations of such mappings are in common use, such as * ad-hoc unordered lists of key-value pairs * ordered lists of key-value pairs, e.g. O’Keefe’s library(map) or ad-hoc * ordered arrays of keys and values, e.g. SWI’s dicts and variants thereof * unbalanced trees, e.g. O’Keefe’s library(assoc) * balanced trees, e.g. library(avl) (SP), library(rbtrees) (YAP, SWI), library(m_tree234) (ECL) * hash tables

In order to have a uniform API for many representations, the API proposed here * does not require keys to be ordered or orderable (to allow hash tables) * does not require a unique representation, in particular, maps with the same Key-Value set may have non-identical representations (e.g. when the entries to a balanced tree were made in different orders)

Specification

Operations on maps

Principles

The predicates are uniformly named map_xxx/N (with the exception of is_map/1).

The first argument is always a map. If a modified map is created, it is always the last argument.

Maps are logical data structures. Predicates that return an updated map leave their input map(s) unchanged.

Maps do not necessarily have a unique representation – maps that contain the same set of Key-Value entries may differ and not unify!

Considerations

With sufficiently expressive module systems, the map_ prefix could be omitted, and all API predicate references disambiguated by module qualification, as in m:contains(D,K). The qualification references the module that contains the map implementation.

Similarly, when an object-method invocation mechanism is employed, the method name can be chosen to be the API predicate name without the prefix, as in D.contains(K). In both cases, the map_ prefix is not needed to disambiguate the generic method name.

Basic Operations

  • is_map(@Term) is semidet
    Test whether Term is a map. This is meant to be an efficient operation, it will not fully test whether Term is a well-formed map data structure.

  • map_empty(-Dict) is det
    Create the empty map, or test for the empty map. Unlike maps in general, the empty map does have a unique representation, so that unification with it can be used to test for emptiness.

  • map_count(+Dict, -Size) is det
    Returns the number of entries in Dict as Size.

  • map_contains(+Dict, +Key) is semidet
    Test whether Dict has an entry for Key. Fail if no such entry.

  • map_get(+Dict, +Key, -Value) is semidet
    Get the Value stored in Dict under Key. Fail if there is no entry for Key.

  • map_member(+Dict, ?Key, ?Value) is nondet
    Nondeterministically enumerate all Key-Value entries in Dict. If the map is ordered, results will be in order of increasing keys. Fail for the empty map. If Key is known, it is more efficient to use map_get/3.

  • map_set(+Dict, +Key, +Value, -NewDict) is det
    Create an entry mapping Key to Value, returning NewDict. This will either create a new entry, or overwrite an existing one. Any keys other than Key have identical mappings in Dict and NewDict.

  • map_insert(+Dict, +Key, +Value, -NewDict) is semidet
    Create a new entry mapping Key to Value, returning NewDict. Fail if Dict already contains an entry for Key. This is operationally identical to

      map_insert(Dict, Key, Value, NewDict) :-
          \+ map_contains(Dict, Key),
          map_set(Dict, Key, Value, NewDict).
  • map_unify(+Dict, +Key, ?Value, -NewDict) is semidet
    If Dict already has an entry for Key, unify its value with Value. If there is no such entry, create a new one mapping Key to Value. This is operationally identical to

    map_unify(Dict, Key, Value, NewDict) :-
          ( map_get(Dict, Key, V) -> Value=V, NewDict=Dict
          ; map_set(Dict, Key, Value, NewDict)
          ).

    With this access predicate, the map behaves as a total function, as if every key were initially mapped to a free variable.

  • map_update(+Dict, +Key, -OldValue, +Value, -NewDict) is semidet
    Overwrite an existing entry (mapping Key to OldValue) with a new one mapping Key to Value, returning NewDict. Fails if Dict had no entry for Key. This is operationally identical to

    map_update(Dict, Key, OldValue, Value, NewDict) :-
      map_get(Dict, Key, OldValue),
      map_set(Dict, Key, Value, NewDict).
  • map_update(+Dict, +Key, -OldValue, +Value, +InitValue, -NewDict) is det
    Overwrite an existing entry (mapping Key to OldValue) with a new one mapping Key to Value, returning NewDict. If Dict had no entry for Key, create one mapping Key to Value, and unify OldValue with InitValue. This is operationally identical to

      map_update(Dict, Key, OldValue, Value, InitValue, NewDict) :-
          ( map_get(Dict, Key, V) -> OldValue = V ; OldValue = InitValue ),
          map_set(Dict, Key, Value, NewDict).

    This can be used to initialise or modify entries in a single step. One use case is a map of counters:

      increment(Dict, Key, NewDict) :-
          map_update(Dict, Key, Count0, Count1, 0, NewDict),
          Count1 is Count0 + 1.
  • map_delete(+Dict, +Key, -NewDict) is det
    Delete an entry for Key if one exists, returning NewDict. If Dict has no entry for Key, return it unchanged.

  • map_remove(+Dict, +Key, -Value, -NewDict) is semidet
    Remove (i.e. retrieve and delete) an existing entry (mapping Key to Value) from Dict, returning NewDict. Fail if no such entry.

Conversion from and to lists

  • map_from_pairs(+Pairs, -Dict) is det
    Return a new map with one entry for each Key-Value pair in Pairs. Pairs is an unordered, duplicate-free list of Key-Value pairs.

  • map_to_pairs(+Dict, -Pairs)
    Return the list of all Key-Value pairs in Dict. If Dict is ordered, the list is in increasing order of keys.

  • map_to_keys(+Dict, -Keys) is det
    Return the list of all Keys in Dict. If Dict is ordered, the list is in increasing order of keys.

  • map_to_values(+Dict, -Values)
    Return the list of all Values in Dict (one per Key). If Dict is ordered, the list is in increasing order of keys.

Iteration

We define just two primitives for deterministic iteration over a map (nondeterministic enumeration being provided by map_member/3). With these, any map-traversing operations (such as map_map/2/3, map_fold/4, …) can be trivially implemented.

An iterator is an abstract data structure indicating a particular position during traversal of a map. The only operation defined on an iterator is map_next/4 for advancing the position. The empty/exhausted iterator is guaranteed to be represented canonically as [], so that unification can be used to check for it.

  • map_iter(+Dict, -Iter) is det
    Create a new iterator, pointing to the first Key-Value entry in Dict. If Dict is empty, the iterator is [].

  • map_next(+Iter1, -Key, -Value, -Iter2) is semidet
    Return the Key-Value pair that the Iter1 points to, and return a new iterator (pointing to the next map entry) as Iter2. If Iter1 is empty ([]), the predicate fails.

Typical usage pattern (for further examples see below):

traverse(Dict) :-
    map_iter(Dict, Iter),
    traverse_loop(Iter).
    
traverse_loop(Iter1) :-
    ( map_next(Iter1, Key, Value, Iter2) ->
        ... process Key and/or Value ...
        traverse_loop(Iter2)
    ; true
    ).

Optional: Operations on two Dictionaries

These are definable using the above primitives.

  • map_same_keys(+D1, +D2) is semidet
    Dictionaries D1 and D2 have the same set of keys.

  • map_agree(+D1, +D2) is semidet
    True if the common keys of D1 and D2 have their values unified. Corresponds to map_agree/2 in O’Keefe’s library(map), and D1 >:< D2 in SWI. Can be defined as

      map_agree(D1, D2) :-
          map_iter(D1, It0),
          iter_map_agree(It0, D2).
    
      iter_map_agree(It1, D2) :-
          ( map_next(It1, K, V, It2) ->
              ( map_get(D2, K, V0) -> V = V0 ; true ),
              iter_map_agree(It2, D2)
          ;
              true
          ).
  • map_select(+Dsub, +D) is semidet
    True when all keys of Dsub are also in D and the values are unified. Corresponds to Dsub :< D and select_dict/3 in SWI. Can be defined as

      map_select(Dsub, D) :-
          map_iter(Dsub, It0),
          iter_map_select(It0, D).
    
      iter_map_select(It1, D) :-
          ( map_next(It1, K, V, It2) ->
              map_get(D, K, V),   % fail if no key, or value doesn't unify
              iter_map_select(It2, D)
          ;
              true
          ).
  • map_union(+D1, +D2, -Dnew) is semidet
    Return the union of D1’s and D2’s entries as Dnew. If D1 and D2 have common keys, their values are unified. Corresponds to map_union/3 in O’Keefe’s library(map). Can be defined as

      map_union(Dbase, Dover, D) :-
          map_iter(Dover, It0),
          iter_map_union(It0, Dbase, D).
    
      iter_map_union(It1, D1, D) :-
          ( map_next(It1, K, V, It2) ->
              map_unify(D1, K, V, D2),
              iter_map_union(It2, D2, D)
          ;
              D = D0
          ).
  • map_overlay(+Dbase, +Doverlay, -Dnew) is det
    Merge Dbase and Doverlay, preferring Doverlay’s value in case of common keys. Corresponds to map_update/3 in O’Keefe’s library(map), and overlay/3 in library(m_map) (ECL) and put_dict/3 in SWI. Can be defined as

      map_overlay(Dbase, Dover, D) :-
          map_iter(Dover, It0),
          iter_map_overlay(It0, Dbase, D).
    
      iter_map_overlay(It1, D1, D) :-
          ( map_next(It1, K, V, It2) ->
              map_set(D1, K, V, D2),
              iter_map_overlay(It2, D2, D)
          ;
              D = D0
          ).

Optional

  • map_get_ex(+Path, +Dict, ?Value) is det
    As map_get/3, but raise an exception if a key does not exist.

  • map_delete_ex(+D,+K,-Dnew) is det
    As map_delete/3, but raise an exception if a key does not exist.

  • map_remove_ex(+D,+K,-V,-Dnew) is det
    As map_remove/3, but raise an exception if a key does not exist.

Not included

  • map_pairs(?Dict, ?Pairs)
    Bidirectional alternative to map_from_pairs/2 and map_to_pairs/2. This should only be provided if the map representation is unique for a particular set of Key-Value pairs, and Pairs is restricted to be sorted and duplicate-free.

Syntactic Sugaring

It is convenient to have a simple syntax for concrete map instances. Such a syntax can

  • avoid the need for auxiliary data structures (such as key-value lists)
  • avoid the need to invoke a constructor at runtime
  • be independent of the underlying map implementation/representation
  • hide the (potentially complex) actual representation

Some options for a concrete syntax (where ki are ground terms, and vi arbitrary terms):

  • {k1:v1,...,kn:vn} — Python-like
  • #{k1:v1,...,kn:vn} — SWI-dict-like
  • #[k1:v1,...,kn:vn] — array/list-like
  • some other syntax that (unlike the above) is not valid Prolog

The Prolog system should transform this syntax into a chosen map representation (at parse time, or at least when compiling source), and reversely transform the internal map-representation into this syntax on output.

Unfortunately, the facilities for such transformations are poorly standardized. A basic implementation could use term_expansion and the portray-mechanism. With sufficiently expressive module/library system, these transformations can be packaged together with the implementation of the API predicates. A system may also choose to hardcode a mapping to a fixed map representation.

Sample Implementations

  • map_olist.pl: in strict ISO-Prolog, using ordered key-value lists.
  • map_arrays.pl: in strict ISO-Prolog, using key- and value-arrays.
  • map_assoc.pl: a wrapper for traditional library(assoc), which uses unbalanced binary trees.
  • map_rbtrees.pl: a wrapper for library(rbtrees) from YAP, using red-black trees.
  • map_234.ecl: a wrapper for library(m_tree234) from ECL, using 2-3-4-trees.
  • map_dict.pl: a wrapper for SWI dictionaries.


The Prolog Improvements Forum is a part of the "All Things Prolog" online Prolog community, an initiative of the Association for Logic Programming stemming from the Year of Prolog activities.