Options for write_term

[Note: If you have comments please post them at the Prolog Community Discourse for this PIP]

Abstract

Extending the set of options recognized by the standard write_term/2,3 predicate.

Notation and Terminology

Abbreviations for Prolog systems: Ciao (Ciao-Prolog), ECL (ECLiPSe), GP (GNU-Prolog), IF (IF-Prolog), SP (SICStus Prolog), SWI (SWI-Prolog), XSB (XSB-Prolog).

Unless otherwise stated, the default for each option is false.

Motivation and Rationale

We suggest a number of additional options for the write_term/2,3 built-in, as an extension of the basic set predefined by ISO-Prolog.

Main section(s)

The options required by the ISO-standard

The ISO-Prolog Standard [STD] defines the following required write-options in its section 7.10.4.

  • quoted(Bool)
    Iff Bool is true each atom and functor is quoted if this would be necessary for the term to be input by read_term/3. In addition, floats must be printed with sufficient precision to allow them to be read back exactly.
    NOTE: In systems with a string data type, print strings in string quotes.

  • ignore_ops(Bool)
    Iff Bool is true each compound term is output in functional notation. Neither operator notation nor list notation is used when this write-option is in force. In Corrigendum 3 this was extended to apply to the printing of {}/1 terms too.
    NOTE: SWI’s version does not affect lists and brace-terms.

  • numbervars(Bool)
    Iff Bool is true a term of the form '$VAR'(N), where N is an integer, is output as a variable name consisting of a capital letter possibly followed by an integer. The capital letter is the (i+1)th letter of the alphabet, and the integer is j, where i = N mod 26, j = N // 26. The integer j is omitted if it is zero.
    NOTE: the standard forbids the traditional feature of allowing N to be an atom which is used as the variable name.

  • variable_names(VNList)
    Each variable V is output as the sequence of characters defined by the syntax for the atom A iff a term A=V is an element of the list VNList. If more than one element applies, the leftmost is used. VNList is a list of terms A=T with A an atom and T any term, possibly a variable.
    NOTE: This option was added with Corrigendum 3 [STD3].

A processor may support one or more additional write-options as an implementation specific feature.

TBD: Treatment of unknown/unsupported options: error, silent ignore, warning, or? ISO requires error, SWI ignores silently.

TBD: SWI allows option_name as abbreviation for option_name(true), which seems convenient and sensible.

Proposed additional options

Whole term layout

  • max_depth(N) (SP,SWI,ECL,GP,Ciao,XSB)
    If MaxDepth is a positive integer, print the term only up to a maximum nesting depth of MaxDepth, and represent more deeply nested subterms as .... If 0, impose no depth limit.

TBD: We have some disagreement over the default value: 0 (no limit) or inherit a context setting. Leave implementation-defined?

TBD: IF-Prolog also has maxdepth(N,TermAbbrev,ListAbbrev) for specifying the atoms that are used to abbreviate the omitted subterms.

TDB: SWI interprets the value specially for lists.

  • portrayed(Bool) (SP,ECL,IF,GP,Ciao,SWI)
    If true, call the user-defined predicate portray/1,2 in the way print/1,2 does.
    NOTE: This is for systems that support traditional print/portray/1 – ISO doesn’t define print/portray at all. ECLiPSe prefers print/portray/2 with stream argument.

  • priority(Prec) (SP,SWI,GP,Ciao,ECL,XSB)
    Prec is an integer between 0 and 1200 (default 1200), representing context operator precedence. Can be used to force correct parenthesizing when partial terms are written as arguments of operators. The written term will be enclosed in parentheses if its precedence is higher than Prec.

Term termination

  • flush(Bool) (ECL)
    If true, flush the stream (as with flush[_output]/1) after the term hash been printed.

  • fullstop(Bool) (ECL,SWI)
    If true, terminate the term with a fullstop (a dot followed by blank space), so it can be read back. The blank space after the dot is a newline if the nl(true) option is present, otherwise a space character. If necessary, an extra space will be inserted before the fullstop, in order to separate it from the end of the term.

  • nl(Bool) (ECL,SWI)
    If true, print a newline sequence (as with nl/1) after the term. If this is used together with the fullstop(true) option, this newline serves as the blank space after the fullstop.

NOTE: Some of these seem redundant, since write_term(T,[nl(true)]) can be written as write_term(T,[]),nl. However, in a multi-threaded context, the two-goal sequence may be interrupted by other threads printing to the same stream, while a single write_term goal can easily be implemented as an atomic operation.

Functor-specific syntax

These options provide finer control than ignore_ops. Where options conflict, the more specific one should take precedence, e.g. dotlists over ignore_ops.

  • dotlists(Bool) (ECL,SWI)
    If false (default), write lists in the common square bracket notation, e.g. [1, 2]. If true, write lists in the dot functor notation, e.g. .(1,.(2,[])). This is subsumed by the ISO ignore_ops option.
    Note: SWI also has no_lists(Bool) for alternative list functor.

  • brace_terms(Bool) (SWI)
    If true (default), write {}(X) as {X}. This is subsumed by the ISO ignore_ops option.

TBD: this is irregular as it has a default of true - negate and rename?

  • operators(Bool) (ECL)
    This is like ISO ignore_ops, except that it does not affect lists (which is rarely desirable), but does affect all other syntactic sugaring (operators, braces, extensions such as array subscripts, apply syntax).

TBD: this is irregular as it has a default of true. It also controls syntax other than operators. Negate and rename to something like unsugared or functional?

TBD: Not all options are needed, choose a sensible set. Do we really need control for every individual feature, or is a single options to disable sugaring enough?

Option pre/in/postfix lists braces other
<default> a-b [a,b] {c} sugared
ignore_ops(true)/ISO -(a,b) .(a,.(b,[])) {}(c) functional
operators(false) -(a,b) {}(c) functional
ignore_ops(true)/SWI -(a,b)
dotlists(true) .(a,.(b,[]))
brace_terms(false) {}(c)

Candidate options - TBD

Guidelines for adding options

The functionality of write_term overlaps with the functionality of format/printf. Trying to support all format/printf functionality in write_term may lead to a confusing proliferation of options, and multiple ways of doing the same thing. This might be kept in check by considering that * format/printf is about embedding one (or multiple) small (often atomic) Prolog terms into a printed text string. The focus is on how each embedded term is laid out individually. The reader is probably a human end user unfamiliar with Prolog terms. * write_term is about printing a single (possibly complex) Prolog term. The focus is on globally controlling the layout of a unknown number of subterms of varying types. The reader is probably a programmer or program familiar with Prolog terms.

Whole term layout

  • as(Kind) (ECL)
    Where Kind is clause, goal or term: assume that the printed term is of the given Kind.
    Note: In ECL, this selects the appropriate write transformations, but it could also be used to control portraying and indentation.

TBD: does this make sense for everybody?

  • cycles(Bool) (SP,SWI)
    If true (default), cyclic terms are written as @(Template, Substitutions), where Substitutions is a list Var=Value.

  • cycles(Style) (proposal)
    where Style is one of true or @ (select the SP/SWI style), false (don’t detect cycles), or some other constant such as detect.

TBD: since the @/2 syntax conflicts with other uses of this functor, it would be preferable to have cycles(Atom) to be able to specify different alternatives for representing cycles. The true-default is surprising, as the @-syntax is nonstandard. Default implementation-defined?

  • indented(Bool) (SP)
    The term is printed with the same indentation as is used by portray_clause/1 and listing/[0,1].

TBD: is this what we want? Why specific to clauses? Is it better to keep write_term basic, and leave complex formatting to specialised routines?

  • spacing(Atom) (SWI) or compact(Bool) (ECL) or space_args(Bool) (GP)
    Specifies whether and where to print extra spaces for readability.

  • spacing(SubOptions) (proposal)
    Where to print spaces (default: minimal spacing for parseability), with one or a list of sub-options in

    • next_argument: after the comma separating structure or list arguments
    • operators: always after prefix, around infix and before postfix operators

TBD: agree on name, default and possible values

Subterm-type-specific options

  • float(+SubOptions) variant A
    How to print floats, with one or a list of sub-options in
    • precision(+Precision): number of digits after the decimal point.
    • style(+Style): select either a C-printf-like style, one of f, e, g, or q (full precision for reading back)
    • upper: use upper case E for the exponent indicator instead of e.
  • float(+SubOptions) variant B
    How to print floats, with one or a list of sub-options in
    • style(+Style): where Style is one of f(Precision), e(Precision), g(Precision) or q (full precision for reading back).
    • upper: use upper case E for the exponent indicator instead of e.

TBD: decide on variant A or B

  • integer(+SubOptions)
    How to print integers, with one or a list of sub-options in
    • base(+Base): print integers in the given base. Base is either one of the atoms dec (default, without prefix), bin (with prefix 0b), oct (with prefix 0o), hex (with prefix 0x), or an integer in the range 2..36 (with prefix Base').
    • bare: suppress the base indicator prefix.
    • grouping(+Size): if 0 (default) no grouping. Otherwise size of digit groups.
    • separator(+Atom): string used to separate digit groups, default?.
    • upper: use upper case letters instead of the default lower case for printing in bases greater than 10.

TBD: decide on default group separator, and whether to support grouping at all.

  • atom(+SubOptions), string(+SubOptions), text(+SubOptions)
    How to print atoms or strings, with one a list of sub-options in
    • max(+Length): truncate text after Length characters. Don’t truncate if 0 (default).
    • quote(+When): whether to print quotes. When is one of never (default), when_needed or always.
    • escape(+What): whether to print nonprintable characters as escape sequences in quoted text. When is one of all (default), most (all but newlines and tabs) or none.

TBD: discuss quote and escape options.

TBD: Is there a desire for an option to write codes/character lists as text?

Variable attributes

Attributes are not well standardized. The general consensus is to print them in curly braces after the variable they belong to. They may be qualified with an attribute name.

  • attributes(Atom) (ECL,SWI)
    Determines how variable attributes are printed. Options (default is inherited from a global setting) are:
    • none (ECL) or ignore (SWI): do not print attributes
    • dots (SWI): print {...}
    • full (ECL) or write (SWI): print the attributes as subterms surrounded by curly braces
    • pretty (ECL): use a per-attribute print handler mechanism to transform before printing, or suppress.
    • portray (SWI): use a per-attribute portray mechanism.

TBD: agree on the names.

Variables

In addition to ISO numbervars and variable_names:

  • legacy_numbervars(Bool) (SP)
    Like numbervars, but with the more permissive pre-ISO convention for printing variable names: if the argument of '$VAR'(N) is an atom or code list, these characters are written instead of the term.

  • namevars(Bool) (GP)
    A term of the form ’$VARNAME’(Name), where Name is an atom respecting the syntax of variable names, is output as a variable name.

TBD: These two deal with shortcomings of ISO numbervars. Probably unnecessary when either numbervars(legacy) is supported, or legacy behaviour ican be enabled by another ISO-compatibility switch mechanism.

  • variables(Method) (ECL)
    How to print variables:
    • default: print variables using their source name, if available. Otherwise print a system-generated name, which consists of an underscore and a number, e.g. _123. Note that this format cannot be reliably read back, because different variables may have the same source name.
    • raw: print all variables using a system-generated name, which consists of an underscore and a number, e.g. _123. This format is suitable when the term needs to be read back later. It makes sure that multiple occurrences of the same variable have the same name, and different variables have different names.
    • full: print variables using their source name, if available, followed by a unique number, e.g. Alpha_132. Variables without source name are printed in the raw format. Since variables with identical source names are named apart, this format is suitable when the term needs to be read back later.
    • anonymous: print every variable as a simple underscore. Any information about multiple occurrences of a variable is lost with this format. It is mainly useful to produce output that can be compared easily with the output of a different session.

TBD: Discuss options. Source names are probably ECL-specific. SWI can detect singletons and treat specially.

Subterms generally

  • portray_goal(:Goal) (SWI)
    Call call(Goal,SubTerm,WriteOptions) for every subterm. Like portray, if this fails, print normally, otherwise consider subterm printed. This can be used to implement an interface to format/3, for example.

Unclear

  • float_width(Width) (XSB)
    Width must be an integer between 1 and 17, and this number determines the minimum width precision with which a floating point number is displayed. For instance, a width of 2 ensures that a floating point number is always displayed with a decimal value. The default value is 2.

  • module(Module) (SWI)
    Workaround for passing context module.

  • partial(+Bool) (SWI)
    If true, the token separation logic associated with the stream is not reset. This has consequences for inserting spaces and parentheses. Not precisely specified.

Probably Redundant/Compatibility only

  • maxdepth(N) (IF):
    The same as max_depth(N).

  • precedence(Pred) (ECL)
    A synonym for priority(Prec).

  • float_format(Atom) (SP)
    Subsumed by float(SubOptions). This is sensible, but defined in terms of non-standard format/2.

  • float_precision(Prec) (XSB)
    Subsumed by float(SubOptions). Prec must be an integer between 1 and 17, and this number determines the precision with which a floating point number is displayed, (excluding trailing zeros). The default value is 15.

  • float_specifier(Spec) (XSB)
    Subsumed by float(SubOptions). Floats in XSB are printed using underlying C routines. In C a floating point specifier of f or F means that a floating point number is always printed with a certain precision, while a specifier of g or G truncates trailing zeros. The allowed values are g,G,f and F. The default value is g.

  • radix(Radix) (XSB)
    Subsumed by integer(SubOptions). Ensures that integers are printed with radix Radix. Radix can be decimal, hex or octal. The default is decimal.

Implementation

details for implementation, systems that support it, etc.

Related work

pointers to discussions and other PIPs

References

  1. [STD] International Standard ISO/IEC 13211-1 : 1995 Programming Languages - Prolog
  2. [STD3] ISO/IEC 13211-1:1995 TECHNICAL CORRIGENDUM 3 from 2017-07

Copyright

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



The Prolog Implementers Forum is a part of the "All Things Prolog" online Prolog community, an initiative of the Association for Logic Programming stemming from the Year of Prolog activities.