💾 Archived View for schinkel.bevuta.com › fleng › MANUAL captured on 2024-08-18 at 19:24:21.
View Raw
More Information
⬅️ Previous capture (2024-05-12)
-=-=-=-=-=-=-
_______ _______ __ _ ______
|______ | |______ | \ | | ____
| |_____ |______ | \_| |_____|
User Manual
Version: 21
I. Introduction
This document describes an implementation of the "FLENG" programming
language as described in [1]. FLENG is a very minimal and low-level
committed-choice concurrent logic language descending from
Concurrent Prolog.
The implementation described here provides a FLENG compiler that
generates assembly language for x86-64, ARM and RISCV machines on UNIX
or BSD systems. Additionally front ends for "Flat Guarded Horn
Clauses" (FGHC)[2], "Strand"[5] and "PCN"[10] are provided, which are
higher level and more convenient languages.
II. Usage
The compiler is invoked by using the "fleng" driver script:
fleng [OPTION ...] SOURCEFILE
The script takes FLENG, Strand, PCN or FGHC source files and generates
assembler, binary object files or executables. Enter
man fleng
or
flengdoc fleng
to read about the available compiler options.
Binary object files can be linked together to produce executables
consisting of multiple compiled modules. The "fleng" driver accepts
- .o and *.a files on the command line and links them to any generated
executable. To compile an executable program, one main module must
be given when linking.
The file extension of the source file determines as in what language the
source code is to be written, it should be ".ghc" or ".st" for FGHC and
Strand, or ".pcn" for PCN, respectively. Alternatively, use the "-l"
option to specify explicitly in what language the code is written in.
The compiler options are also documented in the fleng(1) manual page.
Generated wrapper code using the "foreign/1" declaration is detected
and compiled automatically when binary files are produced. Separate
compilation is straightforward by compiling library modules into
object files that can then be linked with the main module.
All compiled executables accept a number of run-time command line
options to enable debugging output and override internal settings.
Enter
man 7 fleng
or
flengdoc options
to obtain more information about the available run-time options.
FLENG and FGHC programs usually spawn many processes that run in
parallel. This implementation can utilize native threads to partition
process pools into separately executing entities to take advantage
of multi-core CPUs.
Enter
flengdoc
to explore the available documentation interactively, or
flengdoc '<name>'
to see documentation for a particular primitive, library module or
syntactic element.
After you have installed the system you can try it out. If you
do not want to install it yet, just build it and set the FLENG_PREFIX
environment variable:
$ export FLENG_PREFIX=. [FGHC]
$ cat examples/hello.ghc
% the canonical first program
-initialization(main).
main :- writeln('hello, world.').
$ ./fleng examples/hello.ghc
$ ./a.out
hello, world.
A tool for building more complex programs and libraries is provided
with "flengmake". Finally, the "flengbundle" tool can be used to add
raw data to a program and access it from a program or library. For
more information, view the man pages or use "flengdoc".
III. The FLENG Language
For a full desription of FLENG, see [1]. This implementation provides
a superset including additional computation operators for numerical
calculations. A very short description of the language is given
here. Some familiarity with Prolog is helpful to understand the
syntax and concepts used, but Prolog has quite a different evaluation
model and one is not required to be proficient in it to learn
FLENG, which is in a sense considerably easier to understand.
If you prefer a more traditional syntax, skip to section "V. The PCN
Language" below, which describes "PCN" (Program Composition Notation),
which has more similarity with C-like languages and may be more
accessible for users that do not have a Prolog background.
A FLENG program consists of a set of "process definitions", where
each definition is a set of clauses of the same name and arity:
process_name(Argument1, ...) :- Goal1, ... .
Arguments are "matched" to the actual argument given when the process
is created. Goals are new processes to be invoked when the clause
is selected. When a process is invoked, one clause of its definition
that matches the given arguments is selected and executed, by
spawning a new process for each goal in the body, for example:
% write each element of the given list: [FGHC]
-initialization(main). % start execution at "main" goal
walk([]) :- true.
walk([_|Y]) :- walk(Y).
main :- walk([1, 2, 3]).
Each clause must have a body containing a list of goals, separated
by comma (","). Note that all body goals are truly executed in
parallel, not in the order in which they are specified. If no clause
matches the arguments, then the process "fails", resulting in a
run-time error.
Argument variable bindings take place when clause heads are matched
and variables are distinguished by starting with an uppercase letter
or the "_" character. A variable in a clause head matches whatever
is passed as argument to clause invocation. Logic variables passed
as arguments on the other hand can only be bound by explicit
unification in the body. Matching a non-variable argument with an
unbound variable passed by the caller, the process selection is
suspended until the variable has been bound. Here an example:
-initialization(main). [FGHC]
try([Y]) :- true.
main :- try(X), wait(X).
wait(X) :- X = 123.
Here "try" will suspend until the unification of "X" with 123 is
performed. Once a logic variable is bound, all goals suspended on
the variable will be made active and are scheduled to be executed
at some stage, after which they may continue or suspend on other
variables. The variable named "_" is always unique, multiple
occurrences of "_" represent different anonymous variables. These
are normally used as "don't care" patterns when matching.
Once the head of a clause is fully matched, the clause "commits"
and the body goals are spawned. All other alternative clauses will
be ignored, there is no "backtracking" as in Prolog.
One specific feature of FLENG is the "!" operator. When used in the
head of a clause, a variable prefixed with "!" will suspend until
the variable is bound, regardless of whether the argument is
subsequently used or not:
-initialization(main). [FGHC]
try1([X]) :- true.
try2([!X]) :- true.
main :- try([Z]).
try(Y) :- try1(Y), try2(Y).
"try1" will match and complete, but "try2" will suspend forever,
since it waits for the first element of the given list to be bound
to some value.
In FLENG the same argument variable may not appear more than once
in a clause head.
A FLENG program terminates once there are no more active goals. If
there are still suspended goals, then a deadlock has occurred and
the program aborts with an error message.
The only built-in primitives in FLENG are "compute/{3,4}", "unify/3",
"true/0", "call/1", "=/2" and "foreign_call/1", described later
in this manual. There are further undocumented primitives which are
used for translated FGHC code.
Declarations define additional properties of the compiled module
and have the following form:
-declaration(Argument, ...). [FGHC]
Currently supported declarations in FLENG are "include/1",
"initialization/1", "arguments/1", "exports/1", "module/1", "foreign/1"
and "uses/1", which are described in the "LIBRARY" file and via
flengdoc(1). Some additional declarations exist for internal use
but are not documented here. Declarations apply to the whole source
file and are processed before all other clauses, regardless of where
the declaration is located in the source file.
FLENG is intentionally minimal to make it easy to implement and
should be considered a suitable target language for compilers that
translate higher-level languages. For user programming it is
recommended to use the FGHC or PCN front ends described below.
IV. The FGHC Language
"FGHC" or "Flat Guarded Horn Clauses" is another dialect of Concurrent
Prolog, descended from "GHC" which is described in [2]. It provides
clause guards, arithmetic expression evaluation and some other
syntactically convenient features. See also [3] for a general
overview over the many concurrent logic language dialects and how
they relate to each other. FGHC is "flat", meaning that only built-in
guard-expressions are allowed in the guard part of a clause, no
user defined process definitions can be used as guards.
"Strand" can be considered a slightly restricted subset of FGHC,
so code written in Strand is compiled as FGHC and can take advantage
of FGHC features like general output unfication.
An FGHC clause has the following general format:
process_name(Argument1, ...) :- Guard1, ... | Goal1, ... .
The "Guard1, ... |" part may be omitted. If no guards and no goals
exist, the clause may be written as
process_name(Argument1, ...).
If guards exist but no goals need to be evaluated, use the following
form:
process_name(Argument1, ...) :- Guard1, ... | true.
A guard is one of a set of predefined guard expressions that are
executed to determine if the clause commits or not. If a guard
expression fails, then another clause is selected until one clause
commits or the whole process invocation fails.
Both FLENG and FGHC support the following data types:
signed integers 123 0xF00F 0'A
floating point numbers 1.23
strings (atoms) abc 'one\ntwo'
lists [1, 2, 3] [head|tail] "abc"
tuples foo(bar, baz) {x, y, z}
variables Abc _123 _
ports
modules
The atom "[]" designates the empty list. Anonymous variables ("_")
as in Prolog are supported. Character lists enclosed in double
quotes are represented as a list of UNICODE code points. Source code
may contain arbitrary UNICODE characters as long as the basic
syntactic requirements are fulfilled, e.g. variables must start with
"_" or an uppercase letter ("Lu" category).
"Ports" are identity-preserving stream containers, as described in [4],
Circular data structures are not allowed. This is
currently only superficially checked at run-time. Tuples of the
form "{...}' may not be empty. Note that "abc(def)" is equivalent
to "{abc, def}".
Quited strings and character lists may contain hexadecimally encoded
characters using the "\xXX" and "\uXXXX" escape sequences, in
addition to the usual "\t", "\n" and "\\".
Care must be taken when having multiple clauses that overlap in the
arguments they match: if multiple clauses match at the invocation
of the associated process definition, then it is unspecified which
of the candidate clauses will "fire" and be committed. You can
narrow the set of matching clauses by using guards if you want to
avoid this non-determinism:
big(N, F) :- N > 1000 | F = yes. [FGHC]
big(_, F) :- F = no.
In an invocation of "big(1001)" the second clause may match, since
matching is non-deterministic. Add another guard or "otherwise" to
remove this ambiguity:
big(N, F) :- N > 1000 | F = yes. [FGHC]
big(N, F) :- N =< 1000 | F = no. % alternatively: "otherwise/0"
Declarations as in FLENG are supported. FGHC also allows using the "!"
annotation. The restriction that the same variable may not occur more
than once in a clause head is lifted and equivalent to binding a fresh
variable and adding a "==/2" guard matching both variables.
Prolog's single line ("% ...") and block ("/* ... */") comments are
supported in FGHC/Strand and FLENG.
The main differences between FGHC and Prolog can be summarized
as follows:
- FGHC and FLENG are committed-choice language and has no backtracking:
once a clause head matches and the guards succeed, all remaining
solutions are abandoned.
- All goals of a clause are executed in parallel, in an arbitrary order.
- FGHC has no general unification - clause heads and guards only match
(and suspend if unbound variables are matched with non-variable
arguments), variables are assigned exclusively by "=/2", "unify/3",
"assign/3", "is/2" and primitives that unify results with argument
variables.
- Matching a non-variable with an unbound variable or using it in a
guard other than "unknown/1" suspends the current process until the
variable has been bound.
For a more in-depth description of concurrent logic languages, the
"Strand" book [5] is highly recommended as it explains the concepts
and techniques in detail and gives many examples (The book is available
online, see the "References" section for details).
Strand is a subset of FGHC, but only provides assignment instead
of output unification. FGHC is very similar to "KL1"[6], which was a
language used in Japan's Fifth Generation Computing project (FGCS)[9].
This implementation of FGHC also supports paired arguments, an extension
from KL1 that simplifies handling argument pairs. A preprocessing stage
transforms clause heads and goals of a certain form in a manner described
as follows:
(this description was taken from the "KLIC"[12] manual)
The head and goals in body parts of a clause can have argument pairs
specified by a single variable name attached to the head or goals
by a hyphen character. We call such a pseudo variable an "argument
pair name":
p(X,Y)-Pair :- q(X)-Pair, s(Z)-Pair, r(Pair,Y), t(Z)-Pair. [FGHC]
The pseudo-variable "Pair" is an argument pair name. Such a clause is
interpreted the same as the following clause:
p(X,Y,P0,P) :- q(X,P0,P1), s(Z,P1,P2), r(P2,Y), t(Z,P2,P). [FGHC]
Occurrences of argument pair names attached to the head or goals by a
hyphen character are interpreted as a pair of two different variables
added at the end of the argument lists. In what follows, we call the
two variables generated from an paired argument an "expanded pair".
The second of an expanded pair of a goal is the same as the first of
the expanded pair of the next goal with the same argument pair name.
In the example above, "P1" appearing as the third argument of the goal
of "q/3" also appears as the second argument of "s/3", as originally
they both have the same argument pair name "Pair".
The first of an expanded pair in the head will be the same as the
first of the expanded pair in the first goal in the clause with the same
argument pair name. The second of an expanded pair in the head will be
the same as the second of the expanded pair in the last goal with the
same argument pair name.
In the above example, the first of the expanded pair "P0" in the
head appears again as the second argument of the first goal calling
"q/3", and "P", the second of the expanded pair in the head, appears
again as the third argument of the last goal of "t/3".
If the argument pair name appears only in the head, two variables of
the expanded pair are unified in the body. For example, a clause:
p(X)-Y :- q(X). [FGHC]
is expanded into the following.
p(X,Y0,Y) :- Y0=Y, q(X). [FGHC]
An argument pair name may appear at a usual argument position rather
than being attached to the head or goals, as does the first argument
of the goal for "r/2" in the above example. In such a case, it is
expanded to a single variable. This variable is the same as the
second of the last expanded pair and is also the same as the first
of the next expanded pair. Thus, in the above example, "Pair"
appearing as the first argument of "r/2" is expanded into "P2",
which is the same as the third argument of "s/3" and the second
argument of "t/3".
Arbitrarily many argument pair names can be specified for a head
or a goal. For example, a clause such as:
p-X-Y :- q-X, r-Y, s-Y-X. [FGHC]
is interpreted as follows:
p(X0,X,Y0,Y) :- q(X0,X1), r(Y0,Y1), s(Y1,Y,X1,X). [FGHC]
Sometimes, specifying normal arguments after some argument pair
names is desirable. This can be done by connecting them with a
plus ("+") character. For example:
p-X+Y :- q-X+35, r(Y), s+Y-X. [FGHC]
is interpreted as follows:
p(X0,X,Y) :- q(X0,X1,35), r(Y), s(Y,X1,X). [FGHC]
Note that the expansion rules for paired arguments described above
are position sensitive for goals. However, this does not mean that
the execution order of body goals are constrained anyhow.
Also note that the argument pair notation is no more than macro
expansion of clauses. One predicate may have clauses some of which
written in the argument pair notation and others in the usual
notation.
V. The PCN Language
"PCN" (Program Composition Notation) is a language developed by the
designers of "Strand", having a more conventional C-derived syntax,
but providing the same level of parallelism as the more traditional
variants of concurrent logic languages. The language is described
in detail in the PCN manual[10] and in [11], a tutorial can be found
in the "doc" directory of the distribution in the file "PCN-tutorial.txt",
or enter
flengdoc PCN-tutorial
to view it at the command line.
The following statement types are supported:
"<procedure-name>(<term>, ...)" (procedure call)
Invokes the given procedure, passing zero or more arguments.
"<identifier> = <term>"
Defines a definitional variable.
"<identifier>[<expr>] = <term>"
Defines the definition variable in a tuple element.
"<identifier> := <expression>"
Assigns the result of an expression to a mutable variable.
"<identifier> ++ <term> ++ ..."
Pushes the terms or expressions onto an argument pair (see
the description of "variable pairs" below).
"<identifier>[<expr>] := <expression>"
Sets an array element to the result of an expression.
"<identifier> over <start> .. <end> :: <statement>"
"<identifier> over <list> :: <statement>"
Iterates over the range of numbers <start> to <end> (inclusive)
or over elements of a list, executing <statement>, with
<identifier> bound to the current value during the execution of
the statement.
"{<designator> <statement>, ...}"
Nested composition, where <designator> is one of ";", "||' or
"?".
"if(<guard>, ...) <statement> [else <statement>]"
Syntactic sugar for
"{? <guard>, ... -> <statement>, default -> <statement> }"
"let <guard>, ..."
Syntactic sugar for
"{? <guard>, ... -> <following statements in composition>,
default -> <fail with error> }"
"with <identifier> = <term> , ... -> <statement>"
Invokes <statement> in a task with dynamic bindings.
Task-bindings are inherited by sub-tasks and can be accessed with
"binding/{2,3}" and "call_handler/{1,2}".
"<identifier>.<identifier2> <-- <term>"
Field update notation for structs. <Identifier> must be
a paired argument.
"<identifier>[ <expr> ] <-- <term>"
Indexed update notation for lists and tuples. <Identifier>
must be a paired argument.
Guards are used to test conditions and execute the associated
statement when all guards in a rule are successful. The following
guard types are supported:
"<identifier> ?= <pattern>"
Match the contents of <identifier> with <pattern> and succeed
if the structure of the value matches.
"<type>(<term>)"
Succeeds when the given <term> is of the designated type. Valid
types are "int", "double", "tuple", "list", "number", "string",
"atomic", "remote", "module" or "data".
"known(<term>)"
Succeeds if <term> is a non-variable value or a bound variable.
"unknown(<term>)"
Succeeds if <term> is an unbound variable.
"<term> == <term>"
"<term> != <term>"
Test terms or expressions for equality. The arguments may be
numerical expressions.
"<expression> > <expression>"
"<expression> < <expression>"
"<expression> >= <expression>"
"<expression> <= <expression>"
Test expressions by numerical comparison.
"idle"
Succeeds when the current thread is inactive (i.e. no other
processes are running).
"default"
Succeeds when no other guard does. If another guard in the same
choice composition suspends, then the whole composition suspends
and the default guard waits until all other choices do not
succeed even after resumption.
The special guards "idle" and "default" can not be combined with other
guards in the same clause.
A single term or expression used as guard is equivalent to a guard that
compares that expression to a value other than the string "false".
A "term" stands for a structured data object or an arithmetic expression.
Data objects are
"[<element>, ...]"
Lists of arbitrary values.
"{<element>, ...}"
Tuples, fixed-length sequences of one or more values.
<number>
<string>
<character>
<identifier>
Term-elements inside "[...]" and "{...}" may also be expressions, but
these are not evaluated and represented as structured data in the form
of tuples, lists and atomic objects (strings, numbers and variables),
mostly identical to FGHC terms. Note that not all valid PCN
expressions are allowed inside terms, specifically "lambda"
expressions ("``(...) -> ...").
Elements may be integer or floating point numbers, following the usual
syntactic conventions of C:
- Numbers, optionally using numeric base prefixes "0x...", "0..." and
"0b...", for hexadecimal, octal or binary integer literals).
- characters, using the "'<char>'" notation, as in C.
- strings, character sequences enclosed in double quotes ("...").
- character lists, using the L"..." notation.
Note that strings are not equivalent to character arrays. Two or more
adjacent strings or character lists are combined during compile-time
into a single string or character list if both literals are of the
same type.
Numbers may contain "'" as a separator for readability if the preceding
and following character is a digit (in whatever base the number is given).
Expressions may be any of the following:
"<expr> + <expr>"
"<expr> - <expr>"
"<expr> * <expr>"
"<expr> / <expr>"
"<expr> % <expr>"
"<expr> & <expr>"
"<expr> | <expr>"
"<expr> ^ <expr>"
"<expr> >> <expr>"
"<expr> << <expr>"
Arithmetic and logic expressions and the "modulo" operator.
Operator precedence is as in C.
"abs(<expr>)"
Absolute value.
"sin(<expr>)"
"cos(<expr>)"
"tan(<expr>)"
"atan(<expr>)"
"log(<expr>)"
"exp(<expr>)"
"real_fractional_part(<expr>)"
"real_integer_part(<expr>)"
"floor(<expr>)"
"ceiling(<expr>)"
"round(<expr>)"
"truncate(<expr>)"
"sqrt(<expr>)"
Operations on real numbers and trigonometic functions.
"max(<expr>, <expr>)"
"min(<expr>, <expr>)"
Maximum and minimum, if either of the arguments is a floating-point value,
then the result will be as well.
"real(<expr>)"
"integer(<expr>)"
Convert reals to integers and vice versa.
"rnd(<expr>)"
"rnd()"
Return a random integer between 0 and <expr>-1 or a uniformly
distributed random real between 0 and 1.
"-<expr>"
Negation.
"~<expr>"
logical inverse.
"`(<identifier>, ...) -> <statement>"
"Lambda" expression: returns a string or tuple that can executed
using the metacall notation ("`...`") and then runs the statement
body on invocation via "call/{1,2,3}", "apply/2" or using the
"metacall" syntax ("`...`").
"``<identifier>``"
"``<identifier>(<term>, ...)``"
A callable reference to a procedure in the local module, equivalent
to a lambda expression that simply passes all arguments to the named
procedure, optionally with a number of implicit arguments.
"length(<expr>)"
Returns the length of an array, string, list or tuple, for
arguments of any other type 1 is returned.
"tail(<expr>")
Returns the tail element in a linked list.
"<identifier>"
A definitional or mutable variable.
"<identifier1>.<identifier2>"
Field-reference for structs.
"<identifier>[<expr>]"
References an array, list, string or tuple element. May be used
as an assignment target for arrays and as a definition target
for lists and tuples, if the indexed element refers to an
unassigned variable.
"<function-name>(<term>, ...)"
A function procedure call. Meta-calls using "`...`" are
supported.
"<literal>"
An integer, float or character literal.
The "`...`" notation allows constructing procedure calls dynamically,
the "@" notation allows executing a procedure in another thread. Code in
separately compiled modules is called using the "<module>:..." prefix,
as in FGHC. See the tutorial, the PCN manual[10] and the PCN Primer[11] for more
information.
Declarations of the form
-<declaration>(...) [PCN]
are available as in FGHC and have the same meaning. Note that
the "include" directive is passed on to the FLENG backend and
does not include PCN code, to do that use the "#include" form
of the C preprocessor.
PCN programs have access to the same primitives as FGHC code and
the "foreign_call" operation can be used in the same manner. PCN
code can call FGHC or Strand predicates and vice versa. Note that
arguments passed to library primitives and non-PCN modules follow
call-by-value semantics, as FGHC/Strand/FLENG code has no notion
of mutable variables as implemented by PCN. To ensure seamless
interoperability, use the
-fleng_library([module1, ...]) [PCN]
declaration for separately compiled modules that are not written
in PCN. The PCN compiler automatically declares the standard library
modules provided in this manner, so no declaration is needed.
Where primitives and standard library operations require strings
or character lists, char- and int-arrays are usually accepted as well
where character lists are taken as UTF-8 encoded byte arrays
ant int-arrays as sequences of UNICODE code points.
The main differences to the language described in [10] and [11] are
as follows:
- Single line comments ("// ...") are supported.
- Added support for closure-like meta-expressions ("`(...) -> <block>"
and "``<call>``".
- Quantification over lists has been added.
- Numeric literals in hex, octal, binary and character notation,
compatible to C are provided.
- Tuples may not be empty, the empty tuple is identical to the
empty list.
- The "in" operator is not supported. It's meaning is not yet clear
to me.
- Expressions support negation (unary "-") and all arithmetic, conversion
and trigonometric functions supported by FGHC.
- Lists and tuples are distinct types.
- The full meta call form ("`<id>`") takes a tuple as argument,
where the first element is the name of the procedure.
- Meta-calls may not refer to primitives.
- Remote calls (using the "@" notation) always pass mutable arguments
by value, with the exception of arrays.
- The PCN compiler detects multiple assignments to mutable variables
in parallel compositions, but not modifications caused when mutable
variables are passed to PCN procedures and modified subsequently.
- The "@" notation allows Strand peer names (the identifiers "fwd", "bwd",
"north", "east", "south" and "west") in addition to integer
expressions. The "nodes()" and "topology()" functions are not available.
Numerical peer indexes starts at 1.
- All FGHC type tests are available and have the same semantics as in FGHC.
- Mutable variables may have the "byte" type. "char" can hold a full
UNICODE code point and thus defines a 32-bit integer. "byte"
represents unsigned octets.
- The "port" type is not supported.
- Numerical guards fail with an error if given non-numerical arguments,
this stricter type discipline is considered more robust than the
behaviour given in [10] which makes the guards silently fail.
- Procedures of the same name but different arity (number of arguments)
are distinguished and can exist alongside each other.
- The argument to "length()" may be a term.
- Array declarations that do not refer to arguments may contain a
literal integer size specifier or refer to a mutable argument
variable, but not to a definitional or non-argument variable.
- Changes to mutable non-array function arguments are not visible in
the caller.
- Choices must always be enclosed in "{? ... }", even if they only
contain a single clause. This implementation provides "if...else..."
and "let" as a more convenient notations.
- Multiple "default" clauses in choice compositions may be present
and are combined into a single clause.
- Ordering guards (">", "<", ">=" and "<=") only compare numerically,
arrays are not compared element-wise, as specified in [11].
- The binary integer operators "&", "|", "^", ">>" and "<<" are
supported.
- In declarations of local array variables the size specifier may
refer to an argument, which must be a declared mutable variable.
- The primitives listed in [10] are only partially available, the
set of primitive operations that can be used is fully equivalent to
the primitives available for FGHC. See the reference documentation
(or "flengdoc") for details.
- Struct declarations and field-reference and -update.
- Variable pairs, the "++" operator and the update operators "+=", ...
are implementation-specific extensions to PCN.
- Embedding of C/C++ code sections.
A BNF definition of the syntax supported by this PCN compiler
can be found in the file "doc/pcn.bnf" of the source distribution.
VI. Builtin Primitives and Support Libraries
The separate document "LIBRARY" describes the available primitive
operations and library definitions available in FGHC code. The
flengdoc(1) tool can be used to search and extract reference
documentation from the command line. Enter
flengdoc modules
To see a list of the available library modules.
These primitives may be implemented as calls to the run-time system, as
calls to external C libraries, or as library predicates written in
FGHC, PCN or FLENG. Arguments with the suffix "^" will be unified with
a result value, arguments with the suffix "?" are expected to be
non-variable ("ground") values.
If a primitive is said to "signal an error", then the program will
report an error message and terminate.
Most library primitives "force" their arguments, if required, by
suspending the calling process if an argument is an unbound variable.
Notable exceptions are "compute/{3, 4}" and "unify/3".
Note that if primitives require arguments of type "string", they expect
an (interned) atom (FGHC: abc, 'a string', PCN: "..."), not a character
list. If a primitive accepts both a string or a character list in an
argument position, it specifically says so.
The library modules "$rt", "pcn", "lib" and "sys" are used internally and
provide support code for builtin operations and inter-thread
communication.
- Precedence and associativity of infix operators in FGHC:
The precedence and associativity of numerical operators follows the
usual ISO Prolog syntax:
Operator Associativity
+ - \ fy Highest precedence
** xfx
* / \\ << >> yfx
+ - /\ \/ >< yfx
: xfy
== =\= =:=
\=:= @< @> @=<
@>= > <
>= =< @
<= => <==
+= -= *= /= xfx
= := is xfx
& xfy
, xfy
-> xfy
; xfy
| xfy
:- xfx Lowest precedence
- Precedence and associativity of infix operators in PCN:
Operator Associativity
- ~ fy Highest precdence
* / % xfx
+ - yfx
<< >> xfx
& | ^ yfx
== != > < >=
<= := = xfx Lowest precedence
VII. FGHC Programming Tips
1. Sequencing process execution
If processes must run sequentially, there are three methods
for ensuring non-parallel execution, with different run-time
costs and partially different semantics. The cheapest method
is to assign a value to a dedicated unbound variable and
force the argument in a clause head to be bound:
seq :- first(A), second(A). [FGHC]
first(A) :- ..., A = []. % "[]" used by convention
second(A) :- data(A) | ... . % alternatively: "second(!_)"
The second option is to use "when/2":
seq :- first(A), when(A, second). [FGHC]
first(A) :- ..., A = [].
second :- ... .
"when/2" is basically equivalent to the first method, but
creates a compiler-generated intermediate predicate that
performs the dereferencing (and possible suspension) of the
assigned variable.
The third method is to use "&/2", which runs the first goal
as a child task and invokes the second goal when the task and
all processes spawned by the task have completed:
seq :- first & second. [FGHC]
Note that if "first" starts a recursive process, then "second"
will not be executed until the process finishes. The example
is mostly equivalent to the following:
seq :- call(first, A), when(A, second). [FGHC]
Task-creation and maintenance will have a slight run-time cost,
so performance critical code may try to avoid using tasks in
this situation.
In PCN, writing sequential code is straightforward, but it should
be kept in mind that parallel code is usually more efficient than
sequential code, as the compilation strategy and run-time system is
optimized towards parallel execution of code. Sequential execution
requires additional internal machinery to ensure proper order which
has a small, but non-trivial overhead.
2. Module prefixes in uses of "call" and "apply"
Dynamic calls assume the target predicates belongs to the
module that contains the "call" or "apply" form, unless the
invoked form is already prefixed with a module name using the
":/2" operator. Indirect calls and applications, which occur
for example in library code (notably in the "app" library)
do not have information about the calling module, so this
cases the module defaults to "" (the empty module name)
which is the main application module.
3. Runaway stream producers
When processes that produce and consume streams run at different
speeds, it may happen that the consumer is not able to process
elements from the stream fast enough to avoid the build up of an
ever larger number of input elements. As an example, consider the
following code:
io:read_lines(File, Lines), consume(Lines).
If "consume" runs slower than the library code that reads in the
stream of lines, a large file will create a stream holding most or
all of the contents of the file, while the consumer processes only
a small part of the data at a time, possibly filling up the available
heap space. In this example, "io:read_lines_bounded" can be used,
but the problem exists in every situation where data is produced
quicker than it can be consumed. In such situations the producer
must be slowed down artifically, by using bounded buffers (where
the consumer explicitly provides elements to the producer to be
filled) or by using some other method of signalling to the producer
that new data is required.
Another situation exists when processes are created at a high rate,
without requiring synchronization:
loop :- process_user_input, loop. [FGHC]
This will quickly exhaust the available goal space by spawning
processes excessively, since the call to "loop/0" can proceed
immediately, even though the intent is to read and process input
from the user. The remedy is to explicitly wait for data to be
read and processed fully before starting the next iteration of the
loop, for example by using "when/2":
loop :- process_user_input(Done), when(Done, loop). [FGHC]
where "process_user_input/1" unifies "Done" with some value when
it is ready for new input. Alternatively, you can use "&/2" and
run the processing in a separate task, if possible background
processes spawned in reaction to the input are ensured to be
completed:
loop :- process_user_input & loop. [FGHC]
4. Overlapping matches and "otherwise"
The order in which goal arguments are matched with clause heads is
not specified and may not follow strict left to right order or the
order of clauses in the source code. Heads of clause definitions
that overlap may cause unexpected behaviour. For example:
start :- perform_work(A), foo(A). [FGHC]
foo([A]) :- ... .
foo(A) :- ... .
Both clauses may match if "foo" is called with a one-element list,
as the order of matching and unification is not defined. It is
advisable to use the "otherwise/0" guard to identify fallback clauses
that should be matched if all previous clauses are guaranteed to
be failing. The FGHC compiler will detect overlapping clause heads
in some cases (but not all) and issue a warning.
Sometimes this overlapping is desired, for example when
simultaneously consuming more than one stream:
merge([X|S], S2) :- ... [FGHC]
merge(S, [Y|S2]) :- ... [FGHC]
Here we want to suspend on either stream and are sure that at least
one stream will be instantiated. Still, an otherwise clause will
not hurt and may also catch unexpected cases.
5. Variable matching in guards
Unbound variables in guards will cause a process to suspend. This may
be unintuitive for the user as unbound variables in clause heads do not
cause a suspension:
foo(A, B) :- B == x(_) | ... [FGHC]
In the example, "A" will not cause a suspension, even if the argument
passed to "foo/2" is unbound. But the match of "B" with will suspend
on "B" (if it is unbound) _and_ will also suspend forever, because
it will try to dereference the anonymous variable ("_"). This can
be considered a bug.
- Interprocess comunication ("detached ports")
Message exchange between threads normally takes place via shared
memory "message ports" where threads store and retrieve message
data passed between different execution contexts. These ports
can also be "detached" by placing them into files on disk that
allow shared access by different processes. The "sys" library
provides operations to detach a message port and to attach to
a port already detached by a different process. Messages can
only be sent to a detached port and must not contain unbound
variables or references to "live" data that can not be meaningfully
deserialized by the receiving process, like port objects. To
perform bidirectional communication, both processes have to detach
message ports and need to know the filename of respective peer's
port file.
Here is a simple example:
% a [FGHC]
-initialization(main).
main :- sys:detach_message_port('/tmp/a', _).
forwarded(Msg) :-
fmt:format('a received: ~q\n', [Msg]).
% b
-initialization(main).
main :-
sys:attach('/tmp/a', _, Addr),
command_line(Args),
sys:transmit(Addr, Args, _).
$ fleng a.ghc -o a.out
$ fleng b.ghc -o b.out
$ ./a.out &
$ ./b.out hello world
a received: [hello, world]
$ ./b.out and more
a received: [and, more]
Only the thread calling "sys:detach_message_port/2" will have its
port detached, all other threads still use a process-local message
port, but communication between threads with detached and undetached
ports still works as normal.
6. Networking
FLENG has no built-in networking facilities. It is recommended
to run tools like socat(1) [7] as subprocesses, using the "proc"
library module.
7. Cleaning up resources once they are no longer needed
Especially when interfacing to foreign code it may be necessary
to control when resources are being freed. A particularly useful
idiom is to use a "port" object, which closes the associated
stream once the port is not referenced anymore by a process:
create_resource(P) :- [FGHC]
open_port(P, S),
<create resource R>
wait(S, R).
wait([], R) :- <release resource R>.
create_resource(p) {|| [PCN]
open_port(p, s),
<create resource R>
{? s == [] -> <release resource R>}
}
Since ports are also useful as a message or command interface and
ensure serialization of requests, they are generally useful to
implement all accesses to specific resources.
8. Speeding up calls to other modules
Cross-module invocations are performed dynamically, using a cache
for saving the resolved target address in subsequent calls from
the same site. To further speed up direct calls to other modules,
you can generate a special "link" file holding labels designating
the location of the external call target. Use it like this:
% foo.ghc: [FGHC]
-initialization(main).
main :- bar:foo.
% bar.ghc:
-module(bar).
foo :- writeln(foo).
$ fleng -e bar.ghc # generates "bar.link"
$ fleng -c -link bar.link bar.ghc # generates "bar.o"
$ fleng foo.ghc -link bar.link bar.o
$ ./a.out
foo
/* foo.pcn: */ [PCN]
main(argc, argv, exit_code) {; bar:foo() }
/* bar.pcn: */
-module(bar)
foo() {; writeln("foo") }
$ fleng -e bar.pcn # generates "bar.link"
$ fleng -c -link bar.link bar.pcn # generates "bar.o"
$ fleng foo.pcn -link bar.link bar.o
$ ./a.out
foo
Note that compiling "bar.ghc"/"bar.pcn" to an object module needs to
see the link file so that additional globally visible entry-points
are produced in the output file.
VIII. Debugging
To analyse run-time errors, execution can be logged to standard error
or a dedicated file using the "+LOG" run-time option. If a FLENG or
FGHC file is compiled using the "-d" option, each entry into a
clause is written to the log-file.
Logging can be enabled for specific threads by using a suitable
bit-mask specifying the threads that should be logged. The log-file
is never deleted and new information is appended at the end, so you
can use "tail -f" to follow the log over mulitple runs. The "+LOGX"
option only logs explicit calls to "log/1". You can also use the
"trace/{1,2}" primitive to enable logging for specific tasks only.
Additionally, information about the overall status of the program
can be logged, when the "+STATS" or "+HEAPSTATS" run time options are
given. Statistics about memory use and process load are collected and
can be subsequently filtered and analyzed. Note that logging has a
severe performance impact due to file-system access and thread
locking.
Configuring the build with "--debug" adds debugging information
to the C run time library and enables additional events to be
logged.
Code execution can be profiled by using the built-in statistical
profiler. Compiling your programs and library modules with the "-p"
option will instrument the code to provide profiling information.
When you run a thus instrumented executable with the
"+PROFILE <filename>" run-time option, then on successful execution
timing information is written to the given file in textual format.
Finally, compiled programs can be interactively debugged using a
built-in read-eval-print loop, enabled by the run-time option
"+INTERACTIVE", which ignores the initialization goal of a program
and directly enters the library operation "eval:repl/0" (To use
this mode, the program must be compiled with the "-i" option).
In interactive mode you can enter and evaluate a restricted
subset of FGHC terms. All primitives are available augmented with
a number of convenience operations, enter
flengdoc eval
for more information about the available forms.
The "+LOAD" run-time option is similar to "+INTERACTIVE" but
accepts a filename and evaluates the FGHC terms in the file
until the end of the file is reached, using the library operation
"eval:load/{1,2}".
PCN code can be debugged in the same manner, but the terms allowed
in interactive mode must follow FGHC syntax. This basically means
strings are enclosed in single quotes ('...'), variables must
begin with an uppercase letter or underscore ("_") and argument
terms are not evaluated. PCN functions are internally prefixed
with "function " (note the whitespace) and receive an implicit
first argument for the result:
$ cat a.pcn
main() {
writeln(square(3))
}
function square(x) {return(x * x)}
$ fleng -i a.pcn
$ ./a.out +INTERACTIVE
> 'function square'(R, 3), q(R)
9
>
Variables used in the entered terms keep their value, once bound
and can be referenced in terms entered at a later point in time.
IX. Data Representation and Interfacing to C Code
Memory for all data objects is allocated from the "heap", which
holds fixed size "cells". All data is made up from cells and each
thread has its own private heap, data is (usually) never shared
between threads. Unused memory is reclaimed using a reference
counting scheme, where every store of a cell pointer increases the
reference count and any release of a cell reduces the count. Once
the count reaches zero, the cell is reclaimed and stored in the
"freelist" from which new allocations take cells for further use.
Cells have 3 fields: a "tag" and reference count cell, a head and
a tail. The tag specifies what type of object the cell is representing,
the head and tail point to other objects or contain integers.
A datum is either a cell-pointer (of type "FL_CELL *), an integer
of machine word size or a pointer to a string. Integers and strings
are distinguished from pointers by having the lowest and next to
lowest bit set, respectively, which means the cells are always
located at a word size boundary. Integers are representable in the
range of 63 (on 64 bit systems) or 31 bits (on 32 bit machines).
The base type on the C level for all objects is "FL_VAL". Cells
should never be modified destructively.
Strings are "interned" and unique, a string with a given name exists
only once in the entire process.
Depending on type, the fields of a cell have various meanings:
Head Tail
List List head List tail
Real IEEE double Unused*
Variable Value+ Suspensions
Tuple Head/name Arguments (list)
Port Stream|source Unused
Module Internal^ Unused
Ref Source% Unused
Array Box& Parent Array$
*: on 32 bit systems the head + tail fields contain the double
+: points to the variable if unbound
^: internal pointer to module structure, with lowest bit set
%: contains a pointer to a variable in another thread, with the
lowest bit set
&: a raw cell holding a pointer to the storage and length/type
information
$: either the empty list or a reference to the array for which
this array is a "view" (section)
A "ref" is a variable-like object referencing a variable in another
thread.
To call native code from FGHC, Strand or FLENG you can use the
"foreign_call" form, which generates a call to an external function
using the C ABI:
foreign_call(my_func(Argument, ...))
will invoke an externally visible C function with the following
prototype:
void my_func_<N>(FL_TCB *tcb, FL_VAL, ...))
"<N>" is the number of arguments passed to the foreign function,
at most 4 arguments can be passed. The "tcb" argument holds a pointer
to a "thread control block" designating the thread context in which
this foreign function is invoked.
Normally, user code should not manipulate the reference counts of
arguments or results, this is automatically handled by the compiler.
Cell-pointers stored in external memory must have their reference
count increased or may be invalidated when their reference count
is decremented to zero in other code using the value.
Note that while your foreign code executes, the runtime can not
process other processes, the thread is effectively blocked until
the foreign code body has finished and returns to the caller.
Therefore it is important to minimize the execution time and perform
polling of I/O and other resources with the means provided by the
FLENG runtime system ("listen/2", timers and clocks, for example).
Also, if foreign code requires data passed in arguments to be
dereferenced (possibly by waiting until variables are bound), this
should be done on the FGHC/FLENG side, see "deref/2".
To simplify the creation of wrapper code, the "foreign/1" declaration
can be used, allowing the automatic generation of FLENG stub
predicates and associated C wrapper code. "foreign/1" receives a
list of specifications (or a single specification) defining the
name and argument and result types that a foreign function accepts:
-foreign('#include <math.h>'). [FGHC]
-foreign([modf(real, -real, -real)]).
-initialization(main).
main :- modf(1.23, I, F), writeln([I, F]).
A specification can either be a tuple describing a foreign function,
a foreign struct specification, a foreign constant definition or
an atom holding C code that is inserted verbatim into the generated
wrapper code. A function specification gives the name and argument
types that the function expects, where results and "output" arguments
are preceded by a "-". The meaning of the types and their interpretation
follow these rules:
1. If an argument list ends with an output ("-...") argument, then
this argument is taken as the type of the function result.
2. If no output arguments are given, or if the final type is "-void",
then the function is expected to have no return type.
3. Output arguments are passed as pointers to the foreign function
(with the exception of the final return type).
4. Arguments of type "string", "list" and "tuple" are completely
dereferenced before they are passed to the foreign function.
5. All arguments, except output arguments and arguments of type "any"
are "forced", execution will block until they are bound.
6. If no result type is given and the function has no output arguments,
an implicit final argument is added and assigned the empty
list when the function returns.
The wrapper predicate will have the same name as the name given in the
function specification, unless the specification has the form
<localname>=<foreignname>(...)
In this case the FLENG predicate is named "<localname>".
Specification types are mapped to C types according to the following
table:
Type C
integer int
short short
long long
char char
real double also accepts integer as argument
string char *
charlist char * also accepts string as argument
tuple FL_VAL
list FL_VAL
any FL_VAL
pointer void *
pointer(T) T *
void - only allowed as return type
Foreign structs are defined like this:
-foreign(struct(NAME(SLOT1:TYPE1, ...))).
An example should make this clear:
-foreign(struct(vector(-x:real, -y:real, -z:real, moved:integer)).
The "-..." marks read-only slots, so the following predicates will be
generated:
vector_x(POINTER?, VALUE^)
vector_y(POINTER?, VALUE^)
vector_z(POINTER?, VALUE^)
vector_moved(POINTER?, FLAG?, DONE^)
vector_moved(POINTER?, FLAG^)
Setter predicates take a third argument that is unified with the empty
list when the operation is complete.
The "<localname>=<foreignname>(...)" syntax is allowed here as well to
allow using a specific name prefix for the stub predicates.
Finally, foreign constant are defined as follows:
-foreign(const(eof = 'EOF':integer)).
This defines a predicate "eof(EOF^)" that assigns the C EOF code to
the result variable "EOF". The "<Name> =" prefix can be omitted if
the native name shall be used.
Note that PCN does not use the "foreign" declaration, as you can
embed and call C/C++ code directly and also has access to C
preprocessor definitions.
In PCN calls to unknown procedures without explicit module prefix
will be treated as calls to externally linked C code, passing mutable
variables, arrays, strings and floating point values by reference.
The compiler and run-time system will try to ensure appropriate
"boxes" holding immediate values are created as needed, but very
little error checking is done. Passing mutable variables of the
wrong type or modifying non-mutable and non-array arguments
destructively is a recipe for disaster. The mapping of argument
types to C is as follows:
PCN C
string char * (UTF-8 encoding, zero-terminated)
integer long *
mutable variable or array:
long long *
int int *
short short *
char unsigned char *
double double *
term opaque pointer to FLENG object
C functions can modify mutable variables, arrays or references to
array elements, but the effect of changes to arguments that do
not represent is undefined (and is likely to be fatal).
In PCN verbatim code can be directly embedded into the source file
by using the "C" or "C++" code section markers, which must be given
in double quotes, directly at the beginning of the line:
main() [PCN]
long t; {
gettime(t),
writeln(t)
}
"C":
void gettime(long *t) {
*t = time(NULL);
}
Code markers may be "C", "C++" or "PCN". "C++" overrides any
other foreign-language section markers - the wrapper code will
be compiled as C++ if at least one "C++" section exists. "PCN"
will switch back to PCN code. Note that the C preprocessor
will already have processed the code in the foreign language
section.
Verbatim C code and generated C stubs are written to a file named
"NAME~foreign.c" where "NAME" is the basename of the compiled source
file (use the "-foreign" option to override the name). Compilation and
linking of this file is done by the "fleng" compiler script
automatically when producing an executable file. Should the target
file be an object module, then you need to link the
"NAME~foreign.o" file yourself into your final executable.
Compiled FGHC, Strand and PCN code always uses the C ABI and
does not respect C++ name-mangling. If you want to call embedded
C++ code, you must declare it to have C linkage using the
`extern "C"' qualifier.
X. Bugs and Limitations
Maximal number of threads: 64 (32 on 32-bit platforms)
Maximal number of local variables: 64
Maximal size of message sent to other threads: 4096 bytes
Number of uninterrupted goal calls: 10
These settings can be changed using run time options:
Default heap size per thread: 1000000 bytes
Default timeslice: 10
Maximal number of active or suspended goals: 10000
Number of ticks between logging of statistics: 50
- Known bugs and shortcomings:
- [PCN] The C preprocessor does not treat terms as distinct expressions,
so macro expansions will split arguments in surprising ways, e.g.:
#define foo(x) writeln(x)
main() { foo([1, 2]) }
This will fail, as "[1" and "2]" are treated as separate arguments.
As a workaround you can enclose the list in parentheses to make the
preprocessor treat it as a single macro argument. In many cases
you can also use variadic macros to circumvent this restriction:
#define foo(...) writeln(__VA_ARGS__)
:
- Argument pairs are not handled in PCN quantification bodies.
- The generated code is too slow and too big.
- There is currently not mechanism for catching errors, failed
run-time errors (with the exception of certain I/O related
system call failures) result in a termination of the program.
- Some attempt is made to prohibit the creation of circular data
structures, but not all edge cases are likely to be addressed.
XI. Code Repository
A git[8] repository hosting the sources can be found here:
https://gitlab.com/b2495/fleng/
XII. Further Reading
[1] "FLENG Prolog - The Language which turns Supercomputers into
Parallel Prolog Machines"
Martin Nilsson and Hidehiko Tanaka
[2] "Guarded Horn Clauses"
PhD Thesis, Kazunori Ueda
[3] "The Deevolution of Concurrent Logic Programming Languages"
Evan Tick
[4] "Ports of Objects in Concurrent Logic Languages"
https://people.kth.se/~johanmon/papers/port.pdf
[5] "Strand: New Concepts for Parallel Programming",
Ian Foster and Stephen Taylor, Prentice Hall, 1990
This book can also be downloaded here:
http://www.call-with-current-continuation.org/files/strand-book.pdf
[6] "Design of the Kernel Language for the Parallel Inference Machine",
U. Kazunori et al., 1990
[7] "socat"
http://www.dest-unreach.org/socat/
[8] "git"
https://git-scm.com
[9] https://en.wikipedia.org/wiki/Fifth_generation_computer
[10] "Parallel programming with PCN"
Ian Foster and Steven Tuecke
https://digital.library.unt.edu/ark:/67531/metadc1310344/
[11] "A Primer for Program Composition Notation."
Chandy, K. Mani and Taylor, Stephen (1990),
http://resolver.caltech.edu/CaltechCSTR:1990.cs-tr-90-10
[12] "KL1 to C compiler system: KLIC (version 3.011)"
Kazunori UEDA, Waseda University
Publicly available at: https://github.com/GunterMueller/KLIC
[13] "The Joy of Concurrent Logic Programming"
Felix Winkelmann
http://www.call-with-current-continuation.org/articles/the-joy-of-concurrent-logic-programming.txt
XIII. License
This software was written by Felix L. Winkelmann and has been released
into the public domain. Some parts of the code are authored by other
people, see the file "LICENSE" in the source distribution for more
information.
XIV. Contact Information
In case you need help, have suggestions or ideas, please don't
hesitate to contact the author at
felix AT call-with-current-continuation DOT org
Also visit the IRC channel "#fleng" on https://libera.chat/ if
you have questions or need advice.