Rationalized APL I.P. Sharp Research Report #1 Certain aspects of conventional APL (as defined in the IBM publication APL Language [1]) violate some of the fundamental characteristics of the language, and the definitions of some other aspects are too limited to provide clear guidance for systematic extensions. The present paper proposes convenient replacements for the anomalous facilities, and a systematic scheme for extensions, a scheme that does not invalidate the facilities defined in Reference 1, nor, indeed, those incorporated in many other systems, specifically SHARP APL. The main aspects to be treated are:
Items in the foregoing list can be used to illustrate the main point. For example, the indexing a[i;j] cannot (without the use of ⍎ ) be applied to an array a of arbitrary rank, the list expression (i;j) must be treated as a monolithic whole (i.e., not as a sequence of normal APL functions), and a name cannot be assigned to the resulting list. The reduction and inner and outer product operators are limited to scalar functions and could be extended in a variety of ways. For example, the reduction f⌿a could (as it is in [2]) be assumed to apply f over each vector along the leading axis of a , or it could (as defined in [3]), apply f over the set of subarrays obtained by splitting a along the first axis. Both are consistent with the conventional definition, but only the latter allows the matrix product over a rank three array to be expressed as +.×⌿a . A section will be devoted to each of the eight items listed above, and additional sections will treat a number of new functions and operators defined according to the schemes developed in the earlier sections. Two appendices summarize the ranks assigned to existing primitive functions (to systematize all extensions to higher-rank arrays), and the definitions of dyadic operators. Any problems attendant upon the use of implicit arguments such as ⎕io are exacerbated by the more general uses of operators. Consequently, all new functions and operators to be defined will be independent of ⎕io , and will be defined in origin 0 . Moreover, 0-origin will be used throughout this paper. Consider, for example, the operator ⍥ which we will define so that the function g←f⍥i applies f to its argument after transposing to the rear end those axes specified by i . If the operator is independent of ⎕io (and uses fixed origin 0), then a function g that moves the leading axis to the rear is obtained from the expression g←f⍥0 . However, if dependence on ⎕io were to be adopted, which value of ⎕io should apply, that in effect when g is specified, or when it is used? If the former, what advantage does the dependence offer, and if the latter, how could one specify the desired behaviour in the unknown environment? The reader is assumed to be familiar with the functions enclose, disclose, and match (< > ≡) as defined in [4], Moreover, we will assume that the symbol ∘ denotes the constant <'' ; that is ∘≡<'' . This provides a convenient notation for a “fill” element appropriate to a non-simple array, just as 0 and ' ' provide fill elements for simple arrays. Moreover, it rationalizes the use of the symbol ∘ in the outer product, so that the expression a.b may be used with a←∘ (or a←<'' ) and b←× , just as it may be used with a←+ and b←× . An interpreter for the language as extended here (written in APL) has proved invaluable in developing the design, and it has been used to produce the examples used herein. Any reader having access to this interpreter (in workspace 705 model on the I.P. Sharp system) will find it helpful in studying the present paper. An early version of the interpreter has been presented in [5]; except as it may prove useful in the treatment of language questions, the interpreter itself will not be discussed here. To obviate repetitions of phrases such as “if a is an array of rank 3 ”, we will adopt the following convention: a0 , b0 , etc., denote scalars, a1 , b1 , etc., denote vectors, and so forth. Appendices. The body of the paper is devoted to definitions of the proposed extensions, and does not include examples of their use. Such examples are provided in Appendix C, and many readers will find it more fruitful to go directly to this appendix before studying the complete definitions. In order to make such a course feasible, simplified definitions are included in the appendix. Appendices A and B show a table of ranks assigned to existing primitive functions, and a table of the new dyadic operators proposed. A final appendix (D) provides a general comparison of APL2 [3] with a restricted subset of the proposals in this paper. A. Indexing and Indexed Assignment The enclose function as defined in [4] has made it possible to produce by straightforward APL functions the “index lists” required in indexing expressions of the form a[i;j] , and therefore makes it possible to define a corresponding indexing function, which will be denoted by { and called from: i{a ←→ a[>i[0];>i[1]; ...] Since the disclose function > is permissive, the selection of any single element of a can be written without enclosures as, for example, 1 2 3{a3 . Moreover, the left rank of { is 1 and its right rank is infinite, so that (as discussed in Section F) a simple left argument i of rank greater than 1 produces an array of shape ¯1↓⍴i of elements chosen by the index vectors along its last axis, yielding what is sometimes called “scattered” indexing. For examp1e: (3 2⍴⍳6){a2 ←→ a2[0;1],a2[2;3],a2[4;5] Each index >i[k] must be of the form s or <s , where s is a simple array. If >i[k] is of the form <s , then the indexing along axis k is complementary, selecting all elements except those whose indices occur in >>i[k] . In particular, if i[k]≡<<⍳0 , the entire axis is selected. In forming the left arguments of the indexing function, it will often be convenient to use the link function ⊃ defined as follows:
For example, (2 3⊃4⊃∘⊃5 6){a4 ←→ a[2 3;4;;5 6] . The indexing function { as defined thus far provides all of the facilities provided by conventional indexing, and “scattered” and “complementary” indexing as well. Its power is further enhanced by allowing negative indexing (with the completely disclosed element e in position k , either >k{i or >>k{i , replaced by (k{⍴a)|e ) and abbreviated indexing (with i{a equivalent to (i,((⍴⍴a)-⍴i)⍴<∘){a ). For example, 2 3{a3 ←→ (2⊃3⊃<∘){a3 and, if ⍴a is 2 3 4 , then 1 ¯1 1 {a ←→ a[1;2;1] . Each index j along an axis of length n is limited by (j<n)∧(n≥-j) . Indexed assignment of the form a[j;k]←b is anomalous, or at least awkward; the result of the expression (as received by c in c←a[j;k]←b ) is merely b , and the merged result normally wanted must be obtained by a separate reference to a . What is desired is a merge of a and b controlled by an index i and by the particular selection function employed. A simple solution is provided by the with operator ¨ (described in [6] as a¨f x ←→ a f x ) combined with the observation (made by Arthur Whitney) that a dyadic case of the derived function i¨{ could be defined to produce the desired merge. Thus if: r←b i¨{ a then (⍴r)≡⍴a , and b≡i{r , and “the rest” of r agrees with the rest of a , Thus: a2[j;k]←b ⋄ a2←⍉a2 ←→ a2←⍉b (j⊃k)¨{a2 In order to avoid the questions raised by expressions such as x[1 1]←2 3 , it might be desirable to restrict the domain of the expression b i¨{a to a “proper” index i containing no repeated elements. On the other hand, a permissive definition may be harmless, and could probably be implemented more easily. B. Name Assignment In conventional APL, the arrow used for assigning a name to a variable is not permitted for naming functions. Instead, a function name is assigned indirectly by characters embedded in its representation, and there is no direct way for assigning a name to a derived function such as +.× . The solution is to allow the arrow to assign a name to any entity — variable, function, or operator (as in x←⍳5 and ip←+.× and red←/ ) — and to provide means for function definition that do not embed the function name in its representation. This is discussed in Section D. C. Function Valence In conventional APL, functions are generally considered to be ambivalent, with the actual valence of application being determined by context. Many, however, are assigned a fixed valence — some primitives (e.g., monadic ~ and dyadic ∊ ), most derived functions (with one exception, ⌽[i] ), and all defined functions. The introduction of ambivalent defined functions in many APL systems has proved very convenient to the user, and the admission of ambivalent derived functions opens up many new possibilities, one of which has been exploited in the use of -.×m for the determinant [7]. It is now proposed that (except for niladic functions, which must be treated separately), all functions be assumed to be ambivalent, and that any undefined case (such as ∊x) be treated as a proper function which has an empty domain. An unfortunate consequence of this is that some change will occur in the behaviour of existing programs, but only in replacing a syntax error message by a domain error message, or (as in the case of ~/3 ) by a result. However, this is a single, general change that would prevent further changes occasioned by each function extension introduced. Niladic functions must, of course, retain their fixed valence. A further consequence of the ambivalence of all functions is that no operator can have its behaviour defined in terms of the valence of the function to which it is applied, since all have the same valence. This matter was anticipated in [4] in the definition of two distinct composition operators, one to apply a function f dyadically to the results of monadic applications of g (as in ⍺ f⍤g ⍵ ←→ (g⍺) f (g⍵) ), and one to apply f monadically to the result of a dyadic application of g (as in ⍺ f⍥g ⍵ ←→ f ⍺g⍵ ). D. Function Definition Although the operators are the entities generally used to produce functions in APL, the only language facility (as opposed to the ∇-editing facility based upon it but not usable in programs) provided for producing defined functions (i.e., ⎕fx ), is itself a function, and one which produces as an explicit result only the name of the function, producing the function itself as a side-effect. Moreover, although many APL systems allow ⎕fx to produce ambivalent functions, none provide true ambivalence, with, for example, independent localization of variables for the two cases. The proposed replacement for ⎕fx is a modification of the direct definition operator ∇ defined in [8], and is defined as follows:
A function produced by the operator ∇ may be assigned a name (as in f←m∇d or in a(f←m∇d)b ), but it may also be used without assigning a name, as in y←''∇'⍺+÷⍵'/x . E. Syntax and Order of Evaluation Although the order of execution of functions and operators in conventional APL is well-defined, the order of evaluation of names is not, and ambiguity therefore occurs not only in an expression such as (a←2)÷a←3 , but also (if a is a shared variable) in the simpler expression a÷a . The matter is further complicated by allowing the direct assignment of names, as in (f←3×⍳4)(f←×.-)f←⍳4 . Although the precedence of operators over functions is established in conventional APL, the fact that operators are not applied to derived functions implies that the syntax and order of execution of a sequence of operators is not established. The syntax and order of evaluation and execution that is now widely accepted (and is proposed here) may be summarized as follows:
More precisely: 1. The tokens (i.e., the indivisible names such as abc or ¯2.4e5 or + or ¯2.4e5 3 2 ) are examined in order from right to left; each is evaluated immediately and the representation of the result (whether variable, function, or operator) is appended at the beginning of a (right) stack vector and is immediately dissociated from the name whose evaluation yielded it. 2. The evaluation of tokens may be interrupted by the execution of a function or operator in the stack applied to one or two adjacent arguments in the stack. 3. The syntax and order of execution are further specified by a set of rules that determine what segment of the stack is to be executed, or whether evaluation of tokens is to proceed. The rules are based upon the class (variable, operator, function, assignment arrow) of each of the first four elements of the stack. 4. A right parenthesis invokes the execution of the entire expression up to the matching left parenthesis, the result being then transferred to the right stack. 5. Because conventional APL uses name assignment in the form ab←x rather than in the possible, but more awkward, form 'ab'←x , rule 1 above must be modified as follows: if the leading element in the right stack is an assigment arrow, the next token is transferred directly without evaluation. 6. Items 4 and 5 conflict in the case (n)←3 , a conflict resolved by giving precedence to item 4. The consequence (as elaborated in [5]) is to provide “indirect” assignment. For example, if n←'asd' and (n)←3 , then n retains the value 'asd' , and asd becomes 3 . 7. The rules referred to in item 3 are summarized in the following table taken directly from the interpreter: vmf←l vf d vf L S(1↑R),(R[2]RE R[1 3]),4↓R vmf←l v f v L S(1↑R),(R[2]RE R[1 3]),4↓R vmf←l vf m L S(1↑R),(R[2]RE R[1]),3↓R mf←l f v L S(1↑R),(R[1]RE R[2]),3↓R d v f v L S(2↑R),(R[2]RE R[3]),4↓R v ← vmdf r L S,R[0] IS R[2] l vmdf r R[1] vmdfr (>''⍴Z)S(1↓Z←L LE 0),R ← (>''⍴Z)S(1↓Z←L LE 1),R R not single∊ Each of the first four columns above specifies the classes to which the corresponding element of R may belong to invoke the corresponding action specified by the final column. The symbols v , m , d , f , ← , l , and r denote the classes variable, monadic operator, dyadic operator, function, assignment, left (a filler introduced when the left stack is exhausted), and a right filler implicitly introduced to extend a short right stack to four elements. Thus the first line specifies the action to be taken (replacing elements 1 , 2 , 3 of the stack by the result of applying element 2 [a dyadic operator] to elements 1 and 3 ) in the case that R[0] belongs to any one of the classes vmf←l , and R[1] belongs to vf and R[2] belongs to d , and R[3] belongs to vf . No special treatment is required for a niladic function (represented in the usual way) since it is executed (exactly as a variable is) before being transferred to the right stack. Further entries could be added at the head of the syntax table to catch invalid sequences in the right stack as soon as they occur, rather than allowing them all to appear as a single error due to the fact that the stack has the wrong form upon termination. F. Extensions to Higher Rank Arrays In conventional APL, the scalar functions (which apply to scalar elements and produce scalar results) extend to higher rank arrays according to simple general rules; no corresponding general rules exist for the remaining so-called mixed functions. We will now define a systematic scheme of extensions based on two attributes of a function: rank, and conformance. Rank. For a general function f , the axes of an argument ⍵ will be split at some point k such that f is applied to each subarray of shape k↓⍴⍵ , producing a result of shape (k↑⍴⍵),sir , where sir is the (necessarily common) shape of the individual results produced by applying f to each subarray. Each function f is assigned a monadic rank mf which determines the value of k as follows:
Consequently, a positive or zero value of mf determines the rank of the subarrays to which f applies, and a negative value determines the complementary rank, the number of leading axes excluded in each application of f . Because of the definition of k , the rank of a function imposes only an upper limit on the rank of the subarrays to which f applies, and does not prohibit the application of f to degenerate cases of lower rank. For example, the function ⌹ has rank 2 , but if its argument is a vector it is applied to the vector. Because the function ⌹ is defined on vectors, it yields a result, although another rank 2 function, such as the determinant, might be defined to yield a domain error in such a degenerate case. Similar remarks apply to the dyadic case, with each function being assigned independent left and right ranks, and yielding outer shapes ol←kl↑⍴⍺ and or←kr↑⍴⍵ . The agreement required between these outer shapes will be discussed under the heading “conformance”. Certain functions, such as ravel, are of infinite, or unbounded rank; consistently with the foregoing definition, such a rank will be assigned an infinite value (denoted by ¯ ). The utility of the notion of rank will now be illustrated briefly by introducing the rank operator, which yields a derived function of specified rank. Thus f⍤2 ¯1 1 yields a derived function of monadic rank 2 , of complementary left rank 1 , and right rank 1 . Moreover, f⍤r ←→ f⍤(⌽3⌽⍴⌽r) , so that abbreviated forms such as f⍤2 (all ranks are 2 ), and f⍤2 3 (left rank is 2 and both right ranks are 3 ) may be used. For example, ,⍤2⍵ ravels the last two axes of the argument, ,⍤¯2⍵ ravels all but the first two, and ,⍤0⍵ adds a final unit axis to the axes of the argument. A necessary first step in establishing the use of ranks is to define the ranks of all existing functions. Because of the treatment of degenerate cases, it is possible to do this in a manner compatible with existing definitions by simply assigning infinite ranks to all. However, it is generally more useful to assign the lowest rank which will ensure compatibility with existing definitions, since this leads to simpler definitions of the functions, and greater utilization of the general scheme of extension. In particular, the scalar functions can, and will, be assigned rank 0 . The argument for assigning to a function f the lowest compatible rank is somewhat weakened by the fact that a corresponding function of lower rank r is so easily obtained from the expression f⍤r . Thus, although ⌽ could be assigned monadic rank 1 , it might instead be assigned infinite rank, so as not to make it dissimilar to ⊖ , which must be assigned infinite monadic rank. Appendix A shows a table of the proposed rank assignments, together with notes on the less obvious choices. The scalar functions are not included in the table. Conformance. In the application of a dyadic function f , the outer shapes ol and or are each split into two sets of axes (called bound and free) such that ol=bl,fl and or=br,fr ; the shape of the overall result is b,fl,fr,sir , where b is one of bl and br , and sir is the shape of the individual results of applying the function to its cells. A shape s is said to be single if 1=×/s ; if one of bl and br is single, then b equals the other; if both are, then b equals the one of greater length; if neither is single, then bl and br must agree, and b is chosen as either one. The number of bound axes (that is, the lengths of bl and of br ) is limited by the conformance of the function f ; all primitive functions have infinite conformance, and the conformance operator, denoted by f.k (where k is a non-negative integer), produces a derived function equivalent to f but having conformance k . For example: ⍴(a←2 3 4⍴⍳24)×. 0(b←2 3 5⍴⍳30) 2 3 4 2 3 5 ⍴a ×. 1 b 2 3 4 3 5 ⍴a ×. 2 b 2 3 4 5 ⍴a ×. 3 b length error ⍴a×. 1 (1 1 4⍴9) 2 3 4 ⍴a×. 2 (3 4⍴9) 2 3 4 ���(1 1 1⍴8)+1 1⍴9) 1 1 1 The case of zero conformance ( f. 0 ) is equivalent to outer product ( ∘.f ). Except for the conformance operator itself, every operator produces derived functions having infinite conformance. If 1=×/s , then the shape s is said to be (the shape of) a singleton. Singleton bound shapes are given special treatment that formalizes, and extends to non-scalar functions, the use of scalar and other singleton arguments in expressions such as 2 3 + 4 and 2 3 + ,4 and 2 3 + 1 1 1 1⍴4 . Thus, if one of bl and br is a singleton, conformance is satisfied and the bound part of the shape of the result is determined by the other; if both are singletons, then the bound part of the shape is determined by the one of larger rank. For example, ⍴(1 1 1⍴4)+1 1⍴2 is 1 1 1 . G. Operators and Nonscalar Functions The reduction operator can, as remarked in the introduction, be consistently extended to non-scalar functions in at least two different ways, and similar variety is possible for the remaining conventional operators. We will now present a systematic scheme of extension (based upon the notion of function rank) which applies to all of the existing operators except the (bracket) axis operator. This operator is excluded because of its anomalous syntax, and because its purpose can be conveniently served by the rank operator, sometimes in conjunction with the transpose operator. The cases of reduction are straightforward: f⌿⍵ splits the argument ⍵ along the leading axis and applies f between the resulting subarrays; f/⍵ splits along the final axis. The outer and inner products are conveniently defined in terms of conformance: ⍺ ∘.g ⍵ ←→ ⍺ g. 0 ⍵ ⍺ f.g ⍵ ←→ f⌿(¯1⍥⊢⍺)g. 1 ⍵ (The transpose operator ⍥ is fully defined in Section K; the effect of ¯1⍥⊢⍺ used above is to move the last axis of ⍺ to the leading position). The left and right ranks of the derived inner product are infinite. A notable result of this definition of inner product is that, (because of the permissive treatment of the conformance of singletons) the conventional permissive definition of inner product is preserved. For example, if 3 2 4≡⍴a and 1 5 6≡⍴b , then a f.g b ←→ a f.g 4 5 6⍴b . A scan applies reduction over each of a set of partitions of one axis of the argument, and the result therefore includes a supernumerary axis corresponding to this set. The scan f\⍵ partitions along the last axis, and places the supernumerary axis last, after the (necessarily common) shape of the individual reductions. The scan f⍀⍵ partitions along the first axis, and places the supernumerary axis first, before the common shape of the individual reductions. Supernumerary axes are discussed more fully in Section K. H. Miscellaneous This section treats three matters, a rationalization of the jot (∘) used in the outer product and of the type attribute of an array used in expansion and overtake, and some illustrations of the assertion (made in Section G) that the rank operator can fulfill the functions of the anomalous bracket operator. The type attribute. For most purposes an APL array a is completely determined by its shape (which can be obtained by applying the function ⍴ ) and by the value of each of its elements (which can be obtained by indexing), and the array incorporates no other attributes which determine the result of applying functions to it. However, conventional APL includes two functions, the overtake (↑ with a sufficiently large left argument), and the derived function expansion (u\ , where u is boolean) which do depend on an attribute associated with neither the value of an element nor the shape: if n←⍳0 and c←'' , then n and c agree in shape and differ in no element, but 0\n and 0\c produce different results (,0 and ,' '), as do 1↑n and 1↑c . This anomalous behaviour of overtake and expansion arose as follows: they are essentially “selection” functions, but they depend upon a parameter not explicitly specified by their arguments, namely, the filler quantity to be inserted in each of the “new positions” produced. Moreover the domain of the catenate function was limited to arguments of the same class (numeric or character) so as to preclude the introduction of heterogeneous arrays of mixed classes. Consequently, the filler could not be a fixed quantity, but had to be drawn from the same class as the argument of the overtake or expansion function. For any argument a that contains at least one element (i.e., 1≤×/⍴a ), the filler can be determined from the class of any one of the elements, but for an empty array, no element can provide this information. The definition of overtake and expansion on empty arrays therefore required either the choice of a single filler for all, or the introduction of an attribute, usually referred to as the type of an empty array. In conventional APL, arguments for the convenience provided by the type attribute (in maintaining the class of an argument in expressions such as u\u/a [for 0=∨/u ] and k↑j↓a [for j≥⍴a ]) won the day. Although compatibility requires that the type attribute be maintained to continue the present behaviour of expansion and overtake, it does not require any further extension of its use in the definition of new or existing functions and operators. There are two classes of argument against further extension of the use of the type attribute of empty arrays: the complexity of the rules required for extension, and the possibility of providing the same (or nearly the same, or greater) convenience aimed at in the design of conventional APL, by means not then available. Some of the complexity can be glimpsed from the following:
Further indications of the complexity can be seen in the rather extensive discussions of types and prototypes in APL systems [2, 3]. The introduction of dyadic cases for derived functions has made it easy to provide extra parameters, as in the merge produced by the expression b i¨{ a . Moreover, an associated dyadic case of compression as discussed in Section K provides for the specification of fillers for expansion. Not only would 0 u\ n and ' ' u\ c specify the normal fillers for vectors n and c (empty or not), but 1 u\ n and '*' u\ c would provide useful generalizations. Since take is a selection function, a dyadic case b j¨↑ a could be defined to provide a merge analogous to b j¨{ a . Moreover, the derived function ↑¨a (defined by ↑¨a i ←→ i↑a ) could be provided by an associated dyadic case: thus i ↑¨a b would employ a filler specified by b . Since the “new positions” need not even form a rectangular array, b should be a scalar, that is, the right rank of ↑¨a would be 0 . In conclusion:
The bracket axis operator. The following remarks about the bracket axis operator are based upon unpublished suggestions by Arthur Whitney: +⍀⍤¯2⍵ ←→ +⍀[2]⍵ ←→ +\[2]⍵ ⊖⍤¯2⍵ ←→ ⊖[2]⍵ ←→ ⌽[2]⍵ ⍺,¨⍉⍤¯2⍵ ←→ ⍺,[2]⍵ ⍺,¨<⍤¯2⍵ ←→ ⍺,[1.5]⍵ In the foregoing identities, 0-origin indexing is assumed, and the identities do not hold for all degenerate cases of catenation that involve arguments of differing ranks. The function ,¨⍉ catenates along the leading axis, and ,¨< produces lamination along a new leading axis. Note that lamination to introduce a new axis at any position can be achieved without the use of fractional indices. I. New Operators This section treats only those new operators that will be used directly in the definition of new functions in Section J; others are deferred to Section K. To facilitate distinctions between the cases distinguished by argument types (function-function, function-variable, variable-function, and variable-variable), we will use the names f and g exclusively for functions. This section treats three dyadic operators (denoted by ⍤ , ⍥ , and ¨ , and called on, upon, and with), although the cases f¨i , i¨f , and, i¨j are deferred to Section K. It also treats the monadic operator ⊂ , called con. The monadic operator ⊂ is defined as follows: ⍺f⊂⍵ ←→ ⍵f⍺ , and the right and left ranks of f⊂ are therefore the left and right ranks of f . Moreover, the monadic case f⊂⍵ is the function inverse to f ; the definition of this inverse (including its rank) is an attribute of the function f , that is, it is part of the definition of f . For example, the inverse of the scalar function * is the scalar function ⍟ , and the inverse of the infinite rank function < is the rank 0 function > . The name con (meaning in opposition to) is chosen to suggest both the function inversion of the monadic case, and the argument inversion of the dyadic case; the symbol ⊂ resembles the letter c, the initial letter of both con and commute. The composition f⍤g is defined as in Section C, with the added proviso that the composition is “close” in the sense defined in [6]. In terms of function rank, this proviso may be stated very simply by declaring that the three ranks of the derived function f⍤g each equal the monadic rank of g . The “dual” f¨g has the same ranks as f⍤g , the definitions being (g⊂)f g⍵ and (g⊂)(g⍺)f g⍵ . The derived function f⍥g has the same ranks as g . The cases f⍤i and f⍥i and i⍥f will be defined so that together they provide the facility provided by the axis operator defined in Reference 4, but in a simpler and more flexible manner. The case f⍤i was defined in Section F, where it was referred to as the rank operator. The case f⍥i⍵ applies f to a transpose of ⍵ that moves axes i to the tail end. For example, if ⍴⍵ is 3 4 5 6 7 , then ⍴⊢⍤2 1⍵ is 3 6 7 5 4 , and ⍴⊢⍤0⍵ is 4 5 6 7 3 . (The identity function ⊢ is defined in Section J.) In the general case, the right argument of ⍥ has three elements, one to specify the transpose to be applied to each of the argument cases: monadic, left, and right. Each is enclosed, and a simple or abbreviated argument is extended as follows: f⍥i ←→ f⍥(⌽3⍴⌽⊃i) . Negative indexing of axes can be used, as, for example, in ⊢⍤¯2⍵ to interchange the last two axes. The case i⍥f is defined similarly, except that the indicated axes are moved to the front. In conjunction with the identity function, the operator ⍥ can, as illustrated above, provide transpositions that would be much more awkward to write using the transpose function ⍉ . Derived functions produced by the function definition operator ∇ are defined to have infinite rank. J. New Functions This section defines a number of new functions, most of which have been proposed in earlier papers. The main point is to illustrate the ease of definition provided by the joint use of direct definition and rank specification. The symbol ¯ is used to denote infinity:
The monadic case defined for ∊ is the distribution function defined in [6]; the function ≡ used in its definition is defined in [5], and yields a 1 if its arguments agree. The dyadic case of ~ is the set difference function. The monadic case of { is the Cartesian product over the enclosed elements of its argument, and yields a result whose shape is the catenation of the shapes of (⍴¨>⍵),<⍴⍵ . For example: { 1 2 ⊃ 3 4 ⊃ 5 1 3 5 1 4 5 2 3 5 2 4 5 The self-reference (∆) in the definition of the dyadic case is used to define the from function discussed in Section A; except for the complementary indexing provided by the use of doubly enclosed elements, this definition is complete. K. Further Operators The operators discussed in this section are not essential to the rationalization program for existing functions and operators carried out in Sections A-H. They do, however, constitute important extensions, and introduce one further notion, the placement of “supernumerary” axes produced in scans and similar “partitionings” along an axis. They also help to illuminate the mnemonic schemes employed in the choice of operator symbols and in the disposition of operator arguments. Because of the large number of functions required in a language, the adoption of mnemonic schemes (such as the use of a single symbol for related monadic and dyadic cases, for application to arrays of any rank and class, real or complex, as well as the use of a single operator with existing functions to obtain sums, products and maxima over arrays) can be an important aid to the user. Because a large number of operators will also be required, care should also be exercised in assigning symbols to them. Since each argument of a dyadic operator may belong to either of two classes (function or variable), and since each derived function produced is ambivalent, each operator symbol can produce as many as eight classes of functions; various kinds of similarities can be employed to group these functions in mnemonically useful ways. For example, the symbol ⍤ is called on, and the first entry in Appendix B shows the use of different interpretations of the meaning of “on”: the case f⍤g is the application of f on the result of g (called composition in mathematics); the case f⍤i is the application of f on each of the pieces obtained by splitting the argument into subarrays of ranks determined by i ; the case i⍤f is the application of f on pieces obtained by splitting the argument along an axis or axes in a manner determined by i and i⍤j is the constant function whose value is i , applied on subarrays of ranks determined by j . In cases such as f⍤i and i⍤f , it is preferable to assign f⍤i to the more commonly used case, because f itself will often be a derived function, and an expression such as i⍤(+.×) requires parentheses, whereas +.×⍤i does not. The symbol ⍥ in the second entry of Appendix B is called upon: the case f⍥g denotes application of the monadic function f upon the result of the dyadic function g , the larger circle (as compared to ⍤ ), indicating the larger valence in the application of g ; the cases f⍥i and i⍥f concern the application of f upon the result of a transpose (a function that also employs ○ in its symbol), the transpose moving the axes designated by i to the front if i precedes the operator, and to the back if it follows. Some of the relations among the cases of the operator denoted by the dot can be appreciated only by understanding that the inner product f.g is (except for a transposition of the left argument) a very special case of tensor contraction produced by combined use of the monadic and dyadic cases of the form f.i , as in +. 2 ⍺ ×. 2 ⍵ , which “runs together” the first two axes of ⍺ and ⍵ (forming an outer product between the rest), and then applies + reduction over axis 1 and axis 0 , in that order. The equivalence of f. 0 and ∘.f has already been remarked. For example: r←(2 3 4⍴⍳24)×. 2(2 3 5⍴⍳30) ⍴r 2 3 4 5 -. 2 r ¯300 ¯312 ¯324 ¯336 ¯348 ¯315 ¯327 ¯339 ¯351 ¯363 ¯330 ¯342 ¯354 ¯366 ¯378 ¯345 ¯357 ¯369 ¯381 ¯393 -⌿-⌿⍤¯1 r ¯300 ¯312 ¯324 ¯336 ¯348 ¯315 ¯327 ¯339 ¯351 ¯363 ¯330 ¯342 ¯354 ¯366 ¯378 ¯345 ¯357 ¯369 ¯381 ¯393 Til. Appendix B shows two cases for the til operator, denoted by } . The case f}i ⍵ is the ith power of f , and f}(-j) ←→ f⊂}j , so that negative values of j give powers of the inverse function. Moreover, if i≥0 , then f}i has rank mf ; if i<0 it has the rank of the inverse function f⊂ . The second case f}g has ranks mg , rf , and mg , and is defined as follows: ⍺ f}g ⍵ ←→ (g⍵)f⍺ f}g ⍵ ←→ (g⍵)f⍵ As remarked in [9], the case a f}g}h ⍵ is equivalent to (g⍵)f(h⍵) and is therefore very useful. It was also stated that ⍺f}⊢⍵ is equivalent to ⍺f⊂⍵ ; although true in some instances, this is not true in general because the rank of the identity function ⊢ (which is infinite) may differ from the left rank of f . Dot. In Section F, the dot operator, conventionally defined only for the case f.g , was extended to the case f.k , but only the dyadic case of the derived function was defined. The monadic case f.k ⍵ is defined as k successive reductions of the form f⌿⍤r . For example: f. 3 ⍵ ←→ f⌿ f⌿⍤¯1 f⌿⍤¯2 ⍵ . It should be noted that this definition ensures that the reduction must take place over the k successive axes, whereas any one of a succession of reductions of the form f/⍥0 f/⍥1 f/⍥2 might (because of the characteristics of the arbitrary function f ) entirely change all axes of its argument. As noted earlier, the monadic and dyadic cases of f.k together provide the contraction used in tensor analysis. Supernumerary axes. Certain operators produce derived functions that may introduce one or more supernumerary axes in addition to the axes produced by the particular function to which the particular operator is applied. These supernumerary axes may be defined either to precede or follow the normal result axes of the function, the whole collection following, of course, the outer axes of the argument to which the derived function is applied. For example, scan introduces one extra axis contributed by applying reduction to each of a set of partitions of the argument, and a general derivative operator applied to a function of rank r introduces r extra axes (for differentiation with respect to each element of the argument). Scan. Each scan operator behaves like the corresponding reduction operator, and places the supernumerary axis first if the split (in the reduction) is along the first axis, and last if the split is along the last axis. This definition provides conformity with conventional APL. Cut Operator. With an integer left argument k , the operator ⍤ provides a family of derived functions such that k⍤f applies f to pieces cut from its right argument, the pieces being determined by the left argument or, in the monadic case, by the right argument itself. The right ranks of all cases are infinite. There are seven cases: k=0 provides the general case, in terms of which each of the others can be expressed rather easily. Its left rank is 2 , and f is applied to a rectangle of size |1{⍺ beginning at 0{⍺ or, more precisely, to the rectangle r selected as follows: r←((((¯1↑⍴⍺)↑⍴⍵)|0{⍺)+¨>⍳¨>|1{⍺){⍵ Before applying f , the piece r is reversed along each axis for which the specified size (that is, 1{⍺ ) is negative. Note that both negative and abbreviated indexing may be used in 0{⍺ to specify the position of the selected rectangle, and that reversals may be specified by negative elements in 1{⍺ . The degenerate case of a vector left argument ⍺ is treated as ⍉0,⍤0⍺ , so as to select a piece of size ⍺ beginning at the origin. The monadic case 0⍤f ⍵ is equivalent to (⍴⍴⍵)⍴⌊/⍴⍵)0⍤f ⍵ , that is, it selects the “maximal cube” from ⍵ . The cases ⍺ 1⍤f ⍵ and ⍺ ¯1⍤f ⍵ both apply f to pieces obtained by splitting ⍵ , along the leading axis, into segments beginning at each 1 in the boolean vector left argument ⍺ , the element at each cutpoint being included in the case of 1⍤f , and excluded in the case ¯1⍤f . The supernumerary axis introduced is placed before the individual results. A “short” left argument ⍺ is extended cyclically by (1↑⍴⍵)⍴⍺ . The left ranks of the functions 1⍤f and ¯1⍤f are 1 , and extension to higher ranks is made in the usual way. The monadic cases are equivalent to the dyadics with a left argument of (0{⍵)≡⍤¯ ¯1 ⍵ , that is, splits begin at occurrences of the “delimiter” specified by the leading subarray of ⍵ . The cases 2⍤f and ¯2⍤f differ from the foregoing only in that each 1 in the left argument determines the end of a segment, and that the delimiter in the monadic case is the last subarray of ⍵ , namely, ¯1{⍵ . The remaining cases, 3⍤f and ¯3⍤f , provide (possibly overlapping) tessellations. The left rank is 2 , and the expression ⍺ ¯3⍤f ⍵ applies f to each complete rectangle of size |1{⍺ obtained by beginning at all positions obtained as integer multiples of (each element of) the movement vector 0{⍺ . As in the case O⍤f , reversal of each piece occurs along the axis for which the “window” 1{⍺ is negative. The degenerate case of a vector ⍺ is equivalent to the left argument ⍉1,⍤0⍺ , and therefore provides a complete tessellation of size ⍺ . The monadic case provides tessellation by “maximal cubes” having the size (⍴⍴⍵)⍴⌊/⍴⍵ . The case 3⍤f differs from ¯3⍤f only in that any “shards” of sizes less than |1{⍺ are included. With. If the ranks of the arguments i and j in the expressions i¨f ⍵ and f¨j ⍵ are equal to the left and right ranks of f , then the definitions are straightforward: i¨f ⍵ ←→ i f ⍵ f¨j ⍵ ←→ ⍵ f j However, arguments i and j of greater rank introduce axes which are placed before the individual results. Consequently, the derived functions behave more like outer products of f than like direct applications of f . For example, if a←1 2 3 , then a¨+a ←→ a∘.+a . Compression. The definitions of the derived functions i/ and i⌿ (called compression or replication) are based upon a vector argument i . The effect of i⌿⍵ is to split ⍵ along its first axis, select subarrays determined by i , and array them along a supernumerary axis that is placed first; the effect of i/⍵ is to split along the last axis and to place the supernumerary axis last. An argument i of rank greater than 1 will introduce further axes that are placed first in both i/⍵ and i⌿⍵ . Dyadic cases of i/ and i⌿ should also be defined, with negative values in i providing selection from the left argument [10]. L. Summary The proposals essential to the rationalization program described in the introduction can be summarized as follows:
Appendix A. Ranks of Primitive Functions The following table shows the proposed ranks for all primitive APL functions except the scalar functions, whose ranks are all zero. The ranks are shown in the order monadic, left, and right, with an overbar denoting infinite rank. The indices in the left column refer to notes following the table. The notes employ aO and bO to denote scalars, a1 and b1 to denote vectors, etc. The proposed ranks were chosen in the light of the following general considerations:
Ranks of Primitive Functions Notes
Appendix B. Table of Dyadic Operators The following conventions are used: i and j denote variables, and f and g functions. mf , lf , and rf denote monadic, left, and right ranks of f and af denotes all ranks of f. The monadic derived function is shown to the left of the dyadic.
Appendix C. Examples and Brief Definitions This appendix provides loose definitions and numerous examples of use of the functions and operators defined more formally in the body of the paper. This appendix may be read first, although it is probably best to make concurrent reference to the body as desired. The ideas introduced in Sections B to E (Name Assignment, Function Valence, Function Definition, and Syntax and Order of Execution) will be treated only indirectly by using them in the discussion of other topics. The following functions will be used in examples:
The examples shown in this appendix were executed by the function apl in workspace 705 model . 0-origin indexing is assumed thoughout, and infinity is denoted by the overbar ¯ . Function Ranks and Disposition of Axes Function rank is the most important notion needed to provide a simple and systematic basis for the uniform treatment of all “mixed” or non-scalar functions. It is also relatively difficult to grasp: although the rules for rank are simple, their consequences are neither obvious nor inconsequential, and the reader unfamiliar with function rank should carefully study the simple examples of its use, in preparation for the more complex and more rewarding uses in dyadic functions and in conjunction with operators such as composition and dual. If f has rank r , then f⍵ is determined by applying f to each of the “cells” of shape (-r)↑⍴⍵ , producing a common shape s for each, and assembling the whole into a result of shape ((-r)↓⍴⍵),s . In other words, the argument ⍵ is treated as an array of outer shape (-r)↓⍴⍵ of cells of shape (-r)↑⍴⍵ , and the (common) shape of the result of applying f to any cell is placed last in the shape of the overall result. Every function has an assigned rank. Thus < has rank 0 (i.e., applies to each scalar), and > has infinite, or unbounded rank, applying to its entire argument. For example: a←1 2 3 ⊃ 4 5 6 a |1 2 3| |4 5 6| b←>a b 1 2 3 4 5 6 ⍴b 2 3 <b |1 2 3| |4 5 6| The rank operator ⍤ produces a function of rank specified by its right argument. Thus: <⍤1 b |1 2 3| |4 5 6| Moreover, the ravel function has infinite rank and: c←2 3 4⍴⍳24 ,c 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ,⍤2 c 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 >,<⍤1 c 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 A negative argument of the rank operator specifies the complementary rank, that is the rank of the outer shape. For example: ,⍤¯1 c 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 In some instances, a primitive in conventional APL extends usefully by simply making an explicit assignment of the ranks implicit in its definition. For example, ⌹ is presently limited to rank 2 arguments, but extends usefully according to the general rules. More interestingly, ⍎ is defined on vectors, and if assigned rank 1 , can be used in the manner illustrated below: ⍎ch←⍕2 2⍴⍳4 0 1 2 3 The Transpose Operator It is sometimes necessary to move certain axes of the argument before applying a function. Although the dyadic transpose function can perform this, it is much more convenient to use the transpose operator defined as follows: f⍥k⍵ Applies f after moving axes k to the rear. k⍥f⍵ Applies f after moving axes k to the front. For example: ,⍥0 c 0 12 1 13 2 14 3 15 4 16 5 17 6 18 7 19 8 20 9 21 10 22 11 23 1 2⍥, c 0 12 1 13 2 11 3 15 4 16 5 17 6 18 7 19 8 20 9 21 10 22 11 23 ,⍤2⍥0 c 0 12 1 13 2 14 3 15 4 16 5 17 6 18 7 19 8 20 9 21 10 22 11 23 <⍤2⍥0 c | 0 12| | 4 16| | 8 20| | 1 13| | 5 17| | 9 21| | 2 14| | 6 18| |10 22| | 3 15| | 7 19| |11 23| The last two examples above involve a sequence of operators, and it should be noted that an operator has “long left scope”, applying to the result of the entire operator sequence to its left. Thus, ,⍤2⍥0 is equivalent to (,⍤2)⍥0 , not to ,⍤(2⍥0) . This may be emphasized by assigning a name to the intermediate (function) result produced: r2←,⍤2 r2 c 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 r2⍥0 0 12 1 13 2 14 3 15 4 16 5 17 6 18 7 19 8 20 9 21 10 22 11 23 For transpositions on an argument of arbitrary rank, the transpose operator used in conjunction with the identity function ⊢ can often be much more convenient than the transpose function. For example, to move axes k to the rear: ⊢⍥⍵k ←→ (⍋((~a∊k)/a←⍳⍴⍴⍵),k)⍉⍵ Dyadic Functions If the function g←f⍤r is to be applied dyadically as well as monadically (the only cases addressed in the preceding sections), then it is necessary that r specify three independent ranks, the monadic, the left, and the right. The general argument r is therefore a three-element vector that specifies the ranks in the order just indicated. Moreover, r is extended by reshape if necessary, so that f⍤r ←→ f⍤(⌽3⍴⌽r) . The following example shows the behaviour of a useful catenation function that has differing left and right ranks: g←,⍤¯ 0 1 a←2 3⍴⍳6 7 8 g a 7 0 1 2 8 3 4 5 7 8 ∘.g a 7 0 1 2 7 3 4 5 8 0 1 2 8 3 4 5 h←∘.g ⍴b←3 4 5 h 7 8 h a 3 2 2 5 0 0{b 3 7 0 1 2 3 7 3 4 5 If h←f⍤r is a function of higher rank than f , then f and h may be indistinguishable when used in the expressions ⍺f⍵ and ⍺h⍵ , but may produce quite different results under the application of an operator. More precisely, if i and j are non-negative scalars and if the rank of f is i , then h←f⍤(i+j) agrees with f in direct application, but may differ from it under application of an operator. For example, the rank of the scalar function × is 0 , and: t1←×⍤1 a←2 3⍴⍳6 (a×a)-:a t a 1 ⍴a∘.×a 2 3 2 3 a ∘.t1 a 0 1 4 0 4 10 0 4 10 9 16 25 The left brace denotes a dyadic indexing function called from. Its left and right ranks are 1 and ¯ , and: 1 2 3{a3 ←→ a3[1;2;3] For example: a←2 3 4⍴⍳24 1 2 3{a 23 If the left argument of { is of rank greater than 1 , then the normal rules of extension apply, giving the useful “scattered” indexing: i←?4 3⍴2 i 1 1 0 0 0 0 1 1 1 1 1 0 i{a 16 0 17 16 If j is a non-simple vector (i.e., including one or more elements produced by the enclose function < ), then the disclosed elements of i apply as follows: i{a3 ←→ a3[>i[0];>i[1];>i[2]] For example, using the link function ⊃ (that encloses its left argument and catenates it to the right, first enclosing the right if it is simple): i←1⊃1 2⊃0 3 i{a 16 19 20 23 This generalization may also be written as: ⍺{⍵ ←→ ({⍺){⍵ using the monadic case of { , the cartesian product discussed in the following section. For example: {i 1 1 0 1 1 3 1 2 0 1 2 3 ({i){a 16 19 20 23 The Cartesian Product The asymmetric catenation function g←,⍤¯ 0 1 illustrated earlier can be used to add a unit final axis. Thus: 8 9 ∘.g '' 8 9 ⍴8 9 ∘.g '' 2 1 It can also be used to form a cartesian product of arrays: c←2+b←2+a←1 2 r←a ∘.g b ∘.g c ∘.g '' r 1 3 5 1 3 6 1 4 5 1 4 6 2 3 5 2 3 6 2 4 5 2 4 6 Finally, if i is the vector i←a⊃b⊃c , then the reduction of i,∘ by ∘.g¨> produces the enclosed cartesian product of the elements of i . Thus: i←a⊃b⊃c r≡> ∘.g¨>/i,∘ 1 The operator ¨ (dual) used in the foregoing expression is defined by: f¨g⍵ ←→ g⊂ f g ⍵ ⍺f¨g⍵ ←→ g⊂ (g⍺)f g ⍵ where ⊂ (con) denotes the inverse operator. The effect of f¨>⍵ is to apply f to (the disclose of) each of the ×/⍴⍵ enclosed elements of ⍵ , and to enclose each result. The cartesian product function will be defined as the monadic case of { . Formally: {⍵ ←→ > ∘.(,⍤¯ 0 1)¨>/(<⍤>⍵),∘ Its use in the definition of the dyadic case was shown in the preceding section, namely: ⍺{⍵ ←→ ({⍺){⍵ Since { applied to a simple vector yields the vector unchanged, the definition applies (trivially) to a simple vector left argument as well as to a vector of enclosed elements. Note that it is the case of a simple argument that required the expression (<⍤>⍵) instead of ⍵ in the definition of {⍵ above. The definition of ⍺{⍵ is extended further to allow abbreviated, negative, and complementary indexing. For example: a←2 3 4⍴⍳24 1 2{a 20 21 22 23 1 ¯1{a 20 21 22 23 i←1⊃(<2)⊃0 3 i |1| ||2|| |0 3| i{a 12 15 16 19 Because an element of the form <<'' will select all along the corresponding axis, an expression such as ((3⍴<<''),<k){a4 can be used to select an index k along the last axis. However, if it is also desirable to bring the last axis to the front, it is more convenient to use the “display” function d , defined and illustrated below: d←¯1⍥{ 1 d 3 3 3⍴⍳27 1 4 7 10 13 16 19 22 25 0 d 3 3 1⍴⍳9 0 1 2 3 4 5 6 7 8 The last example above shows the production of an easily readable display of an array with a final unit axis. This will be used later in the discussion of permutations. Operators on Non-Scalar Functions Some examples of the application of operators (such as the outer product and the rank) have already appeared in earlier sections. We will here provide further examples involving the reduction operators / and ⌿ , as well as the scans \ and ⍀ , which introduce supernumerary axes. The expression f⌿⍵ splits ⍵ along its leading axis, and applies f between the resulting subarrays. Thus, if a←?3 4 5⍴9 then f⌿a ←→ (0{a) f (1{a) f 2{A For example: a←3 3⍴⍳9 ∘.×⌿a 0 0 0 0 0 0 0 0 0 18 21 24 24 28 32 30 35 40 36 42 48 48 56 64 60 70 80 p←3 3 1⍴1 0 2 2 0 1 2 1 0 d←¯1⍥{ 0 d p 1 0 2 2 0 1 2 1 0 ,{⌿p 2 0 1 The last result above is the “product of the permutations” represented by the rows of p . The other reduction (f/) splits along the last axis. For example: a←i∘.×10*i←⍳3 a 0 0 0 1 10 100 2 20 200 ⍴r←∘.(,⍤0)/a 3 3 3 2 2 1 2 { r 1 20 1 0 1 20 1 100 1 20 1 200 A scan of the form f⍀⍵ applies f⌿⍵ to each of a set of subarrays (,⍤0⍳k){⍵ , for each k∊1+⍳1↑⍴⍵ , and the supernumerary axis (of length 1↑p⍵ ) is placed before the axes contributed by the individual results. This will be illustrated in the scan {⍀p , where p is the permutation matrix used above to illustrate the reduction {⌿ : 0 (d←¯1⍥{) p 1 0 2 2 0 1 2 1 0 The subarrays involved in the scan {⍀ may be produced and displayed as follows: g←,⍤0⍤⍳ 0 d (g 1){p 1 0 2 0 d (g 2){p 1 0 2 2 0 1 0 d (g 3){p 1 0 2 2 0 1 2 1 0 The scan {⍀p therefore gives all of the permutations produced by reductions over these subarrays: 0 d {⍀p 1 0 2 0 2 1 2 0 1 The alternate scan f\ is defined analogously, with the split occurring along the last axis, and with the supernumerary axis placed last. For example: ×\3 3⍴⍳9 0 0 0 3 12 60 6 42 336 Til, Cut, and Bind The operators discussed in this section are not essential to the rationalization program, but they can be used to further exploit, and therefore to demonstrate, the power of the essential facilities. In what follows, we will use the fact that the con operator ⊂ is (in the dyadic use of the resulting derived function) a commutation, that is: ⍺ f⊂ ⍵ ←→ ⍵ f ⍺ The til operator, defined by ⍺ f}g ⍵ ←→ (g ⍵) f ⍺ f}g ⍵ ←→ (g ⍵) f ⍵ can be used in a variety of ways. For example: f←÷⍤1 g←⌽ a←2 4 5 b←10 2 4 a f}g b Reverse b and divide by a 2 0.5 2 a f}g⊂ b Reverse a and divide by b 0.5 2 0.5 (a “left composition” in which g applies only to the left argument) a f⊂}g b Right composition 0.5 2 0.5 If f is commutative, then right composition can be obtained more simply by the expression f}g . For example: a +}÷ b 2.1 4.5 5.25 +}÷/c←2 4 5 6 2.23846 +}÷\c 2 2.25 2.2381 2.23846 The case +}÷/c is equivalent to 2+÷4+÷5+÷6 , and is called the continued fraction represented by c ; the case +}÷\c yields the “convergents” to the continued fraction. The most important property of the til operator remains to be stated, namely: f}g}h ⍵ ←→ (g⍵) f (h⍵) In other words, f}g}h ⍵ applies a dyadic function f to the results of the monadic functions g and h . For example: f←'⍵*2'∇∘⍤0 g←'⍵*3'∇∘⍤0 +}f}g 1 2 3 Often denoted by f+g without 2 12 36 indicating that + is an operator rather than a function -}f}g 1 2 3 0 ¯4 ¯18 i←'⍵=⌊⍵'∇∘⍤0 p←'⍵>0'∇∘⍤0 n←¯1 ¯.5 0 .5 1 ∧}i}p n 0 0 0 0 1 ∨}i}p n 1 0 1 1 1 ≠}i}p n 1 0 1 1 0 The cut operator takes an integer left argument and a function right argument and applies it to pieces cut from the right argument of the derived function, putting the supernumerary axes contributed by the set of pieces in first place. The type of cut is determined by the left argument of the operator, and the points of cutting by the left argument of the derived function. The cuts are along the leading axis. For example: t←'abcdefghij' b←0 0 1 0 0 0 1 1 0 0 b ¯1⍤< t A ¯1-cut begins pieces at 1's |def| || |ij| and excludes cut points b 1⍤< t A 1-cut begins at 1's |cdef| |g| |hij| and includes cut points b ¯2⍤< t (2 ¯2)-cuts end at 1's |ab| |def| || b 2⍤< t |abc| |defg| |h| b ¯1⍤(+/) ⍳10 12 0 17 b 1⍤(+/) ⍳10 14 6 24 m←⍉2 10⍴'abcdefghijk' m ak ba cb dc ed fe gf hg ih ji (4↑1) ¯1⍤< m |ak| |ed| |ih| |ba| |fe| |ji| |cb| |gf| |dc| |hg| s←'now it is' ¯1⍤< ' ',s Monadic (1 ¯1)-cuts use the |now| |it| |is| leading element as a delimiter 1⍤< ' ',s | now| | it| | is| ¯2⍤< s,' ' Monadic (2 ¯2)-cuts use the last |now| |it| |is| element as delimiter 2⍤< s,' ' |now | |it | |is | The general cut 0⍤f has left rank 2 and infinite right rank, so that a left argument of rank greater than 2 produces a set of cuts. The first row of the left argument specifies the beginning point, and the magnitudes of the last row specify the size of the window, with a negative value producing a reversal along the corresponding axis. For example (using the identity function ⊢ ): a←>1 1⊃2 2 a 1 1 2 2 a 0⍤⊢ m←4 4$i.16 5 6 9 10 (>1 1⊃¯2 2) 0⍤< m | 9 10| | 5 6| The ¯3-cut ⍺ ¯3⍤f ⍵ provides a tessellation, with the movement specified by ⍺[0;] , and the window again by ⍺[1;] . For example (with ⎕ps set to -1 1 3 3 ): a←2 2⍴2 a ¯3⍤< m ___ ___ | | | | |0 1| |2 3| |4 5| |6 7| |___| |___| _____ _____ | | | | | 8 9| |10 11| |12 13| |14 15| |_____| |_____| a←>1 1⊃2 2 ⍴r←a ¯3⍤⊢ m 3 3 2 2 1{r 4 5 8 9 5 6 9 10 6 7 10 11 In the expression ⍺ f.k ⍵ , the operator causes f to be applied to the arguments with their first k axes “bound” (and necessarily agreeing), and with the remaining free axes (exclusive of those demanded by cells as determined by the ranks of f ) forming an outer product. The main point is best illustrated by a scalar function f in which no axes are demanded by cells: a←10×b←2 3⍴⍳6 a +. 2 b 0 11 22 33 44 55 a +. 1 b 0 1 2 10 11 12 20 21 22 33 34 35 43 44 45 53 54 55 ⍴r←a +. 0 b 2 3 2 3 r≡a∘.+b 1 The monadic case of f.k provides k successive reductions by f over the leading axes. For example: a←2 3 4 5⍴⍳120 +. 2 a 300 306 312 318 324 330 336 342 348 354 360 366 372 378 384 390 396 402 408 414 For many functions (particularly for non-scalar functions) it is important to note that the reductions are carried out in the manner illustrated by f. 3 ⍵ ←→ f⌿ f⌿⍤¯1 f⌿⍤¯2 ⍵ Appendix D. APL2 versus a Comparable Subset This section presents comparisons between the facilities of APL2 [3] and a roughly equivalent subset of the facilities defined in this paper. The restricted subset (to be referred to as RS) embraces:
The functions from ({) and merge (i¨{ ), and the rank and transpose operators obviate all uses of brackets and semicolons. The discussion of RS will therefore be based on the assumption that their use is excluded, even though, for reasons of compatibility, they will continue to function as in conventional APL. The comparisons made in this section are more theoretical than practical, concentrating on general questions such as implications for syntax, function classifications, and domains of operators. More practical comparisons of convenience, ease of learning, etc. are better made by the reader, either by drawing from his own applications, or from definitions and examples given in the APL2 manual [3] and the companion Introduction manual [10] as well as from this paper. For example, one might take examples of strands and of heterogeneous arrays from the APL2 publications and write them completely in RS, and take function definitions such as r←,⍤2 and write them completely in APL2. Syntax. A useful measure of the complexity of a set of syntax rules is the required scope, that is, the maximum number of elements in the execution stack which must either be involved in the determination of the next action to be taken, or be used in that action. This measure is meaningful only on the assumption that each element is a normal operator, function, or variable, that is, an element does not incorporate further information relevant to another element or elements. We will further assume that parentheses are handled by recursion on any segment enclosed by matching parentheses, and that parentheses are therefore not considered in assessing scope. However, this exclusion can not apply to an “unusual” use of parentheses, such as the usage (c\[a]v)←x in APL2, in which the result of the parenthesized expression is a “qualified” name. Under the foregoing assumptions, the scope in RS is 4 . In APL2:
Function Classes. In conventional APL, functions are, explicitly or implicitly, divided into a number of overlapping classes, and the behaviour of a function depends, in significant ways, on the classes to which it belongs. Thus:
In RS all such class distinctions are removed. However, each function is assigned a set of ranks, the ranks for all primitives of conventional APL being 0, 1, 2, or unbounded. In APL2, some of these distinctions are removed in that operators such as reduction and inner product apply to non-scalar and defined and derived functions. However, class 2 is not only retained, but extended (to scalar functions, to ↑ , etc.), and class 3 (fixed valence) is also retained and extended, as evidenced by entries under “monadic” and “dyadic” in the glossary, by references to dyadic functions in the definitions of certain operators, and by the fact that the new bracket axis operator (distinct from the conventional “bracket axis specification” referred to in the definition of class 2) produces a function which is specifically monadic or dyadic. Two further classes are introduced in APL2, called pervasive and selection: members of the former applying down through any structure of enclosures, and members of the latter being permitted to the left of assignment. No provision appears to be made for making a defined or derived function a member of either of these classes. Depth Functions. APL2 provides two significant facilities which apply at “depth” in the enclosure structure of an argument, the dyadic pick function, and the pervasive functions. RS provides no primitive corresponding to pick; it could be defined recursively by: pick←''∇('→2+0=⍴⍺'⊃'(>0{⍺){(1↓⍺)∆⍵'⊃'⍵') If frequent use demands greater convenience and efficiency, the pick function could be incorporated as a primitive. If so, the dyadic case of i¨pick should provide the merge corresponding to the “selective specification” of the form (l⊃r)←x provided in APL2. Since pervasiveness is a property assigned to functions, it would, in the framework of RS, be provided by an operator. Such an operator could be applied to any function (defined or derived as well as primitive) and, if defined to be dyadic, could provide greater variety. Thus a variable p in f op p could provide application at any depth from the top, or any distance from the leaves; a function p (that is, a proposition) could specify the condition at the level where f is to apply. Array Formation. If each essential space in an expression is counted as a character, then the link function and strand notation used to form non-simple vectors from simple vectors require expressions of nearly identical length. For example:
In conventional APL, the precedence o£ operators obviated parentheses around operator expressions, as in a+.×b , and in RS the precedence and scope rules for operators extend this convenience to expressions such as f⍤a b , regardless of whether a is a function or a variable. APL2 adopts the same rule for operators, but because strands take precedence over everything, an operator expression with a variable right argument must be parenthesized. In conventional APL, two vector constants are never juxtaposed. In both RS and APL2, however, juxtaposition may occur in the case of any operator having a variable right argument (necessarily a user-defined operator in the case of APL2). In that event, the divide between the constants must be indicated by some delimiter. Thus: f⍤1 2 (3 4 5) f⍤(1 2) 3 4 5 f⍤1 2 a← 3 4 5 (f⍤1 2) 3 4 5 f⍤1 2⊢3 4 5 In RS, any one of these may be used, the first two being preferred because they clearly indicate the purpose of the parentheses, and the third being used if there are extraneous reasons for naming the right argument. In APL2, strands limit the choice to the fourth. Retention of a valence classification for functions may have some implication for strands. Thus, if f is strictly monadic, the expression a~b might be considered identical to a(~b) , since the strictly monadic function would presumably be executed first. Although the valence classification is made in conventional APL (with a~b yielding a syntax error rather than a domain error) the valence classification has no serious consequences, because the juxtaposition of the two variables occasions an immediate error. Paradoxically, the function ~ is actually treated as ambivalent in the sense that the name a in the expression a~b is evaluated before the syntax error is reported, as may be seen by making a a shared variable (perhaps ⎕ ) or a niladic function that specifies a global variable. Domains of Operators. In conventional APL, the domains of operators are defined by one general rule (which excludes all defined functions), and by specific lists, the list of “scalar primitives” for all but the bracket axis operator, and a second list (which includes some derived functions) for it. Moreover, the domains of the derived functions produced are limited in ways which depend upon the arguments of the operators, depending upon the normal definition of the argument (as in ÷/4 0 ), or upon an “attribute” that it may possess (as in ⍲/⍳0 ). In RS, the domains of operators are unrestricted, although the domains of the resulting derived functions continue to be restricted in ways analogous to those in conventional APL. Thus the domain of f¨g will be restricted by the domains of g , f , and g⊂ (that is g inverse). In particular, the domain of f¨g is empty if the domain of g⊂ is empty. In APL2, the domains of most operators are unrestricted. However, the list of functions subject to (bracket) axis specification is not only maintained (as required for compatibility, even in RS), but is extended, as in a+[2]b , and in a↑[2]b . In the latter case of ↑ , the “default” axis used when the axis specification is absent is not the final axis used for similar cases, but all axes. Although the bracket notation used to represent it in conventional APL tends to obscure the matter, the axis operator is, in effect, dyadic, applying to a function “left argument” and a variable “right argument”. In APL2, the bracket axis operator is defined to be monadic, seeming to imply that there is an indeterminate number of such operators, such as [1;1] , and [1;2] , etc. Prototypes. Prototypes in APL2 are an extension of the notion (implicit in the difference in the results of 1↑⍳O and 1↑'' ) that an array a incorporates some information other than its shape and the values of each of its elements. In RS, compatibility with conventional APL is maintained, but no new distinctions are introduced. Specifically, 1↑0⍴1j1 and 1↑0⍴(1 2 3⊃'ab') both yield 0 . This is a fundamental difference that cannot be eliminated (as can the lack of a pervasive class of functions) by the introduction of an appropriate operator. Heterogeneous Arrays. RS does not include the heterogeneous arrays of APL2, and the production of equivalent constructs requires greater use of enclosure. However, the structure of RS does not preclude their introduction. Permissive Enclose. The monadic enclose functions defined in RS (<) and in APL2 (⊂) differ in one respect: if s is a simple scalar, then s≡⊂s , but ~s≡<s . Although < can therefore produce some structures not producible by ⊂ , the differences between them (in the contexts of the respective systems) cannot, in most cases, be discerned. The function { in RS provides an example where the difference is significant: because a doubly enclosed element of the left argument provides complementary indexing, the value of <<k must be distinguishable from <k , even in the case of a scalar k . A similar situation would arise in the definition of an indexing function f in which a f i selects elements from a and incorporates them in the same “enclosure structure” as I . However, the difference would be discernible only in the case of some scalar element of i and a non-simple argument a . General Remarks. The following general contrasts may be noted:
References
Acknowledgements I am indebted to a number of my colleagues at I.P. Sharp Associates for many helpful discussions and suggestions, particularly to R. Bernecky, E.B. Iverson, E.E. McDonnell, R. Pesch, J.H. Schueler, D.L. Forkes, and L.H. Goldsmith. Errata
Originally appeared as I.P. Sharp Research Report #1, Revision 1, 1983-04-04.
|