💾 Archived View for soviet.circumlunar.space › zwatotem › snapshot › solution › visual.gmi captured on 2024-12-17 at 10:31:49. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Current document describes Visual Solution to the Expression Problem.

Motivation

Since its inception, the bulk of research focused on Expression Problem’s modularity aspect with solutions delivering methods of extending the existing data types and operations through additive means, such that it is possible to extend a domain contained in a compiled module. While research on this aspect can turn out very influential on the patterns of use of external libraries and evolution of code ecosystems, there are other aspects of Expression Problem that may prove beneficial, when tackled.

The aspect this work focuses on is code organization in a self-contained program, i.e. extension does not necessarily occur across module boundaries (although modularity is also considered as a secondary concern). As organization I will understand a property of code that mandates the placement of syntactic units (especially method implementation) in wider grouping structures and sequentially within these structures. Two programs can be equivalent in semantics, but differ in organization, such that one is subjectively more pleasant to work with, than the other. Organization can either be a result of language rules, conventions of a framework or design pattern (potentially one directed at solving the Expression Problem), or a design decision taken deliberately by the programmer.

If we look at organization schemes provided by existing solutions, we can distinguish three styles of organization.

Some solutions gather data and implementations concerned with a data type in the scope associated with that data type. Solutions of this style are naïve object-oriented implementation and Mix-ins. We will call this approach object oriented.
Other solutions gather implementations of a function for all types. Solutions that follow this style include naïve functional implementation, visitor pattern, object algebras, and final tagless implementation. These can be described as functional.
Solutions that do not impose any grouping for method implementations can be called as procedural . These include protocols and multimethods. The subcategory of these solutions are protocols, as implemented by Clojure. They allow the programmer to organize implementations either by object or by protocol, exchangeably in the same program. In other cases where syntax does not imply any grouping files will take the role of organizational structure, staying at programmer’s discretion and giving effectively the same organization as multimethods in Clojure.

Outcomes of these organization schemes may not be fully satisfactory in every scenario, even in the lifecycle of one software product. For illustration, let us construct such an example.

Let us assume a software with a type hierarchy and a set of operations on these types. As the software evolves, in arbitrary order new types and operations are added, and existing ones change their scope (e.g. User stores more personal detail, collision function calculates collision result instead of Boolean value).

For operations where data types are added or changed object-oriented organization would be optimal, as it would allow work on a contiguous fragment of code, without navigating to distant parts of the code and separate files. However, for all operations involving addition or change of operations object-oriented organization mandates navigation through all units or organization.

When an operation is added or modified, functional organization is optimal, because it localizes all implementation variants of a given method, contrary to object-oriented organization, however with a degradation of experience to changes and addition of data types.

Procedural organization would seem an appropriate choice for all additions. It allows to group the newly added implementations regardless whether they are connected by a concern of common data type, or common operation. This strength, however, causes a natural formation of an unpredictable organization pattern (see Figure 3.1), which complicates the process of modifying types/operations, as it is hard to localize implementations and navigate between them.

Figure 3.1: Illustration of example structure of procedural organization. As new data types (A-H) and new operations (a-i) are added, related implementations are grouped in contiguous blocks. This makes it unpredictable, which implementations can be found in which blocks. For example, all methods on type E are in block related to E, except h, i. However, unlike for E, for type D, methods f and g cannot be found in its dedicated block, due to them being defined after D and before E.

Example in Figure 3.1 demonstrates, how the organizational aspect of code is virtually unsolvable. While it is possible to align the code with some concern, at a later stage in the program development a cross-cutting concern will necessarily render the organization suboptimal.

Framework description

General description

To solve aforementioned problems this work takes inspiration from spreadsheet software, specifically a construct known as a pivot table, which allows to view the same data aggregated with different criteria in a visual way. While the data appears different to the user of the pivot table depending on the aggregation axis, filtering, sorting and grouping settings, the represented data stays constant throughout these operations of the pivot table. Similarly, presented Visual Solution will rely on the separation of the program semantics, kept in the underlying environment, and the program representation – visible to the programmer.

Previously implementations of visual programming for code organization appeared for Smalltalk language (Borne, 1993; Giffen & Tomek, 1997). Smalltalk as a dynamic language is not an object of the expression problem. However, these examples are still an invaluable study subject in visual programming, which is a sparse research direction.

The action, programmer can take in programming environment of Visual Solution, can be divided into two categories: transformative and non-transformative. The former are actions that change the semantics of the underlying program representation, causing it to behave differently in runtime or change external public interface in case of library development. The latter is any action of changing the view of the program, that does not change semantics of it. This distinction is important, as it can lead to confusion, if not communicated clearly to the programmer.

The specific implementation of the program representation in programming environment can differ, from a binary in-memory representation to a serial representation, such as programming language code, or low-level intermediate code. That representation is used as a compilation or interpretation target, depending on the implementation specifics. The Visual Solution is therefore a layer that can be applied to an existing solution, to improve its extensibility.

Programming model

The programming model of the solution consists of three main constructs: classes, methods and concepts. For the rest of the chapter, these will be italicized to avoid confusion with classes, methods and concepts in their colloquial meaning as well as in existing programming languages.

Class is defined as either a primitive, or a tuple of classes. This declaration makes classes, equivalent to product types in languages with algebraic type systems. A class, which is a primitive is an arbitrarily defined set of values. Definitions of specific primitive can differ depending on specific implementation of Visual Solution.

In this programming model classes represent data structures, just like classes in object-oriented programming.

Model does not contain an equivalent to union types, as their role of providing polymorphism is fulfilled through concepts (described later).

Function is a general recursive function from a tuple of classes to some set R. Method is a tuple (P,𝔽), where P is (p₁,…,pₙ) – a list of parameters (each parameter consisting of requirement and name) and 𝔽 is a set of functions. For each combination of classes (c₁,…,cₙ), such that cᵢ meets the requirement of parameter pᵢ for i=1,…,n exists exactly one function (c₁,…,cₙ)→R in 𝔽. A method is thus essentially equivalent to Multimethods.

It has not been defined yet, what it means for a class to meet a requirement of a parameter. This ambiguity will be resolved after defining the concept.

Role for a pair (m,p), where p is a parameter of the method m, is a predicate parametrized with a class c that tells whether c has a complete subset of implementations (functions) with class c at a position of p, i.e. for all possible choices of classes for other parameters of m.

This definition puts a role as a rough equivalent of a singleton trait or singleton interface, however whereas a singleton interface mandates, that a certain method is implemented for a class, role mandates that class has implementation for a certain method, for a certain parameter. This makes roles a more focused construct, better adapted for the use with multimethods. In Visual Solution, all roles are defined implicitly, following the definition of a method and its parameter.

Concept is either a role, or a logic conjunction of other concepts. Each concept can be then easily broken down into a conjunction of roles.

Concept corresponds to a trait/interface in existing programming languages. It is defined explicitly; by roles, but one can determine all the classes adhering to the concept and view it as a parent (generalization) of these classes.

A concept is a construct that can be used as a requirement of a parameter. As such it is de-facto a type in this programming model. Any expression in the program has either a type of a class, or a concept. Expression typed with a class shares member access. Expression typed with a concept shares availability to be used as input parameter in a method, if that parameter-method pair defines a role included in that concept.

Other aspects of the environment, such as encapsulation, state management & mutability, side effect management, data ownership, etc. are less important to the solution of the organizational Expression Problem and can be integrated into this model in various configurations.

These aspects may contribute to the way individual function implementations are being defined, however the only essential requirements for function definitions are syntax for object construction and method calling.

Views

The programming environment provides several perspectives of the program that help organizing it by a single concern at a time. Below these perspectives are presented and explained.

Implementation view

Figure 3.2/1: Implementation view with type switch for the Collides method.

Figure 3.2/2: Implementation view with object constructor for the Shrink method.

This view is an element of all views presented below. This view represents implementation of a function or multiple functions. It can represent an expression or a block of statements depending on the specifics of underlying programming model. In particular however, it should support constructs for object construction, method calling, and type matching, as these are parts of the programming model required by the Visual Solution.

Class view

Figure 3.3: Class view for Circle. Visible implementation for method Collides applies for Circle in the role of collider. A separate implementation with Circle as counter_collider is present in this view after scrolling down in the window, since Circle plays both roles in this method.

This view presents all information concerned with a single class. First subsection contains class’s name and allows to change it. Next, subsection contains class definition, that is: an editable list of all the fields; their names and types. Third subsection contains a list of implementations of methods, separately for different roles within each method. Programmer can edit the methods here. Implementation view inside a class view is context aware, and will omit parameter name, treating it like optional this in object-oriented languages. Tokens like object.radius, or collider.center can be simply written as radius, center (see Figure 3.3).

Method view

Figure 3.4: Method view for Collides method. Classes with missing implementations are marked with red. Dropdowns and “⇔” button on the top allow for selection and swapping of displayed parameters.

This view presents all information concerned with a single method. First subsection contains the name and return type of the method and allows to change them. The return type places a restriction on the set of returnable values. The implementations will be type-checked against that type.

Next subsection contains an editable list of parameters. For each parameter programmer can specify its requirement in a form of a concept and the name. As mentioned in Section 3.1.2.2, each parameter, together with its method defines a role. The role is implicit and not listed anywhere in this view, as this would cause unnecessary redundance. However, all the places, where one can specify a concept (for example in the parameter type dropdown) all the roles are also selectable. This has a consequence of emerging recursive design pattern, where one can put the role defined by a parameter as that parameter’s requirement. This results in parameter accepting any class, since by implementing the method that class immediately meets the requirement.

The third section gives an overview to the matrix of functions. Depending on the number of parameters this section has one of two variants. For one parameter, the functions are arranged in a list, each designated to the particular parameter type. For two and more parameters, the implementations are laid out on a two-dimensional matrix, columns of which discriminate types of one parameter, and rows discriminate types of a different parameter. In each cell a function is presented. If number of parameters is greater than two, the topmost construct in each cell will be a type switch, further subdividing the implementations. The programmer has control over which parameters are displayed in columns, rows, and inside cells, while the environment responds to these commands by changing the representation in cells. Programmer can also take a transformative action of editing the implementations and adding new implementations.

Concept view

Figure 3.5: Concept view for Shape concept. “⇔” button allows to swap roles and classes between rows and columns.

This view presents information for a single concept. It consists of two subsections. First allows to see and rename the concept. Second presents a matrix of implementations. Programmer can swap the rows and columns of the matrix, but for clarity of explanation we will assume one configuration. In columns are the roles, that define the concept. Programmer can add and remove these roles, effectively changing the definition of the concept. In rows of the matrix are the classes that qualify to the concept. In every cell at the crossing of row with class C and column with role R there is an implementation of the method where C plays the role R. If the method has multiple parameters, the varied implementation based on those different parameters is presented with a type switch inside the cell.

Programmer can edit the method implementations as well as add roles and classes. The addition of either is a multi-step process, as every implementation needs to be defined in order to achieve consistency. In the state, where a class is added, but not all methods are implemented that class is marked as a candidate for the concept, which will be marked in other views as well. During that process, one can toggle off the implementations to allow for compilation and evaluation of other areas of the program. That helps the programmer plan and manage a big multistep process between consistent states of a program and achieve the consistency transactionally. Similarly, when a role is added to the concept’s definition, all the existing classes are treated as candidate, until all implementations are added, and toggling that role off is also possible.

Solution description

Let us follow the running example solved visually. First, one may create an abstraction, concept Expression which is temporarily empty.

Figure 3.6: Empty concept view

Then programmer can add new classes to the concept, opening their respective editors. Since the concept is unconstrained by any methods yet, classes qualify immediately.

Figure 3.7: Class view for Literal with properties defined

Figure 3.8: Class view for Addition with properties defined

Next, a method is declared in the method view.

Figure 3.9: Method view for Evaluate with signature entered but no implementation present

The method is visible at the concept view for Expression. Literal and Addition are now suspended as members of Expression until the Evaluate method is implemented.

Figure 3.10: Concept view for Expression with single role as a defining sub-concept and two declared members.

One can then fill the implementations in either the concept view, the method view or the class views. After filling in the implementation, the basic expressions are implemented.

Figure 3.11: Concept view for Expression with all implementations entered.

After exporting the functionality as a compiled module, it can be loaded by an extending module. If the source is also shared, it can be displayed with the extended module. The boundary between modules is invisible in the environment.

The Negation class and Stringify method can be added analogically as previous constructs. Stringify is added to Expression with its parameter as new role contributing to the definition of Expression. Then Stringify is defined for existing classes as shown on Figure 3.12.

Figure 3.12: Method view for Stringify

Negation is added as a new candidate class for Expression. Then, in class view all details of Negation can be defined: fields and method implementations.

Figure 3.13: Class view for Negation

The final overview of Expression can be seen on concept view (Figure 3.14).

Figure 3.14: Concept view for extended Expression concept with old and new types and methods.

Programmer can alternatively (and equivalently) add Negation and Stringify in reversed order. Thanks lack of dominant organization scheme in Visual Solution these additions will take the same amount of view changes – only one per construct added.

Solution discussion

To properly assess the Visual Solution, the same points of critique will be applied, as was for the previous solutions. For consistency, solution is assumed to be backed by a low-level mechanism modelled after multimethods.

Static type safety

Visual Solution is statically typed, with no mechanism of type erasure or casting, which ensures static type safety. The environment also ensures that every method has a full implementation coverage, avoiding the problem of missing dispatch target.

Syntactical code organization

As has been discussed before, Visual Solution supports many styles of organization at the same time, minimizing scattering regardless of the concern of the programmer. Thanks to the two-dimensional character of visual programming, the solution can ensure coherence in two concerns at the same time, which is not possible with traditional, textual programming environments.

External extensibility

The extensibility of constructs from Visual Solution can depend on the underlying implementation. I will show, that through translation to constructs of YOMM2 it is possible to achieve extensibility. Again, constructs from Visual Solution will be italicized, while constructs from C++ will stay undecorated for easy distinction. Let us consider the following translation procedure:

Each concept, including roles is translated to an empty abstract class.
Each class is translated to a concrete class with a list of base classes corresponding to a list of concepts it adheres to.
Composition of concepts inside classes is translated into composition of the corresponding abstract class through a dynamic pointer.
Each method is translated to a YOMM2 multimethod with every parameter requirement being translated to a corresponding abstract class.

The question may arise, if it is type safe and semantic preserving to translate a concept – abstraction over classes defined implicitly by their method implementations – into an abstract class – an abstraction defined explicitly for a closed set of classes. To that, we must note, that a module can only compile, when all the implementations are provided. At this point, all the complying classes can be uniquely found and gathered for each concept, which makes the definition of corresponding abstract class type-safe and free of potential runtime dispatch errors.

In the dependant module, a newly defined class can derive from an external class, as is standard for C++. Even if concept, definition is extended, this concept is not retranslated. Instead, the original abstract class is reused, both for composition and for virtual types for multimethod parameters. Newly defined methods can be defined for external types, as shown in the example involving YOMM2 (see Listing 2.7.2).

Extensibility of Visual Solution is therefore at least as capable as that of multimethods.

External composability

Similarly, as for extensibility, we can show that Visual Solution’s composability is the same degree as Multimethods. That means, that modules’ diamond problem is solvable (see Figure 1.1).

However, a similar, but more restrictive “Y problem” is not solvable, or at least solution is not known yet. To swiftly describe such “Y problem”: let us consider module A, which defines a concept Ω, module B, which adds a new method m and new role R based on that method, and then uses R to extend the definition of Ω, and finally, an independent module X which defines a class c. In this scenario module B cannot implement R for c and register c as adhering to Ω. This is because cᵀ (means: c’s translation) is a baseless class, and m requires cᵀ to be a subclass of Ωᵀ in order to be registered in mᵀ (multimethod).

This problem may prove to be a major hinderance in real-world utility of Visual Solution, although it requires further research to determine, if it can actually be solved within the confines of Visual Solution, either by different translation procedure, or by more generalized multimethod implementation.

Polymorphism

In Visual Solution the concept plays the role of an abstraction, as it is used as a type for both parameters and return values, making it a de-facto compile time expression type. Based on this abstraction, runtime polymorphism is achievable. Methods (functionally equal to YOMM2 multimethods) build a virtual table of calls to dynamically dispatch functions based on concrete types of parameters in runtime.

At the same time static polymorphism is also possible based on static analysis. If at a call site a some of the parameters are statically known, such that a set of functions they correspond to have a common return type, narrower than the method return type, that common type can be used instead, as the actual return type of this call. This process could not only improve accuracy of the type system and reduce number of classes handled, but also improve performance, by reducing the dimensionality of the dispatch table, or potentially dispatching that call statically altogether.

Code overhead

Visual Solution’s primary aim is to minimize overhead. An observation, that many solutions create singleton abstractions, i.e. one-method interfaces led to introduction of roles, which have the same uses, but are defined implicitly.

Concepts with concept view serve as a macroscale perspective on the code, providing for a convenient table of contents; from which hyperlinked navigations can be performed to more focused: class and method views. They also serve as a basis for macroscale operations, like redefinitions of concept semantics.

These optimizations should lead to a smaller number of non-transformative operations (such as movement between code sites in textual programming languages, or navigations between views in Visual Solution) needed to perform a semantic change in a program, which in turn should increase developer experience and productivity.