Last edited on 24th of August 2020.
A collection of research questions, that may or may not have been answered already. I haven’t searched thoroughly for previous work on these. This is more a personal notebook of what I’d like to know, than a call to action like “Scientists! Research this!”. It’s possible, that these questions are of little actual value.
In short: Is there an appropriate algebraic structure that captures the “setting” of compiler correctness theorems?
Most (of the *very* few) texts I read on the topic of compiler correctness presume a notion of “observational equivalence” and then discuss how this equivalence shall be preserved by the compiler under particular circumstances. I want to know more about how appropriate definitions of observational equivalence look like and what role the semantics of the language play. I want to know more how the compiler correctness properties are preserved, when compilers are composed with eachother. Maybe there is an interesting category with languages as objects and (correct) compilers as morphisms. Maybe constructing Multi-languages has some nice properties in this category.
It’d be nice, if such a “Grand Unified Theory of compilers” would provide the appropriate vocabulary for compiler correctness theorems. Similar to point-set topology.
Maybe I just need to find a good textbook on the topic, or simply read more and more thoroughly about the topic.
https://dbp.io/pubs/2019/ccc.pdf
https://www.williamjbowman.com/blog/2017/03/24/what-even-is-compiler-correctness/
In short: What do compilers and the property of compiler correctness look like under the logical viewpoint of the Curry-Howard correspondence?
Curry-Howard correspondence connects programs and proofs. Compilers are functions that map programs of one programming language to programs of another programming language. Observational equivalences are relations on programs. Compiler correctness properties restrict the behaviour of compilers using observational equivalence. Now apply the Curry-Howard correspondence to the above sentences. What happens? What vocabulary is needed to describe compiler correctness from the logical/proof theoretic viewpoint? What compiler correctness properties are also interesting from a logical viewpoint?
I hope, that this way “natural” compiler correctness properties can be found and I expect “fully abstract” to be an especially important property.
Another topic is the following: Intuitionistic logic can be embedded in classical logic. Classical and intuitionistic logic can be embedded in linear logic. Do these embeddings correspond to compilers with nice properties?
The Wikipedia page has a lot of material linked and referenced.
https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence
In practice, a compiler like gcc is many formal compilers simultaneously, because each compiler flag changes how the mapping between programming languages behaves. For example enabling or disabling some optimisations is a possible change. It might be worth to think about this case, whether the formal situation changes.