Created 2022-08-21
Programming languages are fun to write. Sorta. Inspired by all the new languages coming out (Hare, Odin, Zig, etc.), I decide to have a go at writing my own. I'm calling it "zinc", a name inspired by the Antimony programming language. Antimony describes itself as "a bullshit-free (©) programming language that gets out of your way".
Antimony looks quite nice. It's like C but with a few nicities. It is written in Rust, so there were obviously a lot of dependencies to download. It emits JavaScript (boo!), with C under development.
I have written my own half-working Basic interpreter in the past, but I was thinking that I would make Zinc a bit like Antimony. Except that I would use a C/C++ front-end and emit qbe (a lightweight llvm alternative).
I started by hand-writing a recursive-descent parser. Meh, you can do that, it's OK, but that's a lot of manual work. I decided to try lex and yacc again. These can be fiddly tools.
So I decided that, as an experiment, I'd try to write a C-to-C parser. I downloaded some yacc and lex files that supposedly specified the grammar. No compilation instructions were given, so I had to figure out a Makefile. I also had to tweak the code a little to get the damned thing compiled. I added in line number functionality and sorted out a pesky shift-reduce warning on IF-ELSE statements.
The yacc file is 437 lines long. I consider that a fair old chunk considering that it only specifies the rules, it doesn't perform any actions. It also does not do any preprocessing. Code emitted by the preprocessor will also confuse the yacc file.
However, it will take a valid C file and echo it to stdout if the input is correct (and ignoring includes and prprocessor stuff), and report an error if it's wrong.
Have I gained anything? Sure. I have at least got the skeleton of woking yacc and lex files, together with a working Makefile. My next step is to use it as a guide to writing my own compiler. I will start from a fresh yacc file, though, otherwise I'll be overwhelmed.
So you can all look forward to another fantabulous NIH programming language sometime in the future.
Just my 2¢.
ANSI C grammar, Lex specification
A Grammar for the C- [sic] Programming Language (Version S21)