BackDrops  Views  Lego  Megabloks  Puzz3D  Compilers  MUDs

The Toy Language and Compiler

I was fairly early in jumping onto the Amiga bandwagon, and there wasn't much in the way of programming tools available for it, initially. So it was soon obvious that making a version of my CP/M Draco compiler for the Amiga was a good idea. I did that, and managed to get it out fairly early. That lead to some interest in it before the later languages overtook it (since I didn't work on it all the time, it didn't keep pace with developments). I don't remember the details of how it happened, but I ended up writing a series of articles for an Amiga magazine about how to write a compiler. For those articles, I designed a subset of the Draco language, and implemented (in Draco of course) a compiler for it. I called the language "Toy". The compiler was standalone, needing no linker or include files or anything. It emitted simple MC68000 code directly into an executable AmigaDOS object file.

I ended up with a set of four articles, which were published in the July 1989 - October 1989 editions of the "Amiga Transactor". The appropriate subsets of the sources to the Toy compiler were included in the magazine and on the on-disk versions of the magazine. Unfortunately, the publisher closed that magazine down after the October issue, and I never got paid for any of the articles. So, I feel I still own them, and can thus publish them here on my web pages. Linked from this page are sample Toy programs, the articles, and the full Toy compiler sources. Note that all of these are straight text files - they contain no HTML. So, you will have to use your browser's BACK button to get back to this page from each of them. Having an article in one browser window at the same time as having a source file being discussed in another browser window would be useful.

First we have some sample Toy programs (I used the suffix '.t' for Toy source files, but Internet Explorer thinks that means something else, so I've changed to '.toy' here):

hello.toy This is the usual "hello there world" program.
factorial.toy Here is a simple program to calculate factorials using recursion. It shows using input as well as output.
hanoi.toy This example shows a recursive "Towers of Hanoi" program. This is the program I use as a standard example for many of my languages.
shoot.toy This program is the longest Toy program I ever wrote. It manages to implement a simple shoot-the-target game. It uses character graphics only. It also uses the Amiga's ANSI console driver to do colour changes, cursor positioning, etc.

The first article in the series contains some general discussion on writing compilers, describes the Toy language, gives some examples, and shows an example program going from input source, through lexical scanning and parsing, to final assembler instructions and binary file format.

The second article describes some of the data structures used by the Toy compiler. It then describes the code for the symbol table and for the lexical scanner (tokenizer). Accompanying it were 4 main source files:

toy.grc This file defines the data structures used by the compiler.
externs.grc This file simply declares all of the functions exported from one source file of the compiler to others.
symbol.drc This file contains the code for the symbol table. This includes looking up and entering symbols in the compiler's symbol table.
lex.drc This file contains the source for the lexical analyzer, or tokenizer, for the compiler. This is the part of the compiler which takes the raw characters of the input source file and interprets them as numbers, keywords, user identifiers, special punctuation characters, strings, etc. of the Toy language.

The third article in the series describes the parsing done in Toy. This includes the parsing of declarations, expressions and statements. Relevant source files are:

decl.drc This source file contains the code which parses variable, parameter and function declarations in the Toy language. Toy types are very simple - in compilers for larger languages, there would be quite a bit more code to parse types.
expression.drc This file contains the source code for parsing expressions. It is a simple recursive descent parser. One of the goals of the Toy project was to not use any external tools, so no parser generator was used. Personally, I've found recursive descent to be easier, and to provide much better control over error handling.
statement.drc This file contains the parsing code for the various statement forms that Toy has. This includes 'while' loops, 'if' statements, etc. It again uses direct recursive descent parsing.
parseUtil.drc This small source file includes some utilities that were generally useful during parsing.

The fourth and final article in the series talks about code generation. The Toy compiler treats the MC68000 as a stack machine, so doesn't generate very good code. However, it is certainly adequate for a demonstration compiler. The article briefly discusses some alternate code generation techniques, as well as the simple direct technique used here. The code generation process is illustrated with an example, and some present and possible optimizations are discussed. There is only one source file that is relevant here:

codeGen.drc This file contains the code generator for the Toy compiler.

There are two remaining source files which didn't fit in with the articles. They are:

system.drc This file contains the interface between the Toy compiler and the hosting operating system. Up until this point, the entire compiler has been independent of what operating system it is running on. This source file is for AmigaDOS. It would not be hard to create variants for other systems, since a compiler needs little from an OS.
runTime.drc This source file is very strange. You may want to look at it, but if you don't understand exactly what it is doing or how, don't worry about it. Basically, I wanted the Toy compiler to not need a linker in order to produce an executable AmigaDOS binary. So, I put the source to the runtime system into the compiler. That source is compiled along with the rest of the compiler. When running, the compiler takes the image of those routines from itself, modifies them slightly, and writes them out to the object file. Those routines are written in Draco, but contain some pseudo-assembler code to handle interfacing to Amiga system routines which use specific registers for parameter passing. The technique I used here is very fragile - changes to the code generation of the Draco compiler result in changes to offsets within these runtime routines. So, I'm not happy with the result, but I never thought of a better one.

The last article is missing a conclusion section. I had presented three alternative conclusions to the magazine editor, depending on whether or not they wanted me to continue with other programming language related subjects. The set of conclusions is here.


Up  Home