Zed

NOTE: the examples in Old are out of date. Beware. The docs in New/Language.html are uptodate.

(2024-10-5: release 1.1 available (up one level, go into "New"). Now you only get linker warnings for some of the test programs - you still get the binary created. Details in below paragraph have changed. (2024-08-27: release 1 really delayed - see next paragraph. I've had some real-life issues that have slowed things down, but the real issue is that switching to generating IP-relative code has not gone well. There are many instruction formats. IP-relative code does not allow indexing. When referencing non-local entities, relocations are needed. So, if indexing is present for a non-local entity in IP-relative mode, I need to break the instruction into two. I've chosen that the first instruction will be an LEA that simply puts the address of the non-local into reg A. Then the second instruction works from there, not caring about IP-relative. However, to do that, the information needed to create the relocation in the LEA needs to be passed in to all of the instruction creators that could use a memory addressing mode (many!). And, that information needs to be updated to prevent the caller from also creating a relocation record. Also, instructions with immediate operands (1, 2, 4 or 8 byte) have the immediate *after* the offset/address that needs to be relocated, so there is no easy way to do it all in once place. Yet. My current plan is to add the reference needed for relocation to a small structure that represents addressing, and hopefully avoid a lot of special case changes. Another fun aspect is that the CPU makes IP-relative offsets relative to the start of the next instruction. The relocation data that goes into '.o' files must work entirely with the relocation itself, because it doesn't understand instructions. So, it treats the target offset of the relocation as being relative to the start of that offset in memory. That's why you see "-4" on the end of function names in X86-64 relocation data. For data references, the code generater also has to take into account the size of any immediate data, which is *after* the relocatible value.

(2024-08-07: release 1 delayed. I decided that the stand-alone "zedc" compiler would be easier to do first since it currently can't do byte-code and so wouldn't need all of my written-in-Zed code, libraries, etc. Mistake. I'm using WSL2 as my test target, and the "ld" there doesn't like 32 bit absolute addressing for globals of all kinds. They want IP-relative. I'm slowly switching to that (while keeping the old stuff, just in case), but it is proving stubborn.)

Zed is the current name for my long-term project of producing a new programming language, a full programming environment, full user libraries and applications, and a full operating system, all programmed in the same programming language. At some point I will of course require assistance in this project. However, I want to finalize an initial version of the programming language first. A recent discussion suggests that I should "publish" my preliminary notes and thoughts, in order to establish as "prior art" any new concepts and ideas I may have produced. The page here does that. I would like a few (5 or 6 should be enough) people who I don't know to take a snapshot of the 3 files on that page, and save them away, with a datestamp. Email me the location of the copy, so I can save that information.

As of April 2010, much of the Zed programming language is implemented. It is, however, a cumbersome system to use, since most programs require the processing of 40,000 lines of Zed code before user program code is encountered. That number will only increase. However, I plan on having all of that code processed into a "zedworld" database/file, so that it can be accessed by a running Zed program as needed. It would also be possible to have the system produce a traditional executable file for a Zed "program", and use a traditional shared library for the Zed libraries.

Overall Description

What I've done so far is to mostly stabilize on a programming language design, and to write notes on all sorts of other things. I have written a compiler for the language, and it executes using my own bytecode system under X86-64 Linux. Most of the compiler and libraries are written in Zed itself - however the bytecode engine is in C. There are also C versions of the compiler which is what usually runs. I also haven't bothered doing the parser in Zed yet - its pretty straightforward. The most complex test program right now uses a Zed version of a friend's object-oriented GUI library, built on top of the X Windowing system, to implement a bare-bones "package browser", similar to the file browsers on Windows and Linux.

The Zed programming language

is strongly typed
has both fixed array types and dynamic matrix/vector types
has interfaces and capsules (classes) much like Java
has allocated records, structs, enumeration types, etc.
has '@' types, similar to C++ refs
has a privileged mode with pointers, unions, casts, etc.
has generics, which have only one copy of code (might change slightly)
checks all array and matrix indexing
checks ranges on enumeration operations
checks for overflow/underflow/zero-divide
supports full compile-time execution
supports full run-time compilation (Zed is introspective)
allows explicit run-time type tests, but requires no implicit ones
is syntactically based on my previous languages, based on Algol68
will not interface with the current forms of any other programming languages, although other languages can be built on top of the publically callable Zed semantic code, and then used with other Zed code

My goal is to have a completely safe programming system. Currently, privileged mode is trivially protected, but my intent is that it be protected as much as possible. In my mind, the protection is crucial to the usefulness of the system. Just a programming language cannot provide full protection of course. Even if all system calls are done directly via traps, it is still possible for modified host kernels to break Zed security. I believe the only true answer is for the OS to be written in Zed, with the usual minimum of assembler code (and I'm willing to add language features to Zed to allow it to replace more assembler), and for that OS to run on a secure, safely sealed CPU, with an unforgeable ID.

Another goal is to have the language execution be as efficient as possible. I'm sure I will be disappointed here, but I will do what I can. Although the system is currently bytecode-oriented, I plan to move to native execution at some point. First, I need to optimize out enough of the run-time checks to make native code relevant.

Examples

(Hmm. Old version of this test - newer version is cleaner, not needing the "MappingAPI_t" record.) Here is an example of a generic mapping facility, and some testing of it. Output from running the tests is here.

Here is a debugging package, using the compile-time execution facilities. Here is a test program for the Debug package. You are not expected to follow this example in any detail. Here is the output from running the tests.

For the truely ambitious (or the very curious) here is the current source to the Fmt package (Zed's alternative to the C printf family). And here is a test program that I use to test the Fmt package. Challenge: explain what happens as test "x01" is executed. You don't have all of the source involved, but should be able to make intelligent guesses about what must be happening.

The Future

When will Zed be finished? Never, of course. Well, when will the programming language be stable? That's hard to answer. I'm currently trying to make the decision of whether or not to merge the interface and capsule concepts, which makes capsules consume one pointer of extra memory for each object, but which allows something resembling C++'s multiple inheritance. There are several small features to be added, as well as a couple of larger ones: physical unit definition and analysis; and user defineable operators (which will not be the same as existing operators).

When can other people start helping? As soon as I get the language stabilized and completed, I can write language descriptions and commentary. At that point, I can post those here, and input from reviewers will be good. The current implementation is awkward - all Zed library sources are compiled on each run that needs them. I need to get to the point where pre-parsed linearized forms can be read from host files, before others would want to try it out. Also, I need at least one more round of cleanup - Zed sources are intended to be examples of how one should program in Zed. I may also continue and expand an early scheme of comment annotations, which is intended to allow automatic production of basic documentation, somewhat like literate programming.

I want the overall Zed system to be such that the various libraries, programs, etc. that make up the system will never be separated from good documentation (reference guides, tutorials, examples) of them. I want all components to have meaningful names, so that users totally unfamiliar with the system should be able to find what they need, and be able to get a decent non-programmer understanding of what the system does.