ALFE's closest ancestor is probably C++ (or possibly D, though I don't have much experience of that language). Though it also draws inspiration from such disparate corners as C#, PHP and Haskell. But there are several C++ features I'm planning on leaving out of ALFE:
- C compatibility - this is the cause of a great many of the complexities of C++, though is quite probably also the only reason why C++ has been as successful as it has.
- Separate compilation - another huge source of complexities in C++. The ALFE compiler will have a built in linker, so it will be able to generate an executable binary in a single instantiation. Actually I'm not entirely getting rid of separate compilation - you'll still be able to create and consume object files, but doing so will require explicitly specifying the ABI to use. The problems that are solved by separate compilation I'm planning to solve in different ways, which I'll write about in future posts.
- Declarations. Declaring something separately from its definition is useful if you have separate compilation, or if you're writing a compiler that can compile programs too big to fit in RAM. RAM is pretty cheap nowadays so this isn't a particularly important use case - certainly not important enough to make programmers duplicate information and have an extra place to modify things when they change. The only things which look like declarations in ALFE will be for things like calling a function declared in an external binary, such as for platform ABI calls. I go to great lengths to minimize the number of declarations in my C++ programs.
- The preprocessor. Well, macros specifically (anything you might want to use macros for in C or C++ will be done better with features of the core language in ALFE). Header files will still exist but will have rather different semantics. In particular, they won't be included until the code in them is needed - if you put an include statement inside a class definition and then no instance of that class is ever used, the file won't even be opened. That should drastically improve the time it takes to compile files while making it very simple to use the standard library (whose classes will all be included this way by default).
- The post-increment and post-decrement operators. These do have some merit in C - you can write things like "while (*p != 0) *q++ = *p++;" much more concisely than would otherwise be possible. The first chapter of Beautiful Code has a nice example. However, I don't think these bits of code should be the ones we're making as concise as possible - such a piece of code should be written once, wrapped up as a highly optimized library function and thoroughly documented.
- Digraphs and trigraphs. You'll need the full ASCII character set to write ALFE, which is probably not much of a hardship any more.
- Unexpected exceptions (and run-time exception specifications). Exception specifications are a great idea as a compile time check but they make no sense as a run-time check, which is why most coding standards recommend against using them. But the reason that C++ needs exception specifications to be a run-time check instead of a compile-time check is because of separate compilation - there's no way to tell what what exceptions will actually be propagated from any particular function. With compile time exception specifications it's trivial to force the exception specification for destructors to be "nothrow" and thereby make propagating an error from a destructor be a compiler error, which is a much better solution than the "unexpected exception" mechanism in C++.
- Multiple inheritance. I gave this one a lot of thought before deciding on it. Multiple implementation inheritance is something that seems to get used very little in proportion to the amount by which is complicates the language. I've only seen one even slightly compelling example (the BBwindow etc. classes in chapter 12 of The C++ Programming Language by Stroustoup) and the rationale for that goes away when you have control over all the parts of the program (i.e. you're not consuming code that you can't modify). ALFE actually goes one step further and disallows multiple interface implementation as well on the grounds that if you have a single class implementing more than one interface, it's a violation of the separation of concerns. From an implementation point of view, inheritance is a special case of composition - the parent object is just in the first position. That suggests a good way to simulate multiple interface inheritance - compose the class out of inner classes which implement the various interfaces - pointers to these inner classes are retrieved via method calls rather than casts. I might even add some syntax to streamline this if it gets too verbose. And I might still change my mind and introduce multiple (interface) inheritance if my plan doesn't work.
- The main() function. Instead the statements in the file passed to the compiler are executed in order from the top - i.e. there is an implicit "main()" wrapped around the entire compilation unit. Command line arguments and environment variables are accessed via library functions (or possibly global variables), which means they don't have to be passed through the call chain from main() to whatever piece of code interprets the command line arguments. That greatly reduces the amount of boilerplate code.
- public/private/protected/friend. Instead I'm planning to implement "accessibility sections", which are kind of like public/private/protected but specify which classes are allowed to access the members in the section. That way you don't have to give all your friends access to all your private parts, which is possibly the most innuendo-laden complaint about C++.
- int to bool conversion. The condition in "while" and "if" statements must have boolean type. If you want C semantics you have to add "!= 0" explicitly. I've been doing this anyway in my own C/C++ code and I don't find it to be a source of excess verbosity - if anything it clarifies my intents.
- Assignments (and increment/decrement) as expressions. This solves the "if (a=b)" problem that C and C++ have, in a far more satisfactory way than the "if (1==b)" abomination that some recommend. I've also been avoiding using assigments as expressions in my C and C++ code and I don't miss them - I think the code I write is clearer without them.
- C strings (responsible for so many security bugs). ALFE will have a first-class String type, possibly consisting of a pointer to a reference counted buffer, an offset into that buffer and a length, but I'll probably try a few things to see what performs best, and the compiler might even decide to use different representations in different places if there's a performance advantage in doing so.
- References. I can see why these were necessary in C++ but I think for ALFE they will be unnecessary because of the lack of a fixed ABI, and the "as if" rule (the compiler is allowed to use whatever implementation strategy it likes as long as the semantics don't change). So the rule for deciding how to pass an argument is simple: pass by value if it's strictly an input, pass by pointer if it the function might need to change the output. The compiler will decide whether to actually pass by value or pointer depending on what is fastest on the target architecture for the kind of object being passed.
- Array to pointer decay. In ALFE, fixed-length arrays are first class types, so passing one to a function declared as (e.g.) "Int foo(Int[4] x)" passes by value, i.e. copies the entire array. Getting a pointer to the first element requires explicitly taking its address: "Int* p = &x[0];"