User avatar
PeterO
Posts: 5024
Joined: Sun Jul 22, 2012 4:14 pm

Bootstrapping Languages

Mon May 13, 2019 3:25 pm

Somewhere in the last few days there was a discussion about bootstrapping languages, but I can't find it now.... It might have been deleted as such things are considered "off topic" by some people :roll:

Anyway, over the weekend I was reminded of the comments made by Tony Hoare in this speech called "The Emperor’s Old Clothes"
http://zoo.cs.yale.edu/classes/cs422/20 ... mperor.pdf

In it he writes "[The Algol 60 compiler] was designed and documented in ALGOL 60, and then coded into decimal machine code
using an explicit stack for recursion. Without the ALGOL 60 concept of recursion, at that time highly controversial, we could not have written this compiler at all."

So even without a working predecessor, the ALGOL 60 compiler still bootstrapped itself :-)

PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

hippy
Posts: 5969
Joined: Fri Sep 09, 2011 10:34 pm
Location: UK

Re: Bootstrapping Languages

Mon May 13, 2019 4:46 pm

PeterO wrote:
Mon May 13, 2019 3:25 pm
So even without a working predecessor, the ALGOL 60 compiler still bootstrapped itself :-)
Eventually. Obviously not immediately when it was just an Algol 60 compiler written in Algol 60 in some source file.

There's always some precursor language needed in bootstrapping a new language.

One neat trick is to write a compiler in its own language and use macros so it's actually also some other language which can be compiled using that other language's tools. That makes for an interesting debate as to which language it is in.

I'm not convinced a language which compiles itself is necessarily a great idea. The question has to be what happens if all executables, or those which can compile the latest source, were lost ?

It's back to bootstrapping from the ground up and that's not always easy if the compiler has evolved to depend on things which could never have been in the initial versions.

I suppose that applies to all languages but it's more of an issue for those which lose favour.

Heater
Posts: 13367
Joined: Tue Jul 17, 2012 3:02 pm

Re: Bootstrapping Languages

Mon May 13, 2019 7:17 pm

hippy,
The question has to be what happens if all executables, or those which can compile the latest source, were lost ?
This is the kind of question that wakes me up in the middle of the night.

Seems to me that if we some how lost all the executables and/or source code of C++ compilers we would be screwed. So much depends on C++ now, even GCC. It would take thousands of software engineers three decades to recreate it all!

Hopefully they would think twice about making the same mistake over again though :)

Contrast to the first ALGOL 60 compiler, written by Dijkstra, only 4000 lines of assembler. Which he wrote out by hand on coding sheets and has been tested and published in a paper recently.

In the face of such a catastrophe we could get that going again quickly enough, and then use ALGOL 60 to bootstrap ourselves back into the Second Age of Computing.
Memory in C++ is a leaky abstraction .

Andyroo
Posts: 4503
Joined: Sat Jun 16, 2018 12:49 am

Re: Bootstrapping Languages

Mon May 13, 2019 7:36 pm

Forth was a bit Like that - you had a minimal language of words implemented in assembler and then built the rest in those and subsequent words.

I spent a few years messing with compilers and cross assemblers and am constantly amazed by the way languages have grown in power and complexity since my time share days but not convinced they have got more readable (COBOL still wins I think).

Given the amount of documentation about now I do not think we are at risk of loosing those skills BUT how close would the result be if they were re-created? What programmer would / could resist just making a little ‘tweak’ to ‘fix’ something they do not like in the language :lol:

The other risk is that there are too many languages - I hate having to pay for a software house to ‘upgrade’ someone else’s code as they do not use the same language! Web sites are classic for this - lost count the number of offers to make the site ‘better’ with this framework not that and the number of times commercial directors fall for it... Watch coders and designers run when you ask for an SLA and a refund if site numbers do not increase :lol: :roll: :lol:
Gone for good.

jahboater
Posts: 4690
Joined: Wed Feb 04, 2015 6:38 pm

Re: Bootstrapping Languages

Mon May 13, 2019 7:49 pm

hippy wrote:
Mon May 13, 2019 4:46 pm
I'm not convinced a language which compiles itself is necessarily a great idea.
Following on from something Heater said.

There are "systems programming" languages.
These, by definition, should be able to compile themselves. Mandatory. They are designed for writing operating systems, compilers, editors, assemblers, debuggers, you name it.

There are "general purpose" languages.
These aught be able to compile themselves if they are truly general purpose. A compiler is a big complex program, and a weak language might have problems with "programming in the large". Speed may also be a problem. But if a language cannot cope with transforming source code text to assembler or machine code it is probably not up to much in the real world.

There are "specialist" languages.
R comes to mind for statistics, SQL for databases, but there are many. There is no reason at all why these should be self hosting. Perfectly reasonable to use a systems programming language like C to implement the translator.

Heater
Posts: 13367
Joined: Tue Jul 17, 2012 3:02 pm

Re: Bootstrapping Languages

Tue May 14, 2019 4:53 am

Certainly a "systems" programming language should be able to rebuild itself from source code written in the same language. After all we would like our systems to be able to clone them selves and be able to mutate them to work on different architectures and hardware platforms. The Unix way.

Is it even reasonable to expect or demand that higher order language compilers/interpreters be written in the same higher order language rather than the lower order language they are written in.

For example, one might spend years writing compilers/interpreters/run-times in C/C++ for languages like Python, Javascript, Perl, etc. Why would one then want to reimplement all of that in Python, Javascript, Perl, etc?

It would be a lot of work.

It's not necessary because your low order language, C/C++, is every where anyway.

Compiler/interpreter performance and memory consumption will likely be worse.

About the only upside is that making a language "self hosting" demonstrates it has the qualities of a systems programming language and perhaps makes it actually useful as one. That might be the "boot strapping" we are talking about here. That might be a worthwhile goal for languages like Rust and D for example.

But perhaps that is not the point of the language. Perhaps it's designed to make other things easier. Like SQL as noted above.

What does bug me is to find that Javascript, as used in node.js, requires Python to rebuild it!
Memory in C++ is a leaky abstraction .

hippy
Posts: 5969
Joined: Fri Sep 09, 2011 10:34 pm
Location: UK

Re: Bootstrapping Languages

Tue May 14, 2019 3:27 pm

I have never been convinced that a programming language has to be able to compile itself to be worthy of acclaim. Nor convinced that being able to do so is as much a demonstration of how capable or credible that language is when it can as some may claim. While compilers aren't usually "trivial" they aren't necessarily complicated either. I think we sometimes get over-awed by how magical a compiler seems to be.

Heater
Posts: 13367
Joined: Tue Jul 17, 2012 3:02 pm

Re: Bootstrapping Languages

Tue May 14, 2019 5:14 pm

Exactly.

Any programming language that is "Turing Complete" as they say, provides the capabilities to write a compiler for that language in the same language.

Whether it makes sense to do so is another matter.

The language may be very convenient and clear for expressing some class of ideas. Which don't happen to include writing compilers.
Last edited by Heater on Wed May 15, 2019 10:31 am, edited 1 time in total.
Memory in C++ is a leaky abstraction .

jahboater
Posts: 4690
Joined: Wed Feb 04, 2015 6:38 pm

Re: Bootstrapping Languages

Wed May 15, 2019 8:04 am

This paper written by Denis Ritchie is interesting and relevant ...

https://www.bell-labs.com/usr/dmr/www/chist.html

User avatar
r3d4
Posts: 967
Joined: Sat Jul 30, 2011 8:21 am
Location: ./

Re: Bootstrapping Languages

Thu May 23, 2019 9:23 am

(interesting and relevant ...)+

https://gra.mirror.cyberbits.eu/fosdem/ ... numes.webm - "GNU Mes Reduced Binary Seed bootstrap for GNU Guix."
Two years ago on FOSDEM'17 a minimalistic bootstrap was a dream; today it has started to become reality.
Bootstrapping GNU/Linux without use of the GNU toolchain (gcc, binutils, glibc) was our first milestone that we just reached.
Mes became a GNU project.
We will talk about what bootstrapping is and why it is important.
We will show how Mes' minimalistic Scheme has made this possible
and on future milestones (Scheme only bootstrap, other GNU/Linux distributions)
before reaching the holy grail: a full source bootstrap.

Heater
Posts: 13367
Joined: Tue Jul 17, 2012 3:02 pm

Re: Bootstrapping Languages

Thu May 23, 2019 3:32 pm

Incredible. I'm really glad there are nerds in the world crazy enough to even think of doing such a thing.

A Scheme run time in C, a C compiler in Scheme. Talk about meta-circular! Did I catch all that correctly?

Oddly enough yesterday I was wondering what I might use with my home brew RISC V on FPGA project. It's no where near suitable to run Linux. Something other than cross compiling C for it. An interactive Javascript like Espruino say. People like to use Forth for this kind of thing but that's not really me. So I started looking at small Scheme interpreters that might be usable.

Sound like this bootstrap project might have just what I was looking for.
Memory in C++ is a leaky abstraction .

Return to “General programming discussion”