Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cxw42/do-not-self-host
A toolchain starting from assembly so you don't have to self-host your next programming language
https://github.com/cxw42/do-not-self-host
assembler assembly bytecode bytecode-interpreter interpreter programming-language programming-language-development self-hosting virtual-machine vm
Last synced: 18 days ago
JSON representation
A toolchain starting from assembly so you don't have to self-host your next programming language
- Host: GitHub
- URL: https://github.com/cxw42/do-not-self-host
- Owner: cxw42
- License: other
- Created: 2018-04-25T14:03:39.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2020-06-05T00:09:32.000Z (over 4 years ago)
- Last Synced: 2024-11-16T03:52:41.622Z (3 months ago)
- Topics: assembler, assembly, bytecode, bytecode-interpreter, interpreter, programming-language, programming-language-development, self-hosting, virtual-machine, vm
- Language: Python
- Homepage:
- Size: 150 KB
- Stars: 2
- Watchers: 4
- Forks: 0
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# do-not-self-host
A development toolchain from the ground up, starting from assembly.
Don't self-host your next language! Make it possible for us to build from
source, from scratch, without needing a bootstrap package!This is a long-term hobby project, so please do not expect regular
updates :) . However, I certainly welcome others who want to contribute.Assumes a development environment that provides stdin/stdout and redirection.
## Current status
* ngb: VM (in C)
* ngbasm: assembler (in Python)## Editor support
ngb assembly files have the extension `.nas`. A Vim syntax configuration
is available [here](https://github.com/cxw42/ngb-vim).## I'm not the only one
The Facebook Buck build system also doesn't self-host by default (although
it can). The Buck [FAQ](https://buckbuild.com/concept/faq.html) says, in part:> Q: Why is Buck built with Ant instead of Buck?
>
> A: Self-hosting systems can be more difficult to maintain and debug.
If Buck built itself using Buck, then every time a change was made to Buck's source, the commit would have to include a new Buck binary that included that change. It would be easy to forget to include the binary, difficult to verify that it was the correct binary, and wasteful to bloat the Git history of the repository with binaries that could be rebuilt from source. Building Buck using Ant ensures we are always building from source, which is simpler to verify.## Installation and testing
The code is currently C and Python, but the infrastructure runs in Perl.
Tests use Perl's [`prove`](https://metacpan.org/pod/prove).### Building
- If you don't already have it, install Perl
(e.g., using [perlbrew](https://perlbrew.pl/)).
- Install [cpanminus](https://github.com/miyagawa/cpanminus).Then build using:
perl Makefile.PL
cpanm --installdeps .
make
cd mtok
makeOnce you have run the `perl` and `cpanm` steps, you shouldn't need to do so
again if you are only working on the C/Python/ngbasm sources. Just run
`make` as necessary.### Testing
Once you have done the build steps, run `prove` or `make test` in the top
level of the repository.## Older notes
Based on [crcx/Nga-Bootstrap](https://github.com/crcx/Nga-Bootstrap), which
provides:* naje - a basic assembler (Python)
* nmfcx - a Machine Forth Cross Compiler (Retro)In the pipeline:
* NGA+:
- Implement NGA VM in x86 assembly (NASM?)
- Read/write stdin/stdout (port-based, a la retro? Maybe not - that's
flexible, but perhaps more than we need).
- Add support for record blocks A and B - configurable number of fields
per block; `aload`, `astore`, `bload`, `bstore`, `aread`, `awrite`,
`bread`, `bwrite`
- `.const`* Minimal Infix High-Level Language (Minhi) - `::=+`, and
everything else is an expression.
- Why expressions? Because infix expressions are easy
to parse based on a table, as described in
[_A Retargetable C Compiler: Design and Implementation_](https://sites.google.com/site/lccretargetablecompiler/).
- Lexer written in NGA+ that takes source and outputs token stream
- Parser written in NGA+ that takes token stream (block A) and outputs
AST (block B)
- Compiler that produces NGA+ assembly
- Later, a compiler that produces x86 assemblyFuture: to be determined... (but possibly a C compiler written in Minhi)