Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/LensPlaysGames/LensorCompilerCollection
A compiler we made just for fun :^)
https://github.com/LensPlaysGames/LensorCompilerCollection
compiler compiler-design compiler-optimization first-class-functions programming-language static-typed
Last synced: 2 months ago
JSON representation
A compiler we made just for fun :^)
- Host: GitHub
- URL: https://github.com/LensPlaysGames/LensorCompilerCollection
- Owner: LensPlaysGames
- License: mit
- Created: 2022-08-01T17:12:27.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-12-21T22:20:48.000Z (about 1 year ago)
- Last Synced: 2024-02-17T06:35:08.678Z (12 months ago)
- Topics: compiler, compiler-design, compiler-optimization, first-class-functions, programming-language, static-typed
- Language: C++
- Homepage:
- Size: 3.83 MB
- Stars: 174
- Watchers: 5
- Forks: 18
- Open Issues: 1
-
Metadata Files:
- Readme: README.org
- License: LICENSE
Awesome Lists containing this project
README
#+created: <2022-08-01 Mon>
* Lensor Compiler Collection
Now written in C++, LCC started as a compiler written in C for just one hobby language, ~Intercept~. Over the course of a few years, it has grown into a compiler collection.
** Usage
For convenience purposes, there is a single executable, ~lcc~, that can delegate between all of the different compilers in the collection. This is called the /compiler driver/, often shortened to just /driver/.
Running the driver executable with no arguments will display a usage message that contains compiler flags and options as well as command layout.
The driver, by default, uses file extension to determine which language to treat the source as. ~lcc examples/glint/SimpleFile.g~ will treat the contents as ~Glint~ source, while ~lcc tst/ir/roundtrip.lcc~ will treat the contents as LCC's Intermediate Representation, for example . You can use ~-x LANG~ to treat source files as a particular language, regardless of extension.
** Building
Dependencies:
- [[https://cmake.org/][CMake >= 3.20]]
- Any C++ Compiler (We like [[https://gcc.gnu.org/][GCC]])
NOTE: If on Windows *and* using Visual Studio, see [[file:docs/VISUAL_STUDIO.org][this document]] instead.
First, generate a build tree using CMake.
#+begin_src shell
cmake -B bld
#+end_srcFinally, build an executable from the build tree.
#+begin_src shell
cmake --build bld
#+end_src** Implemented Languages
- Glint | Low Level, Higher Order
In the future, we hope to support
- some of C
- YOUR language :) (it's easy, really)*** Targeting LCC IR
For those of you who already have a language /and/ a compiler implementation that you are happy with, you are able to extend it minimally to allow LCC to compile (and optimise!) your language. There are two main ways to do this: via text, or direct interaction.
**** Textual LCC IR
As a compiler developer, it is important for me to be able to understand the different data structures in the compiler, so, naturally, I print them out to the terminal. This results in a textual representation of the intermediate representation, which I have also written a parser for. =lcc= treats file extension =.lcc= as LCC IR source, which can be produced from the languages built into the compiler with the IR output code format command line option =-f ir=.
You could theoretically write an AST-visiting code generator that outputs text, much like you can with assembly (and like I did when this compiler first started), but you wouldn't have to worry about register allocation, or any of that complicated business. Just the underlying computations that you want to make happen. If you then fed this to =lcc=, it would know how to compile (and optimise!) it.
NOTE: If you target textual LCC IR, there is no guarantee of the syntax of the IR over time. That is, if we need to make changes, changes will be made, and that may mean the text your code generator generates is no longer valid LCC IR. Do also note that if you opt for direct interaction, you get the textual emission *for free*, while also not suffering from the invalidation problems. Also, if written in C++, you could maybe take advantage of LangTest to test your language and make sure both your parser and your LCC IR generator work properly.
**** Direct Interaction
Link with =liblcc= and use the code directly in your existing compiler implementation.
Generally, IR generation can be very wide but not very deep. That is, there can be quite a bit of code to write, but there is only a few things that it is doing. This is evident in the header of Glint's IR generation, at =include/glint/ir_gen.hh=---there are only a few important functions: =insert=, to insert an IR instruction at the current position, and =generate_expression= to insert instructions corresponding to the AST node it's given. From there, it's only as complex as your language's AST is.
Of course, if your language's compiler implementation /already/ produces it's own lower-level-than-an-AST representation, you could convert from that; there are no rules as long as you produce IR functions with IR blocks with IR instructions within them :).
This is just here for those curious to start probing around in the code that you would be using to generate LCC IR.
#+begin_example
#include
#include
#include
#include
#include
#include
#include
#include
#+end_example