{"id":21145400,"url":"https://github.com/etchedpixels/fuzix-compiler-kit","last_synced_at":"2025-07-09T07:31:24.939Z","repository":{"id":78756259,"uuid":"479790222","full_name":"EtchedPixels/Fuzix-Compiler-Kit","owner":"EtchedPixels","description":"Fuzix C Compiler Project","archived":false,"fork":false,"pushed_at":"2024-04-22T12:56:45.000Z","size":1609,"stargazers_count":33,"open_issues_count":3,"forks_count":6,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-04-22T14:02:28.237Z","etag":null,"topics":["c","compiler"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EtchedPixels.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":"support1802/1802.S","governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2022-04-09T17:00:10.000Z","updated_at":"2024-04-24T19:58:25.928Z","dependencies_parsed_at":"2023-10-27T00:29:43.824Z","dependency_job_id":"defac031-e792-4cef-9c26-77a1a0ceef83","html_url":"https://github.com/EtchedPixels/Fuzix-Compiler-Kit","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EtchedPixels%2FFuzix-Compiler-Kit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EtchedPixels%2FFuzix-Compiler-Kit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EtchedPixels%2FFuzix-Compiler-Kit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EtchedPixels%2FFuzix-Compiler-Kit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EtchedPixels","download_url":"https://codeload.github.com/EtchedPixels/Fuzix-Compiler-Kit/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225492617,"owners_count":17482924,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","compiler"],"created_at":"2024-11-20T08:39:57.237Z","updated_at":"2025-07-09T07:31:24.932Z","avatar_url":"https://github.com/EtchedPixels.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Working Development Tree For the Fuzix Compiler Kit\n\n## Design\n\ncc0 is a tool that tokenizes a C file and handles all the messy\nnumber conversions and string quoting to produce a token stream for a\ncompiler proper to consume. It also extracts all the identifiers and numbers\nthem, before writing them out in a table.\n\ncc1 takes the tokenized stream and generates an output stream that consists\nof descriptors of program structure (function/do while/statement etc) with\nexpression trees embedded within.\n\ncc2 will then turn this into code.\n\nIn theory it ought to also be possible to add a cc1b that further optimizes the\ntrees from cc1.\n\n## Status\n\nThe compiler is currently used to build the Fuzix OS for 8080, 8085 and Z80\nand can cross build itself to run natively on these systems. The core code\nshould be reasonably stable. There is a lot of performance work to do on\nthe compiler itself and there are still a couple of deviations from spec\nthat would be nice to fix. The backends for 8080/5/Z80 should be fairly\nstable but are being used to experiment with improvements.\n\nThe other processor trees are very much a work in progress.\n\n## Installation\n\nAs a cross compiler the front end expects it all to live in `/opt/fcc/`. The\ntool chain provides the compiler front end and phases. For cpp for now it\nuses the gcc preprocessor on Linux and DECUS cpp on Fuzix.\n\nEither make the `/opt/fcc` directory and make it owned by your user or do\nthe install phase with appropriate privileges.\n\nThe assembler and loader tools required live in the\n[Fuzix-Bintools repository](https://github.com/EtchedPixels/Fuzix-Bintools).\nTo build it all first clone the Fuzix-Bintools respository and `make install`.\nThen make sure `/opt/fcc` is on your path.\n\nNow clone this repository. In the Fuzix-Compiler-Kit directory do:\n\n```\nmake bootstuff\nmake install\n```\n\nThis will build a bootstrap then build the full tools and install them.\n\n## Requirements\n\nThe compiler kit makes use of strlcpy/strlcat. Glibc users will need glibc\n2.38 or later to catch up with this interface.\n\n## Intended C Subset\n\nThe goal is to support the following\n\n### Types\n\n* char, short, int, long, signed and unsigned\n* float, double\n* struct, union\n* enum\n* typedef\n\nCurrently the compiler requires that the target types all fit into the host\nunsigned long type.\n\nCurrently the compiler hardcodes assumptions that a char is 8bits, short\n16bit and long 32bits (see tree.c:constify and helpers). This needs to be\naddressed.\n\n### Storage classes\n\nauto, static, extern, typedef, register\n\nregister is dependent upon the backend.\n\n### C Syntax\n\n* standard keywords and flow control\n* labels, and goto\n* statements and expressions\n* declarations\n* ANSI C function declarations\n\n### Intentionally Omitted\n\nThings that add size and complexity or are just pointless.\n\n* K\u0026R function declarations\n* Most C95 stuff - wide char, digraph etc\n* Most C99 bloat by committee\n* C11 bloat by committee\n* struct/union passing, struct/union returns and other related badness\n* bitfields\n* const and volatile typing. To do these makes type handling really really tricky. They are accepted so that code with them can build and some magic tricks are done to get volatile right\n\n###\n\nKnown incompatibilities (some to be fixed)\n\n* The constant value -32768 does not always get typed correctly. The reason for this is a complicated story about how cc0/cc1 interact.\n* Many C compilers permit (void) to 'cast' the result of a call away, we do not.\n* Local variables have a single function wide scope not a block scope\n\n## Backend Status\n\n### 1802\n\nAn experimental bytecode engine for the 1802. The bytecode side of the\ngeneration appears to be functional (except for floats) and the bytecode\nsimulation passes the basic tests. The next steps are a bytecode format\nassembler for user bytecode pieces, and to start to build and debug the\nactual 1802 interpreter. It should also be a good basis for any other\nCPU needing this sort of treatment.\n\n### 6303/6803/68HC11\n\n6303 and 6803 pass the basic tests at this point. 68HC11 needs a little bit\nmore work to nail some remaining bugs. 6803 code will run on all three, 6303\ncode will run only on the 6303, and 68HC11 code only on the 68HC11.\n\n### 6502\n\nEarly development code for a 6502/65C02 backend. The main work at this point\nhas been adding compiler support for reducing operations down to byte size\nwhere possible.\n\n### 65C816\n\nAn intial 65C816 native port that passes the test suite but probaly has some\nbugs left to find. As this port is designed for Fuzix and run in any bank it\nuses Y as the C stack pointer and uses the CPU stack for temporary values\nduring expression evaluation and the all actual call/return addresses. Split\ncode/data is supported but not multiple data or code banks in one application\n(that is pointers are 16bit). Going beyond that gets very ugly very fast as on\n8086. Still needs float support finishing.\n\n### 6800\n\nThe 6800 backend passes the full tests. For size reasons the 6800 ABI\nis not the same as the 6803/6303.\n\n### 68HC08\n\nIntial sketches to help debug the byte reduction code on a big endian\nmachine.\n\n### 8070\n\nMinimal support for the INS807x series of processors. Passes the basic tests\nexcept for floating point. Needs register tracking and some smarts about\npicking p2 or p3 adding to get code the quality up. Code density is still\nreasonable thanks to the 16bit operations and stack relative load and store.\n\n### 8080/8085\n\nThe compiler generates reasonable 8080 code and knows how to use call stubs\nfor argument fetching/storing to get compact code at a performance cost if\nrequested. On the 8085 extensive use is made of LDSI, LHLX and SHLX to get\ngood compact code generation.\n\nLong maths is quite slow but is not trivial to optimize, particularly on the\n8080 processor. There is also no option to use RST calls for the most common\nbits of code for compactness (quite possibly worth 1Kb or more for some\nstuff). The code generator does not know the fancy tricks for turning\nconstant divides into shift/multiply sets.\n\nThe BC register is used as a register variable for either byte or word\nconstants, or a byte pointer. As there is no word sized load/store via BC or\neasy way to do it the BC register pair is not used for other pointer sizes.\n\nSigned comparison and sign extension are significantly slower than unsigned.\nThis is an instruction set limitation.\n\n### 8086\n\nInitial code only. This is waiting some further work on the assembler end\nof the toolchain for debug.\n\n### DDP16\n\nInitial work only. This is primarily being used to experiment with some\nword machine optimisations and behaviour improvements.\n\n### EE200\n\nElectrodata EE200 / Warrex CPU4 backend. Early work only with a view to\ndeprecating the existing cc65 based project.\n\n### Nova\n\nAn initial port to the DG Nova series machines. This target generates pure\nDG Nova code for the original Nova series machines. It requires that the\nautoinc/autodec memory locations are present\n\n### Nova3\n\nAn initial port to the DG Nova 3 and Nova 4. Autoinc/dec memory is not\nrequired. The Nova 4 byte operations are not currently supported or used.\nCurrently adding Nova 4 byte operations and the related DG Eclipse.\n\n### Super8\n\nEarly code only for the Zilog Super8 variant of the Z8 processor.\n\n### ThreadCode\n\nAn initial backend that turns the C input into a series of helper references\nand data. This can easily be tweaked to make them calls, and peephole rules\nused to clean up or re-arrange them a bit to suit any need or turn it into\nbytecode etc.\n\n### TMS7000\n\nA slightly modified Z8 target for the TMS7000 as the two are remarkably\nsimilar in terms of their compiler requirements. Working on an emulator\nintegration to begin proper testing.\n\n### Z8\n\nThis port now passes all of the self tests and the code coverage compile\ntests. It has not yet been used except on test sets so probably contains\na few bugs. Split I/D is supported. Size optimization support is now\nincluded and hugely reduces the code size with -Os but is not yet fully\ndebugged.\n\n### Z80 / Z180\n\nThe Z80 code generator will generate reasonable Z80 code. The processor\nitself is difficult to use for C as fetching objects from the stack is slow\nas on the 8080. The compiler will use BC, IX and IY for register variables\nand knows how to use offsets from IX or IY when working with structs.\n\nIf IX or IY are free they will be used as a frame pointer, if not the\ncompiler assumes the programmer knows what they are doing and will assign\nthem as register variables whilst using helpers for the locals.\n\nThe Z180 is not yet differentiated. This will only matter for the support\nlibrary code and maybe inlining a few specific multiplication cases.\n\n### Default\n\nThis is a simple test backend the just turns the input into a lot of calls.\nIt is intended as a reference only although it may be useful for processors\nthat require a threadcode implementation or to build an interpreted backend.\n\n## Internals\n\n### cc0\n\nTakes input from stdin and outputs tokens to stdout. The core of the logic\nis pretty basic, the only oddity is using strchr() in a few places because\nit's often hand optimized assembler. Tokens are 16bits. C has some specific\nrules on tokenizing which make it simple at the cost of producing unexpected\nresults from stuff like x+++++y; (x++ ++ +y).\n\nAll names are translated into a 16bit token number. So for example every\noccurence of \"fred\" might be 0x8004. The cc0 stage has no understanding of\nC scoping so 0x8004 isn't tied to any kind of scope, merely a group of\nletters.\n\nAfter tokenizing it writes the symbol table out to disk as well. It turns\nout that the compiler phase has no use at all for symbol names and they\ntake a lot of space to store and slow down comparisons.\n\n### cc1\n\nThis is essentially a hand coded recursive descent parser. Higher level\nconstructs are described by headers and footers. Within these blocks the\ncompiler stores expression trees per statement. Trees do not span statements\nnor does the compiler do anything at a higher level. There is enough\ninformation to turn functions or even entire programs into a single tree if\nthe code generator or an optimizer pass wished.\n\nThe biggest challenge on a small machine is the memory management. To keep\nthings tight types are packed into 16bits. Where the type is complex it\ncontains an index to an object in the symbol table which describes the type\nin question (and if the type is named also has the type naming attached).\n\nVarious per object fields are packed into runs of 16bit values, such as\nstruct field information and array sizes.\n\nTo maximise memory efficiency without losing the checking the compiler packs\nall functions with the same signature into the same type. As most functions\nactually have one of a very small number of prototypes this saves a lot of\nroom.\n\n### cc2\n\nThis is at its heart a very simple left hand walking code generator. The\ncore backed allows targets to rewrite subtrees, to evaluate trees in other\norders when useful and also provides an interface that allows the target\nto shortcut the stack whenever it can access the second item of data for\nan operation without disturbing the working balue.\n\nThis should suit simpler processors like the 6502, 680x, 8080, 8085 etc\nbut isn't a good model for register oriented ones. It's not clear there\nis a good model for register oriented processors that works well in 64K\nof memory.\n\nOn the other hand it's ludicrously easy to change it to produce fairly bad\ncode for any processor you want.\n\n## Credits\n\nThe expression parser was created by turning the public domain SmallC 3.0 one\ninto a more traditional tree building recursive parser and testing it in\nSmallC. The rest of the code is original although the design is influenced by\nseveral small C subset compilers and also ANSI pcc.\n\nThe wtest code and some 6809 work were contributed by Warren Toomey\n\nThe 6800 port was taken from an initial sketch to a working compiler by Zu2\n\u003chttp://www.zukeran.org/shin/d/\u003e who also contributed other bug fixes,\nincluding getting the floating point side of the compiler working.\n\nA considerable amount of coverage testing and a large number of test cases\nfor failures were provided by Yasuo Kuwahara.\n\n## Licence\n\nCompiler (not any runtime)\t:\tGPLv3\n\ncopt is from Z88DK. Z88DK is under the Clarified Artistic License\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fetchedpixels%2Ffuzix-compiler-kit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fetchedpixels%2Ffuzix-compiler-kit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fetchedpixels%2Ffuzix-compiler-kit/lists"}