{"id":15116729,"url":"https://github.com/howerj/bit-serial","last_synced_at":"2026-03-15T16:02:45.586Z","repository":{"id":147265482,"uuid":"181192714","full_name":"howerj/bit-serial","owner":"howerj","description":"A bit-serial CPU written in VHDL, with a simulator written in C.","archived":false,"fork":false,"pushed_at":"2024-09-01T21:03:13.000Z","size":831,"stargazers_count":116,"open_issues_count":0,"forks_count":9,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-09-27T01:53:40.414Z","etag":null,"topics":["16-bit","16-bit-cpu","1bit","bit-serial","cpu","forth","tiny","vhdl"],"latest_commit_sha":null,"homepage":"","language":"VHDL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/howerj.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-04-13T15:41:09.000Z","updated_at":"2024-08-25T16:30:53.000Z","dependencies_parsed_at":"2024-02-14T22:47:06.358Z","dependency_job_id":"faf4aff3-df4d-4d04-80d4-2e799bd2eed5","html_url":"https://github.com/howerj/bit-serial","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/howerj%2Fbit-serial","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/howerj%2Fbit-serial/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/howerj%2Fbit-serial/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/howerj%2Fbit-serial/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/howerj","download_url":"https://codeload.github.com/howerj/bit-serial/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234461999,"owners_count":18837223,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["16-bit","16-bit-cpu","1bit","bit-serial","cpu","forth","tiny","vhdl"],"created_at":"2024-09-26T01:44:33.408Z","updated_at":"2025-09-27T22:31:04.233Z","avatar_url":"https://github.com/howerj.png","language":"VHDL","readme":"# BIT SERIAL CPU and TOOL-CHAIN\n\n*  Project:   Bit-Serial CPU in VHDL\n*  Author:    Richard James Howe\n*  Copyright: 2019-2020,2023-2024 Richard James Howe\n*  License:   MIT\n*  Email:     howe.r.j.89@gmail.com\n*  Website:   \u003chttps://github.com/howerj/bit-serial\u003e\n\n*Processing data one bit at a time, since 2019*.\n\n# TLDR\n\n* [Soft-Core][] 16-bit Accumulator Based Bit-serial CPU for an [FPGA][].\n* The processor runs a programming language called [Forth][].\n* The core is *tiny* at just about 23 Slices / 76 LUTs.\n* The [VHDL][] testbench is (optionally) *interactive* (but **slow**).\n* The [VHDL][] testbench is configurable without recompilation (it reads from a\n  [configuration file][]).\n\n# Introduction\n\nThis is a project for a [bit-serial CPU][], which is a CPU that has an architecture\nwhich processes a single bit at a time instead of in parallel like a normal\nCPU. This allows the CPU itself to be a lot smaller, the penalty is that it is\n*a lot* slower. The CPU itself is called *bcpu*. The test program includes\na fully working Forth interpreter.\n\nThe CPU is incredibly basic, lacking features required to support\nhigher level programming (such as function calls). Instead such features can\nbe emulated if they are needed. If such features are needed, or faster\nthroughput is needed (whilst still remaining quite small) \nother [Soft-Core][] CPUs are available, such as the [H2][].\n\nThe CPU also lacks interrupts, traps, byte addressability for load/storing,\na Memory Management Unit or Memory Protection Unit, and a whole host of\nother features that are in a modern core.\n\nThe core is *small* however, *very small*, here is the map report (edited\nto remove unneeded columns, this might not exactly match what is at the\nhead of the project).\n\n\tMax woosh/speed: 123.369MHz (can be improved with a few choice registers)\n\n\t+-----------------------------------------------------------------------------------+\n\t| Module                 | Slices* | Slice Reg | LUTs   | LUTRAM | BRAM/FIFO | BUFG |\n\t+-----------------------------------------------------------------------------------+\n\t| top/                   | 0/73    | 0/181     | 0/220  | 0/4    | 0/8       | 1/1  |\n\t| +cpu                   | 23/23   | 55/55     | 76/76  | 4/4    | 0/0       | 0/0  |\n\t| +peripheral            | 17/50   | 49/126    | 52/144 | 0/0    | 0/8       | 0/0  |\n\t| ++bram                 | 0/0     | 0/0       | 0/0    | 0/0    | 8/8       | 0/0  |\n\t| ++uart                 | 1/33    | 2/77      | 2/92   | 0/0    | 0/0       | 0/0  |\n\t| +++uart_rx_gen.baud_rx | 9/9     | 21/21     | 25/25  | 0/0    | 0/0       | 0/0  |\n\t| +++uart_rx_gen.rx_0    | 6/6     | 18/18     | 23/23  | 0/0    | 0/0       | 0/0  |\n\t| +++uart_tx_gen.baud_tx | 10/10   | 21/21     | 25/25  | 0/0    | 0/0       | 0/0  |\n\t| +++uart_tx_gen.tx_0    | 7/7     | 15/15     | 17/17  | 0/0    | 0/0       | 0/0  |\n\t+-----------------------------------------------------------------------------------+\n\t* Not of pizza\n\t* No DSP48A1/PLL_ADV/DCM/BUFR/BUFIO used.\n\nNote that the UART (92 LUTs) is bigger than the CPU core (76 LUTs)! This\nis certainly one of the smallest soft microprocessors, and perhaps the\nsmallest 16-bit soft processor for FPGAs. The UART is actually quite big\nas it is far more general than it needs to be, perhaps later developments\nwill use a smaller one, like in my [SUBLEQ VHDL project](https://github.com/howerj/subleq-vhdl).\n\nTo build and run the C based simulator for the project, you will need a C\ncompiler and 'make'. To build and run the [VHDL][] simulator, you will need [GHDL][]\ninstalled.\n\nThe cross compiler requires [gforth][], although a pre-compiled image is\nprovided in case you do not have access to it, called '[bit.hex][]', this hex file\ncontains a working [Forth][] image. To run this:\n\n\tmake bit\n\t./bit bit.hex\n\nAn example session of the simulator running is:\n\n![C Simulator Running eForth](bit-sim.gif)\n\nYou should be greeted by a [Forth][] prompt, type 'words' and hit carriage\nreturn to get a list of defined functions.\n\nThe target [FPGA][] that the system is built for is a [Spartan-6][], for a\n[Nexys 3][] development board. [Xilinx ISE 14.7][] was used to build the\nproject.\n\nThe following 'make' targets are available:\n\n\tmake\n\nBy default the [VHDL][] test bench is built and simulated in [GHDL][]. This\nrequires [gforth][] to assemble the test program [bit.fth][] into a file\nreadable by the simulator (or you can use the already assembled [bit.hex][]\nfile). As mentioned, the VHDL testbench is optionally interactive, that\nis you can read input STDIN and output from the CPU (via a simulated UART)\nwill be output to STDOUT. The options for this can be set in [tb.cfg][].\nOn my machine is takes a few minutes to print out \"eForth 3.3\" (you will\nneed to run the system for about 80 milliseconds just to be able to process\na \"bye\" input from the user).\n\n\tmake run\n\nThis target builds the C based simulator, assembles the test program\nand runs the simulator on the assembled program.\n\n\tmake synthesis implementation bitfile\n\nThis builds the project for the [FPGA][].\n\n\tmake upload\n\nThis uploads the project to the [Nexys 3][] board. This requires that\n'djtgcfg' (bless you) is installed, which is a tool provided by [Digilent][].\n\n\tmake documentation\n\nThis turns this 'readme.md' file into a HTML file.\n\n\tmake clean\n\nCleans up the project.\n\n# eForth\n\nThe tool-chain for the device is used to build an image for a Forth\ninterpreter, more specifically a Forth interpreter similar to a dialect of\nForth known as 'eForth', it differs between eForth in order to save on space\nwhich is at a premium. You should be greeted with an eForth prompt when running\nthe 'make run' target that looks something like this:\n\n\t$ make run\n\t./bit bit.hex\n\teForth 3.1\n\nYou can see all of the defined words (or functions) by typing in 'words' and\nhitting return.\n\n\t$ make run\n\t./bit bit.hex\n\teForth 3.3\n\twords\n\nArithmetic in Forth is done using Reverse Polish Notation:\n\n\t2 2 + . cr\n\nWill print out '4'. This is not the place for a Forth tutorial, the Forth\ninterpreter is mainly here to demonstrate that the bit-serial CPU is working\ncorrectly and can be used for useful purposes. No demonstration would be\ncomplete without a 'Hello, World' program, however:\n\n\t: hello cr .\" Hello, World!\" ;\n\thello\n\nGo use your favorite search engine to find a Forth tutorial.\n\nA more advance eForth image exists for the\n[SUBLEQ](https://github.com/howerj/subleq) Single Instruction Set Computer,\nanother contender for a small CPU that could be implemented on an [FPGA][]\n(or in [7400 series ICs](https://en.wikipedia.org/wiki/7400-series_integrated_circuits)).\nIt would need porting to this system, which should not be too difficult, and\nincludes multitasking, a text editor, better numeric I/O, more control\nstructures, and is a much more well rounded Forth.\n\n# Use Case\n\nOften in an [FPGA][] design there is spare Dual Port Block RAM (BRAM) available,\neither because only part of the BRAM module is being used or because it is not\nneeded entirely. Adding a new CPU however is a bigger decision than using spare\nBRAM capacity, it can take up quite a lot of floor space, and perhaps other\nprecious resources. If this is the case then adding this CPU costs practically\nnothing in terms of floor space (or routing resources for connecting the device\nto other sections of the FPGA as the CPU interface is really tiny), the main cost \nwill be in development time.\n\nIn short, the project may be useful if:\n\n* FPGA Floor space is at a premium in your design.\n* You have spare memory for the program and storage.\n* You need a programmable CPU that supports a reasonable instruction set.\n* *Execution speed is not a concern*.\n\nThere were two use cases that the author had in mind when setting out to build\nthis system:\n\n* As a CPU driving a software defined low-baud UART.\n* As a controller for a VT100 terminal emulator that would control cursor\n  position and parse escape codes, setting colors and attributes in a hardware\n  based text-terminal (this was to replace an existing VHDL only system that\n  had spare capacity in the FPGAs dual-port block RAMs used to store the Font\n  and text in \u003chttps://github.com/howerj/forth-cpu\u003e).\n\n# Tool-chain\n\nThe tool-chain consists of a cross compiler written in Forth, it itself\nimplements a virtual machine on top of which a Forth interpreter is written.\nThe accumulator machine lacks call/returns, and a stack, so these have to be\nimplemented. The meta-compiler (a Forth specific term for what is a\nmore widely known as a cross-compiler) is available in [bit.fth][].\n\nAs the instruction set is anemic and CPU features lacking it is best to target\nthe virtual machine and program in Forth than it is to program in assembly.\n\nDespite the inherently slow speed of the design and the further slow down\nexecuting code on top of a virtual machine the interpreter is plenty fast\nenough for interactive use, slowing down noticeably when division has to be\nperformed.\n\n# CPU Specification\n\nThe CPU is a 16-bit design, in principle a normal bit parallel implementation\nof the CPU could be made, but in practice if you want a bit-parallel CPU\nyou would not make a CPU with the same instruction set and behavior if the\nbit-serial restriction is lifted.\n\nThe CPU has 16 operations, each instruction consists of a 4-bit operation field\nand a 12-bit operand. If the top bit of the 4-bit operand field is not set then\nan indirection is performed on the operand which is treated as an address to\nbe loaded. Addresses are word (16-bit) oriented and not byte oriented.\n\nThe CPU is an accumulator machine, all instructions either modify or use the\naccumulator to store operation results in them. The CPU has three registers\nincluding the accumulator, the other two are the program counter which is\nautomatically incremented after each instruction excluding the jump\ninstructions (the SET instruction is also excluded when setting the program\ncounter only) and a flags register.\n\nThe instructions are:\n\n\t| ----------- | ----------------------------- | ------------------------------ | -------------- |\n\t| Instruction | C Operation                   | Description                    | Cycles         |\n\t| ----------- | ----------------------------- | ------------------------------ | -------------- |\n\t| OR          | acc |= lop                    | Bitwise Or                     | 5*(N+1)        |\n\t| AND         | acc \u0026= lop                    | Bitwise And                    | 5*(N+1)        |\n\t| XOR         | acc ^= lop                    | Bitwise Exclusive Or           | 5*(N+1)        |\n\t| ADD         | acc += lop                    | Add with carry, sets carry     | 5*(N+1)        |\n\t| LSHIFT      | acc = acc \u003c\u003c bits(lop)        | Shift left or Rotate left      | 5*(N+1)        |\n\t| RSHIFT      | acc = acc \u003e\u003e bits(lop)        | Shift right or Rotate right    | 5*(N+1)        |\n\t| LOAD        | acc = memory(lop)             | Load                           | 6*(N+1)        |\n\t| STORE       | memory(lop) = acc             | Store                          | 6*(N+1)        |\n\t| LOADC       | acc = memory(op)              | Load from memory constant addr | 4*(N+1)        |\n\t| STOREC      | memory(op) = acc              | Store to memory constant addr  | 4*(N+1)        |\n\t| LITERAL     | acc = op                      | Load literal into accumulator  | 3*(N+1)        |\n\t| UNUSED      | N/A                           | Unused instruction             | 3*(N+1)        |\n\t| JUMP        | pc = op                       | Unconditional Jump             | 2*(N+1)        |\n\t| JUMPZ       | if(!acc){pc = op }            | Jump If Zero                   | [2 or 3]*(N+1) |\n\t| SET         | if(op\u00261){flg=acc}else{pc=acc} | Set Register                   | 3*(N+1)        |\n\t| GET         | if(op\u00261){acc=flg}else{acc=pc} | Get Register                   | 3*(N+1)        |\n\t| ----------- | ----------------------------- | ------------------------------ | -------------- |\n\n* pc     = program counter\n* acc    = accumulator\n* indir  = indirect flag\n* lop    = Load from operand, load the address specified by the operand\n* op     = instruction operand\n* flg    = flags register\n* N      = bit width, which is 16.\n* bits() = count of bits, population count\n\nThe number of cycles an instruction takes to complete depends on whether it\nperforms an indirection, or in the case of GET/SET it depends if it is setting\nthe program counter (2 cycles only) or the flags register (3 cycles), or performing\nan I/O operation (4 cycles), getting the flags or program counter always costs\n3 cycles.\n\nThe flags in the 'flg' register are:\n\n\t| ---- | --- | --------------------------------------- |\n\t| Flag | Bit | Description                             |\n\t| ---- | --- | --------------------------------------- |\n\t| Cy   |  0  | Carry flag, set by addition instruction |\n\t| Z    |  1  | Zero flag                               |\n\t| Ng   |  2  | Negative flag                           |\n\t| R    |  3  | Reset Flag - Resets the CPU             |\n\t| HLT  |  4  | Halt Flag - Stops the CPU               |\n\t| ---- | --- | --------------------------------------- |\n\n* The carry flag (Cy) is set by the ADD instruction, it can also be set and cleared\nwith the GET/SET instructions.\n* 'Z' is set whenever the accumulator is zero.\n* 'Ng' is set whenever the accumulator has its highest bit set, indicating that\n  the accumulator is negative.\n* 'R', Reset flag, this resets the CPU immediately, only the HLT flag takes\nprecedence.\n* 'HLT', The halt flag takes priority over everything else, sending the CPU\ninto a halt state.\n\nThere is really not much else to this CPU from the point of view of a user of\nthis core, it is a simple CPU. Integrating this core into another system is \nmore complicated however, you will need to be far more aware of timing of \nsignals and their enable lines. Much like the processor, a single bit bus \nin conjunction with an enable is used to communicate with the outside world.\n\nThe internal state of the CPU is minimal, to make a working system the memory\nand I/O controller will need (shift) registers to store the address and\ninput/output.\n\nThe CPU state-machine is:\n\n![CPU State Machine](bit-state.png)\n\nAnd the CPU bus timing diagram:\n\n![CPU Bus timing](bit-wave.png)\n\n\n# Peripherals\n\nThe system has a minimal set of peripherals; a bank of switches with LEDs next\nto each switch and a UART capable of transmission and reception, other\nperipherals could be added as needed. A timer would be useful, but not\nnecessary, the same could be said for many other peripherals.\n\n## Register Map\n\nThe I/O register map for the device is very small as there are very few\nperipherals.\n\n\t| ------- | -------------- |\n\t| Address | Name           |\n\t| ------- | -------------- |\n\t| 0x4000  | LED/Switches   |\n\t| 0x4001  | UART TX/RX     |\n\t| 0x4002  | UART Clock TX* |\n\t| 0x4003  | UART Clock RX* |\n\t| 0x4004  | UART Control*  |\n\t| ------- | -------------- |\n\tThese registers are turned off by default\n\tand will need to be enabled during synthesis if needed.\n\n* LED/Switches\n\nA bank of switches, non-debounced, with LED lights next to them.\n\n\t+---------------------------------------------------------------+\n\t| F | E | D | C | B | A | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |\n\t+---------------------------------------------------------------+\n\t|                           |   Switches 1 = on, 0 = off        | READ\n\t+---------------------------------------------------------------+\n\t|                           |   LED 1 = on, 0 = off             | WRITE\n\t+---------------------------------------------------------------+\n\n* UART TX/RX\n\nThe UART TX/RX register is used to read and write data bytes to the UART and\ncheck on the UART status. The UART has a FIFO that is used to capture the\nresults of the UART. The usage of which is non-optional.\n\n\t+---------------------------------------------------------------+\n\t| F | E | D | C | B | A | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |\n\t+---------------------------------------------------------------+\n\t|       |TFF|TFE|   |RFF|RFE|      RX DATA BYTE                 | READ\n\t+---------------------------------------------------------------+\n\t|   |TFW|       |RFR|       |      TX DATA BYTE                 | WRITE\n\t+---------------------------------------------------------------+\n\tRFE = RX FIFO EMPTY\n\tRFF = RX FIFO FULL\n\tRFR = RX FIFO READ ENABLE\n\tTFE = TX FIFO EMPTY\n\tTFF = TX FIFO FULL\n\tTFW = TX FIFO WRITE ENABLE\n\n* UART Clock TX\n\nThe UART Transmission clock, independent from the Reception Clock, is\ncontrollable via this register.\n\nDefaults are: 115200 Baud\n\n\t+---------------------------------------------------------------+\n\t| F | E | D | C | B | A | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |\n\t+---------------------------------------------------------------+\n\t|                                                               | READ\n\t+---------------------------------------------------------------+\n\t|             UART TX CLOCK DIVISOR                             | WRITE\n\t+---------------------------------------------------------------+\n\n* UART Clock RX\n\nThe UART Reception clock, independent from the Transmission Clock, is\ncontrollable via this register.\n\nDefaults are: 115200 Baud\n\n\t+---------------------------------------------------------------+\n\t| F | E | D | C | B | A | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |\n\t+---------------------------------------------------------------+\n\t|                                                               | READ\n\t+---------------------------------------------------------------+\n\t|            UART RX CLOCK DIVISOR                              | WRITE\n\t+---------------------------------------------------------------+\n\n* UART Clock Control\n\nThis clock is used to control UART options such as the number of bits,\n\nDefaults are: 8N1, no parity\n\n\t+---------------------------------------------------------------+\n\t| F | E | D | C | B | A | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |\n\t+---------------------------------------------------------------+\n\t|                                                               | READ\n\t+---------------------------------------------------------------+\n\t|                               |   DATA BITS   |STPBITS|EPA|UPA| WRITE\n\t+---------------------------------------------------------------+\n\tUPA       = USE PARITY BITS\n\tEPA       = EVEN PARITY\n\tSTPBITS   = Number of stop bits\n\tDATA BITS = Number of data bits\n\n\n# Other Soft Microprocessors\n\nThis is a *very* specialized core, that cannot be emphasized enough. It\nexecutes slowly, but is small. Other, larger cores (but still relatively small)\nmay be useful for your needs. In terms of engineering trade offs this design\ntakes things to the extreme in one direction only.\n\nThe core should have be written to be portable to different [FPGA][]s, however the\nauthor only tests what they have available (Xilinx, Spartan-6).\n\n* The H2\n\nAnother small core, based on the J1. This core executes quite quickly (1\ninstruction per CPU cycle) and uses few resources, although much more than \nthis core. The instruction set is quite dense and allows for higher level \nprogramming than just using straight assembler. See \n\u003chttps://github.com/howerj/forth-cpu\u003e.\n\nThis CPU core has deeper stacks, more instructions, and interrupts, which the\noriginal J1 core lacks. It is also written in VHDL instead of Verilog.\n\n* SUBLEQ system.\n\nThe One/Single Instruction Set Computer (OISC) based off of [SUBLEQ][] is another\ncandidate for making a small, niche, CPU. There are many OISC architectures,\n[SUBLEQ][] is the most popular. See \u003chttps://github.com/howerj/subleq-vhdl\u003e\nfor a fully working SUBLEQ CPU running on an FPGA that implements Forth, it has not \nbeen optimized as much as this CPU but it is also quite small. The Forth it runs is\na port of the one available here \u003chttps://github.com/howerj/subleq\u003e. In\nprinciple it should be fairly easy to implement such a CPU in discrete 7400\nseries Integrated Circuits.\n\n* Tiny CPU in a CPLD\n\nThis is a 8-bit CPU designed to fit in the limited resources of a CPLD:\n\nSee \u003chttps://www.bigmessowires.com/cpu-in-a-cpld/\u003e and\n\u003chttps://www.bigmessowires.com/tinycpufiles.zip\u003e.\n\nIt is written in Verilog, it is based on the 6502, implementing a subset of its\ninstructions. It is probably easier to directly program than this bit-serial\nCPU, and roughly the same size (although a direct comparison is difficult).\nIt can address less memory (1K) without bank-switching. There is also a\ndifferent version made with 7400 series logic gates\n\u003chttps://www.bigmessowires.com/nibbler/\u003e.\n\n* Leros and Lipsi\n\nSee \u003chttps://github.com/leros-dev/leros\u003e,\nalso \u003chttps://github.com/schoeberl/lipsi\u003e,\n\n# Future directions\n\nThere are infinite combinations of different features and CPU\ninstructions that could be played with, one which might be more\ncompact is describe below (the best way to see if it is is to try\nto implement it in hardware which has not been done yet).\n\nEach instruction would be 16-bits, 4 bits for the instruction,\n12 bits for an operand. This would be another accumulator\nmachine, where all instructions would operate on the accumulator\n(apart from JMP and JMPZ).\n\nTopmost bit indicates an indirection bit, where contents of the\nmemory location specified by the operand are used instead of the\noperand are used.\n\n8 Instructions (16 if indirect variants included)\n\n* ADD [optional CARRY \u003c- ACC + OPERAND + CARRY]\n* AND [optional CARRY \u003c- 0]\n* XOR [optional CARRY \u003c- 0]\n* ROTATE (rotate left likely most efficient)\n* LOAD\n* STORE\n* JMP [optional ACC \u003c- PC]\n* JMPZ  [optional ACC \u003c- PC]\n\nMissing are:\n\n* A Timer\n* A way to reset the CPU (perhaps JMP to 0000)\n* A way to halt the CPU (perhaps JMP to 8XXX)\n\nHopefully the instruction set would be smaller than this one\nand allow for a more compact Forth (indirect adding would allow \nshorter stack increment routines so long as the carry option was \ndisabled).\n\nFeatures could be optionally enabled in the VHDL as needed.\n\n# References / Appendix\n\nThe state-machine diagram was made using [Graphviz][], and can be viewed and\nedited immediately by copying the following text into [GraphvizOnline][].\n\n\n\tdigraph bcpu {\n\t\treset -\u003e fetch [label=\"start\"]\n\t\tfetch -\u003e execute\n\t\tfetch -\u003e indirect [label=\"op \u003c 8\"]\n\t\tfetch -\u003e reset  [label=\"flag(RST) = '1'\"]\n\t\tfetch -\u003e halt  [label=\"flag(HLT) = '1'\"]\n\t\tindirect -\u003e operand\n\t\toperand -\u003e execute\n\t\texecute -\u003e advance\n\t\texecute -\u003e store   [label=\"op = 'store'\"]\n\t\texecute -\u003e load   [label=\"op = 'load'\"]\n\t\texecute -\u003e fetch [label=\"(op = 'jumpz' and acc = 0)\\n or op ='jump'\"]\n\t\tstore -\u003e advance\n\t\tload -\u003e advance\n\t\tadvance -\u003e fetch\n\t\thalt -\u003e halt\n\t}\n\n\nFor timing diagrams, use [Wavedrom][] with the following text:\n\n\n\t{signal: [\n\t  {name: 'clk',   wave: 'pp...p...p...p...p..'},\n\t  {name: 'cycle', wave: '22222222222222222222', data: ['prev', 'init','0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', 'next', 'rest']},\n\t  {name: 'cmd',   wave: 'x2..................', data: ['HALT']},\n\t  {name: 'ie',    wave: 'x0..................'},\n\t  {name: 'oe',    wave: 'x0..................'},\n\t  {name: 'ae',    wave: 'x0..................'},\n\t  {name: 'o',     wave: 'x0..................'},\n\t  {name: 'i',     wave: 'x...................'},\n\t  {name: 'halt',  wave: 'x1..................'},\n\t  {},\n\n\t  {name: 'clk',   wave: 'pp...p...p...p...p..'},\n\t  {name: 'cycle', wave: '22222222222222222222', data: ['prev', 'init','0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', 'next', 'rest']},\n\t  {name: 'cmd',   wave: 'x2................xx', data: ['ADVANCE']},\n\t  {name: 'ie',    wave: 'x0.................x'},\n\t  {name: 'oe',    wave: 'x0.................x'},\n\t  {name: 'ae',    wave: 'x01...............0x'},\n\t  {name: 'o',     wave: 'x0================0x', data: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', 'F12', 'F13', 'F14', 'F15']},\n\t  {name: 'i',     wave: 'x.................xx'},\n\t  {name: 'halt',  wave: 'x0.................x'},\n\t  {},\n\n\t  {name: 'clk',   wave: 'pp...p...p...p...p..'},\n\t  {name: 'cycle', wave: '22222222222222222222', data: ['prev', 'init','0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', 'next', 'rest']},\n\t  {name: 'cmd',   wave: 'x2................xx', data: ['OPERAND or LOAD']},\n\t  {name: 'ie',    wave: 'x01...............0x'},\n\t  {name: 'oe',    wave: 'x0.................x'},\n\t  {name: 'ae',    wave: 'x0.................x'},\n\t  {name: 'o',     wave: 'x0.................x'},\n\t  {name: 'i',     wave: 'x.================xx', data: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15']},\n\t  {name: 'halt',  wave: 'x0.................x'},\n\t  {},\n\n\t  {name: 'clk',   wave: 'pp...p...p...p...p..'},\n\t  {name: 'cycle', wave: '22222222222222222222', data: ['prev', 'init','0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', 'next', 'rest']},\n\t  {name: 'cmd',   wave: 'x2................xx', data: ['STORE']},\n\t  {name: 'ie',    wave: 'x0.................x'},\n\t  {name: 'oe',    wave: 'x01...............0x'},\n\t  {name: 'ae',    wave: 'x0.................x'},\n\t  {name: 'o',     wave: 'x0================0x', data: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15']},\n\t  {name: 'i',     wave: 'x.................xx'},\n\t  {name: 'halt',  wave: 'x0.................x'},\n\t  {},\n\n\t  {name: 'clk',   wave: 'pp...p...p...p...p..'},\n\t  {name: 'cycle', wave: '22222222222222222222', data: ['prev', 'init','0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', 'next', 'rest']},\n\t  {name: 'cmd',   wave: 'x2................xx', data: ['INDIRECT or EXECUTE: LOAD, STORE, JUMP, JUMPZ']},\n\t  {name: 'ie',    wave: 'x0.................x'},\n\t  {name: 'oe',    wave: 'x0.................x'},\n\t  {name: 'ae',    wave: 'x01...............0x'},\n\t  {name: 'o',     wave: 'x0================0x', data: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', 'F12', 'F13', 'F14', 'F15']},\n\t  {name: 'i',     wave: 'x.................xx'},\n\t  {name: 'halt',  wave: 'x0.................x'},\n\t  {},\n\n\t  {name: 'clk',   wave: 'pp...p...p...p...p..'},\n\t  {name: 'cycle', wave: '22222222222222222222', data: ['prev', 'init','0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', 'next', 'rest']},\n\t  {name: 'cmd',   wave: 'x2................xx', data: ['EXECUTE: NORMAL INSTRUCTION']},\n\t  {name: 'ie',    wave: 'x0.................x'},\n\t  {name: 'oe',    wave: 'x0.................x'},\n\t  {name: 'ae',    wave: 'x0.................x'},\n\t  {name: 'o',     wave: 'x0.................x'},\n\t  {name: 'i',     wave: 'x.................xx'},\n\t  {name: 'halt',  wave: 'x0.................x'},\n\t  {},\n\n\t  {name: 'clk',   wave: 'pp...p...p...p...p..'},\n\t  {name: 'cycle', wave: '22222222222222222222', data: ['prev', 'init','0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', 'next', 'rest']},\n\t  {name: 'cmd',   wave: 'x2................xx', data: ['FETCH']},\n\t  {name: 'ie',    wave: 'x01...............0x'},\n\t  {name: 'oe',    wave: 'x0.................x'},\n\t  {name: 'ae',    wave: 'x0.................x'},\n\t  {name: 'o',     wave: 'x0.................x'},\n\t  {name: 'i',     wave: 'x.================xx', data: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15']},\n\t  {name: 'halt',  wave: 'x0.................x'},\n\t  {},\n\n\t  {name: 'clk',   wave: 'pp...p...p...p...p..'},\n\t  {name: 'cycle', wave: '22222222222222222222', data: ['prev', 'init','0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', 'next', 'rest']},\n\t  {name: 'cmd',   wave: 'x2................xx', data: ['RESET']},\n\t  {name: 'ie',    wave: 'x0.................x'},\n\t  {name: 'oe',    wave: 'x0.................x'},\n\t  {name: 'ae',    wave: 'x01...............0x'},\n\t  {name: 'o',     wave: 'x0.................x'},\n\t  {name: 'i',     wave: 'x.................xx'},\n\t  {name: 'halt',  wave: 'x0.................x'},\n\t  {},\n\n\t]}\n\n\nThat's all folks!\n\n[C]: https://en.wikipedia.org/wiki/C_%28programming_language%29\n[Digilent]: https://store.digilentinc.com/\n[FPGA]: https://en.wikipedia.org/wiki/Field-programmable_gate_array\n[Forth]: https://www.forth.com/forth/\n[GHDL]: http://ghdl.free.fr/\n[GraphvizOnline]: https://dreampuf.github.io/GraphvizOnline\n[Graphviz]: https://graphviz.org/\n[H2]: https://github.com/howerj/forth-cpu\n[Nexys 3]: https://store.digilentinc.com/nexys-3-spartan-6-fpga-trainer-board-limited-time-see-nexys4-ddr/\n[Soft-Core]: https://en.wikipedia.org/wiki/Soft_microprocessor#Core_comparison\n[Spartan-6]: https://www.xilinx.com/products/silicon-devices/fpga/spartan-6.html\n[VHDL]: https://en.wikipedia.org/wiki/VHDL\n[Wavedrom]: https://wavedrom.com/editor.html\n[Xilinx ISE 14.7]: https://www.xilinx.com/products/design-tools/ise-design-suite/ise-webpack.html\n[bit-serial CPU]: https://en.wikipedia.org/wiki/Bit-serial_architecture\n[bit.c]: bit.c\n[bit.fth]: bit.fth\n[bit.fth]: bit.fth\n[bit.hex]: bit.hex\n[bit.vhd]: bit.vhd\n[tb.cfg]: tb.cfg\n[configuration file]: tb.cfg\n[gforth]: https://gforth.org/\n[SUBLEQ]: https://en.wikipedia.org/wiki/One-instruction_set_computer#Subtract_and_branch_if_not_equal_to_zero\n\n\u003cstyle type=\"text/css\"\u003e\n\tbody{\n\t\tmax-width: 50rem;\n\t\tpadding: 2rem;\n\t\tmargin: auto;\n\t\tline-height: 1.6;\n\t\tfont-size: 1rem;\n\t\tcolor: #444;\n\t}\n\th1,h2,h3 {\n\t\tline-height:1.2;\n\t}\n\ttable {\n\t\twidth: 100%;\n\t\tborder-collapse: collapse;\n\t}\n\ttable, th, td{\n\t\tborder: 0.1rem solid black;\n\t}\n\timg {\n\t\tdisplay: block;\n\t\tmargin: 0 auto;\n    \t\tmargin-left: auto;\n    \t\tmargin-right: auto;\n\t}\n\tcode {\n\t\tcolor: #091992;\n\t\tdisplay: block;\n\t\tmargin: 0 auto;\n    \t\tmargin-left: auto;\n    \t\tmargin-right: auto;\n\n\t}\n\u003c/style\u003e\n\n","funding_links":[],"categories":["VHDL"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhowerj%2Fbit-serial","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhowerj%2Fbit-serial","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhowerj%2Fbit-serial/lists"}