https://github.com/robert-van-engelen/forth850
A fast Forth Standard system written in Z80 assembly for SHARP PC-G850 pocket computers
https://github.com/robert-van-engelen/forth850
forth forth-2012 minifloat pc-g850 pocketcomputer single-precision-floating-point z80 z80asm
Last synced: about 1 month ago
JSON representation
A fast Forth Standard system written in Z80 assembly for SHARP PC-G850 pocket computers
- Host: GitHub
- URL: https://github.com/robert-van-engelen/forth850
- Owner: Robert-van-Engelen
- License: bsd-3-clause
- Created: 2022-11-10T20:07:05.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-02-19T03:45:35.000Z (over 1 year ago)
- Last Synced: 2025-03-25T08:01:31.639Z (2 months ago)
- Topics: forth, forth-2012, minifloat, pc-g850, pocketcomputer, single-precision-floating-point, z80, z80asm
- Language: Assembly
- Homepage:
- Size: 1.86 MB
- Stars: 25
- Watchers: 5
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: changelog.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README

A modern Forth 2012 standard compliant system for the vintage SHARP
PC-G850(V)(S) pocket computer or any Z80 system (with a few tweaks to port).Forth850 is under 8K and has [295 words](forth850-words).
A more complete 11K version forth850-full is also included with:
- [58 additional words](#additional-words-included-with-the-full-version)
- single precision floating point math words implemented with a new and
efficient [Z80 IEEE 754 floating point math](#z80-floating-point-math-routines)
library I wrote for Forth850. More floating point math functions that use this
library are defined in Forth in [examples/MATH.FTH](examples/MATH.FTH)- a more capable Forth line editor with replay back feature (cursor keys)
Forth850 includes stack under/overflow checks, dictionary overflow checks and
can be interrupted by pressing BREAK.You can write Forth source code in the PC-G850(V)(S) built-in TEXT editor and
compile it into Forth850 with the [TEXT](examples/TEXT.FTH) word included in
the full version.You can extend Forth850 as you wish, including assembly code written on the
machine itself in the PC-G850(V)(S) TEXT editor and assembled with its Z80
Assembler. See [ASMDEMO1.FTH](examples/ASMDEMO1.FTH) for an example with an
explanation. You can also use the Monitor to set breakpoints and run Forth850
from the Monitor with `G100` to trigger them.If you want to rebuild Forth850 from source code, you will need to install
the [asz80](https://shop-pdp.net/ashtml/asz80.htm) assembler part of the
[ASxxxx Cross Assemblers](https://shop-pdp.net/ashtml/asxget.php).If you plan to use parts of Forth850 and/or the optimized Z80 code in a project
that you plan to share or redistribute, then please give me credit for my work
as per [BCD-3 license](license.txt).Discuss: [HP Forum](https://www.hpmuseum.org/forum/thread-19085.html) and
[reddit](https://www.reddit.com/r/Forth/comments/ytat79/a_fast_forth_for_vintage_sharp_pcg850_pocket)## Performance
I've implemented Forth850 as efficiently as possible in direct threaded code
(DTC) with new Z80 code written from scratch, including faster Z80 integer and
float math routines compared to other Z80 Forth implementations. See the
[technical implementation sections](#Implementation) why Forth850 is fast for a
DTC implementation. The [Forth850 source code](forth850.asm) is included and
extensively documented.The [n-queens benchmark](https://www.hpmuseum.org/cgi-sys/cgiwrap/hpmuseum/articles.cgi?read=700)
is solved in 0.865 seconds, the fastest Forth implementation of the benchmarks.
Forth850 n-queens runs 5 times faster than the C n-queens benchmark on the
Sharp PC-G850VS.## Installation
In RUN MODE enter `MON` to enter the Monitor, then enter `USER3FFF` to reserve
16K memory space:*USER3FFF
FREE:0100-3FFFLoading via SIO (serial) requires a serial adapter. See my post on the
[HP Forum](https://www.hpmuseum.org/forum/thread-19431-post-169250.html#pid169250)
how to construct one as a DIY project. After reserving memory in the Monitor
as described above, use the `R` command to read the forth850.ihx file or the
forth850-full.ihx full version sent from your PC to your PC-G850(V)(S):*R100
The `R` command is used to transmit/receive data in Intel hex format over SIO.
This command is for receiving machine code from a personal computer or other
device. See the [Sharp PC-G850(V)(S) manual](https://pockemul.com/wp-content/uploads/2020/05/PC-G850VSEng_V3_0.pdf).To load via the cassette interface, press `BASIC` to return to RUN MODE. Load
forth850.wav using a cassette interface CE-126P or a CE-124:BLOADM
Load the forth850-full.wav "full version" to include many
[additional words](#additional-words-included-with-the-full-version) and
[floating point words](#floating-point-math-words-included-with-the-full-version).
The full version will continue to evolve with new features.## How to switch between Forth and BASIC
To return to Forth, enter `CALL256` in RUN MODE.
To return to BASIC from Forth, press the BASIC key. The TEXT key takes you to
the TEXT editor.To turn the machine off, press the OFF key. The machine will also power off
automatically after about 10 minutes waiting for user input at the prompt.## How to increase or decrease memory allocation for Forth850
Memory allocation can be adjusted without affecting the Forth dictionary.
In RUN MODE enter `MON` to enter the Monitor, then enter `USERaddr` with an
upper address `addr` larger than `23ff` (9K bytes.) If words are added to
Forth850, you must make sure that `addr` is large enough, i.e. equal or larger
than the hex value displayed with:HERE #708 + HEX . DECIMAL
23FFIn the Monitor specify `USERaddr` with the address displayed. This leaves
about 200 bytes free dictionary space plus 40 bytes for the "hold area" to run
Forth850. The largest size is `USER75FF` which gives about 21K free dictionary
space (but there won't be space left on the machine for files, BASIC or TEXT.)## Forth850 manual
Forth850 is 2012 standard compliant. For help, see the manual included with
[Forth for the Sharp PC-E500(S)](https://github.com/Robert-van-Engelen/Forth500)
and [Forth 2012 Standard](https://forth-standard.org/standard/intro).Forth850 implements a subset of the standard Forth words. A list of Forth850
words with an explanation is given below.## Forth850 words
List of Forth850 built-in words. Reference implementations in Forth are
included when applicable, although many words are implemented in Z80 code for
speed rather than in Forth.### (:)
_-- ; R: -- ip_
call colon definition;
runtime of the : compile-only word### (;)
_-- ; R: ip --_
return to caller from colon definition;
runtime of the ; compile-only word### (EXIT)
_-- ; R: ip --_
return to caller from colon definition;
runtime of the EXIT compile-only word### (;CODE)
_-- ; R: ip --_
set LASTXT cfa to ip and return from colon definition;
a runtime word compiled by the DOES> compile-only word### (DOES)
_addr -- addr ; R: -- ip_
calls the DOES> definition with pfa addr;
a runtime word compiled by the DOES> compile-only word coded as call dodoes### (VAR)
_-- addr_
leave parameter field address (pfa) of variable;
runtime word of a VARIABLE coded as call dovar### (VAL)
_-- x_
fetch value;
runtime word of a VALUE coded as call doval### (2VAL)
_-- dx_
fetch double value;
runtime word of a 2VALUE coded as call dotwoval### (CON)
_-- x_
fetch constant;
runtime word of a CONSTANT coded as call docon### (2CON)
_-- x_
fetch double constant;
runtime word of a 2CONSTANT coded as call dotwocon### (DEF)
_--_
execute deferred word;
runtime word of a DEFER coded as call dodef### (LIT)
_-- x_
fetch literal;
runtime word compiled by EVALUATE, INTERPRET and NUMBER### (2LIT)
_-- x1 x2_
fetch double literal;
runtime word compiled by EVALUATE, INTERPRET and NUMBER### (SLIT)
_-- c-addr u_
fetch literal string;
runtime word compiled by S" and ."### 0
_-- 0_
leave constant 00 CONSTANT 0
### 1
_-- 1_
leave constant 11 CONSTANT 1
### -1
_-- -1_
leave constant -1-1 CONSTANT -1
### BL
_-- 32_
leave constant 32 (space)#32 CONSTANT BL
### PAD
_-- c-addr_
leave address of the PAD;
the PAD is a free buffer space of 256 bytes not used by Forth850### TIB
_-- c-addr_
leave address of TIB;
the terminal input buffer used by Forth850### TMP
_-- c-addr_
leave address of the next temp string buffer;
switches between two string buffers of 256 free bytes each;
used by S" to store a string when interpreting### DROP
_x --_
drop TOS### DUP
_x -- x x_
duplicate TOS### ?DUP
_x -- x x or 0 -- 0_
duplicate TOS if nonzero### SWAP
_x1 x2 -- x2 x1_
swap TOS with 2OS### OVER
_x1 x2 -- x1 x2 x1_
copy 2OS over TOS### ROT
_x1 x2 x3 -- x2 x3 x1_
rotate cells: ROT >R SWAP R> SWAP ;
### -ROT
_x1 x2 x3 -- x3 x1 x2_
undo (or back, or left) rotate cells: -ROT ROT ROT ;
### NIP
_x1 x2 -- x2_
nip 2OS: NIP SWAP DROP ;
### TUCK
_x1 x2 -- x2 x1 x2_
tuck TOS under 2OS: TUCK SWAP OVER ;
### 2DROP
_xd1 xd2 -- xd1_
drop double TOS: 2DROP DROP DROP ;
### 2DUP
_xd -- xd xd_
duplicate double TOS: 2DUP OVER OVER ;
### 2SWAP
_xd1 xd2 -- xd2 xd1_
swap double TOS with double 2OS: 2SWAP ROT >R ROT R> ;
: 2SWAP 3 ROLL 3 ROLL ;### 2OVER
_xd1 xd2 -- xd1 xd2 xd1_
copy double 2OS over double TOS: 2OVER >R >R 2DUP R> R> 2SWAP ;
: 2OVER 3 PICK 3 PICK ;### DEPTH
_-- u_
parameter stack depth: DEPTH sp0 @ SP@ - 2- 2/ ;
### CLEAR
_... --_
purge parameter stack: CLEAR sp0 @ SP! ;
### .S
_--_
display parameter stack: .S DEPTH 0 ?DO sp0 @ I 2+ CELLS - ? LOOP ;
### SP@
_-- addr_
fetch stack pointer### SP!
_addr --_
store stack pointer### >R
_x -- ; R: -- x_
move TOS to the return stack### DUP>R
_x -- x ; R: -- x_
duplicate TOS to the return stack, a single word for DUP >R### R>
_R: x -- ; -- x_
move cell from the return stack### RDROP
_R: x -- ; --_
drop cell from the return stack, a single word for R> DROP### R@
_R: x -- x ; -- x_
fetch cell from the return stack### 2>R
_x1 x2 -- ; R: -- x1 x2_
move double TOS to the return stack, a single word for SWAP >R >R### 2R>
_R: x1 x2 -- ; -- x1 x2_
move double cell from the return stack, a single word for R> R> SWAP### 2R@
_R: x1 x2 -- x1 x2 ; -- x1 x2_
fetch double cell from the return stack### RP@
_-- addr_
fetch return stack pointer### RP!
_addr --_
store return stack pointer### PICK
_xu ... x0 u -- xu ... x0 xu_
pick u'th cell from the parameter stack;
0 PICK is the same as DUP;
1 PICK is the same as OVER: PICK 1+ CELLS SP@ + @ ;
### @
_addr -- x_
fetch from cell### C@
_c-addr -- char_
fetch char### 2@
_addr -- x1 x2_
fetch from double cell: 2@ DUP CELL+ @ SWAP @ ;
### !
_x addr --_
store in cell### (TO)
_x --_
store in value;
runtime of the TO compile-only word### C!
_char c-addr --_
store char### 2!
_x1 x2 addr --_
store in double cell: 2! TUCK ! CELL+ ! ;
### (2TO)
_dx --_
store in double value;
runtime of the TO compile-only word### +!
_n addr --_
increment cell### (+TO)
_n --_
increment value;
runtime of the +TO compile-only word### ON
_addr --_
store TRUE (-1) in cell: ON -1 SWAP ! ;
### OFF
_addr --_
store FALSE (0) in cell: OFF 0 SWAP ! ;
### +
_n1 n2 -- n3_
sum n1+n2### M+
_d1 n -- d2_
double sum d1+n### D+
_d1 d2 -- d3_
double sum d1+d2: D+ >R M+ R> + ;
### -
_n1 n2 -- n3_
difference n1-n2### D-
_d1 d2 -- d3_
double difference d1-d2: D- DNEGATE D+ ;
### UM*
_u1 u2 -- ud_
unsigned double product u1*u2### M*
_n1 n2 -- d_
signed double product n1*n2: M*
2DUP XOR >R
ABS SWAP ABS UM*
R> 0< IF DNEGATE THEN ;### *
_n1|u1 n2|u2 -- n3|u3_
signed and unsigned product n1*n2: * UM* DROP ;
### UMD*
_ud1 u -- ud2_
unsigned double product ud1*u: UMD* DUP>R UM* DROP SWAP R> UM* ROT + ;
### MD*
_d1 n -- d2_
signed double product d1*n: MD*
2DUP XOR >R
ABS -ROT DABS ROT
UMD*
R> 0< IF DNEGATE THEN ;### UM/MOD
_ud u1 -- u2 u3_
unsigned remainder and quotient ud/u1;
the result is undefined when u1 = 0### SM/REM
_d1 n1 -- n2 n3_
symmetric remainder and quotient d1/n1 rounded towards zero;
the result is undefined when n1 = 0: SM/REM
2DUP XOR >R
OVER >R
ABS -ROT DABS ROT
UM/MOD
R> 0< IF SWAP NEGATE SWAP THEN
R> 0< IF NEGATE THEN ;### FM/MOD
_d1 n1 -- n2 n3_
floored signed modulus and quotient d1/n1 rounded towards negative (floored);
the result is undefined when n1 = 0: FM/MOD
DUP>R
SM/REM
DUP 0< IF
SWAP R> + SWAP 1-
ELSE
RDROP
THEN ;### /MOD
_n1 n2 -- n3 n4_
symmetric remainder and quotient n1/n2;
the result is undefined when n2 = 0: /MOD SWAP S>D ROT SM/REM ;
### MOD
_n1 n2 -- n3_
symmetric remainder of n1/n2;
the result is undefined when n2 = 0: MOD /MOD DROP ;
### /
_n1 n2 -- n3_
quotient n1/n2;
the result is undefined when n2 = 0: / /MOD NIP ;
### */MOD
_n1 n2 n3 -- n4 n5_
product with symmetric remainder and quotient n1*n2/n3;
the result is undefined when n3 = 0: */MOD -ROT M* ROT SM/REM ;
### */
_n1 n2 n3 -- n4_
product with quotient n1*n2/n3;
the result is undefined when n3 = 0: */ */MOD NIP ;
### M*/
_d1 n1 n2 -- d2_
double product with quotient d1*n1/n2;
the result is undefined when n2 = 0: M*/ >R MD* R> SM/REM NIP ;
### AND
_x1 x2 -- x1&x2_
bitwise and x1 with x2### OR
_x1 x2 -- x1|x2_
bitwise or x1 with x2### XOR
_x1 x2 -- x1^x2_
bitwise xor x1 with x2### =
_x1 x2 -- flag_
true if x1 = x2### <>
_x1 x2 -- flag_
true if x1 <> x2### <
_n1 n2 -- flag_
true if n1 < n2 signed: <
2DUP XOR 0< IF
DROP 0<
EXIT
THEN
- 0< ;### >
_n1 n2 -- flag_
true if n1 > n2 signed: > SWAP < ;
### U<
_u1 u2 -- flag_
true if u1 < u2 unsigned: U<
2DUP XOR 0< IF
NIP 0<
EXIT
THEN
- 0< ;### U>
_u1 u2 -- flag_
true if u1 > u2 unsigned: U> SWAP U< ;
### 0=
_x -- flag_
true if x = 0### 0<
_n -- flag_
true if n < 0### D0=
_dx -- flag_
true if dx = 0: D0= OR 0= ;
### D0<
_d -- flag_
true if d < 0: D0< NIP 0< ;
### S>D
_n -- d_
widen single to double### D>S
_d -- n_
narrow double to single;
may throw -11 "result out of range" valid range is -32768 to 65535### MAX
_n1 n2 -- n3_
signed max of n1 and n2: MAX
2DUP < IF SWAP THEN
DROP ;### MIN
_n1 n2 -- n3_
signed min of n1 and n2: MIN
2DUP > IF SWAP THEN
DROP ;### UMAX
_u1 u2 -- u3_
unsigned max of u1 and u2: UMAX
2DUP U< IF SWAP THEN
DROP ;### UMIN
_u1 u2 -- u3_
unsigned min of u1 and u2: UMIN
2DUP U> IF SWAP THEN
DROP ;### WITHIN
_x1 x2 x3 -- flag_
true if x1 is within x2 up to x3 exclusive: WITHIN OVER - >R - R> U< ;
### INVERT
_x1 -- x2_
one's complement ~x1: INVERT 1+ NEGATE ;
: INVERT -1 XOR ;### NEGATE
_n1 -- n2_
two's complement -n1: NEGATE 0 SWAP - ;
: NEGATE INVERT 1+ ;### ABS
_n1 -- n2_
absolute value |n1|: ABS DUP 0< IF NEGATE THEN ;
### DNEGATE
_d1 -- d2_
two's complement -d1: DNEGATE SWAP INVERT SWAP INVERT 1 M+ ;
### DABS
_d1 -- d2_
absolute value |d1|: DABS DUP 0< IF DNEGATE THEN ;
### LSHIFT
_x1 u -- x2_
logical shift left x1 << u### RSHIFT
_x1 u -- x2_
logical shift right x1 >> u### 1+
_n1 -- n2_
increment n1+1: 1+ 1 + ;
### 2+
_n1 -- n2_
increment n1+2: 2+ 2 + ;
### 1-
_n1 -- n2_
decrement n1-1: 1- 1 - ;
### 2-
_n1 -- n2_
decrement n1-2: 2- 2 - ;
### 2*
_n1 -- n2_
arithmetic shift left n1 << 1: 2* 2 * ;
### 2/
_n1 -- n2_
arithmetic shift right n1 >> 1: 2/ 2 / ;
### COUNT
_c-addr1 -- c-addr2 u_
convert counted string to string: COUNT DUP 1+ SWAP C@ ;
### COMPARE
_c-addr1 u1 c-addr2 u2 -- -1|0|1_
compare strings, leaves -1 = less or 0 = equal or 1 = greater### S=
_c-addr1 u1 c-addr2 u2 -- flag_
true if strings match: S= COMPARE 0= ;
### SEARCH
_c-addr1 u1 c-addr2 u2 -- c-addr3 u3 flag_
true if the second string is in the first;
leaves matching address, remaining length and true;
or leaves the first string and false### CMOVE
_c-addr1 c-addr2 u --_
move u bytes from c-addr1 to c-addr2 (from begin): CMOVE
SWAP >R
BEGIN DUP WHILE
NEXT-CHAR R@ C!
R> 1+ >R
REPEAT
RDROP
2DROP ;### CMOVE>
_c-addr1 c-addr2 u --_
move u bytes from c-addr1 to c-addr2 up (from end)### MOVE
_c-addr1 c-addr2 u --_
move u bytes from c-addr1 to c-addr2: MOVE
-ROT
2DUP U< IF
ROT CMOVE>
ELSE
ROT CMOVE
THEN ;### FILL
_c-addr u char --_
fill memory with char### ERASE
_c-addr u --_
fill memory with zeros: ERASE 0 FILL ;
### BLANK
_c-addr u --_
fill memory with 0x20 (BL) chars: ERASE BL FILL ;
### CHOP
_c-addr u1 char -- c-addr u2_
truncate a string up to a matching char;
leaves the string if char not found;
char = 0x20 (BL) chops 0x00 to 0x20 (white space and control)### TRIM
_c-addr1 u1 char -- c-addr2 u2_
trim initial chars from a string;
char = 0x20 (BL) trims 0x00 to 0x20 (white space and control)### -TRIM
_c-addr u1 char -- c-addr u2_
trim trailing chars from a string;
char = 0x20 (BL) trims 0x00 to 0x20 (white space and control)### -TRAILING
_c-addr u1 -- c-addr u2_
trim trailing white space and control characters from a string: -TRAILING BL -TRIM ;
### /STRING
_c-addr1 u1 n -- c-addr2 u2_
slice n characters off the start of a string: /STRING ROT OVER + -ROT - ;
### NEXT-CHAR
_c-addr1 u1 -- c-addr2 u2 char_
get next char from a string;
increments the string address and decrements its length by one: NEXT-CHAR OVER C@ >R 1- SWAP 1+ SWAP R> ;
: NEXT-CHAR OVER C@ -ROT 1- SWAP 1+ SWAP ROT ;### X!
_u --_
set cursor column 0 to 23### Y!
_u --_
set cursor row 0 to 5### X@
_-- u_
fetch cursor column 0 to 23, or 24 when beyond the right window edge### Y@
_-- u_
fetch cursor row 0 to 5### AT-XY
_u1 u2 --_
set column x to u1 (0 to 23) and row y to u2 (0 to 5): AT-XY Y! X! ;
### EMIT
_char --_
emit char to screen;
supports the following control codes:
8 (BS backspace, cursor left),
9 (TAB),
10 (LF line feed),
11 (VT scroll),
12 (FF clear screen),
13 (CR carriage return),
28 (cursor right),
29 (cursor left),
30 (cursor up),
31 (cursor down)### TYPE
_c-addr u --_
type string to output; string may contain control codes, see EMIT: TYPE
BEGIN DUP WHILE
NEXT-CHAR EMIT
REPEAT
2DROP ;### CR
_--_
carriage return and line feed: CR $A EMIT ;
### SPACE
_--_
emit a space (BL): SPACE BL EMIT ;
### SPACES
_n --_
emit n spaces (zero or negative n does nothing): SPACES
DUP 0< IF
DROP
EXIT
THEN
0 ?DO SPACE LOOP ;### PAGE
_--_
clear screen: PAGE $C EMIT ;
### BASE
_-- addr_
variable with numeric base for conversionVARIABLE BASE
### DECIMAL
_--_
set BASE to 10: DECIMAL #10 BASE ! ;
### HEX
_--_
set BASE to 16: HEX #16 BASE ! ;
### HP
_-- addr_
hold pointer0 VALUE HP
### <#
_--_
begin pictured numeric output: <# HERE h_size + TO HP ;
### HOLD
_char --_
hold char for pictured numeric output: HOLD HP 1- DUP TO HP C! ;
### #
_ud1 -- ud2_
hold digit: #
0 BASE @ UM/MOD >R
BASE @ UM/MOD
SWAP DUP #9 > IF
#7 +
THEN
'0 + HOLD
R> ;### #S
_ud -- 0 0_
hold all remaining digits: #S BEGIN # 2DUP D0= UNTIL ;
### SIGN
_n --_
hold minus sign if n < 0: SIGN 0< IF '- HOLD THEN ;
### #>
_ud -- c-addr u_
end pictured numeric output, leave string: #> 2DROP HP HERE h_size + OVER - ;
### D.R
_d +n --_
output signed double d right aligned in field of +n chars wide: D.R -ROT TUCK DABS <# #S ROT SIGN #> ROT OVER - SPACES TYPE ;
### D.
_d --_
output signed double d with a trailing space: D. 0 D.R SPACE ;
### U.R
_u +n --_
output unsigned u right aligned in field of +n chars wide: U.R 0 SWAP D.R ;
### U.
_u --_
output unsigned u with a trailing space: U. 0 D. ;
### .R
_n +n --_
output signed n right aligned in field of +n chars wide: .R SWAP S>D ROT D.R ;
### .
_n --_
output signed n with a trailing space: . S>D D. ;
### ?
_addr --_
output signed cell stored at addr: ? @ . ;
### OUT
_u1 u2 --_
output byte u1 to port u2### INP
_u1 -- u2_
input from port u1### DRAW
_c-addr u --_
draw pixel patterns on screen at xy;
writes string c-addr u of pixel patterns at xy;
specify xy with AT-XY, xy not changed after DRAW### VIEW
_c-addr u --_
view screen pixels at xy;
read string of screen pixel patterns at xy into buffer c-addr u
specify xy with AT-XY, xy not changed after VIEW### REVERSE
_+n --_
reverse video of the +n characters displayed at xy;
specify xy with AT-XY### INKEY
_-- x_
check for key press and read key code of a key is pressed;
0x00 = no key pressed and 0x52 = multiple keys pressed### GETKEY
_-- char_
wait and read key;
leaves ASCII char or special key code:
ON =$05,
BS =$08,
DEL =$09,
CA =$0b,
CLS =$0c,
ENTER =$0d,
DIGIT =$0e,
F-E =$0f,
INS =$12,
ANS =$15,
CONST =$17,
RCM =$19,
M+ =$1a,
M- =$1b,
right =$1c,
left =$1d,
up =$1e,
down =$1f;
a space is produced for the TAB key by the GETCHR system call,
calc keys and BASIC keys produce BASIC tokens as key code $fe:
SIN =$fe register B = $95 BASIC token for SIN (ignored)### KEY
_-- char_
display cursor and wait to read key;
same as GETKEY leaves ASCII char or special key code### EDIT
_c-addr +n1 n2 n3 n4 -- c-addr +n5_
edit buffer c-addr;
buffer size +n1;
string in buffer has length n2;
place cursor at n3;
non-editable left margin n4;
leaves c-addr and length +n5### ACCEPT
_c-addr +n1 -- +n2_
accept user input into buffer c-addr +n1;
leaves length +n2: ACCEPT 0 0 0 EDIT NIP ;
### >IN
_-- addr_
variable with offset into input buffer (TIB)VARIABLE >IN
### SOURCE-ID
_-- 0|-1_
value with 0 = source input or -1 = string input0 VALUE SOURCE-ID
### SOURCE
_-- c-addr u_
double value with input sourceTIB 0 2VALUE SOURCE
### REFILL
_-- flag_
attempt to refill the input buffer;
leaves false when end of input### SKIPS
_char "" --_
skips chars in input when present, 0x20 (BL) skips 0x00 to 0x20 (white space and control): SKIPS SOURCE >IN @ /STRING ROT TRIM DROP SOURCE DROP - >IN ! ;
### PARSE
_char "ccc" -- c-addr u_
parse "ccc" up to char when present: PARSE SOURCE >IN @ /STRING ROT CHOP DUP 1+ >IN @ + SOURCE NIP UMIN >IN ! ;
### PARSE-WORD
_char "ccc" -- c-addr u_
parse char-delimited word;
may throw -18 "parsed string overflow": PARSE-WORD
DUP SKIPS PARSE
DUP tmp_size-1 U> IF -18 THROW THEN ;### CHECK-NAME
_c-addr u -- c-addr u_
check if name is valid;
may throw -16 "attempt to use a zero-length string as a name";
may throw -19 "definition name too long": CHECK-NAME
DUP 0= IF -16 THROW THEN
DUP length_bits U> IF -19 THROW THEN ;### PARSE-NAME
_"name" -- c-addr u_
parse space-delimited name;
check if name length is valid: PARSE-NAME BL PARSE-WORD CHECK-NAME ;
### (
_"ccc" --_
start a comment block;
parse and skip input up to the closing ): (
') PARSE
BEGIN
+ DROP
SOURCE + = IF
DROP REFILL
ELSE
C@ ') <> IF
REFILL
ELSE
FALSE
THEN
THEN
0= UNTIL ; IMMEDIATE### \
_"ccc" --_
start a comment line;
parse and skip input up to the end of line;
note that the PC-G850 symbol for \ is ¥: \ $A PARSE 2SROP ;
### .(
_"ccc" --_
emit CR then type "ccc" up to the closing ): .( ') PARSE CR TYPE ; IMMEDIATE
### >DIGIT
_char -- n_
convert char digit to numeric digit when within BASE;
leaves -1 if char is invalid### >NUMBER
_ud1 c-addr1 u1 -- ud2 c-addr2 u2_
convert string to number;
updates accumulated double ud1 to ud2;
leaves string with the remaining unconvertable chars or empty: >NUMBER
BEGIN DUP WHILE
NEXT-CHAR >DIGIT
DUP 0< IF
DROP -1 /STRING
EXIT
THEN
>R
2SWAP
BASE @ UMD*
R> M+
2SWAP
REPEAT ;### DBL
_-- flag_
true if >DOUBLE or NUMBER produced a double0 VALUE DBL
### >DOUBLE
_c-addr u -- d true | false_
convert string to signed double;
leaves the double and true if string is converted;
leaves false if string is unconvertable### L>NAME
_lfa -- nt_
convert link field address to name token (nfa)### NAME>STRING
_nt -- c-addr u_
convert name token (nfa) to string### NAME>
_nt -- xt_
convert name token (nfa) to execution token (cfa)### >NAME
_xt -- nt_
convert execution token (cfa) to name token (lfa);
may throw -24 "invalid numeric argument"### >BODY
_xt -- pfa_
convert execution token to parameter field address### FIND-WORD
_c-addr u -- c-addr 0 | xt 1 | xt -1_
search dictionary for matching word;
leaves execution token and 1 = immediate or -1 = not immediate;
leaves c-addr and 0 when not found### '
_"name" -- xt_
parse name and get execution token;
may throw -13 "undefined word": ' PARSE-NAME FIND-WORD 0= IF -13 THROW THEN ;
### WORDS
_--_
display context vocabulary words### HERE
_-- addr_
address of free memory after the dictionary;
new definitions are added here;
note that numeric output words use HERE for conversion### LASTXT
_-- xt_
leaves the last execution token defined0 VALUE LASTXT
### STATE
_-- addr_
compilation state;
STATE @ leaves TRUE when compiling;
STATE @ leaves FALSE when interpretingVARIABLE STATE
### [
_--_
switch state to interpreting: [ STATE OFF ;
### ]
_--_
switch state to compiling: ] STATE ON ;
### HIDE
_--_
hide the last definition: HIDE CURRENT @ L>NAME DUP C@ smudge_bits OR SWAP C! ;
### REVEAL
_--_
reveal the last definition: REVEAL CURRENT @ L>NAME DUP C@ ~smudge_bits AND SWAP C! ;
### IMMEDIATE
_--_
make the last definition immediate: IMMEDIATE CURRENT @ L>NAME DUP C@ immediate_bits OR SWAP C! ;
### ?COMP
_--_
check if compiling;
may throw -14 "interpreting a compile-only word"### ?SYS
_-- ; C: x --_
check if compiled control structure matches x;
may throw -14 "interpreting a compile-only word";
may throw -22 "control structure mismatch"### UNUSED
_-- u_
unused dictionary space: UNUSED top @ HERE - ;
### ALLOT
_n --_
allocate n bytes starting from HERE in the dictionary;
undo the last ALLOT with negative n to reclaim memory (only do this when no new words are defined);
may throw -8 "dictionary overflow"### COMPILE,
_xt --_
append execution token to dictionary;
may throw -8 "dictionary overflow": COMPILE, , ;
### ,
_x --_
append cell to dictionary;
may throw -8 "dictionary overflow"### C,
_char --_
append char to dictionary;
may throw -8 "dictionary overflow"### 2,
_x1 x2 --_
append double cell to dictionary;
may throw -8 "dictionary overflow": 2, , , ;
### NFA,
_"name" --_
parse name and append dictionary entry with name;
set LASTXT to HERE;
may throw -8 "dictionary overflow": NFA, PARSE-NAME HERE CURRENT @ , CURRENT ! DUP C, HERE SWAP DUP ALLOT CMOVE HERE TO LASTXT ;
### CFA,
_addr --_
append cfa call addr to dictionary;
may throw -8 "dictionary overflow"### CFA:,
_-- addr colon_sys_
append cfa colon definition to dictionary;
make CONTEXT the CURRENT vocabulary;
start compiling;
may throw -8 "dictionary overflow": CFA:, ] HERE colon_sys ['] (:) CFA, CURRENT TO CONTEXT ;
### POSTPONE
_"name" --_
postpone compile action of name;
if name is immediate, then compile name instead of executing it;
otherwise compile name into the current colon definition;
can be used to create macros, e.g. : TRUE POSTPONE -1 ; IMMEDIATE;
may throw -13 "undefined word";
may throw -14 "interpreting a compile-only word"### :
_-- ; C: "name" -- addr colon_sys_
define name and start compiling: : NFA, HIDE CFA:, ;
### ;
_-- ; C: addr colon_sys --_
end colon definition and stop compiling;
may throw -14 "interpreting a compile-only word";
may throw -22 "control structure mismatch": ; ?COMP colon_sys <> IF -22 THROW THEN DROP POSTPONE (;) REVEAL [ ; IMMEDIATE
### EXIT
_--_
exit colon definition: EXIT ?COMP POSTPONE (EXIT) ; IMMEDIATE
### CREATE
_"name" -- ; -- addr_
create name;
executing name leaves address (HERE addr after CREATE): NFA, ['] (VAR) CFA, ;
### DOES>
_-- ; ... -- ..._
change CREATE name behavior to execute code after DOES>: DOES> ?COMP POSTPONE (;CODE) ['] (DOES) CFA, ; IMMEDIATE
### VARIABLE
_"name" -- ; -- addr_
define a variable;
executing name leaves address of value (initialized to zero): VARIABLE CREATE 0 , ;
### 2VARIABLE
_"name" -- ; -- addr_
define a double variable;
executing name leaves address of double value (initialized to zero): 2VARIABLE CREATE 0 0 2, ;
### CONSTANT
_x "name" -- ; -- x_
define a constant;
executing name leaves x: CONSTANT NFA, ['] (CON) CFA, , ;
: CONSTANT CREATE , DOES> @ ;### 2CONSTANT
_x1 x2 "name" -- ; -- x1 x2_
define a double constant;
executing name leaves x1 x2: 2CONSTANT NFA, ['] (2CON) CFA, 2, ;
: 2CONSTANT CREATE 2, DOES> 2@ ;### VALUE
_x "name" -- ; -- x_
define a value;
executing name leaves x: VALUE NFA, ['] (VAL) CFA, , ;
### 2VALUE
_dx "name" -- ; -- dx_
define a double value;
executing name leaves dx: 2VALUE NFA, ['] (2VAL) CFA, 2, ;
### TO
_"name" -- ; x --_
assign value name;
may throw -32 "invalid name argument": TO
'
DUP VALUE? IF
>BODY
STATE @ IF
POSTPONE (TO)
,
EXIT
THEN
!
EXIT
THEN
DUP 2VALUE? IF
>BODY
STATE @ IF
POSTPONE (2TO)
,
EXIT
THEN
2!
EXIT
THEN
#-32 THROW ; IMMEDIATE### +TO
_"name" -- ; n --_
increment value name;
may throw -32 "invalid name argument": +TO
'
DUP VALUE? IF
>BODY
STATE @ IF
POSTPONE (+TO)
,
EXIT
THEN
+!
EXIT
THEN
#-32 THROW ; IMMEDIATE### DEFER
_"name" -- ; ... -- ..._
define a deferred name: DEFER NFA, ['] (DEF) CFA, ['] UNDEF , ;
### UNDEF
_--_
throw -256 "execution of an uninitialized deferred word": UNDEF -256 THROW ;
### DEFER!
_xt1 xt2 --_
store xt1 in deferred xt2: DEFER! >BODY ! ;
### DEFER@
_xt1 -- xt2_
fetch execution token from deferred xt1: DEFER@ >BODY @ ;
### IS
_xt "name" --_
assign execution token to deferred name;
may throw -32 "invalid name argument": IS
'
DUP DEFER? IF
STATE @ IF
LITERAL
POSTPONE DEFER!
EXIT
THEN
DEFER!
EXIT
THEN
#-32 THROW ; IMMEDIATE### ACTION-OF
_"name" -- xt_
fetch execution token of deferred name;
may throw -32 "invalid name argument": ACTION-OF
'
DUP DEFER? IF
STATE @ IF
LITERAL
POSTPONE DEFER@
EXIT
THEN
DEFER@
EXIT
THEN
#-32 THROW ; IMMEDIATE### LITERAL
_x -- ; -- x_
compile a literal: LITERAL ?COMP POSTPONE (LIT) , ; IMMEDIATE
### 2LITERAL
_x1 x2 -- ; -- x1 x2_
compile a double literal: 2LITERAL ?COMP POSTPONE (2LIT) 2, ; IMMEDIATE
### SLITERAL
_c-addr u -- ; -- c-addr u_
compile a string literal;
max literal string length is 255: SLITERAL
?COMP
DUP 255 U> IF -18 THROW THEN
POSTPONE (SLIT)
DUP C,
HERE OVER ALLOT SWAP CMOVE ; IMMEDIATE### ."
_"ccc" -- ; --_
type "ccc" (compiled): ." '" PARSE SLITERAL POSTPONE TYPE ; IMMEDIATE
### S"
_"ccc" -- ; -- c-addr u_
leave string "ccc" (compiled and interpreted): S"
'" PARSE
STATE @ IF
SLITERAL
EXIT
THEN
TMP SWAP
2DUP 2>R
CMOVE
2R> ; IMMEDIATE### VALUE?
_xt -- flag_
true if xt is a VALUE: VALUE? DUP C@ $CD = SWAP 1+ @ ['] (VAL) = AND ;
### 2VALUE?
_xt -- flag_
true if xt is a 2VALUE: 2VALUE? DUP C@ $CD = SWAP 1+ @ ['] (2VAL) = AND ;
### DEFER?
_xt -- flag_
true if xt is a DEFER word: DEFER? DUP C@ $CD = SWAP 1+ @ ['] (DEF) = AND ;
### FENCE
_-- addr_
only permit FORGET past the dictionary FENCE address0 VALUE FENCE
### FORGET
_"name" --_
delete name and all following definitions;
beware of vocabulary definitions crossings;
may throw -15 "invalid FORGET"### [']
_"name" -- ; -- xt_
compile xt of name as literal;
may throw -14 "interpreting a compile-only word": ['] ?COMP ' LITERAL ; IMMEDIATE
### RECURSE
_... -- ..._
recursively call the currently defined word;
may throw -14 "interpreting a compile-only word": RECURSE ?COMP LASTXT COMPILE, ; IMMEDIATE
### ?STACK
_--_
check parameter stack bounds;
may throw -3 "stack overflow";
may throw -4 "stack underflow"### (UNTIL)
_x --_
branch if x = 0;
runtime of the UNTIL compile-only word### (IF)
_x --_
branch if x = 0;
runtime of the IF and WHILE compile-only words### (AGAIN)
_--_
branch;
runtime of the AGAIN and REPEAT compile-only words### (AHEAD)
_--_
branch;
runtime of the AHEAD, ELSE and ENDOF compile-only words### (OF)
_x1 x2 -- x1 or x1 x2 --_
branch if x1 <> x2;
runtime of the OF compile-only word### (LOOP)
_--_
repeat loop unless loop counter crosses the limit;
runtime of the LOOP compile-only word### (+LOOP)
_--_
increment counter and repeat loop unless counter crosses the limit;
runtime of the +LOOP compile-only word### (?DO)
_n1|u1 n2|u2 --_
begin loop with limit n1|u1 and initial value n2|u2;
skip loop when zero trip loop;
runtime of the ?DO compile-only word### (DO)
_n1|u1 n2|u2 --_
begin loop with limit n1|u1 and initial value n2|u2;
loop at least once;
runtime of the DO compile-only word### (UNLOOP)
_R: ... --_
remove loop parameters;
runtime of the UNLOOP compile-only word### (LEAVE)
_--_
discard the loop parameters and exit the innermost do-loop;
runtime of the LEAVE compile-only word### AHEAD
_-- ; C: -- addr orig_
branch ahead to THEN;
may throw -14 "interpreting a compile-only word"### BEGIN
_-- ; C: -- addr dest_
begin WHILE REPEAT;
may throw -14 "interpreting a compile-only word"### AGAIN
_-- ; C: addr dest --_
branch back to BEGIN;
may throw -14 "interpreting a compile-only word";
may throw -22 "control structure mismatch"### UNTIL
_x -- ; C: addr dest --_
branch back to BEGIN if x = 0;
may throw -14 "interpreting a compile-only word";
may throw -22 "control structure mismatch"### IF
_x -- ; C: -- addr orig_
branch to closest ELSE or THEN if x = 0;
may throw -14 "interpreting a compile-only word"### THEN
_-- ; C: addr orig --_
close AHEAD, IF, ELSE;
may throw -14 "interpreting a compile-only word";
may throw -22 "control structure mismatch"### ELSE
_-- ; C: addr orig -- addr orig_
close IF and branch to THEN;
may throw -14 "interpreting a compile-only word";
may throw -22 "control structure mismatch"### WHILE
_x -- ; C: addr sys -- addr orig addr sys_
branch to exit REPEAT if x = 0;
may throw -14 "interpreting a compile-only word"### REPEAT
_-- ; C: addr orig addr dest --_
branch back to BEGIN after WHILE;
may throw -14 "interpreting a compile-only word";
may throw -22 "control structure mismatch"### DO
_n1|u1 n2|u2 -- ; C: -- addr do_sys_
begin loop from initial value n2|u2 to the limit n1|u1;
loop at least once;
may throw -14 "interpreting a compile-only word"### ?DO
_n1|u1 n2|u2 -- ; C: -- addr do_sys_
begin loop from initial value n2|u2 to the limit n1|u1;
skip loop when zero trip loop;
may throw -14 "interpreting a compile-only word"### LOOP
_-- ; C: addr do_sys --_
repeat loop unless loop counter crosses the limit;
may throw -14 "interpreting a compile-only word";
may throw -22 "control structure mismatch"### +LOOP
_n|u -- ; C: addr do_sys --_
increment counter and repeat loop unless counter crosses the limit;
may throw -14 "interpreting a compile-only word";
may throw -22 "control structure mismatch"### UNLOOP
_--_
remove loop parameters;
may throw -14 "interpreting a compile-only word"### LEAVE
_--_
exit the innermost do-loop;
may throw -14 "interpreting a compile-only word"### I
_-- n_
counter of innermost do loop### J
_-- n_
counter of outer (second) do loop### CASE
_x -- ; C: -- 0_
begin CASE ENDCASE switch;
may throw -14 "interpreting a compile-only word"### OF
_x1 x2 -- x1 or x1 x2 -- ; C: n1 -- orig n2_
take CASE arm if x1 = x2;
otherwise branch to next OF;
may throw -14 "interpreting a compile-only word"### ENDOF
_-- ; C: n -- orig n_
branch to ENDCASE;
may throw -14 "interpreting a compile-only word";
may throw -22 "control structure mismatch"### ENDCASE
_x -- ; C: n*orig n --_
close CASE;
may throw -14 "interpreting a compile-only word";
may throw -22 "control structure mismatch"### HANDLER
_-- addr_
variable with saved return stack pointerVARIABLE HANDLER
### EXECUTE
_... xt -- ..._
execute execution token xt### CATCH
_... xt -- ... 0 or xt -- n_
execute xt leaving nonzero exception code n or 0 when no exception occurred;
when an exception was caught, the parameter and return stacks are restored
to their state before execution of xt: CATCH
SP@ >R
HANDLER @ >R
RP@ HANDLER !
EXECUTE
R> HANDLER !
RDROP
0 ;### THROW
_0 -- or ... n -- ... n_
throw exception n if nonzero: THROW
?DUP IF
HANDLER @ ?DUP IF
RP!
R> HANDLER !
R> SWAP >R
SP!
DROP
R>
EXIT
THEN
>R CLEAR R>
ERROR
REPL
THEN ;### QUIT
_... -- ; R: ... --_
throw -56 "QUIT";
no exception error is displayed;
unlike ABORT, the parameter stack is not cleared: QUIT -56 THROW ;
### (ABORT")
_... flag c-addr u -- ; R: ... --_
if flag then abort with string message unless an active catch is present;
runtime of the ABORT" compile-only word;
throw -2 "ABORT"": (ABORT")
ROT IF
HANDLER @ IF
2DROP
ELSE
TYPE
THEN
-2 THROW
THEN
2DROP ;### ABORT"
_... flag -- ; C: "ccc" -- ; R: ... --_
if flag then abort with string message unless an active catch is present;
throw -2 "ABORT"";
clears the parameter stack unless caught with CATCH;
may throw -14 "interpreting a compile-only word": ABORT" ?COMP POSTPONE S" POSTPONE (ABORT") ; IMMEDIATE
### ABORT
_... -- ; R: ... --_
throw -1 "ABORT";
clears the parameter stack unless caught with CATCH: ABORT -1 THROW ;
### ERROR
_n --_
display exception n at the offending location in the input;
n = -1 ABORT and n = -2 ABORT" clear the stack;
n = -56 QUIT stays silent;
List of Forth850 errors:code | error
---- | ---------------------------------------------------------
-1 | ABORT
-2 | ABORT"
-3 | stack overflow
-4 | stack underflow
-8 | dictionary overflow
-10 | division by zero
-11 | result out of range
-13 | undefined word
-14 | interpreting a compile-only word
-15 | invalid FORGET
-16 | attempt to use zero-length string as a name
-18 | parsed string overflow
-19 | definition name too long
-22 | control structure mismatch
-24 | invalid numeric argument
-28 | user interrupt (BREAK was pressed)
-32 | invalid name argument (invalid TO name)
-42 | floating-point divide by zero
-43 | floating-point result out of range
-46 | floating-point invalid argument
-56 | QUIT
-256 | execution of an uninitialized deferred word### NUMBER
_c-addr u -- n|u|d|ud_
convert string to number;
value DBL is set to -1 when the number is a double;
may throw -13 "undefined word" when string is not numeric### INTERPRET
_--_
interpret input while input is available### EVALUATE
_... c-addr u -- ..._
evaluate string### REPL
_--_
read-evaluate-print loop: REPL
rp0 @ RP!
HANDLER OFF
0 TO SOURCE-ID
CR
[
BEGIN
BEGIN ['] REFILL CATCH ?DUP WHILE
ERROR CR
REPEAT
WHILE
SPACE
['] INTERPRET CATCH ?DUP IF
ERROR
REPL
THEN
STATE @ INVERT IF
." OK["
DEPTH 0 U.R
'] EMIT
THEN
CR
REPEAT
BYE ;### BYE
_--_
return to BASIC### CONTEXT
_-- addr_
leaves address of link of the last vocabulary context definition' FORTH VALUE CONTEXT
### CURRENT
_-- addr_
leaves address of link of the last current vocabulary definition' FORTH VALUE CURRENT
### DEFINITIONS
_--_
make CURRENT the CONTEXT vocabulary: DEFINITIONS CONTEXT TO CURRENT ;
### VOCABULARY
_"name" --_
define a new vocabulary: VOCABULARY CREATE , fig_kludge , DOES> TO CONTEXT ;
### FORTH
_--_
make FORTH the CONTEXT vocabularyVOCABULARY FORTH
## Additional words included with the full version
### FALSE
_-- 0_
leave 00 CONSTANT FALSE
### TRUE
_-- -1_
leave -1-1 CONSTANT TRUE
### 2ROT
_xd1 xd2 xd3 -- xd2 xd3 xd1_
rotate double cells: 2ROT 5 ROLL 5 ROLL ;
### ROLL
_xu x(u+1) ... x1 x0 u -- x(u+1) ... x1 x0 xu_
roll u cells on the parameter stack;
0 ROLL does nothing;
1 ROLL is the same as SWAP;
2 ROLL is the same as ROT### D*
_d1|ud1 d2|ud2 -- d3|ud3_
signed and unsigned double product d1*d2: D* >R ROT DUP>R -ROT MD* 2R> * 0 SWAP D+ ;
### UD/MOD
_ud1 ud2 -- ud3 ud4_
unsigned double remainder and quotient ud1/ud2;
the result is undefined when ud2 = 0### D/MOD
_d1 d2 -- d3 d4_
double symmetric remainder and quotient d1/d2;
the result is undefined when d2 = 0: D/MOD
DUP 3 PICK DUP>R XOR >R
DABS 2SWAP DABS 2SWAP
UD/MOD
R> 0< IF DNEGATE THEN
R> 0< IF 2SWAP DNEGATE 2SWAP THEN ;### DMOD
_d1 d2 -- d3_
double symmetric remainder of d1/d2;
the result is undefined when d2 = 0: DMOD D/MOD 2DROP ;
### D/
_d1 d2 -- d3_
double quotient d1/d2;
the result is undefined when d2 = 0: D/ D/MOD 2SWAP 2DROP ;
### D=
_d1 d2 -- flag_
true if d1 = d2: D= D- D0= ;
### D<
_d1 d2 -- flag_
true if d1 < d2: D<
DUP 3 PICK XOR 0< IF
2DROP D0<
EXIT
THEN
D- D0< ;### DU<
_du1 du2 -- flag_
true if ud1 < ud2: DU<
DUP 3 PICK XOR 0< IF
2SWAP 2DROP D0<
EXIT
THEN
D- D0< ;### DMAX
_d1 d2 -- d3_
signed double max of d1 and d2: DMAX
2OVER 2OVER D< IF 2SWAP THEN
2DROP ;### DMIN
_d1 d2 -- d3_
signed double min of d1 and d2: DMIN
2OVER 2OVER D< INVERT IF 2SWAP THEN
2DROP ;### CELL+
_addr -- addr_
increment to next cell: CELL+ 2+ ;
### CELLS
_n1 -- n2_
convert to cell unit: CELLS 2* ;
### CHAR+
_n1 -- n1_
increment to next char: CHAR+ 1+ ;
### CHARS
_n1 -- n2_
convert to char unit: CHARS ;
### DUMP
_c-addr u --_
dump memory in hex: DUMP
BASE @ >R
HEX
BEGIN DUP WHILE
NEXT-CHAR .
REPEAT
2DROP
R> BASE ! ;### HOLDS
_c-addr u --_
hold string for pictured numeric output: HOLDS
BEGIN DUP WHILE
1- 2DUP + C@ HOLD
REPEAT
2DROP ;### BEEP
_--_
sound the speaker for a short ~2KHz beep### KEY-CLEAR
_--_
wait until no keys are pressed: KEY-CLEAR BEGIN INKEY 0= UNTIL ;
### KEY?
_-- flag_
true if a key is pressed: KEY? INKEY 0= 0= ;
### WORD
_char "ccc" -- c-addr_
parse word as a counted string: WORD TMP DUP ROT PARSE-WORD ROT 2DUP C! 1+ SWAP CMOVE ;
### CHAR
_"name" -- char_
parse char;
note that the syntax 'char is preferred instead of this legacy word: CHAR PARSE-NAME DROP C@ ;
### FIND
_c-addr -- c-addr 0 | xt 1 | xt -1_
search dictionary for counted string;
see WORD, COUNT and FIND-WORD### BUFFER:
_n "name" -- ; -- addr_
define buffer with n bytes;
executing name leaves address of n bytes: BUFFER: CREATE ALLOT ;
### :NONAME
_-- xt_
colon definition without name;
leaves execution token of definition to be used or saved### C"
_"ccc" -- ; -- c-addr_
leave counted string "ccc" (compiled);
may throw -18 "parsed string overflow": C" ?COMP POSTPONE S" POSTPONE DROP POSTPONE 1- ;
### MARKER?
_xt -- flag_
true if xt is a MARKER word### MARKER
_"name" -- ; --_
define a dictionary marker;
executing name deletes marker and all definitions made after;
beware of vocabulary definitions crossings: MARKER
CURRENT
DUP @
HERE
CREATE
, 2,
DOES>
DUP CELL+ 2@
SWAP TO CONTEXT
DUP CONTEXT !
DEFINITIONS
L>NAME NAME> TO LASTXT
@ HERE - ALLOT ;### ANEW
_"name" -- ; --_
define a dictionary marker;
deletes previously defined name and all following definitions;
beware of vocabulary definitions crossings: ANEW
>IN @ >R
PARSE-NAME FIND-WORD
OVER MARKER?
AND IF
EXECUTE
ELSE
DROP
R> >IN !
MARKER ;### [CHAR]
_"char" -- ; -- char_
compile char as literal;
note that the syntax 'char is preferred instead of this legacy word;
may throw -14 "interpreting a compile-only word": [CHAR] ?COMP CHAR LITERAL ; IMMEDIATE
### [COMPILE]
_"name" -- ; ... -- ..._
compile name;
note that POSTPONE is preferred instead of this legacy word;
may throw -14 "interpreting a compile-only word": [COMPILE] ?COMP ' COMPILE, ; IMMEDIATE
### K
_-- n_
counter of outer (third) do loop### TEXT
_--_
read and evaluate TEXT editor area with Forth source code;
caveat: .( and ( in TEXT cannot span more than one line, they end at EOL: TEXT
$7973 @ 1+ >R
BEGIN
R> \ -- addr
DUP C@ $FF <> WHILE
2+ DUP C@ SWAP 1+ \ -- len addr
2DUP + >R
SWAP 1- EVALUATE
REPEAT
DROP ;## Floating point math words included with the full version
Floating point values are doubles on the stack. Double words, such as 2DUP,
can be used to manipulate floats. Floats can be stored in 2CONSTANT, 2VARIABLE
and 2VALUE assigned with TO (but not with +TO.)Beware that HEX prevents inputting floats and garbles the output of floats.
### F+
_r1 r2 -- r3_
sum r1+r2;
may throw -43 "floating-point result out of range"### F-
_r1 r2 -- r3_
difference r1-r2;
may throw -43 "floating-point result out of range"### F*
_r1 r2 -- r3_
product r1*r2;
may throw -43 "floating-point result out of range"### F/
_r1 r2 -- r3_
quotient r1/r2
may throw -42 "floating-point divide by zero";
may throw -43 "floating-point result out of range"### FTRUNC
_r1 -- r2_
truncate float towards zero### FLOOR
_r1 -- r2_
floor float towards negative infinity
may throw -43 "floating-point result out of range"### FROUND
_r1 -- r2_
round float to nearest;
may throw -43 "floating-point result out of range"### FNEGATE
_r1 -- r2_
negate float### FABS
_r1 -- r2_
absolute value |r1|: FABS 2DUP F0< IF FNEGATE THEN ;
### F=
_r1 r2 -- flag_
true if r1 = r2: F= D= ; ( works for IEEE 754 floating point without negative zero and inf/nan )
### F<
_r1 r2 -- flag_
true if r1 < r2: F<
DUP 3 PICK AND 0< IF
2SWAP
D< ; ( works for IEEE 754 floating point without negative zero and inf/nan )### F0=
_r -- flag_
true if r = 0.0e0: F0= D0= ; ( works for IEEE 754 floating point without negative zero and inf/nan )
### F0<
_r -- flag_
true if r < 0.0e0: F0< D0< ; ( works for IEEE 754 floating point without negative zero and inf/nan )
### FMAX
_r1 r2 -- r3_max of r1 and r2
: FMAX
2OVER 2OVER F< IF 2SWAP THEN
2DROP ;### FMIN
_r1 r2 -- r3_min of r1 and r2
: FMIN
2OVER 2OVER F< INVERT IF 2SWAP THEN
2DROP ;### D>F
_d -- r_
widen signed double to float### S>F
_n -- r_
widen signed single to float### F>D
_r -- d_
narrow float to a signed double;
may throw -11 "result out of range"### F>S
_r -- n_
narrow float to a signed single;
may throw -11 "result out of range"### >FLOAT
_c-addr u -- r true | false_
convert string to float;
leaves the float and true if string is converted;
leaves false if string is unconvertable### REPRESENT
_r c-addr u -- n flag true_
convert float to string;
store decimal digits of the float in the non-empty buffer c-addr u;
leaves decimal exponent n+1 and flag = true if negative### PRECISION
_-- +n_
floating point output precision, the number of decimal digits displayed is 7 by default7 VALUE PRECISION
### F.
_r --_
output float with a trailing space;
output fixed notation when 1e-1 <= |r| < 1e+7, otherwise output scientific notation: F.
HERE PRECISION REPRESENT DROP IF
'- EMIT
THEN
DUP 0 PRECISION 1+ WITHIN IF
HERE OVER TYPE
'. EMIT
HERE OVER +
PRECISION ROT - '0 -TRIM TYPE SPACE
EXIT
THEN
HERE C@ EMIT
'. HERE C!
HERE PRECISION '0 -TRIM TYPE
'E EMIT
1- . ;## Dictionary structure
The Forth850 dictionary is organized as follows:
low address
_________
+--->| $0000 | last link is zero (2 bytes)
^ |---------|
| | 3 | length of "(:)" (1 byte)
| |---------|
| | (:) | "(:)" word characters (3 bytes)
| |---------|
| | code | machine code
| |=========|
+<==>+ link | link to previous entry (2 bytes)
^ |---------|
: : :
: : :
: : :
| |=========|
+<==>| link | link to previous entry (2 bytes)
^ |---------|
| | $80+5 | length of "aword" (1 byte) with IMMEDIATE bit set
| |---------|
| | aword | "aword" word characters (5 bytes)
| |---------|
| | code | Forth code and/or data
| |=========|
+<---| link |<--- last link to previous entry (2 bytes)
|---------|
| 7 | length of "my-word" (1 byte)
|---------|
| my-word | "my-word" word characters (7 bytes)
|---------|
| code |<--- LASTXT points to code (last xt)
|=========|<--- HERE pointer
| hold | hold area for numerical output (40 bytes)
|---------|
| |
| free | unused dictionary space
| space |
| |
|=========|<--- dictionary limit
| |
| data | stack of 256 bytes (128 cells)
| stack | grows toward lower addresses
| |<--- SP stack pointer
|=========|
| |
| return | return stack of 256 bytes (128 cells/calls)
| stack | grows toward lower addresses
| |<--- RP return stack pointer
|_________|<--- USER address
<--- USER+1 addresshigh address set with USER in Monitor MON
A link field points to the previous link field. The last link field at the
lowest address of the dictionary is zero.`LASTXT` returns the execution token of the last definition, which is the
location where the machine code of the last word starts.Forth850 is a Direct Threaded Code Forth implementation. Code is either
machine code or starts with a jump or call machine code instruction of 3 bytes,
followed by Forth code (a sequence of execution tokens in a colon definition)
or data (constants, variables, values and other words created with `CREATE`.)Immediate words are marked with the length byte high bit 7 set ($80). Hidden
words have the "smudge" bit 6 ($40) set. A word is hidden until successfully
compiled. `HIDE` hides the last defined word by setting the smudge bit.
`REVEAL` reveals it. Incomplete colon definitions with compilation errors
should never be revealed.## Implementation
The following sections explain parts of the technical implementation of
Forth850. I will explain the new Forth system routines, the new Z80 math
routines and the string routines.Forth850 is built with the [asz80](https://shop-pdp.net/ashtml/asz80.htm)
assembler and aslink linker.## Z80 Forth system routines
Forth850 uses direct threaded code (DTC). Faster would be to use subroutine
threaded code (STC), but this would significantly increase the overall code
size and Forth compilation complexity, which are less desirable for a small
Z80-based system.The following Z80 Forth routines are inspired by the article
[Moving Forth](https://www.bradrodriguez.com/papers/moving1.htm). However,
I've decided to use a different Z80 register mapping that is more efficient:- BC: instruction pointer (IP)
- DE: top of stack (TOS)
- IY: address of the "next routine", for `jp (iy)`By contrast to the article, having the TOS in DE makes it quicker to perform
address arithmetic with the TOS, because we can exchange DE with HL with `ex
de,hl` in just 4 CPU cycles. Moving BC to HL takes 8 CPU cycles.I've placed the return stack pointer (RP) in RAM. There is no advantage to use
register IX for RP as the article suggests. In fact, the colon call and return
have the same cycle counts, but almost all of the return stack operations, such
as `>R`, require more cycles with the RP in IX compared to the RP in RAM.A jump to the "next routine" is with `jp (iy)` takes 8 CPU cycles, compared to
a `jp next` that takes 10 cycles. Inlining the "next routine" eliminates
this overhead, but increases the code size. Inlining should only be applied to
performance-critical words that are frequently used. See macros `NEXT` and
`JP_NEXT` defined in the section below.### Next fetch and execute
Fetching an execution token (xt) from the instruction pointer (IP) address,
incrementing IP and executing the token takes 38 CPU cycles in the "next
routine":.macro NEXT
ld a,(bc) ; 7 ;
ld l,a ; 4 ;
inc bc ; 6 ;
ld a,(bc) ; 7 ;
ld h,a ; 4 ;
inc bc ; 6 ; [ip++] -> hl with xt
jp (hl) ; 4(38); jump to hl
.endifThe "next routine" cycles contribute to the overhead of DTC, which cannot be
removed to speed up DTC execution. To improve speed by 10% on average, the
fetch and execute routine is inlined with the `NEXT` macro for
performance-critical words. When performance is not critical, a `JP_NEXT`
macro is used, which simply expands into `jp (iy)` with the IY register
pointing to the "next routine":.macro JP_NEXT
jp (iy) ; 8(46); jump to next routine
.endm### Colon call and return
Each colon definition in memory starts with a `call docol`. The `docol`
routine associated with the `(:)` word saves the instruction pointer in BC on
the return stack and pops the new instruction pointer from the parameter stack
(since `call docol` leaves the address after the call on the stack.) The
routine checks for ON/BREAK key and begins executing the colon definition with
the "next routine":docol: ld hl,(rp) ; 16 ; [rp] -> hl
dec hl ; 6 ;
ld (hl),b ; 7 ;
dec hl ; 6 ;
ld (hl),c ; 7 ; save bc -> [--rp] with caller ip on the return stack
ld (rp),hl ; 16 ; ip - 2 -> rp
pop bc ; 10(68); pop ip saved by call docol
; continue with ON/BREAK key check
cont: in a,(0x1f) ; 11 ; port 0x1f bit 7 is set if ON/BREAK is depressed
add a ; 4 ; test ON/BREAK key
jr c,break ; 7(22); if ON/BREAK pressed then break
; next
next: ld a,(bc) ; 7 ;
ld l,a ; 4 ;
inc bc ; 6 ;
ld a,(bc) ; 7 ;
ld h,a ; 4 ;
inc bc ; 6 ; [bc++] -> hl with xt
jp (hl) ; 4(38); jump to hlA return from a colon definition with the `(;)` word pops the return
instruction pointer off the return stack to continue executing the caller's
next instruction.doret: ld hl,(rp) ; 16 ; [rp] -> hl
ld c,(hl) ; 7 ;
inc hl ; 6 ;
ld b,(hl) ; 7 ;
inc hl ; 6 ;
ld (rp),hl ; 16(58); restore [rp++] -> bc with ip of the caller
NEXT ; 38 ; continueA colon call takes 145 cycles (17 + 68 + 22 + 38 cycles) and a colon return
takes 96 cycles (58 + 38 cycles.) This includes the 38 cycle overhead of the
"next routine" to fetch and execute the next token.### Variables
A Forth variable leaves its address on the parameter stack. A `call dovar` is
used to push the address on the stack, which is then retrieved to set the new
TOS:dovar: pop hl ; 10 ; pop hl with pfa addr saved by call dovar
push de ; 11 ; save TOS
ex de,hl ; 4(25); set new TOS to hl with pfa addr
NEXT ; 38 ; continueExecuting a word defined as a variable takes 80 cycles (17 + 25 + 38 cycles),
which includes the "next routine" overhead.### Constants and values
A Forth constant or value leaves its value on the parameter stack. A `call
doval` is used to push the address of the constant/value on the stack. The
constant/value is then fetched:doval: pop hl ; 10 ; pop hl with pfa addr saved by call doval
push de ; 11 ; save TOS
ld e,(hl) ; 7 ;
inc hl ; 6 ;
ld d,(hl) ; 7(41); set [hl] -> de as new TOS
NEXT ; 38 ; continueExecuting a word defined as a constant or value takes 96 cycles (17 + 41 + 38
cycles), which includes the "next routine" overhead.### Fetch and store
The `@` fetch and `!` store words make good use of `ex de,hl`:
fetch: ex de,hl ; 4 ; addr -> hl
ld e,(hl) ; 7 ;
inc hl ; 6 ;
ld d,(hl) ; 7(24); set [hl] -> de as new TOS
NEXT ; 38 ; continuestore: pop hl ; 10 ; pop addr -> hl
ex de,hl ; 4 ; x -> de, addr -> hl
ld (hl),e ; 7 ;
inc hl ; 6 ;
ld (hl),d ; 7 ; de -> [hl] with x
pop de ; 10(44); pop new TOS
NEXT ; 38 ; continue### CREATE with DOES>
A Forth definer word that uses CREATE with DOES> to define new words is
compiled to execute the `(;CODE)` token with label `doscode`, followed by a
`call dodoes` to start interpreting the DOES> code:doscode: ld hl,(lastxt+3) ; LASTXT -> hl with last defined word xt
inc hl ;
ld (hl),c ;
inc hl ;
ld (hl),b ; ip -> [LASTXT+1] overwrite call address
jr doret ; (;) return to callerdodoes: ld hl,(rp) ; 16 ; [rp] -> hl
dec hl ; 6 ;
ld (hl),b ; 7 ;
dec hl ; 6 ;
ld (hl),c ; 7 ;
ld (rp),hl ; 16 ; save bc -> [--rp] with old ip on the return stack
pop bc ; 10 ; pop bc with new ip of the DOES> routine saved by call dodoes
pop hl ; 10 ; pop pfa addr
push de ; 11 ; save TOS
ex de,hl ; 4(93); set new TOS to hl with pfa addr
NEXT ; 38 ; continueA word defined by a CREATE/DOES> definer makes a `call` to the `call dodoes`
routine. For example, suppose we define `CONSTANT` as follows:: CONSTANT CREATE , DOES> @ ;
123 CONSTANT Xthen `CONSTANT` and `X` are compiled as:
CONSTANT: call docol
.dw create
.dw comma
.dw doscode
CONSTANT_does: call dodoes
.dw fetch
.dw doretX: call CONSTANT_does
.dw 123Executing `X` takes 192 cycles (17 + 17 + 24 + 38 + 96 cycles.) When
more optimally defined as a `CONSTANT` in code, this takes 96 cycles.### Parsing
Forth words are parsed with my new `CHOP` and `TRIM` words that efficiently
parse and extract white-space-delimited words from the input.Entry:
- DE with TOS: a char to truncate the string with
- 2OS: string length u1
- 3OS: string address c-addrExit:
- DE with TOS: truncated string length u2
- 2OS: string address c-addrPerformance:
21 cycles per character for non-BL char to chop, 40 cycles per character for BL
to chop white spacechop: ld a,e ; char -> a
exx ; save bc with ip
ex af,af' ; save a with char
pop bc ; pop u1 -> bc
ld e,c ;
ld d,b ; u1 -> de
ld a,c ;
or b ; test bc = 0, 0 -> cf
jr z,2$ ; if bc = 0 then not found
pop hl ;
push hl ; c-addr -> hl
ex af,af' ; restore a with char
cp 0x20 ;
jr z,3$ ; if a = 0x20 then find white space
or a ; 0 -> cf not found
; find char in string
cpir ; 21/16 ; repeat until a = [hl++] or --bc = 0
jr nz,2$ ; if match then
1$: ccf ; complement to correct cpi bc--
2$: ex de,hl ; u1 -> hl
sbc hl,bc ; u1 - bc - cf -> hl
push hl ; save hl as TOS
exx ; restore bc with ip
pop de ; pop new TOS
JP_NEXT ; continue
; find white space in string
3$: cp (hl) ; 7 ; loop to compare a to [hl]
cpi ; 16 ; hl++, bc--
jr nc,1$ ; 7 ; if [hl] a
exx ; save bc with ip
pop bc ; u1 -> bc
pop hl ; c-addr1 -> hl
1$: ex af,af' ; 4 ; save a
ld a,c ; 4 ;
or b ; 4 ;
jr z,3$ ; 7 ; if bc <> 0 then
ex af,af' ; 4 ; restore a
2$: cpi ; 16 ; loop
jr nz,4$ ; 7/12 ; while a = [hl++], --bc
jp pe,2$ ; 10 ; until b = 0
3$: push hl ; save hl as 2OS
push bc ; save bc as TOS
exx ; restore bc with ip
pop de ; pop new TOS
JP_NEXT ; continue
4$: cp 0x20 ; 7 ;
jr nz,5$ ; 7 ; if char = 0x20 then
dec hl ; 6 ;
cp (hl) ; 7 ;
inc hl ; 6 ;
jr nc,1$ ; 12 ; if [hl-1] <= 0x20 then keep trimming
5$: inc bc ; correct bc++ for cpi match
dec hl ; correct hl-- for cpi match
jr 3$ ; finalize trimmingTo parse a white-space-delimited word is efficiently performed with `BL PARSE`
where the `PARSE` word is defined as:: PARSE ( char "ccc" -- c-addr u )
SOURCE
>IN @ /STRING
ROT CHOP
DUP 1+ >IN @ +
SOURCE NIP UMIN >IN ! ;To skip input until the next non-white-space character is efficiently performed
with `BL SKIPS`, where `SKIPS` is defined as:: SKIPS ( char "" -- )
SOURCE >IN @ /STRING
ROT TRIM
DROP
SOURCE DROP - >IN ! ;### Dictionary search with case insensitive string matching
The `FIND-WORD` word searches the dictionary starting with `CONTEXT` for a
matching word. The search is case insensitive. Smudged words are skipped.Entry:
- DE with TOS: size of the string to search u
- 2OS: address of the string to search c-addrExit:
- DE with TOS: 0 = not found, 1 = found immediate, -1 = found (not immediate)
- 2OS: string address if not found or execution token when foundPerformance:
95 cycles per dictionary entry,
51 or 102 cycles per character comparison when characters matchfindword: ld a,d ;
or a ; test d = 0 high order byte of u
jp nz,zero_next ; if u is too large then set new TOS to 0
sla e ; shift u to compare w/o immediate bit
jp c,zero_next ; if u is too large then set new TOS to 0
jp z,zero_next ; if u = 0 then set new TOS to 0
push de ; save de with 2*u
exx ; save bc with ip
pop bc ; pop 2 * u -> bc
pop de ; pop c-addr -> de
ld hl,(context+3) ; CONTEXT -> hl
jr 3$ ; start searching
; loop over dictionary
1$: pop de ; restore de with c-addr
2$: pop hl ; 10 ; loop, restore hl with lfa
3$: ld a,(hl) ; 7 ;
inc hl ; 6 ;
ld h,(hl) ; 7 ;
ld l,a ; 4 ; [hl] -> hl follow link at hl = lfa
or h ; 4 ;
jr z,6$ ; 7 ; if hl = 0 then not found
push hl ; 11 ; save hl with lfa
inc hl ; 6 ;
inc hl ; 6 ; hl + 2 -> hl with nt (nfa)
ld a,(hl) ; 7 ; word length
add a ; 4 ; shift away immediate bit
cp c ; 4 ; test a = c word length match (both shifted)
jr nz,2$ ; 12(95); if lengths differ then continue searching
; compare string to word
push de ; save de with c-addr
inc hl ; hl++ point to nfa chars
ld b,c ; 2 * u -> b
srl b ; u -> b word length (nonzero)
; loop over word chars
4$: ld a,(de) ; 7 ; loop
cp (hl) ; 7 ; compare [de] = [hl]
jr z,5$ ; 12/7 ; if mismatch then
and 0xdf ; 7 ; make upper case
cp 'A ; 7 ;
jr c,1$ ; 7 ; if a<'A' then continue search
cp 'Z+1 ; 7 ;
jr nc,1$ ; 7 ; if a>'Z' then continue search
xor (hl) ; 7 ;
and 0xdf ; 7 ; case insensitive compare [de] = [hl]
jr nz,1$ ; 7 ; if mismatch then continue search
5$: inc de ; 6 ; de++ point to next char of c-addr
inc hl ; 6 ; hl++ point to next char of word
djnz 4$ ; 13(51/102);until --b = 0
; found a matching word
pop de ; discard saved c-addr
ex (sp),hl ; save hl with xt as 2OS, restore hl with lfa
inc hl ;
inc hl ; hl + 2 -> hl with nt (nfa)
bit immediate_bit,(hl) ; test immediate bit of [hl] word length
exx ; restore bc with ip
jp nz,one_next ; set new TOS to 1 if word is immediate
jp mone_next ; set new TOS to -1
; not found
6$: push de ; save de with c-addr as 2OS
exx ; restore bc with ip
jp zero_next ; set new TOS to 0
JP_NEXT ; continue## Z80 integer math routines
I've written the following Z80 math routines. My objective was to make them
as efficient as possible. The second objective was to keep the code size small
by using tricks with CPU arithmetic and flags.### Fast signed/unsigned 16x16->16 bit multiplication
Entry:
- BC: signed or unsigned multiplier n1
- DE: signed or unsigned multiplicand n2Exit:
- HL: signed product or unsigned product n3Perfomance:
max 51 cycles x 16 iterations = 816 cycles or
max 51 cycles x 8 iterations + 45 x 8 = 768 cycles, excluding entry/exit overheadmult1616: ld hl,0 ; 0 -> hl
ld a,c ; c -> a low order byte of n1
ld c,b ; b -> c save high order byte of n1
ld b,8 ; 8 -> b loop counter
1$: rra ; 4 ; loop, a >> 1 -> a set cf
jr nc,2$ ; 7 ; if cf = 1 then
add hl,de ; 11 ; hl + de -> hl
2$: sla e ; 8 ;
rl d ; 8 ; de << 1 -> de
djnz 1$ ; 13(51); until --b = 0
ld a,c ; c -> a high order byte of n1
ld b,8 ; 8 -> b loop counter
3$: rra ; 4 ; loop, a >> 1 -> a set cf
jr nc,4$ ; 7 ; if cf = 1 then
add hl,de ; 11 ; hl + de -> hl
4$: sla e ; 8 ;
rl d ; 8 ; de << 1 -> de
djnz 3$ ; 13(51); until --b = 0
ret ; doneWe can make an additional speed improvement, which only costs us one more
instruction byte. To calculate the high order byte we do not need to iterate
over all 8 bits of the high order multiplier stored in register c, but only
over the nonzero bits. We also can ignore the lower order result stored in
register e. This reduces the max loop iteration cycle time to 32 and 33 per
bit. Furthermore, the second loop only runs until the last bit of register c
is shifted out. If register c is zero, the second loop does not execute
thereby saving hundreds of cycles. We also use jp instead of jr to improve and
balance the cycle time per bit:mult1616: ld hl,0 ; 0 -> hl
ld a,c ; c -> a low order byte of n1
ld c,b ; b -> c save high order byte of n1
ld b,8 ; 8 -> b loop counter
1$: rra ; 4 ; loop, a >> 1 -> a set cf
jr nc,2$ ; 7 ; if cf = 1 then
add hl,de ; 11 ; hl + de -> hl
2$: sla e ; 8 ;
rl d ; 8 ; de << 1 -> de
djnz 1$ ; 13(51); until --b = 0
ld a,h ; h -> a do high order, low order is done
jr 5$ ; jump to shift c and loop
3$: add d ; 4 ; loop, a + d -> d
4$: sla d ; 8 ; d << 1 -> d
5$: srl c ; 8 ; c >> 1 -> c set cf and z if no bits left
jr c,3$ ; 12/7(32); until cf = 0 repeat with addition
jp nz,4$ ; 10(33); until c = 0 repeat without addition
ret ; doneNote: unrolling the loops would improve the speed at the cost of a significant
code size increase, which is undesirable for small memory devices.### Fast unsigned 16x16->32 bit multiplication
Entry:
- DE: unsigned multiplicand u1
- BC: unsigned multiplier u2Exit:
- DE: low order unsigned product u3
- HL: high order unsigned product u3Perfomance:
max 64 cycles x 17 iterations = 1088 cycles, excluding entry/exit overheadumult1632: xor a ; 0 -> cf
ld l,a ;
ld h,a ; 0 -> hl
ld a,17 ; 17 -> a loop counter
1$: rr h ; 8 ; loop
rr l ; 8 ;
rr d ; 8 ;
rr e ; 8 ; hlde + cf >> 1 -> hlde
jr nc,2$ ; 7 ; if cf = 1 then
add hl,bc ; 11 ; hl + bc -> hl
2$: dec a ; 4 ;
jp nz,1$ ; 10(64); until --a = 0
ret ; doneNote: unrolling the loop would improve the speed at the cost of a significant
code size increase, which is undesirable for small memory devices.### Fast signed/unsigned 32x32->32 bit multiplication
Entry:
- BC': low order signed or unsigned multiplicand d1
- DE: high order signed or unsigned multiplicand d1
- DE': low order signed or unsigned multiplier d2
- HL': high order signed or unsigned multiplier d2Exit:
- HL': low order signed product or unsigned product d3
- HL: high order signed product or unsigned product d3Perfomance:
max 98 cycles x 32 iterations = 3136 cycles, excluding entry/exit overheadmult3232: ld hl,0 ; 0 -> hl high order d3, de with d2 high order
exx ; save bc with ip
ld a,h ;
push af ; save d1 high order byte 3
ld a,l ;
push af ; save d1 high order byte 2
ld a,b ;
push af ; save d1 low order byte 1
ld a,c ;
push af ; save d1 low order byte 0
ld hl,0 ; 0 -> hl' low order d3
ld c,4 ; 4 -> c outer loop counter
1$: pop af ; loop, [sp++] -> a next d1 byte
ld b,8 ; 8 -> b inner loop counter
2$: rra ; 4 ; loop, a >> 1 -> a set cf
jr nc,3$ ; 7 ; if cf = 1 then
add hl,de ; 11 ; hl' + de' -> hl add low order
exx ; 4 ;
adc hl,de ; 15 ; hl + de + cf -> hl add high order
exx ; 4 ;
3$: sla e ; 8 ;
rl d ; 8 ; de' << 1 -> de' shift low order
exx ; 4 ;
rl e ; 8 ;
rl d ; 8 ; de << 1 + cf -> de shift high order
exx ; 4 ;
djnz 2$ ; 13(98); until --b = 0
dec c ;
jr nz,1$ ; until --c = 0
ret ; doneThe same tricks as the 16x16->16 multiplication method can be used to reduce
the cycle time, but at a cost of increased code size. We also assign different
registers to the low and high order parts:Entry:
- BC': low order signed or unsigned multiplicand d1
- DE: high order signed or unsigned multiplicand d1
- HL': low order signed or unsigned multiplier d2
- DE': high order signed or unsigned multiplier d2Exit:
- HL: low order signed product or unsigned product d3
- HL': high order signed product or unsigned product d3Perfomance:
max 8 x (98+87+59+33) = 2216 cycles, excluding entry/exit overheadmult3232: ld hl,0 ; 0 -> hl low order d3, de with d2 low order
exx ; save bc with ip
ld a,h ;
push af ; save d1 high order byte 3
ld a,l ;
push af ; save d1 high order byte 2
ld a,b ;
push af ; save d1 low order byte 1
ld a,c ; d1 -> a low order byte 0
ld hl,0 ; 0 -> hl' high order d3
ld b,8 ; 8 -> b loop counter
1$: rra ; 4 ; loop, a >> 1 -> a set cf
jr nc,2$ ; 7 ; if cf = 1 then
exx ; 4 ;
add hl,de ; 11 ; hl + de -> hl add low order
exx ; 4 ;
adc hl,de ; 15 ; hl' + de' + cf -> hl add high order
2$: exx ; 4 ;
sla e ; 8 ;
rl d ; 8 ; de << 1 -> de shift low order
exx ; 4 ;
rl e ; 8 ;
rl d ; 8 ; de' << 1 + cf -> de' shift high order
djnz 1$ ; 13(98); until --b = 0
pop af ;
ld c,a ; d1 -> c low order byte1
exx ;
ld a,h ; h -> a low order d3
exx ;
ld b,8 ; 8 -> b loop counter
3$: rr c ; 8 ; loop, c >> 1 -> c set cf
jr nc,4$ ; 7 ; if cf = 1 then
exx ; 4 ;
add d ; 4 ; a + d -> a add low order
exx ; 4 ;
adc hl,de ; 15 ; hl' + de' + cf -> hl add high order
4$: exx ; 4 ;
sla d ; 8 ; d << 1 -> d shift low order
exx ; 4 ;
rl e ; 8 ;
rl d ; 8 ; de' << 1 + cf -> de' shift high order
djnz 3$ ; 13(87); until --b = 0
exx ;
ld h,a ; a -> h low order d3
exx ;
pop af ; d1 -> a high order byte 2
ld b,8 ; 8 -> b loop counter
5$: rra ; 8 ; loop, c >> 1 -> c set cf
jr nc,6$ ; 7 ; if cf = 1 then
add hl,de ; 15 ; hl' + de' + cf -> hl add high order
6$: rl e ; 8 ;
rl d ; 8 ; de' << 1 + cf -> de' shift high order
djnz 5$ ; 13(59); until --b = 0
pop af ;
ld c,a ; d1 -> c high order byte 3
ld a,h ; h -> a high order
jr 9$ ; jump to shift c and loop
7$: add d ; 4 ; loop, a + d -> a
8$: sla d ; 8 ; d << 1 -> d
9$: srl c ; 8 ; c >> 1 -> c set cf and z if no bits left
jr c,7$ ; 12/7(32); until cf = 0 repeat with addition
jp nz,8$ ; 10(33); until c = 0 repeat without addition
ld h,a ; a -> h high order
exx ;
ret ; doneNote: unrolling the inner loop would improve the speed at the cost of a
significant code size increase, which is undesirable for small memory devices.### Fast unsigned 32/16->16 bit division and remainder
This implementation is used by all division and remainder (modulo) Forth words
by calling `UM/MOD`. As such, it is an important and versatile algorithm.Entry:
- HL: high order dividend ud
- BC: low order dividend ud
- DE: divisor u1Exit:
- HL: remainder u2
- BC: quotient u3Performance:
max 85 cycles x 16 iterations = 1360 cycles, excluding entry/exit overheadudiv3216: xor a ;
sub e ;
ld e,a ;
sbc a ;
sub d ;
ld d,a ; -de -> de with -u1
ld a,b ; b -> a low order dividend in ac
ld b,16 ; 16 -> b loop counter
sla c ;
rla ; ac << 1 -> ac
1$: adc hl,hl ; 15 ; loop, hl << 1 + cf -> hl
jr nc,2$ ; 12/ 7 ; if cf = 1 then
add hl,de ; 11 ; hl + -u1 -> hl
scf ; 4 ; 1 -> cf
jr 3$ ; 12 ; else
2$: add hl,de ; 11 ; hl + -u1 -> hl
jr c,3$ ; 12/ 7 ; if cf = 0 then
sbc hl,de ; 15 ; hl - -u1 -> hl to undo, no carry
3$: rl c ; 8 ;
rla ; 4 ; ac << 1 + cf -> ac
djnz 1$ ; 13(85); until --b = 0
ld b,a ; a -> b quotient bc, remainder in hl
ret ; doneThe algorithm negates the divisor first to speed up subtraction by adding the
negative of the divisor instead. Another benefit of using the negated divisor
is that this produces the right carry value to shift into the quotient,
otherwise the carry should be inverted or the resulting quotient must be
inverted.By moving the first conditional block out of the loop, we can save 5 CPU cycles
on the critical path (the most expensive path through the loop) to reduce to
max 80 cycles per iteration at the cost of making the code more cluttered.Entry:
- HL: high order dividend ud
- BC: low order dividend ud
- DE: divisor u1Exit:
- HL: remainder u2
- BC: quotient u3Performance:
max 80 cycles x 16 iterations = 1280 cycles, excluding entry/exit overheadudiv3216: xor a ;
sub e ;
ld e,a ;
sbc a ;
sub d ;
ld d,a ; -de -> de with -u1
ld a,b ; b -> a low order dividend in ac
ld b,16 ; 16 -> b loop counter
sla c ;
rla ; ac << 1 -> ac
1$: adc hl,hl ; 15 ; loop, hl << 1 + cf -> hl
jr c,3$ ; 7/12 ; if cf = 1 then hl + -u1 -> hl, 1 -> cf else
add hl,de ; 11 ; hl + -u1 -> hl
jr c,2$ ; 12/ 7 ; if cf = 0 then
sbc hl,de ; 15 ; hl - -u1 -> hl to undo, no carry
2$: rl c ; 8 ;
rla ; 4 ; ac << 1 + cf -> ac
djnz 1$ ; 13(80); until --b = 0
ld b,a ; a -> b quotient bc, remainder in hl
ret ; done3$: add hl,de ; 11 ; hl + -u1 -> hl
scf ; 4 ; 1 -> cf
jr 2$ ; 12 ;By comparison, the CamelForth Z80 code is also fast, but slower than my
implemenation with 90 cycles x 16 iterations = 1440 cycles, excluding
entry/exit overhead:udiv3216: ld a,16 ; 16 -> a loop counter
sla e ;
rl d ; de << 1 -> de
1$: adc hl,hl ; 15 ; loop, hl << 1 + cf -> hl
jr nc,2$ ; 12/ 7 ; if cf = 1 then
or a ; 4 ; 0 -> cf
sbc hl,bc ; 15 ; hl - u1 -> hl
or a ; 4 ; 0 -> cf
jp 3$ ; 10 ; else
2$: sbc hl,bc ; 15 ; hl - u1 -> hl
jr nc,3$ ; 12/ 7 ; if cf = 1 then
add hl,bc ; 11 ; hl + u1 -> hl to undo sbc, sets cf
3$: rl e ; 8 ;
rl d ; 8 ; de << 1 + cf -> de with inverse cf we'll need
dec a ; 4 ;
jp nz,1$ ; 10(90); until --a = 0
ld a,e ;
cpl ;
ld e,a ;
ld a,d ;
cpl ;
ld d,a ; complement de, faster than ccf in loop
ret ; doneNote: unrolling the loop would improve the speed at the cost of a significant
code size increase, which is undesirable for small memory devices.### Fast unsigned 32/32->32 bit division and remainder
This algorithm uses the shadow registers BC', DE' and HL'. Because of this
register pressure, there is little room for further optimization. Registers
IX and IY cannot be used since they lack the necessary adc and sbc
instructions.Entry:
- BC: high order dividend ud1
- BC': low order dividend ud1
- DE': high order divisor ud2
- DE: low order divisor ud2Exit:
- HL': high order remainder ud3
- HL: low order remainder ud3
- BC: high order quotient ud4
- BC': low order quotient ud4Performance:
max 162 cycles x 32 iterations = 5184 cycles, excluding entry/exit overheadudiv3232: exx ;
xor a ;
ld h,a ;
ld l,a ; 0 -> hl'
rl c ;
rl b ;
exx ;
ld h,a ;
ld l,a ; 0 -> hl
ld a,b ; b -> a
rl c ;
rla ; ac << 1 -> ac
ld b,32 ; 32 -> b loop counter
1$: adc hl,hl ; 15 ;
exx ; 4 ;
adc hl,hl ; 15 ;
exx ; 4 ; hl'.hl << 1 + cf -> hl'.hl no carry
sbc hl,de ; 15 ;
exx ; 4 ;
sbc hl,de ; 15 ; hl'.hl - de'.de -> hl'.hl
jr nc,2$ ; 12/ 7 ; if cf = 1 then
exx ; 4 ;
add hl,de ; 11 ;
exx ; 4 ;
adc hl,de ; 15 ; hl'.hl + de'.de -> hl'.hl to undo, sets carry
2$: ccf ; 4 ; complement cf
rl c ; 8 ;
rl b ; 8 ;
exx ; 4 ;
rl c ; 8 ;
rla ; 4 ; ac.bc' << 1 + cf
djnz 1$ ; 13(162); until --b = 0
ld b,a ;
ld e,c ;
ret ;## Z80 floating point math routines
I've written a collection of Z80 IEEE 754 single precision floating point math
routines:
- [math.asm](math.asm) (960 bytes of code) a simple version with truncation
- [mathr.asm](mathr.asm) (1085 bytes of code) includes three IEEE 754 rounding
modes, where the default rounding mode is to round to nearest, ties to even;
- [mathri.asm](mathr.asm) (1296 bytes of code) includes the three IEEE 754
rounding modes, and inf/nan and signed zero. This version is not intended for
Forth850, because Forth850 raises floating point exceptions.My objective was to make the floating point routines as efficient as possible,
such as by using the shadow registers instead of memory. No memory is used,
except at most one push-pop pair to move a value between the (shadow)
registers. The second objective was to keep the code size small by using
tricks with CPU arithmetic and flags. The floating point library is about 1KB.Single precision floating point values are stored in registers BC (high order)
and DE (low order) to form a 32 bit float `bcde` and shadow float `bcde'`.- `fadd`: float `bcde` + `bcde'` -> `bcde`; cf set on overflow
- `fsubx`: float `bcde` - `bcde'` -> `bcde`; cf set on overflow
- `fsuby`: float `bcde'` - `bcde` -> `bcde`; cf set on overflow
- `fneg`: float - `bcde` -> `bcde`; no errors (cf reset)
- `fabs`: float |`bcde`| -> `bcde`; no errors (cf reset)
- `fmul`: float `bcde` * `bcde'` -> `bcde`; cf set on overflow
- `fdivx`: float `bcde` / `bcde'` -> `bcde`; cf set on overflow or when dividing by zero
- `fdivy`: float `bcde'` / `bcde` -> `bcde`; cf set on overflow or when dividing by zero
- `ftoi`: float `bcde` -> signed 32 bit integer `bcde` truncated towards zero; cf set when out of range
- `itof`: signed 32 bit integer `bcde` -> float `bcde`; no errors (cf reset)
- `ftrunc`: float trunc(`bcde`) -> `bcde`; no errors (cf reset)
- `ffloor`: float floor(`bcde`) -> `bcde`; cf set on overflow
- `fround`: float round(`bcde`) -> `bcde`; cf set on overflow
- `fpow10`: 10^`a` * `bcde` -> `bcde` for -128 <= `a` < 39; cf set on overflow
- `atof`: string [`hl`..`hl`+`a`-1] -> float `bcde`; cf set on parsing error and `hl` points after the char
- `ftoa`: float `bcde` -> [`hl`...`hl`+`a`-1] string of digits, exponent `e` and sign `d` bit 7; no errors (flags undefined)
- `fzero`: set `bcde` to 0.0[mathri.asm](mathr.asm) includes inf/nan and signed zero. In this version the
routines listed above may return signed inf or nan with cf set to indicate
overflow and errors. In addition, this version includes the following
routines:- `ftype`: float `bcde` -> `bcde` unchanged; cf set if `bcde` is nan, cf reset and z set if `bcde` is +/-inf
- `fnan`: set `bcde` to nan; cf set
- `finf`: set `bcde` to inf with sign in register A bit 7 (negative when set); cf set## Z80 string routines
I've written the following Z80 string routines. My objective was to make them
as efficient as possible, such as by making the obvious choice to use the `cpi`
and `cpir` Z80 instructions to minimize cycle count. The second objective was
to keep the code size small by using tricks with CPU arithmetic and flags.### Fast string comparison
Entry:
- IX: address of the first string c-addr1
- HL: size of the first string u1
- DE: address of the second string c-addr2
- BC: size of the second string u2Exit:
- A: -1 (less), 0 (equal), 1 (greater)
- F: zero flag set when equal, sign flag set when lessPerformance:
46 cycles per character comparison when characters matchcompare: push ix ; save c-addr1
push hl ; save u1
xor a ; 0 -> a flags u1 = u2, 0 -> cf
sbc hl,bc ;
jr z,1$ ; if u1 <> u2 then
inc a ; 1 -> a flags u1 > u2
jr nc,1$ ; if u1 < u2 then
pop bc ; pop u1 -> bc
push bc ; rebalance stack
ld a,-1 ; -1 -> a flags u1 < u2
1$: pop hl ; pop to discard u1
pop hl ; pop c-addr1 -> hl
ex af,af' ; save a with -1|0|1 flag
ld a,c ;
or b ;
jr z,3$ ; if bc <> 0 then
; compare chars
2$: ld a,(de) ; 7 ; loop
cpi ; 16 ; compare [hl++] to [de], --bc
jr nz,4$ ; 7 ; while characters [hl] and [de] are equal
inc de ; 6 ; de++
jp pe,2$ ; 10(46); until bc = 0
; chars match, check lengths
3$: ex af,af' ; restore a with -1|0|1 flag
ret ;
; strings differ
4$: dec hl ; hl-- to correct cpi overshoot
cp (hl) ; test a<[hl]
ccf ; complement cf, cf = 1 if [hl] cf
sbc ix,bc ; u1 - u2 -> ix
ret c ; if u2>u1 then impossible search, cf = 1
ld a,c ;
or b ;
ret z ; if u2 = 0 then done (found), cf = 0
push ix ;
push bc ;
pop ix ; u2 -> ix
pop bc ; u1 - u2 -> bc
inc bc ; u1 - u2 + 1 -> bc correct for cpir
push hl ; save c-addr1 on the stack
; find char match
1$: push de ; loop, save de with c-addr2
ld a,(de) ; [de] -> a
cpir ; 21/16 ; repeat until a = [hl++] or --bc = 0
jr nz,4$ ; if no match then not found
pop de ; restore de with c-addr2
push bc ;
push de ;
push hl ; save bc,de,hl
push ix ;
pop bc ; u2 -> bc
; compare substrings
dec bc ; u2 - 1 -> bc since u2 > 0
ld a,c ;
or b ;
jr z,3$ ; if bc<> 0 then
inc de ; de++ to start matching at c-addr2+1
2$: ld a,(de) ; 7 ; loop
cpi ; 16 ; compare [hl++] to [de], --bc
jr nz,3$ ; 7 ; while characters [hl] and [de] are equal
inc de ; 6 ; de++
jp pe,2$ ; 10(46); until bc = 0
3$: pop hl ;
pop de ;
pop bc ; restore bc,de,hl
jr nz,1$ ; repeat
; substrings match
dec hl ; hl-- to correct cpir overshoot
ret ; done, cf = 0
; not found
4$: scf ; 1 -> cf
ret ; done, cf = 1## Sharp PC-G850(V)(S)
- Manual:
- HP Forum thread:
- Sharp PC-G850(V)(S) software (Japanese site):## Forth resources
- Forth for the Sharp PC-E500(S):
- Forth 2012 Standard:
- Moving Forth:Forth850 benefits from the work done by many others to offer inspiration, but
the system does not include licensed code of the following implementations or
any other implementation not listed here. Some parts of Forth850 are derived
from freely available Forth resources listed above and the Z80 resources listed
further below:- CamelForth for the Z80:
- eForth:
- Jupiter Ace ROM listing:## Z80 resources
- asz80 assembler and linker: download
- Zilog Z80 CPU User Manual:
- Z80 Instruction Set:
- Z80 Instruction Set:## Z80 maths
- Z80 the 8-bit number cruncher:
- Z80 integer math routines:
- Z80 bits (integer math routines):
- Z80 advanced math:
- Z80 classic maths libraries:
- Z80 IEEE754 floating point library: