https://github.com/irmen/64tass
64tass - cross assembler for 6502 etc. microprocessors - by soci/singular - [git clone from the original sourceforge repo]
https://github.com/irmen/64tass
assembler assembly-6502 c64 commodore-64 retro retrocomputing
Last synced: 2 months ago
JSON representation
64tass - cross assembler for 6502 etc. microprocessors - by soci/singular - [git clone from the original sourceforge repo]
- Host: GitHub
- URL: https://github.com/irmen/64tass
- Owner: irmen
- License: gpl-2.0
- Created: 2017-10-18T21:39:51.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2026-03-04T22:08:37.000Z (3 months ago)
- Last Synced: 2026-03-28T01:54:28.271Z (3 months ago)
- Topics: assembler, assembly-6502, c64, commodore-64, retro, retrocomputing
- Language: C
- Homepage: http://sourceforge.net/projects/tass64/
- Size: 8.93 MB
- Stars: 48
- Watchers: 6
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README
- Changelog: NEWS
- License: LICENSE-GPL-2.0
Awesome Lists containing this project
README
64tass v1.60 r3243 reference manual
This is the manual for 64tass, the multi pass optimizing macro assembler for
the 65xx series of processors. Key features:
* Open source portable C with minimal dependencies
* Familiar syntax to Omicron TASS and TASM
* Supports 6502, 65C02, R65C02, W65C02, 65CE02, 65816, DTV, 65EL02, 4510,
45GS02
* Arbitrary-precision integers and bit strings, double precision floating
point numbers
* Character and byte strings, array arithmetic
* Handles UTF-8, UTF-16 and 8 bit RAW encoded source files, Unicode character
strings
* Supports Unicode identifiers with compatibility normalization and optional
case insensitivity
* Built-in `linker' with section support
* Various memory models, binary targets and text output formats (also Hex/
S-record)
* Assembly and label listings available for debugging or exporting
* Conditional compilation, macros, structures, unions, scopes
Contrary how the length of this document suggests 64tass can be used with just
basic 6502 assembly knowledge in simple ways like any other assembler. If some
advanced functionality is needed then this document can serve as a reference.
This is a development version. Features or syntax may change as a result of
corrections in non-backwards compatible ways in some rare cases. It's difficult
to get everything `right' first time.
Project page: https://sourceforge.net/projects/tass64/
The page hosts the latest and older versions with sources and a bug and a
feature request tracker.
-------------------------------------------------------------------------------
Table of Contents
* Table of Contents
* Usage tips
* Expressions and data types
+ Integer constants
+ Bit string constants
+ Floating point constants
+ Character string constants
+ Byte string constants
+ Lists and tuples
+ Dictionaries
+ Code
+ Addressing modes
+ Uninitialized memory
+ Booleans
+ Types
+ Symbols
o Regular symbols
o Local symbols
o Anonymous symbols
o Constant and re-definable symbols
o The star label
+ Built-in functions
o Mathematical functions
o Byte string functions
o Other functions
+ Expressions
o Operators
o Comparison operators
o Bit string extraction operators
o Conditional operators
o Address length forcing
o Compound assignment
o Slicing and indexing
* Compiler directives
+ Controlling the compile offset and program counter
+ Aligning data or code
+ Dumping data
o Storing numeric values
o Storing string values
+ Text encoding
+ Structured data
o Structure
o Union
o Combined use of structures and unions
+ Macros
o Parameter references
o Text references
+ Custom functions
+ Conditional assembly
o If, else if, else
o Switch, case, default
o Comment
+ Repetitions
+ Including files
+ Scopes
+ Sections
+ 65816 related
+ Controlling errors
+ Target
+ Misc
+ Printer control
* Pseudo instructions
+ Aliases
+ Generic instructions
+ Always taken branches
+ Long branches
* Original turbo assembler compatibility
+ How to convert source code for use with 64tass
+ Differences to the original turbo ass macro on the C64
+ Labels
+ Expression evaluation
+ Macros
+ Bugs
* Command line options
+ Output options
+ Operation options
+ Diagnostic options
+ Target selection on command line
+ Symbol listing
+ Assembly listing
+ Other options
+ Command line from file
* Messages
+ Warnings
+ Errors
+ Fatal errors
* Credits
* Default translation and escape sequences
+ Raw 8-bit source
o The none encoding for raw 8-bit
o The screen encoding for raw 8-bit
+ Unicode and ASCII source
o The none encoding for Unicode
o The screen encoding for Unicode
* Opcodes
+ Standard 6502 opcodes
+ 6502 illegal opcodes
+ 65DTV02 opcodes
+ Standard 65C02 opcodes
+ R65C02 opcodes
+ W65C02 opcodes
+ W65816 opcodes
+ 65EL02 opcodes
+ 65CE02 opcodes
+ CSG 4510 opcodes
+ 45GS02 opcodes
* Appendix
+ Assembler directives
+ Built-in functions
+ Built-in types
-------------------------------------------------------------------------------
Usage tips
64tass is a command line assembler, the source can be written in any text
editor. As a minimum the source filename must be given on the command line. The
`-a' command line option is highly recommended if the source is Unicode or
ASCII.
64tass -a src.asm
There are also some useful parameters which are described later.
For comfortable compiling I use such `Makefile's (for make):
demo.prg: source.asm macros.asm pic.drp music.bin
64tass -C -a -B -i source.asm -o demo.tmp
pucrunch -ffast -x 2048 demo.tmp >demo.prg
This way `demo.prg' is recreated by compiling `source.asm' whenever
`source.asm', `macros.asm', `pic.drp' or `music.bin' had changed.
Of course it's not much harder to create something similar for win32
(make.bat), however this will always compile and compress:
64tass.exe -C -a -B -i source.asm -o demo.tmp
pucrunch.exe -ffast -x 2048 demo.tmp >demo.prg
Here's a slightly more advanced Makefile example with default action as testing
in VICE, clean target for removal of temporary files and compressing using an
intermediate temporary file:
all: demo.prg
x64 -autostartprgmode 1 -autostart-warp +truedrive +cart $<
demo.prg: demo.tmp
pucrunch -ffast -x 2048 $< >$@
demo.tmp: source.asm macros.asm pic.drp music.bin
64tass -C -a -B -i $< -o $@
.INTERMEDIATE: demo.tmp
.PHONY: all clean
clean:
$(RM) demo.prg demo.tmp
It's useful to add a basic header to your source files like the one below, so
that the resulting file is directly runnable without additional compression:
* = $0801
.word (+), 2005 ;pointer, line number
.null $9e, format("%4d", start);will be sys 4096
+ .word 0 ;basic line end
* = $1000
start rts
A frequently coming up question is, how to automatically allocate memory,
without hacks like *=*+1? Sure there's .byte and friends for variables with
initial values but what about zero page, or RAM outside of program area? The
solution is to not use an initial value by using `?' or not giving a fill byte
value to .fill.
* = $02
p1 .addr ? ;a zero page pointer
temp .fill 10 ;a 10 byte temporary area
Space allocated this way is not saved in the output as there's no data to save
at those addresses.
What about some code running on zero page for speed? It needs to be relocated,
and the length must be known to copy it there. Here's an example:
ldx #size(zpcode)-1;calculate length
- lda zpcode,x
sta wrbyte,x
dex ;install to zero page
bpl -
jsr wrbyte
rts
;code continues here but is compiled to run from $02
zpcode .logical $02
wrbyte sta $ffff ;quick byte writer at $02
inc wrbyte+1
bne +
inc wrbyte+2
+ rts
.endlogical
The assembler supports lists and tuples, which does not seems interesting at
first as it sound like something which is only useful when heavy scripting is
involved. But as normal arithmetic operations also apply on all their elements
at once, this could spare quite some typing and repetition.
Let's take a simple example of a low/high byte jump table of return addresses,
this usually involves some unnecessary copy/pasting to create a pair of tables
with constructs like >(label-1).
jumpcmd lda hibytes,x ; selected routine in X register
pha
lda lobytes,x ; push address to stack
pha
rts ; jump, rts will increase pc by one!
; Build a list of jump addresses minus 1
_ := (cmd_p, cmd_c, cmd_m, cmd_s, cmd_r, cmd_l, cmd_e)-1
lobytes .byte <_ ; low bytes of jump addresses
hibytes .byte >_ ; high bytes
There are some other tips below in the descriptions.
-------------------------------------------------------------------------------
Expressions and data types
Integer constants
Integer constants can be entered as decimal digits of arbitrary length. An
underscore can be used between digits as a separator for better readability of
long numbers. The following operations are accepted:
Integer operators and functions
x + y add x to y 2 + 2 is 4
x - y subtract y from x 4 - 1 is 3
x * y multiply x with y 2 * 3 is 6
x / y integer divide x by y 7 / 2 is 3
x % y integer modulo of x divided by y 5 % 2 is 1
x ** y x raised to power of y 2 ** 4 is 16
-x negated value -2 is -2
+x unchanged +2 is 2
~x -x - 1 ~3 is -4
x | y bitwise or 2 | 6 is 6
x ^ y bitwise xor 2 ^ 6 is 4
x & y bitwise and 2 & 6 is 2
x << y logical shift left 1 << 3 is 8
x >> y arithmetic shift right -8 >> 3 is -1
Integers are automatically promoted to floats as necessary in expressions.
Other types can be converted to integer using the integer type int.
Integer division is a floor division (rounding down) so 7 / 4 is 1 and not
1.75. If ceiling division is required (rounding up) that can be done by
negating both the divident and the result. Typically it's done like 0 - -5 / 4
which results in 2.
.byte 23 ; as unsigned
.char -23 ; as signed
; using negative integers as immediate values
ldx #-3 ; works as '#-' is signed immediate
num = -3
ldx #+num ; needs explicit '#+' for signed 8 bits
lda #((bitmap >> 10) & $0f) | ((screen >> 6) & $f0)
sta $d018
Bit string constants
Bit string constants can be entered in hexadecimal form with a leading dollar
sign or in binary with a leading percent sign. An underscore can be used
between digits as a separator for better readability of long numbers. The
following operations are accepted:
Bit string operators and functions
~x invert bits ~%101 is ~%101
y .. x concatenate bits $a .. $b is $ab
y x n repeat %101 x 3 is %101101101
x[n] extract bit(s) $a[1] is %1
x[s] slice bits $1234[4:8] is $3
x | y bitwise or ~$2 | $6 is ~$0
x ^ y bitwise xor ~$2 ^ $6 is ~$4
x & y bitwise and ~$2 & $6 is $4
x << y bitwise shift left $0f << 4 is $0f0
x >> y bitwise shift right ~$f4 >> 4 is ~$f
Length of bit string constants are defined in bits and is calculated from the
number of bit digits used including leading zeros.
Bit strings are automatically promoted to integer or floating point as
necessary in expressions. The higher bits are extended with zeros or ones as
needed.
Bit strings support indexing and slicing. This is explained in detail in
section `Slicing and indexing'.
Other types can be converted to bit string using the bit string type bits.
.byte $33 ; 8 bits in hexadecimal
.byte %00011111 ; 8 bits in binary
.text $1234 ; $34, $12 (little endian)
lda $01
and #~$07 ; 8 bits even after inversion
ora #$05
sta $01
lda $d015
and #~%00100000 ;clear a bit
sta $d015
Floating point constants
Floating point constants have a radix point in them and optionally an exponent.
A decimal exponent is `e' while a binary one is `p'. An underscore can be used
between digits as a separator for better readability. The following operations
can be used:
Floating point operators and functions
x + y add x to y 2.2 + 2.2 is 4.4
x - y subtract y from x 4.1 - 1.1 is 3.0
x * y multiply x with y 1.5 * 3 is 4.5
x / y integer divide x by y 7.0 / 2.0 is 3.5
x % y integer modulo of x divided by y 5.0 % 2.0 is 1.0
x ** y x raised to power of y 2.0 ** -1 is 0.5
-x negated value -2.0 is -2.0
+x unchanged +2.0 is 2.0
~x almost -x ~2.1 is almost -2.1
x | y bitwise or 2.5 | 6.5 is 6.5
x ^ y bitwise xor 2.5 ^ 6.5 is 4.0
x & y bitwise and 2.5 & 6.5 is 2.5
x << y logical shift left 1.0 << 3.0 is 8.0
x >> y arithmetic shift right -8.0 >> 4 is -0.5
As usual comparing floating point numbers for (non) equality is a bad idea due
to rounding errors.
The only predefined constant is pi.
Floating point numbers are automatically truncated to integer as necessary.
Other types can be converted to floating point by using the type float.
Fixed point conversion can be done by using the shift operators. For example an
8.16 fixed point number can be calculated as (3.14 << 16) & $ffffff. The binary
operators operate like if the floating point number would be a fixed point one.
This is the reason for the strange definition of inversion.
.byte 3.66e1 ; 36.6, truncated to 36
.byte $1.8p4 ; 4:4 fixed point number (1.5)
.sint 12.2p8 ; 8:8 fixed point number (12.2)
Character string constants
Character strings are enclosed in single or double quotes and can hold any
Unicode character.
Operations like indexing or slicing are always done on the original
representation. The current encoding is only applied when it's used in
expressions as numeric constants or in context of text data directives.
Doubling the quotes inside string literals escapes them and results in a single
quote.
Character string operators and functions
y .. x concatenate strings "a" .. "b" is "ab"
y in x is substring of "b" in "abc" is true
a x n repeat "ab" x 3 is "ababab"
a[i] character from start "abc"[1] is "b"
a[-i] character from end "abc"[-1] is "c"
a[:] no change "abc"[:] is "abc"
a[s:] cut off start "abc"[1:] is "bc"
a[:-s] cut off end "abc"[:-1] is "ab"
a[s] reverse "abc"[::-1] is "cba"
Character strings are converted to integers, byte and bit strings as necessary
using the current encoding and escape rules. For example when using a sane
encoding "z"-"a" is 25.
Other types can be converted to character strings by using the type str or by
using the repr and format functions.
Character strings support indexing and slicing. This is explained in detail in
section `Slicing and indexing'.
mystr = "oeU" ; character string constant
.text 'it''s' ; it's
.word "ab"+1 ; conversion result is "bb" usually
.text "text"[:2] ; "te"
.text "text"[2:] ; "xt"
.text "text"[:-1] ; "tex"
.text "reverse"[::-1]; "esrever"
Byte string constants
Byte strings are like character strings, but hold bytes instead of characters.
Quoted character strings prefixing by `b', `l', `n', `p', `s', `x' or `z'
characters can be used to create byte strings. The resulting byte string
contains what .text, .shiftl, .null, .ptext and .shift would create. Direct
hexadecimal entry can be done using the `x' prefix and `z' denotes a z85
encoded byte string. Spaces can be used between pairs of hexadecimal digits as
a separator for better readability.
Byte string operators and functions
y .. x concatenate strings x"12" .. x"34" is x"1234"
y in x is substring of x"34" in x"1234" is true
a x n repeat x"ab" x 3 is x"ababab"
a[i] byte from start x"abcd12"[1] is x"cd"
a[-i] byte from end x"abcd"[-1] is x"cd"
a[:] no change x"abcd"[:] is x"abcd"
a[s:] cut off start x"abcdef"[1:] is x"cdef"
a[:-s] cut off end x"abcdef"[:-1] is x"abcd"
a[s] reverse x"abcdef"[::-1] is x"efcdab"
Byte strings support indexing and slicing. This is explained in detail in
section `Slicing and indexing'.
Other types can be converted to byte strings by using the type bytes.
.enc "screen" ;use screen encoding
mystr = b"oeU" ;convert text to bytes, like .text
.enc "none" ;normal encoding
.text mystr ;text as originally encoded
.text s"p1" ;convert to bytes like .shift
.text l"p2" ;convert to bytes like .shiftl
.text n"p3" ;convert to bytes like .null
.text p"p4" ;convert to bytes like .ptext
Binary data may be embedded in source code by using hexadecimal byte strings.
This is more compact than using .byte followed by a lot of numbers. As expected
1 byte becomes 2 characters.
.text x"fce2" ;2 bytes: $fc and $e2 (big endian)
If readability is not a concern then the more compact z85 encoding may be used
which encodes 4 bytes into 5 characters. Data lengths not a multiple of 4 are
handled by omitting leading zeros in the last group.
.text z"FiUj*2M$hf";8 bytes: 80 40 20 10 08 04 02 01
For data lengths of multiple of 4 bytes any z85 encoder will do. Otherwise the
simplest way to encode a binary file into a z85 string is to create a source
file which reads it using the line `label = binary('filename')'. Now if the
labels are listed to a file then there will be a z85 encoded definition for
this label.
Lists and tuples
Lists and tuples can hold a collection of values. Lists are defined from values
separated by comma between square brackets [1, 2, 3], an empty list is [].
Tuples are similar but are enclosed in parentheses instead. An empty tuple is
(), a single element tuple is (4,) to differentiate from normal numeric
expression parentheses. When nested they function similar to an array. Both
types are immutable.
List and tuple operators and functions
y .. x concatenate lists [1] .. [2] is [1, 2]
y in x is member of list 2 in [1, 2, 3] is true
a x n repeat [1, 2] x 2 is [1, 2, 1, 2]
a[i] element from start ("1", 2)[1] is 2
a[-i] element from end ("1", 2, 3)[-1] is 3
a[:] no change (1, 2, 3)[:] is (1, 2, 3)
a[s:] cut off start (1, 2, 3)[1:] is (2, 3)
a[:-s] cut off end (1, 2.0, 3)[:-1] is (1, 2.0)
a[s] reverse (1, 2, 3)[::-1] is (3, 2, 1)
*a convert to arguments format("%d: %s", *mylist)
... op a left fold ... + (1, 2, 3) is ((1+2)+3)
a op ... right fold (1, 2, 3) - ... is (1-(2-3))
Arithmetic operations are applied on the all elements recursively, therefore
[1, 2] + 1 is [2, 3], and abs([1, -1]) is [1, 1].
Arithmetic operations between lists are applied one by one on their elements,
so [1, 2] + [3, 4] is [4, 6].
When lists form an array and columns/rows are missing the smaller array is
stretched to fill in the gaps if possible, so [[1], [2]] * [3, 4] is [[3, 4],
[6, 8]].
Lists and tuples support indexing and slicing. This is explained in detail in
section `Slicing and indexing'.
mylist = [1, 2, "whatever"]
mytuple = (cmd_e, cmd_g)
mylist = ("e", cmd_e, "g", cmd_g, "i", cmd_i)
keys .text mylist[::2] ; keys ("e", "g", "i")
call_l .byte mylist[1::2]-1; routines (>cmd_e-1, >cmd_g-1, >cmd_i-1)
Although lists elements of variables can't be changed using indexing (at the
moment) the same effect can be achieved by combining slicing and concatenation:
lst := lst[:2] .. [4] .. lst[3:]; same as lst[2] := 4 would be
Folding is done on pair of elements either forward (left) or reverse (right).
The list must contain at least one element. Here are some folding examples:
minimum = size([part1, part2, part3]) ...
maximum = size([part1, part2, part3]) >? ...
sum = size([part1, part2, part3]) + ...
xorall = list_of_numbers ^ ...
join = list_of_strings .. ...
allbits = sprites.(left, middle, right).bits | ...
all = [true, true, true, true] && ...
any = [false, false, false, true] || ...
The range(start, end, step) built-in function can be used to create lists of
integers in a range with a given step value. At least the end must be given,
the start defaults to 0 and the step to 1. Sounds not very useful, so here are
a few examples:
;Bitmask table, 8 bits from left to right
.byte %10000000 >> range(8)
;Classic 256 byte single period sinus table with values of 0-255.
.byte 128 + 127.5 * sin(range(256) * pi / 128)
;Screen row address tables
_ := $400 + range(0, 1000, 40)
scrlo .byte <_
scrhi .byte >_
Dictionaries
Dictionaries hold key and value pairs normally but can be used as sets too if
simple values are used. In the latter case the values are the keys for
themselves.
A dictionary is defined with coma separated values between curly brackets. An
empty one is {}. Key and value pairs are separated with colon, like { :
}. A default value for missing items is can be defined by leaving out
the key before the colon, like { : }. Simple value don't use a colon {
}.
Looking up a non-existing key is an error unless a default value is given.
Dictionaries are immutable. There are limitations what may be used as a key but
the value can be anything. As the keys are used for lookups these must be
unique.
Dictionary operators and functions
y .. x combine dictionaries {1:2, 3:4} .. {2:3, 3:1} is {1:2, 2:3, 3:1}
x[i] value lookup {"1":2}["1"] is 2
x.i symbol lookup {.ONE:1, .TWO:2}.ONE is 1
y in x is a key 1 in {1:2} is true
; Simple lookup
.text {1:"one", 2:"two"}[2]; "two"
; 16 element "fader" table 1->15->12->11->0
.byte {1:15, 15:12, 12:11, :0}[range(16)]
; Variables can be used to build dictionaries incrementally.
md := {1:2}
md ..= {3:4}
The keys can be symbols as well, this allows simple definition of data
structures or enumerations.
; Symbol accessible values. May be useful as a function return value too.
coords = {.x: 24, .y: 50}
ldx #coords.x
ldy #coords.y
; Simple enumeration where red = 0, green = 1, blue = 2
colors = dict(.(red, green, blue), range(3))
lda #color.green
; Enumerate register bits as %1, %10, %100, ...
irqbits = dict(.(ta, tb, tod, serial, flag), %1 << range(5))
and #irqbits.flag
Code
Code holds the result of compilation in binary and other enclosed objects. In
an arithmetic operation it's used as the numeric address of the memory where it
starts. The compiled content remains static even if later parts of the source
overwrite the same memory area.
Indexing and slicing of code to access the compiled content might be
implemented differently in future releases. Use this feature at your own risk
for now, you might need to update your code later.
Label operators and functions
a.b b member of a label.locallabel
.b in a if a has symbol b .locallabel in label
a[i] element from start label[1]
a[-i] element from end label[-1]
a[:] copy as tuple label[:]
a[s:] cut off start, as tuple label[1:]
a[:-s] cut off end, as tuple label[:-1]
a[s] reverse, as tuple label[::-1]
mydata .word 1, 4, 3
mycode .block
local lda #0
.endblock
ldx #size(mydata) ;6 bytes (3*2)
ldx #len(mydata) ;3 elements
ldx #mycode[0] ;lda instruction, $a9
ldx #mydata[1] ;2nd element, 4
jmp mycode.local ;address of local label
Addressing modes
Addressing modes are used for determining addressing modes of instructions.
For indexing there must be no white space between the comma and the register
letter, otherwise the indexing operator is not recognized. On the other hand
put a space between the comma and a single letter symbol in a list to avoid it
being recognized as an operator.
Addressing mode operators
# immediate
#+ signed immediate
#- signed immediate
( ) indirect
[ ] long indirect
,b data bank indexed
,d direct page indexed
,k program bank indexed
,r data stack pointer indexed
,s stack pointer indexed
,x x register indexed
,y y register indexed
,z z register indexed
Parentheses are used for indirection and square brackets for long indirection.
These operations are only available after instructions and functions to not
interfere with their normal use in expressions.
Several addressing mode operators can be combined together. Currently the
complexity is limited to 4 operators. This is enough to describe all addressing
modes of the supported CPUs.
Valid addressing mode operator combinations
# immediate lda #$12
#+ signed immediate lda #+127
#- signed immediate lda #-128
#addr,#addr move mvp #5,#6
addr direct or relative lda $12 lda $1234 bne $1234
bit,addr direct page bit rmb 5,$12 smb #$80,$12
bit,addr,addr direct page bit relative jump bbs 5,$12,$1 bbr #$40,$12,$2
(addr) indirect lda ($12) jmp ($1234)
(addr),y indirect y indexed lda ($12),y
(addr),z indirect z indexed lda ($12),z
(addr,x) x indexed indirect lda ($12,x) jmp ($1234,x)
[addr] long indirect lda [$12] jmp [$1234]
[addr],y long indirect y indexed lda [$12],y
#addr,b data bank indexed lda #0,b
#addr,b,x data bank x indexed lda #0,b,x
#addr,b,y data bank y indexed lda #0,b,y
#addr,d direct page indexed lda #0,d
#addr,d,x direct page x indexed lda #0,d,x
#addr,d,y direct page y indexed ldx #0,d,y
(#addr,d) direct page indirect lda (#$12,d)
(#addr,d,x) direct page x indexed indirect lda (#$12,d,x)
(#addr,d),y direct page indirect y indexed lda (#$12,d),y
(#addr,d),z direct page indirect z indexed lda (#$12,d),z
[#addr,d] direct page long indirect lda [#$12,d]
[#addr,d],y direct page long indirect y indexed lda [#$12,d],y
#addr,k program bank indexed jsr #0,k
(#addr,k,x) program bank x indexed indirect jmp (#$1234,k,x)
#addr,r data stack indexed lda #1,r
(#addr,r),y data stack indexed indirect y lda (#$12,r),y
indexed
#addr,s stack indexed lda #1,s
(#addr,s),y stack indexed indirect y indexed lda (#$12,s),y
addr,x x indexed lda $12,x
addr,y y indexed lda $12,y
Direct page, data bank, program bank indexed and long addressing modes of
instructions are intelligently chosen based on the instruction type, the
address ranges set up by .dpage, .databank and the current program counter
address. Therefore the `,d', `,b' and `,k' indexing is only used in very
special cases.
The immediate direct page indexed `#0,d' addressing mode is usable for direct
page access. The 8 bit constant is a direct offset from the start of actual
direct page. Alternatively it may be written as `0,d'.
The immediate data bank indexed `#0,b' addressing mode is usable for data bank
access. The 16 bit constant is a direct offset from the start of actual data
bank. Alternatively it may be written as `0,b'.
The immediate program bank indexed `#0,k' addressing mode is usable for program
bank jumps, branches and calls. The 16 bit constant is a direct offset from the
start of actual program bank. Alternatively it may be written as `0,k'.
The immediate stack indexed `#0,s' and data stack indexed `#0,r' accept 8 bit
constants as an offset from the start of (data) stack. These are sometimes
written without the immediate notation, but this makes it more clear what's
going on. For the same reason the move instructions are written with an
immediate addressing mode `#0,#0' as well.
The immediate (#) addressing mode expects unsigned values of byte or word size.
Therefore it only accepts constants of 1 byte or in range 0-255 or 2 bytes or
in range 0-65535.
The signed immediate (#+ and #-) addressing mode is to allow signed numbers to
be used as immediate constants. It accepts a single byte or an integer in range
-128-127, or two bytes or an integer of -32768-32767.
The use of signed immediate (like #-3) is seamless, but it needs to be
explicitly written out for variables or expressions (#+variable). In case the
unsigned variant is needed but the expression starts with a negation then it
needs to be put into parentheses (#(-variable)) or else it'll change the
address mode to signed.
Normally addressing mode operators are used in expressions right after
instructions. They can also be used for defining stack variable symbols when
using a 65816, or to force a specific addressing mode.
param = #1,s ;define a stack variable
const = #1 ;immediate constant
lda #0,b ;always "absolute" lda $0000
lda param ;results in lda #$01,s
lda param+1 ;results in lda #$02,s
lda (param),y ;results in lda (#$01,s),y
ldx const ;results in ldx #$01
lda #-2 ;negative constant, $fe
Uninitialized memory
There's a special value for uninitialized memory, it's represented by a
question mark. Whenever it's used to generate data it creates a `hole' where
the previous content of memory is visible.
Uninitialized memory holes without previous content are not saved unless it's
really necessary for the output format, in that case it's replaced with zeros.
It's not just data generation statements (e.g. .byte) that can create
uninitialized memory, but .fill, .align or address manipulation as well.
* = $200 ;bytes as necessary
.word ? ;2 bytes
.fill 10 ;10 bytes
.align 64 ;bytes as necessary
Booleans
There are two predefined boolean constant variables, true and false.
Booleans are created by comparison operators (<, <=, !=, ==, >=, >), logical
operators (&&, ||, ^^, !), the membership operator (in) and the all and any
functions.
Normally in numeric expressions true is 1 and false is 0, unless the `
-Wstrict-bool' command line option was used.
Other types can be converted to boolean by using the type bool.
Boolean values of various types
bits At least one non-zero bit
bool When true
bytes At least one non-zero byte
code Address is non-zero
float Not 0.0
int Not zero
str At least one non-zero byte after translation
Types
The various types mentioned earlier have predefined names. These can used for
conversions or type checks.
Built-in type names
address Address type
bits Bit string type
bool Boolean type
bytes Byte string type
code Code type
dict Dictionary type
float Floating point type
gap Uninitialized memory type
int Integer type
list List type
str Character string type
symbol Symbol type
tuple Tuple type
type Type type
Bit and byte string conversions can take a second parameter to specify an exact
size. Values which can fit in shorter space will be padded but longer ones give
an error.
bits([, ])
Convert to the specific number of bits. If the number of bits is negative
then it's a signed.
bytes([, ])
Convert to the specific number of bytes. If the number of bits is negative
then it's a signed.
Dictionaries can be built from a single iterable of key and value pairs, or
from two iterables where the keys come from the first and the values from the
second parameter.
dict([, ])
Build dictionary from iterables
.cerror type(var) != str, "Not a string!"
.text str(year) ; convert to string
Symbols
Symbols are used to reference objects. Regularly named, anonymous and local
symbols are supported. These can be constant or re-definable.
Scopes are where symbols are stored and looked up. The global scope is always
defined and it can contain any number of nested scopes.
Symbols must be uniquely named in a scope, therefore in big programs it's hard
to come up with useful and easy to type names. That's why local and anonymous
symbols exists. And grouping certain related symbols into a scope makes sense
sometimes too.
Scopes are usually created by .proc and .block directives, but there are a few
other ways. Symbols in a scope can be accessed by using the dot operator, which
is applied between the name of the scope and the symbol (e.g.
myconsts.math.pi).
Regular symbols
Regular symbol names are starting with a letter and containing letters, numbers
and underscores. Unicode letters are allowed if the `-a' command line option
was used. There's no restriction on the length of symbol names.
Care must be taken to not use duplicate names in the same scope when the symbol
is used as a constant as there can be only one definition for them.
Duplicate names in parent scopes are not a problem and this gives the ability
to override names defined in lower scopes. However this can just as well lead
to mistakes if a lower scoped symbol with the same name was meant so there's a
`-Wshadow' command line option to warn if such ambiguity exists.
Case sensitivity can be enabled with the `-C' command line option, otherwise
all symbols are matched case insensitive.
For case insensitive matching it's possible to check for consistent symbol name
use with the `-Wcase-symbol' command line option.
A regular symbol is looked up first in the current scope, then in lower scopes
until the global scope is reached.
f .block
g .block
n nop ;jump here
.endblock
.endblock
jsr f.g.n ;reference from a scope
f.x = 3 ;create x in scope f with value 3
Local symbols
Local symbols have their own scope between two regularly named code symbols and
are assigned to the code symbol above them.
Therefore they're easy to reuse without explicit scope declaration directives.
Not all regularly named symbols can be scope boundaries just plain code symbol
ones without anything or an opcode after them (no macros!). Symbols defined as
procedures, blocks, macros, functions, structures and unions are ignored. Also
symbols defined by .var, := or = don't apply, and there are a few more
exceptions, so stick to using plain code labels.
The name must start with an underscore (_), otherwise the same character
restrictions apply as for regular symbols. There's no restriction on the length
of the name.
Care must be taken to not use the duplicate names in the same scope when the
symbol is used as a constant.
A local symbol is only looked up in it's own scope and nowhere else.
incr inc ac
bne _skip
inc ac+1
_skip rts
decr lda ac
bne _skip
dec ac+1
_skip dec ac ;symbol reused here
jmp incr._skip ;this works too, but is not advised
Anonymous symbols
Anonymous symbols don't have a unique name and are always called as a single
plus or minus sign. They are also called as forward (+) and backward (-)
references.
When referencing them `-' means the first backward, `--' means the second
backwards and so on. It's the same for forward, but with `+'. In expressions it
may be necessary to put them into brackets.
ldy #4
- ldx #0
- txa
cmp #3
bcc +
adc #44
+ sta $400,x
inx
bne -
dey
bne --
Excessive nesting or long distance references create poorly readable code. It's
also very easy to copy-paste a few lines of code with these references into a
code fragment already containing similar references. The result is usually a
long debugging session to find out what went wrong.
These references are also useful in segments, but this can create a nice trap
when segments are copied into the code with their internal references.
bne +
#somemakro ;let's hope that this segment does
+ nop ;not contain forward references...
Anonymous symbols are looked up first in the current scope, then in lower
scopes until the global scope is reached.
Anonymous labels within conditionally assembled code are counted even if the
code itself is not compiled and the label won't get defined. This ensures that
anonymous labels are always at the same "distance" independent of the
conditions in between.
Constant and re-definable symbols
Constant symbols can be created with the equal sign. These are not
re-definable. Forward referencing of them is allowed as they retain the objects
over compilation passes.
Symbols in front of code or certain assembler directives are created as
constant symbols too. They are bound to the object following them.
Re-definable symbols can be created by the .var directive or := construct.
These are also called as variables. They don't carry their content over from
the previous pass therefore it's not possible to use them before their
definition.
If the variable already exists in the current scope it'll get updated. If an
existing variable needs to be updated in a parent scope then the ::= variable
reassign operator is able to do that.
Variables can be conditionally defined using the :?= construct. If the variable
was defined already then the original value is retained otherwise a new one is
created with this value.
WIDTH = 40 ;a constant
lda #WIDTH ;lda #$28
variabl .var 1 ;a variable
var2 := 1 ;another variable
variabl .var variabl + 1;update it verbosely
var2 += 1 ;compound assignment (add one)
var3 :?= 5 ;assign 5 if undefined
The star label
The `*' symbol denotes the current program counter value. When accessed it's
value is the program counter at the beginning of the line. Assigning to it
changes the program counter and the compiling offset.
Built-in functions
Built-in functions are pre-assigned to the symbols listed below. If you reuse
these symbols in a scope for other purposes then they become inaccessible, or
can perform a different function.
Built-in functions can be assigned to symbols (e.g. sinus = sin), and the new
name can be used as the original function. They can even be passed as
parameters to functions.
Mathematical functions
floor()
Round down. E.g. floor(-4.8) is -5.0
round()
Round to nearest away from zero. E.g. round(4.8) is 5.0
ceil()
Round up. E.g. ceil(1.1) is 2.0
trunc()
Round down towards zero. E.g. trunc(-1.9) is -1
frac()
Fractional part. E.g. frac(1.1) is 0.1
sqrt()
Square root. E.g. sqrt(16.0) is 4.0
cbrt()
Cube root. E.g. cbrt(27.0) is 3.0
log10()
Common logarithm. E.g. log10(100.0) is 2.0
log([, ])
Logarithm, natural by default. E.g. log(1) is 0.0
exp()
Exponential. E.g. exp(0) is 1.0
pow(, )
A raised to power of B. E.g. pow(2.0, 3.0) is 8.0
sin()
Sine. E.g. sin(0.0) is 0.0
asin()
Arc sine. E.g. asin(0.0) is 0.0
sinh()
Hyperbolic sine. E.g. sinh(0.0) is 0.0
cos()
Cosine. E.g. cos(0.0) is 1.0
acos()
Arc cosine. E.g. acos(1.0) is 0.0
cosh()
Hyperbolic cosine. E.g. cosh(0.0) is 1.0
tan()
Tangent. E.g. tan(0.0) is 0.0
atan()
Arc tangent. E.g. atan(0.0) is 0.0
tanh()
Hyperbolic tangent. E.g. tanh(0.0) is 0.0
rad()
Degrees to radian. E.g. rad(0.0) is 0.0
deg()
Radian to degrees. E.g. deg(0.0) is 0.0
hypot([, ...])
Euclidean distance, any dimensions. E.g. hypot(4.0, 3.0) is 5.0
atan2(, )
Polar angle in -pi to +pi range. E.g. atan2(0.0, 3.0) is 0.0
abs()
Absolute value. E.g. abs(-1) is 1
sign()
Returns the sign of value as -1, 0 or 1 for negative, zero and positive.
E.g. sign(-5) is -1
Byte string functions
These functions return byte strings of various lengths for signed numbers,
unsigned numbers and addresses.
The naming of functions is not a coincidence and they return the bytes what the
data directives with the same names normally emit.
byte()
char()
Return a single byte string from a 8 bit unsigned (0-255) or signed number
(-128-127). E.g. byte(0) is x"00" and char(-1) is x"ff"
word()
sint()
Return a little endian byte string of 2 bytes from a 16 bit unsigned
(0-65535) or signed number (-32768-32767). E.g. word(1024) is x"0004" and
sint(-1) is x"ffff"
long()
lint()
Return a little endian byte string of 3 bytes from a 24 bit unsigned
(0-16777216) or signed number (-8388608-8388607). E.g. long(123456) is
x"40E201" and lint(-1) is x"ffffff"
dword()
dint()
Return a little endian byte string of 4 bytes from a 32 bit unsigned
(0-4294967296) or signed number (-2147483648-2147483647). E.g. dword(
123456789) is x"15CD5B07" and dint(-1) is x"ffffffff"
addr()
Return a little endian byte string of 2 bytes from an address in the
current program bank. E.g. addr(start) is x"0d08"
rta()
Return a little endian byte string of 2 bytes from a return address in the
current program bank. E.g. rta(4096) is x"ff0f"
Other functions
all()
Return truth for various definitions of `all'.
All function
all bits set or no bits at all all($f) is true
all characters non-zero or empty all("c") is true
string
all bytes non-zero or no bytes all(x"ac24") is true
all elements true or empty list all([true, true, false]) is false
Only booleans in a list are accepted with the `-Wstrict-bool' command line
option.
any()
Return truth for various definitions of `any'.
Any function
at least one bit set any(~$f) is false
at least one non-zero character any("c") is true
at least one non-zero byte any(x"ac24") is true
at least one true element any([true, true, false]) is true
Only booleans in a list are accepted with the `-Wstrict-bool' command line
option.
binary([, [, ]])
Returns the binary file content as bytes.
This function reads the content of a binary file as a byte string. It also
accepts optional offset and length parameters.
Binary function invocation types
Read everything binary(name)
Skip starting bytes binary(name, offset)
Some bytes from offset binary(name, offset, length)
sid = binary("music.sid"); read in the SID file as bytes
offs := sid[[$7, $6]] ; data offset (big endian)
load := sid[[$9, $8]] ; load address (big endian)
init = sid[[$b, $a]] ; init address (big endian)
play = sid[[$d, $c]] ; play address (big endian)
; if load address is zero then it's the first 2 bytes of data
.if load == 0
load := sid[offs:offs+2] ; load address (little endian)
offs += 2 ; skip load address bytes
.endif
* = load ; set pc to load address
.text sid[offs:] ; dump music data
format([, , ...])
Create string from values according to a format string.
The format function converts a list of values into a character string. The
converted values are inserted in place of the % sign. Optional conversion
flags and minimum field length may follow, before the conversion type
character. These flags can be used:
Formatting flags
# alternate form (-$a, ~$a, -%10, ~%10, -10.)
* width/precision from list
. precision
0 pad with zeros
- left adjusted (default right)
blank when positive or minus sign
+ sign even if positive
~ binary and hexadecimal as bits
The following conversion types are implemented:
Formatting conversion types
b binary
c Unicode character
d decimal
e E exponential float (uppercase)
f F floating point (uppercase)
g G exponential/floating point
s string
r representation
x X hexadecimal (uppercase)
% percent sign
.text format("%#04x bytes left", 1000); $03e8 bytes left
len()
Returns the number of elements.
Length of various types
bit string length in bits len($034) is 12
character string number of characters len("abc") is 3
byte string number of bytes len(x"abcd23") is 3
tuple, list number of elements len([1, 2, 3]) is 3
dictionary number of elements len({1:2, 3:4]) is 2
code number of elements len(label)
random([, ...])
Returns a pseudo random number.
The sequence does not change across compilations and is the same every
time. Different sequences can be generated by seeding with .seed.
Random function invocation types
floating point number 0.0 <= x < 1.0 random()
integer in range of 0 <= x < e random(e)
integer in range of s <= x < e random(s, a)
integer in range of s <= x < e, step t random(s, a, t)
.seed 1234 ; default is boring, seed the generator
.byte random(256); a pseudo random byte (0-255)
.byte random([16] x 8); 8 pseudo random bytes (0-15)
range([, , ...])
Returns a list of integers in a range, with optional stepping.
Range function invocation types
integers from 0 to e-1 range(e)
integers from s to e-1 range(s, a)
integers from s to e (not including e), step t range(s, a, t)
.byte range(16) ; 0, 1, ..., 14, 15
.char range(-5, 6); -5, -4, ..., 4, 5
mylist = range(10, 0, -2); [10, 8, 6, 4, 2]
repr()
Returns a string representation of value.
.warn repr(var) ; pretty print value, for debugging
size()
Returns the size of code, structure or union in bytes.
var .word 0, 0, 0
ldx #size(var) ; 6 bytes
var2 = var + 2 ; start 2 bytes later
ldx #size(var2) ; what remains is 4 bytes
sort()
Returns a sorted list or tuple.
If the original list contains further lists then these must be all of the
same length. In this case the order of lists is determined by comparing
their elements from the start until a difference is found. The sort is
stable.
; sort IRQ routines by their raster lines
sorted = sort([(60, irq1), (50, irq2)])
lines .byte sorted[:, 0] ; 50, 60
irqs .addr sorted[:, 1] ; irq2, irq1
Expressions
Operators
The following operators are available. Not all are defined for all types of
arguments and their meaning might slightly vary depending on the type.
Unary operators
- negative + positive
! not ~ invert
* convert to arguments ^ decimal string
The `^' decimal string operator will be changed to mean the bank byte soon.
Please update your sources to use format("%d", xxx) instead! This is done to be
in line with it's use in most other assemblers.
Binary operators
+ add - subtract
* multiply / divide
% modulo ** raise to power
| binary or ^ binary xor
& binary and << shift left
>> shift right . member
.. concat x repeat
in contains !in excludes
Spacing must be used for the `x' and `in' operators or else they won't be
recognized as such. For example the expression `[1,2]x2' should be written as
`[1,2]x 2' instead.
Parenthesis (( )) can be used to override operator precedence. Don't forget
that they also denote indirect addressing mode for certain opcodes.
lda #(4+2)*3
Comparison operators
Traditional comparison operators give false or true depending on the result.
The compare operator (<=>) gives -1 for less, 0 for equal and 1 for more.
Comparison operators
<=> compare
== equals != not equal
< less than >= more than or equals
> more than <= less than or equals
=== identical !== not identical
Bit string extraction operators
These unary operators extract 8 or 16 bits. Usually they are used to get parts
of a memory address.
Bit string extraction operators
< lower byte > higher byte
<> lower word >` higher word
>< lower byte swapped word ` bank byte
lda #label ; high byte of address
jsr $ab1e
ldx #<>source ; word extraction
ldy #<>dest
lda #size(source)-1
mvn #`source, #`dest; bank extraction
Please note that these prefix operators are not strongly binding like negation
or inversion. Instead they apply to the whole expression to the right. This may
be unexpected but is required for compatibility with old sources which expect
this behaviour.
lda #start) != (>end)
.cerror >start != >end;Effectively this is >(start != (>end))
Conditional operators
Boolean conditional operators give false or true or one of the operands as the
result.
Logical and conditional operators
x || y if x is true then x otherwise y
x ^^ y if both false or true then false otherwise x || y
x && y if x is true then y otherwise x
!x if x is true then false otherwise true
c ? x : y if c is true then x otherwise y
c ?? x : y if c is true then x otherwise y (broadcasting)
x y if x is smaller then x otherwise y
x >? y if x is greater then x otherwise y
;Silly example for 1=>"simple", 2=>"advanced", else "normal"
.text MODE == 1 && "simple" || MODE == 2 && "advanced" || "normal"
.text MODE == 1 ? "simple" : MODE == 2 ? "advanced" : "normal"
;Limit result to 0 .. 8
light .byte 0 >? range(-16, 101)/6 8
Please note that these are not short circuiting operations and both sides are
calculated even if thrown away later.
With the `-Wstrict-bool' command line option booleans are required as arguments
and only the `?' operator may return something else.
Address length forcing
Special addressing length forcing operators in front of an expression can be
used to make sure the expected addressing mode is used. Only applicable when
used directly at the mnemonic.
Address size forcing
@b to force 8 bit address
@w to force 16 bit address
@l to force 24 bit address (65816)
lda @w $0000 ; force the use of 2 byte absolute addressing
bne @b label ; prevent upgrade to beq+jmp with long branches in use
lda @w #$00 ; use 2 bytes independent of accumulator size
Compound assignment
These assignment operators are short hands for updating variables. Constants
can't be changed of course.
The variables on the left must be defined beforehand by `:=' or `.var'.
Compound assignment operators can modify variables defined in parent scopes as
well.
Compound assignments
+= add -= subtract
*= multiply /= divide
%= modulo **= raise to power
|= binary or ^= binary xor
&= binary and ||= logical or
&&= logical and <<= shift left
>>= shift right ..= concat
= smaller >?= greater
x= repeat .= member
v += 1 ; same as 'v ::= v + 1'
Slicing and indexing
Lists, character strings, byte strings and bit strings support various slicing
and indexing possibilities through the [] operator.
Indexing elements with positive integers is zero based. Negative indexes are
transformed to positive by adding the number of elements to them, therefore -1
is the last element. Indexing with list of integers is possible as well so [1,
2, 3][(-1, 0, 1)] is [3, 1, 2].
Slicing is an operation when parts of sequence is extracted from a start
position to an end position with a step value. These parameters are separated
with colons enclosed in square brackets and are all optional. Their default
values are [start:maximum:step=1]. Negative start and end characters are
converted to positive internally by adding the length of string to them.
Negative step operates in reverse direction, non-single steps will jump over
elements.
This is quite powerful and therefore a few examples will be given here:
Positive indexing a[x]
It'll simply extracts a numbered element. It is zero based, therefore
"abcd"[1] results in "b".
Negative indexing a[-x]
This extracts an element counted from the end, -1 is the last one. So
"abcd"[-2] results in "c".
Cut off end a[:to]
Extracts a continuous range stopping before `to'. So [10,20,30,40][:-1]
results in [10,20,30].
Cut off start a[from:]
Extracts a continuous range starting from `from'. So [10,20,30,40][-2:]
results in [30,40].
Slicing a[from:to]
Extracts a continuous range starting from element `from' and stopping
before `to'. The two end positions can be positive or negative indexes. So
[10,20,30,40][1:-1] results in [20,30].
Everything a[:]
Giving no start or end will cover everything and therefore results in a
complete copy.
Reverse a[::-1]
This gives everything in reverse, so "abcd"[::-1] is "dcba".
Stepping through a[from:to:step]
Extracts every `step'th element starting from `from' and stopping before
`to'. So "abcdef"[1:4:2] results in "bd". The `from' and `to' can be
omitted in case it starts from the beginning or end at the end. If the
`step' is negative then it's done in reverse.
Extract multiple elements a[list]
Extract elements based on a list. So "abcd"[[1,3]] will be "bd".
The fun start with nested lists and tuples, as these can be used to create a
matrix. The examples will be given for a two dimensional matrix for easier
understanding, but this also works in higher dimensions.
Extract row a[x]
Given a [(1,2),(3,4)] matrix [0] will give the first row which is (1,2)
Extract row range a[from:to]
Given a [(1,2),(3,4),(5,6),(7,8)] matrix [1:3] will give [(3,4),(5,6)]
Extract column a[x]
Given a [(1,2),(3,4)] matrix [:,0] will give the first column of all rows
which is [1,3]
Extract column range a[:,from:to]
Given a [(1,2,3,4),(5,6,7,8)] matrix [:,1:3] will give [(2,3),(6,7)]
And it works for list of indexes, negative indexes, stepped ranges, reversing,
etc. on all axes in too many ways to show all possibilities.
Basically it's just the indexing and slicing applied on nested constructs,
where each nesting level is separated by a comma.
-------------------------------------------------------------------------------
Compiler directives
Controlling the compile offset and program counter
Two counters are used while assembling.
The compile offset is where the data and code ends up in memory (or in image
file).
The program counter is what labels get set to and what the special star label
refers to.
Normally both are the same (code is compiled to the location it runs from) but
it does not need to be.
*=
The compile offset is adjusted so that the program counter will match the
requested address in the expression.
;Offset ;PC ;Hex ;Monitor ;Source
* = $0800
.0800 label1
.logical $1000
.0800 1000 label2
* = $1200
.0a00 1200 label3
.endlogical
.0a00 label4
.offs
Sets the compile offset relative to the program counter.
Popular in old TASM code where this was the only way to create relocated
code, otherwise it's use is not recommended as there are easier to use
alternatives below.
;Offset ;PC ;Hex ;Monitor ;Source
* = $1000
.1000 ea nop nop
.offs 100
.1065 1001 ea nop nop
.logical
Starts a relocation block
.here
.endlogical
Ends a relocation block
Changes the program counter only, the compile offset is not changed. When
finished all continues where it was left off before.
The naming is not logical at all for relocated code, but that's how it was
named in old 6502tass.
It's used for code copied to it's proper location at runtime. Can be nested
of course.
;Offset ;PC ;Hex ;Monitor ;Source
* = $1000
.logical $300
.1000 0300 a9 80 lda #$80 drive lda #$80
.1002 0302 85 00 sta $00 sta $00
.1004 0304 4c 00 03 jmp $0300 jmp drive
.endlogical
.virtual []
Starts a virtual block
.endv
.endvirtual
Ends a virtual block
Changes the program counter to the expression (if given) and discards the
result of compilation. This is useful to define structures to fixed
addresses.
.virtual $d400 ; base address
sid .block
freq .word ? ; frequency
pulsew .word ? ; pulse width
control .byte ? ; control
ad .byte ? ; attack/decay
sr .byte ? ; sustain/release
.endblock
.endvirtual
Or to define stack "allocated" variables on 65816.
.virtual #1,s
p1 .addr ? ; at #1,s
tmp .byte ? ; at #3,s
.endvirtual
lda (p1),y ; lda ($01,s),y
Aligning data or code
Alignment is about constraining data/code placement in memory.
The processor architecture doesn't have hard constraints on instruction or data
placement still pages (256 bytes) come up quite often in instruction cycle
times tables. Or even in errata like the indirect JMP bug which happens only if
the word of the vector is crossing such page.
Other components like video chips can only display object if placed at an
address divisible by 64 for example.
For code half of an address table might be spared if it's known that all the
addresses have the same high bytes. Or if all interrupt routines are on the
same page then it's enough to change the low byte of the vector when selecting
another one.
Now it shouldn't come as a surprise that the following directives are mainly
concerned about how dividing the program counter address gives a certain
remainder.
The divisor in this context is called the alignment interval and is usually a
number which is a power of two. Quite often 256, so that's the default.
The remainder is called offset and is by default 0. Negative offsets are a
convenience feature and are internally corrected by adding the interval to it.
An interval sized memory area is called a page. It's boundary is at it's start.
If data spans more than one page it's known as a page boundary cross.
Having a non-zero offset effectively shifts the boundary of a page in memory
further up or down (if negative). An interval of 256 with offset of 8 gives
page boundaries of $1008, $1108 or $1208 for example.
If the alignment is not good enough some alignment directives might try to
correct it by adding padding. This is by default uninitialized (skip forward)
but may be a fixed byte or anything more complex similarly to what the .fill
directive accepts.
When alignment is done within named structures then it's relative to the start
of the structure. This means the structure layout will always be the same
independent of which address it's instantiated at. Anonymous structures do not
change the way the alignment works.
The `-Walign' command line option can be used to emit warnings on where and how
much padding was necessary for alignment.
.page [[, ]]
Start of page check block
.endp
.endpage
End of page check block
This directive is a passive assertion and checks for a page difference or
page crossing.
By default or with a negative interval parameter it verifies that the start
and end directives are on the same page. This is what's needed to guard
relative branches against jumping across pages:
ldx #3
.page ;now this will execute
- dex ;in 14 cycles for sure
bne -
.endpage
With a positive size parameter it verifies that there's no page cross in
the memory range between the directives. This is what's needed to guard
against indexed access page cross cycle penalties:
* = $10c0
.page 256
table .fill $40 ;table within the same page
.endpage ;different page here but no crossing
Normally a page check results in an error but the `-Wno-error=page' command
line option can reduce it into a warning.
Once this directive reports an error it's time to rearrange the source in a
way that the check passes. Or alternatively the alignment directives below
can be used to avoid violating the assertion.
.align [[, [, ]]]
Align the program counter to a page boundary
This directive is useful when code/data needs to be placed exactly to a
page boundary. If that's not already the case sufficient padding is added
until the next one is reached.
.align $40 ;sprite bitmap (64 byte aligned)
sprite .fill 63
.align $400 ;screen memory (1024 byte aligned)
screen .fill 1000
.align $400, ?, -8;sprite pointers (last 8 bytes)
spritep .fill 8
.align ; page sized buffer at page boundary
sendbuf .fill 256 ;to avoid indexing penalty cycles
.alignblk [[, [, ]]]
Starts alignment block.
.endalignblk
Ends alignment block.
Often the start address is not important only avoiding the page boundary
matters.
This often can be achieved without any padding at all. If padding is
necessary then this directive works the same as .align including alignment
within structures.
It's typically used to place tables so that absolute indexed read accesses
won't suffer page crossing cycle penalties.
.alignblk ;avoid page cross
table .byte 0, 1, 2, 3, 4, 5, 6, 7
.endalignblk
lda table,x ;no cycles wasted on access
In case the stronger guarantee of having both the start and the end
directives in the same page is required then the alignment interval needs
to be given as a negative number (e.g. -256). This may be necessary for
aligning code with relative branches.
If the block size varies based on its memory location then doing the
alignment may become impossible.
.alignpageind [, [, [, ]]]
Alignment of a page block indirectly.
Using .alignblk in the middle of executable code is usually problematic as
the alignment is done there as well. This directive can do the alignment
padding outside of the execution flow.
rts
.alignpageind pageblk;add alignment padding here
wait ldx #3
pageblk .page ;now this will execute
- dex ;in 14 cycles for sure
bne -
.endpage
By default and with a negative interval it tries to avoids page
differences. With positive intervals page crosses. Same as the .page
assertion block.
It is assumed that the padding inserted will move the target block as if
it'd be right in front of it. If this isn't the case the alignment will
fail.
If the block size varies based on its memory location then doing the
alignment may become impossible.
.alignind [, [, [, ]]]
Align the target location to a page boundary indirectly
This directive tries to align the target to a page boundary. If not already
on one then sufficient padding will be added until the next one is reached.
;Align "pos" to page boundary. It must come right after "neg".
.alignind pos
neg .fill 8
pos .fill 8
.cerror (
Usually the .fill directive is used to reserve space but it may be useful
to do alignments as well.
;replacement for a .cerror overrun check and *= combo
.fill start_address - *
;align the vectors "block" so it ends at end_address
.fill end_address - size(vectors) - *
vectors .logical * ;dummy non-scoped block for size()
...
;screen memory is needed but if at $9xxx then take $a000 instead
.align $400 ;next 1024 byte alignment
.fill (* >> 12) == $9 ? ($a000 - *) : 0
screen .fill 1000
Dumping data
Storing numeric values
Multi byte numeric data is stored in the little-endian order, which is the
natural byte order for 65xx processors. Numeric ranges are enforced depending
on the directives used. Signed numbers are stored as two's complement.
When using lists or tuples their content will be used one by one. Uninitialized
data (`?') creates holes of different sizes. Character string constants are
converted using the current encoding.
Please note that multi character strings usually don't fit into 8 bits and
therefore the .byte directive is not appropriate for them. Use .text instead
which accepts strings of any length.
.byte [, , ...]
Create bytes from 8 bit unsigned constants (0-255)
.char [, , ...]
Create bytes from 8 bit signed constants (-128-127)
>1000 ff 03 .byte 255, $03
>1002 41 .byte "a"
>1003 .byte ? ; reserve 1 byte
>1004 fd .char -3
;Store 4.4 signed fixed point constants
>1005 c8 34 32 .char (-3.5, 3.25, 3.125) * 1p4
;Compact computed jumps using self modifying code
.1008 bd 0f 10 lda $1010,x lda jumps,x
.100b 8d 0e 10 sta $100f sta smod+1
.100e d0 fe bne $100e smod bne *
;Routines nearby (-128 to 127 bytes)
>1010 23 49 jumps .char (routine1, routine2)-smod-2
.word [, , ...]
Create bytes from 16 bit unsigned constants (0-65535)
.sint [, , ...]
Create bytes from 16 bit signed constants (-32768-32767)
>1000 42 23 55 45 .word $2342, $4555
>1004 .word ? ; reserve 2 bytes
>1006 eb fd 51 11 .sint -533, 4433
;Store 8.8 signed fixed point constants
>100a 80 fc 40 03 20 03 .sint (-3.5, 3.25, 3.125) * 1p8
.1010 bd 19 10 lda $1019,x lda texts,x
.1013 bc 1a 10 ldy $101a,x ldy texts+1,x
.1016 4c 1e ab jmp $ab1e jmp $ab1e
>1019 33 10 59 10 texts .word text1, text2
.addr [, , ...]
Create 16 bit address constants for addresses (in current program bank)
.rta [, , ...]
Create 16 bit return address constants for addresses (in current program
bank)
* = $12000
.012000 7c 03 20 jmp ($012003,x) jmp (jumps,x)
>012003 50 20 32 03 92 15 jumps .addr $12050, routine1, routine2
;Computed jumps by using stack (current bank)
* = $103000
.103000 bf 0c 30 10 lda $10300c,x lda rets+1,x
.103004 48 pha pha
.103005 bf 0b 30 10 lda $10300b,x lda rets,x
.103009 48 pha pha
.10300a 60 rts rts
>10300b ff ef a1 36 f3 42 rets .rta $10f000, routine1, routine2
.long [, , ...]
Create bytes from 24 bit unsigned constants (0-16777215)
.lint [, , ...]
Create bytes from 24 bit signed constants (-8388608-8388607)
>1000 56 34 12 .long $123456
>1003 .long ? ; reserve 3 bytes
>1006 eb fd ff 51 11 00 .lint -533, 4433
;Store 8.16 signed fixed point constants
>100c 5d 8f fc 66 66 03 1e 85 .lint (-3.44, 3.4, 3.52) * 1p16
>1014 03
;Computed long jumps with jump table (65816)
.1015 bd 2a 10 lda $102a,x lda jumps,x
.1018 8d 11 03 sta $0311 sta ind
.101b bd 2b 10 lda $102b,x lda jumps+1,x
.101e 8d 12 03 sta $0312 sta ind+1
.1021 bd 2c 10 lda $102c,x lda jumps+2,x
.1024 8d 13 03 sta $0313 sta ind+2
.1027 dc 11 03 jmp [$0311] jmp [ind]
>102a 32 03 01 92 05 02 jumps .long routine1, routine2
.dword [, , ...]
Create bytes from 32 bit unsigned constants (0-4294967295)
.dint [, , ...]
Create bytes from 32 bit signed constants (-2147483648-2147483647)
>1000 78 56 34 12 .dword $12345678
>1004 .dword ? ; reserve 4 bytes
>1008 5d 7a 79 e7 .dint -411469219
;Store 16.16 signed fixed point constants
>100c 5d 8f fc ff 66 66 03 00 .dint (-3.44, 3.4, 3.52) * 1p16
>1014 1e 85 03 00
.text bits([, ])
Create bytes from arbitrary precision unsigned and signed numbers.
.text bytes([, ])
Create bytes from arbitrary precision unsigned and signed numbers.
For cases not covered by the numeric store directives above it's possible
to convert numbers to byte or bit strings and store the resulting string.
If the count expression of bytes() and bits() is negative then the stored
number is signed otherwise unsigned.
>1000 74 65 78 74 00 00 00 00 .text bytes("text", 8);pad up to 8 bytes
>1008 f4 ff ff ff ff ff ff ff .text bytes(-12, -8) ;8 bytes signed
>1010 00 04 00 00 00 00 .text bits(1024, 48) ;48 bits unsigned
>1016 f4 ff ff ff ff ff .text bits(-12, -48) ;48 bits signed
Storing string values
The following directives store strings of characters, bytes or bits as bytes.
Small numeric constants can be mixed in to represent single byte control
characters.
When using lists or tuples their content will be used one by one. Uninitialized
data (`?') creates byte sized holes. Character string constants are converted
using the current encoding.
.text [, , ...]
Assemble strings into 8 bit bytes.
>1000 4f 45 d5 .text "oeU"
>1003 4f 45 d5 .text 'oeU'
>1006 17 33 .text 23, $33 ; bytes
>1008 0d 0a .text $0a0d ; $0d, $0a, little endian!
>100a 1f .text %00011111; more bytes
.fill [, ]
Reserve space (using uninitialized data), or fill with repeated bytes.
>1000 .fill $100 ;no fill, just reserve $100 bytes
>1100 00 00 00 .fill $4000, 0 ;16384 bytes of 0
...
>5100 55 aa 55 .fill 8000, [$55, $aa];8000 bytes of alternating $55, $aa
...
>7040 ff ff ff .fill $8000 - *, $ff;fill up rest of EPROM with $ff
...
.shift [, , ...]
Assemble strings of 7 bit bytes and mark the last byte by setting it's most
significant bit.
Any byte which already has the most significant bit set will cause an
error. The last byte can't be uninitialized or missing of course.
The naming comes from old TASM and is a reference to setting the high bit
of alphabetic letters which results in it's uppercase version in PETSCII.
.1000 a2 00 ldx #$00 ldx #0
.1002 bd 10 10 lda $1010,x loop lda txt,x
.1005 08 php php
.1006 29 7f and #$7f and #$7f
.1008 20 d2 ff jsr $ffd2 jsr $ffd2
.100b e8 inx inx
.100c 28 plp plp
.100d 10 f3 bpl $1002 bpl loop
.100f 60 rts rts
>1010 53 49 4e 47 4c 45 20 53 txt .shift "single", 32, "string"
>1018 54 52 49 4e c7
.shiftl [, , ...]
Assemble strings of 7 bit bytes shifted to the left once with the last
byte's least significant bit set.
Any byte which already has the most significant bit set will cause an error
as this is cut off on shifting. The last byte can't be uninitialized or
missing of course.
The naming is a reference to left shifting.
.1000 a2 00 ldx #$00 ldx #0
.1002 bd 0d 10 lda $100d,x loop lda txt,x
.1005 4a lsr a lsr a
.1006 9d 00 04 sta $0400,x sta $400,x ;screen memory
.1009 e8 inx inx
.100a 90 f6 bcc $1002 bcc loop
.100c 60 rts rts
.enc "screen"
>100d a6 92 9c 8e 98 8a 40 a6 txt .shiftl "single", 32, "string"
>1015 a8 a4 92 9c 8f .enc "none"
.null [, , ...]
Same as .text, but adds a zero byte to the end. An existing zero byte is an
error as it'd cause a false end marker.
.1000 a9 07 lda #$07 lda #txt
.1004 20 1e ab jsr $ab1e jsr $ab1e
>1007 53 49 4e 47 4c 45 20 53 txt .null "single", 32, "string"
>100f 54 52 49 4e 47 00
.ptext [, , ...]
Same as .text, but prepend the number of bytes in front of the string
(pascal style string). Therefore it can't do more than 255 bytes.
.1000 a9 1d lda #$1d lda #txt
.1004 20 08 10 jsr $1008 jsr print
.1007 60 rts rts
.1008 85 fb sta $fb print sta $fb
.100a 86 fc stx $fc stx $fc
.100c a0 00 ldy #$00 ldy #0
.100e b1 fb lda ($fb),y lda ($fb),y
.1010 f0 0a beq $101c beq null
.1012 aa tax tax
.1013 c8 iny - iny
.1014 b1 fb lda ($fb),y lda ($fb),y
.1016 20 d2 ff jsr $ffd2 jsr $ffd2
.1019 ca dex dex
.101a d0 f7 bne $1013 bne -
.101c 60 rts null rts
>101d 0d 53 49 4e 47 4c 45 20 txt .ptext "single", 32, "string"
>1025 53 54 52 49 4e 47
Text encoding
64tass supports sources written in UTF-8, UTF-16 (be/le) and RAW 8 bit
encoding. To take advantage of this capability custom encodings can be defined
to map Unicode characters to 8 bit values in strings. Even in plain ASCII
sources it could be useful to define escape sequences for control codes.
.enc
Selects text encoding by a character string name or from an encoding object
Predefined encodings names are `none' and `screen' (screen code), anything
else is user defined. All user encodings start without any character or
escape definitions, add some as required. Please note that the encoding
names are global.
This directive changes the text encoding after it therefore it's usually
placed somewhere at the beginning of the source to make sure everything is
covered.
While it is possible to juggle with multiple encodings throughout the
source code using the .enc directive this is not recommended. For such use
case .encode is better suited.
In the past the .enc directive accepted an unquoted string but currently it
needs to be an expression.
.enc "screen";screen code mode
>1000 13 03 12 05 05 0e 20 03 .text "screen codes"
>1008 0f 04 05 13
.100c c9 15 cmp #$15 cmp #"u" ;compare screen code
.enc "none" ;normal mode again
.100e c9 55 cmp #$55 cmp #"u" ;compare PETSCII
.encode []
Encoding area start
.endencode
Encoding area end
This directive either creates a new text encoding (if used without a
parameter) or makes the one in the parameter effective within the enclosed
area.
The text encoding can be assigned to a symbol in front of the directive so
it can be reused whenever it's needed. This symbol can also act as a
conversion function which converts a character string to a byte string
using the encoding.
.encode ;starts anonymous local encoding scope
.enc "titlefont";special character set
.text "game title"
.endencode ;restores original encoding
vt100 .encode ;define custom encoding
.cdef " ~", 32
.edef "{esc}", 27;add escape codes
.edef "{moff}", [27, "[", "m"]
.edef "{bold}", [27, "[", "1", "m"]
.endencode
.encode vt100 ;use custom encoding from here
.text "{bold}bold{moff} text"
lda #"{esc}"
.endencode ;restores original encoding
cmp #vt100("{esc}");conversion when not in scope
.enc vt100 ;select custom encoding (at start of source)
.cdef , , [, , , , ...]
.cdef "", [, "", , ...]
Assigns characters in a range to single bytes.
This is a simple single character to byte translation definition. It's
useful to map a range of Unicode characters to a range of bytes. The start
and end positions are Unicode character codes either by numbers or by
typing them. Overlapping ranges are not allowed.
.enc "ascii" ;define an ascii encoding
.cdef " ~", 32 ;identity mapping for printable
.tdef , [, , , ...]
Assign single characters to byte values.
Similar to .cdef it is a single character to byte translation definition.
It's easier to use when the character codes are not consecutive.
Overlapping ranges with the former and itself are not allowed.
It tries to assign Unicode character codes from the first expression to
byte values from the second. More than one pair of such assignments can be
given.
If the byte value expression is not iterable then it will get incremented
for each character definition. This allows easy assignment of randomly
scattered Unicode values to a consecutive range of bytes values.
.tdef "A", 65 ;A -> 65
.tdef "ACX", 65 ;A -> 65, C-> 66, X -> 67
.tdef "ACX", [65, 33, 11];A -> 65, C-> 33, X -> 11
.edef "", [, "", , ...]
Assigns strings to byte sequences as a translated value.
When these substrings are found in a text they are replaced by bytes
defined here. When strings with common prefixes are used the longest match
wins. Useful for defining non-typeable control code aliases, or as a simple
tokeniser.
.edef "\n", 13 ;one byte control codes
.edef "{clr}", 147
.edef "{crlf}", [13, 10];two byte control code
.edef "", [];replace with no bytes
The example below shows how all this fits together:
petscii .namespace
common .segment;common definitions
.cdef " @", $20;32-64 is identical
.tdef "[?]??", $5b, "???", $db
.edef "{clr}", 147, "{cr}", 13
.endsegment
upper .encode;uppercase PETSCII
#common
.cdef "AZ", $41
.tdef "????????????????????????????????", $a1
.tdef "??????????????????????????", $c1
.tdef "????", [$df, $ff, $c0, $dd]
.endencode
lower .encode;lowercase PETSCII
#common
.cdef "az", $41, "AZ", $c1;the easy ranges
.tdef "????????????????????????????????", $a1
.tdef "????", [$df, $ff, $c0, $dd];random one to ones
.endencode
.endnamespace
.encode petscii.lower
>1000 93 d4 45 58 54 20 49 4e .text "{clr}Text in PETSCII{cr}"
>1008 20 d0 c5 d4 d3 c3 c9 c9 0d
.endencode
Structured data
Structures and unions can be defined to create complex data types. The offset
of fields are available by using the definition's name. The fields themselves
by using the instance name.
The initialization method is very similar to macro parameters, the difference
is that unset parameters always return uninitialized data (`?') instead of an
error.
Structure
Structures are for organizing sequential data, so the length of a structure is
the sum of lengths of all items.
.struct [][=]][, [][=] ...]
Begins a structure block
.ends [][, ...]
.endstruct [][, ...]
Ends a structure block
Structure definition, with named parameters and default values
.dstruct [, ]
. []
Create instance of structure with initialization values
.struct ;anonymous structure
x .byte 0 ;labels are visible
y .byte 0 ;content compiled here
.endstruct ;useful inside unions
nn_s .struct col, row;named structure
x .byte \col ;labels are not visible
y .byte \row ;no content is compiled here
.endstruct ;it's just a definition
nn .dstruct nn_s, 1, 2;structure instance (within label)
lda nn.x ;direct field access
ldy #nn_s.x ;get offset of field
lda nn,y ;and use it indirectly
nnarray .brept 4 ;4 element "array" here
.dstruct nn_s ;fields directly here (without a label)
.endrept
lda nnarray[0].y;access of "array" field
coords2 .bfor x2, y2 in [(1,3),(4,2),(7,5)]
.dstruct nn_s, x2, y2
.endfor ;initialized "array" from list
Union
Unions can be used for overlapping data as the compile offset and program
counter remains the same on each line. Therefore the length of a union is the
length of it's longest item.
.union [][=]][, [][=] ...]
Begins a union block
.endu
.endunion
Ends a union block
Union definition, with named parameters and default values
.dunion [, ]
. []
Create instance of union with initialization values
.union ;anonymous union
x .byte 0 ;labels are visible
y .word 0 ;content compiled here
.endunion
nn_u .union ;named union
x .byte ? ;labels are not visible
y .word \1 ;no content is compiled here
.endunion ;it's just a definition
nn .dunion nn_u, 1 ;union instance here
lda nn.x ;direct field access
ldy #nn_u.x ;get offset of field
lda nn,y ;and use it indirectly
Combined use of structures and unions
The example below shows how to define structure to a binary include.
.union
.binary "pic.drp", 2
.struct
color .fill 1024
screen .fill 1024
bitmap .fill 8000
backg .byte ?
.endstruct
.endunion
Anonymous structures and unions in combination with sections are useful for
overlapping memory assignment. The example below shares zero page allocations
for two separate parts of a bigger program. The common subroutine variables are
assigned after in the `zp' section.
* = $02
.union ;spare some memory
.struct
.dsection zp1 ;declare zp1 section
.endstruct
.struct
.dsection zp2 ;declare zp2 section
.endstruct
.endunion
.dsection zp ;declare zp section
Macros
Macros can be used to reduce typing of frequently used source lines. Each
invocation is a copy of the macro's content with parameter references replaced
by the parameter texts.
.segment [][=]][, [][=] ...]
Start of segment block
.endsegment [][, ...]
End of segment block
Copies the code segment as it is, so symbols can be used from outside, but
this also means repeated use can result in double defines unless anonymous
labels are used.
.macro [][=]][, [][=] ...]
Start of macro block
.endmacro [][, ...]
End of macro block
The code is enclosed in it's own block so symbols inside are
non-accessible, unless a label is prefixed at the place of use, then local
labels can be accessed through that label.
A side effect of this is that trying to use a local symbol's name as a
parameter will likely fail. The reason is it'll be looked up in the macro's
own scope but that's not where it is. To avoid such surprises it's advised
to use .segment instead whenever possible.
# [][[,][] ...]
. [][[,][] ...]
Invoke the macro after `#' or `.' with the parameters. Normally the name of
the macro is used, but it can be any expression.
.endm [][, ...]
Closing directive of .macro and .segment for compatibility.
;A simple macro
copy .macro
ldx #size(\1)
lp lda \1,x
sta \2,x
dex
bpl lp
.endmacro
#copy label, $500
;Use macro as an assembler directive
lohi .macro
lo .byte <(\@)
hi .byte >(\@)
.endmacro
var .lohi 1234, 5678
lda var.lo,y
ldx var.hi,y
Parameter references
The first 9 parameters can be referenced by `\1'-`\9'. The entire parameter
list including separators is `\@'.
name .macro
lda #\1 ;first parameter 23+1
.endmacro
#name 23+1 ;call macro
Parameters can be named, and it's possible to set a default value after an
equal sign which is used as a replacement when the parameter is missing.
These named parameters can be referenced by \name or \{name}. Names must match
completely, if unsure use the quoted name reference syntax.
name .macro first, b=2, , last
lda #\first ;first parameter
lda #\b ;second parameter
lda #\3 ;third parameter
lda #\last ;fourth parameter
.endmacro
#name 1, , 3, 4 ;call macro
Text references
In the original turbo assembler normal references are passed by value and can
only appear in place of one. Text references on the other hand can appear
everywhere and will work in place of e.g. quoted text or opcodes and labels.
The first 9 parameters can be referenced as text by @1-@9.
name .macro
jsr print
.null "Hello @1!";first parameter
.endm
#name "wth?" ;call macro
Custom functions
Beyond the built-in functions mentioned earlier it's possible to define custom
ones for frequently used calculations.
.sfunction [[:][=], ...][*,]
Defines a simple function to return the result of a parametrised expression
.function [:][=]], [=] ...][, *
]
Defines a multi line function
.endf [][, ...]
.endfunction [][, ...]
End of a multi line function
# [][[,][] ...]
. [][[,][] ...]
[][[,][] ...]
Invoke a multi line function like a macro, directive or pseudo instruction
Function parameters are assigned to comma separated variable names on
invocation. These variables are visible in the function scope.
Parameter values may be converted using a function whose name can be given
after a colon following the variable name.
Default values may be supplied for each parameter after an equal sign. These
values are calculated at function definition time only and are used when a
parameter was not specified.
Extra parameters are not accepted, unless the last parameter symbol is preceded
with a star, in this case these parameters are collected into a tuple.
Only those external variables and functions are available which were accessible
at the place of definition, but not those at the place of invocation.
vicmem .sfunction _font, _scr=0, ((_font >> 10) & $0f) | ((_scr >> 6) & $f0)
lda #vicmem($2000, $0400); calculate constant
sta $d018
If a multi line function is used in an expression only the returned result is
used. If multiple values are returned these will form a tuple.
If a multi line function is used as macro, directive or pseudo instruction and
there's a label in front then the returned value is assigned to it. If nothing
is returned then it's used as regular label.
mva .function value, target
lda value
sta target
.endfunction
mva #1, label
Conditional assembly
To prevent parts of source from compiling conditional constructs can be used.
This is useful when multiple slightly different versions needs to be compiled
from the same source.
Anonymous labels are still recognized in the non-compiling parts even if they
won't get defined. This ensures consistent relative referencing across
conditionally compiled areas with such labels.
If, else if, else
.if
Compile if condition is true
.elsif
Compile if previous conditions were not met and the condition is true
.else
Compile if previous conditions were not met
.ifne
Compile if value is not zero
.ifeq
Compile if value is zero
.ifpl
Compile if value is greater or equal zero
.ifmi
Compile if value is less than zero
The .ifne, .ifeq, .ifpl and .ifmi directives exists for compatibility only,
in practice it's better to use comparison operators instead.
.if wait==2 ;2 cycles
nop
.elsif wait==3 ;3 cycles
bit $ea
.elsif wait==4 ;4 cycles
bit $eaea
.else ;else 5 cycles
inc $2
.endif
.fi
.endif
End of conditional compilation.
.elif
Same as .elsif because it's a popular typo and it's difficult to notice.
Switch, case, default
Similar to the .if, .elsif, .else, .endif construct, but the compared value
needs to be written only once in the switch statement.
.switch
Evaluate expression and remember it
.case [, ...]
Compile if the previous conditions were all skipped and one of the values
equals
.default
Compile if the previous conditions were all skipped
.switch wait
.case 2 ;2 cycles
nop
.case 3 ;3 cycles
bit $ea
.case 4 ;4 cycles
bit $eaea
.default ;else 5 cycles
inc $2
.endswitch
.endswitch
End of .switch conditional compilation block.
Comment
.comment
Never compile.
.comment
lda #1 ;this won't be compiled
sta $d020
.endcomment
.endc
.endcomment
End of .comment block.
Repetitions
There are multiple directives which can be used to repeat lines of code.
The regular non-scoped variants cover most cases except when normal labels are
required as those will be double defined.
Scoped variants (those starting with the letter `b') create a new scope for
each iteration. This allows normal labels without collision but it's a bit more
resource intensive.
If the scoped variant is prefixed with a label then the list of individual
scopes for each iteration will be assigned to it. This allows accessing labels
within.
.for [], [], []
.bfor [], [], []
Assign initial value, loop while the condition is true and modify value.
First a variable is set, usually this is used for counting. This is
optional, the variable may be set already before the loop.
Then the condition is checked and the enclosed lines are compiled if it's
true. If there's no condition then it's an infinite loop and .break must be
used to terminate it.
After an iteration the second assignment is calculated, usually it's
updating the loop counter variable. This is optional as well.
All three expressions are free form and can be almost anything which makes
this style of loop very flexible. A good example is this loop which
processes a multipart data structure having a word sized length field in
front of each part:
.for pos := 0, pos < len(data), pos += data[pos : pos + 2]
...
.endfor
Still in it's most typical application it just increments a counter from
start to end by constant steps:
.for counter := start, counter < end, counter += step
...
.endfor
If the loop counter only runs through a range of integers then this
iterative form is recommended instead:
;Iterative for loop (start to end with steps, not including end)
.for counter in range(start, end, step)
;Can be shorter if step is 1 (start to end-1)
.for counter in range(start, end)
;Or even shorter if start is 0 (0 to end-1)
.for counter in range(end)
That's not only more compact it also has the advantage that the counter
name doesn't need to be repeated 3 times.
If the counter isn't used in the loop body then a simple repetition is
better:
.rept count
...
.endrept
A for loop can also act as a while loop when the assignment expressions are
left empty. That was useful in the past but there's a .while now.
.for [, , ...] in [, , ...]
.bfor [, , ...] in [, , ...]
Assign variable(s) to values in sequence one-by-one in order.
Usually one variable is used to loop through all values. The values can be
supplied by range function or some sort of list.
;loop through on iterable or on comma separated values
.for col in 0, 11, 12, 15, 1
lda #col ;0, 11, 12, 15 and 1
sta $d020
.endfor
It's also possible to use more than one variable for each iteration. These
can be assigned to a collection of values each time (row oriented) or a
single value from each collection (column oriented).
A row oriented for loop expects collections of the same number of values as
the number of variables as each variable gets assigned to one of them. The
loop iteration count depends on how many such collections were supplied.
;row oriented iterating for loop, on list of tuples
.for dest, val in [($d011, $3b), ($d020, 0), ($d018, $18)]
lda #val
sta dest
.endfor
A column oriented for loop expects the same number of collections (comma
separated) as the number of variables. On each iteration a single value is
taken from each and is assigned to the matching variable. All collections
should have the same length so that all variables can be assigned. This
length also determines the loop iteration count.
;column oriented iterating for loop, one iterable for each variable.
.for dest, val in ($d011, $d020, $d018), ($3b, 0, $18)
lda #val
sta dest
.endfor
.endfor
End of a .for or .bfor loop block
.rept
.brept
Repeat enclosed lines the specified number of times.
This style of loop is for simple repetitions where no loop variable is
needed.
lda pos
.rept 3 ;multiply pos by 8
asl a
rol pos+1
.endrept
adc #