A Motorola 6809 assembler—there are many like it, but this is mine

I think it's time I start talking about some of the software I write, and I might as well start with my latest project that I've been having way too much fun writing, a 6809 assembler written in C [1].

Yes, I could use an existing 6809 assembler, but most of the ones availble as source seem to be based off one written in 1993 by L. C. Benschop. And the code quality there is … of its time … which I think is the most charitable thing I can say about it. Here's the code to convert text to a decimal number:

short scandecimal()
{
 char c;
 short t=0;
 c=*srcptr++;
 while(isdigit(c)) {   
  t=t*10+c-'0';
  c=*srcptr++;
 }
 srcptr--;
 return t;
} 

Lots of globals, lots of “magic” numbers (at least they're described in comments), and vwl mprd variable names. It's not a pleasant code base to work in.

Besides, it's something I've been wanting to do since college. So why not?

So I have a standard two-pass assembler with a few features I haven't seen in other 6809 assemblers. And that's what I'll be describing here. The first feature is small, but decidedly nice—the ability to have underscores (“_”) in numberic literals. It's more useful for binary literals, such as %10_00_01_11 or %000_01001_0_100_0010 but it can be used for decimal, octal or hexadecimal numbers as well.

Another simple feature is the ability to generate a dependency list for make. Since I support the inclusion of multiple assembly files, it makes sense to support this feature as well. I'm not trying to make an assembler that works on the 6809 system (I think it's way too small a system for that), but an assembler that makes it nice to write code for a 6809 system.

I also have local labels that work similarly to NASM (Netwide Assembler) [2]. As an example:

clear_bytes	clra
.loop		sta	,x+
		decb
		bne	.loop
		rts

clear_words	stb	,-s
		clra
		clrb
.loop		std	,x++
		dec	,s
		bne	.loop
		rts

Internally, the assembler will merge the local labels with the previous non-local label, and thus, we get the labels clear_bytes, clear_bytes.loop, clear_words and clear_words.loop. I find it makes for cleaner code. What is easier to understand, this?

;********************************************************************
;	Music Synthesizer
;Entry:	$3FF0	Freq delay count
;	$3FF1	Envelope table address
;	$3FF3	Envelope delay count
;	$3FF5	Volume, 1 to 255
; NOTE:	from _TRS_80 Color Computer Assembly Lanauge Programming_,
;	page 252
;********************************************************************

		org	$3F00

mussyn		lda	$FF01		; select sound out
		anda	#$F7		; reset MUX bit
		sta	$FF01
		lda	$FF03		; select sound out
		anda	#$F7		; reset MUX bit
		sta	$FF03
		lda	$FF23		; get PIA
		ora	#8		; set 6-bit sound enable
		sta	$FF23
		ldu	#$3FF0		; point to block
		ldx	1,u		; get envelope address
		stx	envptr		; save in envptr
		ldx	3,u		; get envelope delay
mus005		lda	[envptr]	; get value
		beq	mus090		; if 0, done
		ldb	5,u		; get volume
		mul			; adjust volume
		anda	#$FC		; reset RS-232-C (?)
		sta	$FF20		; set on
		ldb	,u		; get frequency delay count
mus010		leax	-1,x		; decrement envelope count
		bne	mus020		; go if not 0
		ldy	envptr		; increment evelope ptr
		leay	1,y
		sty	envptr
		ldx	3,u		; get envrolope delay
mus020		decb			; decrement frequency count
		bne	mus010		; go if not 0
		lda	[envptr]	; DUMMY
		brn	*+2		; DUMMY
		ldb	5,u		; DUMMY
		mul			; DUMMY
		clr	$FF20		; set off
		ldb	,u		; get frequency delay
mus030		leax	-1,x		; decrement envelope count
		bne	mus040		; go if not 0
		ldy	envptr		; increment envelope ptr
		leay	1,y
		sty	envptr
		ldx	3,u		; get envelope delay
mus040		decb			; decrement frequency count
		bne	mus030		; go if not 0
		bra	mus005		; keep on playing
mus090		rts
envptr		fdb	0

		end	mussyn

Or this?

;********************************************************************
;	Music Synthesizer
;Entry:	$3FF0	Freq delay count
;	$3FF1	Envelope table address
;	$3FF3	Envelope delay count
;	$3FF5	Volume, 1 to 255
; NOTE:	from _TRS_80 Color Computer Assembly Lanauge Programming_,
;	page 252
;********************************************************************

		org	$3F00

mussyn		lda	$FF01		; select sound out
		anda	#$F7		; reset MUX bit
		sta	$FF01
		lda	$FF03		; select sound out
		anda	#$F7		; reset MUX bit
		sta	$FF03
		lda	$FF23		; get PIA
		ora	#8		; set 6-bit sound enable
		sta	$FF23
		ldu	#$3FF0		; point to block
		ldx	1,u		; get envelope address
		stx	.envptr		; save in envptr
		ldx	3,u		; get envelope delay
.next_byte	lda	[.envptr]	; get value
		beq	.exit		; if 0, done
		ldb	5,u		; get volume
		mul			; adjust volume
		anda	#$FC		; reset RS-232-C (?)
		sta	$FF20		; set on
		ldb	,u		; get frequency delay count
.sound_on	leax	-1,x		; decrement envelope count
		bne	.check_freq_on	; go if not 0
		ldy	.envptr		; increment evelope ptr
		leay	1,y
		sty	.envptr
		ldx	3,u		; get envrolope delay
.check_freq_on	decb			; decrement frequency count
		bne	.sound_on	; go if not 0
		lda	[.envptr]	; DUMMY
		brn	*+2		; DUMMY
		ldb	5,u		; DUMMY
		mul			; DUMMY
		clr	$FF20		; set off
		ldb	,u		; get frequency delay
.sound_off	leax	-1,x		; decrement envelope count
		bne	.check_freq_off	; go if not 0
		ldy	.envptr		; increment envelope ptr
		leay	1,y
		sty	.envptr
		ldx	3,u		; get envelope delay
.check_freq_off	decb			; decrement frequency count
		bne	.sound_off	; go if not 0
		bra	.next_byte	; keep on playing
.exit		rts
.envptr		fdb	0

		end	mussyn

It helps that I allow 63 characters for a label, which is way more than any 6809 assembler I've ever used.

The last feature I have are warnings. Given the following code:

.start		lda	<<b16,x
		ldb	#$FF12
		std	foobar
		lda	b5,u
		ldb	b8,s
		tfr	a,x
		lbsr	a_really_long_label_that_exceeds_the_internal_limit_its_quite_long

		sta	[<<b5,y]
		bra	another_long_label_that_is_good

a_really_long_label_that_exceeds_the_internal_limit_its_quite_long
		rts

another_long_label_that_is_good
		clra
.but_this_makes_it_too_long_to_use
		decb
		bne	.but_this_makes_it_too_long_to_use

		bra	next8
next8		lbra	next1
next16		brn	next8b
next8b		lbrn	next16b
next16b		rts

foobar		equ	$20
b16		equ	$8080
b5		equ	3
b8		equ	25

The assembler will generate the following warnings (yes, this code is used to test all the warnings in the assembler):

warn.asm:1: warning: W0010: missing initial label
warn.asm:6: warning: W0008: ext/tfr mixed sized registers
warn.asm:7: warning: W0001: label 'a_really_long_label_that_exceeds_the_internal_limit_its_quite_l' exceeds 63 characters
warn.asm:12: warning: W0001: label 'a_really_long_label_that_exceeds_the_internal_limit_its_quite_l' exceeds 63 characters
warn.asm:17: warning: W0001: label 'another_long_label_that_is_good.but_this_makes_it_too_long_to_u' exceeds 63 characters
warn.asm:19: warning: W0001: label 'another_long_label_that_is_good.but_this_makes_it_too_long_to_u' exceeds 63 characters
warn.asm:1: warning: W0003: 16-bit value truncated to 5 bits
warn.asm:2: warning: W0004: 16-bit value truncated to 8 bits
warn.asm:3: warning: W0005: address could be 8-bits, maybe use '<'?
warn.asm:4: warning: W0006: offset could be 5-bits, maybe use '<<'?
warn.asm:5: warning: W0007: offset could be 8-bits, maybe use '<'?
warn.asm:7: warning: W0009: offset could be 8-bits, maybe use short branch?
warn.asm:9: warning: W0011: 5-bit offset upped to 8 bits for indirect mode
warn.asm:21: warning: W0012: branch to next location, maybe remove?
warn.asm:22: warning: W0012: branch to next location, maybe remove?
warn.asm:1: warning: W0002: symbol '.start' defined but not used

So, in order of appearance:

W0010

What happens if you give a local label sans a non-lobal label? Well, I decided to allow it, but at least warn about it. The result label is just .start but it could be hard to reference. I could see making this an error, but for now, it's just a warning.

W0008

This is the only warning about undefined behavior. The 6809 doesn't specify what happens when you transfer (or exchange) an 8-bit register with a 16-bit register [3] (or vice versa). The CPU (Central Processing Unit) just keeps running, but the results are just that—undefined. Again, this could be an error, but for now, I'm letting it slide as a warning.

W0001

Internally, the assembler just truncates labels to 63 characters, but otherwise, it just keeps going.

W0003

This is related to the nature of a two-pass assembler and forward references. Here, I'm forcing the given index to a 5-bit index (which doesn't take an additional byte of space, unlike an 8-bit (one additional byte) or a 16-bit (two additional bytes) offset), but the assembler has to assume it's okay on pass one. By the time pass two comes around, b16 is defined but it's value exceeds that 5-bits (which is -16 to 15 for the record). This warning is just letting the user know the value doesn't fit into 5-bits.

W0004

Pretty much the same as W0003 except for an 8-bit value.

W0005

Again, due to the nature of a two-pass assembler. This time, no hint is given to the size of the label, and on pass one, the assembler assumes the worst—a 16 bit value. It's only on pass two does it have enough information to know it could be an 8-bit address, but it can't use an 8-bit address as it would throw all the other addresses off (ask me how I know).

W0006

Similar to W0005, but for an offset that can fit in 5-bits.

W0007

Similar to W0006 but for an 8-bit value.

W0009

This time, the assembler has determined that the target instruction falls within an 8-bit relative branch instruction, but was given a 16-bit relative branch instruction. This can happen because of code refactorings that shrinks the distance between the branch instruction and the target.

W0011

One of the features of the 6809 is its support of indirect indexing. Instead of the index having the data directly, the index contains the address of the data (in C parlance, LDA ,X is A = *X and LDA [,X] is A = **X). The 6809 doesn't support this mode for 5-bit offsets, but it does for 8-bit and 16-bit offsets. This is just a warning that you can't use a 5-bit offset for this. I'm on the fence about keeping or removing this, and I'm keeping it for now.

W0012

This detects when you branch to the following instruction, except if the instruction is BRN which is “branch never” (or the long branch version LBRN). The 6809 is unique for an 8-bit CPU with such an instruction. And despite it's apparent uselessness (why would you have a branch that is never taken) it is useful to pad out timing loops when talking to hardware.

W0002

The label wasn't referenced by any other code. And if the label is not referenced, why have the label in the first place? It could also mean an unused variable whose removal could save some space.

As you can see, most of the warnings are about code sequences that could be shorter, and I'm not aware of any assembler that gives such warnings. I could be wrong, but of the 6809 assmemblers I've used, I haven't seen anything like this.

I also have a way to supress a given warning (they're all enabled by default—I'm opinionated about this, and your stuck with my opinion if you want to use this assembler).

So that's it about the unique features I have in my assembler. I don't expect many people to use this, but I don't care, I'm having fun developing it. And that's what counts.

[1] https://github.com/spc476/a09

[2] https://nasm.us/

[3] https://tlindner.macmess.org/?p=945

Gemini Mention this post

Contact the author