Friday, 20 April 2012

In Which We Tighten Up The Graphics


I had occasion, the other day, to glance over some old code I wrote about 4 years ago, and there were bits of it that made me almost hang my head in shame and ask myself what the hell I'd been thinking when I created it. I had the same reaction as I looked-over the dirty-row refresh code - most of it I think is OK, but there are a couple of bits that I really, really needed to revisit.

Just for starters, there was a stupid bit of duplication in the glyph-copy logic where I had two loops and actually only need one; there were a few places where I'm needlessly doing work that really isn't necessary - like stripping the LSB off the column index, even though it will only ever be an even number; and a few places where I think I could make better use of registers without having to re-initialise them. Plus of course the dependency errors that needed fixing, and that nasty requirement for both index registers in the last loop - so a little bit of work to do here before I move on, I think.

Here's the revised version of .drawdrty, complete with corrections and some optimisations:

drawdrty SUBROUTINE
LDY #$00 ; [2] set index for indirect in .checkrow
.nextrow
DEC _DRAWROW ; [5] ZP decrement draw row index
BPL .checkrow ; [3/2] check row unless we dropped below zero
LDA #$19 ; [2] reset dirty row counter (#25)
STA _DRAWROW ; [3] ZP save it
RTS ; [6]

; scan first attribute byte on each row looking for dirty row bit
.checkrow
LDA _DRAWROW ; [3] ZP get screen row
ASL ; [2] multiply by 2 for table index
TAX ; [2] set index
LDA _ABUFTAB,X ; [5] get attribute buffer address lo-byte
STA _ATTRADDR ; [3] ZP save for dirty row check address
LDA _ABUFTAB+1,X ; [5] get attribute buffer address hi-byte
STA _ATTRADDR+1 ; [3] ZP save for dirty row check address
LDA (_ATTRADDR),Y ; [5] get attribute byte
BPL .nextrow ; [3/2] if bit 7 not set loop for next row
AND #%01111111 ; [2] clear dirty row bit
STA (_ATTRADDR),Y ; [6] write attribute byte back to buffer

; get text buffer row address
LDA _TBUFTAB,X ; [4] get text buffer row address lo-byte
STA _TEXTADDR ; [3] ZP save for text buffer address
LDA _TBUFTAB+1,X ; [4] get text buffer row address hi-byte
STA _TEXTADDR+1 ; [3] ZP save for text buffer address

; determine bitmap address for dirty row/column (columns always even)
LDY #$26 ; [2] set column index (#38)
.nextpair
STY _DRAWCOL ; [3] ZP set target screen column
LDA _DRAWROW ; [3] ZP get draw row index
CMP #$18 ; [2] is this line 25?
BEQ .line25 ; [2/3] yep, skip stuff for lines 1-24
ASL ; [2] multiply row by 8
ASL ; [2]
ASL ; [2]
ADC _COLADDRS,Y ; [4] add bitmap column address lo-byte
STA _DRAWADDR ; [3] ZP set bitmap draw address lo-byte
LDA _COLADDRS+1,Y ; [4] get bitmap column address hi-byte
ADC #$00 ; [2] add Carry
BNE .sethi ; [3/3] can never be zero, always branch
.line25
TYA ; [2] move column for multiply
ASL ; [2] multiply for line 25 offset
ASL ; [2]
STA _DRAWADDR ; [3] ZP set bitmap draw address lo-byte
LDA #$00 ; [2]
.sethi
STA _DRAWADDR+1 ; [3] ZP set bitmap draw address hi-byte

; determine glyph data addresses
LDA #$10 ; [2] glyph address hi-byte (#$80, right-shifted 3 bits)
STA _GLYPADD1+1 ; [3] ZP set glyph address hi-bytes
STA _GLYPADD2+1 ; [3]
LDX #$02 ; [2] glyph address offset
.nextaddr
LDA (_TEXTADDR),Y ; [5] get ASCII char from text buffer
ASL ; [2] shift upper 3 bits out...
ROL _GLYPADD2+1,X ; [6] ...and rotate into address hi-byte
ASL ; [2]
ROL _GLYPADD2+1,X ; [6]
ASL ; [2]
ROL _GLYPADD2+1,X ; [6]
STA _GLYPADD2,X ; [4] ZP set glyph address lo-byte
INY ; [2] increment column for next glyph address
DEX ; [2] decrement address offset
DEX ; [2]
BPL .nextaddr ; [3/2] loop for second glyph

; copy glyph data to Zero Page
LDY #$07 ; [2] glyph data bytes to copy
.nxtglyph
LAX (_GLYPADD1),Y ; [5] get glyph data byte (undocumented opcode, loads .A and .X)
STX _GLYPDAT1,Y ; [4] ZP store it
LAX (_GLYPADD2),Y ; [5] get glyph data byte
STX _GLYPDAT2,Y ; [4] ZP store it
DEY ; [2] decrement byte counter
BPL .nxtglyph ; [3/2] loop for next byte

; apply underline, strikethrough and inverse-mode
LDY _DRAWCOL ; [3] ZP get target screen column
LDA (_ATTRADDR),Y ; [5] get attribute byte for the column pair
BEQ .merge ; [3/2] if attribute byte contains zero, skip to the merge
LDY #$FF ; [2] glyph underline/strikethrough value
LSR ; [2] shift first glyph strikethrough bit out
BCC .g2st ; [3/2] skip if not set
STY _GLYPDAT1+3 ; [3] ZP store it
.g2st
LSR ; [2] shift second glyph strikethrough bit out
BCC .g1ul ; [3/2] skip if not set
STY _GLYPDAT2+3 ; [3] ZP store it
.g1ul
LSR ; [2] shift first glyph underline bit out
BCC .g2ul ; [3/2] skip if not set
STY _GLYPDAT1+7 ; [3] ZP store it
.g2ul
LSR ; [2] shift second glyph underline bit out
BCC .invertg1 ; [3/2] skip if not set
STY _GLYPDAT2+7 ; [3] ZP store it
.invertg1
LSR ; [2] shift first glyph inverse-mode bit out
BCC .invertg2 ; [3/2] skip if not set
LDX #$07 ; [2] glyph data byte counter
TAY ; [2] stash attribute byte in .Y
.g1inv
LDA _GLYPDAT1,X ; [4] ZP get glyph data byte
EOR #$FF ; [2] invert it
STA _GLYPDAT1,X ; [4] ZP set glyph data byte
DEX ; [2] decrement byte counter
BPL .g1inv ; [3/2] loop for next byte
TYA ; [2] get attribute byte back
.invertg2
LSR ; [2] shift second glyph inverse-mode bit out
BCC .merge ; [3/2] skip if not set
LDX #$07 ; [2] glyph data byte counter
.g2inv
LDA _GLYPDAT2,X ; [4] ZP get glyph data byte
EOR #$FF ; [2] invert it
STA _GLYPDAT2,X ; [4] ZP set glyph data byte
DEX ; [2] decrement byte counter
BPL .g2inv ; [3/2] loop for next byte

; merge and copy glyph data to bitmap
.merge
LDY #$07 ; [2] glyph data byte counter
.nextbyte
LDA _GLYPDAT1,Y ; [4] get first glyph data byte
AND #$F0 ; [2] mask off lower nybble
STA _GLYPDAT1,Y ; [5] set first glyph data byte
LDA _GLYPDAT2,Y ; [4] get second glyph data byte
AND #$0F ; [2] mask off upper nybble
ORA _GLYPDAT1,Y ; [4] merge glyph data bytes
STA (_DRAWADDR),Y ; [6] write it to bitmap
DEY ; [2]
BPL .nextbyte ; [3/2] loop until all glyph bytes merged and written
LDY _DRAWCOL ; [3] ZP get screen column
DEY ; [2] decrement for both columns we just drew
DEY ; [2]
BMI .alldone ; [2/3] exit when done
JMP .nextpair ; [6] loop for next character pair
.alldone
RTS ; [6]

It's a few bytes shorter, a few cycles faster, and fixes all the issues - it's still not quite as compact as I think it can be, as there are still a couple of places where I'm sure I can restructure it, but for the moment it'll do. I was intending to write a high-level string-plotting routine to wrap-up the mechanics of printing stuff to the screen, but I've had an idea for a profiler using a VIA timer, so I'm going to investigate that first...

No comments:

Post a Comment