Friday, 13 April 2012

In Which The Screen Refresh Is Working


I confess I seriously under-estimated the complexities of the work needed to get a refreshable bitmapped text screen working - it seemed pretty simple in principle, and I suppose it is, but it took a lot of careful thought and design to make it work in a way which didn't overwhelm the CPU but still gave a satisfactory user-experience. It's up-and-runnng now, and I'm fairly pleased with it - I think it might be some of the most intricate 6502 code I've ever written. There might still be room for improvement, as there are a couple of places where I'm sure I can either reduce the code size by a few bytes or speed things up by a few cycles, but overall I'm happy with it.

I've added the logic to handle the three 'effects' bits too - I was calling them Control bytes, but I realised they didn't really 'control' anything, they were simply flags to indicate that a particular attribute should be rendered over the glyph as it was plotted into the screen bitmap. Accordingly, the Control Buffer has been renamed the Attribute Buffer, and simultaneously cut in half - since I now process characters in pairs, and each attribute byte only uses 3 bits (discounting the first byte in a row which also holds the dirty-row bit) I combined pairs of attributes into a single byte. This made it easier and cheaper to process them during the line-refresh drawing stage because I only needed to read a single byte for the two glyphs I was rendering, and also meant that the Attribute Buffer dropped from 1000 bytes to 500. Nice.

I also revisited the mechanism which renders the attributes (strike-through, underline and inverse-mode) because I saw a faster, smarter way to do it. Originally I had created specific glyphs in the character set which I'd intended to mask into the target glyph row-by-row, but I realised that for strike-through and underline I'd actually be wasting cycles on seven data rows out of eight - I only need to replace a line in the glyph with a row of 'on' pixels at row 4 and/or row 8 (depending on which attributes were selected) and I could do that in the code itself. Equally, inverse-mode became a simple matter of applying EOR #$FF to the glyph data just before plotting it, so again that was a task better accomplished as bespoke code.

Here then is the finished version of the whole 'find-a-dirty-row-and-refresh-it' routine. If you look carefully, you'll see that the chunk of code I talked about before - which used an undocumented instruction - has actually been replaced with the other version I described. When I finalised the logic and optimised it, the LAX version turned-out to be about 25% slower overall - not because of LAX itself, but because I needed to use the X-register as a Zero Page offset, and so I had to add extra code to swap data between registers. In the end, the other version of the routine was a better fit and worked faster in a two-iteration loop, so it won the day. Even so, LAX is still in there as part of the code that copies the glyph data to Zero Page - it neatly avoids the requirement for two index registers to do ZP and Indirect addressing in the same loop.

So here's the beginning, where we tickle a ZP counter so that we remember where we got to the next time the routine is called from the IRQ handler:

drawdrty SUBROUTINE
LDY #$00 ; [2] set index for indirect
.nextrow
DEC _DRAWROW ; [5] ZP decrement draw row index
BPL .checkrow ; [3/2] check row unless we dropped below zero
LDA #$19 ; [2] reset dirty row counter (#25)
STA _DRAWROW ; [3] ZP save it
RTS ; [6]

Pretty simple, right? Decrement the counter, and go check the row if it's positive. If it drops below zero, reset it and exit - so that next time we'll restart at the end of the screen. Let's look at .checkrow next:

.checkrow
LDA _DRAWROW ; [3] ZP get screen row
ASL ; [2] multiply by 2 for table index
TAX ; [2] set index
LDA _ABUFTAB,X ; [5] get attribute buffer address lo-byte
STA _DRAWADDR ; [3] ZP save for dirty row check address
LDA _ABUFTAB+1,X ; [5] get attribute buffer address hi-byte
STA _DRAWADDR+1 ; [3] ZP save for dirty row check address
LDA (_DRAWADDR),Y ; [5] get attribute byte
BPL .nextrow ; [3/2] if bit 7 not set loop for next row
AND #%01111111 ; [2] clear dirty row bit
STA (_DRAWADDR),Y ; [6] write attribute byte back to buffer

Here we're looking at whatever row the counter is pointing at, and using it as an index into the Attribute Buffer table to get the address for the first Attribute Byte of the row. Because we use bit 7 of that first byte as the dirty-row marker, we can just do a simple positivity-test on it - if the byte is positive, bit 7 is clear and the row isn't dirty (so we skip back to the top of the routine to hit the counter again). Conversely, if the byte is negative then bit 7 is set, which means we've found a dirty row - so we clear the bit, and fall through into the exciting stuff...

    LDY #$27               ; [2]   set indirect index (#39)
.nextpair
STY _DRAWCOL ; [3] ZP set screen column
TYA ; [2] get target column in .A
AND #%11111110 ; [2] mask LSB off for table index
TAX ; [2] stash column index in .X
LDA _DRAWROW ; [3] ZP get target row number
CMP #$18 ; [2] is this line 25?
BEQ .line25 ; [2/3] yep, skip stuff for lines 1-24
ASL ; [2] multiply row by 8
ASL ; [2]
ASL ; [2]
ADC _COLADDRS,X ; [4] add bitmap column address lo-byte
STA _DRAWADDR ; [3] ZP set bitmap draw address lo-byte
LDA _COLADDRS+1,X ; [4] get bitmap column address hi-byte
ADC #$00 ; [2] add Carry
BNE .done ; [3/3] can never be zero, always branch
.line25
TXA ; [2] get column index
ASL ; [2] multiply column index for line 25 offset
ASL ; [2]
STA _DRAWADDR ; [3] ZP set bitmap draw address lo-byte
LDA #$00 ; [2]
.done
STA _DRAWADDR+1 ; [3] ZP set bitmap draw address hi-byte

Remember this? It's the routine I wrote a couple of weeks or so back which takes a text row/column pair and turns it into the equivalent address in the bitmap; we don't need this anywhere else (at least, not yet) so it's now inlined here - having established that there's a dirty row, this tells us where in the bitmap we're going to be redrawing stuff. Now we have to figure-out what we're going to draw:

    LDX #$02               ; [2]   glyph address offset
.nextaddr
LDA #$10 ; [2] glyph address hi-byte (#$80, right-shifted 3 bits)
STA _GLYPADD1+1,X ; [4] ZP set glyph address hi-byte
LDA (_TBUFADDR),Y ; [5] get ASCII char from text buffer
ASL ; [2] shift upper 3 bits out...
ROL _GLYPADD1+1,X ; [6] ...and rotate into address hi-byte
ASL ; [2]
ROL _GLYPADD1+1,X ; [6]
ASL ; [2]
ROL _GLYPADD1+1,X ; [6]
STA _GLYPADD1,X ; [4] ZP set glyph address lo-byte
DEY ; [2] decrement column for next glyph address
DEX ; [2] decrement address offset
DEX ; [2]
BPL .nextaddr ; [3/2] loop for second glyph

OK, this might be starting to seem a little complicated now. What this bit of code does is read a pair of ASCII characters from the Text Buffer and then uses their values to generate two addresses into the Character Generator ROM at $8000, pointing to the first of the eight pixel-data bytes for each character. Interestingly, as I'm explaining this, I've just noticed a glaring error that's going to bite me hard as soon as I start doing lots of screen updates from other places - the routine relies on _TBUFADDR as the pointer into the Text Buffer for where to get the ASCII characters from, but this is set-up and used by the routine which copies data into the buffer - and it could have changed at any point between when the dirty-row bit was set and when we read it here. I'm going to have to modify this so that we derive the Text Buffer address independently rather than using this pointer. Argh. Anyway, once we have the two glyph data addresses...

    LDA _CURSORR           ; [3]   ZP get cursor row
ASL ; [2] multiply by 2 for table index
TAX ; [2] set index
LDA _ABUFTAB,X ; [4] get attribute buffer row address lo-byte
STA _ATTRADDR ; [3] ZP save for attribute buffer address
LDA _ABUFTAB+1,X ; [4] get attribute buffer row address hi-byte
STA _ATTRADDR+1 ; [3] ZP save for attribute buffer address
LDY #$00 ; [2] set index for indirect
LDA (_ATTRADDR),Y ; [5] get attribute byte
STA _ATTRBYTE ; [3] ZP save it

...we can get the corresponding Attribute Byte from the Attribute Buffer, ready for later when we want to apply these attributes to the glyphs as we draw them. Again, argh and double-argh, I've made an assumption about where the screen cursor position is, thinking that it would be where-ever we just added text to the screen - but of course lots of other things might have happened to the screen since then, and by the time we get here the cursor position might be anywhere. So a little surgery is going to be needed here too, to eliminate that dependency. Nevertheless, we can now copy the glyph pixel-data to Zero Page so that we can work on it:

    LDY #$07               ; [2]   glyph data bytes to copy
.nxtglyp1
LAX (_GLYPADD1),Y ; [5] get glyph data byte (undocumented opcode, loads .A and .X)
STX _GLYPDAT1,Y ; [4] ZP store it
DEY ; [2] decrement byte counter
BPL .nxtglyp1 ; [3/2] loop for next byte
LDY #$07 ; [2] glyph data bytes to copy
.nxtglyp2
LAX (_GLYPADD2),Y ; [5] get glyph data byte (undocumented opcode, loads .A and .X)
STX _GLYPDAT2,Y ; [4] ZP store it
DEY ; [2] decrement byte counter
BPL .nxtglyp2 ; [3/2] loop for next byte

And there's our new best friend LAX, who comes to the rescue and enables Y-register addressing on the X-register against Zero Page and Indirect - how cool is that? This is a dead-simple bit of code that just copies the 16 glyph data bytes down into ZP so that we can work on them using the fast addressing mode thus:

    LDA _ATTRBYTE          ; [3]   ZP get attribute byte
LSR ; [2] shift first glyph strikethrough bit out
BCC .g2st ; [3/2] skip if not set
STY _GLYPDAT1+3 ; [3] ZP store it
.g2st
LSR ; [2] shift second glyph strikethrough bit out
BCC .g1ul ; [3/2] skip if not set
STY _GLYPDAT2+3 ; [3] ZP store it
.g1ul
LSR ; [2] shift first glyph underline bit out
BCC .g2ul ; [3/2] skip if not set
STY _GLYPDAT1+7 ; [3] ZP store it
.g2ul
LSR ; [2] shift second glyph underline bit out
BCC .invertg1 ; [3/2] skip if not set
STY _GLYPDAT2+7 ; [3] ZP store it
.invertg1
LSR ; [2] shift first glyph inverse-mode bit out
BCC .invertg2 ; [3/2] skip if not set
LDX #$07 ; [2] glyph data byte counter
TAY ; [2] stash attribute byte in .Y
.g1inv
LDA _GLYPDAT1,X ; [4] ZP get glyph data byte
EOR #$FF ; [2] invert it
STA _GLYPDAT1,X ; [4] ZP set glyph data byte
DEX ; [2] decrement byte counter
BPL .g1inv ; [3/2] loop for next byte
TYA ; [2] get attribute byte back
.invertg2
LSR ; [2] shift second glyph inverse-mode bit out
BCC .merge ; [3/2] skip if not set
LDX #$07 ; [2] glyph data byte counter
.g2inv
LDA _GLYPDAT2,X ; [4] ZP get glyph data byte
EOR #$FF ; [2] invert it
STA _GLYPDAT2,X ; [4] ZP set glyph data byte
DEX ; [2] decrement byte counter
BPL .g2inv ; [3/2] loop for next byte

So here we just LSR each attribute bit out of the byte, and if it's set we apply the appropriate embellishment to the glyph. For underline and strike-through, that's just a line through the glyph in the relevant place (line 3 for strike-through, line 7 for underline). The last two attribute bits for inverse-mode trigger a post-processing loop across each glyph as required, zapping each byte with an EOR #$FF to invert it. The code is additive, so you can apply all three attributes if you want (or any two, or just one) and it all still works. Which means the only thing left to do at this point is merge the two glyphs and copy them into the bitmap so we can see something:

.merge
LDX #$07 ; [2] glyph data byte counter
LDY #$07 ; [2] glyph data byte counter
.nextbyte
LDA _GLYPDAT1,X ; [4] ZP get glyph data byte
ASL _GLYPDAT2,X ; [6] ZP shift four bits from glyph 2...
ROL ; [2] ...and rotate into glyph 1
ASL _GLYPDAT2,X ; [6]
ROL ; [2]
ASL _GLYPDAT2,X ; [6]
ROL ; [2]
ASL _GLYPDAT2,X ; [6]
ROL ; [2]
STA (_DRAWADDR),Y ; [6] write it to bitmap
DEX ; [2] decrement for next glyph byte
DEY ; [2]
BPL .nextbyte ; [3/2] loop until all glyph bytes merged and written
LDY _DRAWCOL ; [3] ZP get screen column
DEY ; [2] decrement indirect index for both columns
DEY ; [2]
BMI .alldone ; [2/3] exit when done
JMP .nextpair ; [6] loop for next character pair
.alldone
RTS ; [6]

The two glyphs are merged by simply rotating four bits of one glyph byte into the other (each byte contains two copies of the glyph, so we don't have to do any masking) and then the composite result is dropped into the bitmap. We then decrement the column counter by two because we've just processed two characters from the Text Buffer, and if we haven't got to the start of the line yet we jump back to do it all again. I'm not entirely happy with having to use both the X- and Y-registers in the merge/draw loop, so I'll be looking for a way to improve that - but it works as intended, even if it feels a little clunky.

I'm going to go away and fix those two dependency errors, and then I'm taking the weekend off before returning to tackle the next item - the high-level 'print this' routine which will let me put text anywhere on the screen using an x/y co-ordinate, and will result in proper start-up messages appearing.

No comments:

Post a Comment