Wow, Russian Dolls indeed. I've been furiously cutting code to get the foundations of the whole screen-editor concept up-and-running, and it's been a blast - let's start with the simple act of pasting a string into the text buffer. The buffer exists as two nearly-consecutive 1000-byte areas starting at $0400 and $0800, describing a 40x25 Data area and a 40x25 Control area - there's a 24-byte gap between them just because it makes things really simple when calculating the Control Byte buffer address; since the buffers are exactly 1024 bytes apart, turning the Data address into a Control address needs nothing more than an ASL on the address hi-byte. The Data area holds the actual text characters which get plotted on the screen bitmap, and the Control area holds the customisation flags for how the text should look (at the moment this just supports inverse mode, underline and strike-through). So we need a memcopy routine to splat a string into the text screen buffer Data area, and a setcntl routine to set the Control byte(s).
Now memcopy will need to know the address in the _TXTBUFFD Data buffer for where to insert the string, so we'll just use the current cursor row/column position (even though we don't actually have a cursor on-screen yet) and call a new routine called settxtad which does a similar thing to setbitad, except that instead of doing a moderately complex bit of processing to find the bitmap draw address, we just convert the cursor position into a simple linear text buffer address with the help of the row address table at _BUFADDRS:
settxtad SUBROUTINE
LDA _CURSORR ; [3] ZP get cursor row
ASL ; [2] multiply by 2 for table index
TAX ; [2] set index
LDA _BUFADDRS,X ; [4] get buffer row address lo-byte
ADC _CURSORC ; [3] ZP add cursor column
STA _BUFADDR ; [3] ZP save it
LDA _BUFADDRS+1,X ; [4] get buffer row address hi-byte
ADC #$00 ; [2] add Carry
STA _BUFADDR+1 ; [3] ZP save it
RTS ; [6]
Dead simple, eh? Thus, given a row/column coordinate, we can now find the address in the Data buffer at which to insert* the string. We don't need to explicitly find the Control buffer address, because as I mentioned that's simply whatever the hi-byte of the Data address is, left-shifted by one bit position. We also need a little bit of code to set the 'dirty' row bit, because that's what tells the OS that something in the text buffer changed and so needs a redraw - we'll plug a routine into the IRQ handler to check for dirty bits later. I've already added a code fragment to the clrscrn routine so that in addition to zeroing the bitmap it now also nukes the text buffer, control buffer and dirty-row bytes. Setting the dirty-row bits is also pretty simple - the row whose bit should be set is passed in the Accumulator to setdrty:
setdrty SUBROUTINE
TAY ; [2] move row number to .Y for later
LSR ; [2] divide row by 8 for dirty bitmap byte
LSR ; [2]
LSR ; [2]
TAX ; [2] move to .X
TYA ; [2] get row number from .Y
AND #$07 ; [2] mask-off byte number leaving bit number
TAY ; [2] move to .Y
LDA _DRTYROW0,X ; [4] ZP get dirty row bitmap byte
ORA _MASKTAB,Y ; [4] set appropriate row bit
STA _DRTYROW0,X ; [4] ZP set dirty row bitmap byte
RTS ; [6]
There are several ways to decide which bit in the byte to set, but for speed I just defined a table called _MASKTAB which holds eight binary patterns corresponding to each bit being on, with all others off. I could of course construct the appropriate mask on-the-fly, but we're in an area of code that absolutely has to be as fast as possible (because we don't want any unnecessary lag when doing screen updates) so I consider the cost of the 8-byte table to be a fair exchange for the simplicity and speed of the lookup. If you happen to know a neat way of doing it in fewer cycles, by all means let me know! Oh, here's the generic memcopy routine, by the way - it copies 'n'-byte chunks of memory from point to point, to a maximum of 256 bytes with the length passed in the Y register:
So, we can now insert a string into the buffer and set the appropriate bit in the dirty-row bitmap - what about those entries in the Control table though? This is pretty easy to get working, as we can just write a little bit of code to use the Data Buffer address, shift its' hi-byte left once to yield the Control Buffer address, and slap the appropriate bit-pattern into each byte corresponding to the string we just placed in the Data Buffer. Therefore all that's left to write at this stage is the line-redraw routine, which will be called from the ISR by a little bit of code that checks for dirty-row bits. Actually, this 'little bit of code' turned out to be quite complicated and I'm having a think about a different way to do it. Here's what it looks like today:
It works, but just feels a bit ... clunky. I think it's because I have to stash registers on the Stack so that I can recover their values after doing the line refresh, although to be honest the mechanism for detecting dirty bits also gives me that not quite right sensation - you know, that feeling you get when you look at a perfectly serviceable bit of code and it just feels like there's a better way to do it...
memcopy SUBROUTINE
LDA (_STRADDR),Y ; [5] get byte from source
STA (_BUFADDR),Y ; [6] store it at destination
DEY ; [2] decrement index
CPY #$FF ; [2] decremented past zero?
BNE memcopy ; [3/2] loop for next byte
RTS ; [6]
So, we can now insert a string into the buffer and set the appropriate bit in the dirty-row bitmap - what about those entries in the Control table though? This is pretty easy to get working, as we can just write a little bit of code to use the Data Buffer address, shift its' hi-byte left once to yield the Control Buffer address, and slap the appropriate bit-pattern into each byte corresponding to the string we just placed in the Data Buffer. Therefore all that's left to write at this stage is the line-redraw routine, which will be called from the ISR by a little bit of code that checks for dirty-row bits. Actually, this 'little bit of code' turned out to be quite complicated and I'm having a think about a different way to do it. Here's what it looks like today:
finddrty SUBROUTINE
LDX #$03 ; [2] byte index
LDY #$00 ; [2] bit counter
.nextbyte
LDA _DRTYROW0,X ; [4] ZP get dirty row bitmap byte
BEQ .skipbyte ; [3/2] don't bother bit-shifting if zero
.nextbit
ASL ; [2] shift bit 7 to Carry
BCC .skipdrty ; [3/2] not set, skip processing of dirty row
PHA ; [3] stack bitmap byte for later
TYA ; [2] move bit counter to .A
PHA ; [3] stack it for later
TXA ; [2] move byte index to .A
PHA ; [3] stack it for later
ASL ; [2] multiply byte index by 8
ASL ; [2]
ASL ; [2]
STY _DRAWROW ; [3] ZP set bitmap draw row (intermediate)
ADC _DRAWROW ; [3] ZP add bit counter
STA _DRAWROW ; [3] ZP set bitmap draw row (final)
JSR rowdraw ; [6] redraw the row
PLA ; [4] get byte index from stack
TAX ; [2] move back to .X
PLA ; [4] get bit counter from stack
TAY ; [2] move index back to .Y
PLA ; [4] get bitmap byte from stack
BEQ .skipbyte ; [3/2] don't bother bit-shifting if zero
.skipdrty
DEY ; [2] decrement bit counter
BPL .nextbit ; [3/2] loop for next bit
.skipbyte
LDY #$07 ; [2] reset bit counter
DEX ; [2] decrement byte index
BPL .nextbyte ; [3/2] loop for next byte
RTS ; [6]
It works, but just feels a bit ... clunky. I think it's because I have to stash registers on the Stack so that I can recover their values after doing the line refresh, although to be honest the mechanism for detecting dirty bits also gives me that not quite right sensation - you know, that feeling you get when you look at a perfectly serviceable bit of code and it just feels like there's a better way to do it...
So this routine will be called by the ISR on every IRQ trigger, but actually I'm not sure this is going to fly. If you remember, we have 22150 PAL cycles from one IRQ to the next (i.e. per frame) and fewer for NTSC, and what's worrying me is the possibility that finding dirty rows and refreshing them could easily blow that limit. From an aesthetic point of view this wouldn't be a problem - after all, the user is hardly likely to notice if the refresh takes two or three IRQs-worth of cycles to complete instead of one. However, if we do overrun the IRQ cycle limit, that means we miss an entire IRQ iteration (or indeed as many as it takes to do the refresh) and of course there are other things that happen in the ISR that would notice the missed executions - the clock update, for example, relies on the countdown being updated every IRQ, so if we miss one or two every now and then, the clock will drift. And there are likely to be other issues like that as the OS matures and more stuff gets plugged-in to the ISR logic.
I'm thinking about staggering dirty-row refreshes, so that the finddrty routine only processes one dirty row per IRQ instead of all that might be flagged, and thereby stays within the ISR cycle limit each time. In the worst-case scenario where all 25 rows are flagged as dirty, this would obviously take 25 IRQ iterations to process them all - so that's about half a second. But there's a downside to this, which is that the ISR might get, say, 50% of the way through - and then the users' code updates the half of the screen that's been checked and sets a dirty bit, so then the ISR refreshes that line again ... if this type of thing happened quickly enough, the ISR might never get to update the second half of the screen. So the logic is going to have to remember where it got to from one IRQ to the next to ensure the whole screen gets a chance at a refresh in a reasonable amount of time.
Hmm. Can open, worms everywhere.
I'm thinking about staggering dirty-row refreshes, so that the finddrty routine only processes one dirty row per IRQ instead of all that might be flagged, and thereby stays within the ISR cycle limit each time. In the worst-case scenario where all 25 rows are flagged as dirty, this would obviously take 25 IRQ iterations to process them all - so that's about half a second. But there's a downside to this, which is that the ISR might get, say, 50% of the way through - and then the users' code updates the half of the screen that's been checked and sets a dirty bit, so then the ISR refreshes that line again ... if this type of thing happened quickly enough, the ISR might never get to update the second half of the screen. So the logic is going to have to remember where it got to from one IRQ to the next to ensure the whole screen gets a chance at a refresh in a reasonable amount of time.
Hmm. Can open, worms everywhere.
* Note: at the moment, the term 'insert' really just means 'overtype' in terms of text layout - as the high-level editor functionality emerges later, we'll have a true 'insert' mode which will shuffle things along in the buffer to make room for a new string.
No comments:
Post a Comment