Monday, 19 March 2012

In Which We Extend The Display


Having decided upon a 4-bit font size (or rather, having it dictated by the constraints of what the VIC can do) I went back to LCE and spent about an hour or so drawing glyphs. The resultant font is surprisingly clear even at such a restrictive size - 4 bits wide means actually only 3 to draw the character shapes, leaving the 4th bit clear for spacing - and I'm quite pleased with how it looks. There are numerous tiny fonts out there on the 'net, but I fancied having a go at designing one from scratch, and it looks like it'll do the job. I've mapped the whole standard ASCII set, plus a few extras from the extended ASCII range and a couple of additional 'special' mask shapes for things like inverse mode, underline and strike-through for emphasis (there's no way I can do legible bold or italic typefaces in 3 bits) which should be enough to get things working - I might go back and add a bunch of extra graphical glyphs later for fun.

I dropped the data as raw hex into a DASM assembly file with the origin set to $8000, so I can now tinker with it and rebuild the Character Generator ROM at will rather than faffing-about with hex editors and suchlike. It also means I can really easily drop additional code and/or data into that 4K ROM block if and when the OS grows beyond the 16K limit. I'm not storing the inverse-mode glyphs individually like the original Commodore ROM does, but instead I'll have a little bit of code in the bitmap display routine to invert stuff on-the-fly. I'm also thinking about using 2K for the font - currently it maps 2 glyphs per byte, which occupies 1K (if I map 256 characters) but I'm wondering if it might give me a speed boost during the bitmap-copy process to map each glyph twice in the byte. This would mean I wouldn't have to shift glyphs around in their 'frame' when I want to plot them into the screen bitmap... Well, we'll see.

The other thing I've done is make use of a little IRQ/raster technique to give me an extra text line on the screen, making 25 in total. The best the VIC can do without additional help is 24 lines, because it simply can't address enough memory to hold a bitmap larger than 4K - using double-height characters to fill the screen, a 20x24 screen is 160x192 pixels (3840 bytes, no problem) and 20x26 is 160x208 (4160 bytes, too big). But if you time things exactly right by syncing IRQ to a specific raster line on the screen, then just as the 24th text line has been drawn you can switch to single-height characters and tell the VIC to look somewhere else for a bitmap for that one extra line (160x8, 160 bytes). After the line has been drawn, you switch back to double-height mode and revert to the original bitmap ready for the next screen refresh.

It requires a little preparation, but by coincidence I'd already done most of it. Firstly the IRQ has to be synced to the screen refresh so that it fires at exactly the same place on the screen every frame - and we did that a while ago for the CPU load raster effect. All I had to do here was to alter the raster line number with which the IRQ synchronises, and add a short delay so that the ISR fires in the border at the end of the line (so that we don't do the VIC bitmap switch whilst data is still being drawn). The other thing is to carefully lay out where in memory the screen matrix and bitmaps will be, since the technique will only work when the VIC has specific memory areas to play with. Again, my earlier design choices for where to put the screen matrix and bitmap are already appropriately set-up, and I just needed to allocate a little extra memory for the 25th line. In a standard VIC-20 configuration this is somewhat complicated, and requires some memory contents in the first 1K to be copied for safekeeping during the ISR - but since VIC++ uses memory quite differently, these considerations do not apply (I just needed to re-arrange things a little). Here's a simplified diagram of memory usage:


The Primary screen matrix, colour matrix and bitmap are where the main 40-column, 24-line screen is constructed; a 20x12 array of unique character codes from 0 to 239 go in the screen matrix at $0200, which are pointers to double-height 16-byte blocks of pixel data in the 160x192 bitmap at $1000 - and the first 240 nybbles of the colour matrix at $9600 align with this. The Secondary screen, one 40-column line to be drawn as line 25, has 20 unique character codes in the matrix at $02F0, aligned with 20 colour nybbles at $96F0, and the corresponding 160x8 bitmap is stored at $0000 (it has to be here because the VIC can't see any other memory). We can re-use the first 20 unique character codes (0-19) from the Primary matrix because we're switching the bitmap location twice within the same video frame - once to look at $0000, and then again to revert to $1000, so there's no conflict.

Here's the slightly modified synchronisation logic from the initvias subroutine to set the IRQ line (just a different value in the data table) and wait for 48 cycles to get into the border on the right of the screen:

initvias SUBROUTINE
SEI ; [2] disable CPU interrupts
LDX #%01111111 ; [2] disable VIA interrupts (#$7F)
STX _V1IER ; [4] set VIA #1 interrupt enable register
STX _V2IER ; [4] set VIA #2 interrupt enable register
LDX #%01000000 ; [2] timer #1 free run, timer #2 clock Ø2, sr disabled, a/b latches disabled (#$40)
STX _V2ACR ; [4] set VIA #2 auxiliary control register
LDX #%11000000 ; [2] enable interrupts for timer #1 only (#$C0)
STX _V2IER ; [4] set VIA #2 interrupt enable register
LDX _VIADATA ; [4] get timer #1 frequency lo-byte
STX _V2T1LL ; [4] set VIA #2 timer #1 lo-byte latch
LDX _VIADATA+1 ; [4] get timer #1 frequency hi-byte
LDA _NTSCPAL+1 ; [2] get raster sync line number
.raswait1
CMP _RASTER ; [4] wait for line
BNE .raswait1 ; [3/2] loop until we hit the desired line
LDY #$0F ; [2] delay loop for raster position
.raswait2
DEY ; [2] countdown
BPL .raswait2 ; [3/2] loop until raster is in border
STX _V2T1LH ; [4] set VIA #2 timer #1 hi-byte latch (synced to raster)
CLI ; [2] enable CPU interrupts
RTS ; [6]

And here's the modified ISR logic, which now does the VIC register switch, waits for the 25th line to be drawn, and then switches back ready for the start of the next frame (all other IRQ processing happens after this):

cpuirq SUBROUTINE
CLD ; [2] ensure Decimal mode disabled
PHA ; [3] push .A to stack
TXA ; [2] move .X to .A
PHA ; [3] push .A to stack
TYA ; [2] move .Y to .A
PHA ; [3] push .A to stack
IFCONST _DEBUG = "Y"
LDX _VICDATA+15 ; [2] get standard screen colour
DEX ; [2] change for IRQ time
STX _SCRNCOL ; [4] set colour
STX _IRQTIME ; [4] set IRQ busy flag
ENDIF
LDA _VICDATA+16 ; [2] get Line25 VIC rows & char-height register value
STA _ROWCNT ; [2] turn double-height chars off
LDA _VICDATA+17 ; [2] get Line25 VIC screen & character memory register value
STA _CHARMEM ; [2] switch screen to $0000
LDA _NTSCPAL+2 ; [2] get raster sync line number
.raswait
CMP _RASTER ; [4]
BNE .raswait ; [3/2] loop until Line25 finished drawing
LDA _VICDATA+3 ; [2] get original VIC rows & char-height register value
STA _ROWCNT ; [2] turn double-height chars back on
LDA _VICDATA+5 ; [2] get original VIC screen & character memory register value
STA _CHARMEM ; [2] switch screen back to $1000
BIT _V2T1LL ; [4] acknowledge VIA IRQ
TSX ; [2] move .SP to .X
LDA _STACK+4,X ; [4] get .SR from stack (IRQ pushes .PCH/.PCL/.SR)
AND #$10 ; [2] mask BRK flag
BNE .brkvec ; [2/3] flag set, do BRK processing
JMP (_IRQVEC) ; [5] jump through IRQ vector
.brkvec
JMP (_BRKVEC) ; [5] jump through BRK vector
_irqmain
JSR updtcdc ; [6] update countdown clock
PLA ; [3] pull .A from stack
TAY ; [2] move .A to .Y
PLA ; [3] pull .A from stack
TAX ; [2] move .A to .X
PLA ; [3] pull .A from stack
RTI ; [6]

So we now have a 40x25 text screen using about 3% CPU time and 160 bytes of Zero Page to generate that 25th line, which I think is a reasonable trade. I still have 94 bytes of ZP available, plus 252 bytes in Page 3 after the Primary screen matrix and 256 bytes at $1F00 after the bitmap - enough for OS variables and suchlike, but VIC++ will need a minimum of the 3K BLK0 expansion populated for user code to have anywhere to live (ideally 8K BLK1/2/3/4 would be filled with RAM too, of course).

I tried a couple of variations of the technique for displaying 26 lines, either using the 'wait' time on line 25 to copy another 160-byte buffer into line 1 for display as line 26, or leaving double-height mode enabled and allocating 320 bytes for the line 25/26 secondary bitmap - but they require increasingly larger chunks of CPU time; in the latter case, the IRQ has to start much earlier in the frame so that it can copy 320 bytes out of $1000, copy 320 bytes in to $1000 from a secondary bitmap, wait for lines 25 and 26 to be drawn, and then copy the original 320 bytes back before the next frame starts. This actually also works for 27/28-line screens, but CPU load approaches 100% which is pretty useless if you want to do anything else with the machine other than display images with greater vertical resolutions.

I also determined that multicolour mode just can't work with a 40-column display - the VIC is hard-wired to interpret each pair of bits in the glyph as a colour index in this mode, which means you lose half of your horizontal resolution and at 4-bit glyph sizes that's a disaster. So VIC++ will have to concede that characters on the 40-column text display can only be coloured in pairs - which is a bit of a shame, but not a calamity.

I'm just working-out the algorithm for taking an ASCII character string, doing the glyph-data lookup for each character, and plotting that into the bitmap - so there's nothing much to see yet. But as I was testing the Line25 routine, I did a fast hack to plot 40-column glyphs into that line as well as display IRQ time in the border. Wanna see? Oh alright, but only because you asked nicely.

No comments:

Post a Comment