Ideas for enhanced CRTC replacement with hardware sprites

discuss both original and modern hardware for the bbc micro/electron
User avatar
Rich Talbot-Watkins
Posts: 1694
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca
Contact:

Ideas for enhanced CRTC replacement with hardware sprites

Post by Rich Talbot-Watkins » Thu Dec 01, 2016 4:49 pm

Inspired by the amazing stuff RobC's been doing with his CPLD replacement for the Video ULA with a 4096 colour palette and (hopefully!) fine-grained horizontal scrolling, I was wondering how we might design an enhanced but backward-compatible CRTC which offered some kind of hardware sprite facilities for the Beeb.

Because of the division of work between the CRTC and the Video ULA, I imagine any hardware sprite type facility being rather more basic than the sort of thing offered by the C64, NES or Sega Master System. Because the CRTC is basically an address generator which feeds bytes to the ULA, I find it hard to imagine how we could use it to overlay sprites of different palettes, at non-byte aligned offsets, or horizontally/vertically flipped without it needing to have some understanding of the pixel format represented by the data, which is entirely the ULA's work. But I think it could still do some useful stuff.

My tentative feature list would be as follows:
  • Plot sprites at any horizontal byte position, and at any vertical pixel line (the CRTC design makes character rows the more obvious option, but I think this can be worked around).
  • No fixed size for sprites: they can be any width or height, but the only limitation would be that each screen row would have a maximum number of bytes of sprite data (more later).
  • Ability to set a horizontal stride for each sprite, to allow clipping off the screen edge.
  • Some way to describe how screen data and sprite data are combined.
Sprite data would be held in main RAM, as per screen data. My thought is that it would be addressed through the CRTC's MA0-MA13 as normal which would mean it would need to be 8 byte aligned. The natural format for sprites therefore would be the same as screen memory, character block by character block. This would also mean that sprite sizes would be a multiple of character sizes horizontally and vertically. I think arbitrary sprite sizes is an unusual feature not seen much back then which could maybe be implemented fairly simply.

Sprites would be positioned by specifying a CRTC address (i.e. character block aligned) and a "scanline adjust" which would offset the sprite by that amount within a row, in order to provide fine vertical positioning. Offsetting the sprite across character rows would be tricky, but this kind of feature seems like the only way to work around the CRTC's chunky addressing - haven't really quite figured how this might work, nor how it could be used to clip off the top/bottom of the screen. You would be able to specify a horizontal and vertical size (in character blocks) and also a horizontal stride which needn't be the same as the horizontal size, so you could plot clipped sprites at the left and right edges.

The hardware would need to cache sprite data from RAM in advance - given that there are cycles in the horizontal border where screen memory is fetched and not used, this could be used instead to fill internal registers with sprite data for the upcoming scanline. In MODEs 0-2, there'd be a maximum of 48 cycles to do this (128 characters minus 80 displayed), hence that could be the maximum number of sprite bytes per row. That's the same as the C64 I think (8 sprites x 12 pixels) - and that's possibly why. Fetching data a line in advance might be a bit tricky, but I guess there are ways.

Determining how screen and sprite data are combined - this is the tricky one. There are two desirable cases:
  1. Plot sprite on top of background, treating sprite pixels in colour 0 as a transparent colour, and anything else as 'opaque' which overwrites background.
  2. Plot sprite behind background, treating background pixels in colour 0 as transparent and any other colour masking out the sprite data.
The problem with this is we need to be able to specify how the byte is broken into pixels in order to identify which ones are colour 0 (strictly speaking, a function of the Video ULA). The only thought I have on this is to be able to specify a pixel bitmask which specifies how to extract the leftmost pixel of the byte.
  • 1bpp modes: %10000000
  • 2bpp modes: %10001000
  • 4bpp modes: %10101010
We need to generate a bitmask for the byte, specifying which pixels to mask. Starting with the data byte, it ANDs with that bitmask, and generates either 0 or the bitmask. It then rotates the bitmask right and repeats the process 8 times.

Code: Select all

Inputs: data, pixel_mask

Set result to 0
Do 8 times:
  if ((data AND pixel_mask) = 0): result = result OR pixel_mask
  rotate pixel_mask right

Output: result
That's a bit complicated, but we'd need something like that in order to have useful sprite plotting - I don't think EORed sprites are really acceptable! I guess that algorithm can be expanded out so that each bit of the result is a boolean expression with 16 input bits.

Still once we've got that, we can perform both operations really easily! Let's call the function implementing that above algorithm 'mask'. Then the two sprite plotting operations described above are:
  1. (mask(sprite) AND screen) OR sprite
  2. (mask(screen) AND sprite) OR screen
That could be condensed down to a single bit to specify which way round the inputs are, or could be made more generic if we wanted to be able to allow stuff like:
  • (mask(sprite) AND sprite) OR sprite
which would give the sprite total opacity over the background. There are other ways it could be extended, e.g. by allowing the mask function to return all 1s, and doing an EOR instead of an OR, would allow for hardware plotted EOR'd sprites. Might be useful for something I guess.

I haven't thought too much about the register model for all of this. I guess we'd need a bunch of new CRTC registers (probably better than putting a sprite table in RAM and needing extra bandwidth). The limit really is the amount of data that can be cached each row, rather than a concrete number of sprites. But maybe it'd still be better presented as 8 hardware sprites, CRTC regs &8x...&Fx, where you can set for each one:
  • Sprite data address (lo/hi)
  • Screen address to plot (lo/hi)
  • Scanline adjust
  • Sprite width
  • Sprite height
  • Sprite horizontal data stride
  • Plot mode
Plus the pixel bitmask (CRTC reg &7F?) which wouldn't be a per-sprite thing.

Well that's an outline of my thoughts on this - what does anyone think? Does this sound like a feasible hardware project one day? I don't think it's easy by any means, and the details would need to be fleshed out properly, but it's a start to spark some discussion.

Here's where to throw some ideas around, it'd be amazing to see a hardware extension like this! This is the "fantasy Beeb graphics hardware" thread :)

RobC
Posts: 3071
Joined: Sat Sep 01, 2007 10:41 pm
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by RobC » Thu Dec 01, 2016 5:38 pm

Some great ideas there Rich! (I've been ill this week so have yet to complete the horizontal scrolling...)

Not that I'm very familiar with them but one point of reference might be the Amstrad CPC+ models. Don't they effectively add sprites (and other features) to a 6845-based machine?

Also, an idea I saw mooted by Mike Cook was to have more than one 6845. If one ran from main memory and the other from it's own memory (or shadow RAM), you could then overlay the output of one on the other. Wouldn't be as good as dedicated sprite hardware but it should be fairly easy to implement.

User avatar
Rich Talbot-Watkins
Posts: 1694
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by Rich Talbot-Watkins » Thu Dec 01, 2016 6:04 pm

The CPC+ seems to have an ASIC which emulates the 6845 and the Gate Array (the equivalent of VIDPROC) among other things, which makes it easier to add this kind of feature, as they can just reinvent it as they want. It holds the hardware sprite data in its own internal memory which solves the contention issue, but also makes it far more inflexible IMO. Effectively the hardware sprites get added at the final stage while generating pixels, something it can do because it can know exactly where in the screen it's rasterising - we don't have that luxury unfortunately.

Get better soon!

Edit: actually I've just read you can page the ASIC RAM into CPU address space, so you can manipulate it without having to go through hardware registers, which is much better. Still a pretty big change of hardware though!

User avatar
Rich Talbot-Watkins
Posts: 1694
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by Rich Talbot-Watkins » Thu Dec 01, 2016 6:23 pm

And talking of the CPC+:

http://www.cpcwiki.eu/index.php/Arnold_ ... l_facility
The lower four bits (D3-D0) of the SSCR define a horizontal delay of between 0 and 15 bits i.e. high resolution (mode 2) pixels. This shifts the screen image to the right by the value programmed , "losing" pixels behind the right border and instead displaying random data on the left. It is left to the programmer to ensure that the delay value is always a multiple of the number of bits per pixel.
The most significant bit (D7), when set, causes the border to extend over the first two bytes (16 high resolution pixels) of each scan line, masking out the bad data caused by the horizontal soft scroll. Software which intends to use horizontal soft scroll should have this bit always set, so that the screen width does not keep changing.
That's pretty much exactly like the horizontal scrolling specs we have for your ULA enhancement!

User avatar
tricky
Posts: 4962
Joined: Tue Jun 21, 2011 9:25 am
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by tricky » Thu Dec 01, 2016 7:18 pm

How about just adding it to the video ula?
Give the ula a big chunk of sprite ram, add some definitions and then point the sprites at definitions.
You could then either have a screen mask for show background, and/or, have colour 0 be transparent. This would allow any combinations of resolutions and colour palettes.
The only drawback would be updating sprite graphics would be a little slow, but with enough memory, you could just upload all the data at the start.

ThomasHarte
Posts: 513
Joined: Sat Dec 23, 2000 5:56 pm
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by ThomasHarte » Thu Dec 01, 2016 7:44 pm

How about going in a different direction and dropping the CRTC aspects in favour of just trying to be the most convenient address generator possible?

Of the 128 cycles per line, designate that 80 are pixel fetches. Fixed width, linear. That leaves 48 accesses around the outside. Two of those can be the start address for that line — each line can start anywhere in RAM, independently.

You could use the other 46 for a variety of schemes that specify sprites; you could do those as address windows, switching to counting through sprite data addresses on appropriate columns.

But how about this for a hack: to allow the address generator to generate data directly for the video ULA, at appropriate times have it generate a known, fixed address (say 7FFF) and then put what it really wants onto the data bus regardless, ignoring its chip select line. The programmer would just need to put a single FF at 7FFF. Then the address generator, whether CRTC-like or not, could load a full sprite table during retrace and reproduce such portions as are relevant upon request.

User avatar
Rich Talbot-Watkins
Posts: 1694
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by Rich Talbot-Watkins » Thu Dec 01, 2016 7:51 pm

tricky wrote:How about just adding it to the video ula?
The ULA has no access to the CRTC's internal counters, or indeed registers, so it can't possibly know when it needs to start overlaying a sprite. The ASIC approach worked in the CPC because they could implement it exactly as they wanted with all the internals visible to all. Best you could do is count VSyncs and HSyncs (at the cost of two extra pins, making it pin incompatible), but even then you'd have to set sprite positions relative to the VSync.

ThomasHarte
Posts: 513
Joined: Sat Dec 23, 2000 5:56 pm
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by ThomasHarte » Thu Dec 01, 2016 7:54 pm

Rich Talbot-Watkins wrote:
tricky wrote:How about just adding it to the video ula?
Best you could do is count VSyncs and HSyncs (at the cost of two extra pins, making it pin incompatible), but even then you'd have to set sprite positions relative to the VSync.
If you were replacing both, could you subvert the meaning of the cursor line and use it as a 1-bit bus for synchronisation?

User avatar
Rich Talbot-Watkins
Posts: 1694
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by Rich Talbot-Watkins » Thu Dec 01, 2016 8:05 pm

Don't quite follow you Thomas. Sounds like that would be vastly incompatible with the 6845 unless I'm misunderstanding. I was trying to find ways to spec a drop-in replacement for the CRTC which would give some hardware sprite type capabilities without requiring a lot of extra components, and whilst still being able to operate as a normal Beeb.
ThomasHarte wrote:If you were replacing both, could you subvert the meaning of the cursor line and use it as a 1-bit bus for synchronisation?
Wouldn't that mean having to reprogram the cursor register per sprite so the ULA could see it? I pretty much discarded the idea of using the ULA for this when I was thinking about how it could be done, it just didn't really seem suited.

ThomasHarte
Posts: 513
Joined: Sat Dec 23, 2000 5:56 pm
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by ThomasHarte » Thu Dec 01, 2016 8:17 pm

Rich Talbot-Watkins wrote:Don't quite follow you Thomas. Sounds like that would be vastly incompatible with the 6845 unless I'm misunderstanding. I was trying to find ways to spec a drop-in replacement for the CRTC which would give some hardware sprite type capabilities without requiring a lot of extra components, and whilst still being able to operate as a normal Beeb.
Yeah — when you enabled the enhanced mode you'd switch to a completely different addressing scheme. So it's two distinct modes combined into one chip. Kicking-off point was: if you're adding a new feature that new software must be written to use, and have a practically unlimited transistor budget, does a complete break yield more benefit than a partial one?

At the very least, a per-line start address, even if the lines are then n, n+8, n+16, etc, would be very useful I think. It'd mean independent horizontal and vertical scrolling of regions without excessive processor cost.
Rich Talbot-Watkins wrote:
ThomasHarte wrote:If you were replacing both, could you subvert the meaning of the cursor line and use it as a 1-bit bus for synchronisation?
Wouldn't that mean having to reprogram the cursor register per sprite so the ULA could see it? I pretty much discarded the idea of using the ULA for this when I was thinking about how it could be done, it just didn't really seem suited.
If you've replaced both the CRTC and the video ULA then you don't need to program anything. When both are switched into sprite mode, the CRTC knows e.g. to produce a single pulse on the cursor line to mark the end of vertical sync, a double pulse to mark the end of horizontal. If the video ULA is switched to sprite mode, it knows to treat the cursor input as that, rather than actually as cursor input. This was only in response to the argument that there would be no way to synchronise the two while being pin compatible — you can synchronise at the cost of replacing both.

User avatar
kieranhj
Posts: 928
Joined: Sat Sep 19, 2015 11:11 pm
Location: Farnham, Surrey, UK
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by kieranhj » Thu Dec 01, 2016 9:31 pm

This is a really interesting thread Rich, thanks for starting it! I am absolutely not a hardware guy, so have no suggestions in that department. Instead I would ask how other 8-bit systems of the day operated and whether any of these approaches are applicable or adaptable to the Beeb 6845 + ULA setup for improvements, enhancement, or just curiosity. Questions would be:

How does the VIC2 sprite system work for overlaying on C64? (I will now go look this up on Wikipedia or YouTube.) What stops this from being added to the Beeb video setup (at the end of the chain as an optional extra?)

Is there any mileage in a true character based mode for the Beeb (not Teletext) in which the screen is mapped characters (so small in memory to address) but the character definitions can be changed easily, along with colour attributes for character cells? Obviously MODES 3 & 6 are billed as text only but are still bit mapped albeit character addressed as always.

Is there any advantage to breaking the Beeb's character mapped display memory addressing and making it linear? (Why is it this way in the first place?) Obviously we'd need a new OS revision but as far as I can tell the Beeb OS is extremely well architected for the time so there are probably few changes that would need to be made. People have MultiOS installed so it's not impossible for us to create a new & well (community) supported version of the OS that's open source in GitHub say.

If we're breaking addressing then how about the interleaved pixel arrangement? I know none of this helps with hardware sprites or being backwards compatible but does beg the question of getting more out of a machine that, theoretically, could have existed in the 80's, a Master++ if you like, without going to "copro running at 125MHz" territory.

There are other things like Amiga style separation of bit planes but they seem less important. Also the original IBM PC CGA used a 6845 CRTC, is there anything we can learn from that? There is a 4096 colour demo for that card which only works on real hardware (Flynnjs and I tried, unsuccessfully, to replicate that technique on a Beeb back in August at the Cambridge ABUG..!)

As I said, great conversation. Not sure if I'm helping or confusing! :D
Bitshifters Collective | Retro Code & Demos for BBC Micro & Acorn computers | https://bitshifters.github.io/

User avatar
tricky
Posts: 4962
Joined: Tue Jun 21, 2011 9:25 am
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by tricky » Thu Dec 01, 2016 10:28 pm

I'm not familiar with the hardware, but it seems to me that if the ula can see the sync pulse times, then specifying sprite positions in hsyncs since vsync and 16khz cycles since h-sync should be possible.

User avatar
kieranhj
Posts: 928
Joined: Sat Sep 19, 2015 11:11 pm
Location: Farnham, Surrey, UK
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by kieranhj » Thu Dec 01, 2016 10:44 pm

Just read up on VIC-II as promised. Let's not just implement that ala BeebSID, it sounds like a ballache. Good primer here: http://dustlayer.com/index-vic-ii. Idea of sprite collision detection is interesting though...
Bitshifters Collective | Retro Code & Demos for BBC Micro & Acorn computers | https://bitshifters.github.io/

User avatar
jgharston
Posts: 4240
Joined: Thu Sep 24, 2009 12:22 pm
Location: Whitby/Sheffield
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by jgharston » Thu Dec 01, 2016 11:34 pm

Rich Talbot-Watkins wrote:
tricky wrote:How about just adding it to the video ula?
The ULA has no access to the CRTC's internal counters, or indeed registers, so it can't possibly know when it needs to start overlaying a sprite.
Also, we only have two "windows" into the Video ULA, &FE22 and &FE23. There are a theoretical 230 spare registers in the CRTC by extending the number of CRTC registers, plus six I/O addresses at &FE02-&FE07 that can be picked up with a/two flying leads.
kieranhj wrote:Is there any advantage to breaking the Beeb's character mapped display memory addressing and making it linear? (Why is it this way in the first place?)
For fast character writing. A 2-colour character is a super-fast LDY #7:.loop:LDA (src),Y:STA (dst),Y:DEY:BPL loop
kieranhj wrote:Obviously we'd need a new OS revision but as far as I can tell the Beeb OS is extremely well architected for the time so there are probably few changes that would need to be made. People have MultiOS installed so it's not impossible for us to create a new & well (community) supported version of the OS that's open source in GitHub say.
Nothing should be done that doesn't mean that a bog standard Beeb with an unchanged bog standard MOS cannot power on and function. Otherwise, you're breaking the entire world. Turn on the extra functionality by turning on the extra functionality.

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.32
(C) Copyright J.G.Harston 1989,2005-2020
>_

User avatar
Rich Talbot-Watkins
Posts: 1694
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by Rich Talbot-Watkins » Fri Dec 02, 2016 7:49 am

kieranhj wrote:How does the VIC2 sprite system work for overlaying on C64? (I will now go look this up on Wikipedia or YouTube.) What stops this from being added to the Beeb video setup (at the end of the chain as an optional extra?)
I think pretty much all the 8 bit systems with hardware sprites had video hardware with combined address generation and pixel output. The VIC-II does everything itself, with the limitation that the screen dimensions are fixed and even certain memory addresses are fixed. This makes scrolling on the C64 more difficult than you might expect, because you have to perform a software scroll (shifting all the characters and color RAM attributes one position along) after you've moved 8 pixels. Not having a dedicated address generator makes plumbing hardware sprites easier as everything's on one chip, but at the cost of less flexibility with screen size / layout / addresses.

On the Beeb, we have this separation of concerns, which is why my first instinct was to turn to the CRTC for hardware sprite generation, it being the video hardware interface to main memory. To me, it just seems more fitting with the Beeb architecture. As already pointed out, one alternative would be to put this into the Video ULA, measuring sprite positions in terms of 16 MHz cycles from the HSync and scanlines from the VSync, and adding a load of internal RAM for holding sprite data which would have to be accessed through memory mapped I/O somehow (slowly). Another possibility would be to take the CPC+ route and combine the 6845 and Video ULA into a single chip where one could see the other's internals.
jgharston wrote:
kieranhj wrote:Is there any advantage to breaking the Beeb's character mapped display memory addressing and making it linear? (Why is it this way in the first place?)
For fast character writing. A 2-colour character is a super-fast LDY #7:.loop:LDA (src),Y:STA (dst),Y:DEY:BPL loop
The CRTC pretty much forces it on you - it's designed to work with characters, and has separate concepts of scanlines and character addresses. Mode 7 on the Beeb is pretty much the intended use of the CRTC. On the CPC, the screen is divided up differently - it's broken into 8 blocks, one block per character scanline. So first all the scanline 0s for all the characters in the screen, then all the scanline 1s, etc. To get from a screen byte to the scanline below (within the same character block) you have to add &800, which works well on the Z80, but on the 6502 would be horrible.

User avatar
kieranhj
Posts: 928
Joined: Sat Sep 19, 2015 11:11 pm
Location: Farnham, Surrey, UK
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by kieranhj » Fri Dec 02, 2016 8:22 am

Rich Talbot-Watkins wrote:
jgharston wrote:
kieranhj wrote:Is there any advantage to breaking the Beeb's character mapped display memory addressing and making it linear? (Why is it this way in the first place?)
For fast character writing. A 2-colour character is a super-fast LDY #7:.loop:LDA (src),Y:STA (dst),Y:DEY:BPL loop
The CRTC pretty much forces it on you - it's designed to work with characters, and has separate concepts of scanlines and character addresses. Mode 7 on the Beeb is pretty much the intended use of the CRTC. On the CPC, the screen is divided up differently - it's broken into 8 blocks, one block per character scanline. So first all the scanline 0s for all the characters in the screen, then all the scanline 1s, etc. To get from a screen byte to the scanline below (within the same character block) you have to add &800, which works well on the Z80, but on the 6502 would be horrible.
Ah, thank you, lots of interesting knowledge. Yes, the CPC arrangement is definitely not 6502 friendly.

So, I know this doesn't address your hardware sprite question, but would there be mileage in attaching a different (additional) character generator to the CRTC + ULA setup alongside the Teletext chip? So we could have the one byte per 8x8 character write arrangement from a large (or multiple) character map but each cell having multiple colour attributes that were generally flexible. Am I describing an approximation of PC CGA?

I guess without hardware sprites these character mapped modes all suffer from the problem of drawing at non-character alignment being horrendously expensive (as we've discovered with our MODE 7 demo tinkerings.)
Bitshifters Collective | Retro Code & Demos for BBC Micro & Acorn computers | https://bitshifters.github.io/

User avatar
Rich Talbot-Watkins
Posts: 1694
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by Rich Talbot-Watkins » Fri Dec 02, 2016 8:55 am

I actually love those kind of tile-based screens. Most C64 games use this kind of mode - so you have 40x25 characters and a separate area of "color RAM" - and each character either represents 8x8 pixels in 1bpp, or 4x8 in 2bpp, with the palette coming from color RAM. Combine this with pixel-by-pixel scrolling in both directions, and - importantly - hardware sprites which can be plotted at any pixel position, and you have pretty much all the ingredients you need for an arcade-style game. That said, if you want to do something a bit less generic (e.g. a racing game), it doesn't work very well, which is pretty much why all C64 games had the same kind of look and feel (and those which didn't kinda sucked!).

The NES and Sega Master System also work like this. The NES had 2bpp tiles and the option to select one of 4 different palettes for every 2x2 block of tiles. The SMS was even better and had 4bpp tiles which could also be horizontally or vertically flipped, have a layer assigned (above or below sprites), and have a choice of two palettes.

The problem with adding this kind of setup to the Beeb is the timing. Displaying a tile based screen requires two reads - one to read the tile index, and another to read the data. So to avoid contention, the tile data would need to come from internal RAM, which is then pretty much the same problem as putting hardware sprites on the ULA. Or you could read tile indices in the preceding border and cache them, and then read out the tile data as it's rastered, but that doesn't fit with the CRTC model so you may as well discard the 6845 then (this is what the NES does).

(Note that MODE 7 works precisely because its character definitions are in internal memory. The extra clock delay required in MODE 7 is because of the extra time needed to read the character definition before the ULA can get it.)

Edit: actually we potentially need three reads: one to read the index, one to read the attribute data, and one to read the definition. That'll never work without some big changes.

User avatar
kieranhj
Posts: 928
Joined: Sat Sep 19, 2015 11:11 pm
Location: Farnham, Surrey, UK
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by kieranhj » Fri Dec 02, 2016 10:29 am

Rich Talbot-Watkins wrote:(Note that MODE 7 works precisely because its character definitions are in internal memory. The extra clock delay required in MODE 7 is because of the extra time needed to read the character definition before the ULA can get it.)
Ahhh, I'm finally understanding what the CRTC vs ULA does and what you're proposing. Please excuse my slowness and ignore the tipsy wish list from last night. :)

Having the ability for the CRTC to overlay sprite data from a predetermined memory location at arbitrary points during the scanline would be immensely useful even without the blending mode and scanline offset. Some thoughts then:
That's a bit complicated, but we'd need something like that in order to have useful sprite plotting - I don't think EORed sprites are really acceptable!
Would EORed sprites be acceptable in MODE 2 using your cunning palette arrangement that separates foreground and background colours?
The hardware would need to cache sprite data from RAM in advance - given that there are cycles in the horizontal border where screen memory is fetched and not used, this could be used instead to fill internal registers with sprite data for the upcoming scanline. In MODEs 0-2, there'd be a maximum of 48 cycles to do this (128 characters minus 80 displayed), hence that could be the maximum number of sprite bytes per row. That's the same as the C64 I think (8 sprites x 12 pixels) - and that's possibly why.
Reducing the visible width of the screen to enable more sprites (because there's more time during hblank to fetch & cache the data) would also be a useful feature - think of Tricky's arcade conversions that are narrow (200+ pixels) particularly vertical scrolling shooters. We might finally be able to make a decent conversion of 1942! :)
Edit: actually we potentially need three reads: one to read the index, one to read the attribute data, and one to read the definition. That'll never work without some big changes.
On the topic of character mapped modes - if there are 48 cycles in hblank surely that would be enough time to read and cache the character attribute data for a 40 character row? We actually have 8 x 48 cycles as there are 8 scanlines per character row.
Bitshifters Collective | Retro Code & Demos for BBC Micro & Acorn computers | https://bitshifters.github.io/

User avatar
Rich Talbot-Watkins
Posts: 1694
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by Rich Talbot-Watkins » Fri Dec 02, 2016 10:50 am

kieranhj wrote:Would EORed sprites be acceptable in MODE 2 using your cunning palette arrangement that separates foreground and background colours?
Perhaps, but it limits the available colours, and using a trick like I described to define a pixel bitmask would mean we could do it 'properly', and even underlay as well as overlay sprites relative to the screen data. Depending on how flexible you made inputs to the combining function, you could probably still get EOR as well, but I'm not convinced it'd be at all useful. The only reason Beeb sprites are EORed is so they can be erased as easily as plotted, but in the case of a hardware overlay, there's no need to erase anything!
Reducing the visible width of the screen to enable more sprites (because there's more time during hblank to fetch & cache the data) would also be a useful feature - think of Tricky's arcade conversions that are narrow (200+ pixels) particularly vertical scrolling shooters. We might finally be able to make a decent conversion of 1942! :)
Yeah totally! I wouldn't particularly go with a conventional representation where there are 8 defined hardware sprites - I'd prefer it to be a bit more free-form, a bit more like a blitter list, but I can't think of a decent way to express that in registers at the moment.

Example case: suppose you have a sprite which is straddling two background blocks: it should pass behind one and in front of the other. If this could be split into two sprites with different combine modes, that'd be easy, but not at the cost of using another hardware sprite. All that should matter is that the sprite data for the upcoming scanline can be cached in the preceding border.
On the topic of character mapped modes - if there are 48 cycles in hblank surely that would be enough time to read and cache the character attribute data for a 40 character row? We actually have 8 x 48 cycles as there are 8 scanlines per character row.
Yeah, but you wouldn't be using the CRTC in the way it's supposed to work. Remember, ordinarily the CRTC is generating the addresses for the tile indices during the displayed screen - but in this case we already need those addresses by then so you'd be duplicating that logic elsewhere, and then ignoring it when fetching addresses for displayed tile data. So you might as well not use the CRTC for address generation, and use a totally custom solution. And for that, you may as well just chuck a VIC-II in there! And for that you may as well just use a C64! :lol:

The thing I had in mind when devising this was to really maintain a lot of the Beeb's character while keeping the extension flexible and not too complicated. I imagine a game like Blurp running in MODE 2 with hardware scrolling and hardware sprites, flicker free, at 25 or 50 fps, and it would be totally feasible with this type of hardware. But it would still very much be a 'Beeb'!

User avatar
1024MAK
Posts: 10475
Joined: Mon Apr 18, 2011 5:46 pm
Location: Looking forward to summer in Somerset, UK...
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by 1024MAK » Fri Dec 02, 2016 12:22 pm

Lots of interesting and exciting ideas in this thread :D.

I do have a question. The 6845 CRTC does all the timing and address generation, but ignoring the cursor for the moment, does not actually generate screen data or read data from DRAM to form the serial data stream that gets sent to the display. The Video ULA / VidProc generates the serial data streams (for the red, green and blue signal) by taking the data byte from DRAM (which gets it's address from the CRTC), and then indexing this data to the colour palette look up registers. So if a enhanced CRTC is produced that includes sprites, how and where is the sprite data and the normal screen data combined and processed before it is converted to the serial data streams?

Or have I missed something?

Mark

User avatar
Rich Talbot-Watkins
Posts: 1694
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by Rich Talbot-Watkins » Fri Dec 02, 2016 12:37 pm

I'd assumed that the CRTC fetched the data from DRAM itself and put it on D0..D7 where it could be read by VIDPROC. If that's not the case then none of this is going to work!

And of course now I realise that D0..D7 on the CRTC will be the interface between the CRTC and the CPU. Probably this whole plan is scuppered!

User avatar
jms2
Posts: 2767
Joined: Mon Jan 08, 2007 6:38 am
Location: Derby, UK
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by jms2 » Fri Dec 02, 2016 12:42 pm

I was wondering exactly the same thing.

My simple understanding (which is a total assumption on my part) is that the addresses being spouted out of the CRTC would not be in a linear sequence. So for example in MODE2, instead of being:

&3000, &3001, &3002, &3003 ... etc

if there was a sprite starting at the third scanline, with its data in SWR at &9000, it would be:

&3000, &3001, &9000, &9001 .... etc .... reverting back to screen memory after the sprite has been drawn.

I can't see how the palette could be altered for each sprite.

User avatar
Rich Talbot-Watkins
Posts: 1694
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by Rich Talbot-Watkins » Fri Dec 02, 2016 1:06 pm

The idea here was to cache sprite data for the upcoming scanline during the border period, which would give a maximum of 48 bytes of sprite data per line in MODEs 0-2. But that doesn't work given that combining screen and sprite data has to happen on the Video ULA - we'd need a cache shared by both, which means probably the only way to do this is by combining their operation into a single unit. That's basically the approach used by the CPC+, so it has precedent, but is not such a modular replacement as I had in mind.

User avatar
1024MAK
Posts: 10475
Joined: Mon Apr 18, 2011 5:46 pm
Location: Looking forward to summer in Somerset, UK...
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by 1024MAK » Fri Dec 02, 2016 1:23 pm

Rich Talbot-Watkins wrote:I'd assumed that the CRTC fetched the data from DRAM itself and put it on D0..D7 where it could be read by VIDPROC. If that's not the case then none of this is going to work!

And of course now I realise that D0..D7 on the CRTC will be the interface between the CRTC and the CPU. Probably this whole plan is scuppered!
As far as address generation for the DRAM is concerned, there are two bus-masters, the CPU and the CRTC. However, the CRTC is connected to the CPU data bus, not the Video ULA / VidProc DRAM data bus. IC14 (a 74LS245 bi-directional tri-state buffer) links (or isolates) the two data buses. The direction (read/write) control comes from the CPU R/W pin. The enable control comes from the glue logic including the CPU clock.

My best suggestion, for adding sprites, is to have an enhanced CRTC supplying extra timing information via the cursor line (this goes straight from the CRTC to the Video ULA / VidProc) as a fast serial data line. It would be clocked and synchronised with the 1 MHz or 2 MHz clock from the Video ULA / VidProc to the CRTC. The enhanced CRTC can then tell an enhanced VidProc where and when the sprites should be displayed. So the position and timing data would be the responsibility of the enhanced CRTC. But the screen display data for the sprites would have to be stored in the enhanced VidProc, or in a SRAM connected to whatever programmable logic chip is used for the enhanced VidProc.

As the enhanced CRTC and enhanced VidProc would now have a unidirectional communications link, this could be used to supply control information telling the enhanced VidProc to load and store data read from main DRAM as sprite data instead of data to be converted into serial colour information for immediate display. Hence, sprite data could be loaded from main DRAM during the field or line "flyback" time and stored in the enhanced VidProc or SRAM.

So I don't see this as a dead end. Just that an upgrade will need both a new CRTC AND a new VidProc.

Mark

ThomasHarte
Posts: 513
Joined: Sat Dec 23, 2000 5:56 pm
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by ThomasHarte » Fri Dec 02, 2016 1:41 pm

1024MAK wrote:As far as address generation for the DRAM is concerned, there are two bus-masters, the CPU and the CRTC. However, the CRTC is connected to the CPU data bus, not the Video ULA / VidProc DRAM data bus. IC14 (a 74LS245 bi-directional tri-state buffer) links (or isolates) the two data buses. The direction (read/write) control comes from the CPU R/W pin. The enable control comes from the glue logic including the CPU clock.
Would it be possible to go a different way, keep the CRTC's address generation exactly as it is but have it act as a blitter during the non-display cycles? R/W is only an input to the CRTC at present but W is active low so if you drained that at the CRTC, would you be able to force a write rather than a read to RAM?

As ever, I'm at the limits of my negligible schematic reading abilities; please don't worry about being polite if I'm talking rubbish.

User avatar
Rich Talbot-Watkins
Posts: 1694
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by Rich Talbot-Watkins » Fri Dec 02, 2016 1:52 pm

1024MAK wrote:My best suggestion, for adding sprites, is to have an enhanced CRTC supplying extra timing information via the cursor line
Right, this echoes what Thomas was saying upthread. It's all starting to sound more complicated than I anticipated!

It seems a shame to have to reimplement the 6845 just for a few small changes to allow synchronisation. I guess in theory, a custom Video ULA could determine the anatomy of a frame by timing changes in DISEN (would have at least a frame's latency though, and I don't know how it'd respond to "gap" modes or interlace). It could still perform the trick of caching sprite data from RAM during borders. It could also allow different palettes and arbitrary sprite positioning, so that's good.

User avatar
simonm
Posts: 316
Joined: Mon May 09, 2016 3:40 pm
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by simonm » Fri Dec 02, 2016 9:35 pm

Fascinating thread.
It occurred to me that if either of these fantasty hardware mods (RobC's 4096 colour ULA & some type of enhanced crtc) became real, most folks would surely want both upgrades, so maybe that could help design decisions where these two projects might work hand in glove with each other.

In other thoughts, I'm no hardware guy, but I wondered if some sort of additional DMA processor component might be an option? Effectively it would be able to compute a composite screen buffer image that is written directly to main video RAM at a speed greater than or equal to the 6845/ULA read.
This approach would maintains the "Beeb-ness", plus it would mean system ram actually contains the byte data of the final image.

Since we're in fantasy-hardware mode, and I am unconstrained by any knowledge of hardware, I'll bung in an assumption this DMA GPU widget would have a bunch of its own video RAM (for sprite/attribute data etc.) and sufficent control registers to allow plenty of scope for fun stuff like clipped sprites/tile maps/bg/fg/masks to be composited before being spat out to main RAM every 50Hz. A primitive pixel shader I guess...

Anyway, combining any of these ideas with RobC's super ULA palette mod and we'll almost have a 16-bit quality games console on our hands!

User avatar
kieranhj
Posts: 928
Joined: Sat Sep 19, 2015 11:11 pm
Location: Farnham, Surrey, UK
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by kieranhj » Fri Dec 02, 2016 10:37 pm

Rich Talbot-Watkins wrote: Yeah, but you wouldn't be using the CRTC in the way it's supposed to work. Remember, ordinarily the CRTC is generating the addresses for the tile indices during the displayed screen - but in this case we already need those addresses by then so you'd be duplicating that logic elsewhere, and then ignoring it when fetching addresses for displayed tile data. So you might as well not use the CRTC for address generation, and use a totally custom solution. And for that, you may as well just chuck a VIC-II in there! And for that you may as well just use a C64! :lol:
So reading up (briefly) on the CRTC (i.e. Wikipedia) suggests that the 6845 is in fact designed / intended for character mapped displays. Leaving aside the fact that Acorn did an awesome job ~not~ limiting the hardware to just character maps, I still don't understand what is stopping us from (theoretically) using the 6845 in its intended manner, generating 14-bit addresses and 5-bit lookups into a character map in either ROM or RAM? What's the difference that I'm missing vs how CGA works, for instance?

Sorry, this isn't anything to do with hardware sprites but I'm just curious to learn about the architecture of how the Beeb generates its display (and I guess the trade offs that Acorn made) vs what could have been theoretically possible back in the day (or is possible now that we can effectively develop our own chips in software.)
Bitshifters Collective | Retro Code & Demos for BBC Micro & Acorn computers | https://bitshifters.github.io/

User avatar
kieranhj
Posts: 928
Joined: Sat Sep 19, 2015 11:11 pm
Location: Farnham, Surrey, UK
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by kieranhj » Fri Dec 02, 2016 10:45 pm

simonm wrote:In other thoughts, I'm no hardware guy, but I wondered if some sort of additional DMA processor component might be an option? Effectively it would be able to compute a composite screen buffer image that is written directly to main video RAM at a speed greater than or equal to the 6845/ULA read.
This approach would maintains the "Beeb-ness", plus it would mean system ram actually contains the byte data of the final image.
An alternative to this would be a mechanism to enable copro to write directly into main RAM (or shadow RAM) then you could fit it out with sufficient power to composite + blit whatever you wanted into screen memory.
Anyway, combining any of these ideas with RobC's super ULA palette mod and we'll almost have a 16-bit quality games console on our hands!
Except without the 16-bit compute to do anything really involved. 8-bit maths FTW! :lol: Probably closer to a NES than anything else. I think RobC did say in one thread that he wanted to look at interfacing the NES PPU one day...
Bitshifters Collective | Retro Code & Demos for BBC Micro & Acorn computers | https://bitshifters.github.io/

User avatar
jgharston
Posts: 4240
Joined: Thu Sep 24, 2009 12:22 pm
Location: Whitby/Sheffield
Contact:

Re: Ideas for enhanced CRTC replacement with hardware sprites

Post by jgharston » Fri Dec 02, 2016 11:38 pm

Rich Talbot-Watkins wrote:
jgharston wrote:For fast character writing. A 2-colour character is a super-fast LDY #7:.loop:LDA (src),Y:STA (dst),Y:DEY:BPL loop
(...) On the CPC, the screen is divided up differently - it's broken into 8 blocks, one block per character scanline. So first all the scanline 0s for all the characters in the screen, then all the scanline 1s, etc. To get from a screen byte to the scanline below (within the same character block) you have to add &800, which works well on the Z80, but on the 6502 would be horrible.
Ha. I just typed up a detailed reply, then re-reading found you'd written exactly the same! :)
kieranhj wrote:Ahhh, I'm finally understanding what the CRTC vs ULA does and what you're proposing. Please excuse my slowness and ignore the tipsy wish list from last night. :)
The simplest way to understand it is that the Video ULA is just a palette, nothing else, all it does is translate the bytes fed to it into video colour signals, nothing else, it doesn't know what the bytes it is fed are, and has no control over what those bytes are or where they have come from.

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.32
(C) Copyright J.G.Harston 1989,2005-2020
>_

Post Reply

Return to “8-bit acorn hardware”