Compressed ROMs

discussion of beeb/electron applications, languages, utils and educational s/w
Post Reply
User avatar
davidb
Posts: 2073
Joined: Sun Nov 11, 2007 10:11 pm
Contact:

Compressed ROMs

Post by davidb » Sun Dec 13, 2015 12:09 am

While exploring ways to package programs for the Mega Games Cartridge, I wrote some simple tools to compress data on a modern computer and some routines to decompress it on the Electron. I've just finished tidying up the files for a release which you can find in this Mercurial repository.

Currently, the tools embed the compressed data into ROM images, which I realise isn't really useful for most people. The ROM code also checks that it's running on an Electron because it was originally intended for use with Elkulator snapshots.

Anyway, I thought I'd put it all up in a public place so that others can experiment with it. :)

User avatar
jgharston
Posts: 3062
Joined: Thu Sep 24, 2009 11:22 am
Location: Whitby/Sheffield
Contact:

Re: Compressed ROMs

Post by jgharston » Sun Dec 13, 2015 12:37 am

Code: Select all

; Version string
.byte "1.0", 0
The version string should be four characters: "1.00" or 18 characters: "1.00 (01 Jan 2001)".

Code: Select all

copyright_string:
.byte "(C) 2015 David Boddie", 0
There should be no space between the (C) and the following string, viz: "(C)2015 David Boddie".

Code: Select all

; Check the system in use.
    lda #129
    ldx #0
    ldy #255
    jsr $fff4
    cpx #1
    beq electron_os
    bne exit_rom
Better to use OSBYTE 0,1. INKEY-256 gives the varient within the set that OSBYTE 0 gives:

Code: Select all

; Check the system in use.
    lda #0
    ldx #1
    jsr $fff4
    txa      ; &00=Electron
    beq electron_os
    bne exit_rom

Code: Select all

service_command:
    tya                         ; push Y and X registers onto the stack
    pha
    txa
    pha
You don't need to stack X as X is always the ROM number in &F4, so you can restore it at the end of the service call by just loading it from &F4.

Code: Select all

    service_entry_exit:
    clc
    rts
The state of the flags is irrelevent on exit from the service handler. Just RTS is sufficient.

Code: Select all

    clc
    tya         ; Store the address of the command line in an new address that
    adc $f2     ; can used zero-based post-indexed addressing.
    sta $74
    lda $f3
    adc #0
    sta $75
You MUSTN'T MUSTN'T MUSTN'T MUSTN'T MUSTN'T MUSTN'T MUSTN'T MUSTN'T MUSTN'T use language workspace when servicing service calls as you are not the current language, you are servicing service calls. If you are a *command you can use the *command workspace at &A8-&AF, but you MUST NOT trample over memory that you do not own. And why copy the command line pointer from &F2/3 to somewhere else, just use &F2/3, that's what it's provided to you for.
Language ROMs generated using the language_template.oph do not currently work correctly.
Possibly because I can't work out what the language code actually does after calling the decoder. Shouldn't it jump to the extracted code or something? All I can see it doing is running into an RTS, which won't do anything useful as the language startup has reset the stack so there's nothing to RTS to, and even without resetting the stack there's nothing to return to from the ROM being JMP'ed to a language.

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.25
(C) Copyright J.G.Harston 1989,2005-2015
>_

User avatar
davidb
Posts: 2073
Joined: Sun Nov 11, 2007 10:11 pm
Contact:

Re: Compressed ROMs

Post by davidb » Sun Dec 13, 2015 1:26 am

jgharston wrote:The version string should be four characters: "1.00" or 18 characters: "1.00 (01 Jan 2001)".
OK, that's useful to know. Thanks.
jgharston wrote:There should be no space between the (C) and the following string, viz: "(C)2015 David Boddie".
The EAUG implies this but doesn't actually make this a requirement.
jgharston wrote:Better to use OSBYTE 0,1. INKEY-256 gives the varient within the set that OSBYTE 0 gives:

Code: Select all

; Check the system in use.
    lda #0
    ldx #1
    jsr $fff4
    txa      ; &00=Electron
    beq electron_os
    bne exit_rom
Thanks. I'll probably update my other ROMs to use this. The machine check will be removed from this code.
jgharston wrote:

Code: Select all

service_command:
    tya                         ; push Y and X registers onto the stack
    pha
    txa
    pha
You don't need to stack X as X is always the ROM number in &F4, so you can restore it at the end of the service call by just loading it from &F4.
Right.
jgharston wrote:The state of the flags is irrelevent on exit from the service handler. Just RTS is sufficient.
I'll drop the CLC.
jgharston wrote:

Code: Select all

    clc
    tya         ; Store the address of the command line in an new address that
    adc $f2     ; can used zero-based post-indexed addressing.
    sta $74
    lda $f3
    adc #0
    sta $75
You MUSTN'T MUSTN'T MUSTN'T MUSTN'T MUSTN'T MUSTN'T MUSTN'T MUSTN'T MUSTN'T use language workspace when servicing service calls as you are not the current language, you are servicing service calls. If you are a *command you can use the *command workspace at &A8-&AF, but you MUST NOT trample over memory that you do not own. And why copy the command line pointer from &F2/3 to somewhere else, just use &F2/3, that's what it's provided to you for.
I hadn't read about the *command workspace. I'll use the memory range you suggest for my temporary variables. I add Y to the address in &F2/3 so that I can address the string starting with an offset of zero. I think I was doing that to make it easier to test for more than one command, though I don't implement more than one in this ROM.
jgharston wrote:
Language ROMs generated using the language_template.oph do not currently work correctly.
Possibly because I can't work out what the language code actually does after calling the decoder. Shouldn't it jump to the extracted code or something? All I can see it doing is running into an RTS, which won't do anything useful as the language startup has reset the stack so there's nothing to RTS to, and even without resetting the stack there's nothing to return to from the ROM being JMP'ed to a language.
The initial ROM was being used to unpack a compressed Elkulator snapshot into memory, so there should be a stack present, but the encodesnap.py tool inserts code to jump into the program in any case. Maybe once I've fixed the workspace issue you mentioned it will have a better chance of working, though I'm not actively pursuing getting snapshots to work since there are other disadvantages to using them.

Thanks for the feedback. :)

User avatar
sweh
Posts: 1920
Joined: Sat Mar 10, 2012 12:05 pm
Location: New York, New York
Contact:

Re: Compressed ROMs

Post by sweh » Sun Dec 13, 2015 6:10 pm

jgharston wrote:

Code: Select all

; Version string
.byte "1.0", 0
The version string should be four characters: "1.00" or 18 characters: "1.00 (01 Jan 2001)".
This is not a requirement for the Beeb/Electron/Master series of machines. It's common, but not required.

Code: Select all

copyright_string:
.byte "(C) 2015 David Boddie", 0
There should be no space between the (C) and the following string, viz: "(C)2015 David Boddie".
This is also not a requirement. Only the (C) is actually needed; everything afterwards is not specified.

Many ROMs from BITD violate both of these so-called rules. eg VIEW3.0 has "(C) 1982 Acornsoft"; BASIC EDITOR has " 1.32" as the version and "(C) 1984 Acornsoft" as the copyright.

Even some of the ROM images that came with the BBC Master violated this (eg Viewsheet, View, ADFS.
Rgds
Stephen

User avatar
davidb
Posts: 2073
Joined: Sun Nov 11, 2007 10:11 pm
Contact:

Re: Compressed ROMs

Post by davidb » Wed Sep 21, 2016 10:16 pm

In an attempt to violate even more rules and conventions, I've been revisiting compression in ROMFS ROMs.

I started by creating wrappers around compressed files that simply replace the original files in the ROMFS section of the ROM, but that meant including the same decompression code for multiple compressed files and finding suitable memory locations for compressed data to be placed before it is decompressed into the correct place. This works fine for files that are *RUN because the decompression code is executed instead of the original code, and the decompression code jumps into the original code after decompressing it. However, for code that is CHAINed or *LOADed, some patching is required.

Ideally, the code to feed bytes from ROM should be able to do some decompression but the algorithm I use needs to be able to refer to decompressed data, and that isn't something the ROM code has information about. It occurred to me that I could cheat by adding load addresses to the ROM, and that would help in-ROM decompression, but it also occurred to me that I could just circumvent ROMFS entirely and speed up loading in the process. What I do is use addresses in the ROM as triggers that cause the decompression code to unpack data into memory as files in ROMFS are traversed and I simply use zero-length placeholder files to orchestrate this. This means that merely cataloguing the ROM will cause decompression to be performed, but that's not really a problem for our use case.

User avatar
jgharston
Posts: 3062
Joined: Thu Sep 24, 2009 11:22 am
Location: Whitby/Sheffield
Contact:

Re: Compressed ROMs

Post by jgharston » Thu Sep 22, 2016 11:52 am

davidb wrote:In an attempt to violate even more rules and conventions, I've been revisiting compression in ROMFS ROMs.
A compressed ROMFS service handler has been implemented, see the mailing list from 2006.

Essentially, it's irrelevent to the "outside world" what service calls 13 and 14 (ROMFS_Init and ROMFS_GetByte) do as long as they supply back to the caller what the caller is wanting. How service call 13/14 get that data is entirely up to them.

I was chatting with David H a few days ago about this, and discussed how loading from ROMFS could be speeded up by the first call to ROMFS_GetByte examining the ROMFS workspace and dumping all the data across in one go and then munging the workspace to make the caller think it had got to the end of the load process, instead of the slow byte by byte loop.

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.25
(C) Copyright J.G.Harston 1989,2005-2015
>_

User avatar
davidb
Posts: 2073
Joined: Sun Nov 11, 2007 10:11 pm
Contact:

Re: Compressed ROMs

Post by davidb » Sat Sep 24, 2016 12:34 pm

Thanks for the link to your BBC mailing list archives. It seems that the official archives don't quite go back that far.

It would be good to peek inside the ROMFS workspace to avoid problems with games that (re)load level files, skipping over other files in the process, since you don't want to dump things into memory that are not supposed to be loaded. In any case, I have avoided that in some cases by storing level files as uncompressed data (not using the loading triggers). It seems to work reasonably well for some games, which is about as much as I expected. :)

User avatar
davidb
Posts: 2073
Joined: Sun Nov 11, 2007 10:11 pm
Contact:

Re: Compressed ROMs

Post by davidb » Mon Sep 26, 2016 3:03 pm

I added some information about various recipes I used to convert UEFs to ROMs to the MGC thread. I thought it might be interesting to mention some of the limitations here.

Software that needs to fit on more than two ROMs isn't covered by my tools. I can imagine a system like the Slogger T2P* ROMs that uses ROMFS service handlers to feed bytes from a storage device (such as an SD card) to the filing systems, but this isn't too practical here. I also considered that it might be possible to drop code into RAM that pages out the first two ROMs and pages in two more from the MGC. However, that requires that we create ROMs that know the locations of other ROMs in the larger flash memory, or that we extend the menu program to keep some code in RAM where it can be called by ROMs that need to page themselves out.

Individual files in ROMs can be compressed or left uncompressed. Rather than try to implement transparent decompression as part of the ROMFS service handlers, I drop dummy blocks in the ROMFS stream and call out to a decompression routine to expand the data directly into RAM at predefined locations. Not all software is happy about this, and we don't always know where the data should go without disassembling code and seeing what it does (or using snapshots in Elkulator to see where we think the data should be placed). In addition, not all files are loaded in as blocks of data; some are read and processed serially. I think some games actually decompress data on the fly as they load it. If a file is problematic, it is left uncompressed and is loaded by ROMFS in the usual way. Level files for certain games are amongst those that need to be left uncompressed.

A lot of the games for the Electron, while probably protected using simpler methods than for the BBC Micro, are fairly hostile to expansions like the Plus 1. They perform *TAPE and try to disable the Plus 1 for no really good reason. We have to trap calls to *TAPE, like the T2P* ROMs do, so that loading doesn't grind to a halt. Unfortunately, some games use protection code that check the vector table for "unusual" addresses. This means that those games fail to load - intentionally hanging or resetting the vectors - if they see addresses in RAM. However, for some of these, we can place our replacement for *TAPE in memory and still put an address in the vector table that refers to a location in ROM. This works because the Electron's ROM contains data (at &EF97) that just happens to be the valid 6502 instructions for a jump to a reasonable location in RAM (&D44). If we put our *TAPE override code in that location and set the BYTEV vector to refer to the convenient data in the OS, protection code that checks for addresses in RAM will be fooled and loading can continue from ROM. There are, however, later protection methods that are more thorough in their checks, and I didn't find a workaround for those. :(

It is possible that improvements to the compression routine will cause games that are just over two ROMs to squeeze under the space limit. I'd like to see that because some multi-load games would be worth getting onto ROM.

User avatar
davidb
Posts: 2073
Joined: Sun Nov 11, 2007 10:11 pm
Contact:

Re: Compressed ROMs

Post by davidb » Sat Jan 27, 2018 12:06 am

davidb wrote:However, for some of these, we can place our replacement for *TAPE in memory and still put an address in the vector table that refers to a location in ROM. This works because the Electron's ROM contains data (at &EF97) that just happens to be the valid 6502 instructions for a jump to a reasonable location in RAM (&D44).
Just clarifying this because I confused myself while talking to crj at the ABUG meet-up. The data in the OS ROM at &EF80 (&2F80 from the start of the ROM) looks like this:

Code: Select all

0000:2F80 | 55 54 52 41  44 43 48 41  49 4E 4C 49  53 54 4D 4F | UTRADCHAINLISTMO
0000:2F90 | 44 45 4E 45  58 54 4F 4C  44 0D 50 4C  4F 54 4C 4F | DENEXTOLD.PLOTLO
0000:2FA0 | 43 41 4C 52  55 4E 0D 53  54 45 50 54  48 45 4E 55 | CALRUN.STEPTHENU
For some reason it's part way through a collection of BASIC keywords, which seems odd, but in any case we can see that the data at &EF97 (or &2F97 from the start of the ROM) is 4C 44 0D. This could be interpreted as machine code, which would make it an absolute jump instruction:

Code: Select all

JMP &0D44
So we reuse data in ROM as code to sidestep the check for RAM addresses in the vector table. Fortunately, &D44 isn't a (very) bad place to stash code. ;)

User avatar
jgharston
Posts: 3062
Joined: Thu Sep 24, 2009 11:22 am
Location: Whitby/Sheffield
Contact:

Re: Compressed ROMs

Post by jgharston » Sat Jan 27, 2018 12:30 am

davidb wrote:Just clarifying this because I confused myself while talking to crj at the ABUG meet-up. The data in the OS ROM at &EF80 (&2F80 from the start of the ROM) looks like this:

Code: Select all

0000:2F80 | 55 54 52 41  44 43 48 41  49 4E 4C 49  53 54 4D 4F | UTRADCHAINLISTMO
0000:2F90 | 44 45 4E 45  58 54 4F 4C  44 0D 50 4C  4F 54 4C 4F | DENEXTOLD.PLOTLO
0000:2FA0 | 43 41 4C 52  55 4E 0D 53  54 45 50 54  48 45 4E 55 | CALRUN.STEPTHENU
For some reason it's part way through a collection of BASIC keywords, which seems odd
Why would it be odd? How else would the keyboard handler supply the characters that are printed in red on the Electron keyboard keys?

The Electron MOS does include a service call "function+letter pressed, give me an expansion string", but BASIC is not a service ROM, so in the absence of anybody responding the MOS reverts to using its own expansion strings in that table. If BASIC was a service ROM, or the ROM header had some way of pointing to a keypress expansion table, that wouldn't be needed, but BASIC was written before the Electron was invented, and it's very difficult to include an API into something for something that has even occured to anybody to think about. And BASIC was written very much as being a non-service ROM and the only non-service ROM, with it doing nothing that the MOS can't do. There's some fossils in OS 0.10 that suggests an intention that ROM languages would have been selected by a *command matching the ROM title, with the ROM title being the first word in a ROM command table.

As to using the keyboard table as code to jump into, have you confirmed that it's at the same place in all Electron MOSes? RAMCount has to parse the Electron MOS to find the acorn bitmap as it is in a different place in different MOSes.

Ah. I've just checked, and the keytable *isn't* in the same place in different MOSes. In Elk OS 1.00 it is at &EF18 and in 64K MOS 3 it is at &EF01. I haven't spotted any simple way of finding where the table is other than, eg, scanning for the string "AUTO". I thought it might be at an offset from the keypress table at OSBYTE &AC/D, but it doesn't seem to be.

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.25
(C) Copyright J.G.Harston 1989,2005-2015
>_

User avatar
davidb
Posts: 2073
Joined: Sun Nov 11, 2007 10:11 pm
Contact:

Re: Compressed ROMs

Post by davidb » Sat Jan 27, 2018 1:11 am

jgharston wrote:Why would it be odd? How else would the keyboard handler supply the characters that are printed in red on the Electron keyboard keys?
Aha! I knew there would be a good reason for the table to be present. I'd forgotten about the keyword shortcuts. Thanks for clearing that up.
jgharston wrote:As to using the keyboard table as code to jump into, have you confirmed that it's at the same place in all Electron MOSes? RAMCount has to parse the Electron MOS to find the acorn bitmap as it is in a different place in different MOSes.

Ah. I've just checked, and the keytable *isn't* in the same place in different MOSes. In Elk OS 1.00 it is at &EF18 and in 64K MOS 3 it is at &EF01. I haven't spotted any simple way of finding where the table is other than, eg, scanning for the string "AUTO". I thought it might be at an offset from the keypress table at OSBYTE &AC/D, but it doesn't seem to be.
Hmm. That might break a few games on 64K Electrons. I'd have to use &EF80 on those machines.

I might look at other sequences earlier in the ROMs where I can find identical data at the same positions.

crj
Posts: 834
Joined: Thu May 02, 2013 4:58 pm
Contact:

Re: Compressed ROMs

Post by crj » Sat Jan 27, 2018 1:43 am

jgharston wrote:Why would it be odd? How else would the keyboard handler supply the characters that are printed in red on the Electron keyboard keys?
I thought the current language was supposed to provide them in response to calls to the language entry point with A=2,3? And, after all, BASIC does have to contain the text of its tokens anyway.

Or is this yet another example of BBC BASIC being special and different?

User avatar
daveejhitchins
Posts: 4153
Joined: Wed Jun 13, 2012 5:23 pm
Location: Newton Aycliffe, County Durham
Contact:

Re: Compressed ROMs

Post by daveejhitchins » Sat Jan 27, 2018 9:31 am

davidb wrote:Hmm. That might break a few games on 64K Electrons.
Would there be any games playable at Turbo speeds? Recommendations would always be to run at normal speeds! A check for this could be carried out . . .

Dave H :D
Parts: UM6502CE, GAL22V10D, GAL16V8D, AS6C62256A, TC514400AZ, WD1772, R6522, TMS27C512, AT28C256
Products: ARA II, ABR, ATI, AP6, MGC, AP5 . . .
For a price list, contact me at: Retro Hardware AT dave ej hitchins DOT plus DOT com

User avatar
jms2
Posts: 1970
Joined: Mon Jan 08, 2007 6:38 am
Location: Derby, UK
Contact:

Re: Compressed ROMs

Post by jms2 » Sat Jan 27, 2018 9:39 am

daveejhitchins wrote:Would there be any games playable at Turbo speeds?
It varies depending on the game. For some it made no difference, others became unplayable and a few were actually improved.

User avatar
jms2
Posts: 1970
Joined: Mon Jan 08, 2007 6:38 am
Location: Derby, UK
Contact:

Re: Compressed ROMs

Post by jms2 » Sat Jan 27, 2018 9:43 am

crj wrote:
jgharston wrote:Why would it be odd? How else would the keyboard handler supply the characters that are printed in red on the Electron keyboard keys?
I thought the current language was supposed to provide them in response to calls to the language entry point with A=2,3? And, after all, BASIC does have to contain the text of its tokens anyway.

Or is this yet another example of BBC BASIC being special and different?
Jonathan's post does explain this. All of your statements above are correct, but the key point is that BASIC 2 was written before the Electron was designed, so knows nothing of the Elk specific soft key system. It was important for the Elk to run exactly the same basic rom as the Beeb, so Basic doesn't do its own expansions.

User avatar
davidb
Posts: 2073
Joined: Sun Nov 11, 2007 10:11 pm
Contact:

Re: Compressed ROMs

Post by davidb » Sat Jan 27, 2018 12:00 pm

daveejhitchins wrote:
davidb wrote:Hmm. That might break a few games on 64K Electrons.
Would there be any games playable at Turbo speeds? Recommendations would always be to run at normal speeds! A check for this could be carried out . . .
I think I would just add a bit of code to those ROMs to look for the correct byte sequence before changing the vector table, so it shouldn't be hard to make them compatible with 64K Electrons.

User avatar
davidb
Posts: 2073
Joined: Sun Nov 11, 2007 10:11 pm
Contact:

Re: Compressed ROMs

Post by davidb » Sat Jan 27, 2018 5:03 pm

davidb wrote:I think I would just add a bit of code to those ROMs to look for the correct byte sequence before changing the vector table, so it shouldn't be hard to make them compatible with 64K Electrons.
I just made sure that extra code is added for those ROMs that rely on the OS ROM to provide useful byte sequences. It simply checks for a JMP instruction and, if not found, uses an address a bit earlier in the ROM instead. Some of the games this hack tries to fix won't actually run on a 64K Electron, anyway, but it fixes a few.

Are there other variants of the Electron OS out there? What about the German Electrons, or the Elektuur variants?

Post Reply