Level Data decoding techniques

bbc micro/electron/atom/risc os coding queries and routines
Post Reply
User avatar
sa_scott
Posts: 154
Joined: Wed Feb 09, 2011 11:34 pm
Location: Witley, Surrey, UK
Contact:

Level Data decoding techniques

Post by sa_scott » Wed Aug 19, 2020 3:47 pm

Hi all,

I've been recently looking at how best to improve Androidz, but the current listing gives me very little room to improve things, before running out of memory. Even with PAGE set to &1100, I know I can - among other things - compress the level data. The data is uncompressed, and is frankly in dire need of compression.

I came up (or rather, cribbed) this approach, which takes every two numbers, and adds 49 to the value, which then outputs the ASCII character. This approach halves the length of each data string, but I know better techniques are available.

Code: Select all

   10 MODE7
   20 N$=""
   30 FOR Z%=1 TO 11
   40   READ C$
   50   oldstring$=C$
   60   FOR A%=0 TO 19 STEP 2
   70     num$=MID$(oldstring$,A%,2)
   80     PRINT "num$="num$;
   90     N%=VAL(num$)+49
  100     PRINT "N%="N%;
  110     N$=N$+CHR$N%
  120   NEXT
  130   N$=N$+","
  140 NEXT
  150 PRINT "600 DATA ";N$
  160 DATA 000000000000000000
  170 DATA 011221120021122110
  180 DATA 020000000000000020
  190 DATA 020212121212121020
  200 DATA 010100000000001010
  210 DATA 010000000000000010
  220 DATA 010100000000001010
  230 DATA 020212121212121020
  240 DATA 020000000000000020
  250 DATA 011221120021122110
  260 DATA 000000000000000000
I've looked at listings from games such as Danger Dog, Dickie Brickie which all use more advanced variations of the above approach. This Twitter thread was quite interesting (https://twitter.com/nemo20000/status/12 ... 8229805057), which I got involved with, the suggestion to use 4 digits to one byte is certainly interesting (https://twitter.com/nemo20000/status/12 ... 6136659978), but I've developed brain fry trying to figure it all out!

Does anyone have any techniques that would be of help?

Thanks in advance!

Stephen
--
Stephen Scott, Digital Media Professional
www.sassquad.net

User avatar
geraldholdsworth
Posts: 642
Joined: Tue Nov 04, 2014 9:42 pm
Location: Inverness, Scotland
Contact:

Re: Level Data decoding techniques

Post by geraldholdsworth » Wed Aug 19, 2020 4:30 pm

The Repton series uses similar techniques - it uses 5 bits for a character and these are stored as 5 bits, so a byte will contain part of another character.
Have a look at Decoding Repton where I've, hopefully, explained it better.
Gerald Holdsworth
Repton Resource Page
www.reptonresourcepage.co.uk

User avatar
sa_scott
Posts: 154
Joined: Wed Feb 09, 2011 11:34 pm
Location: Witley, Surrey, UK
Contact:

Re: Level Data decoding techniques

Post by sa_scott » Wed Aug 19, 2020 4:46 pm

geraldholdsworth wrote:
Wed Aug 19, 2020 4:30 pm
The Repton series uses similar techniques - it uses 5 bits for a character and these are stored as 5 bits, so a byte will contain part of another character.
Have a look at Decoding Repton where I've, hopefully, explained it better.
Thanks for your suggestion. I'm still not seeing it make sense, but to be fair, I've spent all week having to test webpage forms on Browserstack, which if anyone has used before - sucks away your brain!

If anyone can mangle my code snippet into something useful, I would be so grateful!
--
Stephen Scott, Digital Media Professional
www.sassquad.net

User avatar
ChrisB
Posts: 78
Joined: Wed Oct 05, 2011 10:37 pm
Location: Surrey
Contact:

Re: Level Data decoding techniques

Post by ChrisB » Wed Aug 19, 2020 6:45 pm

Firstly I think your code has some issues - you start searching the string from position 0 but MID$ is indexed from position 1. You're also counting to 19 - but there are only 18 characters so you For loop should be
FOR A%=1 TO 17 STEP 2

Secondly - the first and last lines are all zeros - as are the first and last characters. If you assume this then you can not store this data (obviously the decode requires more code but if there are lots of levels you should save this.)

So - you level data is 198 bytes. Your technique takes this down to 99 bytes. Assuming zeros for the edges reduces this to 72 bytes.

Lastly if I understand the requirement correctly here it's to have some easily decompressed data that fits into DATA statements. As such using characters outside of the normal printable range isn't useful (so we can't use the whole value of a byte - 0 to 255). However - you digits are only 0-2 so you only need 2 bits to store them. This means we can store 3 digits in one byte easily with space to spare and keep within the printable range. Some pseudo-code.

Code: Select all

for each group of 3 digits.
C = value of first digit.
C = C + value of 2nd digit * 4
C = C + value of 3rd digit * 16
Store character C+48
Where 48 is "0" but you could use other values. That would bring you down to 48 bytes assuming the edges are zero or 66 otherwise.

User avatar
sa_scott
Posts: 154
Joined: Wed Feb 09, 2011 11:34 pm
Location: Witley, Surrey, UK
Contact:

Re: Level Data decoding techniques

Post by sa_scott » Wed Aug 19, 2020 11:46 pm

ChrisB wrote:
Wed Aug 19, 2020 6:45 pm
Firstly I think your code has some issues - you start searching the string from position 0 but MID$ is indexed from position 1. You're also counting to 19 - but there are only 18 characters so you For loop should be
FOR A%=1 TO 17 STEP 2

Secondly - the first and last lines are all zeros - as are the first and last characters. If you assume this then you can not store this data (obviously the decode requires more code but if there are lots of levels you should save this.)
Thanks for pointing out those errors. I've addressed them in the revised code block below.This now prints out a DATA statement of the revised level data. I've adjusted the value of what is taken from each ASCII code to give some alternate characters. I noticed Matthew Eastmond's games such as Danger Dog use this technique to better obfuscate the map. I believe his routine decodes a value of how many of map item X it should print, rather than in a verbose manner like my data.

I'll have another go at this, taking into account your other suggestions. But thanks again. If anyone else has any alternative techniques, would be appreciated!

Code: Select all

   10 MODE7
   20 N$=""
   30 FOR Z%=1 TO 11
   40 READ C$
   50 oldstring$=C$
   60 FOR A%=1 TO 18 STEP 2
   70 num$=MID$(oldstring$,A%,2)
   80 REM PRINT "num$="num$;
   90 N%=VAL(num$)+46
  100 REM PRINT "N%="N%;
  110 N$=N$+CHR$N%
  120 NEXT
  130 IF Z%<11 N$=N$+","
  140 NEXT
  150 PRINT "600 DATA ";N$
  160 DATA 000000000000000000
  170 DATA 011221120021122110
  180 DATA 020000000000000020
  190 DATA 020212121212121020
  200 DATA 010100000000001010
  210 DATA 010000000000000010
  220 DATA 010100000000001010
  230 DATA 020212121212121020
  240 DATA 020000000000000020
  250 DATA 011221120021122110
  260 DATA 000000000000000000
  
--
Stephen Scott, Digital Media Professional
www.sassquad.net

User avatar
ChrisB
Posts: 78
Joined: Wed Oct 05, 2011 10:37 pm
Location: Surrey
Contact:

Re: Level Data decoding techniques

Post by ChrisB » Thu Aug 20, 2020 8:50 am

So - given the sample of level data that you have has lots of repeating zeros here's a very simplistic run length encoding implementation that stores the number of zeros as well as storing 2 byte pairs. Hopefully the comments are self explanatory. This makes assumptions about the level data including that there will a leading and final string of zeros.

Code: Select all

   10 MODE7
   20 REM Concatenate level data into one string.
   30 FOR Z%=1 TO 11
   40 READ C$
   50 L$=L$+C$
   60 NEXT
   70 REM Print original string for debug
   80 PRINT L$
   90 REM Output string
  100 O$=""
  110 REM Current string offset
  120 P%=1
  130 REM Flag for whether we are doing rle or not
  140 S%=FALSE
  150 REM Current length of zeros
  160 L%=0
  170 REPEAT
  180 IF S%=FALSE PROCrle ELSE PROCenc
  190 REM Increment pointer
  200 P%=P%+1
  210 UNTIL P%>LEN(L$)
  220 REM Add final rle section
  230 O$=O$+CHR$(ASC("A")+L%)
  240 PRINT "Original length = ";LEN(L$)
  250 REM Decode
  260 D$=""
  270 FOR X%=1 TO LEN(O$)
  280 C%=ASC(MID$(O$,X%,1))
  290 REM I have left "asc(" in here for clarity - probably faster to use the actual values.
  300 IF C%>ASC("a") THEN D$=D$+RIGHT$("00"+STR$(C%-ASC("a")),2) ELSE D$=D$+STRING$(C%-ASC("A"),"0")
  310 NEXT
  320 PRINT D$
  330 IF D$=L$ THEN PRINT "Successfully decoded" ELSE PRINT "Decode error!"
  340 PRINT "Encoded length = ";LEN(O$)
  350
  360
  370 PRINT "1000 DATA ";O$
  380 END
  390
  400 DEFPROCrle
  410 REM RLE encode the zeros - or if next byte is not zero add the RLE encoded length
  420 IF VAL(MID$(L$,P%,1))=0 THEN L%=L%+1 ELSE O$=O$+CHR$(ASC("A")+L%):P%=P%-1:S%=TRUE
  430 ENDPROC
  440 DEFPROCenc
  450 REM Encode 2 bytes into 1 character
  460 O$=O$+CHR$(VAL(MID$(L$,P%,2))+ASC("a"))
  470 REM Check if next two bytes are zeros and if so start RLE
  480 IF VAL(MID$(L$,P%+2,2))=0 THEN S%=FALSE:L%=0
  490 P%=P%+1
  500 ENDPROC
  510
  520 DATA 000000000000000000
  530 DATA 011221120021122110
  540 DATA 020000000000000020
  550 DATA 020212121212121020
  560 DATA 010100000000001010
  570 DATA 010000000000000010
  580 DATA 010100000000001010
  590 DATA 020212121212121020
  600 DATA 020000000000000020
  610 DATA 011221120021122110
  620 DATA 000000000000000000

User avatar
sa_scott
Posts: 154
Joined: Wed Feb 09, 2011 11:34 pm
Location: Witley, Surrey, UK
Contact:

Re: Level Data decoding techniques

Post by sa_scott » Thu Aug 20, 2020 12:37 pm

ChrisB wrote:
Thu Aug 20, 2020 8:50 am
So - given the sample of level data that you have has lots of repeating zeros here's a very simplistic run length encoding implementation that stores the number of zeros as well as storing 2 byte pairs. Hopefully the comments are self explanatory. This makes assumptions about the level data including that there will a leading and final string of zeros.
Thank you so much for the code sample. The curveball is that my level data doesn't always have leading and final strings of zeros!

https://github.com/sassquad/androidz/bl ... ROID2#L120 is the level data from which my initial code sample was taken. The lines after this are for other levels, and you'll see that some levels don't have that assumption.

Apologies, I appreciate the time you spent putting this together.

Stephen
--
Stephen Scott, Digital Media Professional
www.sassquad.net

User avatar
ChrisB
Posts: 78
Joined: Wed Oct 05, 2011 10:37 pm
Location: Surrey
Contact:

Re: Level Data decoding techniques

Post by ChrisB » Thu Aug 20, 2020 7:12 pm

It's ok - was just a bit of a distraction. Here's a routine to store 4 digits to a character as mentioned in the original thread. Each level is not 50 bytes (last byte only encodes 2 locations)

Code: Select all

   10 MODE7
   20 FOR Z%=1 TO 11:READ C$:L$=L$+C$:NEXT
   30 PRINT L$
   40 O$=""
   50 FOR X%=1 TO LEN(L$) STEP 4
   60 O$=O$+FNpack(MID$(L$,X%,4))
   70 NEXT
   80 PRINT '"Encoded length = ";LEN(O$)
   90 FOR Z%=1 TO LEN(O$)-1
  100 PRINT FNunpack(MID$(O$,Z%,1));
  110 NEXT
  120 PRINT LEFT$(FNunpack(RIGHT$(O$,1)),2)
  130 PRINT "1000 DATA ";O$
  140 END
  150
  160 DEFFNpack(D$)
  170 LOCAL P%
  180 P%=VAL(MID$(D$,1,1))
  190 P%=P%+VAL(MID$(D$,2,1))*3
  200 P%=P%+VAL(MID$(D$,3,1))*9
  210 P%=P%+VAL(MID$(D$,4,1))*27
  220 PRINT CHR$(P%+46);
  230 =CHR$(P%+46)
  240
  250 DEFFNunpack(D$)
  260 LOCAL P$,D%,Y%
  270 D%=ASC(D$)-46
  280 P$=""
  290 FOR Y%=0 TO 3
  300 P$=P$+CHR$((D% MOD 3)+48)
  310 D%=D% DIV 3
  320 NEXT
  330 =P$
  340
  350 DATA 000000000000000000
  360 DATA 011221120021122110
  370 DATA 020000000000000020
  380 DATA 020212121212121020
  390 DATA 010100000000001010
  400 DATA 010000000000000010
  410 DATA 010100000000001010
  420 DATA 020212121212121020
  430 DATA 020000000000000020
  440 DATA 011221120021122110
  450 DATA 000000000000000000

User avatar
sa_scott
Posts: 154
Joined: Wed Feb 09, 2011 11:34 pm
Location: Witley, Surrey, UK
Contact:

Re: Level Data decoding techniques

Post by sa_scott » Thu Aug 20, 2020 11:36 pm

ChrisB wrote:
Thu Aug 20, 2020 7:12 pm
It's ok - was just a bit of a distraction. Here's a routine to store 4 digits to a character as mentioned in the original thread. Each level is not 50 bytes (last byte only encodes 2 locations)
You're a star! I'll have to play with this when I get the chance. It's great to have had your assistance.

If anyone has any other techniques, please feel free to share!

Stephen
--
Stephen Scott, Digital Media Professional
www.sassquad.net

User avatar
sa_scott
Posts: 154
Joined: Wed Feb 09, 2011 11:34 pm
Location: Witley, Surrey, UK
Contact:

Re: Level Data decoding techniques

Post by sa_scott » Sun Aug 23, 2020 1:16 am

ChrisB wrote:
Thu Aug 20, 2020 7:12 pm
It's ok - was just a bit of a distraction. Here's a routine to store 4 digits to a character as mentioned in the original thread. Each level is not 50 bytes (last byte only encodes 2 locations)
I've used the supplied techniques to spit out a screen, as per the code sample below.

Code: Select all

   10 MODE5
   20 VDU23,224,255,129,129,129,129,129,129,129
   30 VDU23,225,129,129,129,129,129,129,129,255
   40 VDU23,226,170,0,85,0,170,0,85,0
   50 READ O$
   60 XX%=1:YY%=2
   70 FOR Z%=1 TO LEN(O$)-1
   80 D%=ASC(MID$(O$,Z%,1))-46
   90 FOR Y%=0 TO 3
  100 BL%=(D% MOD 3)+48
  110 IF BL%=48 B$=CHR$17+CHR$3+CHR$226+CHR$10+CHR$8+CHR$226
  120 IF BL%=49 B$=CHR$17+CHR$1+CHR$224+CHR$10+CHR$8+CHR$225
  130 IF BL%>49 B$=CHR$17+CHR$2+"X"+CHR$10+CHR$8+"X"+CHR$11
  140 D%=D% DIV 3
  150 PRINTTAB(XX%,YY%)B$;
  160 XX%=XX%+1:IF XX%>18 XX%=1:YY%=YY%+2
  170 NEXT
  180 NEXT
  190 PRINTTAB(0,25);
  200 END
  210 DATA ....Ib5r<4...fsttAL..7J...7L..7esttA4...Kb5r<.....
  
It all appears to work fine, except the last two blocks are never printed. I'm wondering what is causing this, and I am stumped. The inner loop is not exiting cleanly for some reason. Here's the screen output.
screengrab.png
chrisb - you mentioned in your post about the last byte only encodes 2 locations. I'm therefore wondering if my loop is interfering with this fact?

Thanks in advance.

Stephen
--
Stephen Scott, Digital Media Professional
www.sassquad.net

User avatar
ChrisB
Posts: 78
Joined: Wed Oct 05, 2011 10:37 pm
Location: Surrey
Contact:

Re: Level Data decoding techniques

Post by ChrisB » Sun Aug 23, 2020 9:22 am

Yes - your level data is 18 digits by 11 lines = 198 bytes. 198/4=49.5 - which is the problem. My encode routine will pad the last byte with 00. So Line 70 in your code

Code: Select all

   70 FOR Z%=1 TO LEN(O$)-1
is taken from my decode where I skipped the last byte and made it a special case.

But if you go to LEN(O$) without the -1 then you get 2 extra characters at the end. :(

You could add a condition to the main loop body to get round this or you can do this by changing line 90 to

Code: Select all

   90   FOR Y%=0 TO 3+2*(LEN(O$)=Z%)
LEN(O$)=Z% will do the comparison and return TRUE or FALSE which are represented by -1 and 0 respectively. So when Z% is the last character of the string Y% the calculation will be 0 TO 3+2*-1 which is 2 skipping the last 2 characters.

Code: Select all

   10 MODE5
   20 VDU23,224,255,129,129,129,129,129,129,129
   30 VDU23,225,129,129,129,129,129,129,129,255
   40 VDU23,226,170,0,85,0,170,0,85,0
   50 READ O$
   60 XX%=1:YY%=2
   70 FOR Z%=1 TO LEN(O$)
   80 D%=ASC(MID$(O$,Z%,1))-46
   90 FOR Y%=0 TO 3+2*(Z%=LEN(O$))
  100 BL%=(D% MOD 3)+48
  110 IF BL%=48 B$=CHR$17+CHR$3+CHR$226+CHR$10+CHR$8+CHR$226
  120 IF BL%=49 B$=CHR$17+CHR$1+CHR$224+CHR$10+CHR$8+CHR$225
  130 IF BL%>49 B$=CHR$17+CHR$2+"X"+CHR$10+CHR$8+"X"+CHR$11
  140 D%=D% DIV 3
  150 PRINTTAB(XX%,YY%)B$;
  160 XX%=XX%+1:IF XX%>18 XX%=1:YY%=YY%+2
  170 NEXT
  180 NEXT
  190 PRINTTAB(0,25);
  200 END
  210 DATA ....Ib5r<4...fsttAL..7J...7L..7esttA4...Kb5r<.....
  

User avatar
sa_scott
Posts: 154
Joined: Wed Feb 09, 2011 11:34 pm
Location: Witley, Surrey, UK
Contact:

Re: Level Data decoding techniques

Post by sa_scott » Sun Aug 23, 2020 12:22 pm

ChrisB wrote:
Sun Aug 23, 2020 9:22 am
Yes - your level data is 18 digits by 11 lines = 198 bytes. 198/4=49.5 - which is the problem. My encode routine will pad the last byte with 00. So Line 70 in your code

Code: Select all

   70 FOR Z%=1 TO LEN(O$)-1
is taken from my decode where I skipped the last byte and made it a special case.
You see, this is why I'm not left near back end code, but left to deal with CSS in my day job!

Yes, indeed 198 cannot divide evenly by 4. I spent a few hours on this last night but clearly midnight coding is something best left left in my younger years!

I'll take a look at this later, but thanks again!
--
Stephen Scott, Digital Media Professional
www.sassquad.net

User avatar
sa_scott
Posts: 154
Joined: Wed Feb 09, 2011 11:34 pm
Location: Witley, Surrey, UK
Contact:

Re: Level Data decoding techniques

Post by sa_scott » Tue Aug 25, 2020 2:14 pm

I'm still stumped with how to resolve things. I've decided to re-introduce the left and right walls, in effect having 20 characters per data row. I did try to include the top and bottom 'walls' but this makes the total string length 260 characters, 5 characters too long (13 * 20 = 260). By making it still 11 rows, but now 20 characters, this makes it 220 characters (11 * 20). This *is* divisible by 4.

I've amended this post - as I've been able to get the level data to get encoded and decoded after amending the level criteria. I needed to revise it to do the following:
  • I needed to include the left and right walls. This meant that each row was increased from 18 to 20 characters.
    I needed to include the top and bottom walls. This couldn't be done before, as the resulting string would be too long (13 * 20 = 260 - a string cannot be longer than 256 characters)
The upshot of the above is that 260 is divisible by 4, so Chrisb's routine could be modified to not have to take that anomaly into account as before.

I've now attached the two revised listings, one for encoding, the other for decoding and displaying the screen in Mode 5. The encoder has to patch the top and bottom walls onto either end of the 'packed' string, as I already knew that a row of 20 wall bricks would encode as 'VVVVV'. Again, this was due to the 256 character limit in assembling the level data string.

As I write, I realise this could probably be overcome by doing a partial encode of half the data to one string, then encoding the last half of the data to another string, then concatenating them together, but this will do for now.

Thanks again chrisb for your help in earlier stages of this thread, and I hope it is useful to others out there!

Encoder

Code: Select all

   10 MODE7
   20 FOR Z%=1 TO 11:READ C$:L$=L$+C$:NEXT
   30 PRINT L$
   40 O$=""
   50 FOR X%=1 TO LEN(L$) STEP 4
   60 O$=O$+FNpack(MID$(L$,X%,4))
   70 NEXT
   80 F$="VVVVV"+O$+"VVVVV"
   90 PRINT '"Encoded length = ";LEN(F$)'"(including 10 extra characters for top and bottom walls)"
  100 PRINT '"Outputting revised decoded level data:":FOR Z%=1 TO LEN(F$)
  110 PRINT FNunpack(MID$(F$,Z%,1));
  120 NEXT
  130 PRINT '"Final DATA statement of encrypted level data:"
  140 PRINT "1000 DATA ";F$;
  150 END
  160
  170 DEFFNpack(D$)
  180 LOCAL P%
  190 P%=VAL(MID$(D$,1,1))
  200 P%=P%+VAL(MID$(D$,2,1))*3
  210 P%=P%+VAL(MID$(D$,3,1))*9
  220 P%=P%+VAL(MID$(D$,4,1))*27
  230 PRINT CHR$(P%+46);
  240 =CHR$(P%+46)
  250
  260 DEFFNunpack(D$)
  270 LOCAL P$,D%,Y%
  280 D%=ASC(D$)-46
  290 P$=""
  300 FOR Y%=0 TO 3
  310 P$=P$+CHR$((D% MOD 3)+48)
  320 D%=D% DIV 3
  330 NEXT
  340 =P$
  350
  360 DATA 10000000000000000001
  370 DATA 10112211200211221101
  380 DATA 10200000000000000201
  390 DATA 10202121212121210201
  400 DATA 10101000000000010101
  410 DATA 10100000000000000101
  420 DATA 10101000000000010101
  430 DATA 10202121212121210201
  440 DATA 10200000000000000201
  450 DATA 10112211200211221101
  460 DATA 10000000000000000001
Decoder and display of level

Code: Select all

   10 MODE5
   20 VDU23,224,255,129,129,129,129,129,129,129
   30 VDU23,225,129,129,129,129,129,129,129,255
   40 VDU23,226,170,0,85,0,170,0,85,0
   50 READ O$
   60 XX%=0:YY%=0
   70 FOR Z%=1 TO LEN(O$)
   80 D%=ASC(MID$(O$,Z%,1))-46
   90 FOR Y%=0 TO 3
  100 BL%=(D% MOD 3)+48
  110 IF BL%=48 B$=CHR$17+CHR$3+CHR$226+CHR$10+CHR$8+CHR$226
  120 IF BL%=49 B$=CHR$17+CHR$1+CHR$224+CHR$10+CHR$8+CHR$225
  130 IF BL%>49 B$=CHR$17+CHR$2+"X"+CHR$10+CHR$8+"X"+CHR$11
  140 D%=D% DIV 3
  150 PRINTTAB(XX%,YY%)B$;
  160 XX%=XX%+1:IF XX%>19 XX%=0:YY%=YY%+2
  170 NEXT
  180 NEXT
  190 PRINTTAB(0,25);
  200 END
  210 DATA VVVVV/...ISZfzMA...OA```O8/.IL8...L8/.ILA```OA...OSZfzM/...IVVVVV
Attachments
final-screen.png
--
Stephen Scott, Digital Media Professional
www.sassquad.net

User avatar
sa_scott
Posts: 154
Joined: Wed Feb 09, 2011 11:34 pm
Location: Witley, Surrey, UK
Contact:

Re: Level Data decoding techniques

Post by sa_scott » Wed Aug 26, 2020 11:15 pm

Just a quick note, I wasn't aware that editing a post does not amend its updated time. I managed to do some further work and address the issues I had. Do check out my previous post for more info.
--
Stephen Scott, Digital Media Professional
www.sassquad.net

Post Reply

Return to “programming”