basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

handy tools that can assist in the development of new software
User avatar
Richard Russell
Posts: 2523
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by Richard Russell »

lurkio wrote:
Sat Aug 14, 2021 4:21 pm
I hadn't thought of placing the ON ERROR in that precise spot.
A trapped error always resets the stack pointer (in early versions of BBC BASIC anyway, before the introduction of ON ERROR LOCAL) so any 'memory' of being inside a loop or FN/PROC is lost. Placing the ON ERROR statement outside any FOR or REPEAT loop, and not in a FN/PROC, emphasises that, and means that if you allow execution to continue there is no danger of BASIC getting confused because the stack pointer and the position within the program's structure have got 'out of phase'.
I now think that that's probably the best way to do error handling if you're trying to avoid using line-numbers in the sort of BASIC program that has a main loop, such as a text adventure.
There may be multiple places where you might want to put such an ON ERROR, depending on which stage of a program you are in, so long as they are all at the 'lowest level' of the structure (outside of any loops). For example:

Code: Select all

      REM. First stage, e.g. select options:
      ON ERROR ... REM Code to handle a recoverable error in the first stage
      REPEAT
        REM
      UNTIL first_stage_complete
      REM. Second stage, e.g. game loop:
      ON ERROR ... REM Code to handle a recoverable error in the second stage
      REPEAT
        REM
      UNTIL second_stage_complete
      REM. Third stage, e.g. update list of scores:
      ON ERROR ... REM Code to handle a recoverable error in the third stage
      REPEAT
        REM
      UNTIL third_stage_complete
I am suffering from 'cognitive decline' and depression. If you have a comment about the style or tone of this message please report it to the moderators by clicking the exclamation mark icon, rather than complaining on the public forum.
User avatar
lurkio
Posts: 3783
Joined: Wed Apr 10, 2013 12:30 am
Location: Doomawangara
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by lurkio »

SteveF wrote:
Sat Aug 14, 2021 4:34 pm
I did find myself thinking last night that a tweaked basiclabel.py could be used to implement "compile-time constants" …
I like it!

:D
User avatar
lurkio
Posts: 3783
Joined: Wed Apr 10, 2013 12:30 am
Location: Doomawangara
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by lurkio »

Steve, this is an edge case that seems to fail in the latest version of basiclabel.py:

Code: Select all

READ s$
PRINT s$
END
%%string_start%%:DATA %%hello%%
:idea:

EDIT: On second thoughts, this is such an unlikely scenario, and, if the user does actually want to include a DATA string that literally starts and ends with %%, they can already do so by simply enclosing the string in quotes, so I don’t think it’s worth changing the script to recognise this edge case after all — especially if you do decide to implement “compile-time constants”, Steve, because then there might conceivably be circumstances in which the user does want the script to interpret the constant, rather than leave it as a literal, when it appears as a DATA value:

Code: Select all

%%antechamber=1%%
%%library=2%%
…
REPEAT
READ r, desc$
IF r=n PRINT "You are in "; desc$:UNTIL TRUE ELSE r=r+1:UNTIL r>max_r:…
…
DATA %%antechamber%%, "a small, dark antechamber"
DATA %%library%%, "a large room, lined with bookshelves"
…
SteveF
Posts: 1111
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by SteveF »

Hi lurkio,

Thanks for this post, as it has nudged me to go ahead and implement this - I've pushed the change to github.

Simple example of literal labels:

Code: Select all

%%meaning_of_life=42%%:
%%preamble="The meaning of life is "%%:
%%Forever%%:PRINT %%preamble%%;%%meaning_of_life%%;"! ";
GOTO %%Forever%%
Note the colon at the end of the first two lines - this is what makes it a label definition instead of a label reference. I initially missed that off and thought I'd found a bug; this is perhaps not as friendly as it could be.

The latest version of basiclabel.py also generates an error if you try to redefine a label, which should reduce hair-pulling.

I think your edit makes a good point. I could also vaguely imagine clever and/or horrific (delete as applicable :-)) constructions like this which use non-constant labels in DATA statements:

Code: Select all

RESTORE %%base%%+i*%%INCREMENT%%
READ line_number
RESTORE line_number
READ s$
...
%%base%%:DATA %%foo%%
DATA %%bar%%
%%foo%%:DATA "foo string"
%%bar%%:DATA "bar string"
where it's important that un-quoted labels are substituted.

Technical note: When parsing for labels, the regular expression used isn't perfect (what regular expression is? :-) ). It should work 99% of the time but if you try to define a "constant label" (one using =) and then write something after the colon which tries to use a label reference, it will probably go wrong:

Code: Select all

%%mylabel=491%%:PRINT %%mylabel%%:PRINT 1
This kind of goes with the territory when you're attempting to implement a preprocessor without parsing properly. In practice I don't think this is likely to be a big problem but I'm open to tweaking it. (It doesn't actually have to be a regular expression, that was just convenient when I first put this together.)

Edited to add: some of the support for the new "user defined constant" labels is probably a bit shoddy. I'm open to feedback on this. Some miscellaneous notes:
  • Before - when labels just referred to line numbers (ignoring the INCREMENT special case) - a label was recognised as %% following by one or more alphanumeric characters followed by %%; it was a definition if it appeared at the beginning of a line and was followed by a colon, otherwise it was a reference to an existing label. If you wrote "%%^£"^!£&*%%" it would not be recognised as a label and silently passed through to the output program - this is still the case. I did this deliberately to try to avoid breaking things but perhaps "%%...%%" is impossible enough outside of quoted strings in normal BASICthat basiclabel.py should decide anything with %% in is its business and complain if it sees something it doesn't understand. In particular, if at the moment you write "%%foo=42%%" and miss the colon off, this is ignored, because it's not a label definition (there's no trailing colon) and it's not a valid label reference (it contains non-alphanumeric characters, i.e. the "=").
  • In adding support for user-defined labels with an equals sign in them, I perhaps tried to be over-clever and only recognise the "=" when looking for label definitions but otherwise leave things alone. Maybe if an "=" appears between %% delimeters, it should be treated as a label definition without needing a trailing colon - or is that another special case which will confuse things?
  • User-defined labels also generate a corresponding line in the BASIC program containing just a colon - this was important when labels referred to line numbers, as the line had to contain *something* in order for its line number to exist. This behaviour should perhaps be changed for user-defined labels in order to avoid adding unnecessary bloat to the program.
  • Currently user-defined labels can be strings, but this probably raises some potential parsing issues and it would be cleaner to say they can only be numeric - but it would cut down on flexibility.
SteveF
Posts: 1111
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by SteveF »

I've now tagged up a v0.06 release, which you can get from github here. This is just v0.06-pre3 with the test suite fixed up and the man page tweaked, so nothing too exciting for anyone who's been following this thread. You can see the official changelog here as always.
User avatar
lurkio
Posts: 3783
Joined: Wed Apr 10, 2013 12:30 am
Location: Doomawangara
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by lurkio »

So it looks like I’ve come up with another feature suggestion for basiclabel.py. (Do I hear groaning..?)

Would it be possible to enable basiclabel.py to optionally strip all REMs? The idea is that you’d be able to use a modern text-editor to comment your BASIC code liberally and extensively — so extensively, in fact, that you’d even be able to exceed the memory limit of basictool’s virtual machine.

I seem to recall that I once added so many comments to a 6502 BASIC program that it got too big to be loaded into the RAM of a Model B or even a Master, and there was simply no way for PRES ABE Pack to get its hands on the program to strip it of REMs automatically. So I ended up selectively introducing line-numbers to my previously number-unencumbered source code — so that only non-REM lines had numbers. Then, when I pasted the code into BeebEm, the REMs just “vanished”. But I wouldn’t have had to faff around like that at all if only I’d had the use of basiclabel.py with optional REM-stripping…

(REM-stripping might even be a candidate feature for basictool rather than basiclabel, but perhaps there’s the potential for confusion (from the end-user’s point of view) with PRES ABE Pack’s own REM-stripping option..?)

:?:
SteveF
Posts: 1111
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by SteveF »

That seems like a sensible suggestion. Although it doesn't quite fit with basictool's "bringing 6502 utilities to the modern command line" ethos I think it would ultimately be a reasonable special case addition. It's probably best to prototype this in basiclabel.py first though. Are you currently suffering from this problem or is it something that happened a while back? If it's the former I'll see if I can knock out a hacky version ASAP, otherwise I'll probably sit on it for a few days and perhaps try to tidy up some of the ad-hoc parsing in basiclabel.py as part of this.

The ultimate crude hack in an emergency - which would not treat quoted strings specially, but you'd probably get away with it - would just be to do something like:

Code: Select all

sed -e 's/\<REM .*//g' mybigfile.txt | basictool -t - out.tok
User avatar
lurkio
Posts: 3783
Joined: Wed Apr 10, 2013 12:30 am
Location: Doomawangara
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by lurkio »

SteveF wrote:
Mon Sep 06, 2021 7:01 pm
Are you currently suffering from this problem
No.
SteveF wrote:
Mon Sep 06, 2021 7:01 pm
is it something that happened a while back?
Yes.
SteveF wrote:
Mon Sep 06, 2021 7:01 pm
I'll probably sit on it for a few days and perhaps try to tidy up some of the ad-hoc parsing in basiclabel.py as part of this.
No rush. Many thanks!

:)
User avatar
lurkio
Posts: 3783
Joined: Wed Apr 10, 2013 12:30 am
Location: Doomawangara
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by lurkio »

Just ran into some strange errors with PRES ABE Pack and basictool.

If you use PRES ABE Pack in BeebEm (the old-fashioned way!) to compact the program EGYPT on the attached .SSD — saying yes to all Pack options — you'll find that the compacted program is broken: every instance of "GOTO70" is mangled and seems to cause erroneous line-concatenations and all sorts of merry hell. You get the same result if you use basictool, instead of BeebEm, to do the packing.

The workaround is to load the program in BeebEm, RENUMBER it on the standard BASIC commandline, and only then invoke PRES ABE Pack. The program then gets compacted successfully. Not sure why. (Another, less satisfactory, workaround is not to RENUMBER at all, but to go straight to PRES ABE Pack and say yes to all Pack options except "Use unused singles". Again, I'm not sure why that succeeds.)

However, if, instead, you try to use basictool to do the renumbering, passing it the attached untokenised textfile as input and requesting a tokenised output file (basictool -r -t -v -v "decrypted prog Disc999-EgyptianAdventure.bas" EATOK) — the renumbering doesn't seem to work. There are at least a couple of errors. One of the errors involves the parsing of quoted Teletext control codes, which seems to be buggy. But that can't be the only problem with basictool's renumbering algorithm because the renumbered program also crashes when run.

:!:
Attachments
decrypted prog Disc999-EgyptianAdventure.bas.txt
(18.66 KiB) Downloaded 8 times
decrypted prog Disc999-EgyptianAdventure.ssd
(200 KiB) Downloaded 6 times
SteveF
Posts: 1111
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by SteveF »

Hi lurkio,

I've tried to have a look at this but I'm afraid I'm getting really confused.

I've renamed the .bas file to "decrypted.bas" for ease of reference. Just as a sanity check, I think it has this md5sum:

Code: Select all

$ md5sum decrypted.bas
aed3ccf1949fb0c72effc46a874f07e1  decrypted.bas
I then just tokenise that with basictool:

Code: Select all

$ ./basictool -vt decrypted.bas decrypted.tok
info: input auto-detected as ASCII text (non-tokenised) BASIC
$ md5sum decrypted.tok
f6510e58b6432dc8b535a0bfde002d09  decrypted.tok
I also tokenise-and-renumber with basictool:

Code: Select all

$ ./basictool -vtr decrypted.bas decrypted-r.tok
info: input auto-detected as ASCII text (non-tokenised) BASIC
l$ md5sum decrypted-r.tok 
3299fcbcdf12cf28163bf679dd341ee0  decrypted-r.tok
I then put decrypted.tok onto a .ssd, load it into an emulated Master 128 in b-em, RENUMBER it at the BASIC prompt and SAVE the resulting file as DEC128. Copying that file back to my main PC:

Code: Select all

$ md5sum z.ssd.DEC128
3299fcbcdf12cf28163bf679dd341ee0  z.ssd.DEC128
decrypted-r.tok and z.ssd.DEC128 have the same md5sum, i.e. they're the same.

So it looks to me as though basictool's *renumber* is fine - it does exactly the same as a real Master 128 running BASIC 4 does. I am not saying there isn't a basictool bug, but if there is it would seem to be unrelated to renumbering.

I do wonder if basictool is doing something dodgy when processing top-bit-set characters in the text input and it's working fine on my machine but not on yours. Could you please try the above as a sanity check, i.e. do:

Code: Select all

basictool -vt decrypted.bas decrypted.tok
basictool -vtr decrypted.bas decrypted-r.tok
md5sum decrypted.bas decrypted.tok decrypted-r.tok
and let me have the results.

I have a horrible feeling I'm completely missing the point here - apologies if I am - but maybe if I post this I will suddenly realise my mistake. :-)

Edited: I am not sure how else to approach this. It looks as though you've edited decrypted.bas yourself so I can't really compare it with the EGYPT file on the .ssd you've posted. Do you have some other way of tokenising decrypted.bas which *doesn't* involve basictool which provides a point of comparison? Are you doing this by pasting decrypted.bas into BeebEm using copy-and-paste, for example?

Edited again: I guess there are two separate issues here - obviously related in the sense they all come out of this one .ssd, but otherwise distinct.
  1. There's the possibility of bugs in handling decrypted.bas. This is what I've tried to address above.
  2. There's the possiblity that on an actual 8-bit machine doing LOAD "EGYPT":RENUMBER:*PACK:SAVE "PACKED" does *not* give the same results as doing basictool -rptv egypt packed, when it should - I'm wrong, see the next paragraph. I have *not* tried this yet, I will give it a go tomorrow though...
Edited yet again :-) : I don't think this is causing your problems, but what I wrote in the previous paragraph is wrong. basictool renumbers *after* packing, not before, so to get the equivalent of LOAD "EGYPT":RENUMBER:*PACK:SAVE "PACKED" using basictool you need to do something like basictool -rtv egypt intermediate; basictool -ptv intermediate backed. I will try doing both of those tomorrow and see if I get different results. [It's probably possible to pipe the output of the first basictool into the second without using an intermediate file, but in the presence of possible bugs it seems safest to avoid risking it...]
User avatar
Richard Russell
Posts: 2523
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by Richard Russell »

lurkio wrote:
Sun Sep 19, 2021 1:48 am
One of the errors involves the parsing of quoted Teletext control codes, which seems to be buggy.
As far as the built-in RENUMBER command is concerned, BASIC 2 is buggy in this regard but BASIC 4 is fixed. In BASIC 2 a double-height control code in a quoted string will be recognised as the line-number token, which can cause problems such as spurious 'Failed at' errors or, if you are unlucky, program corruption. In BASIC 4 the scan for line numbers ignores quoted strings.
I am suffering from 'cognitive decline' and depression. If you have a comment about the style or tone of this message please report it to the moderators by clicking the exclamation mark icon, rather than complaining on the public forum.
SteveF
Posts: 1111
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by SteveF »

Richard Russell wrote:
Sun Sep 19, 2021 5:24 am
As far as the built-in RENUMBER command is concerned, BASIC 2 is buggy in this regard but BASIC 4 is fixed. In BASIC 2 a double-height control code in a quoted string will be recognised as the line-number token, which can cause problems such as spurious 'Failed at' errors or, if you are unlucky, program corruption. In BASIC 4 the scan for line numbers ignores quoted strings.
Thanks Richard, that's helpful. I did have vague recollections of embedding teletext codes in BASIC programs when I first upgraded to a BBC B from an Electon (having mode 7 felt like magic!) and *something* going wrong when I did it, so it's reassuring to know this is fixed in BASIC 4 (which, luckily, is what I'm using here - I don't know if any of lurkio's "real machine" work is being done with BASIC 2 though).

I've now had a go at renumbering and packing the $.EGYPT program from the .ssd lurkio posted using both an emulated Master 128 in b-em and basictool. Fortunately/unfortunately, they both seem to be behaving identically. I suspect I am either missing the point, or there is some bug in basictool around top-bit-set characters which is manifesting itself only for lurkio (since IIRC he's using a Mac and I'm on Linux). I'll explain what I've done and then lurkio/anyone else interested can a) tell me what point I'm missing b) try it for themselves and see if this works for them as it does for me.

I extracted $.EGYPT and called it EGYPT.orig on my PC. I then did various combinations of renumbering and/or packing using basictool:

Code: Select all

$ ../basictool -vrt EGYPT.orig EGYPT-r.tok
info: input auto-detected as tokenised BASIC
$ ../basictool -vpt EGYPT-r.tok EGYPT-rp.tok
info: input auto-detected as tokenised BASIC
Bytes saved= 776
$ ../basictool -vrt EGYPT-rp.tok EGYPT-rpr.tok
info: input auto-detected as tokenised BASIC
$ ../basictool -vpt EGYPT.orig EGYPT-p.tok
info: input auto-detected as tokenised BASIC
Bytes saved= 794
The suffix on the file uses "r" for renumbering, "p" for packing and the sequence of letters indicates the sequence of operations, so rpr is renumber-pack-renumber.

I then generated equivalent files (with 128 or 12 in the names - I didn't plan ahead enough to always be able to fit "128" in...) on a Master 128 (non-shadow mode 7, ABE 1.00 - the same version used by basictool) using BASIC's RENUMBER and ABE's PACK and extracted them from the .ssd I saved them on in the emulator into the egypt-from-m128-extract directory on my PC. Using md5sum to compare them:

Code: Select all

$ md5sum EGYPT.orig EGYPT-p.tok EGYPT-r.tok EGYPT-rp.tok EGYPT-rpr.tok egypt-from-m128-extract/egypt-out.ssd.{EG128R,EG128P,EG128RP,EG12RPR}|sort
3c8a5c8b77c2d98fc11a4b113a54128f  egypt-from-m128-extract/egypt-out.ssd.EG128P
3c8a5c8b77c2d98fc11a4b113a54128f  EGYPT-p.tok
3e38c7a5c06b90e7e375307327b8350e  egypt-from-m128-extract/egypt-out.ssd.EG128RP
3e38c7a5c06b90e7e375307327b8350e  EGYPT-rp.tok
432089033934a04fb1a002ac7421c956  egypt-from-m128-extract/egypt-out.ssd.EG128R
432089033934a04fb1a002ac7421c956  EGYPT-r.tok
73e6eb42fb0b2492f9ba8977c890d87c  EGYPT.orig
d72771318ba629feac900378365bb716  egypt-from-m128-extract/egypt-out.ssd.EG12RPR
d72771318ba629feac900378365bb716  EGYPT-rpr.tok
In all cases the file I got from the Master is identical to the one basictool is creating. (This isn't to say the file's contents are *valid*, e.g. based on lurkio's post above EG128P/EGYPT-p.tok will be broken because it was packed without being renumbered first. Just that basictool is breaking things in the same way as a real Master. :-) )

lurkio - if you're screaming at your monitor that I've just comically missed the point, apologies, and please have another go at explaining it to me. If I haven't missed the point, it would probably be helpful if you tried to reproduce what I've done here and posted your md5sums (and the actual files, if your md5sums are different) - that would point to a basictool bug which is only manifesting on your particular OS/compiler/whatever and which I can then investigate.
User avatar
lurkio
Posts: 3783
Joined: Wed Apr 10, 2013 12:30 am
Location: Doomawangara
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by lurkio »

Sorry for the confusion! I'm a bit confused myself. I now realise that what I said in my last post wasn't quite right. Let me try again.

If I use BeebEm in Model B mode (BASIC 2) and LOAD the BASIC program $.EGYPT from the attached .SSD (it's the same $.EGYPT as before), and if I then invoke PRES ABE Pack to compact the program (still in BeebEm), saying yes to all Pack options, the resulting compacted program is corrupted and crashes when RUN. The workaround, which results in a successfully compacted, working, non-crashing program, is to first LOAD "EGYPT" (in BeebEm), and then RENUMBER the program from the standard BASIC commandline (in BeebEm), and only then invoke PRES ABE Pack (in BeebEm) to compact the program.

However, if, instead, I use basictool to try to implement the workaround, the result is failure because although the uncompacted renumbered program works, the Teletext control codes are corrupted: see the program file EATOK on the attached .SSD, which was created by entering basictool -r -t -v -v "decrypted prog Disc999-EgyptianAdventure.bas" EATOK. Moreover, if I try to renumber and compact the source code using basictool only — basictool -r -p -t -v -v "decrypted prog Disc999-EgyptianAdventure.bas" EATOK2 — the resulting renumbered compacted program is corrupted and crashes when run: see the program file EATOK2 on the attached .SSD.

So the point is that you can RENUMBER and compact EGYPT and get an error-free program as a result if you work entirely inside BeebEm (using RENUMBER on the BASIC2 commandline, followed by PRES ABE Pack) — but you can't achieve the same result if you use only basictool.

In other words, the only way to compact the program EGYPT is to manually RENUMBER it on the commandline in BeebEm, and then manually invoke PRES ABE Pack in BeebEm. I can't use basictool to achieve the same result. Which is a shame.

Does that clarify the problem? (Hope I haven't made any silly mistakes somewhere along the way..?!)

:?:

decrypted prog Disc999-EgyptianAdventure.ssd
(200 KiB) Downloaded 5 times

EDIT: More confusion! Aargh. It turns out that I can also use the Renumber feature of PRES ABE Pack, in BeebEm, to renumber the program ($.EGYPT) before Packing it. I could have sworn that ABE's Renumber feature actually corrupted $.EGYPT when I tried it in BeebEm previously! But now it works. So I must have screwed up somehow. However, that doesn't change the fact that I still can't use basictool by itself to renumber and compact the program.

:!:
Last edited by lurkio on Mon Sep 20, 2021 12:18 am, edited 1 time in total.
SteveF
Posts: 1111
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by SteveF »

Thanks lurkio. I think I see the problem(s) now - please bear with me as this is perhaps a rather long and pedantic post...

I am able to reproduce your EATOK and EATOK2 files exactly, so I think we can probably rule out any weird differences between basictool running on my machine and running on yours.

I took $.EGYPT and used both an emulated model B and an emulated Master 128 (i.e. BASIC 2 and 4 respectively) to:
  • just RENUMBER
  • just pack
  • RENUMBER then pack
  • RENUMBER then pack then RENUMBER
$.EGYPT. I got identical output from both emulated machines for each test, and I was able to reproduce that output byte-for-byte using basictool, doing everything a step at a time.

When basictool is used to pack and renumber (i.e. -p -r options specified), it does the pack first and the renumber afterwards. [0] This means that if it's necessary to renumber first in order to stop pack corrupting this particular program, using the '-r' option in basictool won't help. You can work around this by doing things in two steps - if $.EGYPT is egypt.orig on your PC (note that we start with the original pre-tokenised BASIC program here):

Code: Select all

basictool -tvvr egypt.orig egypt-renumbered.tok
basictool -tvvp egypt-renumbered.tok egypt-renumbered-then-packed.tok # feel free to add '-r' here too if you want renumbering after packing
The reason it works like this is that I was thinking of renumbering as a purely "cosmetic" operation, so there's no point tidying up *before* the pack as it's probably going to change all the line numbers anyway. Since, as it turns out, renumbering is useful to force BASIC to fix up this program [1], there's definitely an argument to be made that if you specify the -r option, basictool should renumber before packing as well as after packing. Does that seem like a good idea to you? I don't *think* there's any downside (except a microscopic performance hit) - if the line number references in the program are complex and not RENUMBER-safe, doing a RENUMBER after packing would already break things, so an extra RENUMBER first shouldn't hurt. (Edit: we could add some kind of --pre-renumber option to control this explicitly, but unless there's a downside to just making -r do two renumbers, I'd rather go with that to keep the interface a little bit simpler.)

I think the problems you are having with "decrypted prog Disc999-EgyptianAdventure.bas" (which I have renamed lurkio.bas for ease of reference, having extracted it from the zip you attached first) are caused by your text editor altering the encoding for non-ASCII characters.

Looking at the raw bytes in $.EGYPT:

Code: Select all

$ xxd -g1 lurkio-EGYPT | head -27 | tail -3
00000180: 8a 33 2c 31 30 29 22 84 9d 83 50 6c 65 61 73 65  .3,10)"...Please
00000190: 20 77 61 69 74 2c 20 64 61 74 61 20 6c 6f 61 64   wait, data load
000001a0: 69 6e 67 2e 2e 2e 20 9c 22 3b 3a 44 54 3d 30 3a  ing... .";:DT=0:
You can see that the control codes just before "Please wait" are &84, &9D and &83 respectively. Looking at the corresponding raw bytes in lurkio.bas:

Code: Select all

$ xxd -g1 lurkio.bas | head -34 | tail -3
000001f0: 2c 31 30 29 22 c3 91 c3 b9 c3 89 50 6c 65 61 73  ,10)"......Pleas
00000200: 65 20 77 61 69 74 2c 20 64 61 74 61 20 6c 6f 61  e wait, data loa
00000210: 64 69 6e 67 2e 2e 2e 20 c3 ba 22 3b 3a 44 54 3d  ding... ..";:DT=
you can see the control codes are now &C3, &91, &C3, &B9, &C3, &89. [2] These have been passed through (reasonably enough, I think) by basictool and are equally present in EATOK:

Code: Select all

$ xxd -g1 lurkio-EATOK | head -32 | tail -3
000001d0: 30 3b 3a f1 8a 33 2c 31 30 29 22 c3 91 c3 b9 c3  0;:..3,10)".....
000001e0: 89 50 6c 65 61 73 65 20 77 61 69 74 2c 20 64 61  .Please wait, da
000001f0: 74 61 20 6c 6f 61 64 69 6e 67 2e 2e 2e 20 c3 ba  ta loading... ..
This is why the output when running EATOK is corrupt.

On a different note: lurkio.bas contains REM-ed out lines, with a REM before the line number. basictool is treating these as lines which you didn't give a line number for and is auto-numbering them, rather than discarding them as you perhaps intended. If text input to basictool is:

Code: Select all

10PRINT "Hello ";
REM20PRINT "wurld"
30PRINT "world!"
the output it generates will be:

Code: Select all

10PRINT "Hello ";
11REM20PRINT "wurld"
30PRINT "world!"
This isn't a problem as such, but it does make the basictool output from lurkio.bas look like it has duplicate lines when compared to the $.EGYPT original and confused me a little bit. I appreciate you may have known it worked exactly like this, though. We can discuss changing this if you like; my current preference is not to (because it would mean treating REM as a special case for auto line numbering), but I am open to debate, although probably best to do so after we've cleared up these other issues.

I hope all this makes sense and addresses all the different aspects of this; if I've missed anything or anything I've said seems wrong, please let me know!

[0] Edit: it doesn't make any difference what order -p and -r are on the command line, just FWIW.

[1] I don't think we actually know why this is necessary yet, do we? It would be interesting to understand this, but from a narrow basictool point of view I don't think it's relevant so I'm trying to avoid being distracted by this question. :-)

[2] I tried to see exactly why we ended up with these particular bytes, but they don't seem to be the UTF-8 encoding of the original three bytes interpreted as an ISO-8859-1 encoding as I had half expected. It doesn't really matter - it's obvious from the hex dumps they have been corrupted somehow - but if anyone can explain why we have these particular bytes it would be interesting...

Edit: I've just seen your edit, but I don't think it changes anything substantially - as I said, I was able to reproduce your EATOK and EATOK2 files so if you did get confused at some point I don't think you got confused in an important way. :-)
User avatar
lurkio
Posts: 3783
Joined: Wed Apr 10, 2013 12:30 am
Location: Doomawangara
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by lurkio »

SteveF wrote:
Sun Sep 19, 2021 9:56 pm
Since, as it turns out, renumbering is useful to force BASIC to fix up this program [1], there's definitely an argument to be made that if you specify the -r option, basictool should renumber before packing as well as after packing. Does that seem like a good idea to you?
Yes.

SteveF wrote:
Sun Sep 19, 2021 9:56 pm
I don't *think* there's any downside
I can't think of one.

SteveF wrote:
Sun Sep 19, 2021 9:56 pm
we could add some kind of --pre-renumber option to control this explicitly, but unless there's a downside to just making -r do two renumbers, I'd rather go with that to keep the interface a little bit simpler.
Agreed.

SteveF wrote:
Sun Sep 19, 2021 9:56 pm
I think the problems you are having with "decrypted prog Disc999-EgyptianAdventure.bas" (which I have renamed lurkio.bas for ease of reference, having extracted it from the zip you attached first) are caused by your text editor altering the encoding for non-ASCII characters.
I thought that might be the case. The reason that it wasn't a problem for me before basictool came along is that the way in which I used to get a program listing out of BeebEm and into BBEdit was by copying it from Mac BeebEm (with Command-C) and then pasting the copied listing into BBEdit. The procedure was reversible (copy from BBEdit and paste into Mac BeebEm), and any embedded Teletext control codes were always somehow translated correctly, even though, as you point out, they are, strictly speaking, incorrect when you examine the raw bytes in the textfile. So, that's not a huge problem but it is something I need to be aware of. Is there another way to get a listing out of BeebEm and into BBEdit that would preserve the raw-byte values of any embedded Teletext codes faithfully?

SteveF wrote:
Sun Sep 19, 2021 9:56 pm
On a different note: lurkio.bas contains REM-ed out lines, with a REM before the line number.
That was intentional: those REMs disappear when you copy the listing from BBEdit and paste it into BeebEm. So it's a handy "memory-free" way of keeping old copies of an edited line hanging around as a quick reminder of what you've changed. (But, obviously, if you use basictool then those REM lines will be retained and numbered.)

SteveF wrote:
Sun Sep 19, 2021 9:56 pm
basictool is treating these as lines which you didn't give a line number for and is auto-numbering them ... This isn't a problem as such, but it does make the basictool output from lurkio.bas look like it has duplicate lines when compared to the $.EGYPT original and confused me a little bit. I appreciate you may have known it worked exactly like this, though. We can discuss changing this if you like
Perhaps the best solution would be to implement that cheeky "REM-stripping" idea I came up with for basiclabel.py..?

SteveF wrote:
Sun Sep 19, 2021 9:56 pm
I tried to see exactly why we ended up with these particular bytes, but they don't seem to be the UTF-8 encoding of the original three bytes interpreted as an ISO-8859-1 encoding as I had half expected. It doesn't really matter - it's obvious from the hex dumps they have been corrupted somehow - but if anyone can explain why we have these particular bytes it would be interesting...
Yes, I too am quite keen to know what's going on here!

:idea:
SteveF
Posts: 1111
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by SteveF »

lurkio wrote:
Sun Sep 19, 2021 11:48 pm
SteveF wrote:
Sun Sep 19, 2021 9:56 pm
Since, as it turns out, renumbering is useful to force BASIC to fix up this program [1], there's definitely an argument to be made that if you specify the -r option, basictool should renumber before packing as well as after packing. Does that seem like a good idea to you?
Yes.
Cool. I've pushed v0.07-pre to github with this change - can you (and anyone else interested) please give this a try and see how you get on? If it works out OK I'll tag this up as 0.07 proper.
lurkio wrote:
Sun Sep 19, 2021 11:48 pm
SteveF wrote:
Sun Sep 19, 2021 9:56 pm
I think the problems you are having with "decrypted prog Disc999-EgyptianAdventure.bas" (which I have renamed lurkio.bas for ease of reference, having extracted it from the zip you attached first) are caused by your text editor altering the encoding for non-ASCII characters.
I thought that might be the case. The reason that it wasn't a problem for me before basictool came along is that the way in which I used to get a program listing out of BeebEm and into BBEdit was by copying it from Mac BeebEm (with Command-C) and then pasting the copied listing into BBEdit. The procedure was reversible (copy from BBEdit and paste into Mac BeebEm), and any embedded Teletext control codes were always somehow translated correctly, even though, as you point out, they are, strictly speaking, incorrect when you examine the raw bytes in the textfile. So, that's not a huge problem but it is something I need to be aware of. Is there another way to get a listing out of BeebEm and into BBEdit that would preserve the raw-byte values of any embedded Teletext codes faithfully?
I don't use BBEdit and this might be a bit patronising but it looks like it has some relatively sophisticated support for multiple character sets (just based on seeing https://superuser.com/questions/353280/ ... -in-bbedit). Do you get an option to select something like "8-bit ASCII" or "ISO-8859-1" or "raw" when you create a new file? That might stop the translation from the 8-bit teletext control codes coming into play, although BeebEm itself might also be playing a part here.

I appreciate it's not an answer to the question you actually asked, but if you were to use somthing like MMB_Utils to extract the tokenised BASIC program directly from the .ssd, you can then use that file directly as input to basictool, taking BeebEm out of the loop altogether. This might not be enough though, since BBEdit would still potentially get a chance to swizzle the codes when you're editing the de-tokenised output from basictool.

It's probably worth posting a question about this elsewhere, as there's probably someone on stardot who does know how to make this work but isn't interested in basictool.
lurkio wrote:
Sun Sep 19, 2021 11:48 pm
SteveF wrote:
Sun Sep 19, 2021 9:56 pm
basictool is treating these as lines which you didn't give a line number for and is auto-numbering them ... This isn't a problem as such, but it does make the basictool output from lurkio.bas look like it has duplicate lines when compared to the $.EGYPT original and confused me a little bit. I appreciate you may have known it worked exactly like this, though. We can discuss changing this if you like
Perhaps the best solution would be to implement that cheeky "REM-stripping" idea I came up with for basiclabel.py..?
I think you're right - I haven't forgotten about that, I've just been a bit distracted, but it would neatly allow these lines to be removed (if desired) immediately without adding a special case.
User avatar
Richard Russell
Posts: 2523
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by Richard Russell »

SteveF wrote:
Sun Sep 19, 2021 9:56 pm
if anyone can explain why we have these particular bytes it would be interesting...
It doesn't actually need much detective work. The UTF-8 conversions which you have noted happening on the Mac are as follows:

Code: Select all

&83 → &C3 &89 (U+00C9  LATIN CAPITAL LETTER E WITH ACUTE)
&84 → &C3 &91 (U+00D1  LATIN CAPITAL LETTER N WITH TILDE)
&9D → &C3 &B9 (U+00F9  LATIN SMALL LETTER U WITH GRAVE)
So the question is: to what encoding do these conversions correspond? Google easily finds that it's the Mac OS Roman character set. Here's an extract from that table:

Code: Select all

0x83	0x00C9 #	LATIN CAPITAL LETTER E WITH ACUTE
0x84	0x00D1 #	LATIN CAPITAL LETTER N WITH TILDE
0x9D	0x00F9 #	LATIN SMALL LETTER U WITH GRAVE
There's your explanation. :)
I am suffering from 'cognitive decline' and depression. If you have a comment about the style or tone of this message please report it to the moderators by clicking the exclamation mark icon, rather than complaining on the public forum.
User avatar
lurkio
Posts: 3783
Joined: Wed Apr 10, 2013 12:30 am
Location: Doomawangara
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by lurkio »

Thanks, Richard and Steve!

Yes, the problem was indeed the encoding I had been using in BBEdit: UTF-8. I've changed the encoding to "Western (Mac OS Roman)" [1], and the inline Teletext codes in $.EGYPT are now retained, the raw bytes unaltered, after I copy the listing from BeebEm and paste it into BBEdit and save the file. Simple, really. :oops:

SteveF wrote:
Sun Sep 19, 2021 9:56 pm
I've pushed v0.07-pre to github with this change - can you (and anyone else interested) please give this a try and see how you get on? If it works out OK I'll tag this up as 0.07 proper.
That works perfectly. I transferred the $.EGYPT listing to BBEdit, saved it (having first manually abbreviated the GOTO on line 5 to "G."), and then ran it through basictool v0.07-pre — basictool -tvvpr 1.bas EATOK3 — and the result was a renumbered, compacted, uncorrupted, working program! See EATOK3 on the attached .SSD. Many thanks!

decrypted prog Disc999-EgyptianAdventure.ssd
(200 KiB) Downloaded 5 times

[1] "8-bit ASCII", "ISO-8859-1" and "raw" aren't in the list of available encodings. And "Western (ISO Latin 1)" doesn't work: the raw bytes are altered. [EDIT: "Western (ISO Latin 1)" doesn't work if I copy and paste from BeebEm into BBEdit with "Western (ISO Latin 1)" pre-selected, but it does work if I first save the file in "Western (Mac OS Roman)" in BBEdit and then (re)open it, explicitly specifying the "Western (ISO Latin 1)" encoding in the "Reopen using encoding..." menu!] "Western (ASCII)" doesn't work either: BBEdit complains, "Unmappable character(s) detected".

EDIT2:
SteveF wrote:
Sun Sep 19, 2021 9:56 pm
as it turns out, renumbering is useful to force BASIC to fix up this program ... I don't think we actually know why this is necessary yet, do we?
No, but can we assume this is a bug in PRES ABE Pack? Surely Pack ought to be able to compact any (uncorrupted) program, whether or not you renumber it first?

:?:
SteveF
Posts: 1111
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by SteveF »

lurkio wrote:
Mon Sep 20, 2021 1:20 pm
:roll: Thanks, Richard and Steve!
Thanks to both of you for your help sorting this out. I'm glad you got it going.
lurkio wrote:
Mon Sep 20, 2021 1:20 pm
"8-bit ASCII", "ISO-8859-1" and "raw" aren't in the list of available encodings.
I was just guessing at some possible encoding that might be offered and might work, so I'm not too surprised I guessed wrong... :-)
lurkio wrote:
Mon Sep 20, 2021 1:20 pm
SteveF wrote:
Sun Sep 19, 2021 9:56 pm
as it turns out, renumbering is useful to force BASIC to fix up this program ... I don't think we actually know why this is necessary yet, do we?
No, but can we assume this is a bug in PRES ABE Pack? Surely Pack ought to be able to compact any (uncorrupted) program, whether or not you renumber it first?
You are probably right but I'm not 100% happy saying that. It seems to me that it's *possible* (I don't say probable...) that $.EGYPT is subtly invalid in some way and that RENUMBERing causes BASIC to fix up that problem. That's perhaps a bit of a fine point though - if I happened to be right, it would be a bit unfair to say this is a bug in ABE's pack, but it would still be better if it *could* cope with this broken input.

I've been trying to reduce $.EGYPT down to something a bit more practical to investigate. The attached SSD contains two programs:
  • REDUCE1 is $.EGYPT with lines deleted until I got it as small as possible while still getting corrupted after a full pack. (It doesn't generate a "looping" listing as $.EGYPT does, but some sort of screen clear control codes and other random junk BASIC areis present when listing after packing.)
  • REDUCE2 is REDUCE1 with a couple of dummy lines added so there are no dangling line number references, in case that helped - it doesn't seem to.
I did wonder if it might be the teletext control codes causing the problem but I deleted the lines containing them straight away and it didn't help.

I'd appreciate any further thoughts. Is REDUCE1 a valid BBC BASIC program for our purposes? It obviously doesn't run any more, which makes it a bit of an "unfair" test, but it does seem to show *similar* problems to the original and much larger $.EGYPT.

Incidentally I see some corruption of the first line number sometimes, but I think that's just an artefact of using "OLD" to recover the program on entry to BASIC when the first line number is >255 and isn't a fundamental problem.
Attachments
reduced.zip
(284 Bytes) Downloaded 5 times
User avatar
Richard Russell
Posts: 2523
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by Richard Russell »

SteveF wrote:
Mon Sep 20, 2021 9:50 pm
You are probably right but I'm not 100% happy saying that. It seems to me that it's *possible* (I don't say probable...) that $.EGYPT is subtly invalid in some way and that RENUMBERing causes BASIC to fix up that problem.
The only way that I can think of in which RENUMBER could "fix" a problem is if somehow the program has line numbers which are not in an ascending sequence. Although that situation should not arise if you enter and edit a BASIC program using only immediate-mode commands (AUTO, DELETE etc.) it may if you use a different editor.

For example the program editors which come with my versions of BBC BASIC will allow you to give any line any number (in the range 0-65535) even if that means the resulting program won't run because they're not in sequence. Because I don't use GOTO, GOSUB or RESTORE (other than the relative form) that doesn't trouble me.

But even in that case I can't think of a situation in which RENUMBER could interact with a cruncher/packer such that the original program runs properly but a crunched version only runs if renumbered first. That seems very strange and makes me suspect a bug in the cruncher/packer. I know from bitter experience that writing a bug-free cruncher is challenging!
I am suffering from 'cognitive decline' and depression. If you have a comment about the style or tone of this message please report it to the moderators by clicking the exclamation mark icon, rather than complaining on the public forum.
User avatar
daveejhitchins
Posts: 6786
Joined: Wed Jun 13, 2012 6:23 pm
Location: Newton Aycliffe, County Durham
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by daveejhitchins »

Re the possibility of a BUG in the ABE cruncher: I can say there IS, as I've encountered at least once. Unfortunately I can't remember the exact details, sorry. I may be able to recreate it, as I noticed it while packing the original MGC listing.

I've always wanted someone, anyone, to take-on the maintenance of the ABE, however, probably asking too much? I'd love to add a lot more features and make it a lot more intuitive to use e.g. some of the Electron function key mapping!

One final note: I have now (re)created a single socket PLD ROM solution for anyone looking!? This will work on the BBC, Electron and, with the Master you can also have two ROMs (in the socket that supports the 32K ROMs) - I can pre-program or let the user re-program the Winbond 27E512 that I'll be using.

Dave H.
Last edited by daveejhitchins on Wed Sep 22, 2021 8:55 am, edited 1 time in total.
User avatar
lurkio
Posts: 3783
Joined: Wed Apr 10, 2013 12:30 am
Location: Doomawangara
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by lurkio »

SteveF wrote:
Mon Sep 20, 2021 9:50 pm
I've been trying to reduce $.EGYPT down to something a bit more practical to investigate. The attached SSD contains two programs: ... REDUCE1 is $.EGYPT with lines deleted until I got it as small as possible while still getting corrupted after a full pack. (It doesn't generate a "looping" listing as $.EGYPT does, but some sort of screen clear control codes and other random junk BASIC areis present when listing after packing.)
Interesting! It seems you can further reduce REDUCE1 —

Code: Select all

168 GOTO 70
3640 IF DF=0 OR RDF=r THEN 3660
— and still get some corruption (albeit not the screen-clear) when you invoke PRES ABE Pack and say yes to all Pack options:

Code: Select all

168GOTO31296IFD=0ORR=rTHEN3660

SteveF wrote:
Mon Sep 20, 2021 9:50 pm
Incidentally I see some corruption of the first line number sometimes, but I think that's just an artefact of using "OLD" to recover the program on entry to BASIC when the first line number is >255 and isn't a fundamental problem.
Oh! I wasn't really aware of that bug. But, just for the record, here's another mention of it, in the middle of an amazing thread about all sorts of other BASIC hackery:

viewtopic.php?f=12&t=3489&p=30604&hilit ... old#p30604

:idea:
User avatar
Richard Russell
Posts: 2523
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by Richard Russell »

lurkio wrote:
Tue Sep 21, 2021 1:06 pm
Oh! I wasn't really aware of that bug.
OLD not restoring the MS byte of the line number isn't a bug, it's a feature - that byte has been lost forever (it's overwritten with &FF by NEW) so can't be restored. The slightly different internal format that I use for my BASICs is better in that regard, and doesn't suffer from that feature (mind you, I don't even implement OLD in my 'modern' versions).
I am suffering from 'cognitive decline' and depression. If you have a comment about the style or tone of this message please report it to the moderators by clicking the exclamation mark icon, rather than complaining on the public forum.
SteveF
Posts: 1111
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by SteveF »

Thanks guys, it seems we can call this a bug in ABE PACK then. It would be great if someone would take a look at fixing this, but unfortunately I don't think it's going to be me. In the meantime, ABE's PACK is still amazingly useful and we do at least have a workaround for this specific problem.
caspian
Posts: 68
Joined: Sat Nov 24, 2018 5:15 am
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by caspian »

lurkio wrote:
Thu Aug 12, 2021 2:53 am
SteveF wrote:
Tue Aug 10, 2021 11:12 pm
I guess it wouldn't be too hard to operate on the tokenised representation of the program in memory to strip leading/trailing spaces.
If you're going down the road of more involved parsing, I'm tempted to mention a feature I've kind of been wishing for for a while now: GOTO <label>, GOSUB <label>, and RESTORE <label>.

Would it be possible for those constructs to be implemented either as an option in basictool or in some sort of separate pre-processor for basictool? To be clear, what I'm proposing is that in a modern text-editor the user would be able to mark a subroutine by prefacing it with (say) %%myroutine%%, and then call it with something like GOSUB %%myroutine%% -- and then basictool or the pre-processor would take the plaintext BASIC program as input and the first thing that would be done is that the program would be given line-numbers and the labels would be converted to numbers as appropriate.

But actually, the more I think about it, the more I feel that maybe that sort of functionality really falls outside the scope of basictool, which I would summarise roughly as "bringing BASIC ROM utilities to the modern commandline". So this sort of thing probably isn't a good fit for basictool after all...

Still, it would be a useful thing to be able to do in one way or another because when you're writing and editing 8-bit BBC BASIC in a modern text-editor the one stumbling block that prevents you from being able to avoid using line-numbers entirely is the pesky ON ERROR problem. At some point -- usually when you're using filesystem commands, e.g. to load or save a savegame -- you'll want to trap a filesys error, and you'll then have to use ON ERROR, which clears the BASIC stack and forces you to use GOTO in order to recover without crashing. Therefore, you'll end up faffing around and adding line-numbers to code that otherwise could have been structured beautifully with PROCs instead of GOSUBs, and with nary a line-number in sight. Which is a shame. (I know Jonathan Harston has actually implemented ON ERROR LOCAL for 8-bit BBC BASIC, but you may not always have the space to include the machine-code patch.)

And adding line-numbers makes it harder to re-order blocks of code because doing so now requires you to remove the line-numbers in your modern text-editor, rearrange the code, let BASIC tokenise and renumber it, reimport it into the text-editor, and then change all the arguments of any GOTOs and GOSUBs, etc., to their new renumbered values. And repeat the whole procedure every time you need to restructure parts of the prog. Quite a pain.

:idea:
I was thinking about how it would be possible to get a program with out-of-order line numbers from the text file into basic, so basic can then renumber it. I was looking at the thread lurkio linked to talking about making programs with the line numbers out of order.

viewtopic.php?f=12&t=3489&p=30604&hilit ... old#p30604

If basictool could keep track of the out-of-order line numbers, it could enter the basic program using in-order line numbers, then for each line change it to its out-of-order line number (maybe by entering a basic command like from that thread, but designed to only change one line number, and it probably also needs modifying to allow line numbers over 255 and to work on arbitrary programs), then basic renumber should work after that.

Screenshot: me testing basic renumber on the out-of-order program from that thread
bbc-basic-renumber-out-of-order.png
[update] I think I worked out a good set of BASIC commands for this.
Initialise with:
X%=PAGE
pick a line number and store it in L%:
L%=10000
Check X% isn't at the end of the program, change the line number, then search for the next line:
IF X%?1<>255 THEN X%?1=L% DIV 256:X%?2=L% MOD 256:X%=X%+2:REPEAT:X%=X%+1:UNTIL ?X%=13
e.g.
bbc-basic-renumber-out-of-order-2.png
(note: the last L%, 60, was not used, as X% was already at the end of the program. It was just to make sure that end-of-program detection worked)

[further update] It's probably good to also have a command to skip over a program line without changing it's number, e.g. leaving off the "X%?1=L% DIV 256:X%?2=L% MOD 256" part.
User avatar
Richard Russell
Posts: 2523
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by Richard Russell »

caspian wrote:
Thu Sep 23, 2021 4:01 am
lurkio wrote:
Thu Aug 12, 2021 2:53 am
and you'll then have to use ON ERROR, which clears the BASIC stack and forces you to use GOTO in order to recover without crashing.
Just to avoid any potential for confusion, the comment you quoted ("forces you to use GOTO") isn't correct, as I explained earlier in the thread. Because the stack is cleared, it's OK (indeed positively desirable, in my view) to allow the ON ERROR statement to fall through to the next line.
I am suffering from 'cognitive decline' and depression. If you have a comment about the style or tone of this message please report it to the moderators by clicking the exclamation mark icon, rather than complaining on the public forum.
SteveF
Posts: 1111
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by SteveF »

caspian wrote:
Thu Sep 23, 2021 4:01 am
I was thinking about how it would be possible to get a program with out-of-order line numbers from the text file into basic, so basic can then renumber it. I was looking at the thread lurkio linked to talking about making programs with the line numbers out of order.
Hi caspian,

I've created a Python wrapper around basictool which I think will do what you want, although I may have got the wrong end of the stick. If you check out the latest version of the github repo there is a file out-of-order.py in the utils directory. Given this input:

Code: Select all

$ cat testoo1.bas
20PRINT "Hello ";
22RESTORE 287
23READ a$
942PRINT a$
10PRINT "world";
15PRINT "!"
400DATA cruel
287DATA fine
300DATA there
we can run out-of-order.py on it:

Code: Select all

$ python out-of-order.py -t testoo1.bas testoo1.tok
and although you'd probably want to copy testoo1.tok to an SSD for use with BBC BASIC, for the purposes of showing what's happened here we can use basictool to de-tokenise testoo1.tok:

Code: Select all

$ basictool testoo1.tok
   10PRINT "Hello ";
   20RESTORE 80
   30READ a$
   40PRINT a$
   50PRINT "world";
   60PRINT "!"
   70DATA cruel
   80DATA fine
   90DATA there
The order of the lines in the input has been preserved but the mixed-up line numbers were retained internally long enough for the RESTORE to be renumbered correctly.

Let me know what you think; if I've got the wrong end of the stick could you please give me an example input file and the output you'd expect from it?

Cheers.

Steve
caspian
Posts: 68
Joined: Sat Nov 24, 2018 5:15 am
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by caspian »

SteveF wrote:
Thu Sep 23, 2021 5:54 pm
caspian wrote:
Thu Sep 23, 2021 4:01 am
I was thinking about how it would be possible to get a program with out-of-order line numbers from the text file into basic, so basic can then renumber it. I was looking at the thread lurkio linked to talking about making programs with the line numbers out of order.
Hi caspian,

I've created a Python wrapper around basictool which I think will do what you want, although I may have got the wrong end of the stick. If you check out the latest version of the github repo there is a file out-of-order.py in the utils directory. Given this input:

Code: Select all

$ cat testoo1.bas
20PRINT "Hello ";
22RESTORE 287
23READ a$
942PRINT a$
10PRINT "world";
15PRINT "!"
400DATA cruel
287DATA fine
300DATA there
we can run out-of-order.py on it:

Code: Select all

$ python out-of-order.py -t testoo1.bas testoo1.tok
and although you'd probably want to copy testoo1.tok to an SSD for use with BBC BASIC, for the purposes of showing what's happened here we can use basictool to de-tokenise testoo1.tok:

Code: Select all

$ basictool testoo1.tok
   10PRINT "Hello ";
   20RESTORE 80
   30READ a$
   40PRINT a$
   50PRINT "world";
   60PRINT "!"
   70DATA cruel
   80DATA fine
   90DATA there
The order of the lines in the input has been preserved but the mixed-up line numbers were retained internally long enough for the RESTORE to be renumbered correctly.

Let me know what you think; if I've got the wrong end of the stick could you please give me an example input file and the output you'd expect from it?

Cheers.

Steve
Yes! That does the thing I was thinking about, it looks really good. I just tried it today.

It could be useful to also support unnumbered lines in input, but that could be a bit trickier.
As an example, modifying your example (just by taking out some of the line numbers) :

Code: Select all

$ cat >testoo2.bas
20PRINT "Hello ";
22RESTORE 287
23READ a$
PRINT a$
PRINT "world";
PRINT "!"
400DATA cruel
287DATA fine
300DATA there
run out-of-order.py on it and get the same result as if the line numbers were there.

Another thing I just tested, it also supports fixing duplicate line numbers (at least when not GOTO/RESTORE targets), handy.

Code: Select all

$ cat duplicates.bas 
20PRINT "Hello ";
10PRINT "world";
15PRINT "!"
20PRINT "Hello ";
10PRINT "world";
15PRINT "!"
$ python ../utils/out-of-order.py duplicates.bas >duplicates.tok
$ basictool ./duplicates.tok
   10PRINT "Hello ";
   20PRINT "world";
   30PRINT "!"
   40PRINT "Hello ";
   50PRINT "world";
   60PRINT "!"
caspian
Posts: 68
Joined: Sat Nov 24, 2018 5:15 am
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by caspian »

A screenshot, copy/pasting from basictool plain text output to BeebEm emulator.

Code: Select all

test$ python ../utils/out-of-order.py testoo1.bas
   10PRINT "Hello ";
   20RESTORE 80
   30READ a$
   40PRINT a$
   50PRINT "world";
   60PRINT "!"
   70DATA cruel
   80DATA fine
   90DATA there
Attachments
Screen Shot 2 basictool.png
SteveF
Posts: 1111
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: basictool - a command-line tool to tokenise, de-tokenise, pack and analyse BBC BASIC

Post by SteveF »

Thanks, I'm glad it worked for you.
caspian wrote:
Fri Sep 24, 2021 2:57 am
It could be useful to also support unnumbered lines in input, but that could be a bit trickier.
I've just pushed a tweaked version to github which should allow this. Thanks to your observation that duplicate line numbers are OK it mostly just works without any extra effort; we let basictool auto-assign an incremented-by-one line number for the unnumbered lines, then we just don't swizzle their line number back to match the original input (because there is no line number for them) before we renumber.

Using your example:

Code: Select all

$ cat testoo2.bas 
20PRINT "Hello ";
22RESTORE 287
23READ a$
PRINT a$
PRINT "world";
PRINT "!"
400DATA cruel
287DATA fine
300DATA there
$ python out-of-order.py -t testoo2.bas testoo2.tok
$ basictool testoo2.tok
   10PRINT "Hello ";
   20RESTORE 80
   30READ a$
   40PRINT a$
   50PRINT "world";
   60PRINT "!"
   70DATA cruel
   80DATA fine
   90DATA there
Post Reply

Return to “development tools”