BBC Basic Keywords - Mac updates, questions

bbc micro/electron/atom/risc os coding queries and routines
Post Reply
jay
Posts: 57
Joined: Sat Apr 25, 2020 12:53 pm
Location: Dublin
Contact:

BBC Basic Keywords - Mac updates, questions

Post by jay » Sat Jun 06, 2020 5:44 pm

I've done some experimentation on a few implementations of BBC BASIC to understand J. G. Harston's token table and have found the following (comparison is with version 1.16 of the document):
  • fastvars from BBC BASIC for Windows in the range &18 to &1F aren't mentioned.
  • The &C7 and &C8 extension maps implemented on BBC BASIC for (Classic) Macintosh differ from the mapping for ARM, see below.
The &C7 extension map looks like this for (Classic) Mac:

Code: Select all

       +----------+-------+-----+------------+------------+-------------+
       |Bytes     | 6502  |Z80  | ARM        | Mac        | Windows     |
       +----------+-------+-----+------------+------------+-------------+
       |0xC7      |  "DELETE"   |  Extension, see below   | "WHILE"     |
       +----------+-------------+-------------------------+-------------+
       |0xC7 0x8E |             |        "APPEND"         |             |
       +----------+             +-------------------------+             |
       |0xC7 0x8F |             |         "AUTO"          |             |
       +----------+             +------------+------------+             |
       |0xC7 0x90 |             | "CRUNCH"   | "DELETE"   |             |
       +----------+             +------------+------------+             |
       |0xC7 0x91 |             | "DELETE"   | "EDIT"     |             |
       +----------+             +------------+------------+             |
       |0xC7 0x92 |             | "EDIT"     | "HELP"     |             |
       +----------+             +------------+------------+             |
       |0xC7 0x93 |             | "HELP"     | "LIST"     |             |
       +----------+             +------------+------------+             |
       |0xC7 0x94 |             | "LIST"     | "LOAD"     |             |
       +----------+             +------------+------------+             |
       |0xC7 0x95 |             | "LOAD"     | "LVAR"     |             |
       +----------+             +------------+------------+             |
       |0xC7 0x96 | Interpreted | "LVAR"     | "NEW"      | Interpreted |
       +----------+ separately  +------------+------------+ separately  |
       |0xC7 0x97 |             | "NEW"      | "OLD"      |             |
       +----------+             +------------+------------+             |
       |0xC7 0x98 |             | "OLD"      | "RENUMBER" |             |
       +----------+             +------------+------------+             |
       |0xC7 0x99 |             | "RENUMBER" | "SAVE"     |             |
       +----------+             +------------+------------+             |
       |0xC7 0x9A |             | "SAVE"     | "TWIN"     |             |
       +----------+             +------------+------------+             |
       |0xC7 0x9B |             | "TEXTLOAD" | "TWINO"    |             |
       +----------+             +------------+------------+             |
       |0xC7 0x9C |             | "TEXTSAVE" |            |             |
       +----------+             +------------+            |             |
       |0xC7 0x9D |             | "TWIN"     |            |             |
       +----------+             +------------+ invalid    |             |
       |0xC7 0x9E |             | "TWINO"    |            |             |
       +----------+             +------------+            |             |
       |0xC7 0x9f |             | "INSTALL"  |            |             |
       +----------+-------------+------------+------------+-------------+
Also, for the &C8 extension map, codes &99 through to &A6 inclusive appear not to be valid for the Classic Mac implementation. When it reads an input program containing these, the tokens expand to nothing - or, if you prefer, are ignored.

I find it interesting that there's an off-by-one shift between the Mac and ARM BASIC token schemes, and I wonder how it originated, does anybody know?

I haven't been able to check these findings against an ARM implementation of BASIC V, I haven't got an Arc or a working Arc emulator.

BTW, does anybody know how the PDP-11 implementation interprets tokenised programs? Is it identical to any of the other existing implementations?
I recently released beebtools, https://github.com/jamesyoungman/beebtools - feedback very welcome!

User avatar
jgharston
Posts: 4120
Joined: Thu Sep 24, 2009 12:22 pm
Location: Whitby/Sheffield
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by jgharston » Sat Jun 06, 2020 6:04 pm

jay wrote:
Sat Jun 06, 2020 5:44 pm
I find it interesting that there's an off-by-one shift between the Mac and ARM BASIC token schemes, and I wonder how it originated, does anybody know?
I've never had access to a working Mac system to check the various BBC BASIC implementations (there are three that I know of), so had to go from documentation. Or maybe I disassembled the code, and got the token,string,token,string the wrong way around - I can't remember.
BTW, does anybody know how the PDP-11 implementation interprets tokenised programs? Is it identical to any of the other existing implementations?
Yes, identical to other 1-byte tokenised programs. It makes an explicit check for C8,98 for QUIT, but otherwise it is entirely 1-byte tokens.

Edit: I've found my BBC BASIC for the Macintosh User Guide (1987) and, while it lists all the keywords, it doesn't list the token values. I must have dug through the binary image to find them.

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.32
(C) Copyright J.G.Harston 1989,2005-2020
>_

jay
Posts: 57
Joined: Sat Apr 25, 2020 12:53 pm
Location: Dublin
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by jay » Sat Jun 06, 2020 6:31 pm

If you PM me the details of how to pass it on, I can pass you instructions for getting an emulator image containing a Mac implementation of BBC BASIC.
I recently released beebtools, https://github.com/jamesyoungman/beebtools - feedback very welcome!

jay
Posts: 57
Joined: Sat Apr 25, 2020 12:53 pm
Location: Dublin
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by jay » Sat Jun 06, 2020 6:35 pm

jgharston wrote:
Sat Jun 06, 2020 6:04 pm
Yes, identical to other 1-byte tokenised programs. It makes an explicit check for C8,98 for QUIT, but otherwise it is entirely 1-byte tokens.
I think this might be implicit in your comment, but I'm not sufficiently familiar with this stuff to figure it out. If &C8 followed by &98 encodes QUIT in PDP-11 BBC BASIC, how does that dialect encode the LOAD keyword?
I recently released beebtools, https://github.com/jamesyoungman/beebtools - feedback very welcome!

User avatar
Richard Russell
Posts: 1668
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by Richard Russell » Sat Jun 06, 2020 7:09 pm

jay wrote:
Sat Jun 06, 2020 5:44 pm
fastvars from BBC BASIC for Windows in the range &18 to &1F aren't mentioned.
If you want the details, a 'fast' variable is stored as a token in the range &18 to &1F, followed by two bytes to be interpreted as a 16-bit number (little-endian). This number indexes a 4-byte 'slot' pre-allocated at the start of the heap; some variables use more than one slot. So to find the memory address of a fast variable multiply the 16-bit index by 4 and add it to LOMEM. Some slots are never used and are 'wasted', because their index would include a disallowed byte.

The tokens are:

Code: Select all

&18		Fast FN/PROC (see below); 2 slots
&19		Fast byte variable, e.g. v& or v&(); 1 slot (2 slots for array)
&1A		Fast 32-bit integer, e.g. v% or v%(); 1 slot (2 slots for array)
&1B		Fast double, e.g. v# or v#(); 2 slots
&1C		Fast numeric variant, e.g. v or v(); 3 slots (2 slots for array)
&1D		Fast structure, e.g. v{} or v{()}; 4 slots (2 slots for array)
&1E		Fast 64-bit integer, e.g. v%% or v%%(); 2 slots
&1F		Fast string, e.g. v$ or v$(); 2 slots
The format of a fast FN/PROC is slightly different. It consists of the regular token for FN or PROC followed by &18 and the 16-bit index, so 4 bytes in all rather than 3 for the other fast variables.

Everything above applies equally to BBC BASIC for Windows and BBC BASIC for SDL 2.0. The difference is that the BB4W cruncher creates fast variables only when one or more REM!Fast directives are present in the program; the BBCSDL cruncher will use fast variables whenever it can, irrespective of the presence of a REM!Fast.

User avatar
jgharston
Posts: 4120
Joined: Thu Sep 24, 2009 12:22 pm
Location: Whitby/Sheffield
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by jgharston » Sat Jun 06, 2020 10:07 pm

jay wrote:
Sat Jun 06, 2020 6:35 pm
I think this might be implicit in your comment, but I'm not sufficiently familiar with this stuff to figure it out. If &C8 followed by &98 encodes QUIT in PDP-11 BBC BASIC, how does that dialect encode the LOAD keyword?
The standard &C8 token. In cmdLOAD in PDP BASIC it simply checks if the next character is &98 - ie LOADASN which would be a Type mismatch error - and branches off to cmdQUIT. It's simply there to assist in testing, I've got notes on implementing two-byte tokens, but the implementation of BASIC has so far not needed them. I've occasionlly thought about adding SYS, but haven't been able to think of a consistant syntax, and use PROCSYS() instead.

The tokens used in PDP11 BASIC are the standard 6502 ones, see the 'Token' file in bbcpdp.zip. The intention has always been that PDP BASIC could load program files saved by other BBC BASICs, and specifically 6502 BASIC, so it has to have the same tokens unless it was to have lots of additional code translating loaded files. (An outstanding issue is that it doesn't load Russell-format correctly yet.)

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.32
(C) Copyright J.G.Harston 1989,2005-2020
>_

User avatar
Richard Russell
Posts: 1668
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by Richard Russell » Sat Jun 06, 2020 10:51 pm

jgharston wrote:
Sat Jun 06, 2020 10:07 pm
I've got notes on implementing two-byte tokens, but the implementation of BASIC has so far not needed them.
Can you use &01 to &1F as keyword tokens, as BB4W and BBCSDL do, or are they used for something else (in BBC BASIC Z80 I used them as 'tokens' for commonly-used parts of error messages)?

User avatar
jgharston
Posts: 4120
Joined: Thu Sep 24, 2009 12:22 pm
Location: Whitby/Sheffield
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by jgharston » Sun Jun 07, 2020 5:46 am

Richard Russell wrote:
Sat Jun 06, 2020 10:51 pm
jgharston wrote:
Sat Jun 06, 2020 10:07 pm
I've got notes on implementing two-byte tokens, but the implementation of BASIC has so far not needed them.
Can you use &01 to &1F as keyword tokens, as BB4W and BBCSDL do, or are they used for something else (in BBC BASIC Z80 I used them as 'tokens' for commonly-used parts of error messages)?
There're currently not used for anything, &8D is used for "Missing " as in 6502 BASIC. The only keywords that could cumminandy would be QUIT and SYS, and I always ensure any HostIO module I write has *Quit, and SYS is implemented on PDP11 BASIC via a library function which has been much more flexible than making it part of the interpreter.

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.32
(C) Copyright J.G.Harston 1989,2005-2020
>_

User avatar
Richard Russell
Posts: 1668
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by Richard Russell » Sun Jun 07, 2020 10:36 am

jgharston wrote:
Sun Jun 07, 2020 5:46 am
&8D is used for "Missing " as in 6502 BASIC.
I didn't know that. In Z80 BBC BASIC they are:

Code: Select all

&01 "Missing "
&02 "No such "
&03 "Bad "
&04 " range"
&05 "variable"
&06 "Out of "
&07 "No "
&08 " space"
So "No such variable" is just &02 &05.

As far as I know, every common CPU has a signed 8-bit comparison so the 'token' range &80 to &1F (-128 to +31) is contiguous and can be tested in one instruction. Very strangely, in my opinion, Sophie used &7F as a token in ARM BASIC which meant that she needed to use an unsigned comparison for the token range &7F to &FF (+127 to +255) and at a stroke eliminated &00 to &1F as possible contiguous token values. :shock:

jay
Posts: 57
Joined: Sat Apr 25, 2020 12:53 pm
Location: Dublin
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by jay » Thu Jul 16, 2020 11:26 pm

I've finally made an open-source release of the code I was working on when I started this thread. It has some DFS functionality, but relevantly to this thread, it reads BBC BASIC program files and converts them to ASCII. It covers, I believe, nearly all published BBC BASIC implementations -
  • 6502 (e.g. BBC Micro), 32016
  • Z80 (e.g. Sinclair Z88)
  • ARM
  • "Classic" Mac
  • Windows, Mac OS X (i.e. R. T. Russel's modern implementation), SDL (e.g. Linux)
  • 8086
  • PDP-11
I understand there's also a 6809 implementation. If it uses the same tokens and line encoding as any of the above, it also will be supported.

The code includes full documentation of the on-disc format of BBC BASIC programs covering all of the above:
bbcbasic.pdf
(56.41 KiB) Downloaded 13 times
There are some small corrections to J. G. Harston's token table (see the attached PDF)

The code is at https://github.com/jamesyoungman/beebtools
I recently released beebtools, https://github.com/jamesyoungman/beebtools - feedback very welcome!

User avatar
Richard Russell
Posts: 1668
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by Richard Russell » Fri Jul 17, 2020 1:30 am

jay wrote:
Thu Jul 16, 2020 11:26 pm
The code includes full documentation of the on-disc format of BBC BASIC programs covering all of the above: bbcbasic.pdf
This is an extremely useful reference document, congratulations. The only (very minor) things I might want to argue with are:
  1. You describe bytes 0x11-0x17 as "representing themselves". I would rather say that (in my dialects at least) they are 'unused tokens' reserved for future expansion. It's not likely that I would be wanting to add new keywords at this stage, but if I did they are the values that would be allocated as tokens.

  2. You say that "Some dialects (for example Mac) also accept COLOR when a program is being entered". I think the majority do (certainly the ARM and 'Windows' dialects do).

  3. I wonder if the keyword BY deserves more attention. You mention it in the context of byte value 0x0F, which is the token I allocate to this keyword in my dialects, but in ARM BASIC it's not tokenised (or, if you prefer, it has the double-byte token 0x42 0x59!).

jay
Posts: 57
Joined: Sat Apr 25, 2020 12:53 pm
Location: Dublin
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by jay » Fri Jul 17, 2020 10:21 am

Thanks for the suggestions. I've updated the code with this effect:
bbcbasic.pdf
(57.42 KiB) Downloaded 13 times
I recently released beebtools, https://github.com/jamesyoungman/beebtools - feedback very welcome!

User avatar
jgharston
Posts: 4120
Joined: Thu Sep 24, 2009 12:22 pm
Location: Whitby/Sheffield
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by jgharston » Fri Jul 17, 2020 12:21 pm

It's worth pointing out the logical struture that &C6 xx are extended functions, &C7 xx are extended immediate commands and &C8 xx are extended commands, rather than them being just arbitarily thrown together.

And BasConv is updated to tokenise BY where appropriate.

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.32
(C) Copyright J.G.Harston 1989,2005-2020
>_

User avatar
Richard Russell
Posts: 1668
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: BBC Basic Keywords - Mac updates, questions

Post by Richard Russell » Fri Jul 17, 2020 1:48 pm

jgharston wrote:
Fri Jul 17, 2020 12:21 pm
And BasConv is updated to tokenise BY where appropriate.
Take care with that. Because it's not tokenised, BY behaves peculiarly In Acorn's dialects: it doesn't have to be followed by a space (or other delimiter), but it can be used at the start of a variable name (with every other keyword these two attributes are mutually exclusive).

I was particularly keen to make BY a standard keyword in my dialects: I couldn't (and can't) see the logic of it being treated any differently from the other two-letter keywords (IF, FN, LN, OF, ON, OR, PI, TO). But that meant I couldn't make it fully compatible with Acorn's BASICs, I had to choose between it needing to be followed by a delimiter or not being able to be used at the start of a variable name.

I decided on the former, reasoning that this would achieve compatibility with the greater number of programs (using BY as a keyword is rare, in my experience, but variable names like BYTE are relatively more common). So for BY to be tokenised in my dialects it must be followed by a space or a delimiter.

Post Reply

Return to “programming”