Correctly handling EOF when reading with OSGBPB

Discuss all aspects of programming here. From 8-bit through to modern architectures.
Post Reply
User avatar
hjalfi
Posts: 119
Joined: Sat May 13, 2017 10:17 pm
Location: Zürich, Switzelrand
Contact:

Correctly handling EOF when reading with OSGBPB

Post by hjalfi » Wed Feb 21, 2018 10:15 pm

I'm trying to implement an API which wants to read blocks off a disk. (Cowgol's file_getblock(), actually.) The API definition is: if there's any data, you get something; if you're at the end of the file and there's nothing to read, you get an EOF response.

The MOS's EOF handling appears to be a bit weird. There seems to be an internal per-channel EOF flag, but this is only set if you try to read past the end-of-file, and if you try to read past the end-of-file when the EOF flag is set then an error is thrown. This means that to implement my API, I need to:

Code: Select all

call OSBYTE 7f to check EOF (in case the last read set the flag)
  if yes, give up now so the OSGBPB doesn't throw an error
do my OSGBPB 4
if it returned carry set:
  is the bytes-remaining value different from the number of bytes I originally asked for?
    if yes, it's not EOF (because we read something)
    if no, then it is EOF (because we read nothing)
This seems oddly complex, and it doesn't even take into account unsetting the internal EOF flag when seeking backwards in the file. Is there a simpler way that I'm missing?
David Given
http://cowlark.com

crj
Posts: 834
Joined: Thu May 02, 2013 4:58 pm
Contact:

Re: Correctly handling EOF when reading with OSGBPB

Post by crj » Thu Feb 22, 2018 2:09 am

I'm in danger of being corrected, here, but by my recollection...

The Beeb's OS isn't like Unix/Posix. OSGBPB is only intended for use on files, not any kind of stream device, and it's only intended for blocking accesses.

That means, when you ask to read N bytes, the only reason you might receive fewer than N is that you reached the end of the file.

A further subtlety is that there is a difference between an open handle being at end of file, and having notified end of file. You only get an error if you try to read after having been notified of end of file (without subsequently repositioning yourself in the file).

This means there are two sensible idioms for reading from a file.

The first is to read structurally complete values of data from the file (e.g. a 32-bit integer), and check EOF between each and the next. If there's an EOF in the middle of a "value" then the file is corrupt and an error is appropriate.

The second is to ask repeatedly for as much data as you can cope with at once (which, with OSBGET, might be a single byte). At some point your OSBGET or OSGBPB call will indicate EOF. So you stop. Note, in particular, that if an OSGBPB call reads exactly up to the end of the file, the next call will return zero data rather than an error.

So, using the first strategy, you would do:

Code: Select all

WHILE not EOF
  Try to read data
  See how much data you were given and process it
...but the second strategy is somewhat cleaner and more efficient when using OSGBPB:

Code: Select all

REPEAT
  Try to read some data
  See how much data you were given (maybe none) and process it
UNTIL the updated count returned by OSGBPB is non-zero
Or, equivalently:

Code: Select all

REPEAT
  Try to read some data
  Stash the carry flag on return from OSGBPB
  See how much data you were given (maybe none) and process it
UNTIL the stashed carry flag was set
Checking for EOF while using the second strategy normally indicates some level of confusion. Only one thing can be delimited by the end of the file; every previous thing has to have its length denoted in some other way. If your algorithm is looking for data when a previous step has already tried to step beyond the end of the file, that may be a bug.

User avatar
hjalfi
Posts: 119
Joined: Sat May 13, 2017 10:17 pm
Location: Zürich, Switzelrand
Contact:

Re: Correctly handling EOF when reading with OSGBPB

Post by hjalfi » Thu Feb 22, 2018 10:33 pm

The issue is that I don't want to have to remember whether I've hit EOF or not while I process the data --- the API I have to implement it doesn't allow it, so I'd need to find a way to associate another kind of EOF flag with a channel, and that way lies madness.

The tricky bit is that it's possible to reach the end of the file without actually setting the end-of-file flag, which causes the next read to return no data... but if I manage to read past the end of the file in a block, the next read will error out. In Posix you can read as often as you like from an EOF-state file descriptor.

Is there a way to unset the EOF flag? Then I could safely perform the read. That'd play nice with disk seeks, too, which my current approach doesn't.
David Given
http://cowlark.com

crj
Posts: 834
Joined: Thu May 02, 2013 4:58 pm
Contact:

Re: Correctly handling EOF when reading with OSGBPB

Post by crj » Fri Feb 23, 2018 12:12 am

If you really must, you can check for EOF then if not at file-end perform an OSGBPB, which is guaranteed to return at least one byte. That should cover all circumstances.

On the other hand, it'll be slightly inefficient. Very inefficient on NFS. And it still feels as though there has to be something wrong with your I/O model for you to run into such a difficulty.

If you want to clear the "EOF has already been notified to the client once" flag, you can use OSARGS to read the file pointer then immediately set it back to the same value. But that will only work on random-access filesystems. Far better avoided.

Post Reply