Fastest way to arrange data on disc for database-type applications?

discuss both original and modern hardware for the bbc micro/electron
Post Reply
SteveF
Posts: 628
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Fastest way to arrange data on disc for database-type applications?

Post by SteveF » Wed Jul 29, 2020 6:33 pm

If this has been discussed before I'd appreciate a pointer...

If I have an application which has a big (200K-ish) data file where data accessed together is often close together in the file but where there will also be random access to different parts of the file, what's likely to be fastest? Let's assume 80 track DFS here.
  • Put the data sequentially on a single side of the disc: the first 2.5K on track 0 of side 0, the second 2.5K on track 1 of side 0, etc
  • Interleave the data across the two sides of the disc: the first 2.5K on track 0 of side 0, the second 2.5K on track 0 of side 1, the third 2.5K on track 1 of side 0, etc
I'd expect the second option to be fastest, because there's going to be less head movement, and playing around in b-em with drive noises on seems to back that up, but I wondered if real hardware might behave differently. For example, do all drives load the heads on both surfaces simultaneously or could some drives keep loading and unloading the heads as we switch between side 0 and 1?

Edited: Having posted this, I wonder if I should have put it in the software forum instead. I'll leave it to the mods to decide now I've posted it...

kfro
Posts: 33
Joined: Mon Nov 25, 2019 10:26 pm
Contact:

Re: Fastest way to arrange data on disc for database-type applications?

Post by kfro » Sun Aug 02, 2020 1:07 pm

Hi,

From memory, and this is a very long time ago, both heads will be loaded at the same time. The reason being that in a floppy drive the head is in contact with the media. The media is a thin mylar sheet that would deform if not supported. On a double sided drive support is provided by the other head and they move in unison with one another.

I think you are going down the right route focusing on minimising mechanical movement as much as possible. But, you also need to think about sector interleave. Here's the problem, by the time you've read a sector, you've probably already missed the start of the next sector (no idea if this is true for the beeb BTW), so you have to wait a whole revolution before you can read the next sector. The normal solution is to interleave the sectors, so instead of numbering the sectors 0, 1, 2, 3, 4, etc. you might number them 0, 2, 1, 4, 3, we used to do this with hard disks years ago, but nowadays they have track buffering so it's not necessary anymore.

Kazzie
Posts: 1778
Joined: Sun Oct 15, 2017 8:10 pm
Location: North Wales
Contact:

Re: Fastest way to arrange data on disc for database-type applications?

Post by Kazzie » Sun Aug 02, 2020 1:55 pm

You're right to say both heads are loaded at the same time: single-sided drives have a dummy head to rest on the disc.

On the sector interleave issue, whatever arrangement you find to suit, you might want to have sector 0 of the opposite side offset from the last sector on the first side, and so on.
BBC Model B 32K issue 7, Sidewise ROM board with 16K RAM
Archimedes 420/1 upgraded to 4MB RAM, ZIDEFS with 512MB CF card
RiscPC 600 under repair
Acorn System 1 home-made replica

Coeus
Posts: 1657
Joined: Mon Jul 25, 2016 12:05 pm
Contact:

Re: Fastest way to arrange data on disc for database-type applications?

Post by Coeus » Sun Aug 02, 2020 3:22 pm

kfro wrote:
Sun Aug 02, 2020 1:07 pm
From memory, and this is a very long time ago, both heads will be loaded at the same time. The reason being that in a floppy drive the head is in contact with the media. The media is a thin mylar sheet that would deform if not supported. On a double sided drive support is provided by the other head and they move in unison with one another.
That's a very good point. I didn't answer this from memory as I don't trust my memory but that makes perfect sense.
kfro wrote:
Sun Aug 02, 2020 1:07 pm
I think you are going down the right route focusing on minimising mechanical movement as much as possible. But, you also need to think about sector interleave. Here's the problem, by the time you've read a sector, you've probably already missed the start of the next sector (no idea if this is true for the beeb BTW)...
Checking DFS 0.9 (8271), DFS does not use sector interleave and I think this is primarily responsible for why OSFILE is so fast compared so other filing system calls, and also compared to other machines that were around at the time. When loading a whole file with no need to return to the requesting application in between sectors, DFS can indeed keep up with non-interleaved sectors so all the sectors for a file on the same track can be read on one disc rotation. At least one of the double-density DFSes went even further in issuing a single FDC command to read all the sectors in the track.

Once the filing system has to return to the application and then, sometime later, the application requests more data and the next sector needs to be read it is much more likely that the sector will have been missed. Then reading the whole track will take one revolution per sector instead of one for the whole track. That's likely to be the situation reading/writing with OSBGET and OSBPUT which BASIC uses for INPUT# and PRINT#.

DFS does use a skew between the tracks. The idea here is so that after reading the last sector of one track and stepping the head to the adjacent track, the head is in position and settled before the first sector of the new track arrives. That does mean the skew is dependent on the seek time of the drive but I am not sure how you would tune this with the Acorn-provided formatter.

Back to the point about reading all the surfaces for a given position of the head before stepping to the next track, I think that should always be faster if you're accessing the hardware directly or using OSWORD 7F I assume the reason DFS does not do this as standard is to make single-sided compatibility easier. I don't know if you would need a skew between the sectors on the first side and the ones on the 2nd side - switching heads should be much faster than stepping to the next track.

If you're using DFS and opening one file on each side the big question is whether DFS will insist on re-reading the catalogue when you switch sides. If it does that will cause a seek to track 0 and completely swamp any gains.

Coeus
Posts: 1657
Joined: Mon Jul 25, 2016 12:05 pm
Contact:

Re: Fastest way to arrange data on disc for database-type applications?

Post by Coeus » Sun Aug 02, 2020 3:24 pm

And here is the skew for DFS 0.9:

Code: Select all

00: 07 08 09 00 01 02 03 04 05 06
01: 04 05 06 07 08 09 00 01 02 03
02: 01 02 03 04 05 06 07 08 09 00
03: 08 09 00 01 02 03 04 05 06 07
04: 05 06 07 08 09 00 01 02 03 04
05: 02 03 04 05 06 07 08 09 00 01
06: 09 00 01 02 03 04 05 06 07 08
07: 06 07 08 09 00 01 02 03 04 05
08: 03 04 05 06 07 08 09 00 01 02
09: 00 01 02 03 04 05 06 07 08 09
0A: 07 08 09 00 01 02 03 04 05 06
So that's the first ten tracks and the order of the sector IDs on each. I assume this pattern then repeats.

User avatar
jgharston
Posts: 4039
Joined: Thu Sep 24, 2009 12:22 pm
Location: Whitby/Sheffield
Contact:

Re: Fastest way to arrange data on disc for database-type applications?

Post by jgharston » Sun Aug 02, 2020 4:38 pm

Coeus wrote:
Sun Aug 02, 2020 3:22 pm
Checking DFS 0.9 (8271), DFS does not use sector interleave
You wouldn't find out by examining the DFS, whatever version. You'd find it out by examining the formatter you use. I originally used the formatter in DISCDOCTOR which uses a sector interleave of 1 and a track-to-track skew of 3, which is what I put into my *FORMxx command in HADFS, and my BASIC type-in formatter: FormDFS.

From experimentation I found that there was no advantage to interleaving, the BBC is fast enough to be ready to collect sector 2 as soon as it's finished with sector 1. Even the smallest interleave of 2 (where you format the track as 0 5 1 6 2 7 3 8 4 9) is significantly slower than an interleave of 1 (0 1 2 3 4 5 6 7 8 9) as once the BBC has processed (eg) sector 1, all of sector 6 has to go past before it can deal with sector 2.

The same experimentation found that a skew of 3 gave the best speed, as the disk head steps to the next track after dealing with the end of sector 9, slightly under three sectors have gone past, so that's the optimum point to start dealing with the next track's sector 0, viz:

track n+0: 0 1 2 3 4 5 6 7 8 9
track n+1: 7 8 9 0 1 2 3 4 5 6
track n+2: 4 5 6 7 8 9 0 1 2 3

One thing I never tested for is if track 0 should start with a skew. The catalog is on track 0, and so is accessed very often. Most of the time when the catalog is read it is with the disk drive stationary, and it has to spin up, and often also has to step back to track zero, so in all that action several whole tracks have spun past, so I think that outweighs any effect from the position of sector 0 on track 0.

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.32
(C) Copyright J.G.Harston 1989,2005-2020
>_

Coeus
Posts: 1657
Joined: Mon Jul 25, 2016 12:05 pm
Contact:

Re: Fastest way to arrange data on disc for database-type applications?

Post by Coeus » Sun Aug 02, 2020 4:57 pm

jgharston wrote:
Sun Aug 02, 2020 4:38 pm
Coeus wrote:
Sun Aug 02, 2020 3:22 pm
Checking DFS 0.9 (8271), DFS does not use sector interleave
You wouldn't find out by examining the DFS, whatever version....
Good point. I should clarify that, then. I used B-Em with the model set to "Model B with 8271" (which happens to use DFS 0.9) with debugging enabled on the disc implementation. Then I ran the FORM40 program supplied on the Welcome disc and watched what values were being provided to the 8271 format command.

SteveF
Posts: 628
Joined: Fri Aug 28, 2015 9:34 pm
Contact:

Re: Fastest way to arrange data on disc for database-type applications?

Post by SteveF » Mon Aug 03, 2020 11:55 pm

Thanks everyone, that's really helpful. As I'm working on an emulator I won't try playing around with sector interleave/track skew (are any emulators that realistic, as far as timing goes?) but it's good to know splitting the data across the two surfaces isn't going to cause problems on real hardware. I'm using OSWORD &7F so I don't have to worry about DFS seeking back to track 0 to check the catalogue.

Post Reply

Return to “8-bit acorn hardware”