Transferring data across the Tube

Discuss all aspects of programming here. From 8-bit through to modern architectures.
Post Reply
jregel
Posts: 77
Joined: Fri Dec 20, 2013 6:39 pm
Location: Gloucestershire
Contact:

Transferring data across the Tube

Post by jregel » Sun Mar 18, 2018 10:07 pm

Just curious if anyone knows how much data can be transferred from a parasite second processor to the host processor (bandwidth per second), and what the pros/cons of the different methods are?

I've read on the SpectROM thread that this can be done using bespoke OSWORD routines, extending the VDU protocol, getting the parasite to provoke the host into initiating a memory transfer, and possibly others that I haven't deciphered from the thread (I'm just learning all this stuff!).

The use case I'm thinking of is to have the second processor send array data to the host which is then used draw graphical tiles/cells on screen, but I'm not sure at this point if the Tube is fast enough.

Thanks!
BBC Master Turbo
Retroclinic External Datacentre
VideoNuLA
PiTubeDirect with Pi Zero

User avatar
kieranhj
Posts: 701
Joined: Sat Sep 19, 2015 10:11 pm
Location: Farnham, Surrey, UK
Contact:

Re: Transferring data across the Tube

Post by kieranhj » Sun Mar 18, 2018 10:23 pm

I did some prototyping on something similar to this last year but ended up getting unstuck by a bug in the Tube implementation in b-em (now fixed BTW.)

I started with the OSWRCH/OSRDCH approach as in Elite, simple enough to implement but it all depends on how much data you want to transfer. This is all from memory but IIRC you need 24us wait between each read from the Tube using this approach so limits your throughput. I moved to using the 256 byte transfer protocol which only requires 10us wait between reads, initiated on the host side, but was stymied by the aforementioned bug.

Theoretical Tube throughout is around 100KB/s which sounds like a lot but that assumes 100% CPU time on the host doing reads, i.e. you don’t have any time to actually doing anything useful with the data. It all comes down to balancing the amount of work done on the host side per byte sent from the parasite over the Tube interface. Elite hits a perfect sweet spot - sending X,Y coordinates for line drawing only requires a few bytes over the Tube and keeps the host nice & busy plotting to the screen whilst the parasite is doing lots of 3D maths operations (not a 6502 strong point.)

I believe you can implement your own protocols - I think Sarah did this for her recent Tube game thingy? Not sure what that entails but sounds like even more work to me. :D
Bitshifters Collective | Retro Code & Demos for BBC Micro & Acorn computers | https://bitshifters.github.io/

User avatar
kieranhj
Posts: 701
Joined: Sat Sep 19, 2015 10:11 pm
Location: Farnham, Surrey, UK
Contact:

Re: Transferring data across the Tube

Post by kieranhj » Sun Mar 18, 2018 10:28 pm

PS. Tom Seddon wrote a fab article that helped me get started with my Tube programming a while back: http://ffe3.com/tom/tube.html
Bitshifters Collective | Retro Code & Demos for BBC Micro & Acorn computers | https://bitshifters.github.io/

jregel
Posts: 77
Joined: Fri Dec 20, 2013 6:39 pm
Location: Gloucestershire
Contact:

Re: Transferring data across the Tube

Post by jregel » Sun Mar 18, 2018 10:40 pm

Thanks for the comments, Kieran, and for that link which is *very* informative!
BBC Master Turbo
Retroclinic External Datacentre
VideoNuLA
PiTubeDirect with Pi Zero

crj
Posts: 834
Joined: Thu May 02, 2013 4:58 pm
Contact:

Re: Transferring data across the Tube

Post by crj » Mon Mar 19, 2018 12:57 am

That does look like a good article, yes.

I assume you're using a "real" second processor, by the way? If you're using PiTubeDirect or similar modern thing that's vastly faster than the BBC Micro, you can write code that relies on the FIFO being saturated. That is, you're certain there will be another byte waiting for you each time you read, or that the parasite can accept another byte each time you write. In that circumstance, you can save the overhead of either an artificial delay or polling status registers, an optimisation that could nudge you up into 250Kbytes/sec territory.

User avatar
jgharston
Posts: 3129
Joined: Thu Sep 24, 2009 11:22 am
Location: Whitby/Sheffield
Contact:

Re: Transferring data across the Tube

Post by jgharston » Mon Mar 19, 2018 3:08 am

You can get faster if you use no delays and each side does a tight loop Wait RxRdy/FetchByte, WaitTxRdy/SendByte, which is essentially what the VDU channel uses, and one of the 80x86 OSWORDs does. Eg: (untested, too cold):

one side:
LDX #0
.Loop
LDA address,X
.Wait
BIT TUBExS:BVC Wait:STA TUBExD
INX:BNE Loop

other side:
LDX #0
.Loop
.Wait
BIT TUBExS:BPL Wait:LDA TUBExD
STA address,X
INX:BNE Loop

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.25
(C) Copyright J.G.Harston 1989,2005-2015
>_

User avatar
yellowpig
Posts: 38
Joined: Sat Apr 08, 2006 6:28 pm
Location: Nottingham, UK
Contact:

Re: Transferring data across the Tube

Post by yellowpig » Mon Mar 19, 2018 10:23 am

I did quite a bit of experimenting with the Tube a few years ago (so I could use it the "wrong way round"). I found that the interface works as fast as the two processors can handle it, with one exception: Two-byte transfers can be a bit unreliable if you are putting bytes on / pulling bytes off the Tube too quickly. (This was using a Master 512 set-up. I was never able to tie down exactly what the problem was, but it is something to do with the timing of the DRQ line on the parasite side.) There is a bit of an explanation of this on my web page at: http://www.cowsarenotpurple.co.uk/bbcco ... /tube.html

RobC
Posts: 2225
Joined: Sat Sep 01, 2007 9:41 pm
Contact:

Re: Transferring data across the Tube

Post by RobC » Mon Mar 19, 2018 10:00 pm

As others have said, if you're using a modern implementation of a 2nd processor (e.g. PiTubeDirect) then you'll be able to achieve much higher transfer speeds.

With SpectROM, I found that the Pi could provide data as quickly as the host could consume it provided that I transferred the data in a tight loop. So, given that the host just needs to do a series of "load absolutes" to get the data, the limiting factor is how quickly the host can write/store the data to where it needs to go. This very much depends on the nature of the data you're sending across...

E.g. if each byte of data needs to be written to an arbitrary address, then you'll use up a lot of the bandwidth telling the host where each byte needs to go. On the other hand, if you're sending a whole load of bytes to a sequence of addresses (as you may well want to do with tile data), you can get away with specifying a single address and the host can infer the rest.

To implement the transfers, I copied hoglet's Game of Life method of using the VDU queue and having a special control code for each type of transfer by redirecting OSWRCH. Happy to share the code but it's only designed for the Pi as it assumes no delays etc.

Post Reply