Pi-based Co-Pro on the cheap - 100MHz 6502 for £10? (now 274MHz)

for bbc micro/electron hardware, peripherals & programming issues (NOT emulators!)
User avatar
hoglet
Posts: 7807
Joined: Sat Oct 13, 2012 6:21 pm
Location: Bristol
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by hoglet » Tue Oct 25, 2016 5:08 pm

Hi Guys,

Time for an update...

For a while now we've had some tricky reliability issues on the single core Pies. There's more details a few posts back:
http://www.stardot.org.uk/forums/viewto ... 70#p150565
hoglet wrote: What does the future hold?

There is a plan afoot to shift the 6502 request handling to a process on the GPU, which we believe will really help. But this is a pretty big change, and Dominic (dp11) who is driving this work hasn't had much spare time. I'd very much like to see this happen, so Dominic if there's anything I can do to help, then please just ask.
Over the last several weeks, Dominic, myself and Ed have been working hard towards this goal.

It turns out the GPU on the Pi is actually far more than just a GPU. It also includes a general purpose ARM-like processor, and the details of it's instruction set have only recently been discovered (through a herculean black-box reverse engineering process). For the brave who want to know more, take a look at this page:
https://github.com/hermanhermitage/vide ... ers-Manual

What we (mostly Dominic) have done is taken the ARM assembler code we had for emulating the front end of the tube chip, and ported it to run on the GPU. This code is massively time critical - when the Beeb reads a tube register this code needs to run and provide the right data within ~300ns (six squares on the scope piccies below). Meeting this requirement 100% of the time proved to be impossible when sharing a single ARM code, hence the desire to offload this to the GPU. Again, if anyone is interested what this actually looks like, then take a look here:
https://github.com/hoglet67/PiTubeDirec ... e/tubevc.s

Just to give an idea of the benefit of this, here is a before and after picture showing the variation in tube-read times:

Original ARM-Only version:
IMG_0744.JPG
New GPU version:
IMG_0745.JPG
The Beeb needs tube read data to have settled by the grid line with "50ns" just above it.

I think the pictures speak for themselves!

So now even the lowly Pi Zero can effectively devote one (GPU) core to the tube interface and the other (ARM) core to the various Co Pro emulators. A side effect of offloading the tube handling onto the GPU is that the "fast" 65C02 Co Pro now goes even faster - 225.62MHz on the Pi Zero.

Anyway we're pleased to announce the availability of a "beta" release for testing that includes this new GPU code:
PiTubeDirect_20161023_1834_dmb.zip
(826.04 KiB) Downloaded 51 times
I have tested this on the Pi Zero, Pi One, Pi Two and Pi Three. See the wiki for the recommended config.txt settings:
https://github.com/hoglet67/PiTubeDirec ... xt-options

We're particularly interested in people trying out the non-6502 cores, as these were the ones that were unreliable before on the Pi Zero/One.

But any feedback (positive or negative) is most welcome :D

Dave

User avatar
sirmorris
Posts: 755
Joined: Wed Feb 11, 2009 12:18 pm
Location: oxfordshire uk
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by sirmorris » Tue Oct 25, 2016 7:36 pm

=D> 8) =D>

SteveF
Posts: 510
Joined: Fri Aug 28, 2015 8:34 pm
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by SteveF » Tue Oct 25, 2016 8:15 pm

I don't have anything useful to add, but congratulations, this sounds excellent!

User avatar
boba
Posts: 92
Joined: Mon Jul 16, 2012 7:43 pm
Location: The Kingdom of Fife
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by boba » Tue Oct 25, 2016 8:19 pm

Quite a remarkable reverse-engineering task, interesting reading!

I've had a quick play with the new build on a Pi2 on a Master128.
First observation is that the processors all seem to boot more quickly on startup or after an FX151,230. On the previous build it generally took 3 or 4 ctrl-breaks before the processor kicked in after a cold start. It now seems to start reliably on the 2nd. Doesn't start straightaway still because I guess the Master gets up quicker. In particular the 32016 seems to come up quicker and more reliably.
Combined timing on the fast 6502 is 241.98, almost twice as fast as before!
Had a play with both Dos+ on the 286 and Pandora/Panos on the 32016. Both seem stable. No noticeable difference in loading times (IO-bound though). The Fortran check on Panos seems about the same for timing (on my watch!) but probably also IO bound. Nothing convenient for the 286 to hand but running the BASIC timing test on Pandora was a bit of a disappointment - 24.84 for the new build whereas the old build was 31.95 combined speed.

User avatar
hoglet
Posts: 7807
Joined: Sat Oct 13, 2012 6:21 pm
Location: Bristol
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by hoglet » Tue Oct 25, 2016 8:29 pm

boba wrote: Nothing convenient for the 286 to hand but running the BASIC timing test on Pandora was a bit of a disappointment - 24.84 for the new build whereas the old build was 31.95 combined speed.
Yes, the 32016 will be running a bit slower. It's still 6x faster than the original was I believe. It's the price to be paid for increased reliability I'm afraid. Particularly the issue of not being recognised on BREAK. Apart from the on the initial power up, do you find the 32016 is now recognised on every BREAK, or is it still sometimes missed?

Anyway, it's possible we can claw back some of the performance in the future.

Dave

SteveF
Posts: 510
Joined: Fri Aug 28, 2015 8:34 pm
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by SteveF » Tue Oct 25, 2016 8:30 pm

Actually I thought of a question now. It's not important, I'm just curious - does this remove all timing constraints on the soft core, or does it still have to respond relatively quickly? If you sprinkled busy loops which just spin for (say) 5 seconds at random throughout the soft core code (not the new bit running on the GPU), would everything still "work"?

User avatar
hoglet
Posts: 7807
Joined: Sat Oct 13, 2012 6:21 pm
Location: Bristol
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by hoglet » Tue Oct 25, 2016 8:56 pm

SteveF wrote:Actually I thought of a question now. It's not important, I'm just curious - does this remove all timing constraints on the soft core, or does it still have to respond relatively quickly? If you sprinkled busy loops which just spin for (say) 5 seconds at random throughout the soft core code (not the new bit running on the GPU), would everything still "work"?
This is a very interesting and pertinent question. The answer is no, the current GPU doesn't (yet) remove all timing constraints. But it's a step in the right direction...

Tube register reads are handled by the GPU, but writes are still passed through to the ARM core. Some latency is tolerable, but if the Beeb host does a tube register write, followed immediately by a read, then that read needs to return the latest state, or things can go wrong. One of the most challenging cases is detection of the Co Pro on a tube reset, where the Beeb writes the Tube control register, then immediately reads it back again (about 2us later). This sequence currently needs to propagate from the GPU, through at ARM core and back again. So on pretty much all the Co Processors we are emulating a single instruction, then polling a GPU->ARM hardware mailbox for tube requests to process. The Fast 6502 Co Pro is a little different, in that it uses an ARM interrupt from the mailbox, but in a way that's not easily to generalise to the other Co Pros.

Improving this is really one of the next things we need to work on, either by:

- on the ARM side allow the GPU->ARM hardware mailbox to generate in interrupt, and have this process the tube request (in C).

- or, move the C code that handles tube request into the GPU. This part of the Tube chip emulation is written in C (actually borrowed from B-Em - thanks Sarah!). So this either needs porting to GPU assembler, or we need to find a C compiler for the GPU. There is a port of GCC, but haven't tried it yet: https://github.com/puppeh/vc4-toolchain

If we can do one or other of these things, then that will further relax the real time constraints on the emulator side, although not by a huge amount. Certain things on the tube (e.g. NMIs that are part of tube data transfers) need to happen within 10us or data is lost, because there is no host side flow control. But actually 10us is an age when the ARM code is running at 1000MHz!

Dave

dp11
Posts: 869
Joined: Sun Aug 12, 2012 8:47 pm
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by dp11 » Tue Oct 25, 2016 9:13 pm

I did try to build the gpu c compiler but it failed to build on my machine.

SteveF
Posts: 510
Joined: Fri Aug 28, 2015 8:34 pm
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by SteveF » Tue Oct 25, 2016 9:15 pm

Thanks Dave, interesting stuff. I was wondering whether this might open the door (just a crack, even :-) ) to possibly running a lib6502-jit-based emulator on the Pi, but it looks like that's a fair way down the road if it's even remotely feasible.

(I don't know if you can run the LLVM JIT on bare metal, but I would guess in single-threaded mode it's at least a theoretical possibility. But if there are any remotely tight deadlines involved you'd need two threads - one for JITting and one for standard lib6502-style interpreting to keep the worst case response time down - and that would probably be harder still on bare metal.)

I guess a 200MHz+ 6502 is plenty fast enough anyway. :-)

User avatar
hoglet
Posts: 7807
Joined: Sat Oct 13, 2012 6:21 pm
Location: Bristol
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by hoglet » Tue Oct 25, 2016 9:16 pm

dp11 wrote:I did try to build the gpu c compiler but it failed to build on my machine.
Mine also, I think it need a much more recent version of GCC that comes with Ubuntu 14.04. Maybe it's time to update to 16.04.

Dave

User avatar
boba
Posts: 92
Joined: Mon Jul 16, 2012 7:43 pm
Location: The Kingdom of Fife
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by boba » Tue Oct 25, 2016 9:17 pm

hoglet wrote:Apart from the on the initial power up, do you find the 32016 is now recognised on every BREAK, or is it still sometimes missed?
After *fx151,230,13 with the 32016 recognised and running it reliably restarts on ctrl-break.
There's an interesting oddity:
If the current copro is either the fast or slow 6502, then after *fx151,230,13 the 32016 always takes 2 ctrl-breaks to start.
If the current copro is anything else, then after *fx151,230,13 the 32016 always starts on the first ctrl-break.
(within experimental error because my fingers got tired).

User avatar
hoglet
Posts: 7807
Joined: Sat Oct 13, 2012 6:21 pm
Location: Bristol
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by hoglet » Tue Oct 25, 2016 9:21 pm

SteveF wrote: I guess a 200MHz+ 6502 is plenty fast enough anyway. :-)
I honestly don't think lib6502-jit would come close to the 225MHz we are currently getting.

But then I would think that I guess and it would be nice to be proved wrong :D

What sort of a speed up do you typically see compared to lib6502?

Dave

User avatar
BigEd
Posts: 2312
Joined: Sun Jan 24, 2010 10:24 am
Location: West
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by BigEd » Tue Oct 25, 2016 9:27 pm

hoglet wrote:
SteveF wrote:I guess a 200MHz+ 6502 is plenty fast enough anyway. :-)
I honestly don't think lib6502-jit would come anywhere close to the 225MHz we are currently getting.
That could be checked on a Pi running linux - no need for bare metal to benchmark it. I'm inclined to agree though, having seen the micro-optimisations being piled on!

dp11
Posts: 869
Joined: Sun Aug 12, 2012 8:47 pm
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by dp11 » Tue Oct 25, 2016 9:29 pm

I did wonder about JIT but with self modifying code making a mess with the cache I'm not sure it would help. I have thought about using another core to try an compile code in advance for the other core, but that sounds a mess.

There is if you look in our github a faster 6502 coming . Currently 6502 250MHz on Raspberry Pi zero running at 1GHz. Now that doesn't give you a lot of ARM cycles to play with.

User avatar
hoglet
Posts: 7807
Joined: Sat Oct 13, 2012 6:21 pm
Location: Bristol
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by hoglet » Tue Oct 25, 2016 9:32 pm

fast_6502 is what's coming...

dp11
Posts: 869
Joined: Sun Aug 12, 2012 8:47 pm
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by dp11 » Tue Oct 25, 2016 9:38 pm

On the web someone tried static compilation of a new game. They did get it going but it was a lot of work to deal with the tricks programmers used.

SteveF
Posts: 510
Joined: Fri Aug 28, 2015 8:34 pm
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by SteveF » Tue Oct 25, 2016 9:43 pm

BigEd wrote:
hoglet wrote:
SteveF wrote:I guess a 200MHz+ 6502 is plenty fast enough anyway. :-)
I honestly don't think lib6502-jit would come anywhere close to the 225MHz we are currently getting.
That could be checked on a Pi running linux - no need for bare metal to benchmark it. I'm inclined to agree though, having seen the micro-optimisations being piled on!
I did actually try this on raspbian a while back, but unfortunately there seemed to be a bug in the llvm JIT on that platform - the example toy JIT didn't work either, so at least lib6502-jit was off the hook :-). (https://bugs.launchpad.net/raspbian/+bug/1527421) I ought to check at some point and see if there's a newer llvm for raspbian, just as a matter of interest; I suspect the technical obstacles to using it as a co-pro are too great even if the performance was better.

User avatar
boba
Posts: 92
Joined: Mon Jul 16, 2012 7:43 pm
Location: The Kingdom of Fife
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by boba » Tue Oct 25, 2016 9:55 pm

Out of interest tried out the Z80 on the new build, seems completely OK.

Couldn't drum up the energy to port the timing program (yet) but with BASIC on a simple mix of instructions the new build seems to be about 25% faster.

User avatar
fordp
Posts: 971
Joined: Sun Feb 12, 2012 9:08 pm
Location: Kent, England
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by fordp » Wed Oct 26, 2016 7:08 am

Well done guys. I built my boards this week. I will have to get back to some hacking.

=D> =D> =D> =D> =D> =D> =D> =D>
FordP (Simon Ellwood)
Time is an illusion. Lunchtime, doubly so!

iainjh
Posts: 307
Joined: Mon May 14, 2012 11:18 am
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by iainjh » Thu Oct 27, 2016 11:46 pm

just set my master, pi zero and Kjell's master level shifter board up, its working wonderfully . Cheers everyone, you are amazing! :):)

User avatar
jms2
Posts: 2141
Joined: Mon Jan 08, 2007 6:38 am
Location: Derby, UK
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by jms2 » Fri Oct 28, 2016 5:42 am

Top work guys =D>
I can't wait to test mine (including a test to see if it can coexist with the internal turbo board), but I'm away this weekend.

theandylaird
Posts: 2
Joined: Sun Oct 30, 2016 9:07 pm
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by theandylaird » Sun Oct 30, 2016 9:36 pm

Hi everyone - long time lurker, first time poster here.

I've just got PiTubeDirect up and running in my Master with a Pi 3.

All works great except for the 80286 core - when I switch to it and CTRL Break I get the startup header, followed by Bad Command error and I then get dropped to a * prompt instead of it booting off the floppy.

I've tried various versions of the M512 boot disks - occasionally I can get one to start booting if I type DOS at the star prompt but they always crash part way through the boot. I've created the boot disks from the images using IMGTODISK and a formatted 640k ADFS floppy.

What on earth am I doing wrong?

Andy

User avatar
hoglet
Posts: 7807
Joined: Sat Oct 13, 2012 6:21 pm
Location: Bristol
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by hoglet » Sun Oct 30, 2016 9:41 pm

theandylaird wrote: All works great except for the 80286 core - when I switch to it and CTRL Break I get the startup header, followed by Bad Command error and I then get dropped to a * prompt instead of it booting off the floppy.

I've tried various versions of the M512 boot disks - occasionally I can get one to start booting if I type DOS at the star prompt but they always crash part way through the boot. I've created the boot disks from the images using IMGTODISK and a formatted 640k ADFS floppy.

What on earth am I doing wrong?
Probably you are doing nothing wrong.

There is a bug in this core that seems to prevent it booting on Ctrl-Break.

(If you look on the Pi serial console, you'll see warning of an illegal instruction).

You should be able to reliably boot DOS Plus from the * prompt by typing DOS.

What are your overclocking settings?

I'll double-check tomorrow that this is still working correctly for me.

Dave

theandylaird
Posts: 2
Joined: Sun Oct 30, 2016 9:07 pm
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by theandylaird » Tue Nov 01, 2016 1:29 pm

I've now got it working - turns out that the DOS+ boot disk that I created wasn't quite up to scratch. I created a new one using another method and managed to boot to a DOS prompt.

User avatar
jgharston
Posts: 3409
Joined: Thu Sep 24, 2009 11:22 am
Location: Whitby/Sheffield
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by jgharston » Tue Nov 01, 2016 6:40 pm

Any chance of the PDP-11 and 6809 being added to the Pi-CoPro?

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.25
(C) Copyright J.G.Harston 1989,2005-2015
>_

User avatar
jgharston
Posts: 3409
Joined: Thu Sep 24, 2009 11:22 am
Location: Whitby/Sheffield
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by jgharston » Tue Nov 01, 2016 6:44 pm

hoglet wrote:
theandylaird wrote:All works great except for the 80286 core - when I switch to it and CTRL Break I get the startup header, followed by Bad Command error and I then get dropped to a * prompt instead of it booting off the floppy.
(...)
There is a bug in this core that seems to prevent it booting on Ctrl-Break.
(...)
You should be able to reliably boot DOS Plus from the * prompt by typing DOS.
I get that sometimes when booting 80x86 DOS on B-Em. I never got around to digging away and seeing what circumstances replicated it, but it seems very similar to that.

Code: Select all

$ bbcbasic
PDP11 BBC BASIC IV Version 0.25
(C) Copyright J.G.Harston 1989,2005-2015
>_

User avatar
hoglet
Posts: 7807
Joined: Sat Oct 13, 2012 6:21 pm
Location: Bristol
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by hoglet » Tue Nov 01, 2016 6:59 pm

jgharston wrote:
hoglet wrote:
theandylaird wrote:All works great except for the 80286 core - when I switch to it and CTRL Break I get the startup header, followed by Bad Command error and I then get dropped to a * prompt instead of it booting off the floppy.
(...)
There is a bug in this core that seems to prevent it booting on Ctrl-Break.
(...)
You should be able to reliably boot DOS Plus from the * prompt by typing DOS.
I get that sometimes when booting 80x86 DOS on B-Em. I never got around to digging away and seeing what circumstances replicated it, but it seems very similar to that.
What 80x86 Emulator does B-Em use? We're using Fake86 in PiTubeDirect.

Dave

User avatar
hoglet
Posts: 7807
Joined: Sat Oct 13, 2012 6:21 pm
Location: Bristol
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by hoglet » Tue Nov 01, 2016 7:00 pm

jgharston wrote:Any chance of the PDP-11 and 6809 being added to the Pi-CoPro?
These are both on my TODO list, probably 6809 first.

Do you have any recommendations for CPU emulations to leverage? It needs to be vanilla C (or C++ that is close enough).

Dave

User avatar
jms2
Posts: 2141
Joined: Mon Jan 08, 2007 6:38 am
Location: Derby, UK
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by jms2 » Tue Nov 01, 2016 9:32 pm

I've just tried out the latest build (using the GPU to deal with Tube transfers) on my Master Turbo. This is using an RPi2.

It all works brilliantly. I get 219MHz combined average speed on the 6502. All the CPUs start pretty much perfectly first time, and (this is a big improvement) all of them are able to "beat" the internal co-pro and start up first. =D> Obviously I'm using *Configure Extube (otherwise I'd get the internal copro all the time).

One question: I put copro=14 into the cmdline.txt file, but it doesn't seem to work - I still get the (external) 6502 by default. Is there no "null processor" on this setup?

User avatar
hoglet
Posts: 7807
Joined: Sat Oct 13, 2012 6:21 pm
Location: Bristol
Contact:

Re: Pi-based Co-Pro on the cheap - 100MHz 6502 for £10?

Post by hoglet » Tue Nov 01, 2016 9:55 pm

jms2 wrote: One question: I put copro=14 into the cmdline.txt file, but it doesn't seem to work - I still get the (external) 6502 by default. Is there no "null processor" on this setup?
Hmmm, I just tested that, and I have a different issue. It starts up in Co Pro 14 (the Null Co Pro), but then I can't change away from this.

I've added an issue into github:
https://github.com/hoglet67/PiTubeDirect/issues/13

Post Reply