MEMC1A timeline and documentation

Arc/RPCs, peripherals, RISCOS operating system & ARM kit eg GP2x, BeagleBoard
Post Reply
User avatar
myelin
Posts: 751
Joined: Tue Apr 26, 2016 9:17 pm
Location: Mountain View, CA, USA
Contact:

MEMC1A timeline and documentation

Post by myelin » Tue Oct 16, 2018 9:43 pm

Update: The MEMC1A datasheet is available as part of the VTI ARM Databook 1990 at the Internet Archive :D

I've been looking around for a datasheet for the MEMC1A, but haven't had any luck. I have the MEMC datasheet, and an "ARM Family Data Manual" PDF which mentions the VL86C110, but is missing the pages for its section. Is this around somewhere? I'm curious to see how the master/slave memory behaviour is documented. Is the MEMC chip even aware of this, or is it just a matter of leaving the Vidak, Sndak, Vidw, Sirq, Iorq, Romcs, Phi1, and Phi2 pins disconnected? I can't find anything that tells a MEMC whether it's master or slave. The MEMC datasheet says that one MEMC will occupy the entire memory map; is this what changed with the MEMC1A perhaps?

The best resource I have for this was the A500/R200 service manual, which contains schematics for the A540. It looks like two PALs (IC39, 0286,023) and IC71 (0286,020) generate all the signals; IC39 generates a separate A7 and A22 signal for each MEMC1A (documented in Theo Markettos' email) but also IC71 generates four 'enable' signals, which are used to gate the Clk24m line for each chip. I assume IC71 is just for reset timing (keeping the slave MEMCs from interfering with the boot process, maybe?)... if anyone knows, do tell!

The MEMC1A (VL86C110) datasheet explains what I was looking for. From page 4-8:
A single MEMC will control up to 4 Mbytes of DRAM. A second MEMC can be built into a system to extend the maximum addressable DRAM to 8 Mbytes. The two MEMCs are configured as a Master and a Slave, where the Slave acts purely as a DRAM driver (all DMA operations, I/O Controller interactions, etc. are handled by the Master).

The ~B/W input is sampled as RES goes low, and its state determines whether the MEMC will operate in Master (~B/W = 1) or Slave (~B/W = 0) mode. In a single MEMC system, VL86C010 holds ~B/W high during reset, so the MEMC is always configured as a Master.
~B/W is the byte/word select, so it looks like IC71's job is to generate Sreset*, which holds ~B/W low during reset and ensures that the MEMC1A chips on the RAM cards come up in slave mode.

From page 4-13:
In a dual MEMC system, the physical RAM is effectively doubled to 256 physical pages, and the Logical to Physical Address Translators in both the Master and Slave MEMCs must be programmed. When programming the Address Translators, A(7) specifies whether the Master or Slave Address translator is being accessed.
Reading more about how the address translator works, it seems that each MEMC has a 128-entry lookup table, where each entry contains a logical page number, then whenever you access memory in the 32MB logical space, the MEMC looks for a lookup table entry containing the logical page number, and maps it to the physical page corresponding to the table entry.

This feels backwards, but I guess it saves memory over having an entry for every logical page; it means 128 13-bit entries (1664 bits) rather than 8192 7-bit entries (57344 bits). This will cause undefined behaviour if more than one table entry contains the same logical page number, because that means you're telling the MEMC to map one logical page to two different locations. If you do this on two different MEMCs you'll end up with a bus conflict.

From Theo's email, it looks like the master responds in the physical memory space when A22=0, and the slave responds when A22=1. The datasheet says that the RAM image is repeated every 128 pages, and later that having a second MEMC effectively doubles the physical RAM to 256 pages, but I don't believe the master MEMC is aware of the existence of a second one, so I'm a bit confused here.
Last edited by myelin on Wed Oct 17, 2018 1:57 am, edited 6 times in total.
SW/EE from New Zealand, now in Mountain View, CA, making BBC/Electron hardware projects for fun.
Most interesting: Arcflash, FX2+PiTubeDirect Tube/Cartridge adapter, USB cart interface.

Phlamethrower
Posts: 112
Joined: Fri Nov 24, 2017 1:35 pm
Contact:

Re: MEMC1A timeline and documentation

Post by Phlamethrower » Wed Oct 17, 2018 1:13 pm

I'd be interested in seeing a summary of the differences between MEMC and MEMC1A. It looks like some of the differences are:
This feels backwards, but I guess it saves memory over having an entry for every logical page; it means 128 13-bit entries (1664 bits) rather than 8192 7-bit entries (57344 bits). This will cause undefined behaviour if more than one table entry contains the same logical page number, because that means you're telling the MEMC to map one logical page to two different locations. If you do this on two different MEMCs you'll end up with a bus conflict.
It took me a long time to realise that when the RISC OS sources talk about the "CAM", "soft CAM", etc., "CAM" is actually a fairly standard acronym for another type of memory: Content-addressable memory, i.e. the system that MEMC used for its page tables. It does feel a bit backwards (and it's a pain for emulators compared to a "normal" logical -> physical page table system), but I can imagine that it made things a lot simpler for the hardware designers, while also keeping performance high.

User avatar
SarahWalker
Posts: 1207
Joined: Fri Jan 14, 2005 3:56 pm
Contact:

Re: MEMC1A timeline and documentation

Post by SarahWalker » Wed Oct 17, 2018 4:35 pm

The improved performance is entirely down to eliminating the need for workarounds. MEMC1a and MEMC1 w/o workarounds perform the same until you hit one of the bugs.
It does feel a bit backwards (and it's a pain for emulators compared to a "normal" logical -> physical page table system), but I can imagine that it made things a lot simpler for the hardware designers, while also keeping performance high.
It does eat a huge amount of die space though! (since you need 128 logical addresses + 128 comparators)

Acorn probably didn't like the alternatives though - a proper page table walker + TLB would have involved much more logic, and would have required a data bus on MEMC, while a smaller CAM + software page table walking would have had a heavy page table miss penalty. MEMC's CAM setup is at least fast, even though it limits you to 4 MB per chip + 32kB pages.

User avatar
SarahWalker
Posts: 1207
Joined: Fri Jan 14, 2005 3:56 pm
Contact:

Re: MEMC1A timeline and documentation

Post by SarahWalker » Wed Oct 17, 2018 4:50 pm

The A540 schematics may be useful here - http://chrisacorns.computinghistory.org ... awings.zip - showing how the quad MEMC system was wired up (interesting that the MEMC1a datasheet doesn't show this as a supported configuration, did an Acorn engineer just hack this up?). Essentially all MEMCs run on different phases of the memory clock (derived from a 72 MHz oscillator). The abort signals are ANDed together, so an abort only occurs if all fail to translate the address. Curiously the DMA signals are wired on all MEMCs; I can't imagine this is actually doing anything on the slave chips. And as far as communal wiring goes that seems to be about it! Presumably MEMC won't enable the RAMs on an abort, so ANDing the abort signals is enough to keep everything working.
Last edited by SarahWalker on Wed Oct 17, 2018 4:50 pm, edited 1 time in total.

User avatar
SarahWalker
Posts: 1207
Joined: Fri Jan 14, 2005 3:56 pm
Contact:

Re: MEMC1A timeline and documentation

Post by SarahWalker » Wed Oct 17, 2018 4:58 pm

Oh, and the A540 uses a PAL to alter A22 and A7 for each MEMC. This is using nR/W, A7, A12-14 and A22-25. I'm guessing this is a) selecting MEMC responses to writes in 3600000-3ffffff (I believe 3600000-37fffff is written to all MEMCs, 3800000-3ffffff are selecting a MEMC based on what looks like A7 and A12), and b) locating the physical RAM mappings in 2000000-2ffffff.

Edit: just saw Theo's email documents this bit better than I did :)
Last edited by SarahWalker on Wed Oct 17, 2018 4:59 pm, edited 1 time in total.

SteveBagley
Posts: 206
Joined: Sun Mar 15, 2015 8:44 pm
Contact:

Re: MEMC1A timeline and documentation

Post by SteveBagley » Wed Oct 17, 2018 11:33 pm

There's also the A680 schematics which might help since they only have two MEMCs to deal with…
myelin wrote:
Tue Oct 16, 2018 9:43 pm
From Theo's email, it looks like the master responds in the physical memory space when A22=0, and the slave responds when A22=1. The datasheet says that the RAM image is repeated every 128 pages, and later that having a second MEMC effectively doubles the physical RAM to 256 pages, but I don't believe the master MEMC is aware of the existence of a second one, so I'm a bit confused here.
Some thoughts on this…

The only difference I can see between a MEMC in a 4MB machine The page translation table in each MEMC must be set so that it doesn't respond to logical addresses that map to physical addresses on other MEMCs, which seems to be done by A7 when setting up the page table -- if it is 0 then the Master MEMC handles it, otherwise if 1 the Slave MEMC deals with it. I'd hazard a guess that on seeing a write to the page table with A7 set causes the MEMC to set the MEMC to repeat physical pages after 256 pages, rather than 128. Although, the master MEMC will see writes between 0x2800000 and 0x2C00000, as being between 0x2C00000 and 0x3000000 (due to the twiddling of A22 by the PAL) so will never actually see an address that causes the pages to be repeated.

Then there's the interesting question about how the MEMC page tables are set up by the OS when there is more than one MEMC… I think, looking at Theo's notes that it would only be necessary to write to the page table once, using A12 and A7 to specify the MEMC addressing the physical page. Writes to the Master MEMC (A12 and A7 both 0) will cause the master MEMC to be updated (and this will be seen as a write to the Master MEMC page table by all slave MEMCs). Writes to any of the slave MEMCs will cause the Master MEMC to see it as a write to a slave MEMC, and the other two MEMCs to see it as a write to the Master MEMC and so become aware that they aren't handling that page (although two of the three slave MEMCs would think that page was handled by the Master MEMC, rather than one of the other slaves).

Steve

User avatar
myelin
Posts: 751
Joined: Tue Apr 26, 2016 9:17 pm
Location: Mountain View, CA, USA
Contact:

Re: MEMC1A timeline and documentation

Post by myelin » Thu Oct 18, 2018 7:11 am

SarahWalker wrote:
Wed Oct 17, 2018 4:50 pm
The A540 schematics may be useful here - http://chrisacorns.computinghistory.org ... awings.zip - showing how the quad MEMC system was wired up (interesting that the MEMC1a datasheet doesn't show this as a supported configuration, did an Acorn engineer just hack this up?). Essentially all MEMCs run on different phases of the memory clock (derived from a 72 MHz oscillator).
Oh wow... I just realized that the clocking must be really interesting here. The master MEMC generates Phi1/Phi2, presumably defaulting to 175ns Phi1 + 55ns Phi2, and switching to 55ns Phi1 + 55ns Phi2 when SEQ is high on the previous cycle.

I wonder if all the DMA lines go to all MEMCs so they can keep their clocks in sync.
SW/EE from New Zealand, now in Mountain View, CA, making BBC/Electron hardware projects for fun.
Most interesting: Arcflash, FX2+PiTubeDirect Tube/Cartridge adapter, USB cart interface.

User avatar
BigEd
Posts: 2691
Joined: Sun Jan 24, 2010 10:24 am
Location: West
Contact:

Re: MEMC1A timeline and documentation

Post by BigEd » Fri Oct 19, 2018 2:48 pm

myelin wrote:
Tue Oct 16, 2018 9:43 pm
... whenever you access memory in the 32MB logical space, the MEMC looks for a lookup table entry containing the logical page number, and maps it to the physical page corresponding to the table entry.

This feels backwards...
As noted by Steve Furber:
...people were beginning to adopt fairly complex memory controllers. These were things that did memory address translation through two layers of tables and they produced quite complex hardware. And I thought about this and decided I could find a much – a much simpler way of doing this. If I sort of inverted the problem... And unbeknown to me, I’d effectively just reinvented the very first memory management hardware that was developed for the Manchester Atlas machine, which again was based on associative memories.

User avatar
SarahWalker
Posts: 1207
Joined: Fri Jan 14, 2005 3:56 pm
Contact:

Re: MEMC1A timeline and documentation

Post by SarahWalker » Fri Sep 13, 2019 7:38 pm

Bit of a bump, but I've been reworking the ARM/MEMC timing in Arculator, and I feel the need to rectify my statement above that 'MEMC1a and MEMC1 w/o workarounds perform the same'. I've identified three performances differences between MEMC1 and MEMC1a systems, only one of which is the result of a workaround :

1) Consecutive I-cycles are not allowed
This is 'the workaround', and is described in section 7.5 of the ARM2 datasheet. Essentially, it appears MEMC1 does not handle back-to-back I-cycles correctly, messing up the merged instruction fetch* that follows. The workaround essentially means that in a chain of consective I-cycles, all but the first are converted into N-cycles.

This affects MUL and MLA instructions, which will run at around half speed as a result, as well as being susceptible to delays caused by DMA. As a side effect of this, MEMC won't merge the following instruction fetch, but _will_ (pointlessly) merge the I-cycle with the first of the converted N-cycles.

2) Merging is not allowed when A2 and A3 are set
This is actually documented in the MEMC datasheet, but is largely missed. MEMC will not merge a fetch where the target address has both A2 and A3 set. Therefore any instruction eligible for a merge will not merge if 16-byte aligned (as the merged fetch would be from PC+12). The datasheet states that this is a result of the logic limiting back-to-back S-cycles.

This also affects the early merged cycle in MUL/MLA mentioned above, adding an additional quirk to those instructions.

This is also mentioned in the VTI Databook 1990, which is ostensibly describing MEMC1a. However timings measured off an ARM2/MEMC1a machine suggest that this doesn't actually apply to the later chip.

3) I-cycles run at the speed of the previous cycle

This one's a bit more speculative, but appears to be supported by the available data and does seem to show an additional genuine (and not documented) bug in MEMC1. It could probably be confirmed with an MEMC1 machine using a scope on the PH1 and MREQ pins.

I--cycles should take 1 cycle of MCLK (8 MHz in all affected machines), however if the previous cycle was an N-cycle then it actually seems to take 2 cycles. This has several effects - MUL, MLA and data processing instructions with register shift will take an additional cycle if the previous instruction fetch (of PC+8) was 16-byte aligned. LDR will always take one cycle longer than it should as the previous cycle is the data fetch, which is always an N-cycle. LDM will take one cycle longer if the last memory read was an N-cycle; this is always the case for a single register LDM (same as LDR), but may not be the case for a multi-register load, hence the observation that loading a second register in LDM is 'free'.

Merged fetches do _not_ count as N-cycles here, and an I-cycle following a merged fetch will run at full speed.

I don't _think_ the slowed I-cycles are performing memory accesses, but I could be wrong.

You may notice that all three issues impact on MUL/MLA - clearly a difficult instruction to get right!


I hope this was of interest to someone! The upshot of implementing all this is that Arculator's !Si instruction timings in MEMC1 mode now match those of a real machine.


* An I-cycle will break any chain of sequential accesses that may be taking place, guaranteeing that the next memory cycle (always an instruction fetch) will have to be an N-cycle. To mitigate this, MEMC implements a 'merge' optimisation where an I-cycle followed by an S-cycle will be 'merged' into a single N-cycle. This occurs at the end of LDR, LDM, MUL, MLA, and data processing instructions with register specified shifting, and saves a cycle.

User avatar
myelin
Posts: 751
Joined: Tue Apr 26, 2016 9:17 pm
Location: Mountain View, CA, USA
Contact:

Re: MEMC1A timeline and documentation

Post by myelin » Fri Sep 13, 2019 10:08 pm

Nice sleuthing!

This is really interesting; I'm hoping to eventually design an FPGA MEMC replacement, to allow all socketed MEMC boards to be upgraded to 16MB, and it would be nice to give it the ability to replicate original timings (plus all the expected stuff: configurable clock output, doing all accesses in a single cycle if the clock is slow enough, etc.)
SW/EE from New Zealand, now in Mountain View, CA, making BBC/Electron hardware projects for fun.
Most interesting: Arcflash, FX2+PiTubeDirect Tube/Cartridge adapter, USB cart interface.

Post Reply