Why did CISC prevail?

for all subjects/topics not covered by the other forum categories
Post Reply
Coeus
Posts: 963
Joined: Mon Jul 25, 2016 11:05 am
Contact:

Why did CISC prevail?

Post by Coeus » Wed Feb 07, 2018 10:19 pm

When designing the ARM Steve Furber was saying how the other processors of the day could not effectively use the available memory bandwidth and this was part of the rationale for simpler, faster instructions, i.e. RISC. RISC took off - all the Unix workstation manufacturers went for it, it was only the PC that didn't, presumably for reasons of backwards compatibility.

Now, the majority of "big" computers are CISC again. There's IBM's z Series and we have server farms running Linux, which is perfectly capable of running on various RISC processors and seems to run on just about anything, on standard PC (CISC) processors.

Is it that the situation has turned about in that memory has not got much faster as CPUs have shrunk, and hence got faster? Or are they just cheap?

And yet RISC is alive and well in ARM which is almost certainly the most deployed processor ever made with mobile phones, routers, set top boxes, and lots of IOT to come.

User avatar
1024MAK
Posts: 7787
Joined: Mon Apr 18, 2011 4:46 pm
Location: Looking forward to summer in Somerset, UK...
Contact:

Re: Why did CISC prevail?

Post by 1024MAK » Thu Feb 08, 2018 12:25 am

Memory (in the form of DRAM) has not actually got very much faster. Just more "tricks" have been developed to get more data out of it within the same memory cycle / access. Meanwhile, cache memory has allowed the CPU speed to increase, plus the CISC CPUs have got far more sophisticated. And now of course, we have processors that have multiple computing cores.

This has happened due to most computer companies jumping on the IBM compatible bandwagon.

Mark
For a "Complete BBC Games Archive" visit www.bbcmicro.co.uk NOW!
BeebWiki‬ - for answers to many questions...

crj
Posts: 834
Joined: Thu May 02, 2013 4:58 pm
Contact:

Re: Why did CISC prevail?

Post by crj » Thu Feb 08, 2018 1:26 am

Lots of complicated technical and political things have happened in the decades since RISC was first suggested.

However, two strands stand out for me.

Firstly, the Pentium was much more RISC-like than the 486. The superscalar dispatch only worked on a RISCy subset of the instruction set, aligned in RISCy ways in memory. They might almost have billed it as a RISC CPU which could also - slowly - execute legacy x86 code... if their marketing department hadn't spent the previous few years dissing RISC.

Secondly, one of the fundamental premises of RISC doesn't hold as firmly as it did in the eighties. RISC asserts that workloads are so diverse that it's futile to try to provide an effective selection of special-purpose instructions. It reasons that if you instead simplify the instruction set and microarchitecture you can reduce the cycles per instruction and increase the clock rate, thereby achieving a higher instruction throughput for every workload. Trouble is, workloads aren't that diverse any more. As a result, even RISC systems have dedicated support for graphics rendering, vector processing, DSP, crypto, etc. these days. I'd argue that ARM hasn't truly been RISC for well over a decade.

So I don't think CISC prevailed. I think CISC became more RISC-like and RISC became more CISC-like. Right now, ARM and x86, the world's two dominant CPU architectures, are way more similar than they were in 1988.

User avatar
fordp
Posts: 958
Joined: Sun Feb 12, 2012 9:08 pm
Location: Kent, England
Contact:

Re: Why did CISC prevail?

Post by fordp » Thu Feb 08, 2018 7:55 am

Coeus wrote:When designing the ARM Steve Furber was saying how the other processors of the day could not effectively use the available memory bandwidth and this was part of the rationale for simpler, faster instructions, i.e. RISC. RISC took off - all the Unix workstation manufacturers went for it, it was only the PC that didn't, presumably for reasons of backwards compatibility.

Now, the majority of "big" computers are CISC again. There's IBM's z Series and we have server farms running Linux, which is perfectly capable of running on various RISC processors and seems to run on just about anything, on standard PC (CISC) processors.

Is it that the situation has turned about in that memory has not got much faster as CPUs have shrunk, and hence got faster? Or are they just cheap?

And yet RISC is alive and well in ARM which is almost certainly the most deployed processor ever made with mobile phones, routers, set top boxes, and lots of IOT to come.
The advances from the RISC chips were put in to the CISC chips. The big boost in speed came from better memory access, pipe-lining, register to register operation and better compilers.Those techniques were applied to all modern chips!
FordP (Simon Ellwood)
Time is an illusion. Lunchtime, doubly so!

SteveBagley
Posts: 156
Joined: Sun Mar 15, 2015 8:44 pm
Contact:

Re: Why did CISC prevail?

Post by SteveBagley » Thu Feb 08, 2018 8:51 am

crj wrote:So I don't think CISC prevailed. I think CISC became more RISC-like and RISC became more CISC-like. Right now, ARM and x86, the world's two dominant CPU architectures, are way more similar than they were in 1988.
Absolutely, my understanding is that modern ARM CPUs have microcode and crack instructions into micro-ops just as much x64 does…

In fact, you could probably build an argument that ARM hasn’t been truly RISC since they added the MUL instruction ;)

Steve

User avatar
Richard Russell
Posts: 459
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: Why did CISC prevail?

Post by Richard Russell » Thu Feb 08, 2018 9:54 am

Something else I find interesting is that the 64-bit ARM architecture has ditched one of the key features of the original 32-bit instruction set: that every instruction can be conditional. There are just a few conditional instructions left in the 64-bit set, such as conditional jumps. This is much more in line with the way CISC processors typically work, and reading between the lines it's because the internal CPU architecture is now more CISC-like too: "IT blocks are a useful feature of T32, enabling efficient sequences that avoid the need for short forward branches around unexecuted instructions. However, they are sometimes difficult for hardware to handle efficiently".

Richard.

SteveBagley
Posts: 156
Joined: Sun Mar 15, 2015 8:44 pm
Contact:

Re: Why did CISC prevail?

Post by SteveBagley » Thu Feb 08, 2018 11:21 am

Richard Russell wrote:Something else I find interesting is that the 64-bit ARM architecture has ditched one of the key features of the original 32-bit instruction set: that every instruction can be conditional. There are just a few conditional instructions left in the 64-bit set, such as conditional jumps. This is much more in line with the way CISC processors typically work, and reading between the lines it's because the internal CPU architecture is now more CISC-like too: "IT blocks are a useful feature of T32, enabling efficient sequences that avoid the need for short forward branches around unexecuted instructions. However, they are sometimes difficult for hardware to handle efficiently".
My understanding was that the branch prediction and speculative evaluation in 64-bit ARM CPUs effectively makes the conditional instruction execution redundant since most of the time the branch predictor will give the same benefit. Removing the 4-bits of the opcode used to encode the condition code allows for the more registers to be added to the CPU and that has huge speed benefits.

A quick search yields this paper analysing the effect of removing the the condition codes and adding more register:
http://ieeexplore.ieee.org/document/570 ... 3498&tag=1

Steve

User avatar
Richard Russell
Posts: 459
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: Why did CISC prevail?

Post by Richard Russell » Thu Feb 08, 2018 11:47 am

SteveBagley wrote:My understanding was that the branch prediction and speculative evaluation in 64-bit ARM CPUs effectively makes the conditional instruction execution redundant
I suppose the challenge for ARM is how to maintain its edge in terms of power efficiency, which is presumably why it has captured most of the 'mobile' market, whilst introducing such 'CISC-like' complexities into the CPU.

Personally I'm sad that Intel threw in the towel and no longer compete in that market (e.g. with their Atom CPUs); my first 'smart' phone was an Intel Android device and I've also got an Intel Android tablet (the Tesco Hudl2). We won't see the likes of those again.

Richard.

crj
Posts: 834
Joined: Thu May 02, 2013 4:58 pm
Contact:

Re: Why did CISC prevail?

Post by crj » Thu Feb 08, 2018 1:46 pm

SteveBagley wrote:In fact, you could probably build an argument that ARM hasn’t been truly RISC since they added the MUL instruction ;)
The ARM was never a purist RISC architecture. If you want to see what "true" RISC looks like, read up on the early MIPS chips.

Even before MUL, the ARM had register-specified shift amounts, LDM/STM, etc. And ARM chose pipeline stalls over branch delay slots for loads, stores and branches. While the early ARMs didn't contain microcode per se, there was a two-bit feedback channel which turned the instruction decoder into a state machine.

Embedding the PC in the general-purpose register bank was also both non-RISClike and probably with hindsight a mistake. The MIPS instead embeds a hardwired-zero register, and I gather the 64-bit ARM has gone that route, too.


It's worth noting, incidentally, that every instruction being conditional is an ARM thing. It's well within the RISC ethos, but other RISC architectures don't do it, and ARM hasn't become less RISCy by removing it.

SteveBagley
Posts: 156
Joined: Sun Mar 15, 2015 8:44 pm
Contact:

Re: Why did CISC prevail?

Post by SteveBagley » Thu Feb 08, 2018 2:23 pm

Richard Russell wrote:I suppose the challenge for ARM is how to maintain its edge in terms of power efficiency, which is presumably why it has captured most of the 'mobile' market, whilst introducing such 'CISC-like' complexities into the CPU.
I think it is as much a case of making design assumptions that are relevant for the time you are designing the CPU -- conditional instructions make sense with in-order CPUs, where memory is small and faster than the CPU since they enable you to write tight assembly (although I wonder how much use the compilers actually made of them!).

Fast forward to now, and memory access is 200x slower than register accesses but you have lots of it so you don't need worry so much about instruction compactness. Couple that with a branch predictor in place and instructions being cracked into micro-ops that are executed out-of-order and they start to make less sense (certainly compared to instead using the bits in the opcode to allow more registers).

Steve

crj
Posts: 834
Joined: Thu May 02, 2013 4:58 pm
Contact:

Re: Why did CISC prevail?

Post by crj » Thu Feb 08, 2018 2:45 pm

Everybody who's designing CPUs is engaged in a complex multi-dimensional tradeoff. When you improve one thing, you make something else worse.

Code density is an issue for cache occupancy and instruction fetch bandwidth as well as memory usage, for example. Allowing more registers will make the CPU's register file slower as well as bigger; worse, it will slow down context switches and possibly also impair interrupt latency.


As for how good a target conditional instructions are for compilers: as of the mid nineties, both Norcroft and GCC used them, mainly as a peephole optimisation. ISTR that one of the reasons Norcroft outperformed GCC was that it was marginally more shrewd at generating instruction sequences that were a good target for such peepholing. Back then, at least, neither of them did everything an experienced human could do in terms of adjusting register usage to make more use of conditional instructions possible.

User avatar
Richard Russell
Posts: 459
Joined: Sun Feb 27, 2011 10:35 am
Location: Downham Market, Norfolk
Contact:

Re: Why did CISC prevail?

Post by Richard Russell » Thu Feb 08, 2018 2:55 pm

SteveBagley wrote:memory access is 200x slower than register accesses but you have lots of it so you don't need worry so much about instruction compactness.
On-chip caches are relatively small however so 'instruction compactness' is still relevant, hence the Thumb instruction set (although it's noteworthy that there's no 64-bit equivalent to Thumb).

Richard.

User avatar
BigEd
Posts: 2060
Joined: Sun Jan 24, 2010 10:24 am
Location: West
Contact:

Re: Why did CISC prevail?

Post by BigEd » Thu Feb 08, 2018 4:34 pm

(I found myself reading this presentation by Richard Grisenthwaite of ARM, from 2009 I think, on the evolution of ARM offerings and the difficulties of designing a successful architecture.)

While SPARC and POWER might still have a little life in them, and even MIPS too, I think we should look to RISC-Vfor a possible future RISC contender.

crj
Posts: 834
Joined: Thu May 02, 2013 4:58 pm
Contact:

Re: Why did CISC prevail?

Post by crj » Thu Feb 08, 2018 4:51 pm

Richard Russell wrote:hence the Thumb instruction set
The Thumb instruction set is a major bugbear of mine. I could rant a lot about it, but shall restrain myself.

The one broad point I'll make is that if you look at it carefully and think about what's really going on, you'll likely spot that the technical advantages were rather weaker than the marketing might have suggested. To pick just one example, cast your mind back to the mid nineties and figure out just how much cache memory you could fit in the same silicon area as the Thumb decoder...

User avatar
Rich Talbot-Watkins
Posts: 1339
Joined: Thu Jan 13, 2005 5:20 pm
Location: Palma, Mallorca
Contact:

Re: Why did CISC prevail?

Post by Rich Talbot-Watkins » Thu Feb 08, 2018 4:57 pm

Also the Thumb instruction set didn't provide access to the floating point instructions, so you had to switch back to regular mode each time you wanted to execute floating point (which, increasingly, was all the time).

crj
Posts: 834
Joined: Thu May 02, 2013 4:58 pm
Contact:

Re: Why did CISC prevail?

Post by crj » Thu Feb 08, 2018 5:01 pm

BigEd wrote:I think we should look to RISC-Vfor a possible future RISC contender.
I don't think there will ever be a future RISC contender. At least, not one that contends on the basis of being RISC. Such a large proportion of code is now in relatively high-level languages that the hurdle for creating a new CPU isn't huge, and there are a lot of minor contenders like the Xtensa and Nios. But ARM now has market position, and the licensing fees are apparently not large, considering how gigantic the other costs inherent to making custom silicon are.

I don't know what the next big thing will be. Asynchronous logic looked like it might be big, then VLIW, now people are muttering a little more than they used to about quantum computing. If I had to guess, I'd say the way to become as big a player as Intel or ARM would be to come up with a good practical way to beat Amdahl's Law and speed up compute tasks that are hard to parallelise.

crj
Posts: 834
Joined: Thu May 02, 2013 4:58 pm
Contact:

Re: Why did CISC prevail?

Post by crj » Thu Feb 08, 2018 5:46 pm

Rich Talbot-Watkins wrote:Also the Thumb instruction set didn't provide access to the floating point instructions
Better ideas than Thumb: number 1 in a series of umpteen ad nauseam, by crj


Define a 12-bit encoding for the most frequently used simple instructions. Define a 16-bit encoding for even more of the most frequently-used simple instructions, arranging that the 12-bit encodings form a subset with the top four bits set.

Now, say that a 32-bit "instruction" with the NV condition code is actually a 12-bit instruction followed by a 16-bit instruction.

There's no need for a separate processor mode. The two instructions are both simple, so you can treat them as indivisible from an interrupt-handling point of view without affecting latency. That completely eliminates any complications for PC/PSR values. There's no overhead for freely mixing two-instruction words with normal ARM instructions, so no need for horrible bodges like Thumb's branches.

If you tweak the bus protocol just a little, you can arrange for the ARM to pre-fetch and start executing the high-order sub-instruction from a 32-bit word before the low-order sub-instruction has arrived from the memory system.

*sigh* I said I wouldn't rant. I should get a room.

Diminished
Posts: 92
Joined: Fri Dec 08, 2017 9:47 pm
Contact:

Re: Why did CISC prevail?

Post by Diminished » Sun Feb 18, 2018 8:12 pm

Interesting stuff in this thread.

+1 Thumb dislike here too. There is a joke in here somewhere about "sucking your Thumb" but I'll pass, although the 32-bit "Thumb-2" Cortex edition of it is a lot better.

I liked the conditional instructions in the older 32-bit ARM architectures. A few years ago I was messing about trying to get an ARM microcontroller to bit-bang VGA into a resistor DAC. Since your main loop in this application has to be kept perfectly synchronised to the raster, your code has to be strictly time invariant regardless of what it actually does. It would have required a lot more cycle counting and general messy hackery to make it work without those conditional instructions.

User avatar
fordp
Posts: 958
Joined: Sun Feb 12, 2012 9:08 pm
Location: Kent, England
Contact:

Re: Why did CISC prevail?

Post by fordp » Mon Feb 19, 2018 1:51 pm

crj wrote:
Richard Russell wrote:hence the Thumb instruction set
The Thumb instruction set is a major bugbear of mine. I could rant a lot about it, but shall restrain myself.

The one broad point I'll make is that if you look at it carefully and think about what's really going on, you'll likely spot that the technical advantages were rather weaker than the marketing might have suggested. To pick just one example, cast your mind back to the mid nineties and figure out just how much cache memory you could fit in the same silicon area as the Thumb decoder...
Thumb became Thumb2 and is now the only instruction set on the fantastically successful Cortex M ARM Microcontrollers. It certainly made for more compact code and as such fits more instructions in any cache that you have.

The first generation thumb was a compromise however.
FordP (Simon Ellwood)
Time is an illusion. Lunchtime, doubly so!

crj
Posts: 834
Joined: Thu May 02, 2013 4:58 pm
Contact:

Re: Why did CISC prevail?

Post by crj » Wed Feb 21, 2018 5:40 pm

Ah, but did Thumb become Thumb-2?

Or did they design something new from scratch and call it Thumb-2?

User avatar
BigEd
Posts: 2060
Joined: Sun Jan 24, 2010 10:24 am
Location: West
Contact:

Re: Why did CISC prevail?

Post by BigEd » Wed Feb 21, 2018 6:49 pm

crj wrote:Ah, but did Thumb become Thumb-2?

Or did they design something new from scratch and call it Thumb-2?
This might, or might not, help:
ARM instructions have fixed-width 4-byte encodings which require 4-byte alignment. Thumb instructions have variable-length (2 or 4-byte, now known as "narrow" and "wide") encodings requiring 2-byte alignment - most instructions have 2-byte encodings, but bl and blx have always had 4-byte encodings*. The really confusing bit came in ARMv6T2, which introduced "Thumb-2 Technology". Thumb-2 encompassed not just adding a load more instructions to Thumb (mostly with 4-byte encodings) to bring it almost to parity with ARM, but also extending the execution state to allow for conditional execution of most Thumb instructions, and finally introducing a whole new assembly syntax (UAL, "Unified Assembly Language") which replaced the previous separate ARM and Thumb syntaxes and allowed writing code once and assembling it to either instruction set without modification.

crj
Posts: 834
Joined: Thu May 02, 2013 4:58 pm
Contact:

Re: Why did CISC prevail?

Post by crj » Thu Feb 22, 2018 2:46 am

BigEd wrote:This might, or might not, help:
Sorry - I was being mainly rhetorical, bordering on bitterly sarcastic. Tone doesn't carry very well online. (-8

Executing Thumb code by mapping it into ARM instructions may have been the thing they managed to patent, but it's a lousy idea in practice. ARM claimed that the Thumb-to-ARM translation came for free, but that wasn't true: although the ARM2/3/6/7 internal architecture is traditionally described as having a three-cycle pipeline, it would be more accurate to refer to it as a five-phase pipeline. Thumb decode was slotted into that spare sixth phase.

Adding Thumb even to StrongARM or ARM8 in the same way was out of the question, let alone any more modern implementation.

Post Reply