ARM Offers First Clockless Processor Core 351
Sam Haine '95 writes "EETimes is reporting that ARM Holdings have developed an asynchronous processor based on the ARM9 core. The ARM996HS is thought to be the world's first commercial clockless processor. ARM announced they were developing the processor back in October 2004, along with an unnamed lead customer, which it appears could be Philips. The processor is especially suitable for automotive, medical and deeply embedded control applications. Although reduced power consumption, due to the lack of clock circuitry, is one benefit the clockless design also produces a low electromagnetic signature because of the diffuse nature of digital transitions within the chip. Because clockless processors consume zero dynamic power when there is no activity, they can significantly extend battery life compared with clocked equivalents."
Synchronisation? (Score:5, Interesting)
timing (Score:2, Interesting)
The next palm pilot? (Score:1, Interesting)
I worked for ARM... (Score:5, Interesting)
Truely wonderful and very special company for the first two of those years, then it slowly and surely went downhill - these days, it's just another company. ARM's culture didn't manage to survive its rapid growth in those few years from less than two hundred to more than seven hundred.
VAX 8600 (Score:5, Interesting)
Re:The next palm pilot? (Score:5, Interesting)
Basically a good asynchronous chip would draw almost no power while it's waiting for something (like I/O events from network, keyboard, timers, etc). And it would instantly ramp up and handle the event as fast as it possible could. The speed is generally a factor of voltage and temprature. It's how fast the gates can switch and perform interlocks under current conditions, rather than what rate a clock is driving everything.
It's going to be interesting to see what performance metric is used on these "clockless" chips by the industry and by the marketing/sales types. MIPS? FLOPS? SPECmark? not that MHz was ever a good benchmark, but things like MIPS is a lot easier to manipulate to make your product appear faster than your competitors.
Other Uses (Score:2, Interesting)
Asyncrhonous == Clockless (Score:5, Interesting)
But your assertion about critical path is slightly off. Asynch processors still have a critical path. If you immagine the components as a bucket-bregade and the data the buckets, then they may not all be heaving the buckets at exactly the same time anymore, but they will still be slowed down by the slowest man in the line. The difference is that critical path is now dynamic. You don't have to time everything to the static, worst-case component on your chip. If you consistenly don't use the slowest components (say, the multiply unit), then you will get a faster IPT (instruction per time) on average.
And yes, you don't have clock skew any more which is nice, but you now have to handshake data back-and-forth across the chip. Of course putting decoupling circuitry in can help.
Not That Difficult (Score:5, Interesting)
One of the neatest things about asynch processors is their ability to run in a large range of voltages. You don't have to worry that lowering the voltage will make you miss gate setup timing since the thing just slows down. Increasing voltage increases rise time/propegation and speeds the thing up. The grad students had a great demo where they powered one of their CPUs using a potato with some nails in it (like from elementary school science class.) They called it the 'potato chip'.
How fast is it? (Score:3, Interesting)
Clockless chip overview (Score:5, Interesting)
This seems to be a good overview of clockless chips. I can't vouch for its accuracy (not my area), but the source - IEEE Computer Magazine - should be good. The article was published March 2005.
(warning: PDF)0 18.pdf [computer.org]
http://csdl2.computer.org/comp/mags/co/2005/03/r3
Re:Not That Difficult (Score:5, Interesting)
Transmeta's Crusoe was supposed to be clockless (Score:5, Interesting)
What did I miss? I remember the hype, the early diagrams of how it was all supposed to weave through without the need for a clock. Would someone care to elaborate on the post-mortem of what was supposed to be the first clockless processor, 4 years ago?
Re:Horrible summary (Score:3, Interesting)
The most glaring is that you assume that synchronous processors can only have one clock - that's incorrect. While the clock tick is of fixed length (by design), the global clock (as seen by external parties) may run at a different speed than internal clocks.
If the a path of logic takes 5ns to complete, and its clock matches exactly, then you are perfectly optimized. You are hampered not by the clock, but by the transistor's switching speed. This path will have the same delay, regardless of whether it is driven by a clock.
You might be getting confused because you are thinking about pipelining, where the longest stage dictates stage length. If everything is driven by one clock, you create waste because some partitions will finish sooner than others, and are therefore idle. However, modern designs now employ shew-tolerant clocking [amazon.com]. By using multiple clocks, the issues created by clock skew can be entirely avoided. The walls between pipeline stages are destroyed and skewing delay negated.
Your issue with propogation delay of the clock is also not of great concern, in most cases. Synchronous chips can employ distributed clocks, islands of asynchronous logic, and the Pentium-4 actually has stages to help propogate the clock. However most processors are unlike the speed demon design of the P4 and clock speed is limitted by other issues than clock propogation. Currently, that limitting factor is power. In dynamic logic, frequency has a direct relationship to power consumption.
Re:VAX 8600 (Score:3, Interesting)
The VAX 8600 was produced by a team at DEC that had a heritage doing large computers (PDP-10, DECSYSTEM-20). It was competing, internally, with a different group with a "midrange" (VAX) heritage, who produced the VAX 8800 and some other machines. There was no love lost between the groups. They had very different design philosophies, and the 8600 crowd was rather amazed that the 8800 actually worked.
Intel has rival groups too, of course. ISTM that the ones who produced the NetBurst machines (Pentium IV) had the upper hand for a while, but the Israelis who put out Pentium M proved the value of the older Pentium III base, and that evolved into the new Intel Core. Both are clocked, of course, but Pentium IV was designed to have the highest advertised clock speed, as if it mattered. It was one hot chip, much too literally. Async processors move even farther away from that, of course.
You are confused (Score:5, Interesting)
Unfortunately, self-clocked design (like the reported ARM uses) is also sometimes called "asynchronous" logic design; however, this is a completely different kind of thing than the "asynchronous" combinatorial logic used in clock-based design. Self-clocked design also does combinatorial logic in latched stages, but uses a self-timed asynchronous protocol to run the latches instead of a synchronous clock. Basically, the combinatorial logic figures out when it's finished, and tells both the next stage ("data's ready, latch it") and the input latch from the previous stage ("I'm done; gimme some more data").
To close the loop, each stage can wait until there's new data ready at its inputs, and space to put the output data. Thus, in absence of some bottleneck, your chip will simply run as fast as it can.
To overclock a self-timed design, you simply increase the voltage. No need to screw around with clock multipliers; as long as your oxide holds up, your traces don't migrate, and the chip doesn't melt...
Re:Why is async good (Score:2, Interesting)
Re:Why is async good (Score:5, Interesting)
You're right about that. I research side channel attacks on crypto hardware, and my first response to this was --- well, this would make EM analysis more complicated. For those not familiar with the general approach, in side channel attacks you don't try to do anything as complicated as breaking the underlying math of the crypto. Instead you observe the hardware for emissions that can give some clues as the instructions being carried out. If your observations help give you any info about what the chip is processing, you might learn parts of keys or gain a statistical advantage in other attacks. So if it's harder to observe signals emitted (electromagnetically from the chip, then attacking the hardware is harder.
Re:Livin' large and in charge (Score:3, Interesting)
Re:Transmeta's Crusoe was supposed to be clockless (Score:3, Interesting)
Re:Not That Difficult (Score:3, Interesting)
It didn't really work out. While we could easily get prototypes to work well over rated temperature ranges, getting the production version to work reliably was an order of magnitude more effort than the clocked version. As the complexity of the logic increases, the number of potential race conditions increases exponentially. So every nth board had to be scrapped in the early production runs.
It turns out, for TTL and its successors the same manufacturer can produce two copies of the same part that are an order of magnitude different in speed. We would get situations where a signal would propagate through five gates faster than a single gate on another path, so if you missed a path during the design phase you were sure to see a failure eventually. Also, there were no commercial async design tools available at the time, so simulation was definitely "roll-your-own".
Another problem we didn't even consider early on was the inability of the repair technicians to understand the curcuit, so getting a board repaired required the assistance of an engineer.
We would have been fine for onesey-twosey production, but for large commercial runs? The benefits just didn't outweigh the extra hassle.
I'm curious to see if they have any more trouble than normal (for a CPU) when they ramp up to production volumes.
Re:You are confused (Score:3, Interesting)
That sounds a bit like a dataflow language [wikipedia.org]. Maybe you could make a program that automatically converts a program made in such a language into a chip design ? Then we'd only need desktop chip manufacturing to make true open-sourced computing a reality...
But no, such chips would be illegal, since they wouldn't neccessarily have DRM.
Re:Imminent (Score:3, Interesting)
Re:only talk (Score:4, Interesting)
Sun has clockless chips up and running (real silicon, not sims) and they have done some interesting things, but they don't have a complete system that's ready to ship. And there are other components out there that use the clockless philosophy to do certain things, but they're not CPUs in any sense. To give credit where credit is due, as the parent post points out, ARM beat Sun out the door with a clockless CPU that is a drop-in replacement (to some degree, anyway -- not clear how much) for an existing, established architecture. But that wasn't/isn't Suns goal (although perhaps it should be...). They're pushing in new directions, not using this to reimplement current architectures.
Re:Synchronous vs. Asynchoronous (Score:3, Interesting)
Re:Transmeta's Crusoe was supposed to be clockless (Score:3, Interesting)
1997 - Intel develops an asynchronous, Pentium-compatible test chip that runs three times as fast, on the half the power, as it synchronous equivalent. The device never makes it out of the lab."
So why didn't Intel's chip make it out of the lab? "It didn't provide enough of an improvement to justify a shift to a radical technology," Tristram says. "An asynchronous chip in the lab might be years ahead of any synchronous design, but the design, testing and manufacturing systems that support conventional microprocessor production still have about a 20-year head start."
certainly not (Score:3, Interesting)
Linux is more portable. Linux runs on the original 68000. Linux was just ported to the Blackfin DSP. There seem to be about a dozen crappy little no-MMU processors that can run Linux.
Linux requires a gcc-like compiler, but not necessarily gcc. IBM and Intel have both produced non-gcc compilers that are able to compile Linux.