ARM Offers First Clockless Processor Core 351
Sam Haine '95 writes "EETimes is reporting that ARM Holdings have developed an asynchronous processor based on the ARM9 core. The ARM996HS is thought to be the world's first commercial clockless processor. ARM announced they were developing the processor back in October 2004, along with an unnamed lead customer, which it appears could be Philips. The processor is especially suitable for automotive, medical and deeply embedded control applications. Although reduced power consumption, due to the lack of clock circuitry, is one benefit the clockless design also produces a low electromagnetic signature because of the diffuse nature of digital transitions within the chip. Because clockless processors consume zero dynamic power when there is no activity, they can significantly extend battery life compared with clocked equivalents."
Horrible summary (Score:5, Informative)
Re:That's odd (Score:3, Informative)
of course (Score:3, Informative)
ARM, followd by PowerPC, are the most common cores for embedded Linux and embedded Linux boxes far outnumber servers and desktops (where x86 rule).
Re:timing (Score:4, Informative)
Re:Synchronisation? (Score:5, Informative)
This is not really any different than the way a clocked core synchronises with peripherals. These days devices like the PXA255 etc used in PDAs run independent clocks for the peripherals and the CPU. This allows for things like speed stepping to save power etc.
Re:Soooo... (Score:2, Informative)
According to http://www.arm.com/products/CPUs/ARM996HS.html [arm.com], 50-70 MHz. Plenty fast for embedded applications.
Re:Soooo... (Score:1, Informative)
Re:That's odd (Score:5, Informative)
Re:Horrible summary (Score:5, Informative)
http://en.wikipedia.org/wiki/CPU_design#Clockless
Yes, they are based on asynchronous digital logic, but calling them clockless is ok. They do NOT have a clock signal.
One of the top problems in CPU design is distributing the signal to every gate. It is very wasteful. Clockless CPUs are a revolution waiting to happen. And it will. The idea is just better in every respect. It will take effort to reengineer design tools and retrain designers, but they are far superior (now that we really know how to make them, which is a recent development).
Re:Horrible summary (Score:4, Informative)
Re:The next palm pilot? (Score:3, Informative)
Computer Science has lost it's history somewhere.. (Score:5, Informative)
Gads. Now that I'm "overqualified" to write software (i.e., employers don't seem to think experience is worth paying any extra for), the geek world has completely forgotten that it even has a history.
Re:That's odd (Score:3, Informative)
No kidding. When I took a digital systems lab class, we had to do one simple asynchronous circuit. The corresponding state machine only had four states (compared to a computer processor, which might have a hundred states or more), but it was probably the most difficult circuit to design. Basically, you have to make sure that as you're transitioning between states, you always end up in the correct one, no matter where you may be in between.
Re:How fast is it? (Score:5, Informative)
So this core wouldn't be designed for speed.
Also for many embedded platforms the cpu speed is less important compared to power consumption and bus contention.
Tom
Why is async good (Score:5, Informative)
1. It will give good power consumption characteristics i.e. low power consumed, not just because of the built in power down mode, but also because of the voltage the chips will be running at. By pulling the voltage lower than a synchronous equivalent, it will be simpler to have greater power savings. This becomes possible if you are willing to sacrifice speed. and in async devices, speed of switching can be dynamically altered as each block will wait till the previous one is done, not until some outside clock has ticked.
2. Security: Async designs give security against side channel power analysis attacks. As all gates must switch (standard async design usually uses a dual rail design, so most gates means all gates along both +ve & -ve switch), differential power attacks become much harder. Thus async designs are perfect for crypto chips (hardware AES anyone?)
3. elegance of solution:the world is generally async. Key presses are, memory accesses are. so why not the processor
But they have several points of disadvantage:
1. They are hard to do. Especially using the synchronous design flow that most of the world uses. Synchronous tools assume, especially in RTL, that the world is combinational, and that sequential bits are simply registers that occur once a clock cycle (not true for full custom designs like intel and amd, but for slightly lower level : esp ASIC design)
2. The tools that exist now, are either able to do good implementation using only a few gates ie small functions or bad implementations, that are in worst case as slow as synchronous equivalents but are larger functions. Tools exist like http://www.lsi.upc.edu/~jordicf/petrify/ [upc.edu] Petrify , but these become unusable for circuits with more than ~50 gates.
3. Async designs are usually large. This is not always true, but standard async designs are usually implemented as dual rail or using 1-of-M encoding on the wires. But the main overhead comes from the handshaking circuitry. For really fine grain pipeling, the output of each stage must be acknowledged to the previous stage. This adds a massive overhead, as it necessitates the use of a device called the Muller C Element, that sets the output to the output, only if the inputs are the same, or retains the previous value, if not. Many copies of this element are usually required, and its this that adds space, for example, a simple 1 bit OR gate, that would usually have 4 transistors, has 16 transistors for the dual rail async implementation.
For the time being, I think they will find a lot of use in low power applications - such as embedded microcontrollers/processors, in things like wireless sesnor networks, and security processors. However I believe that full processor design is very far off.
Re:That's odd (Score:5, Informative)
Hence, large-scale async work is often based on every data transfer between modules being sent along with a PULSE or READY signal. Of course, every module has to be designed so that its output is ready when it propagates the pulse... otherwise there's bogus output into the next module. Basically, one module having the propagation delay timed incorrectly can kill the whole system. BUT, with fast logic, your system will simply run as fast as the hardware can handle...
Commercial async processors have been around for AGES [multicians.org] -- but modern logic IC-based processors are rarely build and sold on a large scale, being mostly experimental designs.
Re:Horrible summary (Score:5, Informative)
Uhhh, this isn't ARM's first clockless offering... (Score:3, Informative)
Not really (Score:4, Informative)
Re:timing (Score:3, Informative)
On the PIC series of microcontrollers, you can time any code simply by adding up the clock cycles taken by each instruction and figuring in your clock rate. There's even a nice tool to do this for you. This is often handy for simple delays; sometimes you're using all the timers or you don't want to stick stuff into a bunch of configuration registers just to slow down a loop. I don't see this sort of timing being as easy when there's no such thing as a clock cycle.
Re:All I want to know... (Score:2, Informative)
Re:That's odd (Score:5, Informative)
When I read the article what popped into mind was low consumption while doing nothing, which is what made me think of it. So now I've shown my age and made quite the ass of myself, but what else is
So not the same thing. Sorry for the ruccus
Synchronous vs. Asynchoronous (Score:5, Informative)
Clocks help by allowing the designer to effectively freeze the state of the logical circuit on a regular basis. This way, all the signals in a chip can propagate to where they are supposed to go, then the logical operations occur. This process repeats on every clock pulse.
The problems with using clocks are pretty significant, however. First, you need to add a lot of additional circuitry to implement a clock. Another problem is that generally, A LOT of changes happen on every clock tick, which means a large spike in electical current (because you need to use the electrical current to actually change the state of all of the digital circuits). This spike also causes what is known as noise in electronics, and with higher frequency circuits, the noise can actually cause interference with other unconnected electronics (this is known as EMI). And another problem with a clock is that you generally need to keep it running all of the time for it to be useful, which means using electrical power even when no changes are occurring.
So, the asynchronous CPU is a significant engineering feat. It is very difficult to design, but it is probably much smaller and more efficient than any equivalent clocked ARM core. That said, I wonder how do you actually evaluate the performance? With synchronous CPUs, it is a simply a function of the clock speed and architecture. In addition, all of these devices need to be tested so that they are guaranteed to work - I wonder how they do that.
Re:That's odd (Score:3, Informative)
PDP-6 pre-dates all so far (Score:1, Informative)
Digital had it then...
Origins of this technology (Score:3, Informative)
Re:It would, of course, need static memory (Score:2, Informative)
Re:Livin' large and in charge (Score:3, Informative)
> their investors eventually demanded that they make a profit. Was that about the time
> when you left?
Your view in this matter is utterly unlike the reality of events.
ARM was exceedingly hard working and to begin with something like half the staff had PhDs. What (IMHO) happened was that with rapid growth the quality of lower and middle management in particular was diluted and also politics, the rot of all companies, set in; more and more decisions were made because of *politics* rather than good sense.
When that happens, the inevitable happens.
Re:Horrible summary (Score:2, Informative)
That being said. Asynchronous design does present problems in terms of performance evaluation and circuit design complexity. When you have any form of procssing pipeline where there are loop-backs (and almost any complex circuit does), you have to be able to evaluate how fast data will move through that pipeline since it will take a variable time for certain combinations of data input bits to produce results at the output lines. Optimizing then becomes much more difficult than to run a static timing analysis tool, find the critical path, and optimizing it.
System Clock (Score:3, Informative)
Essentially (as an example), when a processor wants to copy something from a register to memory, it puts a signal on a control bus to tell the memory controller to charge a specific address of memory. The memory controller is reading this, and starts the action, knowing that it takes a fixed number of clock cycles to do this (think the timings of memory). After that time has passed, the processor routes the data in question to the bus. So the signal is being produced, the memory controller has it attached to the memory and charged properly, and the processor keeps it there long enough to write to the memory. That signal needs to be there as long as the memory is attached and set to write.
Now- imagine this type of situation (which applies to all devices, and within the processor's internal actions) should the timing be slightly off between all devices. Not very effective is it? The memory controller may still be reading while the processor stops writing, leading to corrupt data. Essentially, it syncs up the talking, listening, and computing.
This is true within each device as well, such as making sure that all elements of a function are performed at the same time and you don't end up with half of your answer after you actually need it.
Aync means that there's no clock, but rather, the timing of it is established before communication, using a few control signals and regularly adjusting accordingly.
-M
Re:Horrible summary (Score:3, Informative)
P(ave) = C(eff) x V(dd)^2 x f
Which of course means my original comment was correct. Rather than citing confidential material and skirting the issues, how about backing up your assertions with real facts.
Re:Synchronisation? (Score:3, Informative)