Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×

ARM Offers First Clockless Processor Core 351

Sam Haine '95 writes "EETimes is reporting that ARM Holdings have developed an asynchronous processor based on the ARM9 core. The ARM996HS is thought to be the world's first commercial clockless processor. ARM announced they were developing the processor back in October 2004, along with an unnamed lead customer, which it appears could be Philips. The processor is especially suitable for automotive, medical and deeply embedded control applications. Although reduced power consumption, due to the lack of clock circuitry, is one benefit the clockless design also produces a low electromagnetic signature because of the diffuse nature of digital transitions within the chip. Because clockless processors consume zero dynamic power when there is no activity, they can significantly extend battery life compared with clocked equivalents."
This discussion has been archived. No new comments can be posted.

ARM Offers First Clockless Processor Core

Comments Filter:
  • Horrible summary (Score:5, Informative)

    by Raul654 ( 453029 ) on Saturday April 08, 2006 @08:44PM (#15093052) Homepage
    I read the summary and cringed. (1) Don't call them clockless -- they're called a-synchronous, because (unlike a synchronus processor, one with a clock), all the parts of the processor aren't constantly starting and stopping at the same time. A typical synchronus processor can only run at a maximum frequency inversely proportional to the longest length in the critical path - so if it takes up to 5 nanoseconds for information to propagate from one part of the chip to the other, the clock cannot tick any faster than once every 5 nanoseconds. (2) One very serious problem in modern processors is clock skew [wikipedia.org] - if you have one central clock, the parts closest to the clock get the 'tick' signal faster than the parts farhther away, so the processor doesn't run perfectly synchronously.
  • Re:That's odd (Score:3, Informative)

    by ScrewMaster ( 602015 ) on Saturday April 08, 2006 @08:45PM (#15093054)
    Nope. All the pre-Mac Apple machines were based upon the MCS6502 and its derivatives. All were clocked, the original Apple ][ Standard at 1 Mhz. The Apple ///'s selling point was that it had a hardware real time clock, which was removed in later revisions because of quality issues.
  • of course (Score:3, Informative)

    by EmbeddedJanitor ( 597831 ) on Saturday April 08, 2006 @08:48PM (#15093076)
    There are probably more Linux boxes running on ARM than on x86.

    ARM, followd by PowerPC, are the most common cores for embedded Linux and embedded Linux boxes far outnumber servers and desktops (where x86 rule).

  • Re:timing (Score:4, Informative)

    by ScrewMaster ( 602015 ) on Saturday April 08, 2006 @08:49PM (#15093080)
    The fact that the CPU itself has no master clock is absolutely irrelevant to timing applications. You can bet your bottom dollar that the processor will sink interrupts, and that there will be a timer/counter component to the chip. Timing won't be a problem.
  • Re:Synchronisation? (Score:5, Informative)

    by EmbeddedJanitor ( 597831 ) on Saturday April 08, 2006 @08:52PM (#15093092)
    The peripherals (serial ports, sound, LCD,...) are still clocked. The core is synchronised with peripherals by peripheral bus interlocks.

    This is not really any different than the way a clocked core synchronises with peripherals. These days devices like the PXA255 etc used in PDAs run independent clocks for the peripherals and the CPU. This allows for things like speed stepping to save power etc.

  • Re:Soooo... (Score:2, Informative)

    by Anonymous Coward on Saturday April 08, 2006 @08:53PM (#15093095)
    Google is your friend.

    According to http://www.arm.com/products/CPUs/ARM996HS.html [arm.com], 50-70 MHz. Plenty fast for embedded applications.

  • Re:Soooo... (Score:1, Informative)

    by DarKry ( 847943 ) <darkry@noSPAm.darkry.net> on Saturday April 08, 2006 @08:56PM (#15093102) Homepage Journal
    did no one else get this joke :( What a sad world we live in. double plus funny
  • Re:That's odd (Score:5, Informative)

    by temojen ( 678985 ) on Saturday April 08, 2006 @09:00PM (#15093110) Journal
    Many current CPUs don't have built in clocks, but still need them. This architecture is very different. It doesn't need a clock at all. All the timing is based on the propagation delay through the gates. This is extremely difficult to do right.
  • Re:Horrible summary (Score:5, Informative)

    by Peter_Pork ( 627313 ) on Saturday April 08, 2006 @09:06PM (#15093135)
    Too late, they ARE called clockless CPUs:
        http://en.wikipedia.org/wiki/CPU_design#Clockless_ CPUs [wikipedia.org]
    Yes, they are based on asynchronous digital logic, but calling them clockless is ok. They do NOT have a clock signal.

    One of the top problems in CPU design is distributing the signal to every gate. It is very wasteful. Clockless CPUs are a revolution waiting to happen. And it will. The idea is just better in every respect. It will take effort to reengineer design tools and retrain designers, but they are far superior (now that we really know how to make them, which is a recent development).
  • Re:Horrible summary (Score:4, Informative)

    by Raul654 ( 453029 ) on Saturday April 08, 2006 @09:12PM (#15093156) Homepage
    If memory serves, the big problem with these chips was the possiblity of 'state explosion' - as the chip got bigger, the number of possible states increased expontentially. Has this been resolved?
  • by gabebear ( 251933 ) on Saturday April 08, 2006 @09:21PM (#15093176) Homepage Journal
    Responsiveness of a CPU is never really a problem, humans generally precieve anything that happens in less than 1/10th of a second as happening instantaniously. The only real problem with using this chip in a PDA is it isn't very fast, the article says the chip is comparable to a 77Mhz ARM9 which is several times slower than anything you'd find in a PDA today. I would love to see a Palm-OS PDA based around this chip because of it's EXTREMELY low power consumption; we could be looking at the same kind of battery life as the original Palms.
  • This is certainly not the first commerical processor without a clock. The PDP/8 operated using a series of delay lines arranged in a loop so that the end of an instruction triggered the next one. One of the EE courses I took (back when EE majors still had to use real test equipment and soldering irons) involved a design of a clocked version of a PDP/8 as a class project.

    Gads. Now that I'm "overqualified" to write software (i.e., employers don't seem to think experience is worth paying any extra for), the geek world has completely forgotten that it even has a history.

  • Re:That's odd (Score:3, Informative)

    by Manchot ( 847225 ) on Saturday April 08, 2006 @09:49PM (#15093247)
    This is extremely difficult to do right.

    No kidding. When I took a digital systems lab class, we had to do one simple asynchronous circuit. The corresponding state machine only had four states (compared to a computer processor, which might have a hundred states or more), but it was probably the most difficult circuit to design. Basically, you have to make sure that as you're transitioning between states, you always end up in the correct one, no matter where you may be in between.
  • Re:How fast is it? (Score:5, Informative)

    by tomstdenis ( 446163 ) <tomstdenis AT gmail DOT com> on Saturday April 08, 2006 @09:53PM (#15093254) Homepage
    ARM doesn't typically target super fast designs. They go more for low power and then reach for efficiency.

    So this core wouldn't be designed for speed.

    Also for many embedded platforms the cpu speed is less important compared to power consumption and bus contention.

    Tom
  • Why is async good (Score:5, Informative)

    by quo_vadis ( 889902 ) on Saturday April 08, 2006 @10:05PM (#15093281) Journal
    I know typing this out will be useless, and it will get overlooked by the mods, but I might as well say this. Asynchronous designs have several advantages :

    1. It will give good power consumption characteristics i.e. low power consumed, not just because of the built in power down mode, but also because of the voltage the chips will be running at. By pulling the voltage lower than a synchronous equivalent, it will be simpler to have greater power savings. This becomes possible if you are willing to sacrifice speed. and in async devices, speed of switching can be dynamically altered as each block will wait till the previous one is done, not until some outside clock has ticked.

    2. Security: Async designs give security against side channel power analysis attacks. As all gates must switch (standard async design usually uses a dual rail design, so most gates means all gates along both +ve & -ve switch), differential power attacks become much harder. Thus async designs are perfect for crypto chips (hardware AES anyone?)

    3. elegance of solution:the world is generally async. Key presses are, memory accesses are. so why not the processor :). (Yes I know busses are clocked, before you start, but if they were not.... )

    But they have several points of disadvantage:

    1. They are hard to do. Especially using the synchronous design flow that most of the world uses. Synchronous tools assume, especially in RTL, that the world is combinational, and that sequential bits are simply registers that occur once a clock cycle (not true for full custom designs like intel and amd, but for slightly lower level : esp ASIC design)

    2. The tools that exist now, are either able to do good implementation using only a few gates ie small functions or bad implementations, that are in worst case as slow as synchronous equivalents but are larger functions. Tools exist like http://www.lsi.upc.edu/~jordicf/petrify/ [upc.edu] Petrify , but these become unusable for circuits with more than ~50 gates.

    3. Async designs are usually large. This is not always true, but standard async designs are usually implemented as dual rail or using 1-of-M encoding on the wires. But the main overhead comes from the handshaking circuitry. For really fine grain pipeling, the output of each stage must be acknowledged to the previous stage. This adds a massive overhead, as it necessitates the use of a device called the Muller C Element, that sets the output to the output, only if the inputs are the same, or retains the previous value, if not. Many copies of this element are usually required, and its this that adds space, for example, a simple 1 bit OR gate, that would usually have 4 transistors, has 16 transistors for the dual rail async implementation.

    For the time being, I think they will find a lot of use in low power applications - such as embedded microcontrollers/processors, in things like wireless sesnor networks, and security processors. However I believe that full processor design is very far off.
  • Re:That's odd (Score:5, Informative)

    by orangesquid ( 79734 ) <`orangesquid' `at' `yahoo.com'> on Saturday April 08, 2006 @10:08PM (#15093288) Homepage Journal
    Async work is very annoying when the whole system is one state machine.

    Hence, large-scale async work is often based on every data transfer between modules being sent along with a PULSE or READY signal. Of course, every module has to be designed so that its output is ready when it propagates the pulse... otherwise there's bogus output into the next module. Basically, one module having the propagation delay timed incorrectly can kill the whole system. BUT, with fast logic, your system will simply run as fast as the hardware can handle...

    Commercial async processors have been around for AGES [multicians.org] -- but modern logic IC-based processors are rarely build and sold on a large scale, being mostly experimental designs.
  • Re:Horrible summary (Score:5, Informative)

    by Raul654 ( 453029 ) on Saturday April 08, 2006 @10:27PM (#15093349) Homepage
    It's not just direct - power consumption is proportional to the *cube* of the frequency (according to the research paper I just peer-reviewed). But, there are all kinds of ways to vastly reduce that, using voltage scaling, frequency scaling, and power-down-during idling technqiues.
  • by Praetor11 ( 512322 ) on Saturday April 08, 2006 @11:27PM (#15093503)
    ARM made a clockless chip in 1994 for cellphones. Couldn't find an amazing reference, but a quick google turned up http://www1.cs.columbia.edu/async/misc/technologyr eview_oct_01_2001.html [columbia.edu] where they briefly mention it... The last time I heard of this stuff being used was in 2001-- I actually wrote an English paper about it purely to see if I could bore my professor :-p
  • Not really (Score:4, Informative)

    by anirudhvr ( 923609 ) on Saturday April 08, 2006 @11:35PM (#15093534)
    It's a regressive step if you look at the speed at which it can push things across. These days, the power consumed is as important an issue. Active research is going on in the area of Globally Asynchronous, Locally Synchronous (GALS, it's called ;) processors, where each module (like, say, the caches, execution units, reservation stations etc.) run their own clock (and hence its synchronous within the module), and communicate between each other using asynchronous protocols (known as delay insensitive protocols). Such a design greatly reduces the need for clock wiring which would greatly reduce area, reduce clock wiring, save power etc. (at the cost of some processing speed, of course). Google for Globally Asynchronous.. if interested.
  • Re:timing (Score:3, Informative)

    by sketerpot ( 454020 ) <sketerpot&gmail,com> on Sunday April 09, 2006 @12:39AM (#15093689)
    Since the PICs have interrupts and several timers, I doubt he was talking about that.

    On the PIC series of microcontrollers, you can time any code simply by adding up the clock cycles taken by each instruction and figuring in your clock rate. There's even a nice tool to do this for you. This is often handy for simple delays; sometimes you're using all the timers or you don't want to stick stuff into a bunch of configuration registers just to slow down a loop. I don't see this sort of timing being as easy when there's no such thing as a clock cycle.

  • by hereticmessiah ( 416132 ) on Sunday April 09, 2006 @12:51AM (#15093722) Homepage
    You obviously haven't read up on the ARM, have you. You should, if only to learn what a truly elegant instruction set looks like. The ARM3 was a thing of pure beauty...
  • Re:That's odd (Score:5, Informative)

    by tinkertim ( 918832 ) * on Sunday April 09, 2006 @01:24AM (#15093776)
    AHA! Found it. It was the 65CE02 which had an on chip clock which you could send a trigger to stop, causing the MP to go into a suspended but wake-able state (and from 5v to 1.5v consumption). When the clock resumed via external trigger, so did the MP without having to go through its full start up cycle. They never did much with it oddly.

    When I read the article what popped into mind was low consumption while doing nothing, which is what made me think of it. So now I've shown my age and made quite the ass of myself, but what else is /. for?

    So not the same thing. Sorry for the ruccus :) Hey I'm amazed I even remembered it :P

  • by chriso11 ( 254041 ) on Sunday April 09, 2006 @02:31AM (#15093941) Journal
    Most digital logic has at least one repeating signal called a clock, which is used to sequence the logical changes (e.g. from 1 to 0) in the circuit. By limiting changes of state to a periodic time, you can simplify a digital design. One of the major challenges in digital design (besides errors in logic) is dealing with timing related issues such as race conditions. Race conditions occur when a logical operation uses the results of earlier operations. Because of the finite speed of signals inside a chip, sometimes a signal arrives too late for a proper operation to occur. Such an error considered to be a race condition.

    Clocks help by allowing the designer to effectively freeze the state of the logical circuit on a regular basis. This way, all the signals in a chip can propagate to where they are supposed to go, then the logical operations occur. This process repeats on every clock pulse.

    The problems with using clocks are pretty significant, however. First, you need to add a lot of additional circuitry to implement a clock. Another problem is that generally, A LOT of changes happen on every clock tick, which means a large spike in electical current (because you need to use the electrical current to actually change the state of all of the digital circuits). This spike also causes what is known as noise in electronics, and with higher frequency circuits, the noise can actually cause interference with other unconnected electronics (this is known as EMI). And another problem with a clock is that you generally need to keep it running all of the time for it to be useful, which means using electrical power even when no changes are occurring.

    So, the asynchronous CPU is a significant engineering feat. It is very difficult to design, but it is probably much smaller and more efficient than any equivalent clocked ARM core. That said, I wonder how do you actually evaluate the performance? With synchronous CPUs, it is a simply a function of the clock speed and architecture. In addition, all of these devices need to be tested so that they are guaranteed to work - I wonder how they do that.
  • Re:That's odd (Score:3, Informative)

    by butlerm ( 3112 ) on Sunday April 09, 2006 @04:31AM (#15094143)
    The original 6502 had dynamic register storage (using capacitors, similar to dynamic RAM) that would lose information if the clock was held off for very long. So the CPU clock had to be run at a certain minimum frequency (a few hundred kilohertz IIRC) and though it could be stopped briefly, more than a few microseconds would cause the CPU to crash hard.
  • by Anonymous Coward on Sunday April 09, 2006 @05:31AM (#15094244)
    The 166 CPU of the PDP-6 was asyncronous back in 1964. This predates the Venus, 8600, and even the PDP-8 mentioned by others.

    Digital had it then...
  • by Shadowlawn ( 903248 ) on Sunday April 09, 2006 @06:57AM (#15094367)
    ARM is actually building this chip with Handshake Solutions [handshakesolutions.com], a Philips incubator. The work stems from Philips Research as early as in 1986 (yes that's 20 years from research to product), and has matured very much over the years. We used to have courses at our university explaining the basics behind these asynchronous designs. All in all I'm excited to see this technology finally in a product, and hope it will make my pda last yet a little bit longer.
  • by xerxesdaphat ( 767728 ) <xerxesdaphat&gmail,com> on Sunday April 09, 2006 @07:33AM (#15094422)
    WTF? Why not just have an external chip pumping syncs in? Not everything has to be done from the CPU you know...
  • by Toby The Economist ( 811138 ) on Sunday April 09, 2006 @07:35AM (#15094424)
    > So basically, when it was a startup, you enjoyed your nerf tournaments, but then
    > their investors eventually demanded that they make a profit. Was that about the time
    > when you left?

    Your view in this matter is utterly unlike the reality of events.

    ARM was exceedingly hard working and to begin with something like half the staff had PhDs. What (IMHO) happened was that with rapid growth the quality of lower and middle management in particular was diluted and also politics, the rot of all companies, set in; more and more decisions were made because of *politics* rather than good sense.

    When that happens, the inevitable happens.

  • Re:Horrible summary (Score:2, Informative)

    by imgod2u ( 812837 ) on Sunday April 09, 2006 @10:40AM (#15094753) Homepage
    Yes and no. Timing problems is really something that occurs in a synchronous design as you have a limited window of time (between clock edges) where your combinatorial circuit must stabilize. This is not the case in asynchronous design. You have all the time in the world to stabilize your stages of logic. When it is stabilized and only when it is stabilized, does it send a signal to the next stage that it is done and to latch the incomming data. So essentially, your design will time itself (as the method implies) and you won't have timing violations because a certain stage took too long to stabilize.

    That being said. Asynchronous design does present problems in terms of performance evaluation and circuit design complexity. When you have any form of procssing pipeline where there are loop-backs (and almost any complex circuit does), you have to be able to evaluate how fast data will move through that pipeline since it will take a variable time for certain combinations of data input bits to produce results at the output lines. Optimizing then becomes much more difficult than to run a static timing analysis tool, find the critical path, and optimizing it.
  • System Clock (Score:3, Informative)

    by PhYrE2k2 ( 806396 ) on Sunday April 09, 2006 @11:23AM (#15094893)
    A clock is a timer, as measured in Hz (oscellations per second). Generally the actions within each device, such as your processor or video card, operate on their own clock (this is the GHz number), while devices communicate with each other using the bus at the speed of the bus (more distance, mismatched components, and possibility for interference causes slower speeds, closer to 800MHz-1GHz these days).

    Essentially (as an example), when a processor wants to copy something from a register to memory, it puts a signal on a control bus to tell the memory controller to charge a specific address of memory. The memory controller is reading this, and starts the action, knowing that it takes a fixed number of clock cycles to do this (think the timings of memory). After that time has passed, the processor routes the data in question to the bus. So the signal is being produced, the memory controller has it attached to the memory and charged properly, and the processor keeps it there long enough to write to the memory. That signal needs to be there as long as the memory is attached and set to write.

    Now- imagine this type of situation (which applies to all devices, and within the processor's internal actions) should the timing be slightly off between all devices. Not very effective is it? The memory controller may still be reading while the processor stops writing, leading to corrupt data. Essentially, it syncs up the talking, listening, and computing.

    This is true within each device as well, such as making sure that all elements of a function are performed at the same time and you don't end up with half of your answer after you actually need it.

    Aync means that there's no clock, but rather, the timing of it is established before communication, using a few control signals and regularly adjusting accordingly.

    -M
  • Re:Horrible summary (Score:3, Informative)

    by NovaX ( 37364 ) on Sunday April 09, 2006 @01:45PM (#15095291)
    Because your comments seem full of nonsense, I finally looked up the dynamic power equation I was thinking of and that the grandparent had just learned.

    P(ave) = C(eff) x V(dd)^2 x f

    Which of course means my original comment was correct. Rather than citing confidential material and skirting the issues, how about backing up your assertions with real facts.
  • Re:Synchronisation? (Score:3, Informative)

    by maraist ( 68387 ) * <{michael.maraist ... mail.n0spam.com}> on Sunday April 09, 2006 @10:22PM (#15097161) Homepage
    There has to be a "clock" in the system. What Asynchronous means is that the ability to do add, sub, mul, if, jump, next-instruction, etc are not key'd to the clock. They are instead keyed to command signals, and instead of implicitly completing their request at the end of X clocks, they have to also generate a "I'm complete" signal. The controller continuously monitors them and acts as a flow-control manager for each segment of the CPU. So sound modules, etc will be using a clock for only those things that require it.

Thus spake the master programmer: "After three days without programming, life becomes meaningless." -- Geoffrey James, "The Tao of Programming"

Working...