Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?

Itanium Update 297

NegaMaxAlphaBeta writes: "For those of you interested in Intel's Itanium 64 bit processor, EETimes has a nice update article to let us know what's happening with this beast. With an 8 stage pipeline, as opposed to the 20 stage pipeline in the P4, clock frequencies are obviously not as high (~1 GHz). Other notable numbers extracted from the article: 130 Watts power consumption, 328 registers, 6 MB of onchip L3 cache ... quite nice (well, not the power thing). I'm sure many people can appreciate 64 bit integer ops; for me, it means single instruction xor for the 64 bit hash codes used in chess transposition tables."
This discussion has been archived. No new comments can be posted.

Itanium Update

Comments Filter:
  • What a dog (Score:1, Flamebait)

    by nagora ( 177841 )
    This thing is garbage. The power and the insane complexity of writing a decent compiler for its instruction set just makes me wonder what Intel were thinking. Not to mention the speed.
    • Re:What a dog (Score:3, Informative)

      I'm sure it is proprietary, but Intel has written it's own optimizing compiler for the IA64 instruction set.

      It is an interesting solution to the performance problem: Rather than just increase clock speed again, figure out the performance details at compile time and arrange the code to help the processor run it more efficiently.

      For example, if you have an if statement and the compiler can determine that 95% of the time the TRUE block will be executed, the code can be arranged so the branch prediction will choose the more frequent route and the pipeline penalty won't need to be paid as frequently. (This is just a simple case of optimization, the IA64 will require insanely complex optimizations, but that is just expanding on what compiler writers have been doing for years.)

      It makes the compiler orders of magnitude more complex, but it could potentially increase execution speed by a couple orders of magnitude too.
      • It is an interesting solution to the performance problem: Rather than come up with an efficient design, come up with a bad one that wastes huge amounts of energy (where do you thing all that heat is comming from?) and try to make up for it by adding loads more complexity to the instruction decode.

        but that is just expanding on what compiler writers have been doing for years.

        What they've been doing wrong for years.

        Simplicity is the correct answer, Intel clearly didn't understand the question.


        • Simplicity is the correct answer, Intel clearly didn't understand the question.

          That's assuming they were listening in the first place.

          If Intel had big plans for the long run, they'd create a "simple" processor, let's take the original Pentium as a bad example:

          Add MMX. Customers upgrade.
          Change processor form factor. Upgrades galore.
          Add SSE. More upgrades.
          Change proessor form factor again. Upgrade.
          Change form factor, add SSE2 and slap on a few marketing terms. Further upgrades.

          The advantage is each time you can say the processor is "new and improved" so people will buy new ones. Does it really matter that a Pentium III 600 is more than enough power for 90% of computer owners? Of course not.

          What makes me laugh, though, is how Intel switched to the Slot 1 form factor so it would be easier for customers to install processors (how often does that really happen?) and then switched back. I'll bet they were planning it all along.
          • the best one yet is cache

            put more cache on the chip because after all its the same structure on the die and get huge gains in performance
            (as long as your MMU and cache lines are done right)
            but takes up more die so more expensive

            Charge an ARM and a LEG

            oh come on you arnt serious ? I hear you cry

            ask an intel engineer the diff between XEON Px and plain Px

            answer cache
            (yes I know that designing a decent cache is hard but compare it to a real change in arch ;-)

            result (foolish) customers upgrade

            fun of the fair


            john jones

      • > It is an interesting solution to the performance
        > problem: Rather than just increase clock speed
        > again, figure out the performance details at
        > compile time and arrange the code to help the
        > processor run it more efficiently.

        That is neither interesting nor a solution. People (i.e. compiler writers) have been working on this for forty years with some (limited) success.

        > the IA64 will require insanely complex
        > optimizations, but that is just expanding on
        > what compiler writers have been doing for years.

        Just because the IA64 demands heroic compiler optimization to make up for its shortcomings doesn't mean that the ability to write such compilers will suddenly spring out of nowhere.

        Compiler researchers haven't just been sitting on their butts for the last forty years.

        > For example, if you have an if statement and the
        > compiler can determine that 95% of the time the
        > TRUE block will be executed, the code can be
        > arranged so the branch prediction will choose
        > the more frequent route and the pipeline penalty
        > won't need to be paid as frequently.

        This was a bad example. Dynamic branch predictors (such as you find in any modern fast CPU) do a great job in practice, better than any known static predictors.
    • Am I responding to flamebait?

      Anyway, this thing is not garbage. I've wondered for a long time why chip designed couldn't do what intel is calling "hyperthreading". It will soon become a reality. I'm excited about it.
      • It's already a reality. What Intel calls hyperthreading is coming in the next generation Alpha, is already shipping in POWER-based AS/400 systems, and is also in some specialized network processors.
    • Intel has a lot of smart people working for it (smarter than either you or I). They have done some dumb things, but overall they've been on the mark a surprising number of times (even the P4 looks pretty impressive now that clock-speeds have ramped up). It would be a serious mistake to underestimate them like that. If Intel is putting this product on the market, you can bet that they've fixed the compiler problem. Initial benchmarks of the Itanium seem to show that it can keep right up there with the Alphas and SPARCs in terms for performance (fp, at least).

      As for compilers, don't discount Intel so easily. They make incredible compilers. The features of ICL for x86 make compiler designers cream their pants. Read this article [intel.com] for some info about Itanium's compiler design.
      • I don't care of you have the smartest people in the world, which is likely with Intel, if you mismanage and drive products the way Intel has it does not matter.

        Sure you have a ton of smart people but I just have a lack of faith in the whole architecture. You can have the brightest bunch of people in the world but if you make them cook burgers does it matter? Not to discredit your post on other means, but saying because Intel has smart people is kinda silly. They also have stupid people by the same token, does that make them less likely to suceed?

        • My point was that you can't outright say "why is Intel doing such a dumb thing?" (Which the original poster did). Intel is not a stupid company (like MS). They have been very sucessful, not based only on their monopoly status (AMD has been making significant inroads into their turf) but because of the quality of their products. Within the constraints of the x86 architecture, Intel's chips have been incredible. Even now, the P4 looks really promising in its 2GHz+ varients. You just can't discount such a company so easily.
          • Groups of people often act much more stupidly than their constituent members. Intel has certainly made a few stupid moves over recent years:
            -- IA64
            -- Rambus
            -- The home wireless network standard they pushed that got beaten by 802.11
            • -- IA64 cannot officially be called "dumb" yet.
              -- RDRAM was a mistake, but it wasn't just Intel. Nintendo bought into RDRAM, as did Sony and several graphics card makers. RDRAM fizziling was NOT something that could have been predicted. I kept up with the reporting back when RDRAM was still called nDRAM (as in unknown), and nobody expressed any objects.

      • > If Intel is putting this product on the market,
        > you can bet that they've fixed the compiler
        > problem.

        Your faith is touching. Another possibility is that the Itanium project was way behind schedule and that they had to ship something, anything, because their competitors and the rest of the industry were laughing at them. And so they shipped a CPU with the worst SpecInt number in the industry and even warned their customers that this was really just a development chip and 'real' hardware would have to wait for the next generation.
  • Did I read that right? 328 registers?

    If that's what I think it is... that's an AWESOME improvement over previous x86 incarnations :-) Just imagine the extent of freedom your C++ compiler will have with register allocation ... this will cut down memory accesses by at least an order of magnitude!

    Of course, this all depends on whether these registers are general purpose. They'd better be, 'cos I can't imagine needing 300+ registers for special purposes while still giving you the klunky ole EAX, EBX, ... & co. registers.

    • Actually, this is technically inaccurate. The IA-64 architecture [intel.com] (click on the link, it's the assembly language reference for the Itanium) has 128 integer registers and 128 floating point registers - on each side, 127 real ones, and R0, which is fixed to return 0.

      What's not commonly known is that the P3 and P4 also have dozens if not hundreds of registers. The trick is register renaming: the P3 and P4 speculatively execute instructions as fast as they can, and they assign the results to temporary registers. If the processor needs these results, they reassign them back to the real registers like EAX, EBX, and so on.

      So, overall, I'm not sure where the 328 number comes from. :P

      • 328 *physical* registers, not logical (ISA accessible). with 128 context switches will hurt big time ia64. yet another bad design decision of the itanic.
        • Re:328 registers??? (Score:3, Interesting)

          by VAXman ( 96870 )
          328 *physical* registers, not logical (ISA accessible). with 128 context switches will hurt big time ia64. yet another bad design decision of the itanic.

          A context switch happens one in a blue moon. Fast context switches are not going to make up for sluggish performance for the real work the machine is doing between context switchs. Registers are considerably faster than cache; the absolutely fastest cache in the world is P4's L1 cache which has a load latency of 2 cycles, and on most architectures it is 3 cycles. Putting 128 qwords into registers is an absolutely dramatic speedup for programs which have a working set more than 8 dwords (all that IA-32 gives you).
      • So, overall, I'm not sure where the 328 number comes from.

        Probably 128 integer regs + 128 float regs + 64 branch/predicate regs (note: NOT general-purpose) + miscellaneous regs like the IP, etc.

        While the P3/P4 have lots of registers, they aren't registers in the sense most people think about them. They solve the dynamic antidependecy problem. The static data allocation problem is a separate beast. Those renaming registers aren't visible to the compiler so you'll still have the same number of memory operations in the program.

        Same deal with SMT/Hyperthreading. More registers are needed, but they aren't the sorts of registers the compiler can use.

        It's interesting that a write to R0 is defined to fault. Is this just for Itanium or is it an IA64 architectural decision? If so, it seems like a very poor one to me.

      • So where do I download a free reader that runs on Linux for that file of binary garbage?

    • Yea, they are. As I remember it, 128 general purpose integer and 128 general purpose fp. Download the Intel C++ docs for some info about the ASM-viewpoint architecture of the Itanium.

      My only question is for the OS guys out there. How does an OS handle context switches with 328 registers? With 8 bytes per register, thats more than 2K of data to dump out every context switch!
  • As someone with a few friends that recently made the move from Compaq's Alpha division over to Intel, what I'm most curious about is what revision of the chip will we see any improvements being incorporated from the Alpha design. I can't imagine Intel would want to let out any news on work that they bought instead of engineering themselves, but I think it'd be interesting to hear what exactly was directed ported over in the designs, if anything, as well as a detailed comparison of the two processors. Any info, anyone? Perhaps the second big revision of the IA64 chips?
  • by Bodero ( 136806 ) on Monday September 03, 2001 @01:34PM (#2248524)
    Hyperthreading, as implemented, exists in the Pentium 4 line.

    Right. And there's no indication that something similar will appear in IA64 until at least 2006 (which is the *earliest* that the Alpha team could likely add it to that complex - or if you prefer messy - an architecture if the hooks for it weren't already built in).

    It's a weak second to SMT. With HT, as I understood it, if a processor happens to have a floating point op and an integer op on hand at the same time, it can run both of 'em at once, instead of sequentially. That's the limit to the HT magic. It can't do two FP or integer ops at once.

    Well, real-world server applications could be sped up by 30%, which would mean that HT could execute multiple *non*-FP instructions at once (and the article doesn't say it can't, just that it can't execute two FP ops at once).

    It actually seems to look quite a bit like EV8's SMT, except that we don't know if it currently adds more execution units to the P4 architecture and whether all execution units can be applied to service a single thread if multiple threads aren't present. And, of course, it only supports two concurrent threads rather than four.

    Intel stole and then implemented Alpha technologies for its Pentium, and only much later did it negotiate with Digital to get the official right to use that stuff.

    No: I'm assessing the situation, unlike your propensity for drawing conclusions based on vague speculation and no data.

    IA64 has to all appearances been developed with zero attention paid to things like out-of-order execution (in fact, it was developed explicitly to *avoid* out-of-order execution). OOO and SMT are intimately intertwined in EV8's SMT design, and apparently also in HT's. There's no indication that Intel has until now given any thought toward incorporating SMT/HT technology in EPIC, and every indication that it will thus take at least close to 5 years before such IA64 technology hits the street (especially as incorporating it into EPIC will almost certainly involve radically different internal approaches than those used to incorporate it into EV8 and P4).

  • Did anyone notice that in the middle of the article it says that the Pentium 4 chip has hardware multithreading, yet it was disabled "until the company comes out with its first Xeon processor with multithreading."

    Shades of the whole 486SX [ic.ac.uk] debacle?

    • No, its just that the technology probably isn't mature enough to release. Its different from the 486SX, which is closer to the whole Celeron thing.
    • Not really. On the 486 the FP unit was a discrete part of the chip, which could potentially have a defect, and thus be disabled. It was done only to increase yield. MT on P4 is spread completely through the chip, and it is unlikely that a defect would prevent MT from working but let the chip run in single-thread mode (since almost all parts of the chip are shared in MT). The reason MT is not enabled is not a manufacturing issue (like on 486SX) but mostly for paranoia about pioneering a totally new feature.
  • I'm running a Coppermine 850 right now, and at the time it was a sensible upgrade from my Katmai 450, just as the 450 was a good upgrade from the PII 233 before it. But right now I'm scratching my head. My next CPU upgrade will most likely require a new Motherboard as well, so I now have the freedom to go to a completely NEW architecture. (The ABIT BX6 will probably go in our media box) But I don't really want a P4, the Itanium definitely isn't for what I do, and I have never really been an AMD fan, I just don't know their stuff. So where is Intels next chip for ME?

    • I've been extremely reluctant in going the AMD route. My first AMD processor was a 133MHz 486, which was branded in a way as to resemble the Pentium 75 (on the premise that it was as fast). I put it into my firewall, which was not getting heavy duty at the time.

      The thing sucked eggs, and I threw the motherboard in the trash and used the CPU as a paperweight.

      At some stage, I needed a faster CPU, needed a motherboard to go with it, so I made the jump to a PentiumIII/450. I needed to revive my firewall, so I bought a decent ASUS 486 mobo at a fair. On a hunch, I put my paperweight AMD133 in. I was pleasantly surprised, and I only replaced the thing when I got a real cheap 300MHz Cyrix mobo.

      Bottom line, it't the motherboard (or rather the chipset on it) that makes or breaks the CPU. I'm now running an ASUS A7V-E with a 1GHz Athlon, and I've been a happy camper. I'm not an overclocker (matter of fact, I underclock some machines just because I don't need CPU power for other things than video recoding, and some machines are on the other end of the globe, so I don't want to lose sleep over fan failure).

      My main gripe with the VIA KTA133 chipset is the fact that I have to sign a $#@#%$#% NDA to get decent specs on it. FreeBSD doesn't seem to grok its I2C based hardware monitoring, and without those docs I'm SOL. Apart from that, it's working great. Even under Carmageddon^WWindows.

  • This isn't so much about the CPU itself, but the chipset it fits to:

    The BIOS on all Itanium chipsets (AFAIK) are setup to have a small kernel onboard. I.E. - you can boot the system with limited funcionality even if there's no floppy, HDD, or other boot medium present. If you do have a filesystem present, the "BIOS-boot" will even give you access to it.

    Not the biggest feature on the block but helpfull none the less.
  • Anybody knows if Itanium or that 64-bit AMD processor will have better support for CPU virtualization?
  • I will now translate the thoughts of an average American regarding this article:

    "Wow, a 64-bit processor with 6MB? I can finally have a computer more powerful than my N64! I hope it doesn't let little Billy access all of that satanic-internet-porn any faster, though...."
  • I would laugh if Intel eventually decided to sell these impressive-looking chips for desktop systems and had to do a big campaign about how clock speed is not terribly relevant to how the chip performs, in hopes of silencing Athlon owners saying "Ha ha, a whole Gigaherz!? How much did you pay?"
  • by Ghoser777 ( 113623 ) <fahrenba@@@mac...com> on Monday September 03, 2001 @01:45PM (#2248562) Homepage
    So when most people go out and buy a computer, they see a lot of mhz and think it's really fast. So if they're use to 2ghz+ pentiums, why would they even think of buying a 1ghz itanium? Sure, I know it'll probably be faster, but how does intel plan to market these? Will they also drop mhz ratings like AMD? Or will they go on some major re-educaiton campaign, like Apple [apple.com]?


    • And Intel will, of course, be working upstream against its own past marketing push which is largely responsible for the MHz Myth. Nice.
    • Remember that Itanium is marketed solely towards IT people, who know that gigahertz does not equal performance, and who do real performance studies before deploying a system. Look at the success of HP and Compaq whose chips are reasonably fast, yet have slow megahertz ratings (or Sun and IBM, whose chips are slower, and have low megahertz ratings, but sell very well).

      It is only the consumer market which looks at gigahertz. Which means that Intel will have to make a high megahertz version if it expects Itanium to enter the consumer market.
  • by ral ( 93840 ) on Monday September 03, 2001 @01:46PM (#2248563)
    ...the Itanium product line will see its speed increase from 800 to 1 GHz, which is half the frequency of the company's fastest 2-GHz Pentium 4....Intel contends, however, that the faster front-side bus, more on-chip memory and redundant logic resources will more than make up for the processor's lag in clock speed.

    We can only hope that this chip helps the media away from using clock speed as the primary (often only) measure of performance.
    • But clock speeds are often a inidcator of performance - anyone who has played rocky's boots and knows anything about clocking transistors knows that.
    • Intel will most likely not face the same problems AMD has with regards to marketting of their CPU due to the target market.

      When you buy a machine with $2000-$5000 CPUs, you tend to do real research on the performance of the system you are buying.
      • > When you buy a machine with $2000-$5000 CPUs, you
        > tend to do real research on the performance of
        > the system you are buying.

        Which makes you wonder who would possibly buy an Itanium (especially for non-FPU-intensive servers where Intel's pushing it).
  • by doorbot.com ( 184378 ) on Monday September 03, 2001 @01:47PM (#2248568) Journal
    for me, it means single instruction xor for the 64 bit hash codes used in chess transposition tables

    Watch where you say that, or you'll be using that nifty Itanium to repel the hordes of women instinctively flocking to you like the salmon of Capistrano.
  • by dpilot ( 134227 ) on Monday September 03, 2001 @01:50PM (#2248579) Homepage Journal
    Is anyone else so completely stunned as me, that essentially everyone (except AMD) has rolled over and allowed the IA64 to be crowned heir apparent as the new high-end microprocessor? The Alpha is dead by acquisition, HPPA is dead by partnership, MIPS is lost somewhere in the low end, and Sparc and Power4 are both retreating upstream.

    It's amazing that ANYONE can field the number of mistakes that Intel has, and get away with it. For some time now, their first-outs have been essentially flops:

    Pentium: Remember the 5V room heaters?

    Pentium: Then the 3.3V units with floating point bugs?

    Pentium Pro: The ancestor of the Pentium II/III line was a good CPU in its own right, and worked well for Unix and OS/2. But it completely missed the market, performing terribly on 16 bit code.

    Celeron: DeCeleron, until they put the cache back on. From another point of view, the whole Celeron program has been a disaster, either by its own crippling, or by revealing how overpriced the PII/PIII line is.

    Pentium III: CPUID - A 'workstation idea' that once again missed its market. Maybe if they'd found a way to node-lock software that can't be used for machine tracing. Maybe that's not what they were after.

    Pentium 4: Let's face it, this CPU is just plain uneven and imbalanced. After a round of redesign to even it out, just like with the others, it could very well be an excellent CPU. Tame the prefetch, expand the trace cache, etc.

    Itanium: Didn't even make it out the door before spin-doctoring began. "Just wait for McKinley!" I've already heard one set of rumors that McKinley isn't going to *really* do it either, so just wait for IA64-III.

    Is all this any better than the "Just wait for this new release!" that Microsoft keeps pulling? Though I guess Intel does generally get each family right on the second shot.

    AMD has a good product, I just wish they were a little less mum, and had a better response than warmed-over P-numbers. I also wish we could hear a bit more noise about the Hammers.
    • MIPS is lost somewhere in the low end

      Umm... have you ever used a MIPS chip? The R10k and R12k are beautiful processors and very fast. Don't let the low MHz rating fool you. The SGI compilers are also very good -- they do a lot of optimization and the profiling tools are some of the best around. There are lots of hardware counters on the R10k (32 I believe) that make it easy to find out where in your code to all your FLOPS are, the secondary cache misses, branch mispredictions, ...

      I wish SGI/MIPS would continue along with these chips. They are a wonderful platform to develop on.

    • Sounds like you haven't heard about a little chip called the Power4 [realworldtech.com]

      Plus Sun sure hasn't rolled over either, Sparc performance has always been subpar, but they make up for it with a good OS (Solaris) and tons of applications.

    • Pentium III: CPUID - A 'workstation idea' that once again missed its market. Maybe if they'd found a way to node-lock software that can't be used for machine tracing. Maybe that's not what they were after.

      I think you're confusing CPUID with Processor Serial Number (PSN).. PSN, IMHO, was a good idea, but the privacy zealots cried foul and ruined an otherwise good way to lock software to a specific individuals CPU. (YES, I know there are work-arounds that pirates can use (from simply hex-editting the instructions that check for the PSN to writing drivers return false info).) I really wish Intel hadn't backed down on PSN and included it in the P4 (afterall, for those naysayers that don't want PSN, or their identity, revealed to websites or software, you can disable it in the BIOS).

      Oh well. Thought I'd clear that up. CPUID is GOOD. PSN is BAD (to the privacy folk, anyways).

      • You're right, I had a brain blip on that. I meant PSN instead of CPUID.

        I merely wish they had looked into some PSN-type technique that would let software be nodelocked without being usable for tracking. I don't believe PSN must be bad, at least not to anyone other than a fanatical Free Software type, who believes NO software should need to be paid for. I'm sure a technique can be used which will not alarm privacy advocates.
      • Realizing now that I never explained what CPUID really was-- CPUID is an instruction introduced in the original Pentium (and some late model 486's, though undocumented and unsupported) that returns a plethora of useful info on what kind of processor is being used, as well as what features it has. AMD and Intel share a lot of the same info (as far as the data layout), but diverge on others. In Intel's incarnation, a bit-flag is returned that exposes the status of such features as an on-board FPU, MMX, SSE and SSE2, as well as some individual instructions such as CMPXCHG8B. Both AMD and Intel reveal Family, Model and Stepping information, as well as an ASCII string representing their company slogan (in Intel CPU's, "Genuine Intel"). Even newer processors tell you their name and speed in an ASCIIZ string.

        Quite useful, and pretty much does away with arcane checks to see what processor the code is really being ran on (like the various methods of checking to see if you're running on an 8086 or 286, or 386 vs. 486, for example).. =) Unfortunately, if you want to run on these golden oldies, you still have to do those arcane checks, but once you establish that you're working with a Pentium or higher processor, you simply do a CPUID and you're done.
    • by Anonymous Coward
      You didn't mention the i860 and the iapx432, the processors Intel wants you to forget about, and it seems that they succeeded.
    • Hey, how could you forget Rambus!
      • Didn't. But that was mostly a chipset, motherboard, and support chip issue. I presume you mean the last-minute delay of the 820 launch.

        Another aspect of Rambus is the untamed prefetch on P4. It's so aggressive that only Rambus can provide enough bandwith to keep it running, at least until dual-channel DDR. But according to the reviews, most of that bandwidth is merely wasted, but needed to keep the processer fully fed.
    • While you have certainly kept track of Intel, I believe you've completely ignored the rest of the world's history.

      That is, all these companies have had their share of problems.

      When the Alpha was first released it ran *HOT*. I had one of the early DEC3000/300 on my desk. DEC had other problems with the Alpha. The CPU itself was denied it's future because of poor quality boxes it was put into.

      I'm not quite as farmiliar with Sparc or PowerPC, but we shouldn't forget that Sun was having difficulties with the Sparc found in the E10000 not too long ago. To the companies who had paid millions for these boxes, it was a bigger deal than the Pentium floating point problem.

      AMD has had their share of flops. The early 386 and 486 designs were good, but should we all forget the K5 and the early K6?

      I had a Cyrix 486DX/50 clone back in '94, and it wouldn't work with a variety of software under Linux such as ghostscript. Cyrix replaced it, reluctantly... I had to argue with them on the phone despite Infoworld articles reporting the problem.

      I don't see Intel has having a signifigantly worse track record than others. Their product is certainly used in a higher number and thus the failures are higher profile.
      • The difference is in what happens after failure. Intel has had a string of failures, and manages to come back, be forgiven, and continue to dominate the market. The others aren't so lucky. Even with as strong a product as the K7, AMD just hasn't cracked the higher profit markets. Intel would like us to believe that they heyday of the K7 is fading, and if they do a good respin on the P4, they'll be right.
    • Someone needs to defend the SPARC chipset, and what [I see] Sun Microsystems is doing, so here I am.

      Sure the single processor, or even up to 8 processor results are not the greatest thing out there. In the single through four processor units Intel beats them, and higher the Power series takes over. What one tends to forget is, for a processor that is designed for SMP, A) 1024 processors linearly is damn good, and B) it is relatively cheap for a server class processor. Also the SPARC line is known to have the least number of hardware bugs of any major processor out there.

      Sun really doesn't need a sports car of a chip anyway. Servers and workstations need uptime. They don't need to attack the user market yet. First they seem to be more actively attacking the workstation market with the sub-$1000 SunBlades. With a Sun solution the workstation only needs to be moderately fast, but the server needs to be DAMN fast because the most intensive processes run on the server and display over the network. Small steps.
  • The article states that the Itanium pulls 130 watts of power. That seems rather high, even for the space heaters that we like to call cpu's nowadays. Is the Itanium using the new all-copper .13 micron process, or an older technology?
  • by deranged unix nut ( 20524 ) on Monday September 03, 2001 @01:55PM (#2248594) Homepage
    This is a rather odd quote from the article:
    (bolding is my emphasis)

    To protect against heat-related system meltdowns, McKinley includes a programmable thermal trip that can throttle processor performance by 40 percent to cut power consumption. But the company sees that more as a safety net, not as an answer to thermal issues. "This should never be needed in a properly designed system," said Naffziger.

  • Fat pipe (Score:1, Funny)

    by Anonymous Coward
    Could it be that Intel(tm) is learning that it's not how long your pipe is but how you use it?
  • With an 8 stage pipeline, as opposed to the 20 stage pipeline in the P4, clock frequencies are obviously not as high (~1 GHz).

    What??? That's totally false, not to mention counter-intuitive. The whole reason for the shorter pipeline is to increase throughput. Think of Henry Ford and the classic assembly line. If you have stages that involve scheduling instructions to be fed into different (parallel) pipelines, as opposed to DIRECTLY COPYING instructions from cache into the appropriate pipeline, which do you think should be faster?

  • 328 registers? (Score:2, Interesting)

    My god. I'll never learn assembly on a modern chip. I tried on the 386/486, but gave up, and opted for the 65c02 (a fine little chip). I'm getting to the point where it's time to move on, and I was going to attempt the 68k or even PPC (no altivec though). I think I might actually manage to learn that, but I can't even begin to imagine 328 registers. Especially arranged the way intel tends to arrange them...

    Will anyone outside of cpu engineers and compiler authors even learn asm on this monster? Or have we truly moved past the point where programmers understand the cpu?
    • In its heyday, the 6502 was an eight bit RISC processor avant la lettre. It featured a whopping 256 memory locations that could be accessed with near-zero overhead. The famous page zero.

      Needless to say, this great concept had gone to the dogs before the first consumer laid his/her hands on the device. Oblivious to the CPU design, a major manufacturer of operating systems (we called them BASIC interpreters at the time, by the way) has decided that most of page zero should be allocated to the OS^WBasic interpreter. I'll leave it to our hidden conscience to name the prepretrator of this gruelsome mistake.

      I have long grown over the idea of using assembly as a faster programming language. The number of times I beat an assembly program with something hacked up in Perl, I don't even want to remember. Not because Perl is the best thing since sliced bread, but because humans are so poor at dealing with complexity. Get it working first, and leave optimization to the compiler. Then, if you have a bottleneck, analyze it, and fix the bottleneck in a targeted piece of code (whether C, or assembly, or something else).

    • 256 of them are numbered generic registers (so instead of EAX, EBX, ECX and EDX, you get r0 through r127, then 128 (or 127) more floating point registers). They're also 64-bit in size, vs. 32-bit on x86 based CPU's.

      I'm not sure what the other 100+ registers are, though I believe there are 64 "predicate" registers that have a 1-bit accuracy (eg: set to 1 or 0) and can't be used as generics (and wouldn't be useful even if they could).
  • Damn, I guess a large scale SMP machine will be dual use convection oven then. Oh wait, by the time I add in FSB buffering and memory maybe that will be true of the workstations :-).
    • by OmegaDan ( 101255 )
      Mattel offered "barbie" and "hot wheels" computers earlier this year ... maybe intel could go in with Mattel and offer an Easy Bake Itanium computer.
  • I think I've figured out what the whole 64-bit thing is about. It means that each instruction (right term?) has more capacity to carry data. This doesn't necessarily mean that it will be twice as fast, of course, because not all instructions are that large.

    What I'm confused about is how it affects programming. Does this mean that everything will need to be optimized for you to take advantage of the higher bitrate? How will programs that are written for 32-bit systems handle it; can they handle it? How about backwards compaibility?

    Do any other people read these sort of threads even though they know that it will be over their heads most of the time?
    • I think I've figured out what the whole 64-bit thing is about. It means that each instruction (right term?) has more capacity to carry data. This doesn't necessarily mean that it will be twice as fast, of course, because not all instructions are that large.

      Yup, exactly right. It means that the CPU tends to deal mainly with 64-bit (8 byte) chunks of data at a time, instead of the more common 32-bit chunks. As far as programming goes, not everything needs larger instructions. For example, to program a user interface, 32 bit integers are quite sufficient for most purposes (unless you have over 4 billion items in a listbox or something). If you only need to store a number from 1 to 10, using 8 bytes instead of 4 is a waste of memory. (This happens a lot.) However, it is useful for many operations, such as multimedia, games, DSP applications, crypto, etc. etc. These applications would run faster on a 64-bit processor because they can use 1 instruction to manipulate a 64 bit number instead of 2 or more that are necessary to do the same thing on a 32-bit processor.

      The other reason to use 64-bit processors is that it makes it easier to use 64-bit memory addressing. (For various reasons, it's a little easier to program if memory addresses are the same size as integers.) If you have more than 4 GB of RAM, (or you want more than a 4GB address space more precisely) then you need larger pointers. At the moment x86 programs use 32 bit pointers, but the Pentiums and above actually have 36 address lines, so they can use up to 64GB of RAM. Anyway, a 4 GB address space will be fairly cramped in about 10 years, so it's time they bumped that up a bit.

      Intel has an emulation mode in the IA-64 series to allow people to run existing 32-bit programs, but at the moment it's dog slow. (It runs at about the speed of a Pentium 133, if that, when the processor is running at around 700 MHz.) The IA-64 architecture is completely different from the current IA-32 (x86) stuff. I get the impression that the 32 bit emulation doesn't use as many tricks as the existing processors to get programs to run faster. They're also overhauling the motherboard/BIOS stuff that's been around for a long while. (Some of it since the original IBM PC.)

      Of course, just because a processor can do 64-bit operations, it doesn't mean that it's actually faster than its predecessors. For instance, IA-64 has a few weaknesses:

      • It doesn't have an integer multiply instruction. You have to convert to floats and back if you don't want to program the multiply using shifts or something.
      • It doesn't support a floating-point type with better precision than 64 bits (called "double" in many programming languages). This makes it unsuitable for high-precision calculations. Current IA-32 chips can use up to 80 bit floating point values.
      • Intel seems to have tried to include every feature (except see above) but the kitchen sink in the instruction set. Loads of processor hints about instruction grouping, branch prediction, cache hints, and heaps of other stuff. This makes quite a complex design that could be difficult to implement and write really good compilers for. (Then again, Intel could always sell their own...)
      • And all of the space-heater comments.
      Anyway, it remains to be seen what effect the above points will have on its acceptance.
  • email from intel (Score:2, Interesting)

    by xted ( 125437 )
    I received this from one of my intel comrades which was sent to all of the intel eployees.

    Speed is important. On Monday, Intel launched the Intel® Pentium® 4 processor at 2 GHz. Tuesday, during his keynote atthe Intel Developer Forum, Paul Otellini, executive vice president and general manager, Intel Architecture Group, demonstrated a processor operating at fully 3.5 GHz.

    But that's not the half of it. Otellini went on to note that the Pentium4 microarchitecture is expected to scale to a whopping 10 GHz.

    Now that's a "Wow!"

    But, exciting as speed is, it isn't everything. While it is important,"it is not sufficient to drive the levels of growth and innovation that will allow our industry to prosper," Otellini said.

    Speaking before an audience of 4,000 developers, designers, and executives Tuesday, Otellini noted that as the computing industry has grown and new technologies have evolved, purchasing criteria are changing. "We all need to change the pattern of our investments," he cautioned the crowd. "We need to think beyond gigahertz and build substantially better computers."

    Buyers now look to a variety of features, noted Otellini: style, form factor, security, power consumption, reliability, communications functions, price, and overall user experience. Combinations of these and other features are driving end-user technology requirements in individual market segments. Intel plans to develop technologies that will help address these changing requirements in each of the key market segments.

    Here are just a few of the ways Intel plans to go beyond gigahertz, as Otellini revealed in his keynote address:

    It's like multiple processors on a single chip
    Otellini introduced the audience to a breakthrough in processor design called hyper-threading. This technology allows microprocessors to handle more information at the same time by sharing computing resources more efficiently. The technology provides a 30 percent performance boost in certain server and workstation applications and will first appear next year in the Intel® Xeon[tm] processor family.
  • 130 Watts. (Score:3, Interesting)

    by istartedi ( 132515 ) on Monday September 03, 2001 @03:04PM (#2248801) Journal

    This makes me wonder, how many Crusoe processors could you put in a box (all other components equal) and equal this power consumption? Would the performance of such a box meet or exceed the performance of an Itanium box for real-world servers?

    • Well, the Crusoe processor uses about 1-2 watts [transmeta.com]. So, you're talking about 65 Crusoe processors to eat up the power of a single Itanium. If you're going by the entire motherboard with its components, an RLX 324 uses 15.7 watts of power.
  • by Cylix ( 55374 ) on Monday September 03, 2001 @04:14PM (#2248957) Homepage Journal
    With an 8 stage pipeline, as opposed to the 20 stage pipeline in the P4, clock frequencies are obviously not as high (~1 GHz).
    This beast has a small wang... its not the size that counts, but how you use it. (no giggling from the girls damn't)

    130 Watts power consumption...
    Who needs space heaters anyway?

    ...6mb of on die cache...
    OY! Hold your wallet tight, not for the light bank accounted!

    I'm sure many people can appreciate 64 bit integer ops; for me, it means single instruction xor for the 64 bit hash codes used in chess transposition tables.

    Not quite what the intel boys will be using in their next commercial. However, the wizards in marketing will be stressing the enhanced features of porn browsing. The fourth blue intel commando will be a scantily clad woman... further emphasizing the need for this processor which will not just make the internet faster, but will speed on your favorite pron sights.
  • Why do I have visions of new computers plugging into a 230V AC socket, like dryers and ovens? 130 watts an awful lot of juice when you consider most power supplies only put out around 5 volts DC or so.

    For those that don't remember their EE or physics courses: watts = volts * amps. And one amp through your torso is enough to kill just about anybody.
  • I'm sure many people can appreciate 64 bit integer ops; for me, it means single instruction xor for the 64 bit hash codes used in chess transposition tables.

    Yes, 64-bit operations have a handful of general uses, but when you weigh the benefits against the huge increases in transistor count, power consumption, and memory usage, are they worth it? I argue that they aren't. Doubling the size of almost every unit on the chip is a steep price to pay.
  • The Itanium is not a clear replacement for the x86 line by any means. If we're going to toss the x86 architecture completely, then there are lots of options: PowerPC, StrongARM, Alpha, SPARC, something else. Now switching the entire PC world to a SPARC chip sounds crazy, but it's not any crazier than switching to Itanium.

    For the record, Intel has cooked up x86 "replacements" before, like the i860 and i960.

"The way of the world is to praise dead saints and prosecute live ones." -- Nathaniel Howe