Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Intel

Inside the Itanium 135

vanguard writes: "Extreme Tech has a detailed overview of the Itanium. It's fairly long but it's worth your time if such things interest you."
This discussion has been archived. No new comments can be posted.

Inside the Itanium

Comments Filter:
  • Better link (Score:3, Informative)

    by jeroenb ( 125404 ) on Sunday February 10, 2002 @09:25AM (#2981872) Homepage
    Go here [extremetech.com] to view the entire article on a single page without all the annoying crap around it :)
    • Re:Better link (Score:3, Interesting)

      by Segfault 11 ( 201269 )
      ... and for a recent read on Hammer, go here: http://www.hardwaremania.com/reviews_eng/hammer/ha mmer1.shtml [hardwaremania.com]

      Of course, for some perspective on the nature of processor speculation, I point you to nearly any issue in Byte's print archive [byte.com].

      • Wow, that is so poorly written!


        The first article that i wrote about the processor architecture was approximately five years ago in order to estimate the present situation for the 64 bit processor market. At those days, the name "Willamette" was implicitly whispered on the sites, which were close to the chip giant, Intel. Intel had bought the production sites of Digital Corp. which was a firm with its product lines beyond its time. On the same period, we were discussing the upcoming announcement of the exciting chip K6-2. And the dream of a 64 bit desktop was even existent on the minds of IT market. I just wrote my first processor article on a 486 DX-2. At that time I was discussing the nonexistent and superior processors when measured with the glorious benchmark criteria of those days, Spec95 than my computer; lots of our crew, were just asking "... why do we need that much processing power on a desktop" scornfully. Now, I can barely put up hauling myself back from laughing, when I looked back to those swirling and blazingly fast years for processor market.



        Woof! Someone help that guy get his GRE...
    • Hang on, there is no $64,000 dollar question, it is an urban myth. There was, however, a $64 question.
      • Hang on, there is no $64,000 dollar question, it is an urban myth.

        I believe that you are mistaken. There are many people still living who remember it and you can buy video tape copies [cduniverse.com] of the show. This seams a little much for "an urban myth"

        -- MarkusQ

        • I beg to differ, but you are partly right. Look here [shu.ac.uk]
          • I beg to differ, but you are partly right.

            I fail to see how you conclude that I am only "partly right" when the link you posted mentions the existence of the very TV show that you initially claimed was "an urban legend". If you recall, nobody said that there wasn't a $64 question on the radio first; but I do claim that the existance of the radio show doesn't magically make the TV "an urban legend".

            -- MarkusQ

  • Wow, 328 registers. Does it mean that it will be more efficient to declare all variables in a single function than using multiple functions (due to scope of variables that restricts the efficient usage of so many registers)?


    • Real programmers who had to deal with the lack of registers. For instance, take the infamous Z-80. It had only 7 general purpose 8-bit registers, A,B,C,D,E,H and L, and you could use BC, DE and HL as a 16-bit register.

      It also had 2 16-bit index registers, a 16-bit stack pointer and a 16-bit program counter. Wich, of course, shouldn't be used for calculations.

      So you could count the registers on your hand. Ye good ol' times.
      • 7! What luxury! In my day, we had the 6502. It had only 4 registers (A, X, Y, & S), and we liked it...

        Kids these days...
        • What is this "4 registers" you speak about? I am interested and would like to subscribe to your newsletter!

          My 8 hamsters-on-wheels-pushing-buttons seem to be VERY interested in this "register" technology.

          I wonder why....

          PS- Yes, that was a Simpson's quote.
        • Sorry this is such a late reply, I wasn't around on Sunday.

          Anyway, I'm glad to see other people out there remember the 'ole 6502. Just to clear things up a bit, the 6502 had 5 registers. You forgot the Processor status (P) register which held the test flags. On the other hand though there was no way to access this register directly, just through test instructions.
      • Old times?! I'm having to develop to it now. Of course, it is a hardware assessment, but it's got some god-awful quirks to it.
    • I would expect to get one hell of a compiler to optimize my code for something like that for me if it's possible.

      Some programmers do not divide my code in little functions to save some register swapping, but to make it easier to read, manage, and of course to code.

      Although I can see how not having to fight for your registers is a big help at the assembler level, I would be very, very afraid of someone who codes a function/method/procedure with 300 variables "to take advantage of the registers", be it C, Assembly or whatever. Fear does not always imply respect.

      I would think all those registers would be much more useful in context switches and/or pipelining issues.

      Say, allocate, 30 registers to each context and keep 10 contexts running on the processor without much penalty. Prepare your variables for your next few operations for each conditional outcome on the registers. Stuff like that would be more useful (and less visible) in most cases.
      • would think all those registers would be much more useful in context switches and/or pipelining issues.

        Say, allocate, 30 registers to each context and keep 10 contexts running on the processor without much penalty. Prepare your variables for your next few operations for each conditional outcome on the registers. Stuff like that would be more useful (and less visible) in most cases.


        That said, the new AMIGA TAO OS has Virtual Registers (not that I've seen diddly squat in the way of programs for it).
    • Most of them are special-purpose, though... for use of things like SSE.
  • by T1girl ( 213375 ) on Sunday February 10, 2002 @09:29AM (#2981876) Homepage
    There's no guarantee that any calculations you did won't have to be redone with valid data. It's a bit of a gamble, but can pay handsomely if you speculate wisely. Architecture imitates life.

    It was worth reading a long article to come across this nugget of wisdom. I think i'll embroider it on a sampler and hang it in my cube.
  • Dell launched this [dell.com] 64-bit machine mid-2001. It comes with a single Intel Itanium processor at 733 MHz, 1 GB SDRAM, Matrox Millenium G450 graphics card, 18 GB SCSI hard disk drive.

    The price? $7,999 at the time.

  • by PoiBoy ( 525770 )
    ...we'll evaluate the pros and cons of the "other" 64-bit processors used in workstations and servers, such as SPARC, Power, MIPS, and Alpha.

    What is so special about the Itanium other than the fact that it's from Intel? We've had 64-bit processors for years now. Moreover, it's not like everyone is going to ditch the IA32 architecture overnight and go to Itanium. It seems to me that anyone who needs/wants 64-bit computing already has it.

    • This is the opportunity to get your 64-bit computing in a beige box instead of a proprietary system. Supposedly, it will be cheaper, easier to use, and mainstream. For most people, one killer argument is "Does it run Windows?" - and you can't natively run Windows on Sparc or HP/UX... (whether you can run Win64 on Itanic is another story ;-)
      • Who wants to run windows anyway? But even if, what about Alpha?
          1. 90% of MS-brainwashed CIOs want Windows. Blame them all you want, but they have the budgets, and they don't ask you or me what to buy.
          2. Win2K on Alpha is dead (see Compaq's Web site). In fact, WinNT on anything but IA32 never *really* existed, from consumer point of view. Did you ever see any commercial app compiled for Alpha or MIPS?
          3. Alpha is not a mass market platform. Itanic will be (or at least Intel hopes so).
          4. Alpha has no x86 compatibility, Itanic promises some. Big deal for legacy apps - many can't even be recompiled with modern tools!
          • Alpha has no x86 compatibility, Itanic promises some.
            The FX!32 [compaq.com] emulation/recompilation system actually provided very good transparent x86 compatibility on Alpha WinNT systems. This definitely fits high on my list of Really Cool Technologies; for most programs I could just forget about whether or not they ran native. The biggest drawback was a marketing one: it wasn't hardware x86, so as far as perceptions were concerned it wasn't real. I used it on some pretty challenging apps, though, and it worked quite well and transparently.

            This is actually an interesting perception issue. Itanium has hardware support for x86 and software support for PA-RISC. The original article attributes this to a lesser priority being given to PA-RISC. While that may be true, it may also be due to the PA-RISC customers having less of a "real PA-RISC" hardware vs. software hangup than x86 customers.

    • What is so special about the Itanium other than the fact that it's from Intel?

      As far as I am aware, the IA-64 is the first instruction set to use explicit parallelism. This removes much work from the CPU when determining which instructions can be executed in parallel. I believe the IA-64 is also extensible--it is extremely easy to add more pipelines or make other significant changes to the architecture without a lot of redesign.

      From the article:
      An assembly program would call [4 instruction types] nuts. You'd think that Itanium's designers would have been satisfied with 241 different opcodes, but no? An assembly program would call it nuts. You'd think that Itanium's designers would have been satisfied with 241 different opcodes, but no?

      The Intel engineer I spoke to about IA-64 said that it would be virtually impossible to write good assembly for IA-64 because humans don't think about explicit parallelism in their heads very well. Itanium/IA-64 relies heavily on good compilers.

    • Note that the story says it is the first of a series on 64-bit processors. I assume IA-64/Itanium was first because it is the one most likely to show up on mere mortals' desktops a few years from now...
  • Is there ANY Itanium hardware that actually delivers good performance for the money yet?

    Is there ANYONE doing ANYTHING useful with them yet?

    As far as I can tell, current Itanium stuff is a mere curiosity.

    Intel is making an enormous gamble with IA-64, it is a huge investment and the whole thing may blow up in their face.

    AMD is taking the 'safe road' -- bolting 64bit onto an existing design, taking advantage of the huge momentum of the x86 architecture.

    Intel on the other hand is going to have to spend a huge amount of effort getting the Itanic moving.

    Will be interesting to see how things develop. Will AMD's 64bit products derail Intel's efforts? Can they crack the 64bit server market fast enough and make IA-64 irrelevant? Stay tuned!!
    • As far as I can tell, current Itanium stuff is a mere curiosity.

      Well right there you have already invalidated the rest of your comment. Intel is not going to have to do anything to get their IA64 architecture moving.

      As announced, HP's new PA-RISC chipset will support both McKinley and their RISC processor. Sounds crazy but HP is doing it.

      Furthermore nobody and I mean nobody is developing anything for AMD's 64bit systems. On the other hand I have seen significant interest in McKinely. True enough, Merced is just considered a development platform and is pretty much dead in the water. I work for a test and measurement company and I deal directly with our front side bus solutions. You are wrong. AMD is facing the same problems in the IA64 arena that they are in the IA32 arena. Intel's chips say Intel and theirs say AMD. So everybody is ignoring them, for now.

      This is not to say that AMD can't come in beat them. I am just saying that Intel's solutions are already in development in many of the high-end server labs. I haven't seen a single group working on AMD's 64bit proc, nor have I been asked if we plan on supporting it.
      • > Furthermore nobody and I mean nobody is
        > developing anything for AMD's 64bit systems.

        It doesn't matter. AMD will sell plenty of Hammer chips in their usual markets, who will just use it as a faster 32bit x86 chip. (Just like the 386 was used as a faster 16bit chip when it came out.)

        BUT one day someone will wake up and realize that Hammer makes for fast, cheap, cool and backwards compatible servers. And then Intel will release their secret x86-64-compatible CPU, and IA64 will be cancelled.
  • by jeroenb ( 125404 ) on Sunday February 10, 2002 @09:49AM (#2981908) Homepage
    I've been looking into the IA-64 for the past year orso, and I'm convinced that on the technology side, both the architecture and the implementation are a good thing. What surprises me is that it's still taking quite some time for it to start popping up in actual production environments. Not sure what the reasons for this are.

    First of all, with HP being a co-developer of the entire architecture, they are a big backer of the Itanium. So is Compaq, who sold their entire Alpha technology to Intel to focus on implementing the Itanium in all their high-end products (makes you think, was this all decided because they already knew they were going to merge with HP? Probably...) Dell is still sticking with 100% Intel, so the Itanium will be their bet for capturing more of the high-end segment. Even SGI is selling Itanium workstations (although, with the recent announcement of the MIPS-only, IRIX-only Fuel workstation, they might abandon the Itanium as well.)

    So what's holding it back? I think that although there's now Linux available for it as well as a prerelease version of Windows Server along with some other systems (like HP UX) we still need to see more applications. Databases alone just aren't enough - and with the high prices of Itanium machines (the cheapest dual-Itanium 733 is around $22K at Dell, everyone else is probably more expensive) developers are not really happy about buying a couple of those machines and start hacking. So I think that because we don't see the Itanium much, developers are not investing in writing the software and business are not investing in buying the hardware.

    Maybe Intel should start giving out IA-64 machines to opensource hackers and watch it fly? Where can I submit my address info? :)

    Oh and about the subject of this post, the fact that the Itanium is 64 bits is not really all that important - the fact that a processor is 64 instead of 32 bits doesn't say anything about how fast it is. If you think it does, you can buy my R4400 Indigo2 for $10K :)
    • Oh and about the subject of this post, the fact that the Itanium is 64 bits is not really all that important - the fact that a processor is 64 instead of 32 bits doesn't say anything about how fast it is. If you think it does, you can buy my R4400 Indigo2 for $10K :)

      Or my R10000 Indigo2 for $20K... at least it runs "IRIX64" the 64-bit kernel and the 64-bit ABIs.

      elwood 6# uname -aR
      IRIX64 elwood 6.5 6.5.15m 01091821 IP28

      AFAIK, the only SGIs that use the R4400 in a 64-bit manner are the Challenge L, Challenge XL, and (original) Onyx. R4400 in desktop machines is limited to 32-bit support for memory contraint issues. 64-bit on the desktop from SGI requires an R8000/R10000 based Indigo2, R10K/R12K/R14K Octane/Octane2, or R14K Fuel. All other desktop configurations are limited to O32 and N32.
    • What surprises me is that it's still taking quite some time for it to start popping up in actual production environments. Not sure what the reasons for this are.

      Huge installed 32 bit codebase.

      $1200 price tag for little speed gains.

      Unproven platform.

      I think thoses are good enough reasons. Also, many hacks are in place to allow 32 bit systems to do a lot of stuff 64 bit ones can do, for example, creating files > 2GB.
    • by Anonymous Coward
      The most important thing about 64-bit chips is that they allow an address space of over 4GB. This is a real fully addressable >4GB space, not some 36-bit hack like the current intel 32-bit chips allow.

      In the EDA industry (Electronic Design Automation i.e. tools for making computer chips) we routinely hit the 4GB memory limit. 99% of EDA tools run on Solaris but EDA companies are slowly recompiling their apps to be 64-bit clean on Solaris.

      Meanwhile Linux is picking up steam in the EDA world but the 4GB limit is holding it back. We're forcing into complex partitioning of our chips to break it into small enough chunks to fit under 4GB.

      We need cheap (non-SPARC) 64-bit chips, say like oh AMD Hammer? :)
    • I see Itanium's arrival as good as the PowerPC chip line for Apple. It is a step up. However, Itanium is slow to arrival because the PC hardware industry is highly resistant to change. It's a vicious symbiotic relationship, or, more simply put, an eternal tug-of-war.

      The big changes in PC hardware are successful only when both Intel and Microsoft marketing convince others (both resellers and users) that having the new tech is the Big Thing--even if neither group really understands the tech and why they should buy into it. I still get questions about USB, which Microsoft and Intel began to support better after the debut of the first iMacs and their USB-only support.

      Another factor may be cost. Are Itanium chips much more expensive than P4s or AMDs? If so, adoption will be further slowed. This is not a game of better tech (IMHO, other companies typically win that game over classic Intel motherboard architecture), but of commodity pricing. The best tech is NEVER achieved by the lowest bidder.
      • Intel required that their motherboard manufactures include USB for a long time before Apple's iMac. Their really old LX chipset had the USB on board. It was just that no-one really cared about it until Apple iMac created a market for USB devices.
    • So what's holding it back? [...] we still need to see more applications.
      What applications, though? In particular, what applications for which Itanium would have an advantage? Whether or not the architecture is a good idea long-term, the current implementation doesn't offer a compelling performance advantage for any one application. What kind of software could make the current hardware sell?
      • One particular kind of application that will benefit enormously from the 64bit integer registers is chess engines. Today most chess engines use several 64bit integers (bitboards) to hold the state of the board (one for black knights, one for white pawns etc).

        With this datastructure, for instance obtaining the free or occupied squares is easily done by oring them together. In today's 32bit CPUs this requires a few instructions, in the Itanium (or any 64bit CPU) this is only one.

        Play a little chess on the net at http://mobilsjakk.no. This service (my employer's service in fact) would definitely benefit from being run on a server with dual itaniums :)
    • So what's holding it back?

      For real-world numerical applications, using state-of-the-art Intel compilers, the Pentium 4 is faster than the Itanium. Of course, people still use the Itanium because of its substantially larger address space, which is a very, very significant issue. And we can expect the IA-64 architecture to catch up as compilers improve, as it is common with such architectures.
    • How about the fact that VLIW (or EPIC, if you prefer) compiler technology really isn't there yet for general purpose problems? Maybe you can write a program for EPIC and get it to scream, but simply recompiling the ray-tracer (or what-have-you) you already have just won't show much. Take a look on comp.arch for more, especially under X86-64 for evolution vs revolution opinions.
      • cache hits (Score:3, Interesting)

        by johnjones ( 14274 )
        the real problem for intel is that the arch does badly for programs that are not cache hitters

        they took one look at the people trying to do predictive memory loads and decided not too. this was a LONG time ago and now people have solved the problem so that most of the time you can get things from cache

        IA64 fails to get things from cache too well (one of the reasons why they stuck such a large one on) so suffers from the latencey problems more than most

        simple

        regards

        john 'try runnning spec marks on it' jones
    • by Anonymous Coward
      the cheapest dual-Itanium 733 is around $22K at Dell, everyone else is probably more expensive


      $7995 single / $14995 dual
      Check the prices yourself:
      http://www.hp.com/workstations/products/itanium/ i2 000/summary.html
      link [hp.com]

  • by Anonymous Coward
    Just the other day, I sat in on one of the compiler developers of the itanium giving a seminar. Needless to say, most of the group went to sleep the moment he started talking about the instructions. The assembly seems awfully arcane, so it's likely someone could go insane before speeding up anything :).

    Anyway, To all those who think it's performance is rather low, It seems more like a proof of concept chip rather than something intended for mass production. :)
  • by redelm ( 54142 ) on Sunday February 10, 2002 @10:04AM (#2981932) Homepage
    A nice architectural overview, but there's no mention of power. IIRC, Itanium sucks back 125 Watts!


    Power/heat this high gives system designers problems, plus it can't be easy getting ~100 Amps to&from a chip.


    Otherwise, AFAIK, Linux has working ia64 so code size can be compared. I'd expect 4x x86.

    • Yeah it is a power hog. For Merced there is a power pod that goes next to each "processor". That pod provides the power to the processor only. Then from there, there power is distributed to the L3, PAL, Core, etc. I'd imagine on the processor's core chip the power is distributed by a signfinicant amount of pins. Otherwise it'd be a bear to get the power evenly to different areas of the die.
  • I don't feel that any of the current reports are aimed at people such as myself, and don't feel that I'm getting the real deal in terms that I understand.

    I understand what a register is, the advantages of 64bit, 128bit etc., even what a pipeline is.

    What I would like to see is a bullet pointed list of advantages put in executive summary style, dumb-down, type!

    Also, I read a few years ago about Elberus, who have some pretty neat claims here:http://www.elbrus.ru/mcst_e/proect_e/e2k_arch _e.htm [elbrus.ru]

    You may be interested in their claims.

    • What I would like to see is a bullet pointed list of advantages put in executive summary style, dumb-down, type!
      Advantages over other existing 64-bit processors:
      • It's got the Intel name on it, and will be marketed accordingly.
      • It'll run Windows.
      • It hasn't been dead-ended by its manufacturer.
      Advantages over x86 processors:
      • It's slightly better for high-performance floating-point computation.
      • It makes a better space heater.
      • It'll keep determined assembly-language hackers out of your hair for a while.
      I think that about covers it. Not all of those will be advantages for every consumer, of course, nor do all the advantages apply to every competitor.
  • What's the point? (Score:3, Insightful)

    by 4im ( 181450 ) on Sunday February 10, 2002 @10:54AM (#2982037)

    Except for the Itanium coming from Intel, what's the point? This is a prototype for a new architecture (IA64), prototype proven to be seriously lacking in speed, stability etc. I got to see a dual Itanium prototype from HP a few months ago, and all the comments I got about it were that it essentially sucked.

    Really, if you need 64 bit, why not just go and get yourself some UltraSparc, Alpha etc.? I have gotten myself a used Ultra 30, will soon get an used AlphaServer, and I sure don't need to go buy an expensive, unstable processor that's not even got decent compiler support yet.

    And if it has to be IA64, at the very least wait for McKinley - HP's engineers are supposed to be doing a much better job of IA64 than Intel did. Or even wait for the version after McKinley, which is supposed to profit from good ol' Alpha.

    • And if it has to be IA64, at the very least wait for McKinley - HP's engineers are supposed to be doing a much better job of IA64 than Intel did. Or even wait for the version after McKinley, which is supposed to profit from good ol' Alpha

      The point of this article is to introduce you to the IA64 architecture, which McKinley and Madison are going to be based on. The point of developing for Itanium/Merced is to learn how IA64 is different from IA32 so that when you do a McKinley platform you are ready and not fighting with as many "what the hell is it doing now."

      I've seen many protoytpes as well from HP and others and the reponse is not "it sucks." The response is "wow, we've got a lot of work ahead of us."
  • wrong direction? (Score:2, Insightful)

    by markj02 ( 544487 )
    Itanium is built around the notion that many of the hard decisions about how to execute code efficiently should be handled in software (mostly, by compiler back-ends). I think this is not the right direction to go into. It means that every single compiler back-end will have to re-invent the wheel. Most likely, there will only be a small number of compilers that will do a decent job, and a lot of languages won't even try. That's fine if you think the world consists only of a bunch of SPEC benchmarks implemented in C/C++, Fortran, and Java, but it will make life even harder for non-standard languages or non-standard applications. And Itanium's implementation of VLIW seems particularly complex.

    Software is by far more costly and complex than processors these days, and we just don't need extra complications in the form of processors that shift even more complexity into software.

    I can't pretend to know what a "good" 64bit architecture should look like. But for the time being, something like Alpha or AMD Hammer seems like a better choice to me. And even Intel seems to be reconsidering and keeping a 64bit version of the Pentium as a backup strategy.

    • No way.

      The main bottleneck of modern microprocessors is, in fact, the extra space and heat produced by the complex logic you defend.

      The RISC processors was born because the CISC complexity was impairing performance as hell. CISC was good when the processing bottleneck was the instruction fetching : every slice of clock saved by reading hard to decode but compact bytecode worth the pain.

      But now, the bottleneck was shifted to inner spheres. Nowadays a CISC processor waste more time translating the bytecode and executing the microcode that anything else. In this panorama Itanium made the right thing : lets get ripped out of complex bytecode. The really bad drawback (and yes, you are right on this) is the huge increase in complexity in coding in a cripped machine language level.

      But think : how many compilers was written in the computer history, and how many applications was written with these compilers? This ratio will prove that it's worth the trade. Of course compilers will be more complex and hard to code, but once that damn thing was done, it was done.

      Thet's the way MIPS, SPARC and Alpha was done, and they did very well in the past.

      • Re:wrong direction? (Score:2, Interesting)

        by Mydron ( 456525 )
        You've missed the point.

        Complex compiling issues are NOT a result of CISC or RISC in this case. In fact, RISC is far easier to write an efficient compiler for than CISC. The instructions offered by RISC more closely mimic the kinds of basic operations compilers manipulate in the very back end of compilation. Register sets are usually general and very orthogonal, compared to CISC (Intel in particular) where you have very few registers and they all have special meaning depending on context.

        The complexity in building compiling tools with respect to the itanium is all about VLIW, parallelization and scheduling. These are incredibly complex topics with many subtle features that make optimization and analysis very difficult.

        Also, think again if you think compilers are written once for an achitecture and then set in stone -- 'once the damn thing is done' it definately will not be done. It will probably be buggy and poor at doing these new complicated tasks compiler writers have never had to do before. It will likely take a few iterations before the compiler tools start to show off the architecture. The question is which will come first, the latter or industry's frustration with poor performance out of expensive silicon.
      • The main bottleneck of modern microprocessors is, in fact, the extra space and heat produced by the complex logic you defend.

        The problem doesn't magically go away by shifting it into software. A static compiler cannot do instruction scheduling and parallelization correctly--you need runtime instrumentation and JITting. The end result is something that likely performs more poorly than if you had let the processor do this.

        Nowadays a CISC processor waste more time translating the bytecode and executing the microcode that anything else

        CISC vs. RISC has nothing to do with it.

        But think : how many compilers was written in the computer history, and how many applications was written with these compilers?

        Not nearly enough compilers have been written.

        This ratio will prove that it's worth the trade. Of course compilers will be more complex and hard to code, but once that damn thing was done, it was done.

        Yes, if you are happy muddling through with C and C++. Itanium will further cement the dominance of languages that we already know to be absolutely lousy from a software engineering point of view, because almost nobody will be able to make the investment to write a competitive compiler for any other language.

  • Itanium is making the compiler do alot of work! This presents a gigantic challenge for compiler writers. My concern is with GCC. We all know that GCC does not produce the tightest, fastest code. For IA-32, this is not a big deal because there is only so much optimization that can be done with that ISA (instruction set architecture). However with Itanium, the compiler will probably affect runtime execution speed by 100% or more. Will the GCC people have the resources and expertise to handle IA-64 (Itanium)??
    • I remember hearing some years back that Intel was thinking of or had committed resources to make open source contributions to GCC for IA64. Does anyone have any more details / information about that?
  • Let me try and get this point across again: writing compilers for the Itanium architecture will be a lot harder than writing compilers for other architectures. The reason is that the compilers need to make even more guesses about what kind of parallelism is possible and what instructions can be scheduled together. The programmer can't really help either, since trying to hand-optimize code for VLIW is just too tedious. This will mean that Intel will make the investment, and we will likely get, decent compilers for C/C++, Fortran, Java/JVM, and C#/CLR, but for any other language, it will be even more of an up-hill struggle. And even in those languages where good compilers exist, if you do something that the compiler doesn't quite understand, it may make the wrong assumptions and generate poor code.

    Altogether, that can't be good for the industry in the long run. We need more, not less, support for new software architectures and languages. Instruction scheduling and parallelization are things that a processor can handle much easier than a static compiler because the processor can efficiently keep runtime statistics on what a program actually does.

    Potentially a better approach to me appears to be hyperthreading, which redefines the problem. No, individual threads won't get very high performance, but code generation is pretty easy, and (unlike VLIW) the programmer can take explicit control of parallelism at a higher level through threads. To me, that seems like an overall better approach.

  • by Pedrito ( 94783 ) on Sunday February 10, 2002 @01:04PM (#2982531)
    I've done my share of assembly language, from 8-bit Ataris to 32-bit Intels, to IBM/390 mainframes.

    I can't even conceive of having to write assembly code for these monsters. Anyone happen to browse through the instruction set reference? All 900+ pages of it? It's all cryptic as hell. I could sooner build a rocket bound for Pluto than write a simple recursive factorial program in IA-64 assembly.

    I sure hope someone can figure it out. I doubt I'll be doing any assembly optimizations in the future.
    • I sure hope someone can figure it out. I doubt I'll be doing any assembly optimizations in the future.

      I've had to do worse, our logic analyzers have to decode the bus traffic to figure out what was going on. As you can imagine, the fact that 1 OP code could mean 4 different thigns depending on when it shows up (or what happened before it) poses a significant challenege. You might say "big deal." Sure, big deal, if you are developing a system. We have to be able to provide tools that are bug free BEFORE the first silicon hits market. That way when a guy in a lab encounters what looks like a bug isn't fighting our mistake.

      I fear one day we are going to get to a point where we can't provide a solution for hardware engineers to probe their system with.
  • First thing that occured to me is that the arch is a little weird in places. No integer multiply, wtf? Of course, then it occured to me: who uses integer multiply? The main reason to use them would be mainly in rasterization code in a graphics pipeline. Since integer-based rasterization went out with MMX, this doesn't matter. Overall, the arch seems pretty clean, a hell of a step up from x86, anyway. And, the initial performance seems pretty good, so it seems to work. While the whole EPIC thing could mean hell for compiler writers, remember that Intel's ICC has been doing parallization on x86 procs for years now, and does a damn good job of it. I can't wait for Deerfield to come out so I can get to chose between that and a K8 for me next PC ;)
  • by HiredMan ( 5546 ) on Sunday February 10, 2002 @01:49PM (#2982726) Journal
    HP decided as early as 1996 that the then "Merced" project would not overtake their PA-RISC arch and essentially walked away from the the project.

    Years late the "Itanium" finally ships (although no one buys it) as Intel says, "But wait for McKinley! Then it will really work!"

    The McKinley is the product of the "rethought" Merced project. McKinley is shipping later this year - with a completely different socket system so even the arch surrounding the "Itanium" is dead in the water.

    Let's compare this to the REAL competition:

    IBM Power4 1.3GHz - shipping for a while now:
    SPECint2000 = 814 SPECint_base2000 = 790
    SPECfp2000 = 1169 SPECfp_base2000 = 1098

    Sun UltraSparc III Cu 1.05GHz:
    SPECint2000 = 610 SPECint_base2000 = 537
    SPECfp2000 = 827 SPECfp_base2000 = 701

    Even the best Itanium 800Mhz reported int numbers are:
    SPECint2000 = 365 SPECint_base2000 = 358
    (Same box) SPECfp2000 = 610 SPECfp_base2000 = 526

    Even if the McKinley (which doesn't ship for 6 months or so) produces double the Itanium numbers it'll still lag the currently shipping Power4 chips.

    Remember the hype and FUD surrounding the launch of the "Itanium" chip that eventually hasn't even caused a ripple in the marketplace? Intel has sunk billions into this EPIC project and refuses to let it go even though it's years late so far hasn't produced the clear advantage over the RISC arch it was supposed to make obsolete. In many cases the "consumer" chips continue to make better results than the "server" chip series - and with AMD knocking on Intel's door throttling back production/performance of the consumer ship is not an option.

    Will the McKinley better than the Itanium? Certainly.
    Will it be compelling? Wihtout Intel behind it - probably not. (Alpha was the clear performance winner for so long but couldn't get any traction.)
    Is VLIW^H^H^H^H EPIC the future of computing? "Answer unclear... ask again later." ;)

    =tkk
    • Intel knew the Merced wasn't a speed demon. An exec actually admitted this in 2000. McKinley is one of the first "real" chips they'll release.

      As for your SPEC figures, you could have at least made your post worthwhile by not fudging the numbers to make the Itanium worse than it actually is (although it still isn't very good).

      Source: http://www.aceshardware.com/read_news.jsp?id=30000 281
      800MHz Itanium
      SPECint2000 base: 403 (Your number was 358)
      SPECfp2000 base: 701 (Your number was 526)

      The McKinley should be much faster. Hell, it's got an 8 stage pipeline instead of 10 stage, an additional 2 integer units (6 total), MUCH more efficient cache, on-die L3 cache, more L2 cache, MUCH faster system bus, etc.

      Don't write off an entire architecture because you didn't like how the experimental implementation came out. Itanium was just to get a product out there for IA-64 early adopters to start getting code working on.
      • As for your SPEC figures, you could have at least made your post worthwhile by not fudging the numbers to make the Itanium worse than it actually is

        I didn't. These are the numbers from official submissions to the spec organization [spec.org]. (If they can't bother to submit results then they don't count.) I took the machine from with the HIGHEST Int performance - as I said in my post. The FP is the result for the same machine - as I said in my post.
        There is a >700 SpecFP machine claimed by Dell but there is no corresponding SpecInt submission. I think Intel claimed nearly 800 Spec2000FP for the Itanium but no one else has been able to re-create those results. That's why non-submitted results don't count.

        Don't write off an entire architecture because you didn't like how the experimental implementation came out.

        But it wasn't supposed to be a proof of concept chip. It was supposed to be the future of computing.

        An exec actually admitted this in 2000.

        Which is at least 3 years after they knew it. Intel instead spread FUD around while refusing to talk performance numbers.

        From Intel Press Release:
        SANTA CLARA, Calif., Oct. 4, 1999 - Intel Corporation today announced it has selected Itanium(TM) as the new brand name for the first product in its IA-64 family of processors, formerly code-named Merced. The Itanium brand extends Intel's reach into the highest level of computing enabling powerful servers and high-performance workstations which will address the increasing demands that the Internet economy places on e-Businesses. "The Intel Itanium processor represents a new level of processor capability that will be the driving force for the Internet economy,"

        Ummmm... okay. I see, by "highest level of computing" and "new level of processor capability" they meant "proof-of-concept place-holder chip". It's all clear to me now...
        The full text is here on Intel's site [intel.com] since you seem to think I make this stuff up.

        Now McKinley is supposed to be the next, big thing.

        From an article about McKinley [com.com] previously on /.
        "Applications will be about one and a half to two times faster than what you get on a (current) Itanium," said John Crawford, an Intel fellow in the enterprise platforms group.

        The additional bus and processor speed and 3 megs of on-chip on speed cache should deliver nearly a 50% boost all by itself. If the "new" features of the McKinley don't add much more beyond that then where are they going?

        Forgive me if I appear skeptical...

        =tkk

  • by Anonymous Coward
    It sucks, nuf said.

    A bit more specific, look at the bottom of the article where it mentions the "use merced for development, mcinley will actually sell". And the fact that even Dell no longer sells Itanium.

    Sanity check:
    system specint/specfp cost
    Hp server rx4610 342/701 $23k
    AMD XP2000+ epox 8HK+ 734/642 $1k

    Keep in mind that the Itanics are supposed to be for the server market, so the specint figures will more likely track actual performance. Intel has been claiming that McKinley will be a vast improvement (actuall claims seem to have been steadily downgraded from "dominate the market" to "actually sell a few" to "won't make management look like complete idiots"...). Present claims of McKinley performance are 1.5 to 2.0 times Itanic performance, i.e. unlikely to keep up with Athlon, let alone the hammer.

    Why does it suck?
    While Intel does know how to design processors, the architectures are annother story. Aside from the 8086 kludge, intel has produced such "successors" as the 432 and the 860. The 432 was even slower compared to the competition as the itanic, and the 860 was even harder to write a compiler for (and impossible to write an interupt handler, let alone an OS).

    EPIC is supposed to be VLIW with enough "extras" to allow the compiler to write code that won't require out-of-order execution. It is also supposed to allow intel to create several generations of compatible chips (something hard to do with pure VLIW). Somewhere along they way they forgot that the point of VLIW is to make execution simple. The lessons of RISC and the Cray machines is that the more simple and clean an architecture is, the faster it can go. Check out how long it takes to explain the architecture, then examine Alpha and ARM. Granted, ease of explaining does not always translate into ease of design, but it ussually does, and certainly did in this case.

    What now?
    It looks like the architecture of the future is x86-64. Hammer should appear this year (maybe not for sale, but at least samples). Intel is claimed to have a project called Yamhill that adds AMD compatiblity to the next generation X86. Right now, any support at intel for x86-64 appears to be a CLM (Career Limiting Move and no, it is still a CLM to support it now even after yamhill is enabled). After McKinley goes down in flames (there doesn't seem to be a chance of anything else), it will be interesting to see how long it takes intel to produce anything AMD compatible. Assuming that the next generation of X86 was started after the Pentium 4 finished, that would place it about 4 years from now. By the time the politics get straightened out, that is likely the earlies option (rushing that job will just make it happen later).

    Why x86-64?
    Anyone familiar with the x86 architecture ussually runs away in horror at the thought of actually using/designing one. Having said that, the 386 architecture fixes almost all of the problems with the 8086/80286. The problems left are:

    variable length instructions - this problem has basically disappeard, all modern high power CPUs have bigger problems than this.

    insufficent registers - hammers doubles the number of registers (note that itanics 200+ registers become a hindrence if not fully used).

    addressing size (32bits) - obviously hammer fixes this.

    In summary, intel had a chance to create a usable architecture (anything noticably better than X86 probably would have worked), and would simply owned the market. It is possible that some PHBs thought that anti-trust laws might actually be inforced and then created an architecture too complex to clone. If so, they certainly did so, admitedly making one too complex to build themselves.

    Scott
  • by jrst ( 467762 ) on Sunday February 10, 2002 @02:15PM (#2982811)
    Many of the comments about compiler technology in this thread could be taken verbatim from discussions about RISC architectures 20 years ago. Or from the HLL (high level language) architecture discussions 30 years ago. (Anyone remember the cries for "closing the semantic gap" between processor's and languages? No? Point made.)

    Hardware is getting more complex; it takes more sophistication to deal with it. Binding a (general purpose) processor to a language in order to make language implementation easier is exactly the wrong way to support a wider variety of languages. Making the most of a processor's capabilities is what compilers are for. That's what compiler writers get paid for.

    That's not to say I'm in love with the Itanium. At first glance I found it a baroque rehash of old ideas. But time--and compiler writer's--will tell.
  • Competing with risc processors? Want to fight with the big boys? Wouldn't it then make sense to actually produce a RISC processor? Just a question.

    If they want to call it RISC, then don't make the instructions so large, don't include hundreds of possible instructions, and make the clock cycle time shorter.

    Where have the basics of RISC gone?
  • by pinkpineapple ( 173261 ) on Sunday February 10, 2002 @02:37PM (#2982889) Homepage
    When Steve Jobs was asked about what he was envisioning regarding 64-bit processor adoption (related to the fact that at that time, IBM came out with the Power4 kick ass cpu), his reply was that it would take about 10 years for the common of the mortals (you and me mostly, but not him :-) to see 64-bit systems on the shelves at Fry's or CompUSA.

    Given that it was coming out from the mouth of the CEO of a company that :
    - can afford the move quickly and nicely (PowerPC architecture is clean compare to IA-64 + x86 and is 32-bit backward compatible).
    - had successfully shifted the kernel to a clean replacement (less kludges) allowing the transition in a blink of an eye (ok, maybe 6 months)
    - has a park of installed machines in places like labs (see gentech), and design studio.
    - runs applications that would benefit the most are all in the Apple camp (A/V and number crunching apps like photoshop, maya and final cut)
    - develops a big chunck of the major apps for its platform leading the way in term of design and adoption of new tech.

    it would seem that we have about 8 more years of 32-bit glory or galore in front of us, before the current cpu architectures get displaced and eventually die.

    Which 64-bit architecture will succeed is not clear today. Knowing that MS doesn't rush their OS out of the door to support the IA-64, it seems to be a little premature to tell.

    PPA, the girl next door.

  • Gad, what a turkey (Score:3, Interesting)

    by Animats ( 122034 ) on Sunday February 10, 2002 @03:01PM (#2982978) Homepage
    I can't see why Intel bothered with this thing. Intel pioneered mainstream superscalar out-of-order machines with the Pentium Pro/II/III architecture, which made it possible to make CISC architectures go fast. That was a major achievement. It made classic RISC obsolete - why put up with the code bloat?

    Then came the Inanium. VILW, code bloat, ugly architecture, requires near-omniscience from the compiler, very tough to program in assembler, a power hog, and with mediocre performance. If anybody else had launched this, it would have died before first shipment. As it is, it's dying anyway. Dell dropped their Itanium workstation recently. The Itanium may end up as a niche product, like the forgotten i860, i960, and iapx432 processors.

    I'm hearing rumors of a new 64-bit machine from Intel that's basically an improved x86, like the AMD Sledgehammer. That may be what actually gets used.

    • by Anonymous Coward

      Then came the Inanium. VILW, code bloat, ugly architecture, requires near-omniscience from the compiler, very tough to program in assembler, a power hog, and with mediocre performance.

      eh?

      even though the IA-64 arch does seem to have some weird stuff into it, i wouldn't call it UGLY especially when comparing it to the IA-32 "architecture" (or rather lack of it.) and who programs in assembler nowadays? (excluding the MMX/SSE stuff which however is a direct consequence of the crap fpu on IA-32)

      mediocre performance might the first implementation be, i can agree with that. but is the IA-64 a power hog? itanium might be, but if you look at the article really carefully, you'll learn that the itanium CPU core only contains approx 25 million transistors. this is MUCH less than a P4! being a x86 chip always required carrying that extra baggage to decode those mysteriously coded x86 instructions. i'm not saying that the IA-64 instruction decoding is simple, but at least it's worth the effort (whereas the legacy x86 baggage is not).

      and no, i don't like itanium or intel very much. i have an alpha 21066 at home :-)

      • by timmyd ( 108567 )

        even though the IA-64 arch does seem to have some weird stuff into it, i wouldn't call it UGLY especially when comparing it to the IA-32 "architecture" (or rather lack of it.) and who programs in assembler nowadays? (excluding the MMX/SSE stuff which however is a direct consequence of the crap fpu on IA-32)


        IA-64 still has backwards compatibility with ia-32, which has realmode, v86, and protected mode. that makes the ia-64 a mess to start with. compilers still generate assembler in some cases and some people have to use asm for low level things in the kernel and doing things that you can't do in C, like calling software interrupts, which, by the way, requires that you enter either v86 or real mode which isn't as simple as changing the PE bit. you have to setup the stack, and memory segments again and real mode can only physically access 2^20/1024/1024=1 megabyte of memory at once. maybe if intel would stop building on their old crap the whole thing would get a little simplier.
        but i guess it would be boring if everything was as simple and stable as a calculator.
    • Granted, this needs improvement, but x86 is hardly a prize either. The architecture of the Pentium Pro and up screams kludge. Huge amounts of circuitry that do nothing but break up CISC instructions into something resembling RISC so that they can be executed at reasonable speed. x86 didn't beat CISC because it avoided code bloat, (Does anyone eve notice code bloat anymore? I mean most of the world is running Word processor that consumes 32+ megs of RAM, who are we kidding?) it won for the same reason Windows wins, which is because it runs all your stuff. Installed user base. It was compatible with what had come before, and no one wanted to buy new programs, so we've stuck with it. It's like an AMC Gremlin with V12 engine welded onto the roof because you didn't want to have to move your stuff out of the trunk when you move your stuff. And to steal somebody eles's analogy, that V12 has uses side-injection to remain compatible with your old Model-T. I mean, segmented memory architecture? Please.

      So what if you can't write assembly to make your code faster. Aside from the "Real Programmers don't eat quiche" mystique, this is a problem why? You probably can't beat a good C compiler on a P3 or P4 either. With very few very specialized exceptions, the compiler is smarter than you are. Granted, it's slow now, but this is the first generation of the chip. With some architectural improvements(please, for the love of god do something different with the cache!) this could be a pretty decent chip, and one that will still run your old apps.

  • If you want to try out Linux on the Itanium architecture using an IA-32 system go here [hp.com]. You can download a simulator and a development environment at no charge. HP released this SDK in 2000 to help developers before systems were available. David Mosberger (maintainer of IA-64 Linux kernel) developed the SDK along with Stephane Eranian. It's still a good option if you can't get access to a system.

    If you want to know more technical details about Linux on the Itanium Architecture, David and Stephane just released a book "IA-64 Linux Kernel: Design and Implementation". David was signing copies at HP's booth at LinuxWorld NY.

  • Come on, admit it guys, no one cares.
  • Part of the idea behind the Inanium was to make it uncloneable. The business concept was to use lots of new ideas, useful or not, thereby providing intellectual property protection. [google.com] This would prevent low-cost clones and allow Intel to get their profit margins back up.

    This is good for Intel, but not for anybody else. Go back and look at Intel CPU prices from just before AMD processors caught up.

    And that's the real reason for the Inanium.

  • This register business is totally out of control. Lets just get rid of them and have something like a 'level 0' cache. TI used to have a processor with only three registers, that used memory for everything else. I think it is ultimately easier to just have 256 high speed registers that simulate memory locations. (If you take out the registers needed to manage all these registers and transistors to handle renaming etc, you could get 1K of registers).This also makes assembly language much easier. Want to multiply two numbers, just multiply the memory locations directly. Got some tight counting loops, these will be done directly in registers. How is this going to be handled with MP machines, NUMA seems to be an obvious answer. The caching of memory to registers would have to be very fine grained, of course. In the long run, this appears to be the only logical way out.
  • From the article:
    "Itanium has a 10-stage pipeline, which is respectable but not impressive by today's standards. "
    Does this guy have any idea of what he's talking about? Since when did having a 20 stage pipeline become more impressive than somthing with 10? Hell, I can design a processor with a 100 stage pipeline that does nothing. Yes, thats right nothing -- just wire delays (the Pentium4 has 2 pipe stages devoted to wire delays just to get it clocked at 2+ GHz). Ultimately what matters is the performance of the CPU (and memory subsystem), clock speed, pipe depth etc. don't matter very much. In fact, if you can deliver higher performance with a smaller pipeline depth, its a much better design.

Understanding is always the understanding of a smaller problem in relation to a bigger problem. -- P.D. Ouspensky

Working...