Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Intel

Intel Unveils Next Gen Itanium Processor 169

MojoKid writes "This week, at ISSCC Intel unveiled its next-generation Itanium processor, codenamed Poulson. This new design is easily the most significant update to Itanium Intel has ever built and could upset the current balance of power at the highest-end of the server / mainframe market. It may also be the Itanium that fully redeems the brand name and sheds the last vestiges of negativity that have dogged the chip since it launched ten years ago. Poulson incorporates a number of advances in its record-breaking 3.1 Billion transistors. It's socket-compatible with the older Tukwila processors and offers up to eight cores and 54MB of on-die memory."
This discussion has been archived. No new comments can be posted.

Intel Unveils Next Gen Itanium Processor

Comments Filter:
  • by GoNINzo ( 32266 ) <<moc.oohay> <ta> <ozNINoG>> on Thursday February 24, 2011 @10:55AM (#35300608) Journal
    Guess the guys at Intel have been watching Fight Club a little too much.
  • Itanium flashbacks (Score:5, Insightful)

    by ArhcAngel ( 247594 ) on Thursday February 24, 2011 @10:57AM (#35300648)

    Does anyone else cringe when they here Itanium? The early chips still give me nightmares.

    • Obviously, lots of people cringe, and even TFS refers to it as a possible redemption of the Itanium line's name (i.e. its reputation was pretty lousy to begin with).
    • by Anonymous Coward on Thursday February 24, 2011 @11:08AM (#35300808)

      I work with the world's foremost experts on optimizing for Itanium 2. All available compilers suck. If you are willing to invest the effort to hand tweek, you can squeeze amazing performance out of the processors. They are extremely memory bound (hence 54MB cache now on chip). It is usually faster to recalculate numerical values than to fetch stored results.

      We work with large high performance computing systems/clusters. IBM Power 7 is fastest hands down for numerical work if you plan to use the crap output from the compiler directly. Recent Intel Xeon is as fast as Power 7 if you adjust all the fiddly settings and use some trial and error, but Xeon doesn't scale well for Symmetric Multi-Processing (SMP). Itanium 2 wins by a bit if you invest huge effort. Power 7 would probably be fastest overall for numerical work if we invested the same effort into optimizing that we do for Itanium. However, we don't have to invest the effort for Power 7 to be "fast enough".

      • by TheLink ( 130905 ) on Thursday February 24, 2011 @11:33AM (#35301140) Journal
        Would the same optimizations for the Itanium work OK for the Itanium 2 and for the upcoming Itanium? Or would the optimizations be too generation specific?

        AFAIK the problem with the Itanic was the Itanic was better at "embarrassingly parallel" problems. But that meant you could usually get the same (or better) performance with two or more x86 servers at a lower cost... And the x86 processors would do better than the Itanic on code that's not been optimized by super experts.
      • by mevets ( 322601 ) on Thursday February 24, 2011 @11:43AM (#35301284)

        | I work with the world's foremost experts on optimizing for Itanium 2....

        So when your whole team orders lunch, do you get a medium pizza or a large?

      • by LWATCDR ( 28044 )

        Memory bandwidth seems to be the next big bottle neck. I wonder what is the "ideal" memory to CPU ratio.
        I wonder what it would be like to have a system with no real ram just cache. Imagine CPUs with 4 GB of cache in a system where all the memory above 32bits was the cache of another CPU. You could access the memory of the other CPU as the speed of RAM today. It would be a really massive MP system to be sure. Of course then you would still want some RAM even if it just for DMA IO and Video.
        Yea I am sure I a

        • by Mr Z ( 6791 )

          The cache has to be backed by *something*. Either that, or you have to have some protocol wherein when you kick the last copy of something out of one cache, you arrange for it to get stored in another, in which case it isn't really a cache at that level so much as a set of dynamically assigned addresses system-wide.

          Consider a smaller system first as an example. Suppose you had two CPUs, each only one level of cache, and each with 1MB. That's 2MB total system memory. Now suppose the first CPU reads throu

          • by LWATCDR ( 28044 )

            I was thinking of Cache as being more like on board fast ram that ran at CPU speed than as cache as we see it today.
            To take your database example the way I imagine it working is a CPU would send commands to all the CPUs to find the records that contain x. Each CPU would search it's own memory of records and then just transmit the records to the requesting CPU. It would take a differn't programing model that what we use today. In a way I was thinking of it as smart ram. It seems dumb that a CPU has to read a

            • by Simon80 ( 874052 )
              I'm not a hardware guy, but what you're saying is out of touch with the design tradeoffs CPU designers already make for the following reason. Memory that runs at CPU speed already exists, that's what registers are. Then you have L1 cache, which takes a few cycles to access, L2, which takes I don't remember, a few dozen cycles, and memory, which takes hundreds of cycles, even longer if the TLB cache is being missed as well. Obviously things go faster if you have more registers, more L1, etc., but it's a trad
              • by LWATCDR ( 28044 )

                I do understand that is why I said I wonder what the ideal trade off between memory and CPU will be. Right now such a system would be useless because caches are only a few MBs. What happens when we can put 4 GB on the die? There are very few problems that can not fit in 4GB. I do understand the register, L1, L2, L3, main memory, mass storage structure. But we are already having memory access issues and moving to more and more cores. There is just a really nice symmetry to each CPU having 32 bits of CPU spee

                • by Simon80 ( 874052 )
                  ..and what happens to your performance if you don't distinguish remote memory from local memory? Also, there are tons of problems that don't fit into 4GB, such as, for example, pretty much every problem that involves using a database. Even so, at some point cache limitations cease to be the bottleneck. Keep in mind that all of that die space can be used to implement more cores, etc., so it's not as simple as "how much cache can we fit on the die", there's always a tradeoff between competing uses for the die
              • ...If you want to be able to write fast software, I suggest you read Ulrich Drepper's What Every Programmer Should Know About Memory [akkadia.org]. It's not that long, and very informative.

                It's 114 pages of not that long, but who's counting?

                • by Simon80 ( 874052 )
                  Well, if you're not serious, then yes, it's long, but if you have a serious specialization in software development then it's a drop in the bucket. Anyway, it's short compared to reading a book, and he's nice enough to not even make you pay for it.
          • by sjames ( 1099 )

            I'm pretty sure he meant that what is now cache on the CPU would be configured as RAM but still as fast (or faster) than the current cache.

            • by Mr Z ( 6791 )

              He used the phrase "No real RAM, just cache." I work with embedded processors that split their memories between cache and RAM (many of the C6000-family DSP cores), and also work with processors that are all cache (most general purpose processors). The phrase "no real RAM, just cache" implies that there wouldn't be any directly mapped memory, only indirectly mapped memory (ie. "cache" by his definition).

              How else would you interpret "no real RAM, just cache"?

        • The size of on-chip cache determines its latency. Larger caches are slower because they HAVE to be, not because they are made of different stuff.
          The first-order reasons is distance. The L1 is closest to where the data is needed, the L2 farther away, and the L3 still farther. Its not possible to simply make the L1 larger without also increasing the largest distance and thus its latency.
          So the L1 is kept small on purpose, to reduce its latency (to 2 or 3 clock cycles these days)
          Trust me. The CPU manufactur
      • I work for the world's secondmost experts on optimizing for Itanium 2. We order a small pizza. Usually we're not that hungry after a half-days work on this. We've lost our appetites completely by the end of each day.
      • So it runs like a bat out of hell when you massage it correctly. Good for you. Know what we call a processor that nobody can write a decent software stack for? A shitty processor.

      • I work with the world's foremost experts on optimizing for Itanium 2. All available compilers suck.

        I sometimes work on compilers for HPC, and this is caused by two, related, things. The first one is that no one cares. Itanium is such a small market that, even if you can get both Itanium users to buy your compiler, it's not worth the investment.

        IBM Power 7 is fastest hands down for numerical work if you plan to use the crap output from the compiler directly.

        POWER 7 is a pretty generic RISC design with a few CISCy tweaks. We've got 40 years and millions of dollars of research to look at when designing compilers for it. For Itanium? Not so much. It doesn't help that Itanium is so unlike everything else that it's h

        • I hate to break up a good pizza party, but I've been wondering if LLVM and Clang might help rescue Itanium as hinted by this 2005 paper which suggests a few compiler enhancements to help Itanium [gelato.org].
          • Nope, I also hack on LLVM. The LLVM Itanium back end is more or less unmaintained at the moment (unless someone's picked it up recently). It has very poor performance. It might even be worse than GCC - LLVM's IR is not an especially good match for EPIC (or even classic VLIW), so the instruction selection phase has to do a massive amount of work. This is usually a simple pattern matching exercise, but for Itanium it's incredibly complex to do well.
        • I sometimes work on compilers for HPC, and this is caused by two, related, things. The first one is that no one cares. Itanium is such a small market that, even if you can get both Itanium users to buy your compiler, it's not worth the investment.

          I heard a story from a guy at Redhat that the team that maintains the Itanium port was putting together a pool to buy the last remaining Itaniums from the customers (for more than it would cost to replace them) and then throwing the Itaniums off the roof.

        • APL rocks when your floating point addition latency exceeds your main memory fetch latency and your programmers don't mind that the trig operators are selected by manifest constants on the left side of the circle operator.

          Itanium is trying to fit the niche where SIMD is not applicable, yet arithmetic instruction density is high relative to memory transactions.

          I spent too much time last night reading about big constants. y-cruncher is sick. It's also not open source, and the core algorithm (Hybrid NTT mult

      • Have you played around with Opterons? I'd like to hear if they stack up any differently to the Xeons.
    • by pezpunk ( 205653 ) on Thursday February 24, 2011 @11:38AM (#35301218) Homepage

      ITANIC processor
      RAMBUS memory
      Voodoo5 video card
      i can't think of a hard drive crappy enough ... maybe you could have the OS installed on an external drive connected via USB1.0.

      obviously the OS would be WindowsME.

    • it's ititanic 2

    • I was a contractor when they were working on this "next gen" 64bit CPU everybody was excited, then later when I read about it, I couldn't understand where this new architecture would fit. Then X86/64 came out and there really seemed to be no place for it.

      IMHO, don't throw good money after bad.

      • AIUI while itanium was a failure on the desktop and windows/linux server intel was successful in pulling a number of vendors (HP being the best known in the west) into using it for their unix and/or mainframe systems and sales for that purpose provide Intel with enough revenue to justify keeping it alive.

    • by turgid ( 580780 )

      I knew a guy who had one once. He used to use it for drying his clothes.

      The marketing hype made me cringe. It was pretty obvious that the whole thing would be a disaster from the start.

      itanic relied on "good compilers" to get any performance. However, compilers that "good" will never be made, since you can't write a compiler that can predict the future.

      However, you can put hardware in your CPU to reorder instructions at run time based on observations of the behaviour of the running code, speculatively ex

  • by fuzzyfuzzyfungus ( 1223518 ) on Thursday February 24, 2011 @10:59AM (#35300676) Journal
    Is it more resistant to icebergs than the previous itanics?
  • Instead of everytime a new one comes along a new motherboard is required. Rather kicks any CPU upgrading possibilities into the long grass.

  • That's a funny line you wrote their "sheds the last vestiges of negativity". Microsoft has dumped Itanium since 2008 R2, Redhat has dumped Itanium for RHEL 6, the only things left are niche markets for HP/UX (market share plummeting as you read this, being eaten alive by IBM PPC / Z series), OpenVMS (well hello mid 1980s and early 90s), and NonStop (neat in its day too, but again IBM eating its lunch) The ship Itanic continues to auger into the ocean floor.
    • Oh, please. Unlike Power, which if you look at the sales numbers has had an awful few months, Itanium is doing pretty damn well. $4bn to $5bn with reliable growth (feeding off Oracle's neglect of M-series SPARC) is a good place to be. Additionally, if Poulson came out today, it would probably be the fastest processor in the world (4-6x the raw performance of the Itanium 9300 should put it slightly ahead of Power7).
      • Additionally, if Poulson came out today, it would probably be the fastest processor in the world (4-6x the raw performance of the Itanium 9300 should put it slightly ahead of Power7).

        Ok. So you are telling me that today, Power7 is almost 4-6x the raw performance of Itanium 9300? But if we wait until the end of 2012 or early 2013 when Poulson ships, it will be slightly ahead of today's Power7?

        Are you trying to help or hurt Itanium with this info?

        • Poulson is likely to ship Q1 of 2012, shortly before IBM's Power7+ refresh is likely. It'll be competitive enough, especially if P7+ is a shitty refresh like P6+ was. (It had slightly improved power characteristics and no performance enhancements.)
          • Why do you think Poulson will ship in Q1 of 2012? I don't remember if any Itaniums have shipped on time, but I definitely remember multiple not shipping on time.

            Here's a recent refresher: Tukwilla specs released early 2008 which had an initial ship estimate of Q42008, actual ship Q12010
      • You're ignoring the other power-derived thing, the mainframes (which HP Integrity tries to compete) so we're talking IBM's $5 + $5 billion vs. Itaniums $5 billion oh please indeed, Itanic is going down the crapper.
  • It may also be the Itanium that fully redeems the brand name and sheds the last vestiges of negativity that have dogged the chip since it launched ten years ago. Poulson incorporates a number of advances in its record-breaking 3.1 Billion transistors. It's socket-compatible with the older Tukwila processors and offers up to eight cores and 54MB of on-die memory.

    That is so ridiculous that it is not funny.

    Biggest complain about Itanic was always absence of cheap versions, something companies can put on engineer' desks.

    Seeing what people do around AMD64 architecture, I doubt Itanic would ever become mainstream - it would remain forever a pet platform of HP's service unit. Similar to IBM's POWER: something sufficiently incompatible so that customers can't migrate overnight to competitor's platform.

    • It may also be the Itanium that fully redeems the brand name and sheds the last vestiges of negativity that have dogged the chip since it launched ten years ago. Poulson incorporates a number of advances in its record-breaking 3.1 Billion transistors. It's socket-compatible with the older Tukwila processors and offers up to eight cores and 54MB of on-die memory.

      That is so ridiculous that it is not funny.

      Biggest complain about Itanic was always absence of cheap versions, something companies can put on engineer' desks.

      Seeing what people do around AMD64 architecture, I doubt Itanic would ever become mainstream - it would remain forever a pet platform of HP's service unit. Similar to IBM's POWER: something sufficiently incompatible so that customers can't migrate overnight to competitor's platform.

      Yes, HPUX is the only major OS for the platform in the West - but (so I've heard from Intel sales and engineering folks) Japan (specifically Fujitsu) buys a lot of these things. So do some major companies in the US - but they also write their own applications/OSes for the platform (ie, they're NOT running HPUX on it).

      • Yeah, outside the US, Itanium is big on mainframes. Fujitsu, NEC, Bull, and (I think) Hitachi all run proprietary mainframe OS's on IA64, and at least in their home countries they do a pretty good business.
  • Diving beneath the settling chip fab, the Itanium ran quivering along its keel; but turning under water, swiftly shot to the surface again, far off the other bow, but within a few yards of HP's boat, where, for a time, it lay quiescent.

    "I turn my body from the sun. What ho, Tashtego! let me hear thy hammer. Oh! ye three unsurrendered spires of mine; thou uncracked keel; and only god-bullied hull; thou firm deck, and haughty helm, and Pole-pointed prow,- death- glorious chip fab! must ye then perish, and

The opossum is a very sophisticated animal. It doesn't even get up until 5 or 6 PM.

Working...