Alpha 21364 EV7 Specs Released

Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

Alpha 21364 EV7 Specs Released 174

Posted by chrisd on Thursday July 04, 2002 @04:52PM from the waiting-for-opteron dept.

Jon Carroll writes " HP has revealed their Alpha roadmap today at RDF and the schedule goes as previously planned. Alpha 21364 (EV7) is based on 0.18 micron to be shipped by this year end and EV79 based on 0.13 micron SOI will be up next. EV7 will be at 1.2Ghz while EV79 will be at 1.6Ghz. The Alpha 21364 EV7 chip will have 152M transistors, 1.75MB integrated on-die L2 cache, 32GB/s of network bandwidth, integrated RDRAM memory controller with 8 channels up to 12.8GB/s of memory bandwidth. "

This discussion has been archived. No new comments can be posted.

Alpha 21364 EV7 Specs Released

Load All Comments

Search 174 Comments Log In/Create an Account

Comments Filter:

Yay (Score:2, Funny)

by Lord Squirrel ( 162184 ) writes:

Alpha Lives! Yay! I can die happy now.
- - - - Re:Q.E.D. (Score:1, Offtopic)
        
        by AndrewHowe ( 60826 ) writes:
        
        I'd just like to say that I was moderated (unfairly) overrated for that. Burn in hell, moderator...
alpha still lives? (Score:3, Interesting)

by Indy1 ( 99447 ) writes: on Thursday July 04, 2002 @04:58PM (#3823736)

Wait, i am confused here. I thought Dec was bought out by Compaq, which then butchered Dec and their Alpha technology to the point that Compaq finally sold off what remained of the Alpha to Intel (and a bunch of former Alpha engineers also went to AMD if memory serves correctly). Can any one clarify what really happened to Alpha ? I hope that Alpha sticks around, as i feel its a good archtecture (forgive spelling) compared to the x86 stuff.

Share
twitter facebook
- Re:alpha still lives? (Score:4, Informative)
  
  by Henry V .009 ( 518000 ) writes: on Thursday July 04, 2002 @05:12PM (#3823789) Journal
  
  Your sketch was more or less right on. When Compaq sold ALPHA to Intel, they said there would only be one more ALPHA chip. Damn them to hell anyway. ALPHA was the best.
  
  Parent Share
  twitter facebook
  - Re:alpha still lives? (Score:1)
    
    by LoRdTAW ( 99712 ) writes:
    
    Your wrong, APHA is the best!
  - Re:alpha still lives? (Score:1)
    
    by Strog ( 129969 ) writes:
    
    Where does this all leave Samsung [samsungelectronics.com] in this whole mess?
    
    Will they continue the Alpha line or will this be where it ends?
  - You forget Samsung (Score:2)
    
    by DABANSHEE ( 154661 ) writes:
    
    Samsung have the right to develop & manufacture Alphas for as long as they want, no matter what Intel & HPaq say or do.
- Re:alpha still lives? (Score:2)
  
  by jbridge21 ( 90597 ) writes:
  
  The plan was to finish the 21364 because their engineers were already pretty far along, and because it would take a while to move over to Itanic anyway. The design of the EV8 was cancelled, unfortunately :-(
  
  THE ALPHA IS DEAD! LONG LIVE THE ALPHA!
barf, RDRAM (Score:1)

by Indy1 ( 99447 ) writes:

i just noticed the bottom part, "integrated RDRAM memory controller". RDRAM is WAYYYYY too $$$$ to be used in servers, and the latency on it sucks balls as well. I dont understand why they dont go with dual DDR (ala nforce style or intels p4 chipset thats due out next time some year).
- Re:barf, RDRAM (Score:3, Informative)
  
  by Anonymous Coward writes:
  
  It is an EIGHT channel RDRAM controller though. Compare to the TWO channel RDRAM controller of the i850 for example. That gives the Alpha 4x the memory bandwidth of the i850. RAMBUS and DDR both have their advantages and disadvantages. I doubt that RDRAM would have been used without a good reason - most likely the need for high memory bandwidth. Graham
- Not really Re:barf, RDRAM (Score:3, Informative)
  
  by ppetrakis ( 51087 ) writes:
  
  Sure RDRAM is 'slow' when used on PC architecture however on an Alpha which has VERY WIDE memory bus it can actually use all that memory bandwidth. The latency doesnt matter anymore. As for cost. If you are buying one of these you probably had to get the job done 'yesterday' :-)
  
  Peter
- Re:barf, RDRAM (Score:1)
  
  by mfago ( 514801 ) writes:
  
  "RDRAM is WAYYYYY too $$$$ to be used in servers"
  
  Too expensive for servers? You must be kidding. Ever price an alpha cluster?
  I'm no fan of RDRAM though. Not that I necessarily dislike the technology, but the tactics.
- Re:barf, RDRAM (Score:5, Informative)
  
  by jmv ( 93421 ) writes: on Thursday July 04, 2002 @05:24PM (#3823840) Homepage
  
  the latency on it sucks balls
  
  It does in a PC, where they only put two 16-bit channels so you need two accesses to each bank to fetch the 64-bit bus-width (it's serialization).
  
  In Alpha, there's no serialization. You've got an eight-channel (16 bit each, unless they use the newer 32-bit wide?) configuration. That means that they are 128 bits wide. In order to get the same performance from DDR, you'd need to have a bus that's 1024-bit wide or something like that, which is not practical...
  
  I don't like RAMBUS at all, but the industry has to come up with something faster because it's clearly the fastest on platforms where it's used correctly (I don't include the current PC in that category).
  
  Parent Share
  twitter facebook
- Re:barf, RDRAM (Score:3, Informative)
  
  by LinuxParanoid ( 64467 ) writes:
  
  You must be buying cheap servers. RDRAM is used in more expensive servers, in part due to the high bandwidth it provides (and also, in part due to engineering decisions made years ago.) 8 channels of RDRAM yields 12.8 GB/sec of memory bandwidth which is certainly more than you get with PCs these days, even PC servers. Then again, the 21364 isn't shipping yet. But I don't think Intel plans on shipping that sort of CPU bandwidth by the end of the year.
  
  And back to your point about economics of RDRAM, there is money out there that will pay a premium for performance scalability (at least when combined with reliability). About 11 percent of all servers -- command as much as 60 percent of all server revenue. [eetimes.com]
  
  I just wonder how it'll stack up performance-wise on this chart [theinquirer.net] versus Power4 and Itanium2.
  
  But the main reason I suspect one would buy one of these is because you want binary compatibility with all your old high-performance Alpha code that you invested so many man-years in.
  
  --LP
- Re:barf, RDRAM (Score:2)
  
  by Indy1 ( 99447 ) writes:
  
  all of you make good points. I didnt stop to consider that a high end alpha solution is massive dollars. Does using multiple channels get around the latency in rdram? One of my friends who is a developer says that applications that do mostly serialized type memory acesses do great with rdram, but unless your app is written to take advantage of rdram, ddr is far better (due to much better latency). I am not a big iron expert, so can anyone comment if your usual type big iron apps depend on latency more vs raw bandwidth? I know with Hard drives, brute STR isnt as godly as random access speeds unless your doing stuff like streaming video.
How sad... (Score:5, Insightful)

by Glock27 ( 446276 ) writes: on Thursday July 04, 2002 @05:03PM (#3823758)

to see Itanium steamroller a much better architecture.
Alpha is brilliant, too bad it didn't receive the development and marketing dollars it deserved. Compaq should be ashamed.
Thank goodness AMD is here to take up the slack with Hammer! =)

Share
twitter facebook
- Re:How sad... (Score:1)
  
  by abdulla ( 523920 ) writes:
  
  With a hammer, or with The Hammer. ;)
- Re:How sad... (Score:1)
  
  by Anonymous Coward writes:
  
  Hammer is not quite something to be happy about. From what I've been told by others x86 is not the best architecture out there, and having a clean break from this to a better architecture would be a good idea. Although, I'm not sure the Itanic is the way to go...
  - Re:How sad... (Score:1)
    
    by DarkHelmet433 ( 467596 ) writes:
    
    One thing I can comfortably say after spending quite some time porting OS code to IA64 is "everything you know is wrong". Much of our accumulated knowledge about OShardware interaction has gone out the window with this beastie. The same for compilers. RISC made a mess of our accumulated compiler knowledge, and IA64 (VLIW^H^H^H^HEPIC) takes us back to square one again too.
    
    Itanium is going to take years to reach critical mass. x86-64 is going to be eating its lunch for quite a good while. Especially if Intel dont hurry up and make a version suitable for desktop/workstation use.
- Re:How sad... (Score:2, Insightful)
  
  by ksymoops ( 557807 ) writes:
  
  > to take up the slack with Hammer!
  
  To take up the slack? How could a glorified x86 chip with a broken/inefficient instruction set possibly be better than a chip with a new from-scratch architecture.
  - Re:How sad... (Score:5, Insightful)
    
    by stripes ( 3681 ) writes: on Thursday July 04, 2002 @07:52PM (#3824302) Homepage Journal
    
    How could a glorified x86 chip with a broken/inefficient instruction set possibly be better than a chip with a new from-scratch architecture.
    
    Well you have the x86 with basically all the market forces behind it driving huge R&D budgets...that's how the x86 managed to slam the MIPS, SPARC, POWER, and pretty much all the other RISC chips. It doesn't matter that you are basically sticking solid rockets onto a large not-so-aerodyanic brick. It flys.
    That's the past. Now in the present we have the same market forces behind the x86, and a stunningly bizzare new creature called the IA64, which may not be the poster child for "broken/inefficient", but is clearly a great one for "will drive compiler writers over the brink into the spinning abyss of madness". It is definitly stunningly hard to write things for, that's for sure. More so in most cases then figuring what "RISCops" your x86 instructions are broken into, and where they are shoved, how long that takes, and what a better set would be.
    Intel will send you the IA64 instruction set manuals for free. Go take a peak...if your mind is strong. Or you don't mind a bit of gibbering.
    
    Parent Share
    twitter facebook
- The Hammer is NOT a good thing... (Score:2, Interesting)
  
  by sl3xd ( 111641 ) writes:
  
  While I prefer AMD processors over Intel's, and I have an x86-PC, as I understand the situation, the Hammer is not a good thing in any way. My understanding being that the Hammer is simply an extension of the x86 architecture from 32 to 64 bits. (in a remarkably similar fashion to how the 80386 was a 32-bit extension to the 80286/8086, which was a 16-bit extension to the 8085, 8080, & 8008. I'm not sure if the 8008 was an 8-bit extension of the 4004 or not; the 4004 was a 4-bit processor, and is considered to be the world's first microprocessor.)
  
  So the x86 architecture/instruction set still has a great deal of commonality with the Altairs running CP/M.
  
  The 'x86' architecture was only intended to be used for a few years. IBM first extended it from the Altair (8085, 64k) to the PC (8086/8, 1M). The popularity of the PC lead to the decision to extend the PC to the AT (80286, 16M). After that, IBM decided that the architecture needed replacement and then tried to kill it. IBM created an entirely new, superior architecture, complete with a new, superior OS. (The PS/2 and OS/2).
  
  This failed miserably. (Not in small part to the fact it was a 'closed' architecture-- just like Macintosh)
  
  Instead most of the world chose to stay with the 'x86' architecture (and the more economical clones), maintain backwards compatibility, and deal with its limitations. (I won't say flaws, because the original architecture was never meant to be extended this far to begin with. Of course, that was back with the 8080 and 8085, 64k (max) memory, the Altair, and CP/M.
  
  And now, the x86 architecture is one extension upon another, finally arriving at the monstrosity we know today.
  
  The Hammer (and Intel's 64-bit extension to the Pentium... NOT the Itanium) will be yet another generation of an architecture originally intended to handle no more than 64k of memory.
  
  It's sick; the best comparison I can think of is if the 'x86' architecture is compared to bare hands, the only tools we have are gloved hands with speed/power assist. No wheel, no lever -- just hands.
  
  The sooner we kill the x86 architecture, the better. It was ancient 15 years ago. Humanity gave up horses and slaves in favor of automobiles and machinery. We can give up the old x86 architecture for something better. Maintaining it is inhumane.
  
  But getting Intel, AMD, and others to cooperate (and share valuable, patented technologies with each other) is like asking Microsoft to GPL the source for Windows.
  - Re:The Hammer is NOT a good thing... (Score:3, Insightful)
    
    by bbbl67 ( 590473 ) writes:
    
    You're characterization about the x86 architecture and PC history is completely wrong. It's one of those tales about an innocent comment that has been passed down from mouth to mouth and along the way the details have gotten embellished completely out of proportion. Having been an assembly language programmer in the x86 world, I know what I'm talking about. There was nothing hard to learn about the x86 instruction set, it was actually quite useful because there were so many complex instructions to choose from that could do some complex tasks with a single instruction. If you didn't feel like learning all of the complex instructions, you could stick to the common instructions to do everything you want.
    
    Now, the problems with the x86 instruction set that have been embellished out of proportion have all been basically taken care of over the years and fixed, but the bashing still continues. It continues mostly from people who aren't aware the problems have been fixed because they are simply bashers for the sake of bashing. One deficiency about the x86 was its integer register set: it was too small, only 8 general purpose integer registers, and in some of the more complex instructions, only specific GPRs could be used. This has been taken care of by the x86-64 instruction set, they doubled the registerset to 16, and these registers are truly general-purpose. Then there is wierdness about the stack-based floating-point unit: again this has been taken care of because they are using SSE for floating point which uses random-access floating point registers rather than stack-based. Still there were some advantages to using the stack-based FPU, such some of the complex floating point instructions you got with it, such as tangents, sines, cosines, logarithms, etc.
    
    Now, your knowledge of PC history is woefully inaccurate. When IBM tried to make its powergrab with the PS/2 hardware and OS/2 software architectures, it wasn't trying to get rid of the x86 instruction set. On the contrary, it was getting much deeper into x86 than at any point in its past. With PS/2, it had tried to change away from the ISA bus towards a new generation bus called MCA, without accomodating the existing ISA bus; the shift away from ISA wouldn't successfully take place until many years later when they introduced the PCI bus, which maintained backwards compatibility with ISA. PCI was successful because it allowed a gradual transition away from ISA, MCA on the other hand tried to force everyone to switch away completely all at once. The OS/2 operating system was a similar story, it actually tried to use the x86 architecture in greater depth than any OS previously, by using the 286's new "Protected" operating mode, which gave access to much larger amounts of memory. The only problem was that the 286's Protected mode was not yet full featured, it was more of a running experiment, and it wouldn't become truly useful until the 386 came along and added all kinds of features to Protected mode that allowed for greater flexibility and backwards-compatibility at the same time.
    - Re:The Hammer is NOT a good thing... (Score:1)
      
      by tricorn ( 199664 ) writes:
      
      the shift away from ISA wouldn't successfully take place until many years later when they introduced the PCI bus, which maintained backwards compatibility with ISA. PCI was successful because it allowed a gradual transition away from ISA
      
      PCI is not compatible in any sense with ISA. I think you're thinking of EISA, which was the industry response to IBM's attempt to corner the market by patenting various aspects of MCA so that no one else could make compatible devices or systems.
      
      Another thing he got wrong was that the 8086 chip was not backwards compatible with the 8080 line. It was similar in architecture (limited non-orthogonal register set, awkward instruction set), and there were 8080 -> 8086 cross-assemblers (sometimes producing more than one 8086 instruction for each 8080 instruction), but it wasn't backwards compatible in the same way as 8086 -> 80186/286/386/486/Pentium were.
      
      Wow, a whole 16 general-purpose registers, my heart flutters. Bah, might as well use a 64-bit extension to the 6502.
      - Re:The Hammer is NOT a good thing... (Score:1)
        
        by bbbl67 ( 590473 ) writes:
        
        PCI is not compatible in any sense with ISA. I think you're thinking of EISA, which was the industry response to IBM's attempt to corner the market by patenting various aspects of MCA so that no one else could make compatible devices or systems
        
        PCI is compatible with ISA in the sense that it allows an ISA-to-PCI bridge to exist, and the ISA bus acts as a client of the PCI bus. MCA never allowed any sort of backward compatibility. EISA (and later VL-Bus) were directly compatible with ISA (as opposed to bridged compatibility), true.
        
        Wow, a whole 16 general-purpose registers, my heart flutters. Bah, might as well use a 64-bit extension to the 6502.
        
        16 GPR's are well within the current norms for RISC processors. Don't forget this is a CISC processor, so more than likely there will be all kinds of hidden internal registers for register renaming available. BTW, the 6502 had a whole 2 GPRs available to it.
        
        Re:The Hammer is NOT a good thing... (Score:2)
        
        by sl3xd ( 111641 ) writes:
        
        16 GPR's are well within the current norms for RISC processors. Don't forget this is a CISC processor, so more than likely there will be all kinds of hidden internal registers for register renaming available. BTW, the 6502 had a whole 2 GPRs available to it.
        
        Funny... I seem to remember most RISC processors I've known (or designed) to have at least 32 GPR's.
        
        Besides... The point of moving away from CISC is so a processor doesn't use over 1/2 its transistors just to decode the instruction. The instruction decode section of the pipeline shouldn't be the single most complex part; unfortunately on a CISC processor, that's where ~50% of the transistors are.
        
        I'm also fully aware of the 'evolution' of PC architecture. I've been programming x86 asm for quite a while as well. Many of the x86 (even modern ones) ways of doing things are just... inelegant (or ugly)
  - Re:The Hammer is NOT a good thing... (Score:4, Insightful)
    
    by Glock27 ( 446276 ) writes: on Friday July 05, 2002 @02:07PM (#3828636)
    
    Sorry I didn't reply sooner, I was away from the keyboard most of yesterday.
    The sooner we kill the x86 architecture, the better. It was ancient 15 years ago. Humanity gave up horses and slaves in favor of automobiles and machinery. We can give up the old x86 architecture for something better. Maintaining it is inhumane.
    This is a silly argument, for two reasons.
    First, almost all programmers can (thankfully) ignore the underlying instruction set and program in a higher level language - therefore it is irrelevant. x86-64 is actually quite an improvement over IA32 regardless.
    Second, if an instruction set is sufficiently efficient to allow the processor to be the fastest microprocessor in the world, it can't be so bad - can it? If my information is correct, Hammer and Opteron will debut with absolutely world-class performance. This isn't so surprising, given that many ex-Alpha engineers are working on it.
    Backwards compatibility is simply a nice bonus, which will be crucial in Hammer attaining critcal mass quickly.
    Time to pick up some AMD stock!!! =)
    
    Parent Share
    twitter facebook
    - Re:The Hammer is NOT a good thing... (Score:1)
      
      by tricorn ( 199664 ) writes:
      
      The main thing wrong with x86 backwards compatibility isn't that the machine code is awkward; you're absolutely right that if you can make it run fast, who cares? One problem is that it makes it difficult to run fast, so it would run faster without the cruft. However, the biggest problem is that it encourages manufacturers to continue producing machines that are basically the same crap as we've always had. IDE, lousy serial ports, the parallel port for gosh sake, the same lousy BIOS architecture, ISA ports and IRQs. The PC world really needs to take the plunge the way Apple did - use a decent boot architecture (hey, maybe they could use Open Firmware!), drop serial ports, go to FireWire/USB. Apple's only mistake was justifiable, going to IDE (due to the ridiculous price differential between SCSI/IDE drives, which was due to a self-perpetuating cycle of being more expensive because it wasn't as widely used).
      - Re:The Hammer is NOT a good thing... (Score:1)
        
        by bbbl67 ( 590473 ) writes:
        
        IDE, lousy serial ports, the parallel port for gosh sake, the same lousy BIOS architecture, ISA ports and IRQs.
        
        Hey before you get too high on your anti-establishment high-horse, check some of those facts first.
        
        IDE is now in use by such high-end server/workstation makers like Sun Microsystems, HP PA-RISC, etc., who use them in their personal workstation lines, because it makes absolute sense both in terms of economics and performance to do so. A workstation just requires a boot disk, and maybe some personal storage space, and IDE does this extremely well. Most large-scale data storage can and should be accomplished off of network-attached or SAN-attached storage devices.
        
        Don't know what you're complaining about those IRQ's, all systems in the world have something similar in concept to IRQ's. And in fact most systems throughout the world are now standardized on PCI, so they use the same IRQ mechanism as PC's.
        
        And what about them "lousy serial ports"? That's absolutely essential in maintaining control over large groups of Unix servers. Their consoles are invariably serial-port based. They do have nice modern GUI consoles, but when it comes to stacking them into a server room and controlling them all from a single input/output source, nothing beats the simplicity of a serial console tty device. And since they're X Window or Java based, you can simply do all of your graphical stuff from the comfort of your own PC logging in remotely, but the local administration can be done over non-graphical serial ports.
        
        Re:The Hammer is NOT a good thing... (Score:2)
        
        by sl3xd ( 111641 ) writes:
        
        on't know what you're complaining about those IRQ's, all systems in the world have something similar in concept to IRQ's. And in fact most systems throughout the world are now standardized on PCI, so they use the same IRQ mechanism as PC's.
        
        Exactly true. Although the number and arrangement of the interrupts may be different. I would prefer not to think of how dog slow computers would be if they had to actively poll system devices (from video cards to keyboards). It's sooo much nicer to use an interrupt system.
        
        And what about them "lousy serial ports"? That's absolutely essential in maintaining control over large groups of Unix servers. Their consoles are invariably serial-port based. They do have nice modern GUI consoles, but when it comes to stacking them into a server room and controlling them all from a single input/output source, nothing beats the simplicity of a serial console tty device. And since they're X Window or Java based, you can simply do all of your graphical stuff from the comfort of your own PC logging in remotely, but the local administration can be done over non-graphical serial ports.
        
        While not arguing this point in the least, I will say one thing: The way the serial ports are set up on the x86 is a bit messy. The Unix boxen I've worked with had a more elegant system for serial ports. (Although most of them also didn't have the same backwards-compatibility problems x86 has).
    - Re:The Hammer is NOT a good thing... (Score:2)
      
      by sl3xd ( 111641 ) writes:
      
      First, almost all programmers can (thankfully) ignore the underlying instruction set and program in a higher level language - therefore it is irrelevant. x86-64 is actually quite an improvement over IA32 regardless.
      
      Oooh! A higher level language!!!
      
      So is BASIC! And you can get it for any platform and your code will run.
      
      Whoopee! It's still dog slow and takes up more resources than is necessary to get the job done. Even compiled (C) code usually runs several times slower and requires more memory than assembler.
      
      Second, if an instruction set is sufficiently efficient to allow the processor to be the fastest microprocessor in the world,
      
      First, an instruction set has little to do with the speed of the processor. The whole CISC vs. RISC thing has more than shown that. An instruction set has more to do with the difficulty and/or complexity of the processor's design. The CISC instruction set requires more (electrical) power, and more transistors to do the same job.
      
      Second, it's to be the fastest in the world? By what method is this measured? Clock speed? Size of the pipeline? Number of pipelines? Clocks per (integer, float, or instruction)?
      
      The hammer isn't even meant to compete with workstation processors in terms of speed. I'll take a SPARC or Itanium any day. (It's a sad thing that so many seem to forget that the Itanium is an HP design, the successor to its PA-RISC, and that newer versions of the Itanium will include many of the Alpha's technologies).
      - Re:The Hammer is NOT a good thing... (Score:2)
        
        by Glock27 ( 446276 ) writes:
        
        First, almost all programmers can (thankfully) ignore the underlying instruction set and program in a higher level language - therefore it is irrelevant. x86-64 is actually quite an improvement over IA32 regardless.
        Oooh! A higher level language!!!
        So is BASIC! And you can get it for any platform and your code will run.
        Whoopee! It's still dog slow and takes up more resources than is necessary to get the job done. Even compiled (C) code usually runs several times slower and requires more memory than assembler.
        You've just proven you have no practical knowledge of software development. Far less than 1% of desktop/workstation/server software is programmed in assembler. Perhaps the inner loop of some game engines might be, but I doubt even that in most cases.
        One of the main points of developing faster processors with large amounts of memory was to enable the use of more programmer-friendly languages. It is simply not worth the cost to develop systems of any size in assembly.
        Finally, if you think C code "usually" runs several times slower than assembler, you're just plain out to lunch.
        Second, if an instruction set is sufficiently efficient to allow the processor to be the fastest microprocessor in the world,
        First, an instruction set has little to do with the speed of the processor. The whole CISC vs. RISC thing has more than shown that. An instruction set has more to do with the difficulty and/or complexity of the processor's design. The CISC instruction set requires more (electrical) power, and more transistors to do the same job.
        The instruction set (and associated issues like register count) certainly does have an effect on speed. Next!
        Second, it's to be the fastest in the world? By what method is this measured? Clock speed? Size of the pipeline? Number of pipelines? Clocks per (integer, float, or instruction)?
        I'll settle for SPEC2000 benchmarks. You know, real world codes optimized to the hilt for the target processor.
        The first Hammer is supposed to debut at a PR 3400, and the first Opteron with a PR 4000. Multiply the current Athlon SPEC scores by the ratios of the PR numbers...that should give you a good idea of what's to come.
        The hammer isn't even meant to compete with workstation processors in terms of speed. I'll take a SPARC or Itanium any day.
        You are absolutely incorrect. First off, Athlon MP and Xeon are already workstation solutions, albeit 32-bit.
        Secondly, the Opteron versions of Hammer (with dual memory controllers, more than 2-way capability and large cache) are squarely aimed at high-end workstation and server applications, up to at least 8-way. Do some homework and you'll see this to be the case. Dell recently announced that it's skipping Itanium 2, and evaluating Hammer/Opteron.
        (It's a sad thing that so many seem to forget that the Itanium is an HP design, the successor to its PA-RISC, and that newer versions of the Itanium will include many of the Alpha's technologies).
        It is a dual HP + Intel design, and so far it has been a collosal dud by anyone's measure. With poor backwards compatibility and anemic performance, it is very vulnerable to Hammer, if AMD can pull it off. So far Hammer is looking great! Working silicon has been demoed, and things look on track for a 4Q release of the first Athlon-64 (desktop Hammer). Opteron will follow 1Q 2003.
        (BTW, it'll be interesting to see how Intel spins the low clock speeds of Itanium. THAT will require some chutzpah! I hope AMD nails Intel on that score.)
        
        Re:The Hammer is NOT a good thing... (Score:2)
        
        by sl3xd ( 111641 ) writes:
        
        You've just proven you have no practical knowledge of software development. Far less than 1% of desktop/workstation/server software is programmed in assembler. Perhaps the inner loop of some game engines might be, but I doubt even that in most cases.
        
        If desktop computers accounted for more than a tiny fraction of the whole computer market, I might actually care about that statement. Fortunately, the vast majority of computers are embedded systems, and a substantial portion of embedded code is pure asm.
        
        One of the main points of developing faster processors with large amounts of memory was to enable the use of more programmer-friendly languages. It is simply not worth the cost to develop systems of any size in assembly.
        
        No, that's the software designers point of view. The hardware designers point of view is to maintain the performance of software written by overworked programmers who don't have the time to do it right.
        
        Finally, if you think C code "usually" runs several times slower than assembler, you're just plain out to lunch.
        
        First off, C code does execute several times slower than assembler. On the order of 5-10x is typical. Compilers really aren't that wonderful.
        
        Re:The Hammer is NOT a good thing... (Score:2)
        
        by Glock27 ( 446276 ) writes:
        
        If desktop computers accounted for more than a tiny fraction of the whole computer market, I might actually care about that statement. Fortunately, the vast majority of computers are embedded systems, and a substantial portion of embedded code is pure asm.
        Er, wait a sec. This discussion was about Hammer, vis a vis Itanium, SPARC, etc. Remember?
        None of them are aimed at the embedded market.
        No, that's the software designers point of view. The hardware designers point of view is to maintain the performance of software written by overworked programmers who don't have the time to do it right.
        Don't worry. Even with the widespread use of high level languages, computers can do far more now than a few years ago - or are you claiming you could run Quake 3 on a 386, if it were written in assembly? ;-)
        First off, C code does execute several times slower than assembler. On the order of 5-10x is typical. Compilers really aren't that wonderful.
        Except for pathological cases, C code will run a few percent slower than hand-tuned assembler - if that. As I said, out to lunch...
        
        Re:The Hammer is NOT a good thing... (Score:2)
        
        by sl3xd ( 111641 ) writes:
        
        The discussion is about Hammer vs (insert processor)
        
        But all are (or will be used) in embedded system design anyway, so that's where my train of thought was leading. The Hammer mainly has 'momentum' going for it. Just about everything else is against it.
        
        First, the Hammer is a design of a few orders of magnitude more complex than anything else ever attempted. The engineers at DEC dropped the VAX processors and designed the Alpha to avoid the same complexity issues the Hammer is trying to tackle.
        
        First, it uses the x86 set, which has both more instructions, and more complexity (some would say features) per instruction than a pure RISC processor. About half the Athlon's design is just to decode the instructions it's given. After decode from x86 into its internal RISC structure, it then schedules the pipeline, and finally actually sends the data into the appropriate pipeline for execution. There is a huge amount of overhead just to decode what needs to be done.
        
        Pure RISC designs use about 15% of the chip's transistors for decode, and that's if you include pipeline scheduling.
        
        This is the crux of the problem for AMD's hammer. The hammer will be forced to use a much larger transistor count than its RISC competitors. The higher transistor count results in several problems: It's far more complex and expensive to design. It takes a more complicated and expensive process to fab. The die is larger, which results in a slower processor. And it uses more power.
        
        Which means that while AMD may have some momentum going for it, the Hammer is far more costly to design and produce than its competition. This will make things very hard for AMD; espescially if Intel is able to use its (considerably greater resources) to get computer makers to move from x86 to IA-64 at the same time they move from 32 to 64-bit.
        
        And since HP/Compaq, Dell, Gateway, Micron, and IBM have all thrown in with IA-64... Things look grim for the Hammer.
        
        The good thing is that I would bet that the RISC back end to the hammer is designed so it can be mated with an IA-64 interface should the x86-interface core not take off.
        
        or are you claiming you could run Quake 3 on a 386, if it were written in assembly? ;-)
        
        Not exactly a fair comparison, given that 3D Acceleration was a rather expensive solution back in the days of the '386. Most of which were used for military flight sims. The accelerator was about the size of a refrigerator, and connected to the 'host' computer, which was usually a SPARCstation @ 33 MHz. (At least in the case of sims made by Evans & Sutherland, which had the market for them pretty much cornered)
        
        However, I wouldn't be too surprised that the non-graphics portions of Q3A would run fairly well (but not great) on a '386 ( if it had a '387 FPU as well.).
        
        I'll say this: Without a dedicated 3D card, it would take a Power4 module to tackle Q3A at max settings. (Of course, a Power4 module isn't a single processor-- it's 8 processor cores roughly analogous to a PowerPC. And just one of the current Power4 cores outruns AMD's 'best-case' specs for the Hammer (which is still in development).
        
        Except for pathological cases, C code will run a few percent slower than hand-tuned assembler
        
        More the opposite; except for pathological cases, C code runs a few hundred percent slower than assembler. (Although on the IA-64 architecture, this is not necessarily true, as it relies entirely on the compiler to explicitly state operation order. The IA-64 does not re-order operations (or do any pipeline scheduling) at all, which is one of the primary reasons the Itanium runs x86 code so slowly.
        
        And, the IA-64 arch. is about the only one out there where a C compiled program stands a fair chance against pure asm, (since it requires the pipeline scheduling to be explicitly stated by the programmer, which is an extremely difficult task for mere mortals.)
        
        Frankly, I'm not about to argue any further. The asm vs c/compiled is older than vi vs. emacs; except the 'vi vs. emacs' doesn't have much impact on the speed of programs written in it. I know for my own experience how much faster assembler is than Compiled languages. I stand by my numbers. So do all the hardware engineers I know, including a couple whom have Ph.D.'s in compiler design.
        
        C is great because it compiles well, and is cross-platform. Asm doens't require THAT much more development time than C does... But ASM is so device specific that unless you're writing software for a driver or embedded devices, the advantage of C more portable nature outweighs ASM's speed.
        
        I mean... think of what Carmack would give up if he wrote his graphics engines in asm: NO cross-platform capability, a nightmare interfacing with graphics card drivers, and almost no flexibility in the graphics engine.
        
        And since the advantages of ID's graphics engines have always been broad platform and hardware support, and extreme graphics engine flexibility. He'd lose a significant part of his market if he wrote it in asm. Plus, other companies would have graphics engines that, while somewhat slower, would be available far sooner than an asm implementation.
        
        Re:The Hammer is NOT a good thing... (Score:2)
        
        by Glock27 ( 446276 ) writes:
        
        But all are (or will be used) in embedded system design anyway, so that's where my train of thought was leading. The Hammer mainly has 'momentum' going for it. Just about everything else is against it.
        They are years (possibly decades) away from widespread use as embedded processors. Given their capability and memory sizes, again it will be far less than 1% programmed in assembler. Mostly likely Java will be the dominant embedded language by then.
        I don't have time to rehash this entire argument, but I will touch on one point:
        This is the crux of the problem for AMD's hammer. The hammer will be forced to use a much larger transistor count than its RISC competitors. The higher transistor count results in several problems: It's far more complex and expensive to design. It takes a more complicated and expensive process to fab. The die is larger, which results in a slower processor. And it uses more power.
        Hammer has a substantially smaller die than P4, it's main competitor. Itanic, er Itanium, not only has a large die, but is priced with extreme margins. It's an easy target for Hammer. There is the issue of OEM support, but if Hammer meets spec it will be in high demand.
        More the opposite; except for pathological cases, C code runs a few hundred percent slower than assembler.
        Why don't you go to the Usenet group comp.compilers and state that "Except for pathological cases, C code runs a few hundred percent slower than assembler".
        The resulting blood bath should be amusing. ;-)
        Let me know if you do it, I want to watch...
        
        Re:The Hammer is NOT a good thing... (Score:2)
        
        by sl3xd ( 111641 ) writes:
        
        Hammer has a substantially smaller die than P4, it's main competitor.
        
        Again, that's comparing a fab tech that is in the near-future compared to one that's been used for over a year. Not a fair comparison by any means. It's like saying that the Athlon has a smaller die than the K6. Completely different chip generations.
        
        And since both Intel and AMD are working together (with about every other semiconducter maker) on researching new fab techs, you can bet Intel will have the same fab tech of the Hammer. (I do know the Itanium II uses a 0.09 micron fab tech, which is unprecedented for the scale.)
        
        Supposedly the Itanium was (more or less) a rushed release (similar to the PowerPC G4). The Itanium II seems to have improved by a few orders of magnitude in efficiency, as well as speed. For that matter, the PowerPC G5 (which is not being rushed out) specs about 2x faster than IBM's Power4 core.
        
        And, remember, as I said before, the Itanium is currently targeted at the Workstation/high-end server market; NOT the PC market. When I say workstation, I mean "ultra-high performance, ultra-high stability (and typically, ultra-high cost)" market. The Itanium is priced similarly to the primary competitors in the arena, those being UltraSPARC, Power, Alpha, and PA-RISC. The first-gen Itanium is not (and was never intended) to be anywhere near your local conumer electronics store (or your local system builder, for that matter).
        
        The Athlon MP is not real workstation class by any stretch of the imagination. No competant engineer even trusts the architecture with critical tasks. I have yet to see anybody design computer hardware (or vehicles, or perform complex simulations, scientific calculations, or true enterprise-level work) on x86 hardware. The hardware, while cheap, still crashes far, far too often... it doesn't have anywhere near as good of a memory (and system) architecture... the list goes on and on.
        
        The reason PC's are used for 'render farms' are because they're so cheap. If a computer crashes, then they just have to re-boot it and re-render the current frame (losing only a few hours work at most, and even then in a relatively non-critical task).
        
        To be short: Sun dominates the workstation market, followed by HP, IBM, and SGI. None of their workstations (with exceptions to SGI's lowest-cost graphic workstations), run x86. That's over 95% of the workstation market.
        
        There is the issue of OEM support, but if Hammer meets spec it will be in high demand.
        Unquestionably. However, that doesn't mean it will be successful. AMD once made the world's most popular RISC processor (hands down). It literally blew everything else away in terms of sales. AMD discontinued production because, in spite of very high demand for the hardware, they couldn't come close to competing with the other architectures (or, to be more specific, although hardware makers loved it, nobody wrote software for it.)
        
        If the Hammer isn't compatible with IA-64 compiled binaries, then AMD will have to fund the development of Hammer-compiled versions, as software developers, following the money, will support IA-64 first. AMD has done this in the past already, but had to give up because it wasn't profitable. (Not coincidentally, it's the same RISC processor that was in such high demand that was the source of this headache).
        
        Why don't you go to the Usenet group comp.compilers and state that "Except for pathological cases, C code runs a few hundred percent slower than assembler".
        
        The resulting blood bath should be amusing. ;-)
        
        As I said, the assembler vs compiled fight is quite long running. Stating asm vs compilers arguments in a compiler newsgroup would get a similar response to a windows user extolling the virtues of WinXP in a Mac (or linux) group.
        
        And, unsurprisingly, stating that C is anywhere near as efficient as pure asm in an assembly newsgroup would be a bloodbath as well.
        
        The main argument for using C is that it is generally faster software development, and generates code that is 'acceptable'.
        
        Pure asm takes more time to develop, but results in significantly tighter/faster code. The L4 microkernel kernel is a great example of this: The C implementation is much slower than the asm implementation.
        
        But, unsuprisingly, the C implementation is a bit easier to work with.
        
        HP did some research a while back (2-3 years) with software optimisation. They discovered a few interesting things: They could 'emulate' (using full architecture emulation) compiled programs with ~5-15% greater performance than running the same binary natively. (The emulator was emulating the PA-RISC architecture, and ran on top of PA-RISC hardware, so the test was conducted on the same machine) The emulator was capable of making up for inefficiencies the compiler added into the code. In spite of the (large) overhead of the emulation, the program still ran faster while emulated.
        
        While it has yet to really see more than tech demo releases, the Amiga OS4 technologies are quite similar: They are able to run the exact same binary on multiple platforms (PowerPC, x86, IA-64, SPARC, and MIPS) with no drop in performance (compared to natively-compiled versions of the same code). Again, this is due to (current) compiler problems. (PS- I'm not an Amiga fan per se, but I do admire how well-engineered they were for their day)
        
        Another good example is the speed difference between different compilers on the same platform. If they compiled to anything remotely close to the speed of asm, then there wouldn't be a 15-20% speed difference between a highly specialized compiler (such as Intel's) versus a more generic (cross-platform) compiler (such as gcc).
        
        Don't get me wrong: There's nothing really wrong with using a compiled (or interpreted) language. There are very definate benefits to their use (development and maintenance time being primary considerations). Compiled languages are acceptably fast, and compilers are getting steadily better.
        
        But I doubt we'll ever see more than a fraction of embedded devices use a more high-level language. A price difference of $0.01 adds up to real money (and reduced cost) in commercial production runs.. In addition, even where compiled languages are used, the resulting code is still de-compiled and the results scrutinized closely. (Which isn't much different than just writing the whole thing in asm anyway).
        
        Re:The Hammer is NOT a good thing... (Score:2)
        
        by sl3xd ( 111641 ) writes:
        
        AMD has stated that adding the 64-bit extensions to Hammer has only increased its core size by about 5% over K7, at the same process size!
        
        That's not too suprising... I'd say the figure is about right. With as large an instruction decode stage as an x86 (or any CISC) has, changing from 32 to 64 bits isn't going to change the size of the chip much. (The 64-bit extensions, from what I understand, do not add more than a couple instructions; it simply reuses the ones it already has. Hence Decode stage won't grow too much)
        
        The thing is the Decode stage takes up so much of the overall die (and number of transistors, etc) in any CISC processor, that even sweeping changes in the remainder of the chip will result in a nearly identical die size.
        
        That being said, the actual RISC processing core (of the Hammer) is significantly larger than the K7's RISC core. (On the order of 20-30%). It's just that the decode stage is so huge that it hardly makes any difference.
        
        Why do you think people are so excited about Hammer?
        
        A couple of things: First, there is a significantly large anti-Intel crowd. (Not surprisingly, they're also anti-Microsoft). So any upcoming non-Intel chip is exciting to them.
        
        My feelings as to 'why AMD?' comes down to a simple factor: Price. AMD chips are loved by so many because they're cheap x86-compatibles (games being a key factor). If Apple hardware were similarly priced, and had the game market that x86 offers, Apple (and PowerPC) would be a favorite.
        
        Processors can be related to cars fairly well, as long as you forget about being compatible with Windows for a moment; And frankly, as far as I'm concerned, the programs that run on it don't make a difference to the actual hardware.
        
        The Hammer is akin to a pickup truck: A fairly inexpensive, medium-quality vehicle. It's loved because it does its job at a bargain price. It's utilitarian. It's the 'people's truck', and is affordable to most of the population.
        
        Workstation processors (Such as Power, SPARC, Alpha, PA-RISC, Itanium) are compared to a semi-truck (Kenworth, International, Caterpillar): They don't necessarily go any faster, but they can tow huge cargos, but the corresponding rise in cost is far from linear.
        
        And Apple (PowerPC) processors are BMW's or an Audi: They don't really run any better (or worse) than a pickup truck-- but it's a higher-quality 'luxury' car, and gives a better ride. You pay for the quality and experience, though.
        
        And, basically, there are a lot of people who are perfectly happy with their pickup truck. They're not about to pay more (at a very uneven scale) for more performance of a semi-truck, nor do they care for the luxury of a BMW.
        
        (And, the Itanium isn't as great as the other workstation processors, but it's also the only 1st gen chip in the bunch; The 1st gen SPARC, Power, and PA-RISC processors weren't wonderful either.)
        
        The Itanium also has one major problem with reguard to die size: It's binary compatible with both x86 and PA-RISC processors; meaning that while the pure IA-64 architecture part of the chip is smaller than the Hammer, it then has the circutry to decode x86 (which is a huge # of transistors, and hence, huge die area), PA-RISC (a much simpler/smaller addition to the x86 decode), and the IA-64's own VLIW decode.
        
        If the hammer had three seperate instruction decoders (one CISC, one RISC, one VLIW), then it would have a huge die area too. But the Hammer has one (CISC). And even the Athlons would be half their current size if they were pure RISC rather than CISC. (Of course, they wouldn't be x86 compatible then, but that's markets for ya.)
        
        The 64-bit extensions don't comprise an entirely new instruction set, primarily because they're just that: extensions. The Hammer's mechanism to extend from 32 to 64 bits is identical to the way the '386 extended from 16 to 32 bits. (This is from AMD's data). The '386 also added a couple more instructions (and registers) to the '286 design. That doesn't make an entirely different instruction set and/or decode.
        
        Re:The Hammer is NOT a good thing... (Score:2)
        
        by sl3xd ( 111641 ) writes:
        
        It's the x86's experience and expertise in designing large decoder stages that has allowed both Intel and AMD to reach the 1Ghz+ frequency stage so far ahead of any of the RISC crowd
        
        Actually, it's primarily because Intel pushed better fab processes into production earlier than the RISC crowd, of whom only Motorola & IBM fab their own.
        
        The Alpha was making a run for this crown, and it was the only horse in this race for the longest time, and then all of a sudden from out of nowhere both Intel and AMD both overhauled the Alpha as if it wasn't there.
        
        Never underestimate the damaging effects of a corporate sale. When DEC was split between Intel and Compaq, (well before the 1 GHz barrier) it was the death knell for the Alpha-- there was simply too much disruption in the shift of companies. (not to mention the fact that many of Alpha's engineers wanted nothing to do with Intel or Compaq, so they left) Neither AMD or Intel was bought out, as DEC was. And AMD even ended up with some of Alpha's engineers!
        
        That leaves the whole category of heavy-haul trucks unanswered by x86 at the moment. But what distinguishes a heavy-haul truck from a pickup? The ability to pull large loads. Is that all achieved by the truck's engine? No! Large trucks have incredible 18-speed transmissions, and stiff chassis, etc. In other words it's the overall package that distinguishes a heavy-hauler from a pickup... [it] describes a similar approach to how you distinguish a RISC processor-based (heavy haul) server from a PC (pickup) processor-based one.
        
        So how's this got anything to do about Hammer?
        
        Easy... Architecture. As you say, the engine is only a small (but significant) part of the entire package that makes the distinction. The rest is the architecture around which the engine is built. Frankly, even though there's been many improvements of the x86 design (primarily by eliminating ISA and replacing it with PCI/AGP), it still has its problems; which is why it will never be a true replacement for high-end workstations and servers.
        
        Well, what it leads to is that Hammer has been designed right from the start to be everything from a car engine, to a pickup engine, to a heavy haul engine. That's because of its various features, such as Hypertransport, and onboard DRAM controller.
        
        If it were designed from the ground up, it wouldn't be x86 compatible; not, at least, if the designers wanted a truly great processor. Rather, AMD hopes to ride the x86-compatibility market and is therefore adapting a phenomenal RISC core to the pre-existing x86 set. It's like bolting a jet engine on a farm tractor.
        
        Hypertransport (as well as a built-in DRAM controller) is only useful on multiprocessor systems (I'm not downplaying their usefulness at all) The onboard DRAM controller allows each processor to have its own seperate memory (whereas many, including the IA-64, share the same memory through the system bus.) Combined with the increased multiprocessing effecinecy Hypertransport offers, the Hammer processor line seems to be clearly designed for multiprocessor systems. (Hypertransport and onboard DRAM doesn't provide any real benefit to a single-processor system)
        
        It will be great for companies that want to upgrade their x86 server hardware, but want to keep their old software. It'll do great in the 3D animation and rendering studios, many of whom use a Unix-like OS anyway. But for the general desktop machine, there will be only one CPU, robbing the user of the benefits Hypertransport and the onboard DRAM module give.
        
        One key here is that Hypertransport is not unique to the Hammer; SUN, HP, Motorola, SGI and Apple are all members of Hypertransport consortium, and intend to incorporate it into their processor designs.
        
        The primary benefit of an onboard DRAM controller per chip (no longer sharing the same memory pool via a bus) is already implemented on other architectures by using multiple DRAM controllers.
        
        My argument all along was that the Hammer isn't a good thing because it:
        Keeps the paleolithic x86 architecture.
        Could operate far faster if its RISC core didn't adapt itself to x86
        We would be better off junking the x86 architecture sooner than later.
        The Hammer, while an excellent x86 design, seeks to make the transition 'later', if at all.
        Most of the responses I've seen are remarkably similar to a PC fan's reasons why they don't want to switch to a better machine than x86 can provide: They're cheap (the machines, although it can apply to a few users). Actual reasons as to the Hammer's 'superiority' are in no way particular to the Hammer, and are found in many of its competitor's drawing boards as well.
        
        And outside the Free software world, where the software typicall only requires a recompile, the Hammer faces some serious, possibly fatal obstacles once 64-bit compiled commercial packages begin to replace the older 32-bit code. The commercial reality is that to be successful, the Hammer has to have natively-compiled 64-bit code. (In Windows) To do this, they have to have developers who will support Hammer/64 in addition to the IA-64. They'll have to either sell two different versions (somewhat similar to the sales of Mac vs PC / or Win32 vs x86Linux games), or have both binaries in one package. Both are expensive propositions, and with Intel's virtually guaranteed market-share, it may not be worth the effort to support Hammer.
        
        For a brief history on AMD and binary incompatibility-- Jim Turley, a CPU/Architecture analyst, said the following: "Backing Intel's newest and heavily promoted next-generation architecture is a foregone conclusion for vendors that want to stay in business. Supporting AMD becomes more problematic. Will the added market share be worth the effort? Suddenly AMD finds itself in the same boat as Apple with a different, yet competitive, product that requires dedicated software support to survive.
        
        Grimly, AMD itself lived through this tragedy not so many years ago, and the wound was self-inflicted. AMD unceremoniously axed its entire 29000 family, one of the most popular RISC processors of the early 1990s, due to the cost of software support. The company decommissioned the second-best-selling RISC in the world because subsidizing the independent software developers was sapping all the profits from 29K chip sales. As "successful" as it was, AMD had to abandon the 29K, the only original CPU architecture it ever created. " (emphasis added)
        
        I'm not saying that the Hammer isn't a good processor.
        
        I'm saying that it's putting a jet engine in a 1940's John Deere tractor. I'm saying the mechanic should dump the tractor, and put a jet engine in an aircraft-- not an ancient, over-extended farm tool. The tractor could still do its job, but it's just such a waste of the engine's potential.
        
        I'm sorry, but the x86 instruction set is old and inefficient; it doesn't allow compilers or programmers to access a modern CPU's (including the Hammer) features-- So the Hammer has to deal with the limits inherited from the x86 set.
        IA-64 allows explicit branch/pipeline ordering and load optimization; this allows the compiler's larger view to create code that keeps all the pipelines busy.
        As all branch/pipeline and load optimization is done in the compiler, there is much more time to find the most optimal instruction order and path. (Fractions of nanoseconds vs. seconds/minutes/hours)
        An instruction set (such as IA-64) capable of direct access to branch ordering, or a greater number of registers is more powerful, in that it allows for developers (directly, or via a compiler) to 'take the time' and resources to find the most optimal/efficient way to use the processor's full capabilities.
        x86/Hammer does not allow explicit branch/pipeline ordering or load optimization, as x86 was purely single-pipeline until the first Pentium. (Although technically x87 is another pipeline, it served an entirely different purpose... the branching I speak of is of two or more identical pipelines)
        As a result, the (Pentium, Athlon, K6, Hammer) must look at its instruction cache, and from that (very limited) amount of information, attempt to optimize the branch/pipelines and provide load-balancing. Time is extremely limited (to fractions of nanoseconds), as are resources to perform any re-ordering. But as time is limited, it frequently executes a suboptimal route and/or order.
        Even though the Hammer has all kinds of ultra-modern features and resources, nearly all of them are inaccessible to the programmer/compiler; while the built-in management of these features/resources is quite good, it is also far from perfect (having a far more limited scope than a compiler does, after all) Cycles that could have been put to good use end up being wasted.
        Lastly, I'll say that I'm not so much a fan of the IA-64 as I am of the VLIW concept; Non-VLIW processors (Sparc, Power, Alpha) have the same pipeline scheduling concerns as the Hammer. But at least they offer greater access to the processor's resources (such as double or more the accessible GP registers of 64-bit Hammer).
        
        Re:The Hammer is NOT a good thing... (Score:2)
        
        by sl3xd ( 111641 ) writes:
        
        So with the improved process technology they were able to get 70% better speeds (Athlon vs. P3), but with increased pipeline stages (P4 vs. P3) they were able to get 100% better speeds.
        
        Interesting side note: One reason the Alpha does so well is that the physical design is very closely tuned to its fab process.
        
        And a question: Do you mean a greater number of pipelines, or more pipeline stages?
        
        I ask because more pipeline stages doesn't really increase speed very much (ie. there can be one instruction in each pipeline stage, but as each instruction takes one clock to move to the next stage, there isn't any improvement in speed.) In fact, shorter pipelines are often faster, as they don't have as much potential for stage bubbles.
        
        A stage conflict is when, for example, you have a 5 stage pipeline. Instruction A comes immediately before B. However, instruction B requires that A finish the entire pipeline before it can begin executing. So, instruction B has to wait 4 more cycles before it can execute (instruction A must finish, which essentially clears out the pipeline) A 10-stage would take 10 cycles to clear out before B can execute.
        
        Out-of-order execution can help keep the pipeline busy with other tasks while B is waiting to be executed; but it doens't always work out.
        
        Additional pipelines (which is what I think you meant) is adding a second (or third, fourth...) identical pipeline, so that tasks unrelated to the A,B instructions (above) can be executed as well. Again, out-of-order execution helps keep things busy, but not always.
        
        Which comes to the nice thing about VLIW design: The compiler (or, in the case of VLIW, the maschocistic asm coder) is able to take a larger look at program than is possible in a non-VLIW design (Which, AFAIK for the mass-produced chips, is everything except the Crusoe and Itanium). And that results in a more efficient run than having the hardware attempt to do it.
        
        Of course, as far as design complexity goes, I'm not entirely sure which is easier to design: The out-of-order predicion chip, or a VLIW chip. I tend to believe the VLIW chip is more complex in design.
        
        Re:The Hammer is NOT a good thing... (Score:2)
        
        by sl3xd ( 111641 ) writes:
        
        How could you possibly say that onboard DRAM controller doesn't provide any benefits to SP systems? [...] Even the recent history shows that a reduction in memory latency has a greater effect on PC performance than an increase in bandwidth.
        
        This argument seems to be more a Rambus vs. DDR thing; and even then on commodity boxen. But I digress. In both cases there is currently an off-chip memory controller. The big reason for the difference in latency is not the controller itself, but the (completely different) methods of transferring data. Rambus uses a serial data transfer, which is easy to scale up (in terms of speed and bandwidth), but has higher latency. DDR is an older, parallell technology. DDR has lower latency, but has lower bandwidth and is much harder to scale up. This primarily because of electromagnetic crosstalk (and other E&M interference problems) within DDR's (parallell) data paths.
        
        There is a point of limited returns with the low latencies DDR offers; the point is frequently reached on high-performance computers (workstations, scientific processing, and high-end servers) where the bandwidth is the key factor. When you're transferring a few GB of memory, who cares that it takes a few us longer to start receieving data-- overall, the entire transfer (from request to completion) takes much less time. Even Wintel boxen are beginning to reach this point.
        
        Personally, I wonder how RAMBUS even got a patent. I don't see how a serial memory bus is 'non-obvious to the trade's practitioners'. But, that's the USPTO for you.
        
        Another major problem is the physical distance to (as well as speed of) DRAM. Silicon technology has already reached the point where a signal often travels faster through logic gates (such as an off-CPU controller) than it does through wire. So long as the memory controller is physically located between the DRAM and the CPU, there is little chance there will be any performance drop. At current CPU speeds, it takes 2-3 clock cycles for any signal to even reach the DRAM (even light-speed is slow at 1 GHz). Then it takes several more before the DRAM addresses and returns data. Then another 2-3 clock cycles before it gets back to the CPU. An off-CPU DRAM controller may or may not take an additional cycle. For large (sequential addressed) memory transfers, this one cycle is a one-shot deal. Even with millions of tiny, single-byte (randomly selected) transfers, there is one million extra clock cycles 'burned up'. This would result in a performance drop of 0.05% on a 2GHz CPU. (And less as speeds increase)
        
        As for Hypertransport, the idea behind that is not just absolute performance increases, but also design flexibility. So the same chipset that serves as a PC chipset, may also be able to serve as an 8-way server chipset, with few design changes (perhaps by adding or subtracting a few more HTT channels).
        
        This is true; but as I said, it only really makes things better for the multiprocessing crowd. Chip makers don't usually pass the costs of a higher-complexity/performance chip to the buyers of a lower-complexity chip. The SP chipset would be the hands-down highest-volume seller. An MP chipset that is based from the SP design would cost less than a wholly-redesigned MP chipset. This suits the MP buyers fine... but it doesn't give any benefit to the SP buyers. The benefit is to MP alone.
        
        Even within a desktop environment, you can easily separate out shared PCI/AGP buses, into multiple switched PCI/AGP buses with Hypertransport underlying them.
        
        You can, but why? For all intents and purposes, the PCI/AGP bus is essentially idle 100% of the time. (The times when it is used is more of a statistical anomoly than fact; a figment of the deranged observer's imagination.) Even in applications when there actually is heavy bus activity, the PCI/AGP bus is far from being saturated. There are cases (such as multiport gigabit ethernet cards) where any single PCI slot is unable to handle the load -- but the PCI bus itself still has massive amounts of idle bandwidth; it's just that it's not possible to transfer the data between the network card and the PCI bus fast enough. (Which is a limitation of PCI's component interface, but not of its bus).
        
        I've seen many servers that have multiple network interfaces, where each NIC saturates the PCI card slot. The actual PCI bus, however, is not saturated, and handles the full load of multiple saturated interfaces quite well.
        
        In other words, it doesn't matter how wide the freeway is; the tollbooth (AKA the PCI Slot interface) is the bottleneck, and is the real limiter of performance. A HyperTransport-switched PCI bus would be like adding more lanes to a highway that has nearly no traffic on it. It doesn't change how fast you can drive. It's the long wait at the toll-booth at the on and off-ramps that is the speed problem.
        
        Espescially as on many motherboards, AGP and PCI are on entirely different buses, so heavy AGP usage (such as DoomIII, or 3D Animation) doesn't even effect the PCI bus. For the desktop user, there is no benefit to such a scheme. Even a power-hungry gamer, using his AGP8X card to its fullest potential, compiling XFree86, and hosting multiple P2P file transfers couldn't do much to dent the PCI bus's capabilities. It's other x86 problems that are most likely to cause speed drops; not PCI or AGP.
        
        Only in ultra-high-end applications would there be a benefit.
        
        But it's not all of the other players it has to worry about, just one player: Intel. Intel may be allowed to use the HTT, but its absolutely certain they would rather die than use their great competitor's designs.
        
        That's completely untrue. In several aspects. First, the NIH (Not Invented Here) syndrome has burned just about everybody. No company that is too proud to use a technology that was NIH lasts long. The managers at Intel are not that stupid. But they aren't going to jump on the bandwagon and spend any money just yet; they'll wait until they see how the results fare on the market before they invest anything in HyperTransport. If it's in Intel's best interest, they'll use it. If not, they'll design an alternative. To call AMD their 'great competitor' is rather short-sighted as well. They're only the most major competitor in the x86 arena, and one with a minority of the market. That's the reality, whether you like it or not. And I like (and have recently bought) AMD processors.
        
        All of the other players are small-fry in terms of volume compared to the x86 camp.
        
        That is an entirely baseless statement. The x86 camp is extremely small in terms of the 'other players'. Or weren't you aware that approximately 0% of all computers use an x86 chip? AMD has a very small production volume; so small they don't even fab their own chips. The only major competitor that is fab'd in such small volumes is SPARC. But Power & PowerPC, Itanium, and even ARM processors are all fab'd in greater volumes than AMD's. Intel plans on abandoning x86 entirely; their Yamhill (Hammer-like) processor is a contingency plan, to 'steal the Hammer's thunder.'
        
        HP has no need to use HTT in its processors, simply because it has no processors anymore
        
        Patently false. HP's processor is the Itanium. (more below)
        
        all of them (PA-RISC and Alpha) have been EOL'ed according their own roadmaps, so what are they going to use them for, Itanium?
        
        Their roadmap EOL's the PA-RISC, but points straight to Itanium. The Itanium is 100% PA-RISC compatible (in addition to supporting x86 and its own architecture). It is the next-gen PA-RISC. They are only supporting the next couple of releases of PA-RISC to appease people whom already have PA-RISC hardware, and wish to upgrade the processors in their pre-existing hardware. Alpha was acquired well after the Itanium was complete; a white elephant of sorts. It was never part of the plan. It's entirely likely that HP will include Alpha technologies into next-gen IA-64 chips. If there is customer demand (espescially if it's from Itanium's co-designers at HP), HyperTransport will be included as well.
        
        Anyways, the only RISC player that is likely to use HTT is Sun, and they will likely use it in their upcoming Opteron servers. It's likely that IBM, HP, in addition to Sun all have Opteron plans secretly already devised.
        
        Opteron is the Hammer's new brand-name, and Sun will definately not be using it.
        Sun is 100% SPARC, has been for more than a decade, and they have no plans to abandon it. There is no such thing as an 'Opteron server' from Sun. Sun only sells SPARC boxen.
        I already covered HP -- they're Itanium. Their roadmaps still point to it.
        SGI's roadmap leads to Itanium for their workstations and servers. They will use Intel's answer to HyperTransport (whether it is HyperTransport or not)
        IBM is all about their own Power and PowerPC processors, which has better SPECint and SPECfp scores than anything else to begin with.
        It's likely that IBM has an Opteron-based PC and Windows.net server, but the Opteron won't be used in their high-end servers or workstations. IBM already scales well past the point where HyperTransport would be beneficial; and IBM is in the same boat as Intel: If it's worth their while, they'll either use or design an alternative for HyperTransport. But for IBM, it may be completely unnecessary to begin with.
        Apple is likely to use HyperTransport, as they have a great deal of flexibility in what technologies are to be used in their machines. Apple is also a member of the HyperTransport consortium. Apple's market is definatley not a trivial one.
        Which goes to show my point: Just because AMD's Opteron has great features, they are in no way unique to the Opteron. And its competitors have a better system architecture than x86 to boot.
  - Re:The Hammer is NOT a good thing... (Score:1)
    
    by DarkHelmet433 ( 467596 ) writes:
    
    x86-64 isn't a pure extension to x86. AMD chopped out a *lot* of stuff. In 64 bit mode, segmentation is completely gone, for example. When the OS is in 'long mode' (ie: the OS is 64 bit) then vm86 is gone. real mode is gone. etc. All that is left when running under a 64 bit OS is 64 bit protected mode and 32 bit protected mode. While the 32 bit apps see what looks like "segments", the supervisor side of it is mostly gone.
    
    From reading the AMD manuals, it looks like somebody wrote up a list of what sucks about x86 from an OS perspective and the design engineers did a damn good job at getting rid of just about everything on the list. There is still some nastiness, but it is a damn sight better than plain x86.
    
    The x86-64 application view is dramatically cleaned up too.
    
    And about damn time!
    - Re:The Hammer is NOT a good thing... (Score:1)
      
      by bbbl67 ( 590473 ) writes:
      
      64 bit mode, segmentation is completely gone, for example.
      
      Those segments could have been put to some extremely good use in Protected Mode. They basically allowed you to have completely separate code and data segments which never overwrote each other. Allowing some extremely unprecendented levels of memory protection for applications, not only from other apps but from themselves. It also would make the task of writing OSes easier because the hardware itself could be employed to enforce protection.
      
      In fact, the original Linux was written this way. Linus's original intention for Linux was to design an operating system to see how much of the Intel 386 architecture's features could be used. Obviously considering how quickly he got the kernel designed and running, the Intel architecture made his life very easy. This was in the pre-1.0 days, as of 1.0 and later they shifted to a more generic kernel that could be ported across platforms. But those early pre-1.0 kernels were extremely small and fast.
      
      When the OS is in 'long mode' (ie: the OS is 64 bit) then vm86 is gone. real mode is gone. etc. All that is left when running under a 64 bit OS is 64 bit protected mode and 32 bit protected mode. While the 32 bit apps see what looks like "segments", the supervisor side of it is mostly gone.
      
      But I think any 64-bit OS can switch easily between "long" and "legacy" mode, right? So if there is a requirement to use VM86 mode, they can still do so by putting it into a legacy segment?
Too little to late (Score:3, Informative)

by synoniem ( 512936 ) writes: on Thursday July 04, 2002 @05:07PM (#3823771)

I used to use Alpha's but left the platform 3 years ago because of lack of progress in the development of the Alpha. Especially now Compaq is dead too, the Alpha is a sitting duck. HP already has PA-Risc and and a very good relationship with Intel and their Itanium chip. Too bad!

Share
twitter facebook
- Re:Too little to late (Score:1, Interesting)
  
  by susehat ( 558997 ) writes:
  
  Don't forget that HP killed PA-RISC when they went to do Itanium. The Alpha is very cool. It really doesn't need to be killed off, and HP saw this. Besides, Itanium is lame, it has bad press behind it, and is is also a sitting duck. So, the best to the Alpha, since it seems that HP may want to get off the Intel Bandwagon.
  - Re:Too little to late (Score:1)
    
    by DiscoBiscuit ( 585436 ) writes:
    
    HP Killed PA-RISC? It ain't dead yet. HP recently released specs for the PA-8800 [theregister.co.uk] and its slated to beat McKinley into the weeds...their own Itanium processor. The roadmap includes an 8900 processor too. As for HP being on the Itanium 'bandwagon', Itanium belongs more to HP from a technology standpoint than it does to Intel from what I understand. I can only imagine that HP decided to let Intel take most of the credit for marketing reasons.
    Viva la PA-RISC!
- Re:Too little to late (Score:1)
  
  by jbbernar ( 41291 ) writes:
  
  Yes, the Alphas have lost their main advantage, a high clock speed. Unless the architecture has fundamentally changed, the Alpha can't hope to compete at 1.3-1.7 Ghz.
  - Re:Too little to late (Score:1)
    
    by Bert64 ( 520050 ) writes:
    
    The alpha still has an advantage over competing RISC processors (PA-RISC, Sparc, MIPS etc) in the clock rate department, and when compared to x86.. the alpha always trashed x86 chips at half the clock rate.
  - Re:Too little to late (Score:3, Informative)
    
    by be-fan ( 61476 ) writes:
    
    The cool thing is, (in SPEC fp scores at least, which are decent benchmarks) Alphas at 1 GHz are just about even (10% either way in different tests) with a 2.53GHz P4. And its only about 40% slower on integer code. Clock-speed is nice, but the Alpha had one mad FP architecture!
- You're forgeting Samsung (Score:2)
  
  by DABANSHEE ( 154661 ) writes:
  
  Samsung have the right to develop, manufacture & sell Alphas for as long as they want, no matter what Intel & HPaq say or do.
I remeber when (Score:1)

by Himmit ( 590058 ) writes:

I first read about their Alpha 21xxx something 64 bit processor and it was running at a whooping 300mhz at that time, which I think was around 94 or 95, compared to my measly i486 dx/2 66mhz that was mind boggeling fast.. good to see that at least a part of my old favorite chip maker still lives
- Re:I remeber when (Score:2, Interesting)
  
  by puto ( 533470 ) writes:
  
  Alphas were running at 500 megahertz at the time the P60,90 were out.
  
  I remember becayse I almost bought one with the special version of NT for the Alpha. They only cost a small amount more and ran like scalded dogs.
  
  The only problem was that there was very little peripheral support and huge driver issues. But most NT stuff ran on them and ran real fast.
  
  AMD is the bastard child of the Alpha.
  
  Puto
  - Re:I remeber when (Score:1)
    
    by fatphil ( 181876 ) writes:
    
    You misremember.
    
    The 21064A was released in October 1993 at 275MHz, according to Bhandarkar, and went into the 3000/900 and 7000/700 systems in mid 1994. Dec didn't reach 300MHz until the release of the 21164 in Sept 1994, which reached systems in 1995. 500+MHz Alphas, such as the one I'm sitting at currently (which has not been booted into NT since about a week after I bought it, thanks to RedHat and more recently Debain), only came later.
    
    The original Pentiums (60/66MHz) were released in 1993. They were at P6 by 1995.
    
    So, timewise,
    Intel 60MHz DEC 200MHz
    Intel 166MHZ DEC 300MHz
    
    Phil
No relevance since HP admitted it will kill it (Score:5, Informative)

by maitas ( 98290 ) writes: on Thursday July 04, 2002 @05:17PM (#3823815) Homepage

After HP anouncement that Alpha is a dead end, this is of no relevance... SADDDLY!!

http://www.hp.com/hpinfo/newsroom/press/07may02b .h tm

They are dropping Alpha and PA-RISC for Itanium... baaadddd move!!

Share
twitter facebook
- Re:No relevance since HP admitted it will kill it (Score:3, Insightful)
  
  by BlueFall ( 141123 ) writes:
  
  This is kinda weird. Certainly, no new customers (usually corporate/research, i.e. not hackers) would buy a chip that will be discontinued and it looks like HP itself acknowledges that:
  
  AlphaServer systems will be focused on the Alpha installed base. - from the press release sited above.
  
  But this also means that of the existing customers, probably only those who can't find another alternative soon will buy the new Alpha. Seems like kind of a harsh thing to do the Alpha. If they (Compaq) released this chip then said that they were stopping the line, that would be one thing, but in this case, they're stopping the line before releasing the chip! This is certainly a bizarre move.
  - Re:No relevance since HP admitted it will kill it (Score:4, Informative)
    
    by rodgerd ( 402 ) writes: on Thursday July 04, 2002 @07:04PM (#3824171) Homepage
    
    Digital and Compaq did a bunch of deals with customers, especially in the supercomputer space, that were predicated on the appearance of this iteration of the Alpha architecture - they'd be in breach to the tune of hundreds of millions, perhaps even billions, if they hadn't pushed this out the door. It's not about whether new customers pick it up, it's about not being sued by old customers.
    
    Furthermore, they've got customers on Tru64 and VMS who have nowhere to move at the moment, but may need more grunt; they'll buy upgrades until they've ported VMS to Itanic and the Tru64 customers have migrated to HP-UX (or give up on the Digital->Compaq->HP fiasco in disgust and move to AIX or Solaris).
    
    Bear in mind that until fairly recently Digital/Compaq were selling new VAX systems to customers who had VAX/VMS setups that worked just fine and no particular desire to upgrade.
    
    Parent Share
    twitter facebook
- Re:No relevance since HP admitted it will kill it (Score:2)
  
  by sl3xd ( 111641 ) writes:
  
  They are dropping Alpha and PA-RISC for Itanium... baaadddd move!!
  
  One thing you seem to have omitted. The Itanium is a joint HP-Intel processor. HP was intimately involved with the design of the Itanium, and intended it as a replacement for the PA-RISC from the beginning of the design. HP had better 'know-how' in 64-bit RISC, and Intel had the fab facilities to produce the Itanium on a large (and more inexpensive) scale. The Itanium was & has not ever been intended to be a x86 competitor. It was designed to replace the PA-RISC and to compete with MIPS & SPARC, among others. In fact, originally, the Itanium was supposed to be backwards/binary-compatible with the PA-RISC. (I'm not sure of if the final product actually IS, but I lost interest in the Itanium several years ago...)
  
  It was merely hoped that one day the architecture the Itanium uses would finally replace the x86 architecture.
  
  The Alpha, I suspect, is somewhat of a white elephant in HP's acquisition of Compaq, and I suspect we can expect to see many of the Alpha's technologies rolled into next-generation Itaniums.
The last Alpha? (Score:4, Funny)

by iankerickson ( 116267 ) writes: on Thursday July 04, 2002 @05:22PM (#3823831) Homepage

And while they're at it, they can change the name to "Omega".

Share
twitter facebook
*mniam* (Score:1)

by zdzichu ( 100333 ) writes:

mummy, mummy, buy mi this! [compaq.com]
Up to 256 GB of ECC memory
Over 51 GB/s aggregate internal bandwidth
4 MB or 8 MB ECC memory onboard cache per CPU
Up to 224 PCI slots on 64 PCI buses

(the image in linked news announcement has this page [compaq.com] (www.compaq.com/alphaserver/index.html) link).
- Re:*mniam* (Score:1)
  
  by lhaeh ( 463179 ) writes:
  
  It begs the question; Just what do
  I put in those 224 PCI slots?
alphas and optimisation (Score:5, Interesting)

by Zurk ( 37028 ) writes: <zurktech AT gmail DOT com> on Thursday July 04, 2002 @05:29PM (#3823850) Journal

just a short comment on how good the alpha high performance math libraries really are (and the alpha engineers -- may alpha rest in peace).
I was writing code for a simple matrix transform using the algorithm as follows :
for (a=0;a100;a++){for(b=0;b100;b++){
txarray[a][b]=o ldarray[b][a];}}
using the alpha libraries to do the transform instead rated me a 10x boost in speed.
this was weird as i didnt see how the above algorithm could be optimized...tearing apart the assembly i saw :
for (a1=0;a1100;a1=a1+10){for b1..{for(a=0;a10;a++){for(b...

evidently they had optimised it so that reads and writes would occur from closely spaced regions of memory and less time would be spent writing.
result ? a 10x boost on a simple algorithm and a neat hack at the same time.
just an example of how awesome the engineering of the alpha wa

Share
twitter facebook
- Re:alphas and optimisation (Score:2, Insightful)
  
  by BlueFall ( 141123 ) writes:
  
  This sounds to me like a standard compilers loop unrolling optimization. Almost all modern processors run this kind of code faster. Sounds like you had a cool compiler, though the alpha itself is cool for other reasons.
  - Re:alphas and optimisation (Score:5, Informative)
    
    by John Whitley ( 6067 ) writes: on Friday July 05, 2002 @01:53AM (#3825641) Homepage
    
    No, this isn't loop unrolling at all. This library (and not the compiler, note) is using this scheme to maintain cache-locality. A general rule of optimization is to agressively utilize the memory heirarchy, be it at the L1/L2 cache level, VM, etc. This means maintaining good data-locality in the algorithm's access patterns at the relevant scales (i.e. cache, VM pages, etc). Failure to manage this (for this example) means a performance hit due to greatly increased cache misses, often in the form of unecessary loading, dirtying, flushing, reloading and redirtying cache lines continuously during the course of processing. Ideally, one wants to load the cache line once, do all work in the cache, then flush/write back and move on to other data.
    
    This principle can be seen in how the GIMP stores image data in tiles data for rapid processing, in matrix math libraries, in the design of FFTW (The Fastest Fourier Transform in the West, www.fftw.org), and many other systems.
    
    Parent Share
    twitter facebook
    - Re:alphas and optimisation (Score:2)
      
      by BlueFall ( 141123 ) writes:
      
      Ok, I misread the comment. It looks like this may have been hand-written in the library. Nonetheless, this is definitely loop unrolling -- there is a loop and certain parts of it have been written explicitly. Locality in code and data is indeed part of loop unrolling. While most people think that loop unrolling is only for code locality (i.e. to prevent unnecessary branches), data locality as you already mentioned is in fact handed by certain special purpose compilers through loop unrolling.
- tiling, not loop unrolling (Score:1, Informative)
  
  by Anonymous Coward writes:
  
  This is not loop unrolling, it's a technique called tiling. The idea is that accesses to your rectangular array are performed in small square sections. This optimizes cache usage during the transform, where sequential access in 1 of the 2 dimensions would otherwise be cache-unfriendly.
  - Re:tiling, not loop unrolling (Score:2)
    
    by morcheeba ( 260908 ) writes:
    
    I'd recommend this book: High Performance Computing. [oreilly.com] It covers this trick and many others -- if your compiler doesn't do them automatically, then you can hand code it.
nice 64bit (Score:2)

by johnjones ( 14274 ) writes:

I would like to see one of these give a specFp result

I bet that it could cane IA64 in the specInt but the real test would be floating point and to do IEEE754 properly you need 64 bit otherwise you end up emulating it

now we have of the true 64 bit microproessor's

Sun Microsystems - Processors [sun.com] which are a Sparc [sparc.org]

PA-risc which is MIPS like

and MIPS64 [mips.com] which I like alot

of the ports linux to 64bit for linux HPPA [debian.org] and the oldie but goodie linux Alpha [debian.org] and linux sprac64 [debian.org] of course not forgeting linux for IA64 [debian.org] but unfortunately the linux for MIPS [debian.org] is not 64bit so if ever their was a challenge as linux is mostly 64bit clean its to do a MIPS64 port

oh and intel wont like to say linux for hammer [x86-64.org] which is not real 64bit just has some 64bit registers tacked on (but hey you can do fp right ;-)
- Re:nice 64bit (Score:1, Interesting)
  
  by Anonymous Coward writes:
  
  Wait till you see what IA64-2 (Itanium 2) can do for FPU ops its just plain frightening. I'm talking 1.3K+ specfpu2000 score. From what a rep from a major OEM mentioned to me, IA64-3 (Madison) should double that.
  
  The really interesting thing is the parallels between IA64 and Alpha, that they both sucked hard in thier early varients. It is a little known fact that Alpha 21064 cost more than all other 64bit CPU's at the time and performed far worse.
  
  Oh I should finish off by mentioning that the current IA64 tools for linux suck donkey balls in the performance stakes hopefully this is one area where serious optimisation will be made.
  - Re:nice 64bit (Score:1)
    
    by fatphil ( 181876 ) writes:
    
    "It is a little known fact that Alpha 21064 cost more than all other 64bit CPU's at the time and performed far worse."
    
    Show us the figures, or shut up.
    
    The 21064, shipping in 1992 was level (Int & FP) with the PA-7150 from 1994, level to PPCs in SPECInt92 withthe 604 from 1995, but thrashed the 604 in FP.
    
    i.e. the 21064 was _2 years_ ahead of the field.
    
    The 21064A, shipping in 1994 was superior in Int and FP to every processor by _every_ other RISC manufacturer before 1996 apart from MIPS's R8000 from the same year, which was better at FP, but lousy at Int.
    
    i.e. the 21064A was _nearly 2 years_ ahead of nearly the whole field.
    
    My figures from MPR, from vendor SPEC releases.
    
    Sure, they weren't cheap, but neither were Sparcs, or PAs.
    
    The ding-dong battle between HP and DEC started after that, and basically DEC would spend 75% of the time top of the SPEC FP tables, but HP would always manage to throw a system into the #1 slot for about 25% of the time. Intel/AMD coming anywhere near either of those, and Power now actually beating it, are relatively new concepts considering the 10-year length of Alpha history.
    
    FatPhil
- Re:nice 64bit (Score:5, Interesting)
  
  by ToLu the Happy Furby ( 63586 ) writes: on Thursday July 04, 2002 @06:21PM (#3824029)
  
  I bet that it could cane IA64 in the specInt but the real test would be floating point and to do IEEE754 properly you need 64 bit otherwise you end up emulating it
  
  x86 processors have had 64-bit floating point registers (actually 80-bit) for as long as they have done native floating point. x86 does not have 64-bit integer registers; this has nothing to do with floating point.
  
  The reason x86 has traditionally sucked at floating point is because the x87 floating point ISA only allows for a stack of 8 fp registers, instead of a flat set of 32 registers like most RISC architectures. This has been worked around to some degree in current x86 processors through the use of a flat virtual register set and good compilers, although there is only so much a compiler can do when it is limited to 8 target registers. Nowadays the continued leadership in SPECfp by 64-bit RISC chips is mostly due to higher memory bandwidth and particularly large L2/L3 caches which help a great deal with certain SPECfp subtests.
  
  While not quite as high as its world-beating SPECint scores, the P4's SPECfp scores are still damn good, and would be even better if Intel would officially support PC1066 RDRAM (the current scores on spec.org are PC800 only). Put another way, they will be even better when Intel releases their dual-channel DDR chipset in a few months.
  
  That said, EV7 will clearly have the SPECfp score to beat for quite some time. (Probably SPECint as well.) And Itanium2's SPECfp scores are reported to vault it well ahead of the also impressive Power4. But, again, this is all to do with higher DRAM bandwidth and larger caches, not with any inherent limitations of x86 for performing double-precision fp.
  
  Parent Share
  twitter facebook
  - Re:nice 64bit (Score:2)
    
    by AndrewHowe ( 60826 ) writes:
    
    You are right, in that I cannot correct you in any way, but you have to admit... x86 is backward compatible with the 8086*... That may (!) not matter any more but they have done a f##king good job... Credit where it's due and all that... Don't you think?
    * 8088?
    - Re:nice 64bit (Score:3, Insightful)
      
      by ToLu the Happy Furby ( 63586 ) writes:
      
      You are right, in that I cannot correct you in any way, but you have to admit... x86 is backward compatible with the 8086*... That may (!) not matter any more but they have done a f##king good job... Credit where it's due and all that... Don't you think?
      
      No doubt. If it wasn't clear from my post: the fact that AMD and Intel can get almost equivalent single-CPU SPEC performance (and SPEC is oriented toward workstation/server/HPC workloads!) to the top 64-bit CPUs, despite maintaining backwards compatability with a much uglier ISA and costing ~50x less, is a huge credit to their engineering teams. As well as pretty strong proof that the fitness of your ISA is much less important than the manufacturing process you use and the engineering resources you have.
      
      And second, while it of course no longer matters than the P4/Athlon are backwards compatible with the 8086, it mattered hugely that the 286 was, and that the 386 was compatible with the 286, and so on. Tremendously. The immense size of the x86 backwards compatible market has meant that Intel and AMD sell their CPUs in volumes large enough to make owning their own fabs (and keeping them on the cutting edge of process tech) worthwhile...which in turn is what has kept x86 performance so competitive (along with other effects from selling into such a huge market).
- Re:nice 64bit (Score:1)
  
  by egoots ( 557276 ) writes:
  
  It's not the hardware that is holding back IEEE754 properly, but rather the compilers
  
  According to W. Kahan (one of the fathers of IEEE754) (see this link to PDF article [berkeley.edu])
  "The widest precision thatâ(TM)s not too slow on odayâ(TM)s most nearly ubiquitous âoeWintelâ computers is not double (8 bytes wide, 53 sig. bits) but IEEE 754 double extended or long double (Â10 bytes wide, 64 sig. bits). This is the format in which all local scalar variables should be declared, in which all anonymous variables should be evaluated by default. C99 would permit this (not require it, alas), but â¦
  Microsoftâ(TM)s compilers for Windows NT, 2000, â¦ disable that format.
  Java disallows it.
  Most ANSI C, C++ and Fortran compilers spurn it.
  ( Appleâ(TM)s SANE got it right for 680x0-based Macs, but lost it upon switching to Power-Macs.)"
- Re:nice 64bit (Score:1)
  
  by fatphil ( 181876 ) writes:
  
  "PA-risc which is MIPS like"
  
  Erm:
  
  PA has a segmented memory architecture, MIPS doesn't. (OK they call it an 'address space identifier', but really it's a segment, as the virtual addressing modes are all 32-bit, unlike MIPS' 64bit.)
  PA has packed decimal types, MIPS doesn't.
  PA has variable bit-field data types, MIPS doesn't.
  PA has 58 SP registers which can be paired for DP, MIPS has a flat DP FP register set.
  Both have branch delay slot but PA has optional nullification, MIPS doesn't.
  PA doesn't have a branch-likely extension, MIPS does.
  PA has conditional moves, MIPS doesn't.
  PA doesn't have a dingle instruction divide, MIPS does.
  
  MIPS-III may have filled in some of the above gaps, but I stopped looking at MIPS at the MIPS-II stage.
  
  PA was a 'braniac' design (lots per tick), MIPS was a 'speed demon' traditional RISC design (lots of ticks).
  
  FatPhil
Such a shame.. (Score:1)

by popeydotcom ( 114724 ) writes:

Only 'real' techies seem to have any time for Alpha. A company I work at has just thrown some out which ran a development SAP system only 18 months ago. Now it's been recycled by one of my collegues and runs RedHat.

Works as a nice door stop too! ;)
- Re:Such a shame.. (Score:2)
  
  by Neon Spiral Injector ( 21234 ) writes:
  
  Not a door stop. A heater. I literally turned off the heat in my room after I bought an Alpha (dual 21164s) this winter. Of course when summer came around, I had to run my air conditioner all the time, and it was getting out of hand (70 year old wiring, 20 amp circut breaker that was always tripping). So I had to get it co-located a in real data center. It is happy there now.
  
  Anyway, I have always loved the Alpha and wanted one since I was a boy. But after having one, and finding how poorly they are supported these days, I can't wait to get my dual Opteron system.
TestDrive (Score:2, Informative)

by SignoffTheSourcerer ( 567014 ) writes:

They have been available for the compaq testdrive project for a couple of weeks
cpu Alpha
cpu model EV7
system variation Marvel/EV7
cycle frequency 800000000
BogoMIPS 2140.20
platform string Compaq AlphaServer ES80 7/800
cpus detected 2
cpus active 2

This has been restructured a bit to pass through the junk filter as well as condense it to the most important info.
0.18? (Score:1, Insightful)

by Anonymous Coward writes:

Why do they use 0.18 by the end of the year?
0.13 schould be capable of such a chip.
IBM uses 0.13 already for their power4.
One of the foundries like UMC or TSMC would be proude to produce the alpha.
RDRAM? Bye-bye! (Score:2, Funny)

by Anonymous Coward writes:
The Slashdot nightmare:
- ALPHA RULEZ! Intel sux0rs!
- RDRAM sux0rs! Evil patents! Evil company! Evil! Information wants to be w4r3z3d!
- ALPHA RDRAM ??
* Sound of Explosion *
Missing feature (Score:5, Funny)

by red_dragon ( 1761 ) writes: on Thursday July 04, 2002 @06:15PM (#3823994) Homepage
152M transistors, 1.75MB integrated on-die L2 cache... integrated RDRAM memory controller with 8 channels...

They should go all the way and integrate either one of these into the packaging:
- Heat exchanger for Freon-based cooling system;
- 1,000,000-CFM fan with exhaust duct (might require special municipal permit to get installed);
- Chimney;
- Frying pan.
Suddenly, Athlons seem mighty cool (literally).
Share
twitter facebook
- Re:Missing feature (Score:2)
  
  by TheMatt ( 541854 ) writes:
  
  Oh, this is true. I use an EV67 for my research and the thing puts XPs to shame. Once, I opened it right after shutdown and the sheer heat coming of the proc was amazing.
- Re:Missing feature (Score:1)
  
  by morcheeba ( 260908 ) writes:
  
  We had a room-sized temperature chamber for baking/freezing satellites that used water cooling. When we'd fire it up, we rolled out a fire hose (literally!) into the parking lot, so that it could drain into the sewer. Technically it's not legal to put treated water into the storm sewer system (it's for rainwater, not chlorinated water), so we were quick to roll it back up.
GOD BLESS THE ALPHA! (Score:1, Interesting)

by Anonymous Coward writes:

AlphaServer 1200 and 164/LX user for several years.
Once again, capitalism destroys superior technology, as the DECHPaq behemoth kills off all its own engineering masterpieces to appease Intel.
Long live Alpha, long live Alpha/x86 binary translation technology, long live PA-RISC, long live HP instrumentation and calculators. Long live control by the competent, rather than the short term profit-minded!
May Fiorina and Cappellas be given the softest of pillows to relieve their nightmares of guilt.
4th July? Why are we celebrating independence from a nation now no less free than our own?
About time. (Score:1)

by SlashdotTroll ( 581611 ) writes:

It's news like this what makes Slashdot:

News for Nerds. Stuff that matters.

Thankyou chrisd, now please cover the stories on the Warcraft3 Linux porting effort.
- Re:Will it run... (Score:1)
  
  by Chemicalscum ( 525689 ) writes:
  
  Should still run NT. At work we have a alpha box which our IT department is running NT on - Too many dumb MCSE's to run a *NIX. Remember NT is just VMS plus the Windoze GUI so it should be at home on an old DEC box.
  Rumour has it that M$ ported NT to alpha to head off DEC from complaining about it ripping off VMS for NT.
  - Only if they support AlphaBIOS Re:Will it run... (Score:1)
    
    by ppetrakis ( 51087 ) writes:
    
    Watch the firmware directories on gatekeeper.dec.com for 'nt' firmware on the EV7 boxen. If they release it, It just may run :-).
    
    Peter
  - Re:Will it run... (Score:3, Informative)
    
    by Slashamatic ( 553801 ) writes:
    
    NT running on Alpha was probably connected with Cutler (a former VMS architect) who was technical lead for NT.
    In reality NT does have some VMS like feataures in the kernel, but it is *not* VMS. If it was it would be a little slower and a BSOD would be strictly mythological.
  - - Re:Will it run... (Score:2)
      
      by Slashamatic ( 553801 ) writes:
      
      Actually Macro-32 (the VAX assembler) runs very nicely on an Alpha. It works as a translator there. Otherwise, I agree with you about the PAL support.
      Digital did start a project to get VMS onto other archiotctures, namely MIPS and INTEL but they gave up even before the feasibility study was fully completed).
- Re:Can you imagine a beowulf cluster of those? (Score:3, Funny)
  
  by Tuzanor ( 125152 ) writes:
  
  Not funny.
  - Re:Can you imagine a beowulf cluster of those? (Score:2)
    
    by evilviper ( 135110 ) writes:
    
    Unlike x86 or MacPPC, you don't NEED to cluster them. Want to have the speed of 200 machines??? Just stick 200 processors in one of them. The SMP abilities of Alpha are absolutely incredible. (not to mention the threading, performance, et al)
    
    Hey, no need for distributed file systems, expensive high-speed ethernet, etc. It's just too bad Alpha never caught on.
    
    And for all those that think it's dead, there are still other companies with vested interest in the Alpha.
    - Re:Can you imagine a beowulf cluster of those? (Score:2)
      
      by be-fan ( 61476 ) writes:
      
      Actually, the EV7 is limited to 128 processors per node. Not chicken-feed, but not 200 ;)
  - Re:Can you imagine a beowulf cluster of those? (Score:1)
    
    by th3_l33t_h4x0r ( 571926 ) writes:
    
    Hilarious.
- Can you imagine a beowulf cluster of those? No (Score:1)
  
  by The Creator ( 4611 ) writes:
  
  Considering how many CPU's you can get in a single box it's often not nessesery.
- Re:Can you imagine a beowulf cluster of those? (Score:2)
  
  by nuintari ( 47926 ) writes:
  
  yeah, I bet it'd look a lot like this, [lanl.gov] only faster.
- Re:Can you imagine a beowulf cluster of those? (Score:2)
  
  by Slashamatic ( 553801 ) writes:
  
  I am working on a VMScluster of multiprocessor Alphas. VMSclusters beat the sh*t out of beowulf because of the better cross cluster synchronisation. Digital has been producing clusters since 1980 and they are quite frankly, boringly reliable.
  Our Alphas don't calculate much, they just run the biggest electronic futures and options market in the world (at least the production cluster does). Most of the backend code is even written in COBOL.
  - Re:Can you imagine a beowulf cluster of those? (Score:1)
    
    by fatphil ( 181876 ) writes:
    
    "Our Alphas don't calculate much..."
    
    Wanna donate some time to a distributed computing project? :-)
    
    (I'm one of the few prime-number nuts (Ernst Mayer being the other) who codes stuff for Alpha.)
    
    FatPhil
    - Re:Can you imagine a beowulf cluster of those? (Score:2)
      
      by Slashamatic ( 553801 ) writes:
      
      Actually, the processors keeps quite busy, but with other problems like reformatting and comparing data.
      They are also a *long* way in connectivity terms away from the Internet as all trading by the members goes via a private WAN (better control of transaction times).
- Re:Thoughts on the world (Score:2)
  
  by AndrewHowe ( 60826 ) writes:
  
  Offtopic.. But strangeky true... Word!
- Re:Too bad: littlw Linux support. (Score:2)
  
  by DMDx86 ( 17373 ) writes:
  
  The Compaq C/C++ compilers are avaliable for "free", though albeit not GPL (or "open source"). They seem to produce excellent and well optimized code.
- Re:That's what happened (Score:1)
  
  by Strog ( 129969 ) writes:
  
  Compaq still owns Alpha. Intel just licensed several Alpha technologies

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Yay (Score:2, Funny)

Re:Q.E.D. (Score:1, Offtopic)

alpha still lives? (Score:3, Interesting)

Re:alpha still lives? (Score:4, Informative)

Re:alpha still lives? (Score:1)

Re:alpha still lives? (Score:1)

You forget Samsung (Score:2)

Re:alpha still lives? (Score:2)

barf, RDRAM (Score:1)

Re:barf, RDRAM (Score:3, Informative)

Not really Re:barf, RDRAM (Score:3, Informative)

Re:barf, RDRAM (Score:1)

Re:barf, RDRAM (Score:5, Informative)

Re:barf, RDRAM (Score:3, Informative)

Re:barf, RDRAM (Score:2)

How sad... (Score:5, Insightful)

Re:How sad... (Score:1)

Re:How sad... (Score:1)

Re:How sad... (Score:1)

Re:How sad... (Score:2, Insightful)

Re:How sad... (Score:5, Insightful)

The Hammer is NOT a good thing... (Score:2, Interesting)

Re:The Hammer is NOT a good thing... (Score:3, Insightful)

Re:The Hammer is NOT a good thing... (Score:1)

Re:The Hammer is NOT a good thing... (Score:1)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:4, Insightful)

Re:The Hammer is NOT a good thing... (Score:1)

Re:The Hammer is NOT a good thing... (Score:1)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:2)

Re:The Hammer is NOT a good thing... (Score:1)

Re:The Hammer is NOT a good thing... (Score:1)

Too little to late (Score:3, Informative)

Re:Too little to late (Score:1, Interesting)

Re:Too little to late (Score:1)

Re:Too little to late (Score:1)

Re:Too little to late (Score:1)

Re:Too little to late (Score:3, Informative)

You're forgeting Samsung (Score:2)

I remeber when (Score:1)

Re:I remeber when (Score:2, Interesting)

Re:I remeber when (Score:1)

No relevance since HP admitted it will kill it (Score:5, Informative)

Re:No relevance since HP admitted it will kill it (Score:3, Insightful)

Re:No relevance since HP admitted it will kill it (Score:4, Informative)

Re:No relevance since HP admitted it will kill it (Score:2)

The last Alpha? (Score:4, Funny)

*mniam* (Score:1)

Re:*mniam* (Score:1)

alphas and optimisation (Score:5, Interesting)

Re:alphas and optimisation (Score:2, Insightful)

Re:alphas and optimisation (Score:5, Informative)

Re:alphas and optimisation (Score:2)

tiling, not loop unrolling (Score:1, Informative)

Re:tiling, not loop unrolling (Score:2)

nice 64bit (Score:2)

Re:nice 64bit (Score:1, Interesting)

Re:nice 64bit (Score:1)

Re:nice 64bit (Score:5, Interesting)

Re:nice 64bit (Score:2)

Re:nice 64bit (Score:3, Insightful)

Re:nice 64bit (Score:1)

Re:nice 64bit (Score:1)

Such a shame.. (Score:1)

Re:Such a shame.. (Score:2)

TestDrive (Score:2, Informative)

0.18? (Score:1, Insightful)

RDRAM? Bye-bye! (Score:2, Funny)

Missing feature (Score:5, Funny)

mniam (Score:1)

Re:mniam (Score:1)