Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
Check out the new SourceForge HTML5 internet speed test! No Flash necessary and runs on all devices. ×

AMD Releases X86-64 Architecture Programmers Overview 199

AMD has released a manual in PDF format to allow software developers to migrate their code to its 64-bit Hammer microprocessor platform. The PDF document is here and Sandpile Web site gives excellent explanatory material of the content of the PDF file. Surprisingly, Sun supports the Sledge Hammer. Article can be found here
This discussion has been archived. No new comments can be posted.

AMD releases Sledge Hammer Info

Comments Filter:
  • by Anonymous Coward
    It's inferior to Intels VLIW approach.

    We don't new another patch on x86, we need a new arch. Intel's approach is superior, but it will take longer to perfect. AMD is counting on that delay to gain support for their inferior product.

    My.. How the tables have turned.
  • by Anonymous Coward on Thursday August 10, 2000 @12:36AM (#865865)
    Face it, with VLIW the onus for performance is on the compiler. And C is not the easiest language for VLIW compilers. VLIW compilers work best when global analysis can be performed. But global ananlysis is very difficult amid the untamed world of C pointers (aliasing problems).

    AMD's 64 bit processor is a better match for the current state of compiler technology, particularly Open Source compiler technology.

  • This is obviously a power move on the side of AMD. However, people that know about Sledgehammer also know about Intel's new 64-bit architecture. Though this might be of interest to some, the only people who really care are developers, and I'm sure they're going to wait for Intel's superior offering. Now that AMD's made the first move, it puts pressure on Intel to quicken the pace a bit. I think this is the first step to putting more powerful computers out onto the market. The next six months or so are going to see computing have a brand new facelift, what with the court cases with the RIAA, MPAA, this race for the 64-bit chip, and the massive popularity that Linux is gaining. These are exciting times, indeed.
  • by Anonymous Coward
    AMDs 64 bit chip wasn't really alien to big system integrators (IBM, Compaq), but with Sun's glowing support, it'll have an enourmeous mindshare. Who knows, maybe Sun could be for Hammer what IBM was for Java: The Confirmation.
  • by ameoba ( 173803 ) on Thursday August 10, 2000 @12:42AM (#865868)
    Both AMD and Intel have impressive 64bit architectures coming in the near future. I don't really know enough about either to say if one is technically superior to the other, but as we've seen time and time again, that's not going to be the determining factor. Since the two aren't compatable, they'll be forced to compete, and I don't think the market has room for both of these platforms.

    Technical merit won't really make a difference here; both Intel and AMD can make a good chip. It's going to come down to who gets to market first, who can buy the press coverage, and who's going to get the required software support.

    I have some vague memory of an Intel sponsored 64bit Linux port; If AMD expects to succeed, they should be doing the same.

    I can sum the whole post up in two words:

  • by Lion-O ( 81320 ) on Thursday August 10, 2000 @12:43AM (#865869)
    Intel then accused Sun of not doing enough work to port Solar to its own 64-bit technology.

    Hmz, I call that very narrow minded, or perhaps 'biased' would be the better phrase. As a matter of fact Microsoft (Windows and other programs) sometimes also don't support the processor as efficiently as they could. The Pentium processor has been around for quite some time now (I'm referring to the plain Pentium (I?) btw) yet it took a lot of developers ages before any decent Pentium based software came out. I've debugged quite some software back then just to see in which way Pentium based software really did differ with the rest and I was quite surprised to make this discovery.

    Offcourse I don't judge all the software outthere, don't get me wrong, but if Windows in those days also didn't use the processor to its full extent then I have a difficult time believing that they are doing so now (speculation, mind you). But why didn't Intel complain about that, and still seem to complain about Sun?

  • Why, oh why, keep on with the assumption that the Intel product will be 'superior' when the companies are pretty much neck-and-neck these days?
  • by Anonymous Coward
    Face it, X86 is dead... Sun had an excuse to add a 64bit extension to their 32bit CPUs... The sparc arch wasn't hopelessly outdated. There is no such excuse for x86.

    AMD's design enhancements are very very very weak. They don't even add a non-stack based FPU (x86's WEAKEST POINT!).

    The only good things that AMD's new arch will do is: Make handling big address spaces less of a kludge (still a kludge), and allow x86es to perform DES cracking as fast as Alpha or Usparc (sideways).

    Other then that there it's mostly marketing fluff. While 64bits might sound faster then 32bits, it really isn't (Cept for a few apps like DES cracking). It's all dependent on the arch and design and AMD is keeping the same crap arch.

    The reason compilers can't handle VLIW very well is because it hasn't been out there enough. The approach is superior, and it will perform like it sooner or later.

    By buying into AMD on this we are selling out our future.
  • by Idaho ( 12907 ) on Thursday August 10, 2000 @12:48AM (#865872)
    You obviously never read Tom's Hardware. Look at this article [tomshardware.com] describing how Tom thinks about Intel's IA64 architecture (Itanium, formerly Merced) in relation to AMDs SledgeHammer technology.

    Why do you think it's called SledgeHammer in the first place???

  • Huge amount of computer based work is now done on Wintel instead of mainframes. Remember how much stability problems arised in windows due to the 16-bit compatibility? Ok, so it was not because of the bitness but the insecure memory model. Still, providing backwards compatibility to 32-bit software is going to happen with separate interfaces (what's the length of an 'int'?^). Have some more DLL Hell? I wouldn't mind Microsoft dropping the backward compatibility all together - heck, the world would probably choose to turn compatible with my OS - but somehow I imagine that won't happen :)

    Most of us would probably finally like to trash 16-bit compatibility, wouldn't we? Keep the old processors for old games and recompile what is
    worth it. Or just make it software-emulated and stop wasting die space. But there it remains, according to the PDF. It's just a tradition, not useful anymore.

    I was considering whether I should upgrade from Celerons to PIII's at home, when it hit me: wait another year or so, and invest into a 64-bit system. Worth it? In schedule? Time will tell. If only Abit released plans of Hammered mobos :) And when will the Transmeta 64-bit-Intel-compatible firmware upgrade come out?-)

  • by Killer Napkin ( 221026 ) on Thursday August 10, 2000 @12:51AM (#865874)
    Intel started that whole game with adding MMX Technology (R) to their chips. It made everyone think that, somehow, they were getting something entirely different with the Pentium chips because MMX was this magical add-on that no one else had.

    Suddeny Cyrix (you remember them don't you) chips started coming out with MX technology. AMD, if I remember correctly started licensing MMX for a while and finally came out with 3DNow.

    Now every peice of hardware you buy comes with magical tweaks. Hard drives, video cards, and even motherboards -- they all come in brightly colored boxes with completely unrelated pictures on them flouting "Proprietary Brand Useless Features (R)" If you're going to blame someone for being lame, blame Intel. (Of course, I'd also blame Microsoft, who touts useless or standard features as being "innovative")
  • by Idaho ( 12907 ) on Thursday August 10, 2000 @12:59AM (#865875)
    According to Tom's Hardware [tomshardware.com], the problem with using VLIW in complicated microprocessors is writing a fast compiler, which is really hard since instruction have to be executed concurrently (or if this is impossible you'll lose much performance). Read the article for further information.

    This is also why the Crusoe [transmeta.com] is not as fast as a Pentium III or Athlon (the Crusoe uses VLIW too). However, in the Crusoe case it might not be a problem since Transmeta is targeting another market: PDA's, laptops and such, requiring low power usage and long battery live.

    Obviously, the Itanium (stupid name!) is targeted at the server market, where power consumption does not really matter (except when running into cooling problems ofcourse!), so it had better be fast.

    And until now it looks like it isn't! Little problem for Intel that has yet to be solved!
  • I don't mean to start a war over people's favorite processors, but Intel chips have conistantly shown themselves to be faster than both PPC chips (Yes, I've run Photoshop. Altivec only works for some fo those filters. Blatantly false advertising, in my opinion.) and AMD chips. I've run the benchmarks myself. And if you throw into the equation that Intel chips don't nearly have as many bugs with them (Pentium 60 chips, aside) as AMD chips, you can understand why I expect a whole lot from the upcoming 64-bit architecture (as opposed to a revamped Athlon chip.) It may be a longer wait, but I'm sure it'll be worth it.
  • Intel did the exact same thing.

    Anyways, this is a common practice in marketing now. Look at Apple. Everything begins with "Power" or "I." Companies like Compaq (iPaq) steal from this popularity. When nVidia started calling their 3D chips GPUs, so did everyone else.

    The rule is: If you can't steal the name, come up with something that's almost the same.

    You can't really just point your finger at AMD.
  • I don't understand why Intel and AMD don't start to work on non x86 based processors. The pentium IIIs and athlons already have far too much code to keep them backwards compatible. As far as software goes, Microsoft has already abandoned DOS totally, so why keep the processor architecture to run DOS.

    I think they should come up with a new design for a good 64 bit processor. Something that is RISC, but not worrying about x86.

    Make Microsoft port all their code for a change to the new processor. They've had it easy. Linux can run on countless platforms and would be easily ported.
  • I wouldn't mind Microsoft dropping the backward compatibility all together...

    That's not going to happen, and there are a few reasons why. The biggest reason is the existing investment in software... a lot of companies waited (or are still waiting) to deploy Win2K, despite the fact that it will run almost 100% of the software that ran under 95/98 and contained security and other enhancements. If you completely axe backwards compatibility, hardware or software, people are not going to flock to replace existing installations.

    Or just make it software-emulated and stop wasting die space.

    Absolutely. Emulation would probably be able to handle 90% or more of current applications. The choice would then be to either keep around 32-bit machines for the software that needs it, and/or wait for new version built for the new platform.

  • by Idaho ( 12907 ) on Thursday August 10, 2000 @01:09AM (#865880)
    For the uninformed,

    VLIW means Very Long Instruction Word, which means that the processor can execute
    several instructions at one time (one Instruction Word can contain multiple instructions).

    This is a good idea because this way the processor does not have to worry about
    running instructions concurrently. Instead, the compiler has to worry about that,
    which makes it really hard to write a compiler that handles this right, while still
    optimizing as much as possible.

    Since the VLIW technology has not been tested very much in microprocessor designs,
    Intel is doing some really hard (maybe even innovate!) work, but that also means
    many problems/bugs/performance issues, as with almost every 'new' technology.

    VLIW is successfully used in (e.g.) DSP processors, but those have very simple
    instruction sets, where the order in which things are processed often does not
    really matter, so this is rather simple to implement.

    However, in a microprocessor it's much harder, which may be part of the reason why Intel's Itanium is delayed so much (performance problems, and problems running high clock frequencies).
  • by Psiren ( 6145 ) on Thursday August 10, 2000 @01:12AM (#865881)
    Okay, so assuming I buy a 64bit x86 CPU and slap Linux on it, what difference will I see as an end user to my current 32 bit system. Ignoring clock speed differences, will it be massively faster? Will it be more stable? What *real* beneift do I get from choosing a 64 bit processor, instead of a 32 bit one. The fact that it can address shitloads more memory doesn't help much. Who can afford over a Gig on their home system anyway?
  • I wander what happpens when amd and intel gives out a 64 bit chip each. Will windows be out for each processors?
    If not one of them is surly gonna beat the crap out of the other and we suddenly have monopoly on CPU's again.
    I know at least that if they're not compatible i'll go for Intel as a standard.
    Does anyone have any info regaring this issue?


  • If Microsoft needs to use emulation if the new processor is so different, I think our friends at WINE [winehq.com] may get a visit...

    Think about how much windows code they can run with incomplete documentation. It's not that slow either.
  • by arivanov ( 12034 ) on Thursday August 10, 2000 @01:24AM (#865884) Homepage
    By the time I read AMD pdf slashdot was already full of various "meaningfull opinions". Whatever, as usual, people never change.

    I have to admit I am impressed:

    • AMD managed to get rid of quite a bit of legacy for the 64 bit mode. Most of the utterly idiotic segmented switching is gone. There is a well defined supervisior mode and user mode. There is a SYSCALL instruction so the mess of "jump to that magic offset to get promoted to OS level" is gone and OSes will have a nice clean API.
    • At the same time there is a reasonable amount of backwards compatibility. It is not without a cost but the cost is way less than Itanic. Almost all idiotic x86ism (tm) like the x86 task switching mechanism generate a GPF on the spot. So the OS will have to handle them in software. This means two things: the compatibility is relatively simple and the compatibility will be easier to achieve for emulating 32 bit OSes where ring 0 is called within a well defined API. Porting "the world of bypass backdoors" - NT on this will suck (even after AMD has broken their own rules and made exemptions for it). Porting all Unix OSes available for x86 will be a piece of cake.
    • Also unless told to it will do all calculations in 32 bit. So that most of favourite x86 legacy issues will have less impact. Also it has a reasonable number of register. 8 more general purpose 64 bit ones and 8 more 128 bit SIMD ones.
    Quite an interesting beast to play with, question being will it have proper linux and BSD support. Also after reading the PDF it becomes clear why is the Sun support for this. This approach is very close to their uSparc/Sparc legacy scenario. And they have swam in that swamp for years (and are still swimming there). It is their CPU. It was born for them ;-)
  • by Tough Love ( 215404 ) on Thursday August 10, 2000 @01:25AM (#865885)
    Though this might be of interest to some, the only people who really care are developers, and I'm sure they're going to wait for Intel's superior offering.

    I'm not sure you're really in touch with what developers think. I'm a developer, and I'm intensely interested. AMD's K7 was a killer punch, 3D Now is pretty successful in spite of the odds, and now I'm in the mood to take a close look at *anything* AMD puts forward.
  • This reminds me of the time in the early 80's when you had a Commodre, an Apple, or MAYBE an IBM machine.

    With the two companies moving away from compatability, it seems like we're moving back toward that era where there's no such thing as a PC as we know it.

    AW hell, if only I could afford to put an Sun enterprise machine in at home. At least the competitors in that arena are incompatible all around - you can tell by looking at the case. In this case, there could be 2 Compaq machine (or Gateway etc) that you can't install the same OS on...can't tell by looking though.

  • Why PDF? I'll tell you! PDF is the STANDARD format for online technical datasheets. Go to the webside of ANY company that manufactures semiconductor products and try to get datasheets. They're all in PDF.
  • by dpilot ( 134227 ) on Thursday August 10, 2000 @01:49AM (#865888) Homepage Journal
    They see a steamroller called Itanium on the horizon. Sure, they have an X86 port. But that's more of an entry tool to make sure that customers grow up into their bread and butter, the Sparcs and the rest of the higher end.

    By endorsing Sledgehammer, AMD hopes (IMHO) to take some of the wind out of Itanium's sails, and make it less of an 'obvious' choice. Intel is becoming formidable as a systems house, and can challenge Sun in that role. AMD at the moment is merely a chip (including CPUs) vendor.

    On the side, I guess it's not so amazing that when Intel announces an upcoming 64-bit CPU, everyone starts planning on it being a success, even before details were known. Where it gets just slightly amazing is when bad news starts to leak out about Merced, how it's a dog, and "Just wait for McKinley!" Yep, we messed up this time, but just wait until next time.

    In spite of the flop of Merced a year before its introduction, and the uncertainty of developing decent compilers for VLIW, and the general dislike of Intel's quirky architectures, like X86...

    IA-64 is still branded one of the Winners in the 64 bit sweepstakes, and there's STILL no generally available hardware.
  • It should be quite a bit quicker (bigger busses, more bandwidth, less legacy idiocy), and you'll be able to handle bigger objects easily.

    It's not just how much RAM you have, it's the size of the files on your disc too. You have a 40GB hard disc, but the largest file you can mmap() is 2GB? That sux0rs. What if you're trying to edit a movie? Or open lots of large images in Gimp? The 2GB limit is becoming painful already.
  • People won't care about performance nearly as much as they do about compatibility. If all we wanted was a 64 bit architecture that was fast -it's already out there. It's called Alphas. So, this race between AMD and Intel is about which will Windows run on? It's too bad, though, since Linux has already proven that a good OS is capable of working anywhere. I would prefer to have more choices in the hardware world....
  • You've already got one.. It's called Alpha.
  • Speed will not necessarily be the most important factor, neither will be the tons of RAM that can be addressed.
    *BUT* you'll get rid of the 2GB file size limitation (ignoring that large-file-patch) of the 32bit-platforms. With an eye on video and some server applications that's a big plus.
  • by MobyDisk ( 75490 ) on Thursday August 10, 2000 @01:56AM (#865893) Homepage
    Intel is behind behind behind in IA-64. MS is releasing their latests compiler soon MSVC 7. No IA-64 support. GCC has no IA-64 support. What about drivers, that need to be written in an assembly that noone knows? Normally, those things take time. With IA-64, it will take 5 times longer than usual due to the insane architecture difference. Intel will need to do much hand holding to prevent alienating x86 developers.

    AMD has managed to address some of the major issues without causing major issues. The architecture is a logical next step, not a crazy throw-it-out-the-window change. Many people would love a re-engineering, since it really is time. But 8 more registers, 64-bit support, and a few tidy-ups and we are in business. With technology, simplicity wins most of the time since it is cheaper and easier.

    The big question is "Can AMD release this before Intel steals the idea and make their own version?" :)

    VHS won out because "The cartridge was smaller" thus easier. IA-64 is a behemoth, and the timetable doesn't look good.
  • Not that again...

    There are enough OSses on 32-bit CPU's that can handle large files. NT and FreeBSD to name two.

    It is unbelievable that Linux still hasn't fixed this.
  • > Where it gets just slightly amazing is when bad news starts to leak out about Merced, how it's a dog, and "Just wait for McKinley!" Yep, we messed up this time, but just wait until next time.

    Yeah, that's a novel line to take in the mass market.

    Whoever (ahem!) innovated it first should have patented it as a business method.

  • If neither GCC nor VC++ has IA64 support, er, how exactly do you think Linux has been ported to IA64, and how is Microsoft porting Windows 2000? I think GCC and Linux for IA64 are available to the public already, though I could be wrong.
  • by Anonymous Coward
    GCC has no IA-64 support.

    Darn, then what are those files in my egcs-20000717/gcc/config/ia64 directory?

    Don't get me wrong, I like AMD, but Intel's working seriously hard to make sure they have compiler support.
  • 64-bit processors are overly expensive and will stay that way until market share provides enough economy of scale for the productions to be ramped up to make them cheap enough for the masses. It's all about acceptance. The Alpha isn't making it because it just lacks acceptance. While I'd agree it is better than IA-32, there are 2 reasons I'm not using Alpha. The first is Alpha is too expensive (but if it became accepted that might no longer be the case). The second is because it's not really all that great a CPU (as an example, some instructions, like divide, take way too long). If I had to do 64-bit today, it would more likely be UltraSparc. I look forward to Itanium, though I suspect Intel will try to rape the market with high prices on it for as long as they can.

  • There are enough OSses on 32-bit CPU's that can handle large files. NT and FreeBSD to name two. It is unbelievable that Linux still hasn't fixed this.

    It would be unbelievable if it hadn't been fixed, as almost any reasonable size database would run into the end of the line fairly quickly. If you pay attention to the available patches for the kernels you'd know that quite a while ago there were patches to the kernel to support larger files - I think the name was the Large File Summit or LFS (not to be confused with Log-structured File Systems). Ah - the patches are available here [tmt.tele.fi].


    Toby Haynes

  • If 64 bit processing were actually important for most computing tasks, we'd see a much higher market penetration for current 64 bit platforms. UltraSPARC, Alpha, MIPS and POWER (not PowerPC) are all 64 bit processors backed by 64 bit operating systems, compilers and applications.

    For most tasks, 64 bit computing is not an obvious win.

    Sure databases and heavy duty scientific computing benefit from the vastly larger address space, but day to day gaming, office tasks, web surfing, mp3 listening and Natalie Portman porn viewing (I had to say it) won't benefit much at all.

    Now architectural improvements and (mainly) higher clock rates will mean serious improvements in peak performance, but I think Apple is on the right track with pervasive SMP, which should deliver higher SUSTAINED performance, so that I can do all of the tsaks above at the same time without worrying about spikes in demand for processing power will freeze my game at some crucial point. Of course, MacOS isnt's the ideal SMP platform, but the hardware is on the right track.

  • by Greyfox ( 87712 ) on Thursday August 10, 2000 @02:44AM (#865901) Homepage Journal
    I agree. Intel's putting a hell of a lot of effort into making sure that you have a lot of OS choices for the Itanium. They and SGI and IBM are collaborating on a heavily (and I mean HEAVILY) optimizing C compiler. Linux will already boot and run on the Itanium, and I'm sure Tarentella or whatever the commercial UNIX for the processer is at least as advanced. Wouldn't surprise me if even Windows runs on it now.

    AMD's been putting out superior processors for about a year now, but if I can't run my software on them, those processors are going to go the way of all the other superior products that came and went before. If AMD's chips run 20% faster and pgcc makes my programs run 20% faster on Intel's hardware, the superior design of AMD's stuff is pretty much negated (Except that I'm not forced to get RAMBUS RAM with the Athlon.)

    In response to your betamax I have a few words of my own. Words like Amiga, NeXT, and Apple.

  • To quote the first paragraph of the introduction of the pdf: "The x86-64 architecture supports legacy 16-bit and 32-bit applications and operating systems without modification."

    There are 3 operating modes in the chip - a pure legacy mode (supporting realmode, v86, and protected mode) which will run if you're on a 16 or 32bit OS; and two modes - true 64 and 16-32 compatibility (both supporting only protected mode, woohoo!) - that can be accessed if running an OS with their 64-bit extensions, and it oughta be trivial for the OS's program loaded to know if the app it's trying to run is 32-bit or 64-bit so it will just have to know how to launch a program in either the 64-bit mode or the 32-bit compatibility mode.

    So with the software incompatibility nightmare out of the way, people can upgrade their hardware and sit around running their 'legacy OS' while waiting for a new OS that supports the wonderful 64-bit extensions. And who will be there first? I think we all know that answer and it's probably not even going to be Sun.

    The only thing that really sucks about this is that the byte order is still backwards Intel legacy crap. If only they could get around it through emulation.

    The big question is whether Microsoft will write two 64-bit OS's. If Win2k doesn't run in 64-bit mode on this thing it's going to be dead in the water --- I was going to say all the Linux boxes in the world changing to this chip wouldn't save it... but it probably would ;-). Of course, all it should take is a new OS loader which throws it into 64-bit mode and a few changes in the kernel so it can start up all the apps in 32-bit compatibility mode. So that'll take MS, what, maybe 2 years :-)

  • AMD managed to get rid of quite a bit of legacy for the 64 bit mode. Most of the utterly idiotic segmented switching is gone. There is a well defined supervisior mode and user mode. There is a SYSCALL instruction so the mess of "jump to that magic offset to get promoted to OS level" is gone and OSes will have a nice clean API.

    amd has had syscall since K6 days. and intel has its own version (both are incompatible.) and believe it or not, intel implemented it at the REQUEST of Microsoft!! (dont know if Win* actually uses it or not...)...

    unfortunatly the AMD implementation was seriously buggy. reports say intels version is ok tho.

    Write your Own Operating System [FAQ]!

  • Wine doesnt do CPU emulation (ok for the nitpickers in some cases theres a tiny bit of mucking around).
    WINE is an implementation of the WIN32 API and others for Unix. It also has a loader for loading windows executables into memory and doing a bit of translation on some of the adresses that the app might try to relocate itself to to fit in with the way we have things set up on *nix.

    From there on in its basically the x86 processor executing x86 code. Any library calls to win32 are either hanled by WINE's inbuilt WIN APIS or at your choice native windows dll's which undergo the same treatment as the programs loaded into memory as described above.

    Basically wine on a PPC wont run x86 Windows apps. Youd need to run wine compiled on an x86 under BOCHS or similar for this to work.

    If an app breaks under windows with a move to a new 64bit processor because of problems with some of its assembly code theres little or nothing WINE can do about it and it will fail under WINE too
  • I know this is way-OT and I probably should post as AC but ...
    Why the hell does the sandpile site locked to width of 1024 pixels?
    Even if my monitor was at 1024x768 it's awful arrogant of them to think that i would maximise my browser just to read it.
    I'm getting close to patching junkbuster (http://junkbusters.com) to strip width= from the html!
    At least I have the zoom out option in Opera.
  • This same argument was made in favor of 16bit systems. See any modern 16bit systems?
  • there's one problem with that:

    wine == wine is not an emulator

    btw, merced/itanium/pentium4/whatever-this-weeks-name-i s, can natively execute x86 code.
  • You get more and larger registers --- it will be faster because the less time spent moving data in memory to and from registers the faster the app will be. When you want to multiply 2 numbers they have to be pulled from memory and thrown into 2 registers and then operated on, then you get a result in one of those registers and you can do something else to it. Now scale that up to say an mp3 encoder which is doing millions of repetetive math operations and reusing results of previous calculations.... and you hopefully get the picture.
    Granted, this isn't as huge of an issue now that we have chips with a meg of cache on board and busses that can do 4 gigabits/sec.... but every little bit helps! even a 1st-level cache hit takes longer than just using what's already in a register.
  • There are many PDF-viewing (and creation!) programs available --- You can use xdvi or gv to view PDFs as well. No need to use Adobe's crufy Motif-based quivering mount of hacks.

    (Last four words stolen from elsewhere :)
  • alpha is definetly NOT a VLIW cpu. also, VLIW is nothing new, intel had a VLIW chip out about ten years ago, which never caught on.

    btw, there's a reason no one else is making VLIW cpus. it's because it's an inherently flawed idea.
  • When u say that VLIW has not been tried out yet u r speaking about the situation last yr when VLIW only existed in textbooks( where incidentally it has been for more than a decade). But with the successful launch of Transmeta's Crusoe which does software emulation of the x86 and uses a VLIW processor VLIW is here to stay Incidentally in my opinion if we HAVE to have backward compatibility why not have it in software and not on the die?
  • My guess would be that Sun supports it because it differentiates them from HP. Remember, the IA-64 architecture was jointly developed by HP and Intel. When 64 bit processors first come out, only the big boys (read high end server vendors) are going to care right away. Sun and HP are directly competing in that space, so I bet Sun is hoping to gain an advantage over HP.
  • Read the question - this has nothing to do with large filesystems and everything to do with address space. He's saying that you can't mmap() the large file that exists on the filesystem. And you CAN'T unless you have an address space that will hold the whole thing. NO 32-bit OS can do that if the file is over 2 Gigs, though there may be some trickery you can do to get that up to 3 Gigs. I haven't tried it on an Alpha... anyone out there mmap()ing 20 gig files on an Alpha?
  • I don't understand why Intel and AMD don't start to work on non x86 based processors. The pentium IIIs and athlons already have far too much code to keep them backwards compatible. As far as software goes, Microsoft has already abandoned DOS totally, so why keep the processor architecture to run DOS.

    Intel has done just this. It's called Itanium. Haven't you read a thing about it yet?

    Itanium is NOT an X86 processor. That's also a big part of why Intel keeps telling us that it's for the server market, not the desktop.

  • the only advantage other than more address space, is that 64bit interger ops will be atomic. that is, single instruction, instead of having to concatenate two 32bit values.
  • Umm... there is a 64-bit PowerPC backed by a 64-bit AIX that IBM sells... you need to do more research!
  • I wander what happpens when amd and intel gives out a 64 bit chip each. Will windows be out for each processors? If not one of them is surly gonna beat the crap out of the other and we suddenly have monopoly on CPU's again. I know at least that if they're not compatible i'll go for Intel as a standard. Does anyone have any info regaring this issue?

    Picking "Intel as a standard" is a rather strange thing to do in this situation, as the AMD chip is expected to run all your current software natively, while the Intel will have to do tricks to make it run.

    This is a big reason why the Intel chip is targeted at server markets; to give the much broader desktop software markets more time to mature and adapt to their programs to Intel's radical change. The AMD, on the other hand, should run x86 code natively faster than any other x86 chip, and could be scaled down from servers to desktop systems any time AMD wants to pound on Intel.

    Gives a little more meaning to the code name "SledgeHammer", doesn't it?

  • Well, anytime you are dealing with 64bit numbers you'll see a speed increase. So, if you have a file system larger than can fit in a 32bit address space, you'll get some performance improvement. This is good news for file servers and database servers.

    Also, the AMD 64bit CPU has twice as many registers as before, so that will help performance some. It would be a tremendous performance gain, but cache helps take some of the pain out of having to swap registers to memory.
  • Microsoft has already abandoned DOS totally

    But the embedded market hasn't. Neither have these guys [drdos.org]. And the FreeDOS Project [freedos.org] is even creating a GPL'd DOS clone.

    ( \
    XGNOME vs. KDE: the game! [8m.com]
  • The only thing that really sucks about this is that the byte order is still backwards Intel legacy crap.

    But is little-endian (LSB first) really backwards? That's how bytes are sent over a serial port (low bit first). Little-endian is also the standard on the popular 6502 and Z80 CPUs in embedded systems and in the popular Game Boy handheld console (think Handspring Visor without the pen input). x86 asm coders seem to think the way 68000 and friends do it is bass-ackwards.

    ( \
    XGNOME vs. KDE: the game! [8m.com]
  • Sure databases and heavy duty scientific computing benefit from the vastly larger address space, but day to day gaming, office tasks, web
    surfing, mp3 listening and Natalie Portman porn viewing (I had to say it) won't benefit much at all.

    That's only because you're letting yourself be shortsighted. When faster processors emerge, you'll start seeing a real market for interactive 3d virtual-reality Natalie Portman.
  • by styopa ( 58097 ) <{ude.odaroloc} {ta} {rsllih}> on Thursday August 10, 2000 @03:41AM (#865932) Homepage
    Sun is tired of all of Intels whining and wants to stabe a sharp stick in their eye. For the past year Intel has been whining to the press that Sun isn't backing the IA-64 processor as their primary processor, and that because of that they aren't living up to their agreements. Sun had one of the first working ports on the IA-64, and sees IA-64 as secondary to their UltraSPARC line for Solaris.

    Some of the high ranking people in Sun have gotten so fed up with Intels whining that they have gone so far as to say that they may not even release Solaris for IA-64. They are supporting AMD just to piss off Intel.
  • by SQL Error ( 16383 ) on Thursday August 10, 2000 @03:47AM (#865934)
    The Palm Pilot is 32-bit (Motorola 68328).

    People working with databases (my area), complex simulations and digital video already have problems with 32 bit limits. All can be worked around, but it's annoying to the developer, and often inefficent. My desktop machines both at work and home have 1GB of memory; 4GB of real memory is probably only two or three years away for me. 95% of my development is targeted at 64-bit systems (Sun, SGI and Alpha).

    Also, the SledgeHammer is expected to have dual cores, which addresses at least one of your other points. If I can get a dual slot/socket motherboard, and thus four processors, I'll be happy (for a while, at least).

    I do agree with your points on I/O - PCI has done well, but today a single GBit Ethernet or Ultra160 SCSI controller can swamp standard PCI, and 10GBit Ethernet and Ultra320 are waiting in the wings. PCI-64/66 will help, but it's only a short-term solution.
  • by / ( 33804 ) on Thursday August 10, 2000 @03:53AM (#865935)
    "Just wait for McKinley!"

    Right. The last time there was a McKinley in the public spotlight, he got assasinated. By an Anarchist. Parallels, anyone?
  • I know I'm supposed to be impressed by the bigness of the term "64 bit" and all, but this is striking me as another odd niche piece of hardware that most developers don't have time to exploit. Think MMX, Katmai, 3DNow.

    Processor speeds have gotten high enough and software bloated enough that much bigger wins can be obtained from cleaning up existing software than from CPU pissing contests. Corporate fanboys don't want to hear that, though.
  • by SQL Error ( 16383 ) on Thursday August 10, 2000 @04:02AM (#865937)
    Agreed 100%. IA-64 is a very interesting architecture (all those little crinkly bits - lovely baroque feel) but the first chips (i.e. Merced) look like they'll be hot (150W), slow, and expensive. Intel are already spinning Merced as a development platform only. McKinley (chiefly designed by HP, who tend to know what they're doing) looks like the first "real" IA-64 chip, but it's still some distance away.

    Meanwhile, AMD has designed the SledgeHammer as a minimal 64-bit extension to the Athlon, so it's not much bigger than existing chips (on the order of 10%), and so hopefully won't be too expensive. Though if it comes with dual cores and a large L2 cache, you can expect to pay rather more than for a Duron.

    Assuming that it does come with dual cores and a healthy L2 cache (and no critical bugs), SledgeHammer will make a good choice for small servers and high-end desktops even without true 64-bit OS support. And Intel's troubles with Merced effectively hand AMD a 12-to-18-month window of opportunity in the low-end 64-bit market. Which may be all they need.

    The other issue is that Merced is likely to be relatively slow on IA-32 apps, while SledgeHammer could well be the performance leader. If you don't expect all your code to go 64-bit overnight, it's another plus for the AMD solution.
  • Since the size of integers will double, you will need twice more RAM to run your software.
  • I've programmed in 6502 assembly on both Apple II and NES, and I've looked at a Game Boy ROM (Game Boy uses a Z80-compatible CPU, and only once have I ever seen a string written backwards. It was "DISK VOLUME BARSBAIT" in the old Apple II DOS 3.3. Try editing some legal [parodius.com] NES ROMs [8m.com] with a hex editor.
    ( \
    XGNOME vs. KDE: the game! [8m.com]
  • I'm 100% positive the 6502 is an 8-bit processor, and I'm 99% positive that the Z-80 is too. Since endian refers to the order of bytes in a word, it doesn't make sense to refer to them as being big or little endian.
  • by Anonymous Coward
    What are you smoking?

    gcc Already has some work on it to support predicated execution and a few more things (Have you even checked gcc.gnu.org in the last 3 months?) And there is a port to IA-64 already underway.

    Compilers HAVE TO analyze for independency between intructions, just to reorder them. Once you have that, it is pretty simple to schedule them in parallel.

    And about C pointers, gcc also has a fortran compiler that doesn't have to care about aliasing, and is in the process of implementing C99, with restrict in it. Besides, it already has to deal with aliasing issues, just to do the aforementioned reordering, and to decide whether to use a host of other optimizations.

    Granted, IA-64 is a lot more complicated than usual processors, but GCC has been ported to more processors that you can shake a stick at.

    "Open Source" compiler technology is very damn good. Don't you dare underestimate the GCC group.
  • ... called Sledge Hammer [imdb.com]. I thought it was hilarious. I wish they had it on video.
  • Actually elements of VLIW have trickled into many CPUs. Many DSPs feature this as you have mentioned aas well as Transmetas chip. Also the AltiVec in the G4 is similar to a VLIW design (makes sense, no? Motorola also makes tons of DSPs). It's not the bulk of the chip but it hadnles the specialized instructions and that's where all of this 'G4 is twice as fast as a Pentium III' stuff comes from
  • It's fashionable to bash Intel on /., and claim that AMD will eat its lunch everywhere, but for the moment my impression is that X86-64 is at a much earlier stage of development than Merced (I can't bring myself to call it Itanium). Let's see, long before Merced's specs were public, Intel worked with MS to ensure there'd be a 64-bit Windows port for it, and formed the Trillian project with Linux vendors to make sure there'd be a 64-bit GCC and Linux port. Now that the specs have been out for a while, GCC-ia64 exists (does anyone know how good it is at optimizing? my guess is "not great", but i really don't know), linux-ia64 is merged into the official kernel, several Linux distros have alphas or betas for ia64, MS has showed a prototype of win64, and things appear to be moving, slowly but surely. Now, AMD just unveiled the X86-64 specs, but where is the software support? all we get is vague claims from Sun that they will port Solaris to the chip. Sure, AMD's processor will probably be compatible enough to run 32-bit Linux or Windows with no problem, but who wants a 64bit processor to run a 32-bit OS with 32-bit apps, when the 32-bit processor lines are cheaper and not anywhere near obsolescence? and where is the GCC port? it may be conceptually easy to move GCC over to this new beast, but it's still many hours of work; have they been done?

    If AMD wants to be shipping X86-86 boxes soon (like early next year), they really need a solid, working, 64-bit OS for it. Linux being what it is, it will get ported over no matter what, but *very early* availability of Linux may well be what makes or breaks the X86-64. I sure hope that AMD, or some partner, will soon make public an internal port of GCC and Linux. If they haven't done it, they'd better start one, like now.

  • In my local computer store (PC World for those in the UK) I read the following ad for some laptop.

    Sony Vaio XXXXXX laptop

    Key Features:
    * 10.4" TFT Screen
    * XXXMhz P3 CPU
    * No Modem

    Solved all my problems since it's so hard to get a laptop without a built in modem these days.

    Oddly the feature card failed to mention 1394 Firewire, Memory Stick Slot and the low weight!??
  • Both architectures have their own strong points and weak points...

    What could play in favor of AMD now is that software for Itanium at it's launch might be far from mature and it might take a while before it gets any better. There will be not many other options considering that backwards compatibility with x86 for Itanium is, as it should be considering the new architecture, just a big joke. Not quite ideal for servers if you ask me...

    This could mean a wide headstart for AMD:

    - Backward compatible with a performance boost instead of a performance hit while 64bit software gets polished
    - Porting existing software will be easier and easier means pontentially more software available at launch
    - Probably much better availabilty of the chips; many websites have claimed that Itanium will only prepare the "road" for it's eventual successor..
    - Rumors have also claimed that Intel had no intention to drop the x86 line of chips for a while. That could also play in favor of AMD; they would have potentially the first widly available 64bit x86 chip in the market. Unless Intel adds a 64bit layer of it's own to it's x86 chips.. but then again.. AMD seems to be much more ready than they are on this front and it would be stupid even for Intel to release their potentially incompatible "extensions" after them.. Intel would be one that would be playing catch-up.. unless they can bribe as much software houses as they can..

    Now if AMD really manages to do it correctly, and considering their success with the Athlon it might happen.. they've seen bad days but it is getting much better IMHO, they really might become the new leader in, at least, desktop chips. Servers are another story but still it is possible considering again that software for Itanium won't be as much mature so easily. I don't think Microsoft will follow AMD in 64bit though unless their "marriage" with Intel comes to a divorce...

    And i'm not upset at all for linux... i'm sure it will be available for it...

  • Why do you think it's called SledgeHammer in the first place???

    Because the architect said "Trust me, I know what I'm doing."

  • You know every so often, a technology comes along that just blows me away. I apply it everywhere, it becomes my favorite hammer (in that everything starts looking like a nail).

    Dynamic software translation is such a technology. Yes I know that it completely failed to do anything for the Alpha's NT penetration (remember the FX!32 dual mode chip?) but we'll put
    that down to poor marketing (did you ever see one system with it?).

    This is how the revolution will be telecast. Apple did it. win/tel will do it too. While I love AMD for giving intel credible competetion, I do have to admit that I think sacrifices to binary compatibility are looking like a worse and worse strategy.

    This is the optimal time to introduce a new architecture. When apple changed to PPC, they were switching to a much faster chip, so were able to claim significant speedups vs the old architecture. Win/tel doesn't really have that option (requiring longer lead times to see significant speed iumportovements over the current arch), but we finally have a plateau in speed requirements.
    1. Games are bottlenecking on 3d hardware,
    2. office suites are fast enough,
    3. our UI is fast enough and there are no computationally expensive metaphors on the horizon

    This plateau means that a new architecture can be introduced on obsolesence alone -- the first couple of generations don't need to compete on speed. As more and more applications get ported to native code, consumers will again see speedups.

    Of course, the new architecure would offer speedups to their big-money early adopters -- java backends are already architecture independent. Just introduce a native java VM along with the chip.

    So if there ever was a time to throw out cruft it is now, before the next wave of innovation starts. Go to a faster architecture now, get back-compat by emulation/translation.
    Everyone wins; server users dump money into native JVMs and get speedups soon. Consumers wait and by the time the next wave of power hungry apps comes along, everyone has "gone native".

    Notice that this entire argument is based on the assumtion that the plateau will last long enough to give most of the industry a chance to go native. So I ask you; what technologies are out
    there waiting to use the power? Natural language speech recognition? Good AI in games? Immersive 3d environments?
  • Sorry to be picky, but all modern CPU evaluate many instructions simultaneously, using multiple-issue systems for many similar functional units of eack kind. The difference with VLIW is that several instructions are held in the same instruction word.

    In conventional VLIW the instructions combined in the same word must not have any dependencies on one another. The Crusoe native instruction set is VLIW in this sense. The Itanium system - which they call EPIC - is more complex, allowing instructions that interfere with one another to be combined in the same word. This ought to be somewhat easier for the compiler, but it seems in fact that elements of it are causing problems.
  • As a matter of interest: why ?

    PS. Crusoe's native IS is VLIW.
  • But is little-endian (LSB first) really backwards? That's how bytes are sent over a serial port (low bit first).

    Asynchronous serial interfaces send the data LSB first. All of the synchronous serial interfaces, that I have written software for, send the data MSB first.

  • by GauteL ( 29207 ) on Thursday August 10, 2000 @07:42AM (#865991)
    Part of the advantage with AMD's offering, is
    that porting from the 32bit x86 to their 64bit x86
    is MUCH, MUCH easier than porting from x86 to IA64. This means that though Intel had to start
    more than a year in advance, AMD needs much less
    time to get support than Intel.
    Besides, the OSes are the most important, but not
    everything. The applications also have to be present. And Intel does not have that much of a head start here.
    Also.. AMD's offering can run 32bit x86 applications MUCH faster and better than IA64.
    That said. I think Intels architecture looks more
    interesting. I've just got the feeling that AMD is right about the industry wanting another x86.

    Everyone says they hate x86, and most of us do hate all the legacy, but starting all over with
    a HUGE presence of applications for x86 doesn't
    seem like that good an idea.

    A point against being better at compatibility,
    is that Intel is large enough to warrant porting,
    and with Sledgehammer being good at running 32bit,
    people may not bother to port applications to 64bit x86.
    Time will tell..
  • My contribution:
    How to tell if your system is little or big endian

    int x = 1;
    if( *(*char)&x )
    printf( "little endian\n" );
    printf( "big endian\n" );

    Or something

  • by maraist ( 68387 ) <michael@maraistNO.SPAMgmail@n0spam@com> on Thursday August 10, 2000 @09:58AM (#866007) Homepage
    I didn't see any mention of 3DNow. It's been a while since I've seen talk of it.

    I _think_ that 3DNow was completely compatible with MMX, and so the reuse of the floating-point registers is all that was needed (If I could remember the name of a 3DNow op, I'd look in their table).

    What I thought was interesting, however, was that Intel's SSE instructions were incorporated. I wonder if this was an admission of failure on the part of 3Dnow (since to my knowledge SSE was superior), or if they're just trying to capture that compatibility (while at the same time, upping it's potential performance by doubling the number of registers).

    My favorite addition however was the new addressing scheme. The reduction of segment selector-dependency, plus the addition of IP-relative addressing just rocks; Finally, dynamically relocate able code.

    As a hobby, I've followed the growth of x86 hardware since the 808[68], so it's really cool to see it finally catch up to the RISC big-boys in most arenas.

    Now if they could only have provided 32 GPR and 32 FP-esk registers instead of the paltry 16 (including SSE).

    Another interesting point is that the new "xHammer" design is truly an improvement over legacy design instead of scrapping then emulating (a la Italium). Basically, by using prefix op-codes, they maintain a tremendous amount of compatibility and CPU-reuse. In fact, the CPU should hardly even notice which mode it's in (minus how it deals with the upper 32-bits). Any future advancements in scheduling design, n-way execution, etc, should benefit both "long" and "legacy" modes.

    Though I find much distaste for variable-length instructions, this is a beautiful case of human ingenuity solving a problem in an intelligent way (various NASA solutions come to mind).

    AMD is sure to best Intel at overall value at this stage (Intel may still have a few 32bit tricks up it's sleeve). My real question, however, is how well AMD will fair against finely tuned 64bit solutions. Now that x86 is infiltrating more and more of the server market, does AMD really stand a chance of competing against raw FPU or multi-processing on say Sparc / Alpha / Power / HP / etc? They seem to be fairing well in the fabrication process micron and MHZ wise. Post-RISC chips still have an edge over x86 in that little translation of op-codes is required, and the over-all die-size can be smaller to perform the same functions. This facilitates less power-consumption and theoretically higher clock-speeds. Also, to my knowledge, many op-codes in these post-RISC chips are well suited for multi-CPU design (Off hand, I remember the Alpha's version of test-and-set).

    How much life does x86 really have left in it? Can we just keep adding layers and layers of compatibility modes? And if Intel maintains a predominant market share, are they going to be able to purposefully produce incompatibilities to stifle innovation? We've already seen how they can gum up the works with RAMBUS / DDR-SDRAM and their biased chipsets.

  • So what you're saying is we shouldn't upgrade because some programmers (to put it mildy) can't code their way out of a paper bag?

    "Remember how much stability problems arised in windows due to the 16-bit compatibility? Ok, so it was not because of the bitness but the insecure memory model."

    NT runs 16-bit code in its own separate address space. That's what should've happened in Win9x, and might have happened had MS not been helmed by marketters trying to sell the more expensive NT to businesses (who themselves bought the 9x because it was the "legacy" update to their legacy Win31 machines).

    "Still, providing backwards compatibility to 32-bit software is going to happen with separate interfaces (what's the length of an 'int'?^). Have some more DLL Hell?"

    It's not AMD's fault that programmers can't be bothered to use proper types and use sanity checking in their programs. AFAIK, NetBSD is the only project written with enough portability discipline to run on a 26-bit processor (one of the ARM family). Using things like "int" and assuming they are 16-bit, 32-bit, or even 64-bit is wrong. There are specific types given with every mature libc that allow you to define exactly how many bits it should be via such defined types as int8_t, int16_t, int32_t, etc.

    If you look at this Kernel Traffic [linuxcare.com], you'll see what assumptions Linux makes about the various bit types.

    So before you complain about migration hell, remember that you probably made the choice to run your company on a Windows powered server. If you then can't get support for it 10 years later on newer hardware because the company behind it has decided to sell other things, you have to accept that. Don't try to call foul when the hardware standards advance, but your chosen software does not.

    I agree that earlier stuff should be emulated in software, rather than hardware. That's where the Crusoe has an edge -- it can, at the lowest level, emulate several processor families and allow them to communicate using the system memory. A benefit that is only gained by emulating another processor on theirs (which is what those "D00d, c0mp1l3 R4D H4t f0r 17 n471v3" kiddies fail to understand).
  • Let's see:
    Pros of 64-bit proccssor:
    • 64-bit default pointer means you get > 2 gb file size limit without any performance penatly or backflips on the part of your OS.
    • Your memory bandwidth will increase (think AGP 4x being faster still!) because you move larger chunks of data per clock cycle (not more).
    • You, too, can run nuclear weapon detonation simulations on your home machine without waiting an eon for a result.
    • Whereas most small operations (such as bit reads/writes) are about the same in terms of speed, all "large" operations (such as flushing memory pages to disk) will go faster.
    • AMD specific: you lose some of the Intel compatbility-bagage, and should get better performance because of the much-saner memory segment handling.

    Cons of a 64-bit proccessor:
    • Most software not written with portability in mind will break in interesting ways (not fixable if it's closed source)
    • Some 32-bit code will not run on it because of some of the legacy compatibility removed.

    Over all, a good upgrade (IMO).
  • This ARM connection may be backed up by the fact that Intel owns ARM now.
  • As far as I recall, the Saturn chips don't hold a candle to the Dragonball procs in the TI8x (and the Gameboy. Interesting tidbit, the TI8x has a processor that is 6 MHz while the one in the GB is only 4 MHz, both are speced to run higher)
  • Actually, the cool thing in Itanium is not the 64bit-ness (that is just kind of requisite for a next-gen proc) but the fact that the architecture scales to 8 pipelines. EPIC allows all 8 pipelines to stay filled (which BTW is probably the reason why the Itanium demos have been less than impressive, it is simply hard as hell to optimize for an 8-way proc.) AMD kind of misses that point. Of course, Sledgehammer also has a new-improved FPU, so that's the other big thing. But 64 bit is, as you said, a checkbox item. And it also isn't the main purpose of the new architectures. The media is taking it as such (because it is easier to understand 64 bit than parallel pipes) but the main purpose of 64 bit is so the developers don't feel stupid for introducing an 8-way 32bit proc.
  • There is also a PCI-X planned that will push over 1GB/sec. (64bit, 133MHz)
  • There is a GCC-IA64. There is a IA64 Linux kernel. Where have you been? Though I agree, that Intel may have a problem. However, it isn't has hard as you make it out to be. The main problem's with the difference in architecture are supposed to be addressed by the compiler. Intel's idea is that the compiler should do a huge amount of optimization, and free up those resources on the chip. I doubt GCC is doing a very good job of optimizing IA64 code, so that's out as a viable IA64 compiler. Chances are that Intel will release it's own compiler (like they have with x86) which will cover most of the hardware differences. Also, this compiler will optimize binaries correctly to work well with EPIC and the whole VLIW thing. By using this compiler, most of the architectural differences should be hidden. Since the big OS on IA64 (Montery) is being custom written, that shouldn't have such a bad time getting out, and other software should be easily ported if they are in fact portable applications.
  • Recent versions of the call are fine. The problem had to do with SYSCALL not allowing IRETs until a SYSRET, which meant that it couldn't be used for system calls that could sleep (and switch to another process).
  • I suggest you read the /usr/include/stdint.h on your system. It will show you how they libc people use C preprocessor magic to make types sane by default.
  • by ToLu the Happy Furby ( 63586 ) on Thursday August 10, 2000 @03:02PM (#866038)
    Sorry to be picky, but all modern CPU evaluate many instructions simultaneously, using multiple-issue systems for many similar functional units of eack kind. The difference with VLIW is that several instructions are held in the same instruction word.

    More to the point, most modern CPU's, the chip itself determines which instructions should be run simultaneously. With VLIW, in theory, the compiler does this. Thus (in theory) the VLIW chip can be a lot simpler, since it doesn't need to spend a whole bunch of chip area on complicated logic to manage the process of choosing which instructions to process simultaneously and on large buffers to hold many instructions while the fastest combinations are chosen.

    In conventional VLIW the instructions combined in the same word must not have any dependencies on one another. The Crusoe native instruction set is VLIW in this sense. The Itanium system - which they call EPIC - is more complex, allowing instructions that interfere with one another to be combined in the same word. This ought to be somewhat easier for the compiler, but it seems in fact that elements of it are causing problems.

    Ah, what happens when theory becomes practice. See, the problem with all of the above is that traditional VLIW (and it is very a traditional idea, despite Intel pretending they just invented it (and even then, it was HP who invented EPIC)) can only produce code which is known to be dependency-safe at compile time. In the fields where VLIW has been used for decades--i.e. specialized DSP stuff--this means most code. In the fields in which Intel is trying to sell its IA-64 chips--i.e. general purpose, multitasking machines (first servers and eventually workstations and even, they claim, desktops)--this turns out to be almost no code at all. It turns out that virtually nothing that a general-purpose computer tends to do has very many instructions which can be proven at compile-time never to have any dependencies on many other instructions. This compares to a normal superscalar x86/RISC processor, which has the much more fruitful task of determining which instructions are likely to be dependency free at run-time.

    This presented a problem. So, with EPIC, HP came up with the idea of having long-words (or "bundles") composed of instructions which were merely likely not to have dependencies on each other. So far so good...except that then you need a bunch of hardware on the chip to determine when you have dependencies and when you don't, and to figure out what to do when you do (after all, the chip still needs to execute all those instructions). That is, all the hardware that any non-VLIW superscalar CPU needs. But wasn't the whole point of VLIW was to make the chip simpler by moving all that stuff into the compiler and keeping it there?

    Another feature that most modern CPU's (so called "out-of-order designs") have is the ability to pick one side of a conditional branch--the side which is more likely to be taken--and pretend it has already been taken, executing the results before it knows they will be needed. The problem with this is that now the chip needs to keep track of which instructions are definitely needed and which would have to be "thrown out" if it turns out the wrong branch was taken. This would seem to be counter to VLIW philosophy, but because of the speed gains speculative execution offers, EPIC includes it as well. While branching "hints" are indeed introduced by the IA-64 compiler, this is still just more complexity which ends up not being removed from silicon.

    Further holding back Itanium is the fact that without dynamic handling of instructions in the CPU, the chip can't take advantage of something called register renaming, which essentially allows the compiler to pretend it's using certain registers while the on the chip the programs just use whatever registers happen to be available. The upshot of this is that Itanium needs a whopping 128 14-ported (a "port" is like a line into/out of the register) 64-bit general purpose registers. More registers--sounds like a good thing, right? Problem is that 128 14-ported 64-bit registers take up such a phenomenal amount of die space that it's difficult to get them to do all the things they need to do in one clock cycle and still stay in sync. The result is that Itanium won't clock very fast without failing, and Intel had to screw with the pipeline timings late in the design process to get it to "acceptable" clock speeds at all.

    The end result of this? Amazingly, in a processor whose entire design philosphy is to remove as much complicated control logic from the chip to the compiler, the portion of Itanic's die dedicated to control logic will be bigger than the entire die of an Alpha 21264 on an identical process [realworldtech.com]!! Intended to be a high-volume chip released at 800 MHz in late 1997 or 1998, Itanium is going to end up a low-volume chip released at 733 MHz in 2001, if indeed it is released at all in anything more than engineering-sample quantities.

    Thus everyone's hopes have turned to the next IA-64 chip, McKinley. Intended to be released in late 2001 (put me on the record as doubting it) as a high-volume, moderate clock speed chip on Intel's upcoming .13um process, it may indeed be an impressive performer. (Interestingly enough, McKinley is being designed entirely at HP...) Whether that's because of or despite the EPIC design will be interesting to see. It is, at the least, pretty doubtful that the competing Alpha 21364 and Power4 chips around by then won't be even faster than McKinley, and they'll be manufactured on less advanced processes than Intel's.

    Bottom line: at this point in the game, VLIW looks like a damn neat idea that never should have made the jump to general-purpose computing. Whether IA-64 will end up dominating the world anyways will be an interesting story to see. On the one hand, the server market isn't known for rewarding well-designed processors: the lion's share of the market is owned by Sun, whose UltraSPARC-II's are horribly outdated and outclassed on a per-processor basis by everything this side of a Celeron. Meanwhile, the Alpha, universally agreed to be the fastest and most elegant processor around, is languishing in the market. On the other hand, Sun's big strength (other than marketing) is in scalability, and given that the two big OS's for IA-64 appear to be Linux and Windows-64 (laff), it seems Intel won't be able to able to compete on this measure either.
  • I wonder if we will ever need that 128bit processor in our lifetimes?

    Well, let's do the math. Moore's Law or some derivative suggests that the size of {disks,ram,whatever} doubles every N months. Each doubling of "stuff" will require one more address bit. So the number of bits in an address should increase linearly with respect to time. That is, one bit every 18 months or so. Since address bit counts seem to grow by powers of 2 the time between address bit count doublings should double.

    Year Chip Bits Physical/log
    1971 4004 4 640/9
    1972 8008 8 16k/14
    1974 8080 8 64k/16
    1978 8086 16 1M/20
    1980 I was born
    1982 80286 16 16M/24
    1985 80386DX 32 4G/32
    1994 Alpha 64 lots/64 (guessing)

    The data doesn't fit perfectly. Through the 70s we seemed to gain one bit a year or thereabouts. Through late 80s and early 90s we got two or three bits per year. I suppose everyone will have 128 bit computers in 30 years or so. 1e17 bytes on a disk (mmmm). And I'll be 50 (doh!).

    News flash: #9 came out with a 128 bit graphics processor in the early 90s. So this might all be bullshit.

    Ryan "1e17 bytes should be enough for anyone!"
  • The origin of the terms "big-endian" and "little-endian" are from Gulliver's Travels, where there were people who opened their eggs on the big end and people who opened their eggs on the little end. These two groups were each going to war to impose their opening method on the other group.

    The entire reason that the terms were derived from that part of Gulliver's Travels was to point out that the bit-order debate was as pointless as the egg debate.

    Steven E. Ehrbar
  • Because Itanium is far more heavily dependent on optimization than x86.

    You know how with tradtional RISC chips you needed to recompile the software when the instruction set's newest-core chip came out, or else you'd experience little to no performance improvement (or maybe even perfomance losses)?

    VLIW/EPIC is even more dependent on the compiler. If the code is difficult to optimize in the first place, EPIC/VLIW will have a severe performance hit.

    Steven E. Ehrbar
  • They made it hard to get to DOS, but they didn't remove DOS. To quote http://www.seagate.com/sup port/kb/disc/windowsmefaq.html [seagate.com]:

    What are the differences between Windows Me and Windows 2000?

    Windows Me is structurally based on the 16-bit DOS (Disk Operating System) code base, although it is a native 32-bit operating system. The underlying technology of Windows Me is very similar to the software platform on which Windows 95/98 was built. Windows 2000, in contrast, was designed from the Windows NT software platform, and on a completely different code structure. Windows NT and Windows 2000 are native 32-bit operating systems built upon a 32-bit code base.

    Steven E. Ehrbar
  • by cburley ( 105664 ) on Thursday August 10, 2000 @07:06PM (#866048) Homepage Journal
    Excellent post. I have only a few off-hand observations to add.

    My impression of IA-64/EPIC, based on some reading of the docs, is that it's VLIW redesigned so it can scale up or down quite a large range of implementations.

    (I say something about this on my linuxexpo web page (see my site), and that's over a year old.)

    Basically, VLIW design can be (roughly) thought of as focusing on the question "given chip process design circa <insert two-or-so-year period>, and problem space <type of software being targeted>, and assuming very clever compilers, what's the optimal ISA we can implement?"

    (Note that it doesn't take into account long-term viability of the ISA that gets produced, because the idea is that, with very clever compilers, when you get to the next generation of chip/process design, you instantiate a new ISA based on asking that question again, recompile, and, viola, you've got "optimal results" absent previous-ISA baggage. Ah, dreams.)

    EPIC seems to augment the question with "and that scales across a fairly wide range of potential chip sizes and processes".

    So, e.g. on the 128-bit VLIW machines I used to deal with, there were reasonably decent low-end chip designs that couldn't viably handle the ISA, because too much stuff would have to go off-chip (like, say, the register file maybe ;-).

    Whereas simpler designs, 16-bit or even 32-bit CISC or RISC, could fit on such chips, blowing the doors off a similarly-processed instantiation of that 128-bit VLIW.

    A good amount of the logic you talk about in your article strikes me as being unnecessary for low-end applications.

    E.g. the register renaming, the huge multi-port register file (a huge file is needed, but not so hugely multiported), the logic that attempts to figure out what the dependencies are, etc. -- these all strike me as being necessary only for medium-to-large-scale (vis-a-vis the high-level ISA) implementations.

    My impression is that early Alphas, e.g. 20264 or 20266 or whatever, were intentionally done as low-end implementations. But they didn't attract a lot of attention. Even though the (well-done, IMO) ISA allowed for a great deal of scaling up, as the 21264 and 21264 are apparently proving (my 21264 is "mothballed", sadly), there isn't enough "groundswell" to make it popular. Perhaps not coming out with more of a barn-burner chip -- shooting higher than the basic ISA implementation -- is partly responsible for this.

    I'm thinking that the only way EPIC, IA64 in particular, ends up looking good is if it's popular and implemented across a wide spectrum of chips, from low to high end.

    That way, code that's already compiled to the ISA can, in theory (and perhaps in practice too), be optimized for mid-level to high-end chips, and still run pretty well on the low end.

    Similarly, code that's "mundanely generated" might run about as well on the low end as the highly-optimized stuff anyway, but could still get worthwhile boosts being moved, without recompiling, up the chain.

    That's a nice dream, if it's what they have in mind, but there are a couple of problems.

    One, any information that is able to be encoded into ISA-level instructions that helps a variety of implementations of an architecture make their own decisions how to implement them represents ISA space (and usually other resources) that can't be used in a way that's optimal for a given implementation -- one that didn't have to accommodate the "wide-range" ISA (like EPIC).

    E.g. if you reserve in the ISA a bit that says "please serialize between previous and next op" so a faster chip can know when to do dependency analysis before assuming parallel execution will work, that's a bit that a low-end chip, which is going to serialize it anyway, won't need.

    Also e.g. if you reserve bits and fields for all sorts of optimization hints, so mid-level chips can do a better job of guessing, not only do you hurt low-end chips that won't use them, but you hurt high-end chips that have either enough logic to do at least that good a job of guessing at runtime anyway or have otherwise rendered most of the compile-time guessing redundant (e.g. by reducing latencies on branches, memory loads, whatever).

    In both cases, these bits in the ISA could, on chips that don't really benefit from them, have served to boost performance in some other fashion. E.g. a low-end chip might want hints about L1/L2 cache issues, or maybe just a more compact ISA so it holds more instructions in Icache. Whereas a high-end chip might want hints about completely different things, or maybe just a way to specify what some new ALU can do in parallel with everything else!

    Two, notice the Catch-22 in my statement about how EPIC/IA64 might look good. It has to be popular and run across a wide variety of architectures to show how good it really is (to the market, anyway), but how does it get popular until the market knows how good it is?

    HP/Intel seems to be taking the approach of doing their best to "mandate" its popularity -- by making a real commitment to the architecture, including ensuring Linux runs out it from the outset, making good (great?) compilers (including OSS ones -- not sure how SGI Pro64 fits into this) available, etc.

    But while they struggle to popularize this "epic" ISA by planting many seeds far and wide, and, at the same time, proving its value by rolling out an increasingly-wide array of implementations (in terms of performance), they do risk giving competitors opportunities to make inroads targeting shorter-term strategies.

    And, in the longer run that IA64 seems targeted towards, will ISA compatibility across implementations be nearly as important as it was when this strategy was formulated and adopted, what, five or so years ago?

    If so, then assuming IA64 is not much more or less "correct" an ISA for the longer haul than the competitions' designs for their shorter outlooks, Intel perhaps can afford (fiscally speaking) to remain committed to an ISA that might seem like a dinosaur roaming around slowly while clever early mammals expand into a bunch of niches for awhile, until that day comes when the dinosaur's strengths are visible -- applications compiled to IA64 will have longer useful lifetimes compared to those compiled to upstart ISAs, they'll scale better across a wide range of machines, etc.

    But what works against this is the increasing viability of the ISA-ignorant code base out there, by which I mean code distributed as source, bytecodes, whatever, and which cares little for the underlying ISA to meet its performance goals.

    That viability is increased by things like:

    • Increased deployment of Open Source software
    • Better compiler technology, encouraging less use of ISA-specific tactics (assembly coding being the extreme) for performance
    • Increased awareness of the pitfalls of investing in, or developing, ISA-specific applications (this applies especially to the comparatively huge "market" of in-house software)
    • The mere existence of IA64, which tells everyone that even the "King" realizes that the one-time "only" ISA (the 1980s Unix equivalent of the VAX ISA), the IA32, will someday pass away as a viable platform for many
    That last item is kinda like what Y2K did for two-digit year encoding. After all, similar mind-sets, priorities, etc. lead to that as to doing ISA-specific, e.g. IA32-specific, platform development.

    Yet it seems Y2K didn't cause a lot of people to resort to "1900/2000" solutions. I.e. we don't see a lot of cases (that I know of anyway) where the developers said "okay, the old code was for 1900, the new code supports 2000, and includes some 1900->2000 conversion utilities, but it's still all two-digit stuff".

    No, it seems most Y2K issues were handled by finally going the full four (or more!) digits, so the problem wouldn't come up again in 2100.

    If so much effort (US$Billions) was put into keeping the problem from resurfacing in another hundred years, that suggests the industry (those who decide what platforms to target for their software, whether for distribution or use in-house) will respond, to a significant extent, to the IA32->IA64 transition with less of an "okay, let's retarget everything to IA64" attitude and more of a "hmm, IA64 might get us 10-20 years of viability, that's too short for the investment we have to make, let's go the extra distance, get away from ISA-specific tactics, and pick a strategy that gives us flexibility in choosing IA64 vs. IA32 vs. Alpha vs. whatever at a suitably fine-grained level".

    To the extent the industry adopts that model, the advantages of EPIC decrease, while the disadvantages remain the same.

    It's also interesting to note the strong feedback among the other items.

    In particular, consider how the success of Open Source (mainly GNU/Linux -- GCC specifically) came about soon enough to cause HP/Intel to openly "target" OSS so it could join the IA64 revolution.

    Now, that helps promote IA64 acceptance.

    But it also allows competitors, who want to produce better, "one-off" ISAs using whatever IA64-like techniques are appropriate, to do so without losing all that Open Source software, and even without necessarily having to do much high-end compiler development!

    I.e. to the extent OSS compiles code well for IA64, it can be fairly easily modified to compile code well for a one-off ISA that doesn't have all the baggage of IA64 but does do some of the sophisticated stuff.

    So OSS popularity led to IA64 "openness", which could well lead to better compiler technology being available/affordable for arbitrary ISAs that are VLIW or RISC subsets of IA64, and that could encourage the third item above, in that more "users" of code bases will realize they'd spend less $$ to get performance if they could just recompile for the ISA de jour (from AMD, Compaq, whoever).

    I'm not saying IA64 represents taking a risky path, since I don't know the percentages -- could be anywhere from 10% to 90% chance of success for all I know.

    But, of course, a huge amount of $$ and energy is being put into IA64, so it is, indeed, a "big risk" to say the least.

  • So you're basically saying that modifying a hut
    on a oil-platform is more prone to errors than
    creating the platform from scratch? :-)

    I know... I get your point, I just couldn't help
    it, and it is a very bad analogy...

    I still disagree however, and I'm perfectly aware
    about the fact that you have to change your code
    to do this.
    I do however feel that an untested in real applications IA-64 is much more prone to child-disease than an extension to the very
    well tested x86.
    Everyone that ports to 64bit x86 will probably
    KNOW what it takes.. but people porting to IA-64
    is more or less exploring the unknown and facing
    totally new advantages and limitations.
  • Nice post. As a matter of interest, whats your take on Crusoe's VLIW instractions ? To what extent can the problems with VLIW be circumvented by running the compiler at run-time ?

    My guess would be that you can do it to some extent, provided your compiler is sophisticated enough, and you're prepared to compile several versions of the same source for different input data.
  • As a matter of interest, whats your take on Crusoe's VLIW instractions ? To what extent can the problems with VLIW be circumvented by running the compiler at run-time ?

    My initial guess would be that you can do it to some extent, provided your compiler is sophisticated enough, and you're prepared to compile several versions of the same source for different input data.

    My guess would be similar--that, by looking for dependencies at run-time instead of compile-time, Crusoe has a much better shot at generating fast code and keeping the CPU small, cool, and simple. After all, with their approach, they are successfully able to do what IA-64 can only promise: move all the complexities of instruction scheduling from hardware to software.

    Or are they? Because Crusoe (the hardware) is a straight-up in-order VLIW chip, its runtime compiler is actually doing two things: recompiling compiled x86 (or whatever) instructions, and scheduling them. Most of the criticism levied at Crusoe's approach focuses on the first half of this equation, and proceeds along the lines that "JIT is a bad idea, because it's why Java is so slow." As it turns out, I couldn't disagree more. For one thing, much of the reason Java is slower than C/C++ is because it is safer and more OO--it runs its own garbage collector and forces everything to be an object, amongst other things. For another, it's not actually slower! [aceshardware.com] The newest Java JIT's manage to generate faster code than static compilers in many cases--and well they should, because they know more about the machine they are compiling to and the most common critical paths through the software than a static compiler ever could. Indeed, HP is working on a runtime interpreter [arstechnica.com] which will speed up almost any precompiled code.

    The reason JIT's can work so well is because they only need to compile the code once, then sit back and profile it, recompiling only when necessary. In other words, they incur a lot of overhead at first, but pretty much stay out of the way afterwards unless they'll really help out.

    Now on to the second half of Crusoe's compiler--the scheduler. As I mentioned before, this sounds like a good idea--taking some functionality off of hardware and moving it into software. But when you think about it, you realize there's no such thing as "taking functionality off of hardware and moving it into software"--after all, the "software" still needs to execute on the same hardware!

    What you're actually doing, then, is moving a function from having dedicated on-chip hardware performing it to having to be run with normal general-purpose hardware. This still has the very real benefit of making the chip a lot simpler, but now you've added a scheduler that needs to take clock cycles from the code it's trying to schedule.

    The big question, then, is how much can the scheduler be like the compiler--that is, just doing its work once and then only stepping in when necessary. If, by moving the scheduler from dedicated on-chip logic to software using general-purpose logic, you make it able to do that much better, then it may be a significant design win. If, however, the scheduler needs to do anywhere near as much work as it would have as dedicated on-chip hardware, you're going to end up losing speed.

    Which of these is the case? I have no idea. Obviously, a lot of very smart people (Dave Ditzel, etc.) thought the former. On the other hand, Dave Ditzel is reportedly the one responsible for keeping Sun's chips in-order while the rest of the world moved to out-of-order; a quick comparison between an Alpha 21624 and an UltraSPARC-II (or even the upcoming UltraSPARC-III) shows you who was right on that one. (Hint: not Dave Ditzel.)

    What we do know is that Crusoe is a lot slower than Transmeta originally thought a runtime interpreted VLIW processor would be. There have been strong reports that they originally envisioned their processors would be able to beat leading-edge x86 chips handily, and only scaled back to the low-power market once their original benchmarks came back disappointing. Even at the low end of the scale, they're attracting a lot of ridicule amongst chip designers for trying to "reinvent the benchmark" because their chips can't compete. I happen to believe that (work/time)/power is a useful benchmark for the mobile arena; still, there's no denying that Transmeta would rely on traditional benchmarks if they could. Furthermore, it looks as if several chips may end up being able to compete with Crusoe even on (work/time)/power--StrongARM's, the much-maligned Cyrix III, and various other low profile simple RISC chips coming out of the woodwork to compete for the "mobile embedded" market.

    So, while I would very much like Transmeta to succeed, so far--just as with IA-64--there's little indication that it's more than a bunch of hype. Perhaps after a disappointing first iteration, VLIW will get its kinks worked out and become the standard general-purpose CPU design philosophy of the next couple decades, just as RISC has been for the last two. (And yes, modern x86 chips are designed according to the RISC philosophy, even though the x86 ISA is CISC.) However, I have to say I doubt it. Looking ahead, all the badassest designs of the future (MAJC, Power4, 21464, SledgeHammer) seem to be moving towards keeping the dynamically scheduling RISC architecture and adapting it for CMP--chip level multiprocessing.

    But, as always, only time will tell.
  • Ahh, I see. My mistake

To invent, you need a good imagination and a pile of junk. -- Thomas Edison