Intel: No Rush to 64-bit Desktop 616
An anonymous reader writes "Advanced Micro Devices and Apple Computer will likely tout that they can deliver 64-bit computing to desktops this year, but Intel is in no hurry. Two of the company's top researchers said that a lack of applications, existing circumstances in the memory market, and the inherent challenges in getting the industry and consumers to migrate to new chips will likely keep Intel from coming out with a 64-bit chip--similar to those found in high-end servers and workstations--for PCs for years."
4 GB is not a lot of memory (Score:5, Insightful)
If Intel keeps on braking a lot of people will get really disappointed when they realize they need more memory than their platform supports.
Re:4 GB is not a lot of memory (Score:5, Funny)
Oh, come on! Don't you want the fun of playing with the 64-bit equivalent of extended and expanded memory? Endless tinkering of autoexec.bat and config.sys! Endless reboots! Doom 3 runs in it's own operating system (the way God intended)!
Bring on the half-ass memory solutions! We should be deep in flavor-country by 2005.
Re:4 GB is not a lot of memory (Score:5, Funny)
Re:4 GB is not a lot of memory (Score:3, Informative)
Do people multitask while playing games like Doom III? HELL YEAH! I can't remember how many times I've 'windowed' UT [unrealtournament.com] or TO:AoT [tactical-ops.to] to tweak my TeamSpeak [teamspeak.org] settings. Or how often I take a break while woring (I work at home) to let off some steam lobbing grenades or rushing SF with my trusty AK.
Besides, Doom III is as much a proof of concept as it is a game. By developing the engine in a console-like enviroment you're limiting it's 'real world' parameters, you're not letting it get tested. Let's not kid ourselves, in 2-3 years time there's going to a *lot* of games toting the Doom III Engine badge.
Anyway, we've been in this situation before - praying your game can detect your video and sound card. This is why DirectX and OpenGL are popular - they provide a much needed interface and abstraction layer to your sound and graphics. This is one of the promises of a modern OS - set up an interface to differnet devices. Configure it once and you're set! The lack of this was one of the worst things about DOS, and I don't really want to go back there.
Re:4 GB is not a lot of memory (Score:3, Interesting)
And this is great...if you're doing mainframe style computing and price is no object. Back in the day, given infinite funds, you could have purchased an Apple II or a VAX 11/780. The former, even with its 64K of memory, let you do about 80% of what you'd want to use the VAX for, and it's a lot easier to maintain, lower power, and fits on your desk.
Now we have a similar situation. 64-bit is "better," but in a loose "for maybe 5% of all computing tasks" kind of way. That's not a compelling reason to switch all desktop PCs over to 64-bit processors. If Intel--or any other company--tries to do that, then I'll just wait until the lower end mobile processor makers improve enough that I can avoid the bloated desktop market all together.
Article Back Story (Score:5, Funny)
Um, Hi... this is Intel. We know you *WANT* 64 bit but, um, you dont NEED it. Really, you dont. You believe that? Great! Basically guys, this is the problem, we *screwed the pooch* on this processor. We've spent 10's of billions of dollars on development, it's years behind schedule, it ain't that fast, and the whole thing just sucks right now. So here's what we're gonna do, We're gonna hold back this technology for like ehh, 6, 7, maybe 8 years SO WE HAVE TIME TO RECOUP THE MONEY WE WASTED by selling the chip as an expensive "workstation" CPU. So, expensive high-profit workstations for now, then you can have it later once it sucks (well it already does, but once it sucks more). Other platforms have had 64 processors for a decade now you say? You want mid 90's processor technology in 2003? FUCK YOU, you can't have it, end of discussion!
OH, and expect some dirty tricks, we know AMD is gonna be ready to sell you 64 bit way before us, so, well ... you'll just see ;)
Thanks, Intel
Re:4 GB is not a lot of memory (Score:4, Insightful)
As it is right now, there isnt really a desktop application that could use 4 GB if you asked it to. Sure, some developers could use it, some CG people, and DV people, but those people can justify buying more expensive (64 bit) workstations. Joe twelvepack's $600 dell will run any consumer application faster than it needs to.
Once developers start making good use of the power they have, then it's time to make the big financial investments required to go 64-bit for consumers. I personally have a hard time even thinking up a consumer application (besides games)that could really stretch existing computing resources.
Re:4 GB is not a lot of memory (Score:3, Informative)
Re:4 GB is not a lot of memory (Score:3, Informative)
Re:4 GB is not a lot of memory (Score:3, Insightful)
Excuse me? Intel is saying that our cheap desktops are already fast enough, so they're putting off 64-bit CPUs?
Why should I even buy a new 32-bit CPU from Intel, then?
(You are of course right. I'm just wondering aloud why Intel is admitting it, and how they plan to dig themselves out once they convince the public of it.)
Re:4 GB is not a lot of memory (Score:3, Insightful)
The biggest advantage of a 64 bit processor is the increased memory space. Intel makes processors, not memory. The last thing that they want is a computer where Dell spends more on memory than processor.
Re:4 GB is not a lot of memory (Score:5, Interesting)
That is true, but the memory bus can be made wider, and that won't affect the adressing scheme. Take nVidia's nForce, it uses 2 DIMM slots in paralell to double the memory bandwidth (although the processor bus must be fast enough to use the bandwidth).
The bandwidth issue scales much more easily than the fact that 32 bits is 4 GB of addressable memory, no matter what. (OK, you can do a extended-memory-kludge, but that's beside the point
Re:4 GB is not a lot of memory (Score:5, Funny)
That ought to be enough for anyone.
<ducks>
Re:4 GB is not a lot of memory (Score:4, Informative)
The ordering is: byte, kilybyte, megabyte, gigabyte, terabyte, petabyte, exabyte, zettabyte, yottabyte. After yottabyte comes 'ohmygodijustcameabyte'.
Re:4 GB is not a lot of memory (Score:4, Funny)
Re:Why I need 500 ZettaBytes (Score:3, Funny)
Re:64GB (Score:5, Informative)
Re:64GB (Score:4, Insightful)
Of course... (Score:5, Insightful)
What's interesting is the "nobody really needs 4Gb this decade" line. Just about every Mac in this room has 1Gb in it, and even the crappy test PC has 768Mb. 4Gb will be here sooner rather than later...
Re:Of course... (Score:3, Interesting)
As a semi-future-proofing-power-user. I built a PC in 1998. I put in 256MB RAM to try to keep it running as long as possible. That's price-equivilent to 2GB at todays prices.
It's really not going to be long before the geeks feel they need to do so.
Re:Of course... (Score:3, Interesting)
Re:Of course... (Score:2, Informative)
Re:Of course... (Score:4, Funny)
Err... 1500 years, give or take. Never mind.
Re:Of course... (Score:2, Insightful)
I'm seeing 256MB std now, so I think we're still 3-5yrs away...
Re:Of course... (Score:3, Funny)
Just use vi, instead of emacs.
New operating sytems will change Intel's tune? (Score:5, Interesting)
But I think that will change almost overnight once operating software that supports the Athlon 64/Opteron becomes widely available. We know that Linux is being ported to run in native Athlon 64/Opteron mode as I type this; I also believe that Microsoft is working on an Athlon 64/Opteron compatible version of Windows XP that will be available by time the Athlon 64 is released in circa September 2003 (we won't see the production version of Windows Longhorn until at least the late spring of 2004 (IMHO), well after the new AMD CPU's become widely available).
Intel's problem... (Score:5, Insightful)
Of *course* Intel is going to argue that 64bit isn't required for desktop computers. If users make the leap to AMD's x86-64, Intel will have to scramble to build a chip of their own to support it. Also, if you start getting $100, $200, $300 64-bit chips out there, I'm sure the server market's gonna stop and ask "why the hell are we spending $10k per Itanium?"
Intel stands to lose if we move to 64-bit on desktops.
Re:Of course... (Score:5, Informative)
Scientific applications have been using 64-bit computing for quite a while. What they usually use is floating point for calculations. Double precision floating point (64-bit) has been around for quite a while. Loading/Storing the 64-bit (sometimes 80-bit) FPU (stack) register using single instructions, even though it may require multiple bus transactions, and manipulating them with single instructions has existed for a long time. Scientific applications frequently have very large datasets as well - several GB not being uncommon. For performance reasons, you frequently want to load all this data into memory and not have to worry about processing data in chunks that can fit into memory (although this is an option but is bad for some types of data access and reuse patterns). The data types of scientific applications can typically be handled by 32-bit CPUs today (IEEE double precision floating point - for example) with no problems and those FP registers can be loaded from L1 or L2 64-bits wide 'in one go' - they can even be load/store from memory fast (memory typically operates at a cache-line at a time reads and can be more precisely tuned for writing). It's the amount of data being handled.
Video - I admit, I'm not an expert in this area, but I would imagine that the Altivec/SSE/Whatever are being used heavily here - although these aren't *really* that much different from what the 32-bit CPUs can do already, they just do several at the same time (SIMD). What matters here are very large datastreams (multiple GB) that have to be manipulated. I'm not exactly sure what would need to be done other than having a 64-bit file system though, and that can be (and is) done using 32-bit CPUs today. Maybe simply the ability to pull the entire image into one chunk of memory is what is desired - similar to the scientific computing issues where block read schemes are inefficient because of data access problems and data reuse. If the video files are over ~3GB, then you have a problem on 32-bit systems.
Databases - this is getting the most attention. Here, 64-bit integer manipulation becomes important (not SIMD types either) - Index/address calculations, large trees of data, etc. The other important thing is caching of data so you don't have to hit the disks. For this you want all the memory you can get.
Also... remember that just because a 64-bit CPU will typically have the ability to manipulate and use 64-bit addresses, that does not mean that all 64 address lines will be brought out of the package. For example, I would imagine that more like 40 address lines will be brought out - limiting the amount of physical memory that will actually be able to be used by the CPU to, in this case 256GB, for cost reasons. However, the virtual address space isn't effected by that and will be 64-bit regarless. Of course, over time, more and more address lines may be brought out.
Well... (Score:5, Funny)
lack of applications (Score:5, Insightful)
But the gaming market is going to drive this and the hardcore gamers already build their systems (with AMD?). Intel will lose nothing at first.
pc overhaul (Score:5, Insightful)
Emulation (Score:2)
All you need to solve is the quite abysmal video rates of things like virtualPc.
Basically you need a WinUAE for PCs.
And the reason Intel are holding back is contained in the first line here. Their 64-bit chip is crap.
Re:pc overhaul (Score:5, Interesting)
Of course, if you want real hardware agnosticism, there is always Linux isn't there? That runs on 64 bit CPUs, in 64 bit mode right now, and should be ready to work on AMD's Hammer right from launch. The big gamble for Intel is, can it afford to be late to the party? Intel certainly seems to think so, but I think that the Hammer is going to end up on more desktops than they expect, unless AMD sets the price of entry too high.
Re:pc overhaul (Score:5, Insightful)
1) The CPU: x86? Who cares? Even the Power4 does instruction-level translation, and advances like the trace cache take decode out of the hot path. In the end, x86 is just a nice, compact, widely supported bytecode. Outside of instructions, PC processors are very modern. Highly superscaler, highly pipelined, *very* high performance.
2) The chipset: This isn't your ISA system anymore. CPU -> chipset and chipset -> Memory interrconnects will be hitting 6.4 GB/sec by the end of the year. The Athlon 64 will have an integrated memory controller, just like the UltraSPARC. I/O hangs of the PCI bus, which is not a bottleneck given current systems. And when it does become a bottleneck, solutions like Hypertransport are already ready and working. Peripherals now hang off advanced busses like USB and Firewire, while traditional I/O methods are relegated to a tiny (cheap!) Super I/O chip. ISA is finally dead (the new Dells don't ship with ISA slots). The only thing we can't seem to get rid of is the infernal 8239 interrupt controller. The I/O APIC has been around for ages now. VIA has integrated them for years. Intel is finally getting around to putting them in, but is doing a half-assed job of it. My Inspiron has an 845 chipset, which theoretically has an IO-APIC, but it seems disabled for some reason.
3) The firmware: OSs today ignore the BIOS anyway. They're only in place for booting and SMM mode. ACPI has replaced most of what the BIOS used to be used for. Just this month, Intel said that EFI (used in the Itanium) will finally replace the PC BIOS, and bring with it a host of new features like support for high-resolution booting modes, network drivers, advanced debugging, etc.
Re:pc overhaul (Score:3, Insightful)
What are you smoking? It's widely supported, yes, and it might or might not be compact (myself, I would guess not, even RISC chips like the ARM/XScale have more compact code), but 'nice'?
Re:pc overhaul (Score:3, Informative)
Show that [indiana.edu] to people here in this thread, that should be enough namedropping for slashdot.
Btw., this lklm thread is really informativ.
Re:pc overhaul (Score:3, Informative)
linux overhaul (Score:4, Funny)
The whole Linux architecture should ideally be replaced. We're still using something designed in the 70s, with lil hacks here and there to make it halfway usable in the current day. Unfortunately, it would be incredibly difficult to do, as the macrokernel system and crusty old ASCII-pipe-based GNU tools would have to be remade. Unix compatibility slows us down from moving forward. Even if everything was replaced, how long till RMS decided it was the work of Satan and began on a further replacement?
No hurry? (Score:5, Insightful)
Re:No hurry? (Score:3, Interesting)
Just in: Intel drives *INNOVATION* (Score:2, Funny)
So after this AMD is contemplating the release of Hammer and Moto/IBM/Apple are teaming on the next gen macintosh. Both teams are celebrating and letting schedules slip to ensure a good product.
15 minutes later, Intel pulls the rug and releases a consumer level 64 bit cpu. Calling the former press release a premarketing bell weather.
Reasons for 64 bit desktops (Score:5, Interesting)
For corporate desktops... (Score:5, Interesting)
Re:For corporate desktops... (Score:4, Insightful)
So now that you have a cheap smart terminal whith the capability of running its own applications, why spend large amounts of money on a huge network and backend servers.
From a management standpoint x-term type machines would be great, everything stored on the servers for backup; easy management, just replace a broken one with a working and the user is back up, and users could move around and keep all thier settings. It keeps being tried every few years and keeps being rejected by corporations.
Re:For corporate desktops... (Score:2, Informative)
These guys [k12ltsp.org] seem to be having no problem with being rejected. I put together my school's lab for about the cost of two serious desktops, networking included. In fact, Jim McQuillan [ltsp.org] seems to be making a reasonable living out of selling such systems. It all depends on where you sit, and what you need, I guess.
Bandwidth (Score:2, Insightful)
Wouldn't it make more sense to put that 64 on the server, with XXGB of RAM, and push the display to the clients?
Not if there's a dial-up link between the server and client.
Not if the application is movie editing. 640x480 pixels x 24fps x 24-bit color = too big for even 100Mbps Ethernet.
Intel speak (Score:5, Funny)
AMD investor. (Score:3, Interesting)
Intel is committing hara-kiri in my opinion here (thats suicide for honor in Japanese). Similar events return to my memory, and history has proved all these were utterly wrong... (Its sad to acknowledge that I REMEMBER when some of these things happened!
- Intel 286 vs 386 (IBM: A 286 is enough for most people...)
- IBM Microchannel vs ISA (The same thing)
- 'A good programmer should be able to do anything with 1K of memory'. I don't remember the author, but probably someone from IBM in the 60s or 70s.
Time flies...
Re:AMD investor. (Score:3, Informative)
Hara = stomach
Kiri = gerund of kiru (=to cut)
Literally, 'stomach-cutting'.
It's the vernacular for seppuku (which, by the way, is written using the same characters - setsu is kiru, fuku is hara).
It's been done before (Score:5, Interesting)
Didn't Apple manage to get their (admittedly smaller) user base to switch to a better processor?
Intel's argument against 64-bit computing seems to be an advertisement for the x86-64 concept. The article didn't mention gaming, but surely the gamer market will be a major early-adopter base. It sounds like preemptive marketing to me.
As for memory, the article, and presumably intel, don't seem to account for the ever-increasing memory footprint of Microsoft's operating system (or for the GNOME stuff on our favorite OS), and so are perhaps too dismissive of the need for a >4GB desktop. As we all know all too well, one can never have too much memory or disk space, and applications and data will always grow to expand to the limits of both.
Personally, I'm holding off on any new hardware for my endeavors until I see what AMD releases, though I would settle for a Power5-based desktop...
Re:It's been done before (Score:5, Insightful)
Yes. They did it gradually. The first PPC Macs ran a 68k emulator which provided backwards compatability for old Mac software. Intel are trying to do the same thing; you can run IA-32 software on IA-64.
The problem that Intel has, and that Apple didn't, is that the IA-32 mode on an Itanium is generally slower than a real IA-32. Many Mac users found that their old 68k code ran just the same, or in some cases faster on the new PPC's. Intel then, is at a disadvantage with the IA-64, speedwide. Why invest all that money in a new platform just to run your code slower?
Now, this might not be such a problem if people were busy porting their stuff and tuning it for the IA-64, but Intel have two problem there. The first if the chicken and egg; no one is buying IA-64, so no one is porting their applications, so no one is buying IA-64. The other problem is technical; the EPIC (VLIW) instruction set is a nightmare to understand and code. Only a handful of people trully understand the full IA-64 ISA, so compilers and Operating Systems are slow to suport it. If you don't have adequate tools, how can you do the job?
At the moment, it looks like Intel could be onto a looser with IA-64. Only time will tell.
Re:It's been done before (Score:4, Interesting)
Yes. They did it gradually. The first PPC Macs ran a 68k emulator which provided backwards compatability for old Mac software. Intel are trying to do the same thing; you can run IA-32 software on IA-64.
The problem that Intel has, and that Apple didn't, is that the IA-32 mode on an Itanium is generally slower than a real IA-32. Many Mac users found that their old 68k code ran just the same, or in some cases faster on the new PPC's. Intel then, is at a disadvantage with the IA-64, speedwide. Why invest all that money in a new platform just to run your code slower?
Sorry, you're wrong on two points there.
- The PPC Macs did not run a m68k 'emulator' - an opcode translator converted m68k code to PPC code. There wasn't a clearly-defined emulator (which implies an application) - certain parts of the MacOS itself at the time consisted of m68k code, which was run through the translator.
- The first PPCs ran m68k code *slower* than the fastest m68k Macs. In particular, the 6100/60 was badly crippled by its lack of cache, and could be quite handily beaten by the faster 68040 Macs when running m68k apps.
Re:"The first" PPCs? (Score:3, Interesting)
Those Mac emulators still work, and still run the ancient software, on a modern OS X Mac. My father has a word processor from maybe 1987 (WriteNow) that's just fine, and continues to use it for day-to-day writing. Hey, whatever makes you comfy.
Maybe it isn't supported in some subtle ways, and I'm sure there's stuff that's broken -- even recent OS 9 games sometimes won't run in "Classic Mode" and require booting in OS 9 instead. But Apple's taken this seriously during every OS or chip migration they've ever had, and they're still keeping their eye on pre-PPC chip software.
Margins (Score:4, Interesting)
Separation of consumer and "server" processors is just marketing, which is Intel's strongest talent (like Microsoft).
What are other advantages of 64 bit? (Score:2)
I agree with Intel (Score:2, Interesting)
Before you reply with a bunch of other reasons why my PCs are becoming more obsolete with each passing day anyway, think back to the transition between the 286 and 386. The 386 could run everything a 286 could run and it performed much better. Due to the performence benefit, most applications that couldn't be run on a 286 wouldn't have run well on a anyway.
The transition to 64-bit on the desktop isn't going to be the same. While 640k may not be enough for everybody, 4GB is certainly enough for web browsing, wordprocessing and basic photo manipulation. I'd hate to see the horribly inefficient code that requires more than 4GB of RAM for such simple tasks.
Realistically, the force that will cause 64-bit to be a requirement on the desktop will be the version of Windows that no longer runs on 32-bit hardware. Windows XP's minimum requirements are:
If you look at the current system requirements compared to the current top end PC hardware, it's easy to see why Intel wants to hold off on production of 64-bit processors targeted for the desktop market.
Object spaces (Score:5, Insightful)
Re:Object spaces (Score:4, Interesting)
Intel is wrong, just like they were last time (Score:5, Interesting)
Intel didn't want to make the jump to 32 bit, so they introduced "segment registers". They tried to convince people that this was actually a good thing, that it would make software better. Of course, we know better: segment registers were a mess. Software is complex enough than to have to deal with that. That's why we ended up with 32 bit flat address spaces.
64 bit address spaces are as radical a change from 32 bit as 32 bit was from 16 bit. Right now, we can't reliably memory map files anymore because many files are bigger than 2 or 4 Gbytes. Kernel developers are furiously moving around chunks of address space in order to squeeze out another hundred megabytes here or there.
With flat 64 bit address spaces, we can finally address all disk space on a machine uniformly. We can memory map files. We don't have to worry about our stack running into our heap anymore. Yes, many of those 64 bit words will only be filled "up to" 32 bits. But that's a small price to pay for a greatly simplified software architecture; it simply isn't worth it repeating the same mistake Intel made with the x86 series by trying to actually use segment registers. And code that actually works with a lot of data can do what we already do with 16 bit data on 32 bit processors: pack it.
Even if having 4G of memory standard is a few years off yet, we need 64 bit address spaces. If AMD manages to release the Athlon 64 at prices comparable to 32 bit chips, they will sell like hotcakes because they are fast; but even more worrisome for Intel, an entirely new generation of software may be built on the Athlon 64, and Intel will have no chips to run it on. If AMD wins this gamble, the payoff is potentially huge.
Re:Intel is wrong, just like they were last time (Score:4, Informative)
Um.... no. Segment registers have been in Intel's products from the beginning (at least since the 8088). It wasn't a band-aid to stall adoption of 32-bit processors as you imply with the above comment.
The current 32-bit processors also have segment registers and you can use them with the "flat" address space. Some OSes (like Linux) just set all the registers to the same segment and never change them. But you could have separate segments for the stack, data, code, etc.
Re:Intel is wrong, just like they were last time (Score:3, Interesting)
Let's not forget the excellent Motorola 68K chips either. The 32bit addressing 68020 was introduced in 1984. It was used in many *nix workstations.
In 1985 Intel said the same thing they are saying now: This new CPU is for servers, you don't need it in workstations. They were wrong then. They are wrong now.
Re:Intel is wrong, just like they were last time (Score:3, Informative)
The 68020 was truly a computing milestone (the first 32 bit CPU, after all) and it had excellent features such as a fully functional MMU, and an available FPU, not to mention it came in speeds up to 16 MHz originally and eventually up to 33 MHz. I used to have a Sun 3/260, which I later upgraded to a 4/260.
big mistake IMHO (Score:5, Interesting)
Intel currently owns the market for low end workstations and servers. If you need a web server or a cad station you get a nice P4 with some memory. This is also the market where the need for 64 bit will first come. At some point in time some people will want to put 8 GB of memory in their machine. AMD will be able to deliver that in a few months, Intel won't.
My guess is that Intel is really not that stupid (if they are, sell your intel shares) and has a product anyway but wants to recover their investment on their 32 bit architecture before they introduce the 64 bit enhanced version of their P4. The current P4 compares quite favorably to AMDs products and AMD has had quite a bit of trouble keeping pace with Intel. AMD needs to expand their market whereas Intel needs to focus on making as much money as they can while AMD is struggling. This allows them to do R&D and optimize their products and ensure that they have good enough yields when the market for 64 bit processors has some volume. Then suddenly you need 64 bit to read your email and surf the web and Intel just happens to have this P5 with some 64 bit support. In the end, Intel will as usual be considered a safe choice.
The entire industry (Score:3, Interesting)
Who cares about 4GB? (Score:3, Interesting)
I wrote a little library that strings together a bunch of unsigned longs. It in effect creates an X-bit system in software for doing precise addition, subtraction, etc. This library would be considerably faster if I could string 64 bit chunks together instead of 32 bit chunks. Does no one on
What about bitwise actions like XOR, NOR, and NOT. You can now perform these operations on twice as many bits in one clock cycle. I'm not really into encryption, but I think this can speed things up there.
Many OS's (file systems) limit the size of a file to 4GB. This is WAY crazy too small! This again stems from the use of 32 bit numbers. When the adoption of 64 bit machines is complete, this limit will be removed as well. Again, 32 bits isn't just about ram.
I could really go on all day. The point is this: Twice the bits means twice the math getting done in the same amount of time (in some situations). So if a person were to write their code smart to take advantage of it, you would have all around faster code and a larger memory size. Sounds like a nice package to me.
Really, give the 4GB limit a rest. Lets talk about some of the exciting optimizations we can do to our code to get a speed boost!
Re:Who cares about 4GB? (Score:3, Interesting)
And last I checked, most major x86 operating systems supported 64bit addressing for files.
And if you are thinking about RAM, x86 isn't limited to 4gb. It can support up to 64gb of physical ram; Windows and Linux have both supported this for a while now... except for a few AMD chips (a number of recent AMD chips have microcode bugs which prevent you from addressing more than 4gb of RAM).
There actually are some cool things you can do in 64bit which you can't in 32bit. You listed none of them. However, they tend to be closely tied to OS architecture, and even then few OSes take advantage of them (they aren't the kind of things you can retrofit on).
Ha ha ha! (Score:3, Informative)
If you need big processing, you still buy the big iron. Next time you're at the airport and the ticket agent is checking you in, sneak a peek at the logos on the terminals they're using. Oh sure they'd love to upgrade to a spiffy new-fangled GUI based dingus, just no one's figured out quite how to do that.
When I signed on with IBM back in 1994 they were trying to replace their big iron with PCs. "By end of year 1995," they promised us, "all the mainframes will be gone and all our applications will run on Lotus Notes." Well here it is nearly a decade later and they still haven't replaced that big iron, and they'll never get rid of their RETAIN technical support database. No one can figure out how to deliver RETAIN's performance on any other platform.
Sure, today a mainframe might consist of over a thousand high-end desktop processors working in unison, but look how many processors they had to slap in there to deliver the performance the customers expect from that big iron. And those are all wired together and working closely, unlike that (much smaller) network cluster your latest clueless technical manager just suggested.
So what Intel is really saying here is their marketing department just realized that they will never deliver that kind of performance in a desktop or even in a 4 to 8 way "server" machine. The customers they're targeting will continue to purchase the big iron when they need that kind of processing power, and the "toy" shops are happy with the 32 bit processing power. By the way, Google essentially just built themselves a mainframe. I wonder how the cost of their solution would stack up against the biggest iron IBM currently provides...
You seem to be confused (Score:4, Insightful)
Supercomputers are almost purely cpu number-crunching beasts. This is what you seem to think of as mainframes with "over a thousand
Most mainframes, like IBM's Z Series, have 24 to 36 CPUS. A mainframe is not about cpu performance, a mainframe is about data. A mainframe has system data throughput that puts almost any other system to shame. Historically, mainframes are good at supporting many simultaneously-connected users doing data queries and updates. (Yes, they run huge databases very well.)
And then you get Beowulf clusters (your Google remark, effectively), which are really chasing the supercomputer market, and not the mainframe market. Beowulf clusters care about a limited class of supercomputer applications, they are good where you need a lot of parallel number crunching, and have very little data dependency between parallel calculations, so you don't need a lot of inter-cpu communications.
Pick the type that's right for your job, and you'll be happy. Pick the wrong one, and you'll have nothing but problems.
And it helps if you're stuck-up intelligently, that way people will still hate you, but won't think you're stupid any more.
When 64bit Desktop PCs Hit the Market... (Score:3, Interesting)
As fast as the hardware engineers struggle to keep up with Moore's law, shoddy programmers backed by cheapskate management labor to set the performance gains back.
Kids these days...
Re:When 64bit Desktop PCs Hit the Market... (Score:4, Insightful)
Whither VMware? (Score:3, Interesting)
With investment from Intel and Microsoft, they could release a cheap VM workstation optimized to run Windows only. They could even detect a 32-bit app starting up and shove it off to the VM, where it sounds like it might run faster. Well, easy for me to say, I guess. Make it so!
Also, MS is buying Connectix, but their VMs are below VMware's quality, and it seems they bought it mainly for the server product. But this strategy could still work for them; build the 64-bit Windows workstation with a built in 32-bit VM.
Different perspective (Score:3, Insightful)
Kind of like how a speed bump on the road can sometimes have a positive effect for traffic on the whole. Consider the current state of (desktop) software: its rarely written with efficiency as an important consideration. Often, there is not much incentive to do so: as long as it runs comfortably on decently new hardware, its fine. As a result, people who are forced to use bottom-of-the-line hardware are screwed. (Like me. I'm running my webserver [cjb.net] on stone-age hardware, simply because I can't afford anything more). In fact, Microsoft even goes to the extent of deliberately makign its new releases require the latest hardware to force users into an upgrade cycle. This is a Bad Thing.
Now consider the effect that the 32-bit speedbreaker will have. Applications like gaming will be affected first. Since they have to add more features without getting more memory expensive, there will be incentive to do more efficient coding. In turn there will be pressure on underlying libraries to be more efficient. Other apps using these libs will start benefitting. There will also be more programmers catching those memory leaks which eat tons of memory rather than postponing them to a future release. More emphasis on software engg in general.
The bottom line: more headaches for programmers for a couple of years, but smaller, faster, better software for a long time.
ZDNet (Score:5, Insightful)
Fairly typical for ZDNet, Linux is either downplayed; or, as is the case in this article, ignored totally:
Currently, Itanium chips do not run regular Windows code well.
Windows software is designed to run on 32-bit systems.
'There hasn't been much OS support'.
Forget the number of years Linux has been running on a variety of 64 bit chips [google.com] for years.
Articles like these are way too biased towards the Intel/Microsoft duopoly. I say go for it Intel, AMD can produce stable quality CPUs and you and Microsoft can say to each other: "No one will ever need more than 4GB of memory."
Everybody else must be seriously jumping for joy. (Score:5, Interesting)
- Well, now that they're most recently Going out of business [slashdot.org], in steps IBM to save the day for them... a new line of iMacs is going to do insanely well, considering it's going to be the only fully-functional line of 64-bit personal computing, because I can pretty much guarantee Apple's going to have full-fledged 64-bit standardizing before anybody else. Apple's going to have an insane surge in users, a lot of the multimedia software that's been migrating to PCs is going to be happy with the better, faster and more powerful 64-bit hardware support and go back to developing for Macs... basically, Macs regain a lot of the status they've been falling behind in quickly.
AMD:
- Hammer sales go up! If they're really lucky, Intel will either do a harsh (and hopefully inferior) yet still more expensive knock-off of Hammer, or they're going to release Itanium in a hurry because they realize businesses like the idea of progress so they're starting to hop over to 64-bit architectures. So AMD will reclaim its status it lost about a year and a bit ago when the P4 got the title of "Best x86 on the market". Good on them.
Linux:
- Business as usual. Increased PPC support. Cool new Hammer patches, as well as the usual suspects (i386 still harshly dominating)
Microsoft:
- Well, maybe not everybody's jumping for joy... A lot of migration to PPC. But otherwise, they're still busy saying that "The Next New Windows Will Be Secure, And This Time We Mean It!" (tm).
That about it?
Re:Everybody else must be seriously jumping for jo (Score:3, Insightful)
I wouldn't bet the farm on this. The iMac was and is marketed at the average non-geek who couldn't care about CPU bit path, or memory addressing, or upgradability. And it probably will still be marketed at the non-geek when they go 64 bit.
Now the full on tower machines, those will be the machines to get for hot 64 bit CPU sex. not as cheap as the iMacs are, but they're a whole lot cheaper than say a Sun sparc machine, or other 64 bit box.
x86-64 (Score:3, Informative)
Here are some general specs on x86-64:
64-bit addressing
8 Additional GPRs (for a total of 16)
GPR width increased to 64-bits
8 128-bit SSE registers (for a total of 16)
64-bit instruction pointer and relative addressing
Flat address space (code, data, stack)
--Ace's hardware (http://www.aceshardware.com/read_news.jsp?id=100
The fact that x86 has only had 8 General Purpose Registers has been the bane of its existence for quite a while... I think that this will be the main source of speed improvement over existing 32-bit apps when compiled for the x86-64 architecture, not the fact that the system can handle more precise numbers.
As far as selling these things, having worked in video game retail, the consumer is already very conscious of the idea of an n-bit processor from all the old console hype where the precision of the CPU was marketed as the primary "performance number" the way Mhz are on desktop PCs.
--Shon
spin (Score:3, Interesting)
I'd like to see one of two systems. Either provide backward compatibility - like AMD with it's 64 bit extensions, or start with a clean slate and produce a performer - like Digital's Alpha.
The advantage of a 64 bit AMD is that the most used architecture can migrate without dropping everything. My PII can still run DOS binaries that ran on my 8088. This is a GOOD thing. Even running Linux, I don't want to recompile all my apps, if I don't have to. If this were the case, I might have gotten a Power PC already.
The advantage that the Alpha has is speed, and there is only one kernel systems calls interface - 64 bits. For example, there's no lseek() and lseek64() on the Alpha. (For the history buf, first there was seek() for 16 bits, then lseek() for 32 bits. We've been here before. Now we have the off_t typedef, so it should be easier to simply change it to be 64 bits... Yet some have added off64_t, in the name of backwards compatibility.)
Itanium may have the clean break (or it may not), but where's the speed? I'm not switching without something.
Digital's Alpha is at least the third attempt that Digital made before getting a RISC system to perform. The Power architecture is IBM's 2nd attempt. Sometimes you design it, and it just doesn't deliver. Move on!
When one looks at Digital's switch from 16 bits (PDP-11) to 32 bits (Vax 11/780), one notes that the new machines were more expensive, and about the same performance. I'd still rather have a Vax, because there are things that you can do in 32 bits that are painful in 16 (but not many).
It should be noted that throwing the address space at problems often slows it down. For example, Gosling's Emacs was ported from the Vax to the PDP-11. On the Vax, the file being edited was thrown into RAM completely. On the PDP, just a few blocks of your file were in RAM, in a paged manner. On the PDP, an insert (or delete) cause only the current page to be modified. If the current page filled up, it was split, and a new page was created. On the Vax, inserts tended to touch every page of the file - which could make the whole machine page. It was quite obviously faster on the PDP-11. No one cares about this example anymore - since machines have so much more RAM and speed. But, throwing the address space at video editing will show how bad this idea really is. Programmed I/O is smarter than having the OS do it. The program knows what it's doing, and the OS doesn't. Eventually, machines may have enough RAM and speed that no one will care, but it won't happen here at the begining of the curve.
One problem that has not been solved is the memory management unit TLB. This is the table on the chip that translated between physical and virtual memory. With 16 bits of address, 256 byte pages require only 256 entries to cover the whole address space. For 32 bit processors, the page table just doesn't fit on the chip. So, the TLB is a translation cache, and on cache miss, the OS must be called to fill it.
An alternative is to use extent lists. On my Linux system, the OS manages to keep my disk files completely contiguous 99.8% of the time. If this were done for RAM, then the number of segments that would be needed for a typical process would be small - possibly as few as four. One for text (instructions), one for initialized read only data, one for read/write data, BSS and the heap, and one for the stack. You'd need one for each DLL (shared library), but IMO, shared libraries are more trouble than they're worth, and ought to be abandoned. Removing any possibility of TLB misses would improve performance, and take much of the current mystery out of designing high performance software.
For this to work, you need the hardware vendor to produce appropriate hardware, and have at least one OS support it. The risk factor seems to have prevented this from happening so far...
We need 64-bit TODAY (Score:5, Insightful)
On a daily basis we're running into the Windows 2GB barrier with our next-generation content development and preprocessing tools.
If cost-effective, backwards-compatible 64-bit CPU's were available today, we'd buy them today. We need them today. It looks like we'll get them in April.
Any claim that "4GB is enough" or that address windowing extensions are a viable solution are just plain nuts. Do people really think programmers will re-adopt early 1990's bank-swapping technology?
Many of these upcoming Opteron motherboards have 16 DIMM slots; you can fill them with 8GB of RAM for $800 at today's pricewatch.com prices. This platform is going to be a godsend for anybody running serious workstation apps. It will beat other 64-bit workstation platforms (SPARC/PA-RISC/Itanium) in price/performance by a factor of 4X or more. The days of $4000 workstation and server CPU's are over, and those of $1000 CPU's are numbered.
Regarding this "far off" application compatibility, we've been running the 64-bit SuSE Linux distribution on Hammer for over 3 months. We're going to ship the 64-bit version of UT2003 at or before the consumer Athlon64 launch. And our next-generation engine won't just support 64-bit, but will basically REQUIRE it on the content-authoring side.
We tell Intel this all the time, begging and pleading for a cost-effective 64-bit desktop solution. Intel should be listening to customers and taking the leadership role on the 64-bit desktop transition, not making these ridiculous "end of the decade" statements to the press.
If the aim of this PR strategy is to protect the non-existant market for $4000 Itaniums from the soon-to-be massive market for cost-effective desktop 64-bit, it will fail very quickly.
-Tim Sweeney, Epic Games
64-bit address space useful even without 4GB mem (Score:3, Informative)
AGP and PCI cards, especially newer video cards, are also getting big. These need to have address space allocated to them. Even with a 64-bit PCI card, Linux still surprisingly allocates address space in 32-bit memory (the lower 4GB). If 4GB of RAM is installed, Linux must create a "hole" for PCI cards and such, as there isn't enough address space for all the RAM plus the PCI cards. This reminds me of the bad old days of ISA, where the expansion cards had to sit between 640K and 1M, creating a hole between the first 1M and all later memory. This hole still exists!
And finally, there's lots of good reasons to have a huge address space that provides room enough for everything on the system at once. No need to decode multiple memory maps and translate between them. It would be a boon to things involving virtual memory, multiple programs, data transfer between programs, and so on.
BTW, I use a machine at work with 4GB of memory installed. It's running Linux 2.4. Even with HIGHMEM enabled, it is still a mess, because we need that memory to be available to the kernel and PCI devices, and not just in user space. Linux is very good at doing page table tricks with PAE (Physical Address Extensions) for user programs, but this isn't true in kernel space. I'm looking forward to real 64-bit machines!
Re:amd get leap on intel? (Score:2, Interesting)
Another technique for expanding the memory capacity of current 32-bit chips is through physical memory addressing, said Dean McCarron, principal analyst of Mercury Research. This involves altering the chipset so that 32-bit chips could handle longer memory addresses. Intel has in fact already done preliminary work that would let its PC chips handle 40-bit addressing, which would let PCs hold more than 512GB of memory, according to papers published by the company.
Re:amd get leap on intel? (Score:3, Interesting)
The problem with PAE (Score:5, Informative)
True, you could have a PIII with 10G of memory on it (in theory, anyway), but this would not help you for the common applications for which you need these quantities of memory - databases, video editing and so on.
In those tasks, you have ONE program that needs lots of memory. You ideally want to be able to take a multi-gigabyte file, and mmap() it so that it appears to your program to be just a stretch of memory. Then you can access the file with a simple pointer, and moving within the file is nothing more than pointer manipulation. You don't have to worry about paging the file in and out - that is the OS's virtual memory manager's problem.
PAE won't help you in those cases. At best, you can back some of the buffer cache with the PAE memory, creating in effect a glorified RAM disk.
PAE is great if you have a machine running hundreds of processes, each of which takes 100M of space. But this usually is NOT the case.
Just as machines with more than 1M of memory started out the providence of the high-end user and slowly moved down, 64 bit address space on the desktop will start out the providence of the high-end folks first, then will move down as it becomes more common.
I would guess the likely sequence will be something like:
1) We *nix folks had it first - I was running 64 bits on my Alpha years ago. But we are not "the masses", and so will be ignored by the mainstream.
2) The Macs will be next - Apple will port MacOS X to the newer 64 bit Power chips. This will greatly simplify video editing - one of Apples favorite areas to compete in. 64 bit Apple will make the Mac the chosen platform for video editing of large files (NOTE: a 40 minute capture from my Firewire camcorder is a couple of gig - so already the home consumer is getting close to needing this.)
3) Windows will finally release a 64 bit OS (also note: they could have done this YEARS ago under Alpha, but didn't - Windows NT under Alpha only could access a 32 bit address space.) Microsoft will hail this as a revolutionary breakthrough - "Windows AYCABTU is the first 64 bit OS for the home user!" *nix and Apple users will scratch their heads in puzzlement.
Re:The problem with PAE (Score:5, Insightful)
We know that Microsoft actually bothered to write an Itanium-native 64-bit version of Windows XP Professional; it doesn't take much to figure out that Microsoft is right now coding an Athlon 64/Opteron 64-bit native version of Windows XP. My guess is that Windows XP for the Athlon 64 will be released commercially about the same time as the Athlon 64 is released (circa September 2003).
Re:Apple is already RISC... (Score:5, Informative)
RISC vs. CISC (Score:5, Informative)
RISC and CISC are the two main forms of processors out there these days. RISC simply means that an operation instruction is embedded with both the opcode and the operands. A CISC chip is one in which the opcode tends to be the first instruction processed and the operands are the next couple of instructions inputted.
My CMPT 150 course (introduction to Computer Design) was done entirely with a Motorola HC11 Processor emulator, which is a CISC processor.
The advantage to RISC processing is that you can put in "Pipelining", which basically means a buffer for all data throughout the CPU at different levels. Now, this means that a single chunk of opcode/operand takes x clock cycles to process (x being the number of levels you have to your pipeline), but it also allows the processor to do multiple things at once, so that after the first instruction goes through to the last buffer, there's one waiting right after it for the next clock cycle, so a RISC processor can give a new CPU instruction with every single clock cycle.
Confused yet? Let me put it this way...
Pretend that your CPU is a plumbing system, with water streaming through hot and cold pipes to deliver a prefered temperature for the water. Now, the water temperatures are your CPU data (signals, bits, whatever...) and your pipes are your cpu circuitry.
Now, you want to send a big chunk of hot water down to the bottom of your pipe system using a bunch of intermediary valves (or/and/not/xor gates) and a specific pathway (Let's not ask why, let's just assume you want to do that). Now, say right after that you want to send a bunch of cold water down a similar path, but not necessarily the same path, however you will want to use some of the same pipes.
Now, with a CISC processor, what you would do is you would send down the hot water, occasionally storing it in some pipes whilst you send down the cold water, and the sheer design of the system would keep the Hot and Cold waters seperate and you would be able to output your hot water, and then output your cold water, once they have gone through their systematic storages and movements around.
The annoying thing about this is you need a sophisticated CPU to do it. And you need a bunch of clock cycles to open and close the valves and whatnot and finally get your desired output.
Now, a RISC processor does something a bit smarter.... It throws your hot water in (First clock cycle) and just lets the valves automatically trickle to the bottom, and then, on the second clock cycle, send the cold water down. The downside of this is the fact that your single clock cycle is going really slow, which means you have a big lineup of people requesting hot and cold water and they have to wait for it to come out (Lag, for those taking notes in computer-world).
So, we instate pipelining.
Pipelining is a bunch of basins (let's say 4) that appear at different levels of the pipe system.
So, you dump your hot water in the top basin. (First clock cycle)
Then, you unlock the basin and let it dump into the second basin. Once it's done that, once again, seal the basin and dump your cold water in. Now, (second clock cycle) open the plugs for both basins, and your hot water goes down the tubes (magically) before the cold water shows up and you can re-plug your basin. Now you have room for more water in the top basin.
Every move into a new basin is a clock cycle, so It takes 4 clock cycles for it to finally reach the bottom so you can do whatever the hell it is you would want to do with hot or cold water. However, these are relatively quick clock cycles compared to the clock cycle you had in your non-pipelined RISC architecture. And, ultimately, once the first output reaches the bottom, you only have to wait a single clock cycle for the input right after it, rather than waiting another oh-so-many amounts of clock cycles that you would've in your CISC architecture.
Did that make sense to anybody? I hope it did.
Re:RISC vs. CISC (Score:5, Insightful)
The real difference between x86's and RISC's are that the x86 ISA was designed without consideration for contemporary CPU design technology (that is/was available at the time), while RISCs supposedly are. But anyone who has looked under the hood of these CPUs will see that this has not impeded the modern x86s. x86s are more complicated (and therefore in theory should probably be either a bit larger or slower) but as time shown, instruction set complications are not the only consideration for CPU design.
All x86's are pipelined, and in fact use the absolute latest CPU design techniques. The Pentium 4, in fact, has pseudo-double clocked integer ALUs and hyper-threading. Neither of these are available in any other RISC CPU.
Re:Does 64 bits slow memory down? (Score:3, Informative)
Re:Does 64 bits slow memory down? (Score:5, Interesting)
In case you're wondering about constants: the PPC only supports loads of 16bit immediate values (both in the lower and upper 16bits of the lower 32bits of a register), so to load a 64bit value you may have to perform up to 5 operations (two loads, a shift and two more loads). So a PPC requires up to 64bits for a 32bit immediate load and up to 160bits to load a 64bit value (unless you store such a value in a memory location that can be addressed in a faster way). These are worst cases however, and in a lot of cases 1 or maybe two instructions is enough.
The main downside of 64bit code is that all pointers become 64bit, so all pointer loads and stores indeed require twice as much storage and bandwidth.
Re:Does 64 bits slow memory down? (Score:3, Informative)
Just because a 64 bit processor can handle 64 bit integers doesn't mean that it can *only* deal with 64 bit quantities or that its instructions are necessarily 64 bits long.
As an example, take PPC-64. Its instructions are still 32 bits long and are basically identical to PPC-32 except for those instructions dealing with 64 bit quantities, which PPC-32 doesn't have. All pointers (memory addresses) are 64 bit but you may use any size integer you wish, from 8 bit to 64 bit, depending on what you need.
You forget... (Score:5, Insightful)
In the beginning, no one really needed a PC either. It is not need that drives the tech market, its want.
Bill Gates claims he did not say 640K is enough (Score:5, Interesting)
One quote from Gates became infamous as a symbol of the company's arrogant attitude about such limits. It concerned how much memory, measured in kilobytes or "K," should be built into a personal computer. Gates is supposed to have said, "640K should be enough for anyone." The remark became the industry's equivalent of "Let them eat cake" because it seemed to combine lordly condescension with a lack of interest in operational details. After all, today's ordinary home computers have one hundred times as much memory as the industry's leader was calling "enough."
It appears that it was Marie Thérèse, not Marie Antoinette, who greeted news that the people lacked bread with qu'ils mangent de la brioche. (The phrase was cited in Rousseau's Confessions, published when Marie Antoinette was thirteen years old and still living in Austria.) And it now appears that Bill Gates never said anything about getting along with 640K. One Sunday afternoon I asked a friend in Seattle who knows Gates whether the quote was accurate or apocryphal. Late that night, to my amazement, I found a long e-mail from Gates in my inbox, laying out painstakingly the reasons why he had always believed the opposite of what the notorious quote implied. His main point was that the 640K limit in early PCs was imposed by the design of processing chips, not Gates's software, and he'd been pushing to raise the limit as hard and as often as he could. Yet despite Gates's convincing denial, the quote is unlikely to die. It's too convenient an expression of the computer industry's sense that no one can be sure what will happen next.
Click here [nybooks.com] to read the full article.
bah! (Score:4, Insightful)
No friggin way! They're going to go with AMD Opteron.
Cheap 64-bit computing is right around the corner, and Intel is going to be playing catch-up real soon now.
And with more and more people getting into editing their own videos, people are going to want 64-bit computing sooner than Intel is letting on.
Then again, I could be wrong. I'm wrong "alot" :)
Re:bah! (Score:3, Interesting)
not anymore (Score:3, Informative)
Not anymore. With iMacs coming with decent video-editing tools, and consumer versions (only $300) of Final Cut, and other tools, Joe User is getting interested in this stuff.
Not to mention students in film school, etc. 64-bit procs sure could be useful to them in the near future.
I dunno though, I guess 4 GB is till enough for most Joe Users for now... But just wait for Windows XP 2004 3.1!
Re:bah! (Score:3, Interesting)
Was a specialized enterprise. Not anymore; witness iMovie or Final Cut Express.
I am still stunned by this. I remember building and demo'ing Media 100 systems in 1997; you needed at least $20k for something reasonable (i.e. Big Mac w/gobs of RAM, SCSI arrays, specialized PCI board and breakout box, industrial VTR, preview monitor, time-base corrector...) and that didn't get you fancy realtime effects.
A $1500 iMac just spanks the crap out of this system I used to sell, requires no extra hardware (firewire is beautiful), and the quality is superior.
So, past tense.
Now, back on topic, accessing 4GB of memory is very desirable in this situation; 4GB of DV footage is measured in minutes. It would be nice to manuipulate more than minutes in RAM, no? (also, RAM Preview in After Effects would be really sweet).
definition of 64-bit (Score:5, Informative)
What's the big difference between 32-bit processors and 64-bit processors?
A 64-bit machine can address more than 4 GB of memory without funky segmented addressing kludges. This has applications in scientific simulation and database managers.
A 64-bit machine can also handle 64-bit integers as a native data type. This is important for encryption, number theory, financial applications dealing with money over $40 million, etc.
Re:Sorry my ignorance but... (Score:3, Funny)
Re:Sorry my ignorance but... (Score:4, Informative)
Re:I wonder what would happen if...... (Score:5, Interesting)
I've heard that Microsoft is developing an Athlon 64/Opteron native version of Windows XP; if that is true then gaming companies involved with PC-based games may be already creating games that run in native Athlon 64/Opteron 64-bit mode under Windows XP as I type this.
Re:Microsoft's top five arguments for 64-bit WinXP (Score:3, Informative)
Thanks for the M$ marketing hype. Sure that's what all the boxes and packaging say, but there's more too it than that.
Ever used a pointer? Ever taken the size of a struct? Ever assumed a certain page size? Ever written a mask for MMIO? Check your sign extension so your masks don't barf? These are some issues you encounter when your machine word size or address size changes.
Emulation has always been a joke not to be taken seriously.
Applications don't just magically work in a 64-bit O/S, except maybe, hello world or stuff that sticks entirely to LIBC.