Porting to 64-bit Linux 120
An anonymous reader writes "As 64-bit architectures continue to gain popularity it is becoming more and more important to make sure that your software is ready for the shift. IBMDeveloperworks takes a look at a few of the most common pitfalls when making sure your applications are 64-bit ready. From the article: 'Major hardware vendors have recently expanded their 64-bit offerings because of the performance, value, and scalability that 64-bit platforms can provide. The constraints of 32-bit systems, particularly the 4GB virtual memory ceiling, have spurred companies to consider migrating to 64-bit platforms. Knowing how to port applications to comply with a 64-bit architecture can help you write portable and efficient code.'"
Just a recompile? (Score:1, Informative)
Re:Just a recompile? (Score:1)
Not quite (Score:1, Interesting)
I'm sure it won't affect your VB app, but it could affect something written in C/C++.
I'm just wondering if this is what is holding up an AMD64 version of Flash [macromedia.com].
Re:Not quite (Score:2)
Or your Python, Perl, Pascal, Ruby, Tcl, Java, COBOL, FORTRAN, PL/1, Prolog or Forth programs.
I'm just wondering if this is what is holding up an AMD64 version of Flash.
Sloppy code?
Well written portable code is fine (Score:2)
In theory, yes. However programmers usually do stupid mistakes, like assuming that sizeof (int) == sizeof (void *) and so they think they can cast pointers to int and the other way around, while on a typical LP64 platform (like AMD64), ints are 32 bits wide and pointers 64 bits, so the cast will not work as expected.
All in all, nothing new here, well written portable code just need a recompilation, and everything else will need to be debugged.
Re:Well written portable code is fine (Score:2)
Re:Well written portable code is fine (Score:1)
Even GNU systems will differ because of differences among CPU types--for example, difference in byte ordering and alignment requirements. It is absolutely essential to handle these differences. However, don't make any effort to cater to the possibility that an int or a pointer will be other than 32 bits. We don't support 16-bit machines in GNU.
A lot of gnu (and unix) code assumed 32 bit ints, 32 bit pointers. Could have been worse... a lot of ear
Re:Well written portable code is fine (Score:1)
Which reminds me of something... Isn't it time for C and its likes to let us specify explicitly how many bits we want for a variable? I would like to tell the compiler that a variable should be _exactly_ 32 bits and another one _at least_ 64 bits. It's appears strange to me that an int seems to be allowed to be, well, anything it wants to be, and I will never know. Throw in arbitrary precision floats too while we're at it. If the hardware cannot handle 523-bit
Re:Well written portable code is fine (Score:2)
However, you could implement this in a library in C++ using compile-time templates, implying very little run-time overhead (if any) for stuff that can be co
Re:Well written portable code is fine (Score:1)
But it's already done! Even if you lower your expectations, not requiring the processor to do 153-bit floating point maths, but settling for, say, 32 bits, not all processors will be able to give that to you. Thus, if you're feeling modern and brave, using floats anyway, the compiler will have to emulate it for you. Anything less, I think, would require knowledge of the instructions present on the target C
Re:Well written portable code is fine (Score:2)
Proving that those who don't use Common Lisp are doomed to reimplement it...
Re:Well written portable code is fine (Score:2)
Re:Well written portable code is fine (Score:1, Informative)
However, using something like "int_least32_t" directly in your
Re:Well written portable code is fine (Score:2)
Most C programmers don't even think about issues like arithmetic overflow in their code
Well, that's negligence if I've ever heard of it. I thought only Java programmers were allowed to do that.Specifying bit lengths (Score:2)
typedef unsigned char uint8;
typedef unsigned short uint16;
And so forth. Then I exclusively use the new types. If I need to compile to another platform, I just need to change the portable.h file.
You can even go one further with:
#if sizeof(unsigned char) == 1
typedef unsigned char uint8;
#else
#error "No uint8 type available"
#endi
Re:Specifying bit lengths (Score:2)
#if sizeof(uint8)!=1
is not valid syntax. Well, I'm sure there's a way to do it properly. Anyone?
Re:Specifying bit lengths (Score:2)
#else
/* sizeof(uint8) != 1 */
#endif
maybe... i'm not sure if the problem is that the preprocessor only understands == and not !=, or if it's something different
Re:Specifying bit lengths (Score:2, Informative)
Re:Specifying bit lengths (Score:2)
Re:Specifying bit lengths (Score:1)
Re:Specifying bit lengths (Score:2)
Re:Specifying bit lengths (Score:1)
Cool! Thanks!
Re:Specifying bit lengths (Score:2)
typedef unsigned char uint8;
#else
#error "No uint8 type available"
#endif
That way the compiler will warn you if there's a problem when you switch platforms.
Um, no, actually it won't, because the sizeof operator returns the size of the type in chars. That is to say, sizeof(char) is 1 by definition, regardless of the number of bits in a char.
Re:Specifying bit lengths (Score:2)
typedef unsigned char uint8; typedef unsigned short uint16;
And so forth. Then I exclusively use the new types. If I need to compile to another platform, I just need to change the portable.h file.
Two comments:
Get yourself a compiler/stdlib with implements the official C99 known size types: uint8_t, uint16_t and so on. Or have fun when
Re:Specifying bit lengths (Score:2)
Re:Well written portable code is fine (Score:2)
People who have forgotten COBOL and Binary-Coded Dedimal are doomed to repeat it, poorly.
Re:Well written portable code is fine (Score:1)
Re:Well written portable code is fine (Score:2)
COBOL uses IBM's version of BCD. Decimal numbers are a fundamental data type in COBOL, and IIRC you can make them as big as you want with PICTURE clauses.
Re:Well written portable code is fine (Score:2)
Ada allows you do what you want. You want an integer that is always 32 bits:
type Int32 is new Integer range -2**31
for Int32'size use 32;
Fortran 90/95 (Score:2)
Sounds like you want the "kind" functionality of Fortran 90.
Back in the bad old days, a REAL in Fortran could be anywhere from 32 to 64 bits - a program that ran fine using REAL on a CDC-6600 (60 bits) might die horribly using REAL on an IBM 360 (32 bits but usi
Even well written code can have problems (Score:3, Insightful)
Specifically, say I have a 64 bit platform capable of running both LP64 code and ILP32 (legacy) code.
I use a shared memory segment to communicate between my legacy 32 bit applications, and it has internal use of pointers to perform self-reference on data.
[Rather than complicating things, let's just assume that the pointers are internally based off the base address of the shared memory segment, rather than being based off of 0, so there is no requirement of mapping th
Re:Just a recompile? (Score:5, Informative)
In addition, and this is hellish, a 32-bit MOV is (generally) atomic on x86. You can rely on the high-order word and the low-order word staying together, without race conditions. The memory access semantics are different on x64 and many other platforms. This is not related to 64-bitness per se, you could see if you ported to multi-threaded 32-bit PPC as well, but it will still surface if you do the transition to AMD64/EM64T/x64. Or rather, it will result in an additional one-in-a-million crash in your source, that you'll blame on bad memory chips in the user's machine.
very wrong (Score:2)
sizeof(int)==4
the "long" and "void*" data types may be atomically written, 64-bit or not
Also:
sizeof(long)==sizeof(void*)
sizeof(long long)==8
This is quite standard for 32-bit and 64-bit systems. The only major OS to
violate this is Win64, which kept a 32-bit long and thus can't safely cast
a void* to long and back again. Linux, BSD, Solaris, MacOS 9, Win32, OS/2,
VMS, VxWorks... they all work as Linux does. (screw Win64)
Re:Just a recompile? (Score:2)
Re:Just a recompile? (Score:1)
The AMD guys certainly don't seem to be sure that AMD64 satisfies this. See this trail [amd.com].
Further, I suspect that none of the processors support atomic writes of 64-bit values that are not aligned on an 8-byte boundary. If your code has not been written to ensure that values are always on appropriate boundaries (and it's very easy to get this wrong, even if you're aware of the issue), this will probably bite you. At work, we run a lot of software on both Intel and Sparc processors. It is far from unusual
Re:Just a recompile? (Score:2)
Re:Just a recompile? (Score:1)
So what you're saying is that if you have a variable, that is shared between two (or more) threads and you haven't protected it with e.g. a mutex, a condition variable or a semaphore, you could have a problem?
Well, yeah, isn't that parallel computing 101?
Re:Just a recompile? (Score:3, Insightful)
no (Score:5, Insightful)
As a general rule, "just a recompile" *never happens* for any architecture and compiler change on a project above a certain size. Compiler writers break compatibility with some little ol' thing they don't think anyone is using, but which everyone is actually using in *every* version, fail to implement uncommon or difficult language features, add non standard features that other compilers don't support. Then application developers do things like not swapping to network byte order and using architecture dependent data types (size_t as in the example). Between different unices, header file contents will change.
The fixes are often not that hard (usually trivial) to do between say versions of the same compiler, or endian switches... but they are still there and annoy the hell out off people trying to compile old open source software on a new platform, like say macosx was a few years ago and x86 64 is now. There's always growing pains.
Re:no (Score:2)
In general, I agree. But the example is not a good one. Dumping da
Re:no (Score:2)
Re:Just a recompile? (Score:2, Insightful)
In my experience, most of the problems will center around using non-pointer types with pointer-types. Mostly around bounds checking, offsets into arrays, pointer arithmatic, etc.
"didn't change the size of a word" (Score:2)
To AMD and Intel, a word is 16 bits. This is seen in the Intel-style assembly that masm and nasm use.
By the ELF binary specification, a word is 32-bit or 64-bit according to the platform. So the word size did change.
The traditional idea, with a word being the size of a register, is like the ELF spec.
The C programming language has no such thing. On both i386 and x86-64, sizeof(int)==4 and sizeof(long long)==8. On x86-64, sizeof(void*)==8. On i386, sizeof(void*)==
Re:"didn't change the size of a word" (Score:2)
Re:Just a recompile? (Score:3, Insightful)
Do you realise how difficult it is to find a healthy goat and sacraficial knife these days?
Re:Just a recompile? (Score:1)
Re:Just a recompile? (Score:1)
Ideally, that would be the case, but in the real world, not really. A couple weeks ago I got an AMD64 box, and since then I've been working on porting my Linux distribution over. Not exactly the hardest thing I've done, but nowhere near the easiest, either.
Re:Just a recompile? (Score:1)
/lib was botched, so yes you must port libraries (Score:2)
Then AMD told SuSE to make x86-64 run all i386 binaries perfectly, including installers that would expect to use the
So now we're supposed to use
Re:/lib was botched, so yes you must port librarie (Score:3, Informative)
Nah. That's a bit pessimistic outlook. Already today /lib64 is a mere symlink to /lib on current distributions. The symlink may have to be kept around of for a while though until the early nomenclature oopses have been effectively phased out.
Re:/lib was botched, so yes you must port librarie (Score:2)
64bit ain't all it's cracked up to be.. (Score:3, Funny)
Re:64bit ain't all it's cracked up to be.. (Score:2)
Then the problem is fixed.
Re:64bit ain't all it's cracked up to be.. (Score:2)
Re:64bit ain't all it's cracked up to be.. (Score:2)
In theory 64bit should be good for 17 179 869 184 GB. Granted, on AMD64 the process can only address 1 TB of RAM, and I think 256TB of virtual memory. However, that is a result of chip design and not arch, per se - so that can be raised without breaking compatibility at the software level.
I believe linux supports the full range, or close to it.
Re:64bit ain't all it's cracked up to be.. (Score:2, Informative)
There's space in the page table entries to handle 64 bits, but adding extra levels to the translation probably has a performance impact. There's still debate [stanford.edu] about the best way to do 64 bit address translation, and 52 bits is plenty for now. And when they change, it will only affec
Re:Sheesh... (Score:3, Insightful)
I suspect what you're saying is that there is no particular need for 64-bit in most apps, which I agree with. But the point here is that the program should work correctly, which means code that makes assumptions like pointers and ints being the same size needs to be fixed. The point is that amd64 is making 64-bit platforms relevant to more users, not that everyone thinks most apps will be gee-
Most of the time it's easy. (Score:5, Informative)
The only area I've ran into things being significantly harder is writing clean lock-free algorithms due to the lack of a CMPXCHG16B instruction in the original spec - only EMT64 and very recent AMD64 models have it. There are a couple ways to hack around this limitation but they aren't very pretty.
Re:Most of the time it's easy. (Score:1)
Re:Most of the time it's easy. (Score:2)
Chicken-egg problem with libraries (Score:2)
Finally, when I did get it working, the maintainer didn't have a 64-bit OS so they weren't interested in hosting the RPM I built. It seems like until enough people have 64-bit systems, nobody really cares about it.
Re:Chicken-egg problem with libraries (Score:2)
I had very similar experience when working on 64bit Linux 6 years ago in the days of Debian on alpha. I ended up lifting many libraries out of the NetBSD tree and rebuilding them for Debian because they were the only project at the time which was meticulously cleaned up to be both endian-clean and int-size-clean.
After getting some things working I could not get the packages and patches back because people could not verify them.
Been there, done that (Score:3, Informative)
I've been running a 100% 64-bit dual Opteron rig for almost two years, under Gentoo. No emulation libraries, no multilib, just 64-bit code. Other than Open Office, I've had almost no trouble at all.
BTW, "64-bits" don't make programs run faster (in general) — code compiled for AMD64/EMT64 runs faster than its 32-bit counterpart (for the most part) because of the extra general-purpose registers in the AMD 64-bit design.
Re:Been there, done that (Score:1)
Use stdint.h! (Score:5, Informative)
Re:Use stdint.h! (Score:2)
Umm, try NetBSD maybe? It's the most portable sy
Re:Use stdint.h! (Score:2)
32-big: SPARC
64-big: Alpha
32-little: IA-32
64-little: EM64T
More subtleties can arise ... (Score:5, Informative)
One example:
on a 32-bit Intel machine, a double is precise enough to distinguish LONG_MAX (the highest representable long) from LONG_MAX+1 (a number that doesn't fit in a long anymore). So for instance, to determine whether a long multiplication has overflowed, you could repeat the same multiplication using doubles and compare the result to (double)LONG_MAX.
In contrast, on a 64-bit platform LONG_MAX and LONG_MAX+1 are mapped to the same double representation, so there's no way to do the comparison anymore.
As this example involves static casts, it is something the compiler will usually not warn you about.
Another thing to be careful about is passing pointers to variadic functions (eg. sscanf), because usually the compiler doesn't know the expected types, as they are buried in the format string, not in the function prototype.
Re:More subtleties can arise ... (Score:1, Informative)
on a 32-bit Intel machine, a double is precise enough to distinguish LONG_MAX (the highest representable long) from LONG_MAX+1 (a number that doesn't fit in a long anymore). So for instance, to determine whether a long multiplication has overflowed, you could repeat the same multiplication using doubles and compare the result to (double)LONG_MAX.
That seems like a terrible way to do it... couldn't you just find the highest set bit position in the multiplicands and add?
Or better yet, IIRC when yo
Re:More subtleties can arise ... (Score:1)
Re:More subtleties can arise ... (Score:2, Insightful)
However, while it is indeed a hack, I would like to challenge you to suggest a better version that:
a) is as portable (so no assembly for checking overflow flags that for instance Alpha doesn't have)
b) uses only 1 multiplication and a comparison in case of overflow, and 2 multiplications otherwise.
The point is, from a pra
Re:More subtleties can arise ... (Score:2)
const R a = 5, b = 7, c = 1;
R r = ((x + a) * (y + b)) / (z - c);
Practice What You Preach (Score:2)
Going off on a tangent:
I have no idea what a 36 bit signed-magnitude integer mainfraim (( Yeah, they really existed -- CDC made them )) would return for *(unsigned char *) (int)-2. It would probably be 0x80 or 0x40 -- but it might be 0x800 (CDC used 6 bit characters, an
9-bit (Score:3, Informative)
C requires at least 8 bits for char, so 6 isn't good enough.
All types must be a multiple of the size of char, because
sizeof(char) is 1 by definition and fractions are not OK.
Valid sizes are thus: 9, 12, 18, 36
The char-short-int-long progression may be one of:
9,18,18,36 a likely choice
9,18,27,36 this is the cool way: sizeof(int)==3
9,18,36,36 a likely choice
9,27,27,36
9,27,36,36
9,36,36,36 a likely choice
12,24,24,36
12,24,36,36
12,36,36,36
18,18,18,36
18,18,36,36
18,36,36,36
36,36,36,36
Re:9-bit (Score:2)
Re:9-bit (Score:2)
24-bit shorts make perfect sense. I suspect they got smart about powers of two after making the mistake of using a 36-bit word, and decided not to have sizeof(long)==3.
So char was 9 and long must have been 36. (long could have been bigger, but I doubt it was) The remaining two were most likely 18 or 36. These are most likely:
9/18/36/36
9/18/18/36
Also, 9/36/36/36 was somewhat likely.
DEC or Univac - not CDC (Score:2)
CDC made 48 bit machines (1604 and 3000 series) and 60 bit machines (6000 series, 7600 and some Cyber's) but not a 36 bit machine AFAIK. The 6600 had 60 bit reals and long ints, 18 bit short ints, 12 bit words for the peripheral processors - a real PITA for C.
Written For Macromedia (Score:1)
64 bit porting is more of a compiler problem (Score:2)
In particular, the GNU toolchain has a very poor ability to complain about long/int coercion. It also doesn't have a 64 bit pointer type for use in 32 bit code - so any 32 bit code you need to talk to from 64 bit ends up handing around a long long, and since this is just an integer type, there's no problem with assigning it to another integer type, and potentially losing resolution (and bits off the pointer, should it be converted/passed back).
Minimally, the too
Re:64 bit porting is more of a compiler problem (Score:1, Informative)
The `quad' type was a BSD anachronism, and in any event "Unix" specific (quad what? C doesn't guaranteed that char is an oct
GNU toolchain and not giving warnings (Score:2)
You are incorrect.
The following is some code that does not warn that the resolution of "long l" is potentially insufficient to store the value contained in "long ll":
Rely on compiler warnings (Score:2)
Then -- patiently fix them all. You know, you planned to do that for years. Do it before trying to build a 64-bit version.
Then -- try the 64-bit version and fix all the warnings you missed before. void * to int conversions are my personal favorites...
Resist the temptation to invent your own types, though (Mozilla's source tree is awful in this regard). Use the standard int32_t or uint64_t, where the number of bits mat
My experience (Score:1)
Re:My experience (Score:2)
Now... get it to actually work, be backwards compatible with your 32 bit compiler, and do something even slightly nontrivial. Say, pack a struct into a datagram on your 64 bit host, send it to a 32 bit host, and unpack it into a struct with the same byte alignment (using the same code on both hosts).
Re:Just use a modern language (Score:1)
http://www.informit.com/guides/content.asp?g=cplu
Re:Just use a modern language (Score:2)
Indeed. Having been through the horrors of 16-bit to 32-bit transition on Windows in the early 90s, it is great to be developing in Java, knowing that I don't have to care about such matters again. I let the JVM translate my bytecodes to high-performance machine code on whatever platform I am on, no matter what the word length.
Re:Just use a modern language (Score:2)
You are missing the point here. I don't want to have to recompile all my stuff to suit these different machines. With Java, if a machine is running a J2SE 1.5 VM, it will run my single binary, no matter what the bit size of the target machine.
Maintaining a source repository and having to rebuild it for a range of different target architectures (and support it on those architectures) is
Re:Just use a modern language (Score:2)
All the Java BitString libraries I have used have been platform independent; which makes sense, as a Java program has no indication of the word size of the underlying platform.
So what you said isn't really correct in all situations. Look at your code.
Eh? I look at my code all the time!
Re:Just use a modern language (Score:1)
http://www-128.ibm.com/developerworks/java/jdk/64
Many Java applications are not written 100% in the Java language. Those apps will need some porting effort. The document also mentions considerations such as the usage of JNI by the native libraries on a 64 bit system.