Python-to-C++ Compiler 181
Mark Dufour writes "Shed Skin is an experimental Python-to-C++ compiler. It accepts pure, but implicitly statically typed, Python programs, and generates optimized C++ code. This means that, in combination with a C++ compiler, it allows for translation of pure Python programs into highly efficient machine language. For a set of 16 non-trivial test programs, measurements show a typical speedup of 2-40 over Psyco, about 12 on average, and 2-220 over CPython, about 45 on average. Shed Skin also outputs annotated source code."
not terribly useful quite yet (Score:5, Insightful)
Re:Why not just use pure C++? (Score:2, Insightful)
Yeah, but that's not what we need. (Score:3, Insightful)
This won't be meaningful until a converted python script is compared to efficient code written natively in C++ in the first place.
Native code (Score:3, Insightful)
The best way to get some speed and still keep the nice Python functions and layout is just to export the most heavily used functions to native code (C/C++).
I don't know if its possible to take the C++ output and optimize it seperatly, that way you will have a good start to make native code though.
In short: Better, fast and easy, but not the best (if you can write native code)
Re:Ewwwww (Score:5, Insightful)
I think you're not supposed to read it. You're only supposed to feed it to your C++ compiler. f2c produced unreadable output too, but nobody read the output; at one time it was the only free fortran option on linux.
Re:Sounds good... (Score:4, Insightful)
Uh, why would they have to? This goes from Python to C++, not vice versa. If there are no pointers or structs in the Python code, why would they have to handle them? Certainly, it's quite possible that some Python variable types will be converted to pointers or structs in the output code, but that's orthagonal to the issue of Python not having them natively.
If you were trying to go from C++ to Python, then you'd have to convert C++ pointers and structs to some sort of Python data type, and your comment would make sense. As it is, I'm not sure what you were trying to say.
Re:Yeah, but that's not what we need. (Score:4, Insightful)
Re:Sounds good... (Score:3, Insightful)
Why would one ever need to do that? The goal is not to write C++ in Python, it's to compile Python to machine code via an intermediate Python -> C++ compilation.
Re:Yeah, but that's not what we need. (Score:5, Insightful)
no, I'd be far more interested in a good compiler to compile that python straight to machine code...
Re:not terribly useful quite yet (Score:3, Insightful)
I could envision it working like this. Instead of statically declaring all your variable types in every function, you instead simply declare that whatever tpyes are being used, they are always the same every time this function gets called. All the compiler then has to do is to find one instance of calling that function or one instance of the use of the arguments within the subrotuine in which the type is unambiguous to reverse engineer the types without having to be told. It could then flag the few cases where it can't resolve the type and either handle those in a slower dynamically typed fashion, or allow you to hint the types it was confused by.
File as NBNC (Nice But No Cigar) (Score:5, Insightful)
What the Python C/C++ interested people REALLY need is a book written by a group of Python AND C/C++ masters which teaches the two simultaneously showing complimentary methods of doing any given thing working from beginner to advanced and I DON'T mean "How to turn your n00b Python code into C/C++ hotness" sort of viewpoint. I mean both taught simultaneously in synch showing how they can interchange and compliment.
Software tricks for converting? Ultimately worse than not having them because it leads to horrible obfuscation because we don't know exactly what is going on when 13,412 lines of Python is turned into C++ because WE DIDN'T WRITE IT AND WE NEVER LEARNED C/C++. "Say Mike, that's great but you're the company code cowboy and you don't do C++ natively and I sure as hell don't read it being management so exactly what happens if this needs to be fixed? We've gone from importing open source code you couldn't read to writing our own open source code you can't read."
Re:Yeah, but that's not what we need. (Score:3, Insightful)
Re:Ewwwww (Score:3, Insightful)
Re:Very nice, but... (Score:3, Insightful)
Re:If they can do this... (Score:5, Insightful)
If this converter proves to be successful, I believe that a GCC frontend will be written eventually. There are probably potential optimizations that would be difficult or impossible to implement any other way.
Some may think that the dynamic nature of Python may preclude its inclusion in GCC. Technically, all that would need to be done is to have a runtime to handle dynamic things, similar to how Objective-C (for which there is GCC support) has a runtime to handle message passing and late binding. However, a large portion of the potential efficiency of a compiled version of the language would be lost to these dynamic capabilities; luckily, a compiler can detect when things are implicitly static (in fact, this converter is limited to implicitly static constructs), and optimise them to be truly static at compile-time.
Stupid comparison (Score:4, Insightful)
Re:Yeah, but that's not what we need. (Score:4, Insightful)
...and that's why it shouldn't be a Python to C++ translator; it should be a GCC frontend instead (i.e., translating to GCC's internal representation).
Re:Yeah, but that's not what we need. (Score:3, Insightful)
Is that the same way the method of using layers of multiple simple tools that all do one thing really well is more buggy that just using one larger general purpose monolithic app?
A cross platform Python to machine code compiler would presumably need to reinvent a whole lot of difficult platform specific stuff that has already been solved by C++ compilers.
Re:2-40 what? (Score:3, Insightful)
Re:Ewwwww (Score:4, Insightful)
Yeah, whenever I look at the output of my optimising compiler, it's really hard to understand too. It's all in assembler, for a start.
Plus, the quality of C code generated by CFront was rubbish - unreadable.
Same with the Modula-3 compiler I tried. You couldn't work out what was going on in the resulting C code without a load of work.
Can you see where I'm going with this?
Re:File as NBNC (Nice But No Cigar) (Score:4, Insightful)
That isn't how a compiler is used. When you compile a C++ program, you don't throw away your C++ source and check the executable into source control. "Oh, no! We used gcc and now we have a bunch of gobbledygook we don't understand!"
The C++ is an intermediate stage in the make process, akin to the output of various phases of gcc.
Re:Yeah, but that's not what we need. (Score:3, Insightful)
Would you also like to translate a text from Arabic to English by passing through 3 or 4 languages in between?
In this analogy the problem would probably be accuracy, in the case you presented it would be performance being lost due to layers of conversion. Some high level optimizations are inevitably lost (unless the C++ compiler has some sort of strong AI).
Re:Yeah, but that's not what we need. (Score:1, Insightful)
Assuming the Shed Skin project overcomes the dynamic typing (and other) limitations so that any Python code is translatable to semantically equivalent C++, then it should be possible to skip the translation to C++ and go straight to GCC IL. Of course, that may prove to be too much of an assumption...
- T
Re:Why not just use pure C++? (Score:3, Insightful)
As have I, but I'd certainly rather manage in languages that support first order data structures, "for each" loops for iterations, proper disjunctive types, pattern matching, and so on. C++ is better than it used to be, but all the data structures and algorithms in the standard library barely hold a candle to the expressive power of many functional programming and "scripting" languages.
Re:Yeah, but that's not what we need. (Score:3, Insightful)
You are making a gigantic assumption that because this converter's better than the last one, that it's usable in efficiency arenas. By comparison, you might be looking at the difference between a shoe and a shoe with a spring (that's what air pumps do, don't laugh) when dealing with runners - you can grab another 10-20% speed out of the runner with the new shoe, and it's easier on the knees to boot.
They're still not racing a car.
I'm not saying this thing does a bad job; I honestly have no idea. The point, though, is that neither do you. If you're working a project which has a good reason to be Python, such as in an arena where programmer time is more important than execution time (a lot of programmers unfortunately believe that this is all programming, because they've never done performance-conscious things like operating systems, databases, video games, embedded or realtime software, and so on,) then this is great. Why? Because a new tool gives you an essentially free linear speed multiplier, which means you can crank X% more work out of the same machine farm, or whatever.
But, you've read this differently. You're starting to think of Python as an appropriate tool to write efficiency-conscious software, and there are not yet appropriate tests to display that this is an accomodating such tool. To be frank, I highly doubt that this is actually going to be a real performance-appropriate concern, and whereas a bunch of flametards will jump all over me demanding that I explain a lifetime's worth of instincts and experience to them in two paragraphs or I'm so obviously wrong, I still want to point out that technologies like this have come and gone for scripting languages for decades, that none of them have turned the python of their day to the C++ of their day, and that I don't see any compelling reason to believe that this will be any different.
What you're reading isn't what those tests say. Python is not a performance-appropriate language, and I believe that this tool will not make it so. No tests which determine whether it actually is a performance-appropriate environment have yet been run.
Besides, it's worth pointing out to all the flametards that C++ isn't actually the performance language of choice either. Depending on the nature of your problem, that crown usually goes to forth, erlang, k, mozart-oz or formulaONE. (And, for math-heavy stuff, it still goes to Fortran surprisingly often.)
Re:Very nice, but... (Score:3, Insightful)
Indeed, VB.net and C# have very similar features and capabilities, and if there are big performance differences between them, it's because the authors of one of the compilers screwed up.
But the other posters were arguing that their performance and capabilities should be identical because they both compile to MSIL, and in fact that any language that does so would have equal performance and capabilities. Which is just silly; hence my silly IRock.net example. For a less silly example, Managed C++ certainly has different capabilities than VB.net or C#.
VB.net and C# produce very similar performance, because they are very similar to begin with. Not because their existing compilers target the same virtual machine.