Python-to-C++ Compiler - Slashdot

Catch up on stories from the past week (and beyond) at the Slashdot story archive

×

Python-to-C++ Compiler 181

Posted by timothy on Thursday June 15, 2006 @01:12PM from the calibrate-your-scales dept.

Mark Dufour writes "Shed Skin is an experimental Python-to-C++ compiler. It accepts pure, but implicitly statically typed, Python programs, and generates optimized C++ code. This means that, in combination with a C++ compiler, it allows for translation of pure Python programs into highly efficient machine language. For a set of 16 non-trivial test programs, measurements show a typical speedup of 2-40 over Psyco, about 12 on average, and 2-220 over CPython, about 45 on average. Shed Skin also outputs annotated source code."

This discussion has been archived. No new comments can be posted.

Python-to-C++ Compiler

Search 181 Comments Log In/Create an Account

Comments Filter:

not terribly useful quite yet (Score:5, Insightful)

by Surt ( 22457 ) writes: on Thursday June 15, 2006 @01:21PM (#15541340) Homepage Journal

Until he addresses mixed types in n-tuples, this won't be useful for very many people.

Share
twitter facebook
Re:Why not just use pure C++? (Score:2, Insightful)

by bzerodi ( 731405 ) writes: on Thursday June 15, 2006 @01:25PM (#15541377)

Why not pure assembler ?

Parent Share
twitter facebook
Yeah, but that's not what we need. (Score:3, Insightful)

by stonecypher ( 118140 ) writes: <stonecypher@noSpam.gmail.com> on Thursday June 15, 2006 @01:26PM (#15541390) Homepage Journal

See, it's all well and good to compile python to speed it up. The problem is, people are now saying that they can write efficient code in python just because it magically translates to C++, and because this translator is faster than other python compilers.

This won't be meaningful until a converted python script is compared to efficient code written natively in C++ in the first place.

Share
twitter facebook
Native code (Score:3, Insightful)

by Roy van Rijn ( 919696 ) writes: on Thursday June 15, 2006 @01:29PM (#15541419) Homepage

This is a good step to make Python run a bit faster, but I don't think it'll really make a huge difference.

The best way to get some speed and still keep the nice Python functions and layout is just to export the most heavily used functions to native code (C/C++).
I don't know if its possible to take the C++ output and optimize it seperatly, that way you will have a good start to make native code though.

In short: Better, fast and easy, but not the best (if you can write native code)

Share
twitter facebook
Re:Ewwwww (Score:5, Insightful)

by Anonymovs Coward ( 724746 ) writes: on Thursday June 15, 2006 @01:30PM (#15541434)

Completely unreadable.
I think you're not supposed to read it. You're only supposed to feed it to your C++ compiler. f2c produced unreadable output too, but nobody read the output; at one time it was the only free fortran option on linux.

Parent Share
twitter facebook
Re:Sounds good... (Score:4, Insightful)

by B'Trey ( 111263 ) writes: on Thursday June 15, 2006 @01:33PM (#15541464)

I will have to explore it more, but it will be intriguing to see how they handle things like pointers and structs that are not in python.

Uh, why would they have to? This goes from Python to C++, not vice versa. If there are no pointers or structs in the Python code, why would they have to handle them? Certainly, it's quite possible that some Python variable types will be converted to pointers or structs in the output code, but that's orthagonal to the issue of Python not having them natively.

If you were trying to go from C++ to Python, then you'd have to convert C++ pointers and structs to some sort of Python data type, and your comment would make sense. As it is, I'm not sure what you were trying to say.

Parent Share
twitter facebook
Re:Yeah, but that's not what we need. (Score:4, Insightful)

by Anonymovs Coward ( 724746 ) writes: on Thursday June 15, 2006 @01:42PM (#15541568)

I don't see your point. Some of us use python. It takes me a fraction the time to do something in python than to do it in any other language. I'm not interested in writing native C++ code because it's hypothetically faster (it's not faster if I count coding time). But I am interested in a good python-to-C++ translator. Why wouldn't any python user be?

Parent Share
twitter facebook
Re:Sounds good... (Score:3, Insightful)

by masklinn ( 823351 ) writes: <.slashdot.org. .at. .masklinn.net.> on Thursday June 15, 2006 @01:43PM (#15541576)

it will be intriguing to see how they handle things like pointers and structs that are not in python.

Why would one ever need to do that? The goal is not to write C++ in Python, it's to compile Python to machine code via an intermediate Python -> C++ compilation.

Parent Share
twitter facebook
Re:Yeah, but that's not what we need. (Score:5, Insightful)

by advocate_one ( 662832 ) writes: on Thursday June 15, 2006 @02:04PM (#15541857)

But I am interested in a good python-to-C++ translator. Why wouldn't any python user be?

no, I'd be far more interested in a good compiler to compile that python straight to machine code...

Parent Share
twitter facebook
Re:not terribly useful quite yet (Score:3, Insightful)

by goombah99 ( 560566 ) writes: on Thursday June 15, 2006 @02:05PM (#15541871)

But he's on the right track. Python allows dynamic typing but nearly all of ones programs do not take advantage of it. Recognizing that is key to making it go fast I think. It would be nice to have a filter you could run over python that would find all the type ambiguous points and let you insert some sort of compiler hinting.

I could envision it working like this. Instead of statically declaring all your variable types in every function, you instead simply declare that whatever tpyes are being used, they are always the same every time this function gets called. All the compiler then has to do is to find one instance of calling that function or one instance of the use of the arguments within the subrotuine in which the type is unambiguous to reverse engineer the types without having to be told. It could then flag the few cases where it can't resolve the type and either handle those in a slower dynamically typed fashion, or allow you to hint the types it was confused by.

Parent Share
twitter facebook
File as NBNC (Nice But No Cigar) (Score:5, Insightful)

by suitepotato ( 863945 ) writes: on Thursday June 15, 2006 @02:13PM (#15541953)

Why? Read the linked page? Says it all. Violates most any Python code of any complexity out there. So if it doesn't convert Python code from the real world, what is it for? Making Python coders learn enough about C++ to remember the limitations and write/rewrite Python code to use it?

What the Python C/C++ interested people REALLY need is a book written by a group of Python AND C/C++ masters which teaches the two simultaneously showing complimentary methods of doing any given thing working from beginner to advanced and I DON'T mean "How to turn your n00b Python code into C/C++ hotness" sort of viewpoint. I mean both taught simultaneously in synch showing how they can interchange and compliment.

Software tricks for converting? Ultimately worse than not having them because it leads to horrible obfuscation because we don't know exactly what is going on when 13,412 lines of Python is turned into C++ because WE DIDN'T WRITE IT AND WE NEVER LEARNED C/C++. "Say Mike, that's great but you're the company code cowboy and you don't do C++ natively and I sure as hell don't read it being management so exactly what happens if this needs to be fixed? We've gone from importing open source code you couldn't read to writing our own open source code you can't read."

Share
twitter facebook
Re:Yeah, but that's not what we need. (Score:3, Insightful)

by Abcd1234 ( 188840 ) writes: on Thursday June 15, 2006 @02:33PM (#15542152) Homepage

Why? If you can convert Python to reasonably optimized C++, then you can leverage the C++ compiler to do all the machine-level optimizations, rather than reinventing yet another wheel.

Parent Share
twitter facebook
Re:Ewwwww (Score:3, Insightful)

by IamTheRealMike ( 537420 ) writes: on Thursday June 15, 2006 @02:58PM (#15542406)

If you actually tried ShedSkin you'd find the C++ it produces is very similar to what a human might produce, and is actually quite easily readable. But then - why would you want to anyway? It's an intermediate form useful to pass to an optimising C++ compiler, not as something to read.

Parent Share
twitter facebook
Re:Very nice, but... (Score:3, Insightful)

by rjshields ( 719665 ) writes: on Thursday June 15, 2006 @03:20PM (#15542621)

MSIL is machine code for a virtual machine rather than a physical one. This distinction makes no difference to the point the GP was making.

Parent Share
twitter facebook
Re:If they can do this... (Score:5, Insightful)

by rpwoodbu ( 82958 ) writes: on Thursday June 15, 2006 @03:25PM (#15542682)

It is worth mentioning that one of the the original implementations of C++ (if not the very first) was "cfront", a C++-to-C converter. I see this as a much easier way to get a new language implemented quickly, as you can take advantage of the common functionalities already implemented in the target language of the converter. Although Python is not a new language, using it as a compiled language is new, and thus I believe it is comparable to being a new language for this argument. C++ and Python have a lot in common, which makes C++ a very suitable target language for a Python-to-[compiled_language] converter.

If this converter proves to be successful, I believe that a GCC frontend will be written eventually. There are probably potential optimizations that would be difficult or impossible to implement any other way.

Some may think that the dynamic nature of Python may preclude its inclusion in GCC. Technically, all that would need to be done is to have a runtime to handle dynamic things, similar to how Objective-C (for which there is GCC support) has a runtime to handle message passing and late binding. However, a large portion of the potential efficiency of a compiled version of the language would be lost to these dynamic capabilities; luckily, a compiler can detect when things are implicitly static (in fact, this converter is limited to implicitly static constructs), and optimise them to be truly static at compile-time.

Parent Share
twitter facebook
Stupid comparison (Score:4, Insightful)

by ardor ( 673957 ) writes: on Thursday June 15, 2006 @03:33PM (#15542750)

As another poster already said, file I/O is a bottleneck regardless of ANY language. So, try something different. Real-time h264 decoding for example.

Parent Share
twitter facebook
Re:Yeah, but that's not what we need. (Score:4, Insightful)

by mrchaotica ( 681592 ) * writes: on Thursday June 15, 2006 @04:09PM (#15543090)

...and that's why it shouldn't be a Python to C++ translator; it should be a GCC frontend instead (i.e., translating to GCC's internal representation).

Parent Share
twitter facebook
Re:Yeah, but that's not what we need. (Score:3, Insightful)

by styrotech ( 136124 ) writes: on Thursday June 15, 2006 @04:30PM (#15543273)

it gives you an extra area for weird bugs to creep in... get the Python right and go straight to machine code with a trusted compiler.

Is that the same way the method of using layers of multiple simple tools that all do one thing really well is more buggy that just using one larger general purpose monolithic app?

A cross platform Python to machine code compiler would presumably need to reinvent a whole lot of difficult platform specific stuff that has already been solved by C++ compilers.

Parent Share
twitter facebook
Re:2-40 what? (Score:3, Insightful)

by Schraegstrichpunkt ( 931443 ) writes: on Thursday June 15, 2006 @04:42PM (#15543406) Homepage

"Times faster" is a unitless quantity.

Parent Share
twitter facebook
Re:Ewwwww (Score:4, Insightful)

by Tim Browse ( 9263 ) writes: on Thursday June 15, 2006 @05:29PM (#15543944)

Yeah, whenever I look at the output of my optimising compiler, it's really hard to understand too. It's all in assembler, for a start.
Plus, the quality of C code generated by CFront was rubbish - unreadable.
Same with the Modula-3 compiler I tried. You couldn't work out what was going on in the resulting C code without a load of work.
Can you see where I'm going with this?

Parent Share
twitter facebook
Re:File as NBNC (Nice But No Cigar) (Score:4, Insightful)

by try_anything ( 880404 ) writes: on Thursday June 15, 2006 @06:45PM (#15544620)

Software tricks for converting? Ultimately worse than not having them because it leads to horrible obfuscation because we don't know exactly what is going on when 13,412 lines of Python is turned into C++ because WE DIDN'T WRITE IT AND WE NEVER LEARNED C/C++. "Say Mike, that's great but you're the company code cowboy and you don't do C++ natively and I sure as hell don't read it being management so exactly what happens if this needs to be fixed?"
That isn't how a compiler is used. When you compile a C++ program, you don't throw away your C++ source and check the executable into source control. "Oh, no! We used gcc and now we have a bunch of gobbledygook we don't understand!"
The C++ is an intermediate stage in the make process, akin to the output of various phases of gcc.

Parent Share
twitter facebook
Re:Yeah, but that's not what we need. (Score:3, Insightful)

by rbarreira ( 836272 ) writes: on Thursday June 15, 2006 @07:13PM (#15544871) Homepage

Not quite true. Analogy:

Would you also like to translate a text from Arabic to English by passing through 3 or 4 languages in between?

In this analogy the problem would probably be accuracy, in the case you presented it would be performance being lost due to layers of conversion. Some high level optimizations are inevitably lost (unless the C++ compiler has some sort of strong AI).

Parent Share
twitter facebook
Re:Yeah, but that's not what we need. (Score:1, Insightful)

by Anonymous Coward writes: on Thursday June 15, 2006 @07:45PM (#15545073)

The GCC IL isn't suitable for a dynamic language like Python.

Assuming the Shed Skin project overcomes the dynamic typing (and other) limitations so that any Python code is translatable to semantically equivalent C++, then it should be possible to skip the translation to C++ and go straight to GCC IL. Of course, that may prove to be too much of an assumption...

- T

Parent Share
twitter facebook
Re:Why not just use pure C++? (Score:3, Insightful)

by Anonymous Brave Guy ( 457657 ) writes: on Thursday June 15, 2006 @09:15PM (#15545610)

C++ makes it difficult to use complex data structures...
It does? I've always managed, somehow.

As have I, but I'd certainly rather manage in languages that support first order data structures, "for each" loops for iterations, proper disjunctive types, pattern matching, and so on. C++ is better than it used to be, but all the data structures and algorithms in the standard library barely hold a candle to the expressive power of many functional programming and "scripting" languages.

Parent Share
twitter facebook
Re:Yeah, but that's not what we need. (Score:3, Insightful)

by stonecypher ( 118140 ) writes: <stonecypher@noSpam.gmail.com> on Sunday June 18, 2006 @02:49PM (#15558742) Homepage Journal

This was my point exactly. The article says "this thing does a better job of converting Python to C++ in terms of efficiency than did the older one." People are hearing "This thing generates efficient C++." Nobody's tested that yet, though.

You are making a gigantic assumption that because this converter's better than the last one, that it's usable in efficiency arenas. By comparison, you might be looking at the difference between a shoe and a shoe with a spring (that's what air pumps do, don't laugh) when dealing with runners - you can grab another 10-20% speed out of the runner with the new shoe, and it's easier on the knees to boot.

They're still not racing a car.

I'm not saying this thing does a bad job; I honestly have no idea. The point, though, is that neither do you. If you're working a project which has a good reason to be Python, such as in an arena where programmer time is more important than execution time (a lot of programmers unfortunately believe that this is all programming, because they've never done performance-conscious things like operating systems, databases, video games, embedded or realtime software, and so on,) then this is great. Why? Because a new tool gives you an essentially free linear speed multiplier, which means you can crank X% more work out of the same machine farm, or whatever.

But, you've read this differently. You're starting to think of Python as an appropriate tool to write efficiency-conscious software, and there are not yet appropriate tests to display that this is an accomodating such tool. To be frank, I highly doubt that this is actually going to be a real performance-appropriate concern, and whereas a bunch of flametards will jump all over me demanding that I explain a lifetime's worth of instincts and experience to them in two paragraphs or I'm so obviously wrong, I still want to point out that technologies like this have come and gone for scripting languages for decades, that none of them have turned the python of their day to the C++ of their day, and that I don't see any compelling reason to believe that this will be any different.

What you're reading isn't what those tests say. Python is not a performance-appropriate language, and I believe that this tool will not make it so. No tests which determine whether it actually is a performance-appropriate environment have yet been run.

Besides, it's worth pointing out to all the flametards that C++ isn't actually the performance language of choice either. Depending on the nature of your problem, that crown usually goes to forth, erlang, k, mozart-oz or formulaONE. (And, for math-heavy stuff, it still goes to Fortran surprisingly often.)

Parent Share
twitter facebook
Re:Very nice, but... (Score:3, Insightful)

by 2short ( 466733 ) writes: on Monday June 19, 2006 @11:43AM (#15561869)

Indeed, VB.net and C# have very similar features and capabilities, and if there are big performance differences between them, it's because the authors of one of the compilers screwed up.

But the other posters were arguing that their performance and capabilities should be identical because they both compile to MSIL, and in fact that any language that does so would have equal performance and capabilities. Which is just silly; hence my silly IRock.net example. For a less silly example, Managed C++ certainly has different capabilities than VB.net or C#.

VB.net and C# produce very similar performance, because they are very similar to begin with. Not because their existing compilers target the same virtual machine.

Parent Share
twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Related Links Top of the: day, week, month.

413 commentsChatGPT Leans Liberal, Research Shows
347 commentsAmazon CEO Says 'It's Probably Not Going To Work Out' For Employees Who Defy Return-to-Office Policy
327 commentsHotel Owners Start To Write Off San Francisco as Business Nosedives
323 commentsChina is Building Nuclear Reactors Faster Than Any Other Country
315 commentsChina is Calling in Loans To Dozens of Countries

"Ninety percent of baseball is half mental." -- Yogi Berra