Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?

Comment: To 3D print out woes away (Score 1) 888

by maraist (#46250209) Attached to: Star Trek Economics
I'm not seeing the connecting-gap between ' Amercans no longer fret over iPhones (because we can print one with a 3D printer ) ' and 'We can build a star-ship because we've decoupled interest-in-work from the-need-to-work-to-earn-money-to-survive / acquire the things we wish to have'.

I don't fundamentally understand how a star trek society can exist. If we can all convert energy into material things. Consider the fabel, "these are rich people's problems".. Meaning the stresses that make us work harder are ultimately enslave us to our commitments, _change_ as we get wealthier (individually and socially), but they do not disappear.

You might consider the man that has earned enough money that he can go back-packing in Asia for 10 years.. Could the world function if everybody did so? Assume even that we had robots to build houses / plant our food. SOMETHING is always going to be present that prevents eutopia, even 1,000 years after such a world.

It's too narrow minded to look at today's problems, remove a single variable and say; now sci-fi happens.

Comment: Re:Strange title.... (Score 1) 286

by maraist (#40722833) Attached to: Why You Shouldn't Write Off Google+ Just Yet
Not sure why we'd have another dot-crash. Lets compare

    The world was in un-fettered prosperity
    The government was the good-guys with surpluses and expanding state/local infrastructure.. Fiber was being laid. Communism was failing
    People would quit the corporate world and be driven by silly business plans to build entire small-business capital ventures
    The market saw growth-growth-growth
    The pricing grew to reflect the short-term trend - the lead-in to a generic business plan - it self-fed (unsustainable exponential growth)
    Then when the generic business plan got into the ROI phase.. there was NO ROI.... All the business plans failed at the same time
    The market tanks
    The world economy restabalizes (note, doesn't crash..yet)
    All those small firms put a lot of people in under-employment (less shipping, less flying, less office-supplies, less construction, less luxry purchases, etc).
    Local municipalities/labor-intensive-corporation had contractually obligated themselves to 7% annual growth for pension plans
      Said market collapse and re-stabalization with more modest 4% growth, brings projected short-falls EVERYWHERE
    World governments over-react (including to 9/11) - drop interest rate to near zero
    under-employed masses react (as intended) by borrowing
    The ONLY viable investment at this point is the still-growing land/gold-inflation. (e.g. finite-resource ownership).
    Both hyper-inflate.
    Producing another lead-in to a business plan that will have exponential growth and ultimately super-saturate ROI and thus pop - nothing would prevent this BUT
    Newly deregulated banks now cross-buy their depleted LOSS-MAKING pension-funds (due to 2000 collapse) into the ONLY profit making venture, the obvious-bubble-making finite-resource market (gems, land, etc). Gems run the risk of a precious metal rush (e.g. uncovering a massive gold main). Housing is highly contingent upon the pyramid scheme.. Need more buyers than sellers - can't perpetuate unless you have an abundant birth rate (WHICH IS DROPPING).
    World banks determine inflation is too high.. They jack up interest rates.
    This chokes but does not end the bank-borrowing growth rate
    Deregulated banks get more clever and aggressive with their loan practices - new forms of insurance (CDF) allows them to flat out gamble against their own customers - hedging their gambling bets. This is a short-term win.. And so long as you're the first one that quits the game, you can win. Now, there is no longer a free-market incentive for banks to find credible loan customers, and likewise they have incentives to bribe ratings agencies to lie, and both then have incentives to lie to share-holders. So the market capitalizes this ultimately flawed strategy. Country-wide (of which I personally successuflly contributed) had the country's leading CD ROI (at 6%), reflecting secured investments due to guarnteed fraud-based profits.
    Then bad-debt begins to default.
    The insurance begins to pay-out
    Projects are re-normalized
    Heavy gamblers that didn't immediately exist are punished.
    The world governments over-react
    The re-normalized land-value chokes potential sellers (being under-water they couldn't sell if they wanted to)
    This prevents geographic job migration (you're stuck in Detroit Michigan)
    The people employed in real-estate, investment-banking, corporate sales are now under-employed again - cascading more large corporate [semi-]failures. Air-lines, automotive, etc. All cascading an unemployment crisis in some countries.
    Reduced commerce, unfullfilled gambling bets, investment losses, and projections there-of bring about impractical pension plans in countries like Greece, spain Ireland. Their elected policies could work in a 1999 world economy, but not in a 2008 recession. Debts pile up, productivity halts in greece, countries and currencies are on the verge of collapse.

    Unemployement likely continues as inflation and government debt escallates and starts to choke public investment spending (thereby reducing world-wide government employment - e.g. austerity measures).. Somehow conservatives in Germany and the US prevent big projects/investments from cash-injection, and thus the world economy stiffles.. Meanwhile...
    Communist controlled countries with managed currency, factories, natural resources, continue to buy entire countries with scarse resources.. (Using cheap labor as their initial source of income).. Oil, and materials used in high tech equipment like batteries and electronics are slowly shifting profits from middle-east to China.

    Trade wars are in full effect, with corporate espionage, contractual violations
    Natural disasters and continued US draughts further escallate world-wide shortages (shifting cheap food production sales from China)

    Shortages of basic scarce resources lead to a new world-war between the east and west..

Not seeing tech as a major factor. :)

Comment: Re:Discourage (Score 1) 107

by maraist (#37914520) Attached to: Ask Slashdot: Learning Dart Development?
I've also worn the hat of hiring highly skilled technical programmers. What I've found is that most of what 'good' programmers exhibit is self-motivated-determination to read on their own. People that read, not because they HAVE to get something done for work (and thus the bare minimum will suffice), but because they like to read technical manuals as if they were novels. They'll read it through, not because they're looking for a short-cut, or to get some nagging bug fixed, but because they want to dive-deep into some paradigm or language.

Well-read programmers sometimes comes from CIS degrees, usually NOT. Ironically most people I see coming out of universities are CRAP programmers. They go in thinking they're going to do this thing, but get overwhelmed quickly, become bare-minimalists in terms of understanding and (typically implementation), and resort to side-effect, poorly-documented, maximal-surprise code. Why? Because just like the all-nighter they pulled getting their project to work; it was passable.

English majors make great programmers, in my experience.. Presumably because they are people that can absorb a technical manual in a single night.

I've also found some electrical engineers too be good programmers (I happen to be among them). Mostly because they tend to attach problems from the bits up. They often have a very deep understanding of what a function is doing. It also means they have, by default, a richer math background - doing lots of math/equation proofs is useful when writing logic-functions.

So if you're past the college years and are trying to prove yourself. Do a lot of deep-dives of open-source projects.. Convince yourself that they work (e.g. critically analyze the code to understand the decisions made, as if you were the one making those decisions). Make sure you become familiar with a tool-chain (gpp -> gcc -> asm -> objcopy -> ln -> kernel-loader). Convince yourself that lisp is a great language (this will require every ounce of logical-strength that you can muster). Learn small-talk (the parent of most paradigms these days). Learn C++ (so you can see what everybody is trying to implement without actually implementing). Develop a VERY good understanding of C - (learning how everything is a symbol) - try and correlate obdump -x and 'nm' against C functions.. Learn how to make a shared library (either windows DLL or linux .so or Mac OSX dynlib). Delve into the format/layout of ELF. Learn the significance of the various segment-types (this generally applys to all OSs). Learn an ASM if you can.. Start by running
gcc -S helloworld.c
gcc -S -m64 helloworld.c
for the 64bit equivalents.. Make sure to put lots of function-calls, floating-point and OS calls.. Learn what the assembly is doing.. wikipedia ANYTHING you don't understand.

Learn a good editor.. Visual Studio, Eclipse/IntelliJ, X-code, kdevelop, codeblocks. Learn at least 64 short-cuts in two of them. Get familiar with thin editors (notepad++, vim, kate).

LEARN TCP. Google it.. Use perl, python, ruby or Java to write your own client / server in both TCP and UDP if you can. If you're up to it, try writting it in C or C++ + boost or Visual Studio.

LEARN the HTTP protocol (almost impossible to be useful these days without it). Use 'nc' 'curl' 'wget' 'telnet' interchangably to interact with an HTTP service.

Learn XML.. At least the DTD, but to really do well, learn XSDs.. Use javascript's DOM to muck with it to start.. But you'll probably need to learn a C/C++/Java/.NET's perferred APIs. It's hard to NOT have to parse XML in most paid-applications.

Learn UML. Read a good book on design patterns; eventually you'll think in UML for classes and DB entities; but you'll also need to think in terms of it for collaboration diagrams, sequence-diagrams etc.. (lots of free [online] tools.. creately, lucid-charts, argouml). Learn to white-board as if you were Italian. This goes over GREAT in interviews.

Learn SQL.. It's not going anywhere, I promise. Use postgres + pgadmin or mysql + phpmyadmin. Learn what RDBMS is.. What ACID is.. What the CAP theorem is. Try the free Oracle if you have an afternoon to totally lose; but it's probably useful in job-hunting.

Read up on NoSQL solutions.

When creating your resume (after 2 hours of doing all this), make sure to be HONEST in your skill levels in all of the above.. Don't JUST list that you know mysql,postgres,oracle,MS SQL. List mysql [expert], postgres [seasoned], oracle [exposed to], MS SQL [novice]. This avoids wasting people's time, and prevents dissapointments in the interview that would otherwise have been managed expectations leading up to a (yeah, he'll need some ramp-up-time, but I think we have work for him).

Comment: Re:no way - wrong search terms leave things behind (Score 1) 434

by maraist (#37660494) Attached to: Putting Emails In Folders Is a Waste of Time, Says IBM Study
Depends what's meant by 'putting in separate folder'. Can an email have exactly one parent? Then what happens when you have 50 folders, each with 300 unread items. Is this more or less organized?

My preference is to have 100% of emails show up in inbox - but be auto-tagged. This is better than traditional folders because there is more than one parent.

todo, Reference, todelete, asap, projX, companyY, contactCategoryZ, personal, mailinglist, mailinglistX, etc

For each new email, I set up a rule to tag all similar emails (90% are todel). BUT, because they always show up to INBOX, I have a half second glance to decide if that email needed TODO/Reference, or if I don't want it todelete for some reason.

Searching on tags is superior to remembering keywords, because you can navigate the tags (just like folders). And depending on your email tool, you can mix and match "(tags:foo or tags:bar) subject:sales".

Comment: Re:Improve Slashdot By Rewinding To What It Grew O (Score 1) 763

by maraist (#37636858) Attached to: Help Shape the Future of Slashdot
Right, but what is the basis for the paranoia. I am highly skeptical (even of a geek community) of properly directing that paranoia to non tin-foil-hat conspiracies.
1) Theft follows the money and the naive (e.g. major banks, major places with credit cards, and people/groups susceptible to social engineering attacks)
2) net-Stalking generally is done by major govs/institutions that make wide-area attacks with non-targetted victims, or petty people with no servers from which to reliably cause a reasonably cautious netizen to worry.
3) Ad-tracking / Ad-metric-gathering allows vendors to.. Well, produce more targetted ads. I never understood the visceral hatred of double-click. Though I share the frustration with ad providers that steal my cursor with CSS popup DIVs or flash.
4) Porn sites presumably can detect repeat non-paying visitors and restrict content (big shocker there).

I understand the notion of a condom-mode web browsing (no cookies, no cache, no passwords), and I can see the frustration with the web essentially being broken in that mode; but honestly. Session cookies are much more elegant than embedded tokens in paths; as they are perma-linkable. And being a personal hater of 'apps' when a stateful website is just as functional (and almost by definition, more portable), I find it difficult to swallow a demand that HTTP remain stateless.

Comment: Re:Moderation system (Score 1) 763

by maraist (#37634494) Attached to: Help Shape the Future of Slashdot
"6) Delete all accounts numbered 2,000,000+. Remove signup. Invite only"

Heck, why not start at, I don't know, say 1,000,000 Mr 1.05 ;)

Though I do get a sense of "get off my lawn, when I was young, commenters respected their elders" :) I remember from when slashdot started, people were ALWAYS complaining about poor comment quality.. But sorry guys.. I don't see it.. If there is ONE good comment in a comment stream (above level 3, let's say), and I can quick-read through 30 comments, I call it a win.. I've learned something new. If I didn't, I wouldn't waste the, oh, I don't know, 8 minutes a day it took to read those 30 comments. A little more productive than day-time-TV, I'd say.

Comment: Re:This just makes sense (Score 1) 1345

by maraist (#37551924) Attached to: Science and Religion Can and Do Mix, Mostly
haha! So Muslims and Christians should probably re-read it then, because NOBODY that I know actually expouses that.. Here's what a classic Muslim would say:
1) Accept Allah as the one true god
2) Accept the profit Mohammed (may peace be apon him)

Here is what a modern Christian would say
1) Jesus is God
2) Jesus is God's son [and please ignore my apparent stupidity]
3) Adam and Eve ate the forbidden fruit so we're condemned to eternal damnation - as unfixable sinners.
4) Accept Jesus as your personal savior (your sacrificial lamb to quelll your original sin), or perish in a lake of eternal hell fire.

There's no mention of love, or neighbors.. Just God and your acceptance of Salvation from an apparently 6,000 year old forbidden fruit..

All the parables are for naught, because... We should NOT help the poor, because they'll just spend it on sinful things.. We should TRY and enrich ourselves (because a passage in the old testiment referred to as the Jabez passage) says if we ask God to enrich our life (so that we may exault his), then God will do so. We SHOULD conquer other Infadel nations and spread the 'good word' (because some prick non-Jew named Paul started the trend in the new testiment). We SHOULD ready the end-times by facilitating ancient prophecy in Israel (even though for 2,000 years nobody has been even close to a fullfilment, and mathmatically everybody has their own calculation based on BS numerology). We should NOT give up our wealth, because it's better spent looking fabulous in a papal gem-studded-robe or televangelist palace - praise be to JeSUS. We should NOT love they neighbor if they are: jewish, muslim, homosexual, left-handed, speak in a different language, seem animal-like and thus not derived from Adam (e.g. black), reject Jesus openly, have some contractable disease (they're being punished by God of course), or have a tiny nuance difference in the view of how to worship rules 1 .. 4, etc. We should NOT pay taxes because Ceaser kills babys (and not enough grown men).

But, you are technically correct, If modern Christians did actually have a prioritized critical analysis skill of any sort (as opposed to meme-repetition), then it's pretty clear that the 'greatest commandments' involved love. And the parables did center around forgiveness, kindness, charity, non-violence, social-aclimation (e.g. paying taxes, and getting along with your arch-enemies across the river). You know, the crap godless Hippies and Athiests expouse.. Go figure.

Though there was some crazy period mysticism crap in there.. The whole fake-rise-from-the-dead ritual (Lazereanism), baptism ritual (zorastrianism), placebo healing rituals (gee, that would be most religious, except maybe Judaism), Jewish rituals of course. Still, most people think Jesus invented / originated those rituals (and some crazily think they were 'magic'/divine because somebody wrote about it in a biography). Why would a biography lie? If they were true, and if God wanted the 'good news' to be heard, then he wouldn't let people lie about it, right? Which is why thousands of books had to be burned prior to the canonicalization of the modern New Testiment in 300AD I guess. :) But the logic works if you squint REALLY hard.

Comment: Re:Imaginary Mass! (Score 1) 412

by maraist (#37548008) Attached to: Faster-Than-Light Particle Results To Be Re-Tested
My understanding of quarks is that they were essentially of complex mass - which is why they can't exist individually, but must do so in some complementary pair or tripplet. It's the same as the root of a cube.. There are three solutions and two NEED to be conjugate pairs (e.g. imaginary).

Comment: Re:Does that make any sense? (Score 1) 127

by maraist (#37547692) Attached to: Oracle Demos New SPARC T4 Processor
I am suspicious of such numbers. You need to pick an application and run it on two machines for comparison. Number of active threads is independent of the number of CPUs - and while this means overhead in context switching into and out-of CPUs, if the reason they're stalled is memory throughput or disk IO, then CPU context switching is irrelevant.. And guess what, the memory throughput has little to do with the CPU.. Namely you can construct a NUMA motherboard where each CPU has localized RAM (and thus unaffected by threads in other zones).

The main reason I call BS on 2, 4 or 16 threads per core is that this explicitly only allows a single actively running thread; it merely means that IF you can do NON RAM work in the other 15 threads, then when thread-1 stalls you can get useful work done by quickly switching without OS interaction. But that's a big IF.

Comment: Re:Not the point of SPARC (Score 1) 127

by maraist (#37537886) Attached to: Oracle Demos New SPARC T4 Processor
"Essentially, this is about as efficient as you can get"
I don't know about that. alpha/AMD added the 'mesh' CPU network, where you talked to routed neighbors for memory/BUS access, and thus not every CPU needs to spin it's transistors on cache operations. I don't know how advanced that got, but there's no reason that N,S,E,W cache controller nodes can't determine the scope of dissemination of cache-reads, and thereby localize snoop and lock-cache-line calls. Compre-and-set can be implemented IN the cache-line. You've got an address-line and a data-line. Why not add a cas bit on the routed bus and say 'Here's the old data, and it's address, and set the cas bit; then on the next clock, the data line will contain the new data. The mem-controler reponds with a read-complete operation or invalid'. You already have comparison logic for the address lines in their fully-associative cache logic; no reason you can't implement CAS there too.

So now you completely avoid a BUS-lock.. The cache atomicly (within a cache-clock-tick) owns the data.

I'm not a fan of mutex-spin-locks, because most algorithms that are fast enough to warrent a context-switch-free operation generally can use an optimistic-locking algorithm. Thus, I believe OS ctx switches or CPU-oriented thread-swap operations are field (e.g. some type of yield operation). But there are integer or pair-of-integer algorithms that can make heavy use of such CAS operations. And generally, the first step of a mutex is such a CAS.

If you were crazy enough to want to use semaphores instead of mutex's (hopefully because you actually want a constrained resource count, and not because you just prefer semaphores), then this CPU operation wouldn't make sense.. I doubt you want a full-on adder in the cache-controller (though I suppose that's possible too). Namely instead of 'lock; xadd [m1], 1'. You'd have "st_add [m1], 1" where the cache-controller is required to implement read, add, re-write (atomicly) instead of having the CPU do it.

Next is the furthered notion of topological CPU configurations.. Namely, pin thread-5 to CPU-2 then pin thread-6 t CPU-3 (which is adjacent), and the OS tells both CPUs that a shared memory segment (including mutex regions) is ONLY available to those two CPUs. And thus you can pass synchronous control signals between them.. Again, special instructions are required.. But now you can use 'barrier' instructions.. Similar to an idle spin loop (wait until barrier signal arrives from co-processor indicating the pipeline of work has new data. But this now starts competing with GPUs and CELL processors.

I'm sure there are other possible operations, such as direct register to remote-CPU register transfer operations.. Similar in principle to the SUN SPARC register window, you can have an in/out suite of registers where instead of function-call communication, we're facilitating asynchronous co-processors pipeline communication operations.

There are code paths that make sense to implement these sort of pipeline co-process architectures. Any place a 'spawn task' operation happens, that can be optimized in assembly + OS as a co-process pair. You would need some specialized C function APIs:

Future f = DispatchTask(data_structure); // co-process/ co-thread dispatch ....
mytype result = f.get(); // co-process barrier (CPU might context switch to peer thread)

Comment: Re:Not the point of SPARC (Score 1) 127

by maraist (#37537788) Attached to: Oracle Demos New SPARC T4 Processor
"DB are generally IO bound: they must provide guarantee that committed data were safely written to disk. This is where the most of DB's performance is wasted - waiting for the disk to do its job.".
Sometimes. When they are IO bound, then yes, locks are of trivial importance. But DBs generally also serve as massive coherent caches and lock-managers. And those caching operations are VERY susceptible to critical-region code. Namely MySQL INNODB had an inverse performance characteristic with the number of CPUs for a long while. Further, recently, it was shown that by skipping the SQL layer of MySQL-INNODB and using simple direct HTTP calls into the storage layer, performance was improved measureably.

There's no accounting for bad code (which is most likely the case with MySQL optimizations), but I definitely feel that the programming world is solely inadequately prepared for multi-threaded programming.

Basically we need to focus on lock-free algorithms (or wait-free, or read-lock-free as appropriate). They should focus on optimisitic locks instead of pessimistic locks. Use compare-and-set operations/spin-loops. Use stack / context-variables instead of globals, etc. But most people just think in their 1960s sequential style and go; oh, two guys touch this so let me throw in a mutex. Then you find when you double the CPU count (or even machine count) that the system degrades miserably.

Comment: Re:Not the point of SPARC (Score 2) 127

by maraist (#37534516) Attached to: Oracle Demos New SPARC T4 Processor
"Is there a way for a CPU to make mutex handling easier and more efficient?"
mutex's are VERY efficient with cache-oriented MESI and MOESI instructions, the problem is what you do while another thread owns the mutex. You can either spin-loop or context-switch to another thread. Specifically if thread A locks then has several cache-misses, then thread B would have to spin for thousands of clock-ticks. When you have 80 CPUs that might not be too bad (though it burns power), but if you have 2 CPUs, then that's likely highly wasteful.

"trigger on event or register/memory=certain value"
I believe intel provides a block-until cache-line-updated instruction. I believe that's how Linux futuex and OS-level schedulers work if IIRC.

"I bet there's lots of code which regularly checks "is it time to do X yet?" or "wait till X happens" (e.g. wait for connection or data)."

Well, wait-for-connection is an OS thing. If you use epoll, IO-Completion, kpoll, or even ancient unix 'select' you transfer the overhead of IO to the OS which is very event-driven (and thus doesn't necessarily have a lot of blocking structures). Namely ethernet frame-driver running on CPU-3 can in theory directly transfer to thread-16 after completion which is blocking on a TCP packet when the OS determines it's received enough data to be awoken.

As for 'is it time to do X yet' isn't as bad as you might think (well, I don't have that much imperical evidence, but I've worked in this space).
Instead of 'polling', you can leverage a priority-queue, such that if there is literally nothing critical-to-run, you quickly test the head of the queue for it's execution time-stamp. Then do a time-of-day operation (all while in the OS, so no additional context switching). If a < b you flag the blocking thread for execution (possibly transfering directly to it). Here the slowdown comes when you add/remove an item from the temporal priority-queue (namely O(log(n))). So this is a function of how many temporal waits are scheduled/completed, but is independent of how long it's supposed to wait.. When there is literally no work to perform at all (all CPUs are about to go idle), then you look at your priority-queue and ask how long before the next scheduled event.. Then you can make a CPU go to sleep for that long (using interrupt controllers if need be).
Generally you're going to wake up 32 times a second anyway, and the marginal overhead of re-testing a time against the priority-heap-head 32 times a second is nominal (I can run 800,000 java-based time-of-day calls per second.. plenty of room for those 32).

To boot, the OS doesn't need to know about all the scheduled tasks. It only needs to know about one per thread at most (generally only about 0 .. 3 at any given time, with a few socket-timeout type apps bringing it into the hundreds). Apps can, for example implement their own timer logic that mimics this priority-queue model (java does, for example).. Thus one thread can be OS-bound on a timer that is the nearest temporal event in a pool of potentially thousands of scheduled events (e.g. a Java Timer or memory-based Quartz).

What I see the greatest problem with are peer-CPU's modifying common cache lines..

Namely if you have a job-queue data-structure where N threads are pulling/pushing out-of/on-to, then you have a single spot in memory that EVERY thread must modify in order to transfer work.. This is a massive bottleneck.. One that ironically you don't have in single CPU configurations. This is something that I think CPUs can work to address. Especially as co-process and message-passing systems become more prevelant (erlang type languages or message-queue NoSQL models).

One reason Intel CPUs suffer from this is that when two CPUs concurrently modify a cache-line, everybody is forced to 'flush' their cache line and re-read from RAM.. This not only makes those accesses no faster than main RAM, it's competing with everything else you want to do with RAM - increasing latancy massively. I believe AMD's MOESI allows peer CPUs to read from CPU-A's cache line instead of main RAM. But what I think would be better is to have CPUs be able to coordinate using a monitor thread for such data-structure modifications.

Namely instead of having 80 cores on a motherboard fighting to keep their cache-lines coherent, they can delegate a micro-instruction to a single CPU with a hyper-hot suite of cache (e.g. no other thread would ever read/write to that cache-line). The BUS would transfer directly from CPU-N to CPU-x (the hot CPU) the message that needs to be enqueue'd / dequeue'd. This would run at BUS speeds (e.g. pretty damn slow), but if the enqueue/dequeue was sufficiently complex (say if it was a log(n) priority enqueue/dequeue operation) that the BUS speed would be masked by the computational overhead.

So you'd need help from the motherboard, CPU, OS and programming structures (namely it can't be so exotic that nobody'd bother risking implement useful software with this programming model).

Well, we kind of already have that with NoSQL solutions.. Currently a lot of them use TCP - some of which can support UNIX-sockets (which then can be hidden as OS FIFO buffers, which could then be optimized into special kernel structures, like the tux webserver once did, then finally into hardware level messages). Incendently, Windows and MacOS-X also support a lot of 'event' structures in their basic programming models - so they are also potential candidates.

Comment: Re:Single thread performance (Score 2) 127

by maraist (#37534330) Attached to: Oracle Demos New SPARC T4 Processor
The problem is that hyperthreading CPUs and x64-64 EPIC are predicated around floating point performance. The idea is that if you're FPU bound, then you want to minimize RAM latency by flipping between threads while you have FPU-load stalls.. You add speculative execution, predicate registers, pipeline execution stacks to minimize branch-misses, etc. But it's all about FPU with 200 clock execution times (e.g. divides and transcendental ops - as with FFT).

But I'm sorry, no matter how fast you make their FPUs, they're not going to beat FPGA or ASIC or raw-silicone GPU's. These bastards optimize memory paths and reduce critical path latencies.. The only advantage CPUs have over GPUs is that you can context switch unrelated tasks better than with GPUs.

A vast majority of apps in the world are NOT FPU based. They are pure integer. And moreover, these days, they are RAM constrained.. If your're writing a NoSQL DB procedure to perform zlib or merge-sorting or state-machine syntax parsing, FPU oriented architectures are of ZERO benefit. This is all RAM -> branch-prediction related. That is, read-data, make a decision, jump to new code (which triggers new RAM loads) run two or three instructions, then repeat. While SOME of the app state-tables and code-paths can get cached efficiently, the input stream is generally far larger than your L3 cache (on the order of gigabytes).

So, SOME of the memory pre-loading, branch-prediction, and on-stalled-thread-ctx-switch could be leveraged.. But MT apps suffer from barriers in critical regions.. Namely if you memory stall while holding a lock, you cripple the parallel performance.

Co-processes are very efficient (e.g. apache pre-fork, postgres co-threads with specific shared-mem-segments, erlang, ruby-unicorn, etc) in that they organize very small messages to pass between processes and keep all remaining cache-lines isolated to their single thread and thus semi-dedicated CPUs. This can very nicely leverage co-processors without necessarily saturating RAM -though if the apps themselves are RAM-bound you still have problems; BUT if you have NUMA, the CPU can segment memory spaces better with co-processes than with MT. That being said, the SUN light-weight-threads are (I believe) designed around shared memory-spaces having minimal context-switching time v.s. posix-threads or normal co-processes, so they can't really take advantage of co-processes as well as MT.. So SUN light-threads are forced to endure potentially bad programming by DB, file-IO, OS, signal-processing applications.. Namely if you can't create isolated memory regions (malloc/free-locks, IO/pipe-locks, concurrent-data-structure 'critical-region' locks, etc), you'll find yourself dirtying shared cache-lines so often, you actually find yourself running slower than if you were just single-threaded.

I know, for example, a simple merge-sort can run significantly slower (3x) when run in parallel v.s. single-threaded predominantly because of intel's MESI implementation. Well, not necessarily 3x slower human-time wise, but in consumed CPU time with little or no visible decrease in human-response-time.

As another example, mysql INNODB had a inverse performance curve for the longest time.. Meaning, the more physical CPUs you added, the SLOWER it's total throughput would be.. Predominantly due to excessive critical-region locks. Many of those locks have been replaced with less-accurate atomic spin-locks (as with sequence-counters). Namely you can now 'lose' a primary key's sequential value under the right circumstances - but at the benifit of removing a major classic stall-point. But INNODB is still full of complex algorithms that require critical-regions. Lock-free-code is really hard and is very limiting. But that isn't to say people haven't figured out how to architect good designs. 'redis' NoSQL and erlang based apps (like rabbit-MQ) are good examples.. Namely copy-on-write small data-structures.

But there are two types of apps that have lots of parallel threads. Those with MASSIVE memory requirements and those that

The last person that quit or was fired will be held responsible for everything that goes wrong -- until the next person quits or is fired.