Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×

Comment Oh my... (Score 5, Informative) 98

"a high-performance task scheduling engine written (perplexingly) in Python"

guys, there is this thing, it's called "algorithm"....

yeah.... except that algorithm took a staggering 3 months to develop. and it wasn't one algorithm, it was several, along with creating a networking IPC stack and having to create several unusual client-server design decisions. i can't go into the details because i was working in a secure environment, but basically even though i was the one that wrote the code i was taken aback that *python* - a scripted programming language - was capable of such extreme processing rates.

normally those kinds of speed rates would be associated with c for example.

but the key point of the article - leaving that speed aside - is that if something like PostgreSQL had been used as the back-end store, that rate would be somewhere around 30,000 tasks per second or possibly even less than that, over the long term, because of the overwhelming overhead associated with SQL (and NoSQL) databases maintaining transaction logs and making other guarantees in ways that are clearly *significantly* less efficient than the ways that LMDB do it, by way of those guarantees being integrated at a fundamental design level into LMDB.

Comment I can't wait for it (Score 1) 98

At some point there will be an article on Wikipedia, that only meets Wikipedia's notability requirements due to media spillover complaining about the notability requirements.

yaaay! :) works for me. wasn't there a journalist who published a blog and used that as the only notable reference to create a fake article? :)

Comment Would it hurt ... (Score 5, Informative) 98

OpenLDAP was originally using Berkeley DB, until recently. they'd worked with it for years, and got fed up with it. in order to minimise the amount of disruption to the code-base, LMDB was written as a near-drop-in replacement.

LMDB is - according to the web site and also the deleted wikipedia page - a key-value store. however its performance absolutely pisses over everything else around it, on pretty much every metric that can be measured, with very few exceptions.

basically howard's extensive experience combined with the intelligence to do thorough research (even to computing papers dating back to the 1960s) led him to make some absolutely critical but perfectly rational design choices, the ultimate combination of which is that LMDB outshines pretty much every key-value store ever written.

i mean, if you are running benchmark programs in *python* and getting sequential read access to records at a rate of 2,500,000 (2.5 MILLION) records per second... in a *scripted* programming language for goodness sake... then they have to be doing something right.

the random write speed of the python-based benchmarks showed 250,000 records written per second. the _sequential_ ones managed just over 900,000 per second!

there are several key differences between Berkeley DB's API and LMDB's API. the first is that LMDB can be put into "append" mode (as mentioned above). basically what you do is you *guarantee* that the key of new records is lexicographically greater than all other records. with this guarantee LMDB baiscally lets you put the new record _right_ at the end of its B+ Tree. this results in something like an astonishing 5x performance increase in writes.

the second key difference is that LMDB allows you to add duplicate values per key. in fact i think there's also a special mode (never used it) where if you do guaranteed fixed (identical) record sizes LMDB will let you store the values in a more space-efficient manner.

so it's pretty sophisticated.

from a technical perspective, there are two key differences between LMDB and *all* other key-value stores.

the first is: it uses "append-only" when adding new records. basically this has some guarantees that there can never be any corruption of existing data just because a new record is added.

the second is: it uses shared memory "copy-on-write" semantics. what that means is that the (one allowed) writer NEVER - and i mean never - blocks readers, whilst importantly being able to guarantee data integrity and transaction atomicity as well.

the way this is achieved is that because Copy-on-write is enabled, the "writer" may make as many writes it wants, knowing full well that all the readers will NOT be interfered with (because any write creates a COPY of the memory page being written to). then, finally, once everything is done, and the new top level parent B+ Tree is finished, the VERY last thing is a single simple LOCK, update-pointer-to-top-level, UNLOCK.

so as long as Reads do the exact same LOCK, get-pointer-to-top-level-of-B-Tree, UNLOCK, there is NO FURTHER NEED for any kind of locking AT ALL.

i am just simply amazed at the simplicity, and how this technique has just... never been deployed in any database engine before, until now. the reasons as howard makes clear are that the original research back in the 1960s was restricted to 32-bit memory spaces. now we have 64-bit so shared memory may refer to absolutely enormous files, so there is no problem deploying this technique, now.

all incredibly cool.

Submission + - Python-LMDB in a high-performance environment

lkcl writes: In an open letter to the core developers behind OpenLDAP (Howard Chu) and Python-LMDB (David Wilson) is a story of a successful creation of a high-performance task scheduling engine written (perplexingly) in python. With only partial optimisation allowing tasks to be executed in parallel at a phenomenal rate of 240,000 per second, the choice to use Python-LMDB for the per-task database store based on its benchmarks as well as its well-researched design criteria turned out to be the right decision. Part of the success was also due to earlier architectural advice gratefully received here on slashdot. What is puzzling though is that LMDB on wikipedia is being constantly deleted, despite its "notability" by way of being used in a seriously-long list of prominent software libre projects, which has been, in part, motivated by the Oracle-driven BerkeleyDB license change. It would appear that the original complaint about notability came from an Oracle employee as well...

Comment Haha... Yeh that's the problem :/ (Score 1) 150

Storage is hardly the issue. Most companies won't have anywhere near a petabyte to move.

The real problem is whether PaaS or SaaS will screw you. If all your data is written to run on a platform which is closed (AWS, Google...) you're utterly screwed. Cloud software is also never updated like proper applications. Improvements are made incrementally and if AWS went tits up, even if you manage to get a copy of the hosting platform, you'll be stuck with whatever bugs were in the last build.

IaaS isn't too bad, but otherwise Cloud is just a BAD idea.

Comment pay them!! (Score 3, Interesting) 265

the key point that people keep missing is that corporations - which are legally obligated to maximise profits - take whatever they can get "for free". software libre developers *do not have* the opportunity that is normally present in business transactions to present the person receiving their work with the VERY IMPORTANT opportunity to transfer to that developer a reward (payment) which represents the value of the software that the person is receiving.

so it should come as absolutely no surprise that those software libre developers are not equipped with the financial means to support themselves (the Gentoo leader ending up with a $50,000 credit-card debt and having to quit and go work for Microsoft is an example that springs to mind) and they *CERTAINLY* don't have the financial means to pay for e.g. security reviews or security tools.

the solution is incredibly simple: if you are using software libre for your business, PAY THE DEVELOPERS. find a way. pick a project that's important or fundamental to your business, and PAY THEM.

Comment Re: Pay me once, shame on me. (Score 1) 106

I am have released documents and designs for quite a few technologies in the past. This is a topic which has always interested me, though I simply am not interested in building a business making these robots. Have drawings for multiple designs that when used in conjunction can handle most picking related issues. I will not likely enter this competition. The cost of entering is too high and has too big of a risk walking away without my expenses covered.

I think $100,000 first prize, $50,000 second and $20,000 third would have peeked my interest. But $20,000 for a first prize just isn't enough bother with.

Comment Re:Well DUH! (Score 1) 403

It tells you exactly why in the article. It's the way people drive them.

If you try to push a small engine to drive like a larger one, you'll be accelerating harder, therefore using more fuel than under normal acceleration.

there's a little more to it than that. i worked for a company back in 1993 where i was asked to write a vehicle simulator for Detroit Diesel. smaller engines *consistently* under-performed against larger engines, and the reason is simple: when expected to keep up with the demands placed on it a smaller engine, in order to deliver the demanded power, has to operate *well* outside of its most efficient power-band.

a more powerful engine can deliver the power expected of it whilst operating within its most efficient torque/RPM range, and have a much wider selection of gears that will achieve the power that the driver expects and demands.

in other words you need to rev the nuts off of a smaller engine and use low gears to get the same acceleration.

which is why, with that grand cherokee, your wife could not get more than 11mpg, because she was basically driving it very very hard, and you were not. i presume it was an automatic. if so, your driving style allowed the on-board computer to choose the most fuel-efficient gear, whereas your wife's foot-to-the-floor approach made the on-board computer eliminate all gears but the one that delivered the maximum power demanded. that meant that the engine pretty much operated all the time at the redline.... right where it is at its most inefficient.

simple really...

Comment Is this a double bluff ? (Score 1) 575

The spooks get various government loudmouths complain how the current adoption of better device security is only helping terrorists/drug-dealers/paedophiles/... and so please, pretty please, do not do it. The result is that those concerned about privacy & the tech-savvy crowd think ''f**k you - we now have our privacy back''.

The reality is that the NSA/GCHQ/... have the current technologies sussed/back-doored but are scared shitless that something better will be adopted. So: they convince us all that we have them on the back foot and so do not implement anything better.

Whatever the truth of the matter: we MUST continue to implement ever better security on all our devices - complacency is our enemy!

Comment Re:FP? (Score 1) 942

I regularly need to convert from ancient Egyptian cubits which I was lucky enough to learn about in grade school. We should always learn the different unit measures in primary school. They're simple enough. It's not like it takes even a tiny bit of intelligence to understand how to convert.

Of course... in engineering and sciences, we already use metric across the board. It's in daily life which the simpler imperial measure system makes sense. I live in Europe and grew up in the States. I've never been confused by measurements in either, but when I cook, instead of measuring 450grams (my scale isn't that good) I simply grab a chunk of meat which is a pound. It's a proper size for cooking. I also use a cup of water or milk.

Honestly, I know a A LOT of people who moved to America and had no problem with American standard measure and I know many who moved from America or England who had no problem with metric. I just don't see how knowing both is a problem.

Comment Re:FP? (Score 5, Insightful) 942

Bullshit!

Vasa was built asymmetrically because it was a Swedish engineering project. All Swedish engineering projects by definition must start big, go way over-budget, become completely unusable and reach market so late that they're no longer interesting. The project then burns to ashes, rises from the ashes reborn as something amazing and get sold to someone else. As an example look at "ericsson pipe rider cable modem" on Google and you'll see a proper Swedish engineering project that went so completely shitty that it would have killed the company and ended up rising from the ashes as a patent pool on the 10,000 things they created while failing at this.

This is why I refer to all products resulting from failed Swedish projects as Vasa Projects.

Comment A matter of perspective (Score 2) 57

In my case it was because I'm a lazy bastard. I needed e bug tracking module (exception details are turned into a unified format to report to a bug tracking server) and I came across one that was already 95% of what I wanted, so I simply contributed enhancements until the final 5% were covered.
The same thing got a former colleague of mine involved in the Firebug project until he became a regular contributor.

Comment trust vs respect (Score 2) 460

Scientists have earned the respect of Americans but not necessarily their trust,' said lead author Susan Fiske, the Eugene Higgins Professor of Psychology and professor of public affairs

it was only fairly recently that someone explained the absolutely crucial difference between trust and respect, and it knocked me sideways. i used to always accept the "wisdom" that trust is EARNED.

trust - literally by definition- CANNOT be EARNED.

*respect* can be earned, because to respect someone (or something) you learn from PAST experience and PAST actions, you make a judgement call "that thing (or person) did something cool [in the PAST], and i liked it."

trust - by definition - refers to the FUTURE. i am - in the FUTURE - going to give someone the power and authority to do something. i (the person doing the trusting) actually have absolutely NO CLUE as to whether in the FUTURE, regardless of PAST performance, the person will do what they say that they can do.

how on earth can _anyone_ say, "you earned (past tense) my trust (future decision-making)"????

this is how wars are started (and sustained), by people confusing past and present in relation to trust and respect.

so this is where it gets interesting, because the original article is actually making TWO completely SEPARATE and distinct statements:

1) the american public has analysed the PAST actions of scientists, and finds that those actions are [in some way] cool enough to be respected (past tense)

2) the american public has, within themselves, insufficient knowledge about what it is that scientists do - and this has absolutely nothing to do with the scientists but EVERYTHING to do with "the american public" - in order to take the [frightening!] step of placing their trust in the FUTURE decision-making of some individuals-that-happen-to-be-scientists.

i cannot emphasise enough that a decision *to* trust has absolutely nothing to do with the person or thing that you are trusting. the *decision* to place trust in someone else really *really* is something that has absolutely nothing to do with the *analysis* of whether *to* trust.

this is where people get terribly confused. they do some analysis (based usually on past performance), and then they have to make a decision. they *believe* that the [past] analysis *IS* trust. it's not!! even once the [past] analysis has been done, you *still* need to take that step - to trust.

the link between respect and trust is that it is *usually* the respect that we have for people which tips our analysis in favour of certain individuals. but the analysis is NOT respect itself, just as trust (the decision to trust) is not the same thing as respect _either_.

now what i find ironic is that it is someone with a degree in psychology that is talking about trust being "earned". if someone whom the american public implicitly "trusts" (because they have a PhD) is saying "trust is earned" then how is anyone else supposed to know the difference between trust and respect??

Comment custom coding time (Score 1) 97

i wrote a video upload and playback system for a christian-based financial advice organisation that was uncomfortable with the idea of having youtube advertising messages in direct contravention of the advice that they were giving their clients.

the "normal" way to do what you are asking would be to simply have a plugin that allows you to specify the youtube URL, and it would be embedded... this is not very hard to do, and, if there is not something out there already, consider paying a programmer to do it. they should not take very long [of the order of days].

however... if, like the christian-based financial advise organisation that i had to create an entire video upload, storage and playback system for the use of youtube is completely inappropriate for your organisation (because the videos are to be kept confidential for example) then there really isn't anything out there (i looked) and you will need to write your own.

for this task you should allocate at least two to three months, if you have access to good programmers, bearing in mind that you will need both front-end developers as well as back-end server capable engineers. one of the problems to solve (in basically reinventing youtube) is that the videos need to be converted to several different formats in order to make it possible to play them back on multiple browser engines.

if this is the path you've chosen then i can help save you some time. but please think carefully about what it is that you need. as a number of other people have pointed out you've said "i need a wiki to store videos" when actually what you _should_ have said is "what's the best way to offer people in-house training videos" and qualified that potentially with a list of options such as "my budget is $X" and "my time is Y" and "my in-house skill-set is A B and C".

Comment Rushing to mars is crap science (Score 5, Insightful) 267

We still don't have a station orbiting the moon. We don't have a station on the moon. We don't have a sustainable system within our own lunar orbit.

The only reason a Mars mission is one way is because we insist on building the vehicles and launching from Earth.

The cost of launching from earth is much higher than from space because we have to break Earth's gravity and pass through the atmosphere.

We picked on India for making it to Mars by basically cutting corners and just slingshotting a chunk of cheap crap at Mars and then said "ours costs more because we're more conservative". What's our response? Throw a huge expensive chunk of metal at Mars to prove we do it better.

Build the next space station already. Build it big and ship it people and supplies and do it there. If we cat accomplish that, we don belong in space.

Slashdot Top Deals

"There is such a fine line between genius and stupidity." - David St. Hubbins, "Spinal Tap"

Working...