Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×

Comment Oh my... (Score 5, Informative) 98

"a high-performance task scheduling engine written (perplexingly) in Python"

guys, there is this thing, it's called "algorithm"....

yeah.... except that algorithm took a staggering 3 months to develop. and it wasn't one algorithm, it was several, along with creating a networking IPC stack and having to create several unusual client-server design decisions. i can't go into the details because i was working in a secure environment, but basically even though i was the one that wrote the code i was taken aback that *python* - a scripted programming language - was capable of such extreme processing rates.

normally those kinds of speed rates would be associated with c for example.

but the key point of the article - leaving that speed aside - is that if something like PostgreSQL had been used as the back-end store, that rate would be somewhere around 30,000 tasks per second or possibly even less than that, over the long term, because of the overwhelming overhead associated with SQL (and NoSQL) databases maintaining transaction logs and making other guarantees in ways that are clearly *significantly* less efficient than the ways that LMDB do it, by way of those guarantees being integrated at a fundamental design level into LMDB.

Comment I can't wait for it (Score 1) 98

At some point there will be an article on Wikipedia, that only meets Wikipedia's notability requirements due to media spillover complaining about the notability requirements.

yaaay! :) works for me. wasn't there a journalist who published a blog and used that as the only notable reference to create a fake article? :)

Comment Would it hurt ... (Score 5, Informative) 98

OpenLDAP was originally using Berkeley DB, until recently. they'd worked with it for years, and got fed up with it. in order to minimise the amount of disruption to the code-base, LMDB was written as a near-drop-in replacement.

LMDB is - according to the web site and also the deleted wikipedia page - a key-value store. however its performance absolutely pisses over everything else around it, on pretty much every metric that can be measured, with very few exceptions.

basically howard's extensive experience combined with the intelligence to do thorough research (even to computing papers dating back to the 1960s) led him to make some absolutely critical but perfectly rational design choices, the ultimate combination of which is that LMDB outshines pretty much every key-value store ever written.

i mean, if you are running benchmark programs in *python* and getting sequential read access to records at a rate of 2,500,000 (2.5 MILLION) records per second... in a *scripted* programming language for goodness sake... then they have to be doing something right.

the random write speed of the python-based benchmarks showed 250,000 records written per second. the _sequential_ ones managed just over 900,000 per second!

there are several key differences between Berkeley DB's API and LMDB's API. the first is that LMDB can be put into "append" mode (as mentioned above). basically what you do is you *guarantee* that the key of new records is lexicographically greater than all other records. with this guarantee LMDB baiscally lets you put the new record _right_ at the end of its B+ Tree. this results in something like an astonishing 5x performance increase in writes.

the second key difference is that LMDB allows you to add duplicate values per key. in fact i think there's also a special mode (never used it) where if you do guaranteed fixed (identical) record sizes LMDB will let you store the values in a more space-efficient manner.

so it's pretty sophisticated.

from a technical perspective, there are two key differences between LMDB and *all* other key-value stores.

the first is: it uses "append-only" when adding new records. basically this has some guarantees that there can never be any corruption of existing data just because a new record is added.

the second is: it uses shared memory "copy-on-write" semantics. what that means is that the (one allowed) writer NEVER - and i mean never - blocks readers, whilst importantly being able to guarantee data integrity and transaction atomicity as well.

the way this is achieved is that because Copy-on-write is enabled, the "writer" may make as many writes it wants, knowing full well that all the readers will NOT be interfered with (because any write creates a COPY of the memory page being written to). then, finally, once everything is done, and the new top level parent B+ Tree is finished, the VERY last thing is a single simple LOCK, update-pointer-to-top-level, UNLOCK.

so as long as Reads do the exact same LOCK, get-pointer-to-top-level-of-B-Tree, UNLOCK, there is NO FURTHER NEED for any kind of locking AT ALL.

i am just simply amazed at the simplicity, and how this technique has just... never been deployed in any database engine before, until now. the reasons as howard makes clear are that the original research back in the 1960s was restricted to 32-bit memory spaces. now we have 64-bit so shared memory may refer to absolutely enormous files, so there is no problem deploying this technique, now.

all incredibly cool.

Submission + - Python-LMDB in a high-performance environment

lkcl writes: In an open letter to the core developers behind OpenLDAP (Howard Chu) and Python-LMDB (David Wilson) is a story of a successful creation of a high-performance task scheduling engine written (perplexingly) in python. With only partial optimisation allowing tasks to be executed in parallel at a phenomenal rate of 240,000 per second, the choice to use Python-LMDB for the per-task database store based on its benchmarks as well as its well-researched design criteria turned out to be the right decision. Part of the success was also due to earlier architectural advice gratefully received here on slashdot. What is puzzling though is that LMDB on wikipedia is being constantly deleted, despite its "notability" by way of being used in a seriously-long list of prominent software libre projects, which has been, in part, motivated by the Oracle-driven BerkeleyDB license change. It would appear that the original complaint about notability came from an Oracle employee as well...

Comment pay them!! (Score 3, Interesting) 265

the key point that people keep missing is that corporations - which are legally obligated to maximise profits - take whatever they can get "for free". software libre developers *do not have* the opportunity that is normally present in business transactions to present the person receiving their work with the VERY IMPORTANT opportunity to transfer to that developer a reward (payment) which represents the value of the software that the person is receiving.

so it should come as absolutely no surprise that those software libre developers are not equipped with the financial means to support themselves (the Gentoo leader ending up with a $50,000 credit-card debt and having to quit and go work for Microsoft is an example that springs to mind) and they *CERTAINLY* don't have the financial means to pay for e.g. security reviews or security tools.

the solution is incredibly simple: if you are using software libre for your business, PAY THE DEVELOPERS. find a way. pick a project that's important or fundamental to your business, and PAY THEM.

Comment Re:Well DUH! (Score 1) 403

It tells you exactly why in the article. It's the way people drive them.

If you try to push a small engine to drive like a larger one, you'll be accelerating harder, therefore using more fuel than under normal acceleration.

there's a little more to it than that. i worked for a company back in 1993 where i was asked to write a vehicle simulator for Detroit Diesel. smaller engines *consistently* under-performed against larger engines, and the reason is simple: when expected to keep up with the demands placed on it a smaller engine, in order to deliver the demanded power, has to operate *well* outside of its most efficient power-band.

a more powerful engine can deliver the power expected of it whilst operating within its most efficient torque/RPM range, and have a much wider selection of gears that will achieve the power that the driver expects and demands.

in other words you need to rev the nuts off of a smaller engine and use low gears to get the same acceleration.

which is why, with that grand cherokee, your wife could not get more than 11mpg, because she was basically driving it very very hard, and you were not. i presume it was an automatic. if so, your driving style allowed the on-board computer to choose the most fuel-efficient gear, whereas your wife's foot-to-the-floor approach made the on-board computer eliminate all gears but the one that delivered the maximum power demanded. that meant that the engine pretty much operated all the time at the redline.... right where it is at its most inefficient.

simple really...

Comment trust vs respect (Score 2) 460

Scientists have earned the respect of Americans but not necessarily their trust,' said lead author Susan Fiske, the Eugene Higgins Professor of Psychology and professor of public affairs

it was only fairly recently that someone explained the absolutely crucial difference between trust and respect, and it knocked me sideways. i used to always accept the "wisdom" that trust is EARNED.

trust - literally by definition- CANNOT be EARNED.

*respect* can be earned, because to respect someone (or something) you learn from PAST experience and PAST actions, you make a judgement call "that thing (or person) did something cool [in the PAST], and i liked it."

trust - by definition - refers to the FUTURE. i am - in the FUTURE - going to give someone the power and authority to do something. i (the person doing the trusting) actually have absolutely NO CLUE as to whether in the FUTURE, regardless of PAST performance, the person will do what they say that they can do.

how on earth can _anyone_ say, "you earned (past tense) my trust (future decision-making)"????

this is how wars are started (and sustained), by people confusing past and present in relation to trust and respect.

so this is where it gets interesting, because the original article is actually making TWO completely SEPARATE and distinct statements:

1) the american public has analysed the PAST actions of scientists, and finds that those actions are [in some way] cool enough to be respected (past tense)

2) the american public has, within themselves, insufficient knowledge about what it is that scientists do - and this has absolutely nothing to do with the scientists but EVERYTHING to do with "the american public" - in order to take the [frightening!] step of placing their trust in the FUTURE decision-making of some individuals-that-happen-to-be-scientists.

i cannot emphasise enough that a decision *to* trust has absolutely nothing to do with the person or thing that you are trusting. the *decision* to place trust in someone else really *really* is something that has absolutely nothing to do with the *analysis* of whether *to* trust.

this is where people get terribly confused. they do some analysis (based usually on past performance), and then they have to make a decision. they *believe* that the [past] analysis *IS* trust. it's not!! even once the [past] analysis has been done, you *still* need to take that step - to trust.

the link between respect and trust is that it is *usually* the respect that we have for people which tips our analysis in favour of certain individuals. but the analysis is NOT respect itself, just as trust (the decision to trust) is not the same thing as respect _either_.

now what i find ironic is that it is someone with a degree in psychology that is talking about trust being "earned". if someone whom the american public implicitly "trusts" (because they have a PhD) is saying "trust is earned" then how is anyone else supposed to know the difference between trust and respect??

Comment custom coding time (Score 1) 97

i wrote a video upload and playback system for a christian-based financial advice organisation that was uncomfortable with the idea of having youtube advertising messages in direct contravention of the advice that they were giving their clients.

the "normal" way to do what you are asking would be to simply have a plugin that allows you to specify the youtube URL, and it would be embedded... this is not very hard to do, and, if there is not something out there already, consider paying a programmer to do it. they should not take very long [of the order of days].

however... if, like the christian-based financial advise organisation that i had to create an entire video upload, storage and playback system for the use of youtube is completely inappropriate for your organisation (because the videos are to be kept confidential for example) then there really isn't anything out there (i looked) and you will need to write your own.

for this task you should allocate at least two to three months, if you have access to good programmers, bearing in mind that you will need both front-end developers as well as back-end server capable engineers. one of the problems to solve (in basically reinventing youtube) is that the videos need to be converted to several different formats in order to make it possible to play them back on multiple browser engines.

if this is the path you've chosen then i can help save you some time. but please think carefully about what it is that you need. as a number of other people have pointed out you've said "i need a wiki to store videos" when actually what you _should_ have said is "what's the best way to offer people in-house training videos" and qualified that potentially with a list of options such as "my budget is $X" and "my time is Y" and "my in-house skill-set is A B and C".

Comment Re:I, Robot from a programmers perspective (Score 1) 165

Don't get me started on Asimov's work. He tried to write allot about how robots would function with these laws that he invented, but really just ended up writing about a bunch of horrendously programmed robots who underwent 0 testing and predictably and catastrophically failed at every single edge case. I do not think there is a single robot in any of his stories that would not not self destruct within 5 minutes of entering the real world.

hooray. someone who actually finally understands the point of the asimov stories. many people reading asimov's work do not understand that it was only in the later works commissioned by the asimov foundation (when Caliban - a Zero-Law Robot - is introduced; or it is finally revealed that Daneel - the robot that Giskard psychically impressed with the Zeroth Law to protect *humanity* onto - is over 30,000 years old and is the silent architect of the Foundation) that the failure of the Three Laws of Robotics is finally explicitly spelled out in actual words instead of being illustrated indirectly through many different stories, just as you describe, wisnoskij.

in the asimov series there _are_ actually robots that are successful. the New Law Robots (those that are permitted to *cooperate* with humans; these actually have some spark of creativity). Caliban - who had a Gravitonic brain - was a Zero Law Robot: an experiment to see if a robot would derive its own laws under free will (it did). and Daneel, whose telepathic ability and the Zeroth Law were given to him by Giskard. these robots are the exception. the three law robots are basically intelligent but entirely devoid of creativity.

you have to think: how can anything that has hundreds of millions of copies of the three laws be anything *but* a danger to human development, by preventing and prohibiting any kind of risk-taking?? we already have enough stupid laws on the planet (mostly thanks to america's sue-happy culture and the abusive patent system). we DON'T need idiots trying to implement the failed three laws of robotics.

Comment COM (MSRPC), Objective-C/J and Software Libre (Score 2) 54

in looking at why both apple and microsoft have been overwhelmingly successful i came to the conclusion that it is because both companies are using dynamic object-orientated paradigms that can allow components from disparate programming languages to be accessible at runtime. COM is the reason why, after 20 years, you can find a random Active-X component written two decades ago, plug it into a modern windows computer and it will *work*.

Objective-C is the OO concept taken to the extreme: it's actually built-in to the programming language. COM is a bit more sensible: it's a series of rules (based ultimately on the flattening of data structures into a stream that can be sent over a socket, or via shared memory) which may be implemented in userspace: the c++ implementation has some classes whilst the c implementation has macros, but ultimately you could implement COM in any programming language you cared to.

the first amazing thing about COM (which is based on MSRPC which in turn was originally the OpenGroup's BSD-licensed DCE/RPC source code) is that because it is on top of DCE/RPC (ok MSRPC) you have version-control at the interface layer. the second amazing thing is that they have "co-classes" meaning that an "object" may be "merged" with another (multiple inheritance). when you combine this with the version-control capabilities of DCERPC/MSRPC you get not only binary-interoperability between client and server regardless of how many revisions there are to an API but also you can use co-classes to create "optional parameters" (by combining a function with 3 parameters in one IDL file with another same-named function with 4 parameters in another IDL file, 5 in another and so on).

the thing is that:

a) to create such infrastructure in the first place takes a hell of a lot of vision, committment and guts.

b) to mandate the use of such infrastructure, for the good of the company, the users, and the developers, also takes a lot of committment and guts. when people actually knew what COM was it was *very* unpopular, but unfortunately at the time things like python-comtypes (which makes COM so transparent it has the *opposite* problem - that of being so easy that programmers go "what's all the fuss about???" and don't realise quite how powerful what they are doing really is)

both microsoft and apple were - are - companies where it was possible to make such top-down decisions and say "This Is The Way It's Gonna Go Down".

now let's take a look at the GNU/Linux community.

the GNU/Linux community does have XPIDL and XPCOM, written by the Mozilla Foundation. XPCOM is "based on" COM. XPCOM has a registry. it has the same API, the same macros, and it even has an IDL compiler (XPIDL). however what it *does not* have is co-classes. co-classes are the absolute, absolute bed-rock of COM and because XPCOM does not have co-classes there have been TEN YEARS of complaints from developers - mostly java developers but also c++ developers - attempting to use Mozilla technology (embedding Gecko is the usual one) and being driven UP THE F******G WALL by binary ABI incompatibility on pretty much every single damn release of the mozilla binaries. one single change to an IDL file results, sadly, in a broken system for these third party developers.

the GNU/Linux community does have CORBA, thanks to Olivetti Labs who released their implementation of CORBA some time back in 1997. CORBA was the competitor to COM, and it was nowhere near as good. Gnome adopted it... but nobody else did.

the GNU/Linux community does have an RPC mechanism in KDE. its first implementation is known famously for having been written in 20 minutes. not much more needs to be said.

the GNU/Linux community does have gobject. gobject is, after nearly fifteen years, beginning to get introspection, and this is beginning to bubble up to the dynamic programming languages such as python. gobject does not have interface revision control.

the GNU/Linux community does actually have a (near full) implementation of MSRPC and COM: it's part of the Wine Project. the project named TangramCOM did make an attempt to separate COM from Wine: if it had succeeded it would be maintained as a cut-down fork of the Wine Project. The Wine Project developer's answer - if you ask - to making a GNU/Linux application use COM is that you should convert it to a Wine (i.e. a Win32) application. this is not very satisfactory.

in other words, the GNU/Linux community has a set of individuals who are completely discoordinated, getting on with the very important task - and i mean that absolutely genuinely - the very important task of maintaining the code for which they are responsible.

the problems that they deal with are *not* those of coordinating - at a top level - with *other projects*.

now, whilst this "Alliance" may wish to "guide" the development of the GNU/Linux community, ultimately it comes down to money. do these companies have the guts to say - in a nice way of course - "here's a wad of cash, this is a list of tasks, any takers?"

but, also, does this "Alliance" have the guts to ask "what is actually needed"? would it be nice, for example, rather than them saying "this is what you need to do, now get on with it", which would pretty much guarantee to have no takers at all, would it be nice for them to actually get onto various mailing lists (hundreds if necessary) and actually canvas the developers in the software libre world, to ask them "hey, we have $NNN million available, we'd like to coordinate something that's cross-project that would make a difference, and we'd like *you* to tell *us* what you think is the best way to spend that money".

where the kinds of ideas floated around could be something as big and ambitious as "converting both KDE and Gnome to use the same runtime-capable object-orientated RPC mechanism so that both desktops work nicely together and one set of configuration tools from one desktop environment could actually be used to manage the other... even over a network with severely limited bandwidth [1]".

or, another idea would be: ensure that things like heartbleed never happen again, because the people responsible for the code - on which these and many companies are making MILLIONS - are actually being PAID.

but the primary question that immediately needs answering: is this group of companies acting genuinely altruistically, or are they self-serving? an immediate read of the web site, at face value, it does actually look like they are genuine.

however, time will tell. we'll see when they actually start interacting with software libre developers rather than just being a web site that doesn't even have a public mailing list.

[1] i mention that because the last time i suggested this idea people said "what's wrong with using X11?? problem solved... so what are you talking about?? i'm talking about binary-compatible APIs that stem ultimately from IDL files". *sigh*...

Comment define "customer" (Score 4, Informative) 290

from what i understand of the definition of "customer", a "customer" means "someone who is paying for a service". here, there's no payment involved, therefore there is no contract of sale. i would imagine that it's fairly safe to say that we're most definitely *not* quotes customers of google quotes.

if on the other hand these individuals are actually _paying_ google for service and are not receiving a response, _then_ i could understand.

Comment Re:Where to draw the line (Score 1) 326

there is a beautiful tale which i will share with you, which helps to explain why what Dr Stallman is doing is so important:

"the reasonable man adapts himself to the world. the unreasonable man adapts the world to himself. therefore, all progress depends on the unreasonable man".

now, if it wasn't for Dr Stallman, the average pathological corporation (see the first few minutes of the documentary "The Corporation") would take whatever it could get (and you only have to look at the 98% endemic GPL violations on android smartphones and tablets to see the consequences of non-GPL software such as android)

so if it wasn't for Dr Stallman sticking to his principles, you would probably be using a computer that crashes 10 to 15 times a day for anything but the most mundane of tasks, and was entirely outside of your control.

Comment Re:so why is intel's 14nm haswell still at 3.5 wat (Score 1) 161

You seem to be conveniently ignoring Intel's Atom and Quark lines. They're all x86 and none of them has a TDP larger than 3w.

i'm not. intel's quark line - the one i saw announced on here last year - tops out at 400mhz. it has... nothing in the way of interfaces that can be taken seriously. it doesn't even have RGB/TTL video out. however if you are right about the latest intel atom being 3w, then now i am interested! so i am very grateful for you pointing this out, i will go check.

Comment Re:so why is intel's 14nm haswell still at 3.5 wat (Score 1) 161

Here is your answer, the A20 is freakishly slow compared to anything Intel would put their name on.

Granted, you can build a tablet to do specific tasks (like decoding video codecs) around a really slow processor and some special-purpose DSPs. But perhaps the companies in that business aren't making enough profit to interest Intel.

interestingly that assumption - that allwinner is not making enough profit - is completely wrong. allwinner is now one of _the_ dominant tablet SoC manufacturers in the world. their first revision (the A10, which was a Cortex A8) actually caused a major recession in the electronics industry when it first came out, as it was only $7.50 compared to the nearest competitor at around $11 to $12. everyone *not* using the A10 at the time was left holding worthless components; contracts for supply were reneged on; the change was so quick that many factories and design houses simply went out of business.

the volumes that allwinner are shipping are simply enormous, and, along with rockchip, their nearest competitor, the tablet market is completely and utterly overwhelmingly dominated by processors of the type that you describe as "built to do specific tasks".

those "specific tasks" include "running the android OS at a pace that's good enough for the overwhelming majority of end-users".

in short, intel has a long *long* way to go before they can even remotely consider that they have a processor that can be taken seriously in this very large market, both in terms of price and also in terms of performance.

what is particularly interesting about the comment that you make is that it would seem that intel really does, just as you do, believe that "a really slow processor and some special-purpose DSPs" simply is... not enough. and, contrary to that belief, it can be quite clearly seen by the total dominance of allwinner and rockchip that "a really slow processor and some special-purpose DSPs" really *is* enough.

one of the reasons for that is because if you look at the market you find that you need:

* audio and video CODEC processing. this can be handled by a special-purpose DSP. some of these are now handling 3D 4096-bit-wide screens.

* 3D graphics. these are handled by licensing a whole range of hard macros (special-purpose DSPs) that come with proprietary libraries implementing OpenGL ES 2.0. they're good enough, and some of them are getting _really_ good.

* an (as you put it) "really slow processor" - although if you look at allwinner's latest processor the A80 it can hardly be called "slow", it's an 8 core monster - which covers the running of the general OS.

overall these processors are graded according to price: $5 will get you something dreadful but "good enough", $20 will get you something that's complete overkill for a tablet.

and you know what? the $7 1.2ghz dual-core ARM Cortex A7 Allwinner A20 is, when it's put with 2gb of RAM, actually extremely quick. i tested out 1gb of RAM running debian GNU/Linux: i fired up xrdp and i had *five* rdesktop sessions running OpenOffice and Firefox on it, onto my laptop. it didn't fall over, and it wasn't dreadfully slow.

so i think you, just like intel, are completely and entirely missing the point. and in intel's case, that means entirely missing out on a *huge* market segment.

Slashdot Top Deals

Truly simple systems... require infinite testing. -- Norman Augustine

Working...