lkcl - Slashdot User

Comment Re:Did you make any effort to get this undeleted? (Score 1) 98

by lkcl on Friday October 17, 2014 @01:49PM (#48170545) Attached to: Python-LMDB In a High-Performance Environment

Comment Oh my... (Score 5, Informative) 98

by lkcl on Friday October 17, 2014 @01:43PM (#48170483) Attached to: Python-LMDB In a High-Performance Environment

Comment I can't wait for it (Score 1) 98

by lkcl on Friday October 17, 2014 @01:34PM (#48170379) Attached to: Python-LMDB In a High-Performance Environment

Comment Would it hurt ... (Score 5, Informative) 98

by lkcl on Friday October 17, 2014 @01:32PM (#48170357) Attached to: Python-LMDB In a High-Performance Environment

OpenLDAP was originally using Berkeley DB, until recently. they'd worked with it for years, and got fed up with it. in order to minimise the amount of disruption to the code-base, LMDB was written as a near-drop-in replacement.

LMDB is - according to the web site and also the deleted wikipedia page - a key-value store. however its performance absolutely pisses over everything else around it, on pretty much every metric that can be measured, with very few exceptions.

basically howard's extensive experience combined with the intelligence to do thorough research (even to computing papers dating back to the 1960s) led him to make some absolutely critical but perfectly rational design choices, the ultimate combination of which is that LMDB outshines pretty much every key-value store ever written.

i mean, if you are running benchmark programs in *python* and getting sequential read access to records at a rate of 2,500,000 (2.5 MILLION) records per second... in a *scripted* programming language for goodness sake... then they have to be doing something right.

the random write speed of the python-based benchmarks showed 250,000 records written per second. the _sequential_ ones managed just over 900,000 per second!

there are several key differences between Berkeley DB's API and LMDB's API. the first is that LMDB can be put into "append" mode (as mentioned above). basically what you do is you *guarantee* that the key of new records is lexicographically greater than all other records. with this guarantee LMDB baiscally lets you put the new record _right_ at the end of its B+ Tree. this results in something like an astonishing 5x performance increase in writes.

the second key difference is that LMDB allows you to add duplicate values per key. in fact i think there's also a special mode (never used it) where if you do guaranteed fixed (identical) record sizes LMDB will let you store the values in a more space-efficient manner.

so it's pretty sophisticated.

from a technical perspective, there are two key differences between LMDB and *all* other key-value stores.

the first is: it uses "append-only" when adding new records. basically this has some guarantees that there can never be any corruption of existing data just because a new record is added.

the second is: it uses shared memory "copy-on-write" semantics. what that means is that the (one allowed) writer NEVER - and i mean never - blocks readers, whilst importantly being able to guarantee data integrity and transaction atomicity as well.

the way this is achieved is that because Copy-on-write is enabled, the "writer" may make as many writes it wants, knowing full well that all the readers will NOT be interfered with (because any write creates a COPY of the memory page being written to). then, finally, once everything is done, and the new top level parent B+ Tree is finished, the VERY last thing is a single simple LOCK, update-pointer-to-top-level, UNLOCK.

so as long as Reads do the exact same LOCK, get-pointer-to-top-level-of-B-Tree, UNLOCK, there is NO FURTHER NEED for any kind of locking AT ALL.

i am just simply amazed at the simplicity, and how this technique has just... never been deployed in any database engine before, until now. the reasons as howard makes clear are that the original research back in the 1960s was restricted to 32-bit memory spaces. now we have 64-bit so shared memory may refer to absolutely enormous files, so there is no problem deploying this technique, now.

all incredibly cool.

Submission + - Python-LMDB in a high-performance environment

Submitted by lkcl on Friday October 17, 2014 @10:36AM

Comment pay them!! (Score 3, Interesting) 265

by lkcl on Tuesday October 14, 2014 @03:54PM (#48143609) Attached to: Confidence Shaken In Open Source Security Idealism

Comment Re:Well DUH! (Score 1) 403

by lkcl on Wednesday October 08, 2014 @03:25PM (#48095633) Attached to: Fuel Efficiency Numbers Overstate MPG More For Cars With Small Engines

Comment trust vs respect (Score 2) 460

by lkcl on Monday September 29, 2014 @10:20AM (#48019087) Attached to: Scientists Seen As Competent But Not Trusted By Americans

Scientists have earned the respect of Americans but not necessarily their trust,' said lead author Susan Fiske, the Eugene Higgins Professor of Psychology and professor of public affairs

it was only fairly recently that someone explained the absolutely crucial difference between trust and respect, and it knocked me sideways. i used to always accept the "wisdom" that trust is EARNED.

trust - literally by definition- CANNOT be EARNED.

*respect* can be earned, because to respect someone (or something) you learn from PAST experience and PAST actions, you make a judgement call "that thing (or person) did something cool [in the PAST], and i liked it."

trust - by definition - refers to the FUTURE. i am - in the FUTURE - going to give someone the power and authority to do something. i (the person doing the trusting) actually have absolutely NO CLUE as to whether in the FUTURE, regardless of PAST performance, the person will do what they say that they can do.

how on earth can _anyone_ say, "you earned (past tense) my trust (future decision-making)"????

this is how wars are started (and sustained), by people confusing past and present in relation to trust and respect.

so this is where it gets interesting, because the original article is actually making TWO completely SEPARATE and distinct statements:

1) the american public has analysed the PAST actions of scientists, and finds that those actions are [in some way] cool enough to be respected (past tense)

2) the american public has, within themselves, insufficient knowledge about what it is that scientists do - and this has absolutely nothing to do with the scientists but EVERYTHING to do with "the american public" - in order to take the [frightening!] step of placing their trust in the FUTURE decision-making of some individuals-that-happen-to-be-scientists.

i cannot emphasise enough that a decision *to* trust has absolutely nothing to do with the person or thing that you are trusting. the *decision* to place trust in someone else really *really* is something that has absolutely nothing to do with the *analysis* of whether *to* trust.

this is where people get terribly confused. they do some analysis (based usually on past performance), and then they have to make a decision. they *believe* that the [past] analysis *IS* trust. it's not!! even once the [past] analysis has been done, you *still* need to take that step - to trust.

the link between respect and trust is that it is *usually* the respect that we have for people which tips our analysis in favour of certain individuals. but the analysis is NOT respect itself, just as trust (the decision to trust) is not the same thing as respect _either_.

now what i find ironic is that it is someone with a degree in psychology that is talking about trust being "earned". if someone whom the american public implicitly "trusts" (because they have a PhD) is saying "trust is earned" then how is anyone else supposed to know the difference between trust and respect??

Comment custom coding time (Score 1) 97

by lkcl on Monday September 29, 2014 @09:20AM (#48018641) Attached to: Ask Slashdot: Multimedia-Based Wiki For Learning and Business Procedures?

Comment Re:I, Robot from a programmers perspective (Score 1) 165

by lkcl on Tuesday September 16, 2014 @09:43PM (#47923389) Attached to: Developing the First Law of Robotics

Don't get me started on Asimov's work. He tried to write allot about how robots would function with these laws that he invented, but really just ended up writing about a bunch of horrendously programmed robots who underwent 0 testing and predictably and catastrophically failed at every single edge case. I do not think there is a single robot in any of his stories that would not not self destruct within 5 minutes of entering the real world.

hooray. someone who actually finally understands the point of the asimov stories. many people reading asimov's work do not understand that it was only in the later works commissioned by the asimov foundation (when Caliban - a Zero-Law Robot - is introduced; or it is finally revealed that Daneel - the robot that Giskard psychically impressed with the Zeroth Law to protect *humanity* onto - is over 30,000 years old and is the silent architect of the Foundation) that the failure of the Three Laws of Robotics is finally explicitly spelled out in actual words instead of being illustrated indirectly through many different stories, just as you describe, wisnoskij.

in the asimov series there _are_ actually robots that are successful. the New Law Robots (those that are permitted to *cooperate* with humans; these actually have some spark of creativity). Caliban - who had a Gravitonic brain - was a Zero Law Robot: an experiment to see if a robot would derive its own laws under free will (it did). and Daneel, whose telepathic ability and the Zeroth Law were given to him by Giskard. these robots are the exception. the three law robots are basically intelligent but entirely devoid of creativity.

you have to think: how can anything that has hundreds of millions of copies of the three laws be anything *but* a danger to human development, by preventing and prohibiting any kind of risk-taking?? we already have enough stupid laws on the planet (mostly thanks to america's sue-happy culture and the abusive patent system). we DON'T need idiots trying to implement the failed three laws of robotics.

Comment COM (MSRPC), Objective-C/J and Software Libre (Score 2) 54

by lkcl on Tuesday September 16, 2014 @10:01AM (#47917061) Attached to: Industry-Based ToDo Alliance Wants To Guide FOSS Development

in looking at why both apple and microsoft have been overwhelmingly successful i came to the conclusion that it is because both companies are using dynamic object-orientated paradigms that can allow components from disparate programming languages to be accessible at runtime. COM is the reason why, after 20 years, you can find a random Active-X component written two decades ago, plug it into a modern windows computer and it will *work*.

Objective-C is the OO concept taken to the extreme: it's actually built-in to the programming language. COM is a bit more sensible: it's a series of rules (based ultimately on the flattening of data structures into a stream that can be sent over a socket, or via shared memory) which may be implemented in userspace: the c++ implementation has some classes whilst the c implementation has macros, but ultimately you could implement COM in any programming language you cared to.

the first amazing thing about COM (which is based on MSRPC which in turn was originally the OpenGroup's BSD-licensed DCE/RPC source code) is that because it is on top of DCE/RPC (ok MSRPC) you have version-control at the interface layer. the second amazing thing is that they have "co-classes" meaning that an "object" may be "merged" with another (multiple inheritance). when you combine this with the version-control capabilities of DCERPC/MSRPC you get not only binary-interoperability between client and server regardless of how many revisions there are to an API but also you can use co-classes to create "optional parameters" (by combining a function with 3 parameters in one IDL file with another same-named function with 4 parameters in another IDL file, 5 in another and so on).

the thing is that:

a) to create such infrastructure in the first place takes a hell of a lot of vision, committment and guts.

b) to mandate the use of such infrastructure, for the good of the company, the users, and the developers, also takes a lot of committment and guts. when people actually knew what COM was it was *very* unpopular, but unfortunately at the time things like python-comtypes (which makes COM so transparent it has the *opposite* problem - that of being so easy that programmers go "what's all the fuss about???" and don't realise quite how powerful what they are doing really is)

both microsoft and apple were - are - companies where it was possible to make such top-down decisions and say "This Is The Way It's Gonna Go Down".

now let's take a look at the GNU/Linux community.

the GNU/Linux community does have XPIDL and XPCOM, written by the Mozilla Foundation. XPCOM is "based on" COM. XPCOM has a registry. it has the same API, the same macros, and it even has an IDL compiler (XPIDL). however what it *does not* have is co-classes. co-classes are the absolute, absolute bed-rock of COM and because XPCOM does not have co-classes there have been TEN YEARS of complaints from developers - mostly java developers but also c++ developers - attempting to use Mozilla technology (embedding Gecko is the usual one) and being driven UP THE F******G WALL by binary ABI incompatibility on pretty much every single damn release of the mozilla binaries. one single change to an IDL file results, sadly, in a broken system for these third party developers.

the GNU/Linux community does have CORBA, thanks to Olivetti Labs who released their implementation of CORBA some time back in 1997. CORBA was the competitor to COM, and it was nowhere near as good. Gnome adopted it... but nobody else did.

the GNU/Linux community does have an RPC mechanism in KDE. its first implementation is known famously for having been written in 20 minutes. not much more needs to be said.

the GNU/Linux community does have gobject. gobject is, after nearly fifteen years, beginning to get introspection, and this is beginning to bubble up to the dynamic programming languages such as python. gobject does not have interface revision control.

the GNU/Linux community does actually have a (near full) implementation of MSRPC and COM: it's part of the Wine Project. the project named TangramCOM did make an attempt to separate COM from Wine: if it had succeeded it would be maintained as a cut-down fork of the Wine Project. The Wine Project developer's answer - if you ask - to making a GNU/Linux application use COM is that you should convert it to a Wine (i.e. a Win32) application. this is not very satisfactory.

in other words, the GNU/Linux community has a set of individuals who are completely discoordinated, getting on with the very important task - and i mean that absolutely genuinely - the very important task of maintaining the code for which they are responsible.

the problems that they deal with are *not* those of coordinating - at a top level - with *other projects*.

now, whilst this "Alliance" may wish to "guide" the development of the GNU/Linux community, ultimately it comes down to money. do these companies have the guts to say - in a nice way of course - "here's a wad of cash, this is a list of tasks, any takers?"

but, also, does this "Alliance" have the guts to ask "what is actually needed"? would it be nice, for example, rather than them saying "this is what you need to do, now get on with it", which would pretty much guarantee to have no takers at all, would it be nice for them to actually get onto various mailing lists (hundreds if necessary) and actually canvas the developers in the software libre world, to ask them "hey, we have $NNN million available, we'd like to coordinate something that's cross-project that would make a difference, and we'd like *you* to tell *us* what you think is the best way to spend that money".

where the kinds of ideas floated around could be something as big and ambitious as "converting both KDE and Gnome to use the same runtime-capable object-orientated RPC mechanism so that both desktops work nicely together and one set of configuration tools from one desktop environment could actually be used to manage the other... even over a network with severely limited bandwidth [1]".

or, another idea would be: ensure that things like heartbleed never happen again, because the people responsible for the code - on which these and many companies are making MILLIONS - are actually being PAID.

but the primary question that immediately needs answering: is this group of companies acting genuinely altruistically, or are they self-serving? an immediate read of the web site, at face value, it does actually look like they are genuine.

however, time will tell. we'll see when they actually start interacting with software libre developers rather than just being a web site that doesn't even have a public mailing list.

[1] i mention that because the last time i suggested this idea people said "what's wrong with using X11?? problem solved... so what are you talking about?? i'm talking about binary-compatible APIs that stem ultimately from IDL files". *sigh*...

Comment define "customer" (Score 4, Informative) 290

by lkcl on Friday September 12, 2014 @07:08AM (#47887995) Attached to: German Court: Google Must Stop Ignoring Customer E-mails

Comment Re:Where to draw the line (Score 1) 326

by lkcl on Monday September 08, 2014 @03:32AM (#47850635) Attached to: Stallman Does Slides -- and Brevity -- For TEDx

Comment Re:so why is intel's 14nm haswell still at 3.5 wat (Score 1) 161

by lkcl on Monday September 01, 2014 @05:10AM (#47798903) Attached to: Research Shows RISC vs. CISC Doesn't Matter

Comment Re:so why is intel's 14nm haswell still at 3.5 wat (Score 1) 161

by lkcl on Monday September 01, 2014 @05:08AM (#47798897) Attached to: Research Shows RISC vs. CISC Doesn't Matter

Here is your answer, the A20 is freakishly slow compared to anything Intel would put their name on.

Granted, you can build a tablet to do specific tasks (like decoding video codecs) around a really slow processor and some special-purpose DSPs. But perhaps the companies in that business aren't making enough profit to interest Intel.

interestingly that assumption - that allwinner is not making enough profit - is completely wrong. allwinner is now one of _the_ dominant tablet SoC manufacturers in the world. their first revision (the A10, which was a Cortex A8) actually caused a major recession in the electronics industry when it first came out, as it was only $7.50 compared to the nearest competitor at around $11 to $12. everyone *not* using the A10 at the time was left holding worthless components; contracts for supply were reneged on; the change was so quick that many factories and design houses simply went out of business.

the volumes that allwinner are shipping are simply enormous, and, along with rockchip, their nearest competitor, the tablet market is completely and utterly overwhelmingly dominated by processors of the type that you describe as "built to do specific tasks".

those "specific tasks" include "running the android OS at a pace that's good enough for the overwhelming majority of end-users".

in short, intel has a long *long* way to go before they can even remotely consider that they have a processor that can be taken seriously in this very large market, both in terms of price and also in terms of performance.

what is particularly interesting about the comment that you make is that it would seem that intel really does, just as you do, believe that "a really slow processor and some special-purpose DSPs" simply is... not enough. and, contrary to that belief, it can be quite clearly seen by the total dominance of allwinner and rockchip that "a really slow processor and some special-purpose DSPs" really *is* enough.

one of the reasons for that is because if you look at the market you find that you need:

* audio and video CODEC processing. this can be handled by a special-purpose DSP. some of these are now handling 3D 4096-bit-wide screens.

* 3D graphics. these are handled by licensing a whole range of hard macros (special-purpose DSPs) that come with proprietary libraries implementing OpenGL ES 2.0. they're good enough, and some of them are getting _really_ good.

* an (as you put it) "really slow processor" - although if you look at allwinner's latest processor the A80 it can hardly be called "slow", it's an 8 core monster - which covers the running of the general OS.

overall these processors are graded according to price: $5 will get you something dreadful but "good enough", $20 will get you something that's complete overkill for a tablet.

and you know what? the $7 1.2ghz dual-core ARM Cortex A7 Allwinner A20 is, when it's put with 2gb of RAM, actually extremely quick. i tested out 1gb of RAM running debian GNU/Linux: i fired up xrdp and i had *five* rdesktop sessions running OpenOffice and Firefox on it, onto my laptop. it didn't fall over, and it wasn't dreadfully slow.

so i think you, just like intel, are completely and entirely missing the point. and in intel's case, that means entirely missing out on a *huge* market segment.

Slashdot Top Deals