Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×

Comment Re:complex application example (Score 4, Informative) 161

> the first ones used threads, semaphores through python's multiprocessing.Pipe implementation.

I stopped reading when I came across this.

Honestly - why are people trying to do things that need guarantees with python?

because we have an extremely limited amount of time as an additional requirement, and we can always rewrite critical portions or later the entire application in c once we have delivered a working system that means that the client can get some money in and can therefore stay in business.

also i worked with david and we benchmarked python-lmdb after adding in support for looped sequential "append" mode and got a staggering performance metric of 900,000 100-byte key/value pairs, and a sequential read performance of 2.5 MILLION records. the equivalent c benchmark is only around double those numbers. we don't *need* the dramatic performance increase that c would bring if right now, at this exact phase of the project, we are targetting something that is 1/10th to 1/5th the performance of c.

so if we want to provide the client with a product *at all*, we go with python.

but one thing that i haven't pointed out is that i am an experienced linux python and c programmer, having been the lead developer of samba tng back from 1997 to 2000. i simpy transferred all of the tricks that i know involving while-loops around non-blocking sockets and so on over to python. ... and none of them helped. if you get 0.5% of the required performance in python, it's so far off the mark that you know something is drastically wrong. converting the exact same program to c is not going to help.

The fact you have strict timing guarantees means you should be using a realtime kernel and realtime threads with a dedicated network card and dedicated processes on IRQs for that card.

we don't have anything like that [strict timing guarantees] - not for the data itself. the data comes in on a 15 second delay (from the external source that we do not have control over) so a few extra seconds delay is not going to hurt.

so although we need the real-time response to handle the incoming data, we _don't_ need the real-time capability beyond that point.

Take the incoming messages from UDP and post them on a message bus should be step one so that you don't lose them.

.... you know, i think this is extremely sensible advice (which i have heard from other sources) so it is good to have that confirmed... my concerns are as follows:

questions:

* how do you then ensure that the process receiving the incoming UDP messages is high enough priority to make sure that the packets are definitely, definitely received?

* what support from the linux kernel is there to ensure that this happens?

* is there a system call which makes sure that data received on a UDP socket *guarantees* that the process receiving it is woken up as an absolute priority over and above all else?

* the message queue destination has to have locking otherwise it will be corrupted. what happens if the message queue that you wish to send the UDP packet to is locked by a *lower* priority process?

* what support in the linux kernel is there to get the lower priority process to have its priority temporarily increased until it lets go of the message queue on which the higher-priority task is critically dependent?

this is exactly the kind of thing that is entirely missing from the linux kernel. temporary automatic re-prioritisation was something that was added to solaris by sun microsystems quite some time ago.

to the best of my knowledge the linux kernel has absolutely no support for these kinds of very important re-prioritisation requirements.

Comment complex application example (Score 4, Insightful) 161

i am running into exactly this problem on my current contract. here is the scenario:

* UDP traffic (an external requirement that cannot be influenced) comes in
* the UDP traffic contains multiple data packets (call them "jobs") each of which requires minimal decoding and processing
* each "job" must be farmed out to *multiple* scripts (for example, 15 is not unreasonable)
* the responses from each job running on each script must be collated then post-processed.

so there is a huge fan-out where jobs (approximately 60 bytes) are coming in at a rate of 1,000 to 2,000 per second; those are being multiplied up by a factor of 15 (to 15,000 to 30,000 per second, each taking very little time in and of themselves), and the responses - all 15 to 30 thousand - must be in-order before being post-processed.

so, the first implementation is in a single process, and we just about achieve the target of 1,000 jobs but only about 10 scripts per job.

anything _above_ that rate and the UDP buffers overflow and there is no way to know if the data has been dropped. the data is *not* repeated, and there is no back-communication channel.

the second implementation uses a parallel dispatcher. i went through half a dozen different implementations.

the first ones used threads, semaphores through python's multiprocessing.Pipe implementation. the performance was beyond dreadful, it was deeply alarming. after a few seconds performance would drop to zero. strace investigations showed that at heavy load the OS call futex was maxed out near 100%.

next came replacement of multiprocessing.Pipe with unix socket pairs and threads with processes, so as to regain proper control over signals, sending of data and so on. early variants of that would run absolutely fine up to some arbitrarry limit then performance would plummet to around 1% or less, sometimes remaining there and sometimes recovering.

next came replacement of select with epoll, and the addition of edge-triggered events. after considerable bug-fixing a reliable implementation was created. testing began, and the CPU load slowly cranked up towards the maximum possible across all 4 cores.

the performance metrics came out *WORSE* than the single-process variant. investigations began and showed a number of things:

1) even though it is 60 bytes per job the pre-processing required to make the decision about which process to send the job were so great that the dispatcher process was becoming severely overloaded

2) each process was spending approximately 5 to 10% of its time doing actual work and NINETY PERCENT of its time waiting in epoll for incoming work.

this is unlike any other "normal" client-server architecture i've ever seen before. it is much more like the mainframe "job processing" that the article describes, and the linux OS simply cannot cope.

i would have used POSIX shared memory Queues but the implementation sucks: it is not possible to identify the shared memory blocks after they have been created so that they may be deleted. i checked the linux kernel source: there is no "directory listing" function supplied and i have no idea how you would even mount the IPC subsystem in order to list what's been created, anyway.

i gave serious consideration to using the python LMDB bindings because they provide an easy API on top of memory-mapped shared memory with copy-on-write semantics. early attempts at that gave dreadful performance: i have not investigated fully why that is: it _should_ work extremely well because of the copy-on-write semantics.

we also gave serious consideration to just taking a file, memory-mapping it and then appending job data to it, then using the mmap'd file for spin-locking to indicate when the job is being processed.

all of these crazy implementations i basically have absolutely no confidence in the linux kernel nor the GNU/Linux POSIX-compliant implementation of the OS on top - i have no confidence that it can handle the load.

so i would be very interested to hear from anyone who has had to design similar architectures, and how they dealt with it.

Comment Re:Evolution (Score 1) 253

I think it's more likely that more people are becoming obese because of exactly one factor: age. They are living artificially prolonged lifetimes due to access to adequate food and to medicine. It's easier to get fat when you are 50 than when you are 30 because of the natural changes in your metabolism.

Comment Re:Evolution (Score 1) 253

:-)

You make it sound like starving people are getting fat too.

If they are becoming obese, the particular individual has a surplus of caloric intake, if only for this year or month. This is not to say that they have proper nutrition. So I am not at all clear that the fact that there is obesity in the third world is confounding evidence.

Comment Evolution (Score 1) 253

For most of the existence of mankind and indeed all of mankind's progenitors, having too much food was a rare problem and being hungry all of the time was a fact of life. We are not necessarily well-evolved to handle it. So, no surprise that we eat to repletion and are still hungry. You don't really have any reason to look at it as an illness caused by anything other than too much food.

Comment Re: If you pay... (Score 2) 15

Martin,

The last time I had a professional video produced, I paid $5000 for a one-minute commercial, and those were rock-bottom prices from hungry people who wanted it for their own portfolio. I doubt I could get that today. $8000 for the entire conference is really volunteer work on Gary's part.

Someone's got to pay for it. One alternative would be to get a corporate sponsor and give them a keynote, which is what so many conferences do, but that would be abandoning our editorial independence. Having Gary fund his own operation through Kickstarter without burdening the conference is what we're doing. We're really lucky we could get that.

Comment Re:One hell of a slashvertisement! (Score 2) 15

I think TAPR's policy is that the presentations be freely redistributable, but I don't know what they and Gary have discussed. I am one of the speakers and have always made sure that my own talk would be freely redistributable. I wouldn't really want it to be modifiable except for translation and quotes, since it's a work of opinion. Nobody should get the right to modify the video in such a way as to make my opinion seem like it's anything other than what it is.

Comment Re:If you pay... (Score 2) 15

Yes. I put in $100, and I am asking other people to put in money to sponsor these programs so that everyone, including people who did not put in any money at all, can see them for free. If you look at the 150+ videos, you can see that Gary's pretty good at this (and he brought a really professional-seeming cameraman to Hamvention, too) and the programs are interesting. Even if at least four of them feature yours truly :-) He filmed every one of the talks at the TAPR DCC last year (and has filmed for the past 5 years) and it costs him about $8000 to drive there from North Carolina to Austin, Texas; to bring his equipment and to keep it maintained, to stay in a motel, to run a multi-camera shoot for every talk in the conference, and to get some fair compensation for his time in editing (and he does a really good job at that).

Submission + - Open Hardware and Digital Communications conference on free video, if you help (kickstarter.com)

Bruce Perens writes: The TAPR Digital Communications Conference has been covered twice here and is a great meeting on leading-edge wireless technology, mostly done as Open Hardware and Open Source software. Free videos of the September 2014 presentations will be made available if you help via Kickstarter. For an idea of what's in them, see the Dayton Hamvention interviews covering Whitebox, our Open Hardware handheld software-defined radio transceiver, and Michael Ossman's HackRF, a programmable Open Hardware transceiver for wireless security exploration and other wireless research. Last year's TAPR DCC presentations are at the Ham Radio Now channel on Youtube.

Comment legal ramifications of identity verification (Score 1) 238

i think one of two things happened, here. first is that it might have finally sunk in to google that even just *claiming* to have properly verified user identities leaves them open to lawsuits should they fail to have properly carried out the verification checks that other users *believe* they have carried out. every other service people *know* that you don't trust the username: for a service to claim that they have truly verified the identity of the individual behind the username is reprehensibly irresponsible.

second is that they simply weren't getting enough people, so have quotes opened up the doors quotes.

Security

German NSA Committee May Turn To Typewriters To Stop Leaks 244

mpicpp (3454017) writes with news that Germany may be joining Russia in a paranoid switch from computers to typewriters for sensitive documents. From the article: Patrick Sensburg, chairman of the German parliament's National Security Agency investigative committee, now says he's considering expanding the use of manual typewriters to carry out his group's work. ... Sensburg said that the committee is taking its operational security very seriously. "In fact, we already have [a typewriter], and it's even a non-electronic typewriter," he said. If Sensburg's suggestion takes flight, the country would be taking a page out of the Russian playbook. Last year, the agency in charge of securing communications from the Kremlin announced that it wanted to spend 486,000 rubles (about $14,800) to buy 20 electric typewriters as a way to avoid digital leaks.

Comment Should have been Ascension Island (Score 2) 151

While I was VP for Public Affairs at E'Prime Aerospace, we evaluated various sites for establishing a space port to launch our MX-derived rockets. It turned out that the presence of a military air strip at Ascension Island allowed a military jet transport large enough to deliver entire launch vehicles. Of course, the MX system was solid fueled so we didn't have to transport cryogenics long distances, but it would be feasible to set up a LOX facility on the island. There is a particular coastal cliff that is ideal for a launch pad.

Slashdot Top Deals

"Everything should be made as simple as possible, but not simpler." -- Albert Einstein

Working...