jafo - Slashdot User

Comment Re:Not rocket science (Score 1) 244

by jafo on Friday December 04, 2015 @01:15AM (#51055083) Attached to: Why To Choose PostgreSQL Over MySQL, MariaDB

I *LOVE* Postgres, and have since the '90s. But I've never set up replication on it except using DRBD. The replication story seems ... experimental? Every time I've looked at it, I couldn't even find which replication project to use (Slony seems to be the longest lasting, in retrospect, but I want to say there were always a half dozen of them around). I've been hoping that the newer Postgres versions with some built in replication that it would get better faster, but it seems like the replication is basically what we had in MySQL a decade ago, which disappoints me.

I've used MySQL, as I mentioned, and the replication was not bad. There was always one clear choice to do it, and it worked. Sure, master/master felt like a horrible hack (using even/odd autoincrement, for example), and sure sometimes I had to re-replicate the slave because somehow the slave would get wedged up on a record... But setting it up was fairly straightforward, and it has been around *FOREVER* and there is a clear choice. And it was really useful for a lot of cases, we set up a lot of clusters just for backups, so we could do the "mysqldump" on a slave so the locks wouldn't stall the master.

Recently, I wanted to convert a mail server that was using Postgres for sender address rewriting database, but never managed to get clustered, into something that was active/active replicated. This was our only system that we had no way continue service if the master went down. I ended up trying everything, starting with Postgres clustering, looking at I think Mongo, Cassandra, and eventually stumbled across something new: galera under either Percona or Maria (I forget, probably Percona). I had the cluster set up quickly, it is 3 way master, and seems to work great. Except if all 3 nodes go down, you need to take the stick to get it started, as mentioned above.

Usually, I will default to Postgres. But, I respect MySQL.

Comment Re: Satellite Internet (Score 4, Informative) 64

by jafo on Saturday October 31, 2015 @12:08PM (#50838085) Attached to: Cuba's Internet Routing Is Messed Up

Probably not a "problem", more likely it is a "decision". BGP routing isn't really about finding the fastest or best route, though InterNAP has some special sauce they can add via an appliance to help with that. It is about finding the "shortest" in terms of number of ASNs traversed, often weighted by company policy about what is cheapest. A satellite link directly to Cuba is probably fewer ASN hops than a cable to Venezuela and another cable to Cuba, so BGP picks that as best. The company pushing the traffic out the satellite either don't know to prefer the other path, have congestion on that link (just because it comes in that way doesn't mean it is going out the same link return, and there could be asymmetric loading), or it is more expensive to pass it to that provider. And it could be that using the exact same path to return traffic reduces latency, but increases loss due to overloaded links, so the satellite may provide a way better experience even if it is slower.

BGP routing can be tricky.

Comment ZFS is one option, Glacier is worth looking at. (Score 1) 321

by jafo on Tuesday December 10, 2013 @03:23PM (#45653443) Attached to: Ask Slashdot: Practical Bitrot Detection For Backups?

I've used ZFS under Linux for 5 years now for exactly this sort of thing. I picked ZFS because I was putting photos and other things on it for storage that I wasn't likely to be looking at actively and wouldn't be able to detect bit-rot until it was far too late. ZFS has detected and corrected numerous device corruption or unreadable issues over the years and corrected them, via monthly "zpool scrub" operations.

I have been backing these files up to another ZFS system off-site. But now I'm starting to look at other options because it's looking like I can begin doing it more cheaply than even my free hosting of a box I bought can provide.

Amazon Glacier reduces the cost of S3 storage by an order of magnitude, making 2TB of storage cost around $20/month. For a backup copy, it's hard to compete with this, even just buying a USB drive to stick somewhere... You do have to be careful about recovery though, they charge based on peak download speed (a very weird pricing).

Comment Re:Procmail is a fine tool -- but the wrong tool (Score 1) 190

by jafo on Saturday August 31, 2013 @04:41PM (#44726165) Attached to: Ask Slashdot: Speeding Up Personal Anti-Spam Filters?

In an ideal world, many of the tips you mention would be fine and not produce any false positives. Unfortunately, we don't live in that world and users *WILL* receive e-mail from servers without proper PTF records, that don't know how to properly deal with greylisting (sending from multiple IPs or sender addresses, immediately bouncing on a 4xx response), from an IP that is on a blacklist... And god forbid you have any users, because they often will squeeze you from both ends: "I've *GOT* to receive this e-mail RIGHT NOW", but also: "Why am I getting so much spam?"

Comment Re:Problem spotted. (Score 1) 190

by jafo on Saturday August 31, 2013 @04:37PM (#44726133) Attached to: Ask Slashdot: Speeding Up Personal Anti-Spam Filters?

grep *CAN* take a bunch of patterns, we simply don't know if the user in question is using it in that way. Agreed though, if you are running egrep once for every pattern you are looking for, that is probably your problem and simply putting the patterns in a file and having egrep load the patterns from it via the "-f" flag will likely reduce this dramatically. However, doing many matches is still relatively expensive.

Comment Re:You could speed up your current solution (Score 1) 190

by jafo on Saturday August 31, 2013 @04:34PM (#44726107) Attached to: Ask Slashdot: Speeding Up Personal Anti-Spam Filters?

Doubtful that the time is largely spent compiling the regexes... But without knowing more about the OPs exact setup, it's hard to say. In particular, we don't know how many rules the OP has in their corpus. It could easily be tens of thousands or hundreds of thousands, if they just throw a bunch of strings they've seen in spam into a list of "don't let me see this message again" expressions. egrep is probably already compiling any expressions, it's just doing a *LOT* of matching.

You could try doing statistical matching on the corpus and moving more frequent matches earlier, so that matches cause the rules to terminate more quickly. "-q" might help speed it up by short-circuiting on failure (not sure if it does this or not, but I see no reason why "-q" wouldn't).

But to really improve the performance, you're probably going to have to simply be more clever than looking for a bunch of strings. For example, using something like razor fingerprinting or bayesian matching.

You can't just drop your corpus into a database and solve it, you'd need to come up with a way of indexing the data such as fingerprinting to get something that you can index.

You might also want to do different checks depending on whether the message is directly addressed to you or not. For example, any e-mail that doesn't mention one of my addresses in the To or Cc, or that comes from specific mailing lists, gets stored into a separate folder that I look at very rarely. The vast majority of spam that I get goes into that box.

Sender IP is VERY easy to use for a database lookup. When I get spam from an IP, I will often set of a blacklist for IPs around that address. Unless it is something like gmail or another big mail service that I recognize. It's surprising how often I get spam from a bunch of very similar IPs (in the same /24 or same /22).

Comment Re:Call me old fashion (Score 1) 156

by jafo on Tuesday August 20, 2013 @06:48PM (#44624701) Attached to: Samsung SSD 840 EVO 250GB & 1TB TLC NAND Drives Tested

Worse, a lower rate kind of is *MORE* indicative of a load that needs an SSD rather than *LESS*. SSDs are *VERY* good at random seeks and you can easily saturate a spinning disc at 400KB/sec or less worth of random I/Os. (assuming 10ms average access time, or 100 accesses/sec).

If you are streaming a lot of data, an SSD is "only" around 4x faster than a spinning disc. If you are doing random I/Os, an SSD is more than 100x faster.

Comment Another thing to keep charged... (Score 1) 69

by jafo on Tuesday July 09, 2013 @12:55PM (#44227129) Attached to: Linux-Based Smartpen Heads For Kickstarter

I'd love to write you a letter but my pen needs charging.

Comment Re:Facilities: learn from the telcos (Score 1) 75

by jafo on Saturday June 08, 2013 @10:56PM (#43949807) Attached to: Ask Slashdot: Best Software For Tracking Fiber Optic Networks?

I used to work doing IT for the ILEC and the more I worked with their systems the more surprised that I was able to pick up the phone and get a dial-tone. A friend of mine worked on the systems that managed the in-the-ground cables, he's the one that said the previous sentence. I worked mostly on the billing and ordering systems. They were not the most robust systems.

Comment Re:I loathe the medical "profession" (Score 1) 273

by jafo on Thursday May 30, 2013 @02:10PM (#43863425) Attached to: Hospital Resorts To Cameras To Ensure Employees Wash Hands

You mention the "time out" before sedation is administered, which is great. But, the last time I had a procedure, we went through a bunch of stuff before I went into the operating theater, with my wife present and verifying everything that was going on. Then I was moved into an operating theater and asked by someone I had never met before to sign a paper about some sort of sedation, which I could opt out of. Without my wife present to "double check my math".

At this point, I hadn't had anything to eat in 24 hours, nothing to drink in 12, and had gotten little if any sleep in the previous 24+ hours because of the operation preparation... To recap: I was sleep deprived, dehydrated, and my blood sugar was all messed up, and had to make a decision that was so important that it required signing off on a page of dense text.

In retrospect, I should have said that I wasn't able to make that decision and lobbed the ball back into their court. What I did was the doctor said he recommended it, and I signed it.

In second retrospect, if at all possible, I'm never going to "meet" a doctor in the operating room. Apparently I had the opportunity to have a sit-down in their office, but this was presented to me as a waste of time. Never again...

Sean

Comment Re:Why? (Score 2) 268

by jafo on Friday April 26, 2013 @03:10PM (#43560151) Attached to: Btrfs Is Getting There, But Not Quite Ready For Production

zfsonlinux has less testing than Btrfs? Really?

I think you mean *THE LINUX SHIM* has less testing. However, there's this *HUGE* portion of the code, as a wild ass guess I'd say 80%, which is the internal algorithms, data structures, and other internal parts of the file-system that are shared by the Linux and Solaris versions and those have been quite seriously tested for ZFS.

My experience with ZFS under Linux via FUSE was that there were some bugs in the integration layer, but they tended to be fairly shallow and never lead to data loss. This is over around 3 years of ZFS+FUSE on Linux serious use (~30TB of backup storage, home storage server). I tested the heck out of ZFS+FUSE before we deployed it, found some issues, worked with the developers (who were amazing!), and eventually got to a point where the stress test I was running on it was more stable than it was under our OpenSolaris systems a few years prior (and the reason I built the stress test).

Based on my experience with ZFS, ZFS+FUSE, and btrfs, I'd personally trust ZFSonLinux over btrfs. My experimentation with btrfs the last few years has been that it still needs a lot of work.

Comment Re:ZFS (Score 1) 268

by jafo on Friday April 26, 2013 @02:59PM (#43559981) Attached to: Btrfs Is Getting There, But Not Quite Ready For Production

Please explain it to me, because I really don't see any reason not to rely on an "out of tree FS". My system won't boot without tons of stuff that is outside of the kernel tree, including things like init but also things like graphics drivers on my desktop.

It seems to me that the ZFS license issue is only with the kernel, and can be solved by distributors. Distributions deal with wrapping up things under multiple licenses *ALL THE TIME*. And Ubuntu seems to be pretty close to having this integration done, based on what a friend reported with his experiments with zfsonlinux as a root device.

With all due respect to those involved, I think the pronouncement that it must be in the kernel and that it must be in the kernel, and that it is a "rampant layering violation" have set Linux back a long ways. FreeBSD, DragonFly BSD, OpenSolaris, have all had "advanced filesystems" for years now. Linux is basically stuck with a feature-set from Berkeley FFS and isn't really showing that that is going to change for several years... It's kind of a shame, especially since at the time of the "layering violation" comment it was clear to me that the violation came with significant compelling reasons for it, and now btrfs seems to be realizing that and implementing the same features...

Hindsight and all that, but it's a damn shame. ZFS is insanely awesome, I have a number of systems running it under FUSE and it has proven very reliable over the years.

Comment Worth trusting your data to btrfs?!? (Score 1) 268

by jafo on Friday April 26, 2013 @02:43PM (#43559793) Attached to: Btrfs Is Getting There, But Not Quite Ready For Production

If you are "trusting your data" to *ANY* file-system, you are likely to be disappointed.

I have run btrfs off and on for maybe 3 or 4 years because I don't *HAVE* to trust my data to it. I have good backups that run daily. If btrfs screws the pooch, I'm not really out that much.

Note though, my backup servers run ZFS. :-)

Honestly, it seems to me that btrfs has gotten worse over the last few years rather than better. 4 years or so ago when I first started using it, it actually worked pretty well and I was fairly happy with it, including taking automatic snapshots, but I never had a data loss. ISTR that I switched away from it because I upgraded to a new distro and had to reformat, for various reasons. Newer versions I've tried have been barely usable and I've had brtfs wedge itself a few times. Some of the issues were distro integration issues I think, like 12.04 seemed to *ALWAYS* run a full fsck on boot, and I think it took a snapshot when I tried to do an upgrade to 12.10, which somehow caused it to think that it had space available when it didn't and it ran out of disc space during the upgrade...

I really want btrfs to get production ready, but I'm half thinking that by the time it is HAMMER2 will be out and I'll be infatuated with it. Note that btrfs and HAMMER started around the same time, maybe HAMMER had a 6 month lead. HAMMER has been "production stable" and has been the default Dragonfly BSD filesystem for several years. Dillon seems to know how to build a file-system...

Comment Re:PyCon is a wonderful thing. (Score 1) 759

by jafo on Monday March 25, 2013 @04:25PM (#43275363) Attached to: Will Donglegate Affect Your Decision To Attend PyCon?

PyCon, as an organization, takes it very seriously if someone expresses what they feel is a code of conduct violation. They didn't "side" with either party, they arbitrated a discussion between the 3 of them. They would have done that no matter the gender of the reporting and reportee sides, I am quite confident. I say this because I know the organizers fairly well.

That said, one evening I said something fairly similar to one of the organizers and another community member: "I just wish we could all act like adults". The one guy said "I hear you, but I can cite several papers on why we can't just do that." And with this guy, I have no doubt that he literally could. His theory (which I buy because he's much smarter than me :-) is that PyCon will have a couple more years of an "awkward phase" where we don't quite have enough diversity that various groups can stand more on equal footing. Once we reach that point, he speculates, things like this will be less of an issue.

Comment PyCon is a wonderful thing. (Score 1) 759

by jafo on Saturday March 23, 2013 @05:21PM (#43259347) Attached to: Will Donglegate Affect Your Decision To Attend PyCon?

I'll admit, I have done some soul searching since I heard about donglegate about whether I would attend PyCon 2014. I hear other responses saying things like "If I were a python dev, I would ...", so let me be clear here: I have commit privs to Python core (though I don't exercise them as much as I'd like), I'm involved in the conference (again, not as much as I'd like). I say this to make it clear that when I say I was seriously considering not going to 2014, it's somewhat of a big deal. I'm involved, however this is in no way an official statement from PyCon, these are my thoughts and my thoughts alone.

But here's the thing... Not going doesn't really send a message to the conference organizers, or at least it doesn't send the one you think it does. More on that in a moment. What it *DOES* send is a message to people who will take any opportunity to grandstand on their agenda, that they can find an audience at these conferences, to the extent that it goes on for multiple years. It doesn't matter whether the actions taken here were grandstanding or not. Irrespective of her intentions, many people are seeing it as such, so I think it's fair to say it can send a message to others who would, without speculating on the intentions that started this.

If conference attendance were way down next year, the story would be about how donglegate caused it, and it would be feeding all the horrific sentiments behind this. If, however, attendance is up next year, the story will be how despite this the Python community remained strong, shutting down the bad sentiments and making it into a positive story.

Unfortunately, I and a number of folks are expecting attendance to be down next year, before any of this donglegate stuff came out. PyCon tends to lose attendees every time it moves cities -- though moving to Santa Clara didn't suffer from that. Moving it such that a significant number of Americans need passports, who haven't in the past, may reduce attendance. On the other hand, there may be people who come from around the world who didn't want to deal with the TSA... It's all speculation, but an informal poll I took showed about half the people were expecting it to be smaller.

So why doesn't it send a useful message to the organizers? Because the conference organizers did all they could about this incident. When the incident was reported, they acted swiftly (by all accounts), spoke to the 3 involved, apologies were given and apparently accepted, and everyone went away happy. No complaints were filed about the posting of the photograph.

Everything that happened that is making this show up on slashdot happened *OUTSIDE THE CONFERENCE*. The incident itself happened, I believe, in the last hour of the conference (her blog post sounds like it happened during the closing Lightning Talks, the last session of the conference). But in any case, the firing and rage happened largely on the Internet, in response to her post of that picture.

What can the conference do? Ban any of them from the show in future years? The only official complaint to the conference was handled to the satisfaction of all involved, at the time. Excluding someone from the conference without any complaint would lead to another storm...

As I said in the subject, PyCon is a wonderful thing. I've been to 10 of them, I've only missed one. PyCon has been working hard to include more diversity, and this year we had around 20% women. I remember when we literally had a hand-full of women at PyCon, and I was married to one of them. In order to get here PyCon has had to do a lot of outreach and take reports of harassment and the like very seriously. The community is stronger for it. And we now have experience dealing with someone tweeting "shame photos"...

Retaliating against the conference for this is going to do more harm than good. Plain and simple.

Am I going to PyCon 2014? Absolutely!

Slashdot Top Deals