Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×

Comment Re:Apache what? (Score 1) 42

My advice: Don't use Solr. Don't use PCRd PDFs. Don't support full-text searching, because no one fucking uses it. We get thousands of searches against title, keywords, dates, and other meta shit every day in our internal application. The only full-text searches performed are by me when I'm testing shit.

Lawyers use it. Magazines use it. Lots of people use it.

Lawyers use it because they have to - there is no alternative to search shit short of hiring monkeys to manually type up mountains of old documents. Often, those monkeys would have to be legally privileged to look at the documents, so it's not something you can shunt off to cheap labor / Mechanical Turk. OCR sucks. Solr sucks. Mixing the two is a big ol' suck fest.

Magazines use it because... they're stupid? There's no need to OCR a massive backlog of shit. For old shit that may not be digital, you can go ahead and hire a monkey to type it in. You're still left with Solr sucking, but on top of that much of a magazine's content is so heavily formatted/styled/image-based that a Solr index would not suit it well.

If you NEED a fulltext index. there are plenty of alternatives, some mentioned by others in the comments on this article. I can only speak to OCR sucking, Solr's indexer sucking, and Solr's search giving me way too many things for it to be useful.

Comment Re:Not contradictory (Score 2) 549

1) The frequence of choosing a password is not within the end-user's control, and hence has no impact on whether or not the end-user chooses to include special characters vs several simple words.

The vast majority of passwords and resets are controlled by the user. Websites do not often force people to reset passwords. In a corporate environment people will be forced to change passwords more frequently, sure. But email, 20 social networking sites, shopping sites, and even banks will typically not force a reset unless they've been compromised.

2) Protecting against a brute force attack does not, in any way, break protection against "informed statistical" attacks.

XKCD's shitty advice is protecting against brute force attacks by using length (even though in many cases the effective length is still limited to something stupid like 16 characters). By following XKCD's shitty advice, you open yourself up to statistical attacks - your search space is just a combination of a few words. People generally only use a few thousand words, and when you want them to be random about it they'll likely pick common ones, fairly short ones, mostly nouns, etc.

3) End-users do not typically know how many other people have chosen that same password, but can protect themselves against accidentally choosing a common password by doing exactly what the XKCD comic recommends (picking four random words and juxtaposing them). Just don't use the specific password chosen in the comic.

Humans are terrible at being random. Any magician, con-artist, or statistician will tell you that. The most commonly-picked "random" cards are the ace of spades and the queen of hearts, for example. The 4 "random" words scenario will give you a search space many orders of magnitude smaller than a good, traditional password.

4) Disallowing common passwords is not within the end-user's control. It is a good practice, but does not in any way change the password-selection logic that end users should use as per the XKCD comic.

The only contradictory point mentioned is the "change password strength meters", which might mean "require special characters and numbers," which is exactly what the comic demonstrates to offer no value. The intent here seems to be the avoidance of common passwords, and that can be done without forcing special characters, which makes passwords hard to memorize.

Disallowing common passwords is within the user's control. Don't use a fucking password you've heard of before. If your password manager, or a site, tells you that the password is shitty, maybe don't use it.
The XKCD comic is fucking wrong. Symbols, numbers, and capitalization, all increase the search space exponentially. Special characters do not make passwords harder to memorize. I find they make it easier. They provide a cadence in may of the passwords I use. Instead of just a slurry of letters, a password with digits or symbols is less likely to get twisted about in someone's mind. alhysuidopmnah will be subject to transposition on shit like the ui, mn. alhys5idop#nah doesn't have that problem, and is much easier to compartmentalize (alhys5 idop# nah). This may or may not be true for all users for fixed length (and it certainly depends on the specific password itself). Beyond that, for passwords of a given strength those with symbols and shit will be easier to memorize than those without, if only because they'll be much shorter.

Comment Re:Apache what? (Score 1) 42

Yes, if you have to support it, you have to support it. I would steer clear from Solr based on my experience with it, however.
We only added it on because the shit we use integrates well with it. It was a "why not?" that works well enough to not be ripped out, but I wouldn't do it again unless I had to.

Comment Re:Apache what? (Score 2) 42

I had never heard of it either until I needed to create an internal search engine where I work. After a few days of research, I found that Apache Solr/Lucene is often used for intranet search engines and for e-commerce sites.

We use it to parse and index OCRd PDFs for full-text searching.
My advice: Don't use Solr. Don't use PCRd PDFs. Don't support full-text searching, because no one fucking uses it. We get thousands of searches against title, keywords, dates, and other meta shit every day in our internal application. The only full-text searches performed are by me when I'm testing shit.

Comment Re:What right do they have anyway? (Score 1) 144

This. If they're bound by law to remove results upon request, then they should remove them (assuming the request itself is valid).
They shouldn't be deciding which requests to approve or not beyond a technical / common sense capacity. John Doe obviously couldn't request all results for the generic use of his name to be removed, nor could he request that a specific page for someone else's name be removed.
Anything else should be honored, in accordance with the law.

Comment Re:But that was not the same! (Score 3, Interesting) 622

Her celebrity image (or personality) is a nice girl (she is one of those that really listens to her agents). No idea if she is really a nice girl or a slut in real life.

No it isn't. Her image is sex. She was only a "nice girl" when promoting the first Hunger Games movie to ensure the teens and tweens would see it.
She's been in other movies, you know. She's done plenty of magazine photo shoots. Her image is as much "nice girl" as Brittney Spears - it's a manufactured angle designed to hook a demo which is then leveraged for mass appeal.

Comment Re:Is this counting Apple's new encryption scheme? (Score 1) 210

I think his point is that while the NSA has been able to sniff around the internet with impunity, to actually take your phone and examine it, they would need a warrant.

Step 1: You are pulled over while driving for .
Step 2: Cop determines that you are acting suspicious and refusing to comply with his orders.
Step 3: Cop tells you to step out of the car, puts you in handcuffs, empties your pockets, and searches your vehicle.
Step 4: Cop takes your phone and plugs in AutoFascist 3.0 device while you watch, pressed up against the hood of your own car.
Step 5: "Thank you, Officer."

Comment Re:Shellshock is way worse (Score 1) 94

If you read the article (even the summary) you can see this isn't about tricking an admin into running a script. This is about a script already set to say, list a set of directories and do something per directory, but if a user names a directory FOO&BAR the script will interpret BAR as a command instead of input.

If a normal user has read access to a maintenance script to know that this is possible, you've already failed.
If your maintenance script doesn't enclose paths and variables containing them with quotation marks, you've already failed.
There's a reason MS isn't going to patch it - the problem is in your scripts.

Comment Re: Intel Common Core i7 (Score 1) 239

Except you can't equate those measurements.
You can calculate the length of the two sticks stacked end-to-end as 1+1=2. You can measure it as 3.
You can't equate those separate measurements because of the precision (and accuracy) issues inherent with measuring shit. You don't equate data to data for this very reason.

If you know your upper and lower bounds then you absolutely should use them throughout the use of the data, as you showed with 1±½ + 1±½ = 2±1 .

Comment Re:What this mean... (Score 1) 239

To be fair, almost no consumers have any use for double. And commercial entities who do usually don't mind the extra zero at the end of GPU's cost, because to them, that's just expenses to be written off on their taxes.

Protip:
Make $X in profit, owe some fraction $x in taxes.
Write off some amount $Y, owe (x/X) * (X-Y) in taxes.
x - (x/X) * (X-Y) = x - xX/X + xY/X = xY/X savings from writeoffs (assuming X isn't 0 - you had a profit, and assuming the writeoffs don't cross brackets).

You only get a fraction of a writeoff as a reduction in your taxes. You can't make huge purchases and write them off as if they were free.

Comment Re: Intel Common Core i7 (Score 2) 239

Thats true though. Using nearest integer rounding, 1.4 can be accurately represented as one and produce a sum of 2.8, which can be represented by three.

In other words, 1+1=3 for sufficiently large values of 1.

That would be 1.4 + 1.4 = 2.8. 2.8 could be then rounded to 3, at which point you could say 1.4 + 1.4 ~= 3,
Anything beyond that, though, is horseshit.

Slashdot Top Deals

8 Catfish = 1 Octo-puss

Working...