Forgot your password?
typodupeerror

Comment: Pragmatic: continual, active refresh (Score 1) 111

by michaelmalak (#47953255) Attached to: Data Archiving Standards Need To Be Future-Proofed

One can whine and wax poetic all one wants, but since we don't have a good archival format, the practical solution today is continual refresh of data: periodically copying data to fresh, and technologically up-to-date media. It's not sexy, but it does address three of the four points at the end of the linked piece (end-to-end data integrity, format migration and secondary media formats). The unaddressed point, access audit trails, makes no sense given the premise stated at the beginning of the piece that "No matter what anyone tells you, there is data that does not need to be on primary storage".

Yes, this is expensive. Yes, it would be nicer (cheaper) if a one-time single format could address the archive problem.

P.S. There is also this gem from the piece:

creation of a collision-proof hash

Of course the whole point of a hash is a mapping from a high-cardinality space to a low-cardinality space, and thus collisions are always a possibility. Collisions are minimized when a good hashing function uniformly distributes the resulting hashes, but given a large enough collection of source documents (no more are needed than the cardinality of the hash space), collisions will occur.

Comment: Re:Fallacy (Score 2) 914

by michaelmalak (#47899369) Attached to: Why Atheists Need Captain Kirk

It's a strawman argument, lacking an understanding of what actual science and the scientific process is.

And yet it is a common misunderstanding about the scientific method, namely:

"If it can't be proven by the scientific method, it must not be true."

This misunderstanding is false because there are things that are true that we know from outside the scientific method, namely by reason (e.g. Calculus and other philosophy of math) and by faith (religion).

The grandparent comment asks "show me the Spockists". To which I answer, show me where in public school curriculum the scientific method is explained and its relationship to philosophy, religion and truth (or even just philosophy and math, to keep things secular).

Comment: Seven letters (Score 1) 819

I travel a huge amount for work, and I am required to select the cheapest available option (within a window)

Three letters: ADA

Four more letters: OSHA

The $20 for Economy Plus is a "reasonable accommodation." However, if you're able to use frequent flier miles earned on the job to obtain Economy Plus, your case is much weaker.

IANAL, nor have I tried this yet (because I've never had an employer decline my initial polite request).

Comment: Marketing vs technology (Score 1) 87

by michaelmalak (#47731097) Attached to: What's After Big Data?

From the linked piece:

In hindsight, his remark was a clear sign that the marketing hype around "big data" had peaked.

This is true, and it provides the context missing from TFS: "Big Data" is over as a marketing term. But as technological term and as far as actual implementation, it is the status quo and forevermore will be.

From a technological perspective, "Big Data" has a simple definition: more data than can be stored on a single machine. And this need will only grow as hard drives and maybe even SSDs plateau while of course enterprise data only grows.

Indeed, TFA itself states (that TFS omitted):

A particularly hot sector has matured around Hadoop, an open-source analytics software platform. Many tech companies are writing software to make Hadoop industrial strength and integrate it with new and existing types of databases.

So, from TFA itself: Hadoop is hot, but the term "Big Data" is not.

Comment: Re:Why wouldn't you think they are scanning? (Score 1) 353

by michaelmalak (#47617427) Attached to: Microsoft Tip Leads To Child Porn Arrest In Pennsylvania
You are correct that automated scanning combined with reporting to the government is to be expected in today's political climate. However, you would be incorrect if you asserted that the founding fathers expected the asymmetry where the populace could not similarly examine Lois Lerner's e-mails.

Comment: Buggy whip techniques (Score 1) 637

by michaelmalak (#47616913) Attached to: Ask Slashdot: "Real" Computer Scientists vs. Modern Curriculum?

Much that is taught in CS today I had to learn on my own because it hadn't matured enough yet to be incorporated into CS programs: multi-threading, unit testing, OOP, SQL, data mining, all of the web technologies, etc.

But perhaps today's graduates will be complaining ten years hence how new graduates just rely on quantum computing searches and don't know anything about pruning search trees.

Seriously, though, to the point, I'd be more leery of those who graduated ten years ago and had not kept up their skills as opposed to those who graduated recently and did not learn skills from ten years ago.

Comment: Yes, no coding. No, problem is not tools (Score 4, Insightful) 372

by michaelmalak (#47518565) Attached to: 'Just Let Me Code!'

Yes, it is true coders have little time to code. But the author misses the primary cause: the ratio of library/framework code to self-written code.

In the old days (say, 25+ years ago), you would pick up a book -- a single book -- of the OS API calls, memorize, and start coding. Today, with github, it's as if everyone in the world were working on the same single project. Today, a developer needs to learn all these libraries that are coming out daily and how to work with them. In the old days, there was a lot of reinvention and co-invention of the wheel. Today, that is not allowed, because one has an obligation to "buy" (for free) instead of build because of a) of course, development time and b) more importantly, one gets updates/upgrades "for free" without having to invest (much) additional development time, and c) one's organization can advertise in the future for developers who already have experience with that particular library/framework.

To address specifically the reasons identified by the author:

  • Deployment. This is big, perhaps even as big as the above. In the old days, deployment was copying a single executable file. Today, not only is deployment to various and numerous servers more complicated, but for the past 20 years we've had people dedicated to managing those servers, called sys admins, to handle all those non-coding tasks. Of course, coders end up doing some admin and admins end up doing some coding, so now for the past couple of years we have a new half-breed, the Dev Ops. The very existence of both sysadmin and dev ops are themselves acknowledgement that coding is a smaller percentage of the total work involved.
  • Tools. The author spends most of the piece harping on this, and it's just totally bogus. We've always had source code control, editors, compilers, and linkers, and they've always been a pain at times to work with. But in fact, it's better now because you can find or ask about work-arounds and solutions on StackOverflow instead of calling up tech support at a closed-source vendor.

But this new development paradigm of the global github hive -- where we're all essentially working on and contributing to this one massive codebase that we all have to understand -- is what the author missed. The amount of custom code to actually code is small now, and the majority of time is spent figuring out how to get the various libraries and frameworks to work.

If you have to ask how much it is, you can't afford it.

Working...