Big data, and big data technologies may be a buzz word today, and you are probably right most people don't need them. However, Big Data is a very, very real problem. I design and run systems which crunch 60 plus gigabits of data per second. So no, a few "well crafted python scripts" will accomplish exactly nothing.
Agreed. The OP doesn't realize just how big Big Data can be, how diverse it can be (binary vs text, structured vs unstructured, real-time or historical, etc.), and how much can be generated each day if he/she thinks that some scripts will fix the problem. When companies like EMC, Splunk, LogRhythm, Tibco, Q1 Labs, etc. exist to analyze and collect data for their customers and they have to throw millions into R&D then you know it's not just a fad.
And how are we measuring the size? What sizes are measured for typical 'big data'?
You measure the size based on how much storage capacity the data takes up on disk. Usually it's on SAN storage. Big data can be any size but typically it is used for customer data that is in the terabyte range, which can obviously extend from 1 TB to 1024 TB. For one company 1 TB of data may be created in one day and for another it might take a year. But creation isn't the issue...it's the storage, analysis and being able to act on the data that can be difficult at those capacities. Why you ask? Look at my answer to your next question.
Are we talking about detailed information, or inefficient data formats?
Anything. When you begin talking about *everything* an enterprise logs, generates, captures, acquires, etc. and subsequently stores then the data formats can seem infinite, which is why it is so difficult to be able to analyze the data because there are file formats to consider, normalization, unstructured data, etc. to contend with. The level of detail depends on what a company desires. Big Data can represent all the financial information they track for bank transactions, the audit data that tracks user login/logout of company workstations, email logs, DNS logs, firewall logs, inventory data (which for a large company of 100k employees can change by the minute), etc.
Are we talking about high-resolution long-term time series, or are we talking about data that is big because it has a complex structure?
A company's data, depending on the app that generates it, may become lower resolution as time goes on but not always. It's big simply because there is a lot of it and it is ever-growing. The best ways to combat even searching against data sets in the terabyte and exabyte levels is to index it and to use massive computing clusters, otherwise you'll spend forever and a day waiting for the machine to search for what you need out of it. That also assumes the data has already been stored in an efficient manner, normalized, and accessible by an application intended to process that much data by companies who are in the Big Data business (such as my employer).
Is the data big because it has been engineered so, or is it begging for a more refined system to simplify?
It's big simply because companies generate so much data during the course of a day, month, year, 10 years, etc. On top of what they generate, many of them are held to retention regulations such as the medical and financial institutions for various reasons such as HIPAA and SOX. So when they have to store not only stuff that their Security team requires, their HR team, their IT dept, etc. as well as what the gov't requires them to collect (which is usually in the form of logs), it just becomes the nature of the beast of doing business. In some cases, like data generated by the LHC in Europe, it has been engineered to be big just because the experiments generate so much data but a small ma and pop business doesn't generate that much, mostly because they don't need it; they don't care about it.
It definitely is begging for a more refined system to simplify it in the form of analytics tools that are built to do just that. Of course, you need a way to collect the data first, store it, process it, and then you can analyze it. After you analyze it you can then act on the data, whether it is showing that your sales are down in your point-of-sale stores that are only in the southeastern US, or your front door seems to get hits on it from Chinese IPs every Monday morning, etc. Each of the collection, storage, processing and analysis steps I mentioned above requires new ways of doing things when we're talking about terabytes and exabytes of data, especially when a single TB of data may be generated every day by some corporations and their analytical teams need to be able to process it the next day, or sometimes on the fly in near real-time. This means software engineers need to find new algorithms to make it all run faster so that companies competing in the Big Data world can sell their products and services to other companies who have Big Data.
Just another example of our rights to search and seizure eroding away.
Based on what? Sentinel is an investigation tool, not a surveillance system. You may not like the name but this will help them prosecute and convict interstate criminals faster and cheaper because they will have electronic tools at their disposal to assist in finding patterns and links when a human can't.
Rated insightful for someone who can't spell dinosaurs? What is slashdot coming to I ask? Assuming evolution is true, what would have been the preempting factor that turned some dinosaurs into creatures with wings? Seems a bit far-fetched to me.
By the way, you didn't actually answer whether the *chicken* egg came before the chicken. Let's assume again evolution is true, you only "proved" that some type of egg came before one instance of a chicken. You didn't prove that an actual chicken egg came first though. What would have happened is that a dinosaur laid an egg that hatched a chicken which then laid an actual chicken egg that hacked another chicken. Of course, with evolution not actually being a good premise to rely on it means that the chicken came first.
By the way, what are noncompos and for argument's sake, what dinosaur turned into a chicken?
How can you expect Americans to have aristocracies if you stand in the way of holding back or penalizing the poor!?
Everyone except the poorest people in the U.S. get penalized more than the poorest people by paying more taxes. Obviously the more you earn the more you are taxed and therefore the more you are penalized. So in fact, the poor are the least penalized when it comes to taxes and since taxes are usually a large topic area when discussing aristocracies it seems your statement is simply false.
Abortion = murder, murder = crime, hence abortion = crime. Seems to make sense to me. If we, as a species, protect those outside of the womb we should be giving equal protection to those still inside.
Where there's a will, there's a relative.