Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror

Comment: Re:Big Data != toolset (Score 1) 100

by Rob Fielding (#49691395) Attached to: Is Big Data Leaving Hadoop Behind?
Actually, the biggest problem with RDBMS and similar tools is the fact that you are expected to mutate data in place, and mash it into a structure that is optimized for this case. Most of the zoo of new tools are about supporting a world in which incoming writes are "facts" (ie: append-only, uncleaned, unprocessed, and never deleted), while all reads are transient "views" (from combinations of batch jobs and real-time event processing) that can be automatically recomputed (like database indexes).

Comment: Re:Big Data != toolset (Score 1) 100

by Rob Fielding (#49690663) Attached to: Is Big Data Leaving Hadoop Behind?
Except, if you are talking about a centralized database tool, you already know that the default design of "everybody write into the centralized SQL database" is a problem. Therefore, people talk about alternative tools; which are generally designed around a set of data structures and algorithms as the default cases. A lot of streaming based applications (ie: log aggregation) are a reasonable fit for relational databases except for the one gigantic table that is effectively a huge (replicated, distributed) circular queue that eventually gets full - and must insert and delete data at the same rate. Or the initial design already rules out anything resembling a relational schema, etc.

Comment: Re:let's be real for a second (Score 5, Informative) 429

by Rob Fielding (#49639331) Attached to: Why Companies Should Hire Older Developers
That's a pretty ridiculous statement. My actual experience intuitively says just the opposite. I work at a security company that is largely made of guys who just got out of Israeli SIGINT (their mandatory service). The older guys write kernel code know what C compiles to, and see the vulnerabilities intuitively. The new ones have quite a bit more experience in high level languages, while being almost oblivious to abstraction breakage that leads to security holes. At best, I'd say that the older developers get stuck dealing with older code bases (that are making the money) and tools (because the newer ones can't deal with it anyway). But on security.... Prior to the mid 1990s, everybody in the world seemed to be working on a compiler of some kind. This deep compiler knowledge is the most important part of designing and implementing security against hostile input; ie: LANGSEC.

Comment: Re:Well done! (Score 1) 540

Perhaps not directly. But the difference between public school and private schools is impossible to overstate; and it is strongly correlated to houses with one full-time working parent and one part-time or flex-schedule parent. The tuition (almost regardless of how much it is) immediately filters out financially overwhelmed and un-involved parents. Then even for the parents that can afford it, some schools also have involvement quotas that will cause a pair of full-time parents to drop out. Morale and motivation in private schools is extremely high, akin to that of people working in good jobs; which counts for about two or three grade levels. The end result is that you have a kid who is surrounded by children who know nothing other than a 7-day week of school, getting up at 5am to wrap up missed studies, music lessons, sports. Even if they do spend a bit of time goofing on iPads and watching TV, it is nothing like what happens with parents who can only show up long enough to sleep and go back to work. Even people who are poor will try to move their kids into the better school districts. A few will even break the law to do it, with few regrets when they get caught. (You can get sued by the county for doing this.)

Comment: Re:It depends (Score 1) 486

by Rob Fielding (#49337585) Attached to: No, It's Not Always Quicker To Do Things In Memory
The silliness of the paper is that there is no reason at all to keep previously submitted chunks in memory, and it's like somebody discovered that naive string appends are quadratic in memory allocation. On day 2 of everybody's first job, they learn to just append strings to a list and either flatten them to the one big string you need at the end, or evict the head of the list out somewhere (disk?) when a reasonable chunk size (optimize for block size) or amount of time (optimize for latency) has passed. I would imagine that in this case, you should simply queue up writes in memory into a constant-sized and pre-allocated buffer, and flush to disk as soon as it is the size of a disk block.

Comment: What an Idiotic Paper (Score 1) 486

by Rob Fielding (#49337491) Attached to: No, It's Not Always Quicker To Do Things In Memory
Holding the string in memory serves no purpose at all if you are just appending to it. Frankly, this += of strings issue is the most common "Smart but Green Self-Taught" versus "Computer Sci grad" problem you will see with new hires. Appending strings can be O(n^2) when the strings are immutable, and it applies to most high level environments. Even Metasploit had this issue at one point, and it was written by some very smart people. So everybody learns to keep appending to a list, and then flatten it to a string at the end. But this tie in with disk just makes the paper totally dumb. If you won't be reading the queue of string chunks, just flush them to disk immediately so that the code runs in constant space - relieving the memory allocator.

Comment: Re:Number 5 (Score 1) 51

by Rob Fielding (#47806675) Attached to: IEEE Guides Software Architects Toward Secure Design
By what technical means do you prevent data leakage though? You need to specify what the system (and its users) will NOT do. Defending against bad input (and bad state transitions) is the foundation for everything else, because there is no technical means of enforcing any other security property otherwise. The game of the attacker is to reach and exploit states that are undefined or explicitly forbidden. Think of the Heartbleed bug as an example of a failure in 5 mooting 6. Bad input causes a web server to cough up arbitrary secrets, which is used to violate every other security constraint. For 5 mooting everything, including data leakage protections: SQL injections can be used to extract sensitive data out of web sites (ie: SilkRoad user lists presented back to the administrator with ransom demands). I work on a data leakage protection system, and it's based on earlier intrusion detection and prevention systems for a reason. I regard Intrusion Detection and Intrusion Prevention systems as essentially trying to force a fix of number 5 over a zoo of applications that didn't get it right; amounting to taking action on a connection that looks like it's not safely following protocol.

Comment: Number 5 (Score 1) 51

by Rob Fielding (#47787509) Attached to: IEEE Guides Software Architects Toward Secure Design
Number 5 is the most important. It is about defending against bad input. When an object (some collection of functions and a mutable state) has a method invoked, the preconditions must be met, including message validation and current state. A lot of code has no well defined interfaces (global states). Some code has state isolated behind functions, but no documented (let alone enforced) preconditions. The recommendation implies a common practice in strongly typed languages: stop using raw ints and strings. Consume input to construct types whose existence proves that they passed validation (ex: a type "@NotNull PositiveEvenInteger" as an argument to a function, etc). DependentTypes (types that depend on values) and DesignByContract are related concepts. With strong enough preconditions, illegal calling sequences can be rejected by the compiler and runtime as well. If secure code is ever going to start being produced on a large scale, people have to get serious about using languages that can express and enforce logical consistency.

Comment: Re:Whoa 1.3x (Score 1) 636

by Rob Fielding (#47160009) Attached to: Apple Announces New Programming Language Called Swift
Bad algorithms are the major difference between a totally self-taught programmer and a programmer who has learned some actual computer science. Yes, "1000x speedup" is not ridiculous at all. Use the wrong algorithm, and you can make this speedup number as large as you want by feeding it a larger data set.

Comment: Or deal with pointer arithmetic properly (Score 1) 125

by Rob Fielding (#47124367) Attached to: Imparting Malware Resistance With a Randomizing Compiler
This is only an issue because of unchecked pointer arithmetic. For garbage collected and range checked items, you can't take advantage of co-location of data. In a JVM, if you try to cast an address to a reference to a Foo, it will throw an exception at the VM level. Indexing arrays? Push index and array on the stack, and it throws an exception if index isn't in range when it gets an instruction to index it. In these cases, pointer arithmetic isn't used. In some contexts, you MUST use pointer arithmetic. But if the pointer type system is rich enough (See Rust) then the compiler will have no trouble rejecting wrong references, and even avoiding races involving them. In C, an "int*" is not a pointer to an int. It is really a UNION of three things until compiler proves otherwise: "ptr|null|junk". If the compiler is satisfied that it can't be "junk", its type is then a union of "ptr|null". You can't dereference this type, as you must have a switch that matches it to one or the other. The benefit of this is that you can never actually deref a null pointer, and you end up having exceptions at the point where the non-null assumption began, rather than deep inside of some code at some random usage of that nullable variable. As for arrays, if an array "int[] y" is claimed, than that means that y[n] points to an int in the same array as y[0] does. Attempts to dereference should be proven by the compiler or rejected; even if that means that like the nullable deref, you end up having to explicitly handle the branch where the assumption doesn't hold. You can't prove the correctness of anything in the presence of unlimited pointer arithmetic. You can set a local variable to true, and it can be false on the next line before you ever mention it because some thread scribbled over it. Pointers are ok. Pointer arithmetic is not ok except in the limited cases where the compiler can prove that it's correct. If the compiler can't prove it, then you should rewrite your code; or in the worst case annotate that area of code with an assumption that doubles as an assert; so that you can go find this place right away when your program crashes.

Comment: Re:Average (Score 1) 466

Yes, that is what I meant. You will be parsing data almost every day. If you are not good at this, then you will be responsible for the creation of a huge number of security holes, and never-ending quality issues. Complexity theory (ie: algorithms), Probability (ie: measuring), and a basic understanding of language theory (the core of compiler writing) are the key CS specific skills that excellent developers share. You do need to be good at math, but not necessarily the kind of math that is emphasized in the academic world.

If you already have a couple of good developers on a team and put on a guy who can't be trusted to edit the code without breaking its performance, correctness, or usability; he will both cause the good developers to leave and start creating maintenance burdens. That's why I say that the minimum competence bar to get into a programming team is actually pretty high.

Now if you have a job which isn't primarily programming, but are writing code to *automate* a tedious process that you would otherwise have to do manually; then go ahead and write horrible code. Maybe you have nothing to lose in that situation, or the consequences of maintenance fall on somebody else.

Comment: Re:Average (Score 2) 466

Security correct input reading is a compiler problem. Almost all security holes result from haphazard file and socket reading. In any case, people who can only do simple things are not your 10x developers. Due to the maintenance costs of having bad code in the first place, a lot of places will do without a developer than have one that writes code that is just ok.

Comment: Re:Average (Score 4, Insightful) 466

Self-taught developers do need to pick up enough math to stop writing code that dies under its first realistic load (complexity theory). You can't write a good compiler without being able to learn the math. Computer graphics and Statistical programming require absurd amounts of math to do well (to be the 10x developer). I agree that the academic background isn't necessarily useful, and we see good people who went from high school to military to industry. Computer Science as it now stands has serious rigor problems. Which is why it is undergoing a serious security crisis, and the only light at the tunnel is languages that will reject logically inconsistent input that wont follow a specification. That means that at some point, programming is going to look a bit more like Haskell, and require some ability to write code that meets it's spec the first time; rather than quickly building something that undergoes haphazard maintenance for years on end.

Comment: Re:Average (Score 5, Insightful) 466

Even when you get good programmers, projects are often managed to push as many amps through a developer as possible. When that happens to a team, more difficult things do get accomplished, but the code often still looks like it was written by an amateur. Bad code ends up being like credit card charges that never get paid off, while the owed amount continues to climb until bankruptcy occurs. This is because the bad code wastes a percentage of everybody's time every day, and the mess compounds as everybody works around it. So, it is often better to just not hire a developer that isn't "the one" (who is often worth about 10 normal people). We used to do interviews including the entire office, and generally require unanimous approval. Maybe 50 to 100 people between phone screens and actual interviews were done to get one person. I think there is an oversupply of people trying to specialize in programming; and most people should be learning programming as a supplemental skill to a specific business.

Comment: Re:Lol wut (Score 3, Interesting) 128

by Rob Fielding (#46830535) Attached to: Band Releases Album As Linux Kernel Module
This! I had to reload my psmouse module half way through the album. Apparently this is not so uncommon on Dell laptops. I read the source code, and it was trivial. But the very large byte array size makes this something that's generally inadvisable, precisely because you could jump somewhere into it where the audio happens to be specific byte code. I also discovered it from tweets coming out of thegrugq, etc.

Stupidity, like virtue, is its own reward.

Working...