maraist - Slashdot User

Comment Re:Project Lambda (Score 1) 204

by maraist on Friday July 29, 2011 @06:50AM (#36919338) Attached to: Oracle Announces Java SE 7

Because editors that perform code-refactoring are hard pressed to keep up with the syntax of the week. When you have 1M people coding towards a common base platform, then you can quickly build off each other's successes (both open-source and commercially). If EVERYBODY is going their own direction, you get what we've had for the past 50 years in CIS.. Stagnation, redoing the same old algorithms and libraries over and over again.. How many times must I re-learn a CRUD dialog? stdin, curses, X, windows 3.0, Mac, HTML + tables. HTML + CSS, HTML + Javascript + Ajax, jQuery.xxx. Flash. AWT, SWING, TK, applets, blackberry-dialogs, android-dialogs, silverlight, GWT. JSP/JSF/taglets/webworks/spring-MVC/stripes/seam. Objective-C dialogs? I'm sick of it I tell you. And being an old foggie, I've missed out on like 10 rails-style builders. I feel like I'm only ever able to write hello world's level of complexity. Until we can actually start building instead of rebuilding (with the vast majority of human resources), AI is perpetually going to be 100 years away.

I know some people will prefer a php, ruby, C++ (I rule out OS-targetted languages - namely those who's vast wealth of libraries and development support are specific to a given platform - like Objective-C and .NET), but I've felt the most productive with large / advanced algorithms in Java. 100,000 lines of code in java is FAR more error proof than C++, and I would venture to say ruby/php as well (given that they are so loose in syntax you can't validate / refactor them safely). And yes, I know that 40,000 lines of ruby is equivalent to 100,000 lines of java - but lack of safe refactoring is still an issue. I also rule out the lack of modularity of common PHP packages. Perl/Python/Java/C++ allow safe compartmentalization. Perl/Java (and I assume Python) very nicely allow safe module loading (e.g. if I don't have a module, I can gracefully take alternate action) - I can do so safely in multi-threaded environments (less so with Perl/Python). I can package and deploy the code centrally or in a private network at zero extra cost/technical-risk (sshd + /etc/password + apache == secure java maven-module repository).

If I need to create a project for someone, and I don't want to have to re-invent the wheel on almost any algorithm. If I want it to farm out to 50 developers, but I don't want to have to invent a project-management beuracracy.. If I need the project to scale to 500 machines - but I don't want to have to invent a network fabric. If I want to maximize high-CPU counts (e.g. 24 per machine) on complex image/video-processing/batch-processing (and I may or may not want to risk C++ linkage). If I want intricate timing, timeouts, locking.. And more importantly, I want to delegate out to other developers of questionable skill level - and have THEM properly handle threading, locking, timeouts, etc.

Then today, it's a no brainier for me to recommend java. Yes individual languages excel at some aspects of the above (PHP,Ruby,C++), but none do all sufficiently - at least not without very careful coordination with all the developers.

But once it gets to the UI - all best are off, sadly.

Flame away..

Comment Re:Hierarchical File System? (Score 1) 198

by maraist on Friday June 17, 2011 @09:30AM (#36473534) Attached to: Fedora 16 To Use Btrfs Filesystem By Default

mdadm + drbd + lvm + ext4 (throw crypto somewhere in there too). I've found the layers to be frustrating when deviating from initial install. And I've also found installation of new machines to rarely support the particular layering model I'm interested in. So I typically have to do a complex post-install reconfiguration. If, instead the layers were flatened (and supported zfs style RAID-5 write-hole elimination), my life would be make significantly easier. I'm not saying btrfs's particular architecture, but the ability to have the 'layers' work in tandom instead of invocation-level layers. Define plugable aspects of block mapping that are part of a common OS operation.

I doubt this would be as efficient as a monolithic flattened file-system, nor as robust as monolithic or even layered (as the number of combinations of event-condition grows factorially). But when I design software, there is a careful balance of layering v.s. aspect-orientation. Service X can do 10 things, each of which are configured at load-time as delegates. Of which 4 things might be noops for any given configuration.

cache-mapping, block filtration (e.g. encryption / checksumming / block striping (RAID-0/5)), write-block consolidation (e.g. disk-geometry elevators / RAID-Z), physical disk/offset block mapping (including remote-node replication), etc.

Comment Re:Job-killing automation (Score 1) 90

by maraist on Friday June 17, 2011 @08:16AM (#36472898) Attached to: Vivek Kundra Quits As Federal CIO

If Gov has to raise taxes or print money to pay a worker to dig a ditch, or do a job that could have saved money over a 5 year period through automation (produced domestically), then I'm not sure that job was worth saving - as it constitutes a burden on remaining workers.. Now, MOST automation tools today have a large foreign manufactured aspect, and thus contribute to trade deficits and don't trade domestic unskilled for comparable skilled labor. And many automation systems are far more expensive than a 5 year recoup-cost. But you have to take into account the FULL cost of employment, including government sweet-heart retirement benefits. Sadly most government agencies assume they can get an AVERAGE 7% yield on pension funds, and thus find themselves in hot water with monumental debt 30 years later, so I suspect that job for job, automation is probably cheaper than direct employment. With outsourced employment, it's generally more expensive up-front, but at least you avoid future liabilities.

Obviously my intended response was that automation merely shifts the demand for direct menial labor to skilled labor in constructing/distributing/marketing/installing/managing automation systems, along with educational requirements for the next generation of automation. In general, the added sociatal productivity, and the propensity to require more and more higher levels of education within an otherwise wealthy society are, IMHO a good thing. I'd rather solve a problem once so that one man can do 1,000 people's jobs, where possible, as it increases the complexity of a type of job that can be accomplished. We've proven we can get to the moon with brute force.. But we're a long way away from producing a life-systaining biosphere on a satylite.

Government spending is complex - do you focus exclusively on defense, infrastructure? Do you subsidize? Do you 'invest' in education? Or, like in the past 10 years, exclusively react, because there is no political will to be strategic. Republicans 'punt' by just saying someone else will fix the problem - lets give them more money. Democrats haven't provided cohesive ideas beyond 'spend us out of the recession'.

Comment Re:No Thanks (Score 1) 201

by maraist on Friday June 03, 2011 @06:55AM (#36329536) Attached to: OCZ Couples SSD, Mechanical Storage On a PCIe Card

Depends on the working set size. A 4-way RAID-10 controller is like $200. So with 4x2TB at roughly $150/disk, that's:
4TB of storage at $800

For RAID-10 on 500GB hybrid SSD platters with 64GB SSD cache, you'd need a 16-way RAID-10 controller: $1k

16x $300 + $1k = $5,800
That 16-array only gives you randomly allocated 512GB of SSD (maybe a little higher if you interleave which half of the RAID-1 and thus separately allocate SSD mappings - though this can't be controlled by the OS so it's largely random).

IF, instead you explicitly allocated indexes / hot tables onto a quartet of 128GB SSDs (say $300 each). Then we could have:
RAID-4 controller with 4x2TB drives: $800
+ RAID-4 controller with 4x128GB drives: $1,400
Total: $2,200

Less cost per HD failure, less power consumption, and more targeted index disks (SSD) and journaled disks (HD).
So indexes are guaranteed more efficient than on the mixed-mode storage, journals are equal in performance, and data-disks are slightly slower on pure HDs. We have slightly less SD capacity, but there's no guarantee the firmware will properly guestimate what data goes into SSD (same issue as cache-pollution).

You're still going to want 8GB to 128GB of RAM on any DB server. And if you're up to 128GB of RAM, then the SSD is of slightly less usefulness (especially on read-mostly nodes).

For non DB loads, RAID-5/6 might be a more cost effective solution, and thus those 64GB sections can start to add up to a usable working set. Hot regions of video metadata files can speed up drastically, while the HD's can potentially be leveraged for large spanning reads/writes (though don't know if you'd have cache-pollution).

Lastly, something like WAFL on netapp isn't going to leverage the hybrid at all, because it's specifically journaling all data (e.g. translating random writes into remapped linear writes, with explicit fixed-sized SSD mapping tables / journals).

Comment Re:No Thanks (Score 1) 201

by maraist on Friday June 03, 2011 @06:36AM (#36329482) Attached to: OCZ Couples SSD, Mechanical Storage On a PCIe Card

Letting the OS do it is great in theory. I think ZFS supports this concept. Brtrs maybe in 10 years. Maybe MS will support it soon (as a reason to upgrade).
The problem for me as a user of the HW.. What if the SSD went bad? what data did I lose? What if I want to upgrade the disk? Previously the RAID HW or bare OS was very explicit about what went where. Do I trust that I can just clone a disk with a newer partition.. Or relocate the disk to a new machine? If you think of it like RAID-0, then obviously it's no different. But to the casual user, it seems to be more complicated - since the HD 'seems' to store all your data, and thus the SSD is just a 'cache', but in reality it's not necessarily fully committed after hard shutdown.. I can just see the user complaints down the road.

For the initiated, I'm sure it's fine though.

As for the laptop 'low power mode'.. My HD's were usually powered-down most of the time.. If the HD detects wake-up triggers and moves those data blocks over to SSD, then over time the need to wake up the HD reduces over time. So while statistical in nature, I do see value-add.

Comment Re:No Thanks (Score 1) 201

by maraist on Friday June 03, 2011 @06:28AM (#36329470) Attached to: OCZ Couples SSD, Mechanical Storage On a PCIe Card

You're implying write-intensity. I don't think SSD's have any faster write speeds than HD's (at least linearly). Further, most write operations can be journal'd / buffered. It's the random read speed or read-modify-write that's killer.
The question is begged, if 4GB of dedicated RAM cache (e.g. on a 6/8GB RAM installation) is insufficient for your RMW critical work-load, then perhaps 64GB is enough (e.g. if no more than 1/10th of your data is critical path). This seems like a reasonable assumption for today's common problems.

However, it personally is far from enough for any workload I've encountered in 6 years. My critical volume is on the order of 100GB to 500GB (random disk-seek on a multi TB of indexed data - e.g. hashed indexes). Though if I KNEW a specific table only had 64GB of fast-path retrieval, I could possibly specially allocate it's index. Just seems like a lot of work to fine-tune. But it would certainly be nice to know that, unlike a pure SSD solution I COULD grow beyond the size-constraints of the SSD. I think THAT is the target market (along with fast boot laptops, etc).

As for write-wearing on the SSD. At least intel had some intelligent write-leveling - didn't see whether this solution did it as well. Given the need for block-remapping, I'm assuming the answer is yes.

Comment Re:Parallel programming NOT widely needed (Score 1) 196

by maraist on Saturday May 28, 2011 @09:33AM (#36272860) Attached to: What Makes Parallel Programming Difficult?

The problem is that people tend to focus on single-threaded designs for 3rd party libraries.. Then when those libraries get linked to larger libraries (or main apps) which are MT, then the whole world comes crashing down. Now you have to treat every function call of the ST-library as a critical region.

Thus while YOU may not care about MT, you should strive to make all your code reentrant at the very least. To whatever degree this allows contention-free memory access (e.g. ZERO global variables). This future proofs your code.

Use modern inversion-of-control programming models - don't own your dependencies, but instead require factories for the objects you actually need. You read-only value-objects where-possible, factory-IOC singletons where possible, stack-based allocation, minimal co-mingled data-structures. If you have the ability to leverage Garbage collection, do so for co-mingled structures (or maybe explicit reference counting).

This may actually slow your library down and grow it's size slightly.. But you'll have less coupled / more modular code, which can't be a bad thing. And you'll be more likely to be MT-safe out of the box.

Then if tomorrow you find you can parallelize some aspect (possibly by using message-passing worker threads), or that some 3rd party can do so TO your work, you'll get performance for free.

Comment Re:multiple processors (Score 4, Interesting) 196

by maraist on Friday May 27, 2011 @07:29PM (#36268774) Attached to: What Makes Parallel Programming Difficult?

While you're correct from a temporarily practical measure, I disagree in theory. OS theory 20 or more years ago was about one very simple concept.. Keeping all resources utilized. Instead of buying 20 cheap, slow full systems (at a meager $6k each), you can buy 1 $50k machine and time-share it. All your disk-IO will be maximized, all your CPUs will be maximized, network etc. Any given person is running slower, but you're saving money overall.

If I have a single 8 core machine but it's attached to a netapp disk-array of 100 platters over a network, then the latency means that the round trip of a single-threaded program is almost guaranteed to leave platters idle. If, instead I split a problem up into multiple threads / processes (or use async-IO concepts), then each thread can schedule IO and immediately react to IO-completion, thereby turning around and requesting the next random disk block. While async-IO removes the advantage of multiple CPUs, it's MASSIVELY error-prone programming compared to blocking parallel threads/processes.

A given configuration will have it's own practical maximum and over-saturation point. And for most disk/network sub-systems, 8 cores TODAY is sufficient. But with appropriate NUMA supported motherboards and cache coherence isolation, it's possible that a thousand-thread application-suite could leverage more than 8 cores efficiently. But I've regularly over-committed 8 core machine farms with 3 to 5 thousand threads and never had responsiveness issues (each thread group (client application) were predominantly IO bound). Here, higher numbers of CPUs allows fewer CPU transfers during rare periods of competing hot CPU sections. If I have 6 hot threads on 4 cores, the CPU context switches leach a measureable amount of user-time. But by going hyper-threading (e.g. doubling the number of context registers), we can reduce the overhead slightly.

Now for HPC, where you have a single problem you're trying to solve quickly/cheaply - I'll admit it's hard to scale up. Cache contention KILLS performance - bringing critical region execution to near DRAM speeds. And unless you have MOESI, even non-contentious shared memory regions run at BUS speeds. You really need copy-on-write and message passing. Of course, not every problem is efficient with copy-on-write algorithms (i.e. sorting), so YMMV. But this, too was an advocation for over-committing.. Meaning while YOUR problem doesn't divide. You can take the hardware farm and run two separate problems on it. It'll run somewhat slower, but you get nearly double your money's worth in the hardware - lowering costs, and thus reducing the barrier to entry to TRY and solve hard problems with compute farms.
amazon EC anyone?

Comment Re:You are a renegade. (Score 1) 132

by maraist on Friday May 27, 2011 @07:04PM (#36268578) Attached to: JavaScript Servers Compared

Do you debug and maintain 'cd'ing directories? How about opening a log file with a text editor? How about debugging your excel file? Why not? Because the input and output are reproducable and they are one-time events. If I'm trying to ask the question - how many log files did this application just produce.. I can
A) open up a stupid gui and manually count them (if I'm lucky, I can sort by extension - but good luck if they're spanning multiple directories)
B) I can do a file-system search for *.log files, and manually count them.
C) Write a one-liner such as:

find -name '*.log' | wc -l

Now if I wanted to add up their file sizes:

find -name '*.log' | xargs ls -l | awk ' { $sum += $F[5] } END {print $sum} '

This is the basis of shell programming - BUT when you start adding conditional, it becomes very unnatural

for x in $(find -name '*.log'); do [[ -f $x ]] && echo "Found $x"; done

Nothing there denoted the 'if statement'. Enter perl. The ability to run all these shell-like commands, AND leverage all of 'awk' and 'sed's text-processing powers, AND have a full blown turing complete language.

Go through 4 more revisions and you create the one-time glue of the internet.

Perl is a very natural language if you're use to sh, bash, sed, awk, grep and friends. It looks foreign to windows people because batch files have equally bad syntax. And of course straight-programmers that have never had to script in their lives look down on such languages as beneath them I'm sure - much like menial labor.

Comment Re:It's about the question not Penrose. (Score 1) 729

by maraist on Friday May 27, 2011 @06:47PM (#36268462) Attached to: Does Quantum Theory Explain Consciousness?

who said sub-atomic? I said differential equations.. EE 301 type course..Simple circuit modeling equations. It's not precise, but it gives you appropriate steady-state and transition probability equations between coupled modules or within modules..

Comment Re:It's about the question not Penrose. (Score 1) 729

by maraist on Thursday May 26, 2011 @07:56PM (#36258062) Attached to: Does Quantum Theory Explain Consciousness?

But you're missing something..
We simultaneously discovered algorithms and electronic switching circuits (in different centuries even). But the practical application of complex repetitive algorithms ultimately was contingent on the advancement of the physics model. From gears/belts to electro-magnetic solinoids to diode-resistor circuits to field-effect transistors to MASSIVELY parallel transister farms - and possibly to quantum state management.

As each technical hurdle of physics was achieved, we were able to explore ever greater algorithms, until we could write algorithms OF algorithms (ruby-on-rails molds most all aspects of CIS from english constructs into organized hyper-complex code - something Lisp/prolog hasn't seemed to practically achieve). Map-reduce structures can pass tertiary abstract algorithms to both data and algorithm streams.

The order of complexity (allowed by the increasing algorithmic processing capacity adapted to particular classes of algorithms) is slowly allowing us to approach practical computational autonomy. Meaning a proto-brain.

Fundamentally I agree with you that you don't model the software of MS word with differential equations on the circuit board - though I believe you could. BUT both processes are / were ultimately necessary.

Comment Re:It's about the question not Penrose. (Score 1) 729

by maraist on Thursday May 26, 2011 @07:44PM (#36257946) Attached to: Does Quantum Theory Explain Consciousness?

I think you're over stating things here.. In comparison, look at how our evil over-lords have enslaved us by learning about gravity. We really can't risk letting them learn about the next level - sabots to the factory machines I tell you!!

Plus I roll my eyes whenever I hear people talk about free will. It is a completely irrelevant question - it can have zero possible impact on your day to day activities (or at most an equivalent impact as learning that it's raining outside). Even a resultant decision to commit suicide is more attributable to the emotional stability of the person than anything else. Consider for a moment why you don't go around raping every orifice that moves. I'm pretty sure God has nothing to do with that decision. It's the same reason why Dogs don't do it - because they learn (often the hard way) that it's not always prudent.

Comment Re:Consciousness is mysterious not weird. (Score 1) 729

by maraist on Thursday May 26, 2011 @07:30PM (#36257802) Attached to: Does Quantum Theory Explain Consciousness?

Can't tell if you were joking or not. So I'll play along for fun.
"This means either you believe you exist, and if that is the case then you have to solve the mystery of your own existence."

There is no intrinsic requirement to find one's origins.. We're not all salmon. :)

"Or you don't believe you exist, and consciousness and free will are fake illusions."

Some random definition which seems to agree with your argument:
consciousnsess: aware of one's own existence, sensations, thoughts, surroundings, etc

Thus consciousness implies existence by definition. But the reverse is not intrinsically true. You are either existent and conscious, non-existant and non-conscious, OR non-existant and falsely conscious.

I can describe to you a character that is fully aware of itself and it's surroundings.. But then I can tell you afterwards that it doesn't exist. The virtual construct is by all measurable means to you existent, except that it was a lie / fabrication (e.g. a play / cartoon / software program). But most importantly what DOES exist - the 'story' of the non-existent character is defined as fully conscious.

Thus illusion of consciousness is not an illusion to the character, but to the receiver of the fabricated story. HIS consciousness is real - only the projection of his existence has any 'fakeness' (exclusively in the mind of the story teller - unless and until they share the fakeness of the character to the audience).

Free will is also completely abstract and meaningless here as well. The fake character has no knowledge of this philosophical strife.. He acts consistently with his presumably consistent personality and life-struggle.. Making emotionally impactful, gratifiable, or regrettible decisions (or willfully avoids such decisions). Whether his story embraces the duality of his universe and the story-teller is fully within the control of the story teller (so long as the story teller controls the knowledge of the character's existence). Meaning, you as an audience member wouldn't debate with a friend whether their brother ACTUALLY graduated college or not - your knowledge of what this supposed brother did is completely outside of your knowledge .

Why is this relevant?... Two words.. J. C.

Comment Re:You are a renegade. (Score 1) 132

by maraist on Wednesday May 25, 2011 @08:59PM (#36246342) Attached to: JavaScript Servers Compared

If you learn the syntax inside and out, perl has some of the most concise verbs I've ever used. 90% of the perl scripts I've written were 1 liners (where a line can exceed 200 characters).

perl -ne 'chomp; (/start/ ... /end/) && $words{$_}++; END { print join "\n", map { "$_ : $words{$_}" } keys %words }' file.txt

Comment Re:There is nothing else (Score 1) 281

by maraist on Tuesday May 24, 2011 @09:24PM (#36235040) Attached to: Ask Slashdot: FOSS, Multiplatform Skype Replacement for PC-to-PC Video Chat?

skype on linux is buggy and so so. I give it two chances to get better under MS management, slim and none. Thus as an avid promoter of thin-client linux boxes, I'm not unlikely to want to risk investing in corporate multi-screen-sharing chat accounts, and or skype centric voip phones.

Slashdot Top Deals