Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×

Comment Re:What is this ? Keep asking the same question (Score 1) 291

This. Most of the workforce would benefit from basic education in all aspects of business. Sales, marketing, finance, project management, business development, etc.

In our neck of the corporate world (software), too few employees understand how business actually functions and what it really takes to make a business work. The current culture of "just build an app and you're set for life" leaves out many of the key steps needed to build a business. As a result, most promising applications go no where and most "successful" exists are really just acqui-hires (making the entrepreneur just a well paid headhunter, which has nothing to do with coding ability).

Simple things like knowing how to develop top down and bottom up models of a market would help app developers understand who their users are and how they might generate revenue to continue to fund their app. Even something as simple as understanding that revenue is actually necessary for success is lost on most developers I know.

-Chris

Comment This is a Computer Science Test... (Score 1) 252

... not a programming test. Recursion is a key concept in CS and is the foundation of many techniques and principles. Sure, no one uses it in practice that often, but that doesn't mean you shouldn't know it if you're learning CS.

If you want to train people for job interviews, send them to a trade program. If you want them to understand the field, you have to teach them the fundamentals.

I never use the central limit theorem directly when doing stats, but knowing how it underlies the methods helps provide a better understanding of results.

-Chris

Comment Re:No experience teaching no particular gift for i (Score 4, Interesting) 94

I have a Ph.D. and am now fully qualified to teach university courses. The funny thing about that is that in the course of getting my Ph.D., I never once had to take a course on how to teach or even teach/TA a course (I was a research assistant the whole time I was in grad school).

I'm an outlier on not having to teach/TA a course in grad school (I did TA an undergrad, though) , but I don't know of any graduate programs that require actual training for teaching.

The person cited in the summary is just as qualified as most Ph.D.s. :)

As for the big bucks, two of my good friends from grad school (both computer scientists) spent their first two years working for free waiting for tenure track positions to open up. They get decent salaries now, but over the course of their careers, it's not what I'd call big bucks.

-Chris

Comment Re:Yep it is a scam (Score 2) 667

For the sake of this discussion, mosquito borne malaria is a warm weather problem. Increased deaths from cold weather, which was the parent's straw man, occur when it's really cold (sub freezing). Mosquitos die when it freezes. Sure, they can be a problem even when it's cold, but not when it's deadly cold.

-Chris

Comment No Camera? (Score 1) 324

How about just remove the camera? That's the creepiest part of Google Glass.

I'm all for exploring the potential of having a display in my line of site for getting information on demand or for AR applications. You don't need a camera for either of those. For AR, the GPS in the phone gives you position, accelerometers in the headset give you orientation, and public database of roads and buildings gives the apps spatial awareness. If you want to be able to highlight people or cars, they could 'opt in' to a location sharing feature that publishes their coordinates.

Battery life would probably be much better w/o the camera as well.

-Chris

Comment Re:Yep it is a scam (Score 2, Informative) 667

31,000 extra deaths due to cold weather and the flu in 2013:

http://www.dailymail.co.uk/new...

584,000 deaths due to malaria in the same year:

http://www.who.int/features/fa...

Malaria is transmitted by mosquitoes, which rely on warm weather to live. And that's just one warm weather related cause of death that will go up as the planet warms. :. A warming planet will be a deadlier planet than a cooling planet.

Comment Re:Proprietary (Score 3, Interesting) 648

Look, I'm a huge Python and open source advocate and use it for almost everything I do, but the "proprietary" argument doesn't hold any water. VB, and Microsoft's languages in general, have seen more long term support than any open source language. They have consistently had a level of commitment to backwards compatibility and long term support that no open source language implementation can match. Sure, with an open source language you can fix problems yourself*, but if there's good support from the vendor, as is the case with MS, you never need to.

You're going to need to give a much better reason than "proprietary" to discount the VB argument. There are lots of good ones, but this isn't one.

-Chris.

*though I'd argue that there are only a few of us out there with the chops to actually do that

Comment HTML5 Client (Score 1) 264

Have you considered a Web client? HTML5 + JavaScript + [your favorite server language and ORM]* is a good development stack. It also has the benefit of zero-install for your clients.

We develop complex scientific software and made the decision to go HTML/JS for all our client code a few years ago and haven't regretted it. It takes a little bit of learning the libraries, but there are some good mature ones available to make streamline development.

-Chris

*I've used Django and Tornado+SQLAlchemy extensively for this.

Comment Re:Hadoop needs a fairly specialized problem (Score 4, Interesting) 34

MPI is definitely for very specific problems and really isn't what I'd consider "conventional" cluster programming. Most people associate MPI with clusters and parallel computing, but if you look at what's actually running on most big clusters, it's almost always just batch jobs (or batch jobs implemented using MPI :) ).

Interestingly, all my examples were on genomics problems (processing SOLiD and ION Torrent runs). We started going down the Hadoop path because we thought it'd be more accessible to the bioinformaticians. But, once we saw the performance differences (and, importantly, understood the source of them) we abandoned it pretty quickly for more appropriate hardware designs (fast disks, fat pipes, lots of RAM, and a few linux tuning tricks -- swappiness=0 is your friend). Incidentally, GATK suffered from these same core performance problems. The original claims that the map-reduce framework would make GATK fast were never actually tested, just simply claimed in the paper. GATK's performance was always been orders of magnitude less than the same algorithms implemented without map-reduce. But, it's from the Broad, so it must be perfect. ;)

I like sector and sphere. We also did a POC with them and they performed much better than the alternatives. Unfortunately, they also required very good programmers to use effectively.

Good stuff!

-Chris

Comment Re:Ok, I give up (Score 4, Interesting) 34

More importantly, why did we need Hadoop when we already had [your_favorite_language] + [your_favorite_job_scheduler] + [your_favorite_parallel_file_system]?

Seriously, standard HPC batch processing methods are always faster and easier to develop for than latest_trendy_distributed_framework.

The challenges of data at scale* are almost all related to IO performance and the overhead of accessing individual records.

IO performance is solved by understanding your memory hierarchy and designing your hardware and tuning your file system around your common access patterns. A good multi-processor box with a fast hardware raid and decent disks and sufficient RAM will outperform a cheap cluster any day of the week and likely cost less (it's 2015, things have improved since the days of Beowulf). If you need to scale, a small cluster with Infiniband (or 10 GigE) interconnects and Lustre (or GPFS if you have deep pockets) will scale to support a few petabytes of data at 3-4 GB/s throughput (yes, bytes, not bits). You'd be surprised what the right 4 node cluster can accomplish.

On the data access side, once the hardware is in place, record access times are improved by minimizing the abstraction penalty for accessing individual records. As an example, accessing a single record in Hadoop generates a call stack of over 20 methods from the framework alone. That's a constant multiplier of 20x on _every_ data access**. A simple Python/Perl/JS/Ruby script reading records from the disk has a much smaller call stack and no framework overhead. I've done experiments on many MapReduce "algorithms" and always find that removing the overhead of Hadoop (using the same hardware/file system) improves performance by 15-20x (yes, that's 'x', not '%'). Not surprisingly, the non-Hadoop code is also easier to understand and maintain.

tl;tr: Pick the right hardware and understand your data access patterns and you don't need complex frameworks.

Next week: why databases, when used correctly, are also much better solutions for big data than latest_trendy_framework. ;)

-Chris

*also: very few people really have data that's big enough to warrant a distributed solution, but let's pretend everyone's data is huge and requires a cluster.

** it also assumes the data was on the local disk and not delivered over the network, at which point, all performance bets are off.

Comment Re:cis and mi regulation is not "bad" code (Score 2) 14

For small genomes, yes, but for large genomes, there is a lot of "unused" material.

Only about 6-10% of the human genome is transcribed into RNA, either protein the coding kind or non-coding types used in regulation. (small genomes are almost always entirely coding and even include overlapping coding regions, large genomes are the ones that have "junk" DNA in them)

Transcription is most closely related to a processor reading machine code and doing something with it. In a computer program, we know that we can safely remove dead code paths and the code will still function. This is not true for DNA. Remove a portion of someone's genome and they usually die.

It's much more likely that the "junk"/"noise" regions of the genome are structural and help the DNA coform so the chromosomes can specialize for different functions. DNA folds differently depending on the cell type in multicellular organisms. Because the nucleus of a cell is a fairly crowded place, the way the DNA folds determines which sites on it are even accessible for transcription. Muscle cells expose one set of gene coding regions, fat cells expose another.

Taken from this perspective, large genomes are more akin to an origami fortune teller than machine code. Depending on the series of folding/unfolding events, a specific fortune is revealed. The fortunes are encoded directly onto the paper, but the paper also forms the structure used to access the fortunes. Another actor reads the instructions and acts on them (a person in the origami case or polymerase for DNA).

Comment Re:Not in the hospital I work at. (Score 1) 73

I do similar systems for genomics. Despite all the hype around cloud services in our space, we're finding more interest in local copies of the standard databases with links out to the canonical sources as needed. The local copies keep hospital IT happy and ensure access if the network is wonky.

And, it turns out that most clinicians are comfortable sorting through database records on their own and don't like magic algorithms attempting to do it for them. Access to the basic data is what they want.

-Chris

Slashdot Top Deals

The Tao is like a glob pattern: used but never used up. It is like the extern void: filled with infinite possibilities.

Working...