Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Distributed Computing and the Human Genome Project

Posted by Cliff on Mon Nov 29, 1999 04:49 AM
from the breaking-the-genetic-code-instead-of-rc5 dept.
I'm sure most of you have heard about the Human Genome Project by now and how it is working to map our DNA. Aparently there is now a race going on with corporations also performing the similar experiments, except with the intent of patenting the results. Now troc is wondering if another distributed computing effort might be in order. What do you all think? Click below troc's actual question.

troc asks: "I was watching a TV programme on UK TV last night about the Human Genome Project and how there was a race to sequence and publish the whole thing before the private companies do it and patent the sequences. Basically lasers are used to break up the strands, these are then read and fed into a computer that tries to match the bits up with other bits like a giant jigsaw puzzle. This requires a lot of computing time.

Is this an opportunity for the open source movement to help decode the sequences and publish the whole thing becore it's patented?

<soapbox>

I, for one, don't like the idea of a private company owning my gene sequences. They will be able to limit the use of these so only really rich pharmaceutical companies will be able to develop drugs etc and then sell them at huge profits, which isn't realy for the benefit of mankind blah blah blah.

</soapbox>"

I agree. I don't see how information like this can be patented. There is nothing truly proprietary about it, and it would do more good in the public where the benefit can truly be felt.

This discussion has been archived. No new comments can be posted.
Display Options Threshold:
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1) | 2
  • Distibuted computer projects... OT by famfurnell (Score:2) Sunday November 28 1999, @11:59PM
  • by SEAL (88488) on Monday November 29 1999, @12:00AM (#1498031)
    Patents, in general, have really taken a nose dive since the personal computer achieved widespread use. The original intent of a patent was to allow an inventor to come up with an idea and protect it for a period of time. Whether he profits from it or sits on it is then up to that inventor.

    However, with the computer age, the speed of (dare I say) innovation has been astounding. This has produced two detrimental effects. First, the patent examiners simply don't have the niche expertise to scrutinize patents. I'm sure most of us have seen some of the idiotic patents out there. Second, the time span of a patent has become too cumbersome. By the time the patent expires, the invention is often useless.

    I sincerely hope that this particular project will be placed under a HUGE spotlight when the patent requests inevitably filter in. I have a feeling it won't hold up, and at the very least, not in some countries.

    However, keep in mind that this is scientific information about a human being, not software / computer advances. In that regard, a patent will be cumbersome, but not quashing. The patent (if granted) WILL expire someday. And I'm fairly certain that the information will still be very important and valuable when that day arrives.

    Of course I'm all for beating the would-be patenters to the punch, if possible.

    Best regards,

    SEAL

  • Money by talldark (Score:1) Monday November 29 1999, @12:01AM
  • Sure this is ridiculous by Bouglou (Score:1) Monday November 29 1999, @12:01AM
  • Another Distrubuted Project by Keefesis (Score:1) Monday November 29 1999, @12:03AM
  • by reve (59221) on Monday November 29 1999, @12:05AM (#1498036)
    Okay, before everyone hops on this really popular anti-patent train, let's make sure we note that the sequences can't be patented. Yes, independent companies are gonna beat out the human genome project and have been filing patents. But the patents arn't on the sequences themselves, they're on applications. Whether these applications have to do with more efficient methods of genome-unraveling or whether they have to do with specific uses of the patterns they've found, it's NOT the actual sequences.

    In a number of countries it's already quite specifically illegal to attempt to put intellectual property restraints on anything involving human genes. US is considering some laws as well, but let's just get all the facts straight before panicing, okay?

  • by ghoti (60903) on Monday November 29 1999, @12:08AM (#1498037) Homepage
    Well I don't think anybody will say "No, let's not do it, let the big bad corps patent our genes!!".

    The only problem I see here that developing a distributed client for this takes a lot of time and effort --- and one, which definitely cannot be open-source!

    Two reasons:

    • False results. If the data format etc. are known, it's possible to feed the servers bogus results, which could lead to inconsistencies in the data base. This might even destroy results that are already there (okay, this problem also exists with closed source stuff like SETI@Home, I know).
    • Data Theft. An open source program could be modified by Big Bad Corporation Inc. to simply harvest raw data and feed it into their own computers, thereby gaining information they would otherwise have to find themselves. Granted, they won't have as much computing power, but when they have their own and the stolen data, they're still saving time. And I am not sure if enough data is produced to keep hundreds of thousands of computers occupied (see the problems SETI@Home had in the beginning).

    So, sorry, folks, but I believe this is one of the few things that open source clearly is not suited for. But it would be kinda cool to have a proggy running on my machine that messed with genes ... ;-)

  • Prior Art? by JohnG (Score:2) Monday November 29 1999, @12:11AM
  • Patenting genes? by Anonymous Coward (Score:1) Monday November 29 1999, @12:11AM
  • Use of patent by Anonymous Coward (Score:1) Monday November 29 1999, @12:12AM
  • Patents - just a few ideas by CormacJ (Score:2) Monday November 29 1999, @12:15AM
  • DeCode Genetics by lawn_ornament (Score:2) Monday November 29 1999, @12:20AM
  • HGP almost completed; also, NIH computers? by The_Messenger (Score:2) Monday November 29 1999, @12:20AM
  • Re:This can't be open source! by flux (Score:2) Monday November 29 1999, @12:25AM
  • by _Marvin_ (114749) on Monday November 29 1999, @12:27AM (#1498046)
    Of course the seq's themselves can't be patented.
    Otherwise anyone holding such a patent would be
    (AFAIK) entitled to control the reproduction of
    the sequences, that is, since we are contantly
    reproducing them in our bodies he could charge
    us for letting us live...
    Now, this would make patent law a satire just too obviously.
    Still, (again, AFAIK, correct me, if I'm wrong)
    patents on gene sequences (that is, their
    applications) have a new quality: They do not
    cover applications that the patent holder has
    thought of, they cover all applications that
    become possible only if you know that gene
    sequence.
    If I remember it correctly, there are already
    cases where companies hold patents on certain
    proteins in our bodies (again, not the proteins
    themselves but any of their applications) and
    you are not allowed to TEST for these substances
    without paying them license fees, even if you're
    using a completely new testing method you developed on your own.
  • Sex = piracy? by vaxer (Score:2) Monday November 29 1999, @12:27AM
  • Software Patents in EU by Anonymous Coward (Score:2) Monday November 29 1999, @12:28AM
  • by Kingpin (40003) on Monday November 29 1999, @12:30AM (#1498049) Homepage

    All this could be done so much easier. Use applets - people do not have to understand anything at all in order to help out on a project like this. No need to install obscure clients and what have we. I think the only good use of applets is for easy distributed computing.

  • by Lars Arvestad (5049) on Monday November 29 1999, @12:31AM (#1498050) Homepage Journal

    Data Theft. An open source program could be modified by Big Bad Corporation Inc. to simply harvest raw data
    and feed it into their own computers, thereby gaining information they would otherwise have to find themselves. Granted, they won't have as much computing power, but when they have their own and the stolen data, they're still saving time. And I am not sure if enough data is produced to keep hundreds of thousands of computers occupied (see the problems SETI@Home had in the beginning).

    The Human Genome Project is extremely open. They try to make all data public as soon as possible, making patents impossible. So data theft is not an issue here.

    False results might be a problem, but I would expect it to be relatively cheap (computationally seen) to check a solution to see if it is valid.

    A distributed (open source) effort will probably not happen because a computation like this is more difficult to distribute than trying crypto-keys et.c.

    Lars

    --
  • by ewanb (18483) on Monday November 29 1999, @12:34AM (#1498052) Homepage
    There are some good open source genome projects for doing this efficiently - and we do welcome help of any kind. Here are some open source projects which I know about/work on/

    • ensembl [ebi.ac.uk] is an open source genome project designed to get as much data and software into the public domain as possible
    • EMBOSS [sanger.ac.uk]
    • bioperl [perl.org]
    All these are well backed, strong open source projects with different strengths. Everytime genome stuff comes up on slashdot I try to point these things out to people, but everything gets lost in the noise about people $%!"'ing on about patents (generally without alot of knowledge!).

    Anyway - check out these projects for more information about real open source efforts in biology.

  • Re:This can't be open source! by ianezz (Score:2) Monday November 29 1999, @12:36AM
  • Re:Sick by wangi (Score:1) Monday November 29 1999, @12:41AM
  • Re:perhaps this will be a wake-up call by Courier (Score:1) Monday November 29 1999, @12:43AM
  • Utter BS. by jimmyCarter (Score:1) Monday November 29 1999, @12:50AM
  • Re:This was my idea. by jimmyCarter (Score:1) Monday November 29 1999, @12:51AM
  • by jw3 (99683) on Monday November 29 1999, @12:53AM (#1498059)
    Hello, my name is January and the group in which I am doing my Ph.D. thesis sequenced in 1996 a bacterial genome (Mycoplasma pneumoniae [uni-heidelberg.de]). Since we are into genomics, transcriptomics and all other -mics I know at least a little about the way it works - although on a much smaller scale.

    First issue: could distributed computing help? My answer is a brief "no". First, the bottleneck is on the experimental side - getting the sequences, and not putting them all together. Second, although you need quite a lot of computing power to do so, much of the job must be revised and checked by humans, i.e. there is a lot of skilled manual work to do - you have to have "an eye" for the sequences. But the first point is more important.

    Now, TIGR [tigr.org], the commercial alternative to the Humane Genome Project has sequenced more organisms then any other scientific group in the world. Craigg J. Venter seems to be very efficient and hard working guy. Even if you don't like the idea of making money with patents in this area the scientific community owes him a lot - he was the one to sequence the first organism, to sequence Helicobacter pylori and many, many others. On the other side... you know, when M. pneumoniae sequence was about to be published, it was supposed to be the first Mycoplasma sequence. But Venter was faster with Mycoplasma genitalium - and he kept it quiet, so noone involved in sequencing those organisms actually knew there is a race. Now Venter claimed to be able to complete the human genome with much less effort and much less $$, and considerably faster then the HuGeP. I'm not sure whether he is able to do so or not, because it depends chiefly on the "hardware" side - the new Perkin Elmer automatized sequencers they are supposed to use.

    Anyway, the question is, whether it is good or bad if Venter sequences the human genome. In my opinion - it's OK. The Hugep is somewhot different in its purely scientific interest, and I'm convinced that they will produce data of much higher quality. On the other hand, human genome has a considerable variation, so two genomes are better then one. I would not be very concerned about the patent issue, because it will come anyway (because of **!'*%$! american and international patent law) - even if TIGR would not sequence the genome, someone takes the output of the HUGEP project and will patent the same sequences Venter would. Venter just wants to gain a little time for evaluating the sequence before releasing it to the public.

    And of course, not the _sequences_ are patented - what is patented, is the usage of modification of a certain sequence for medical purposes, or a certain enzyme as an aim in medical treatment.

    Regards,

    January

  • Re:This was my idea. by troc (Score:1) Monday November 29 1999, @12:54AM
  • ... by Skinka (Score:1) Monday November 29 1999, @12:57AM
  • Re:Prior Art? by dylan_- (Score:2) Monday November 29 1999, @01:01AM
  • warm and fuzzy (Score:5)

    by counsell (4057) on Monday November 29 1999, @01:05AM (#1498064) Homepage

    It's good that hackers are well-informed and principled enough to think it matters. This happens to be my area of interest; I'm responsible for Bioinformatics at the Institute of Cancer Research in the UK. A couple of weeks back I went to an excellent talk by a clever guy call Ewan Birney from the Sanger Centre [sanger.ac.uk] near Cambridge, UK. He is writing code to catalogue and annotate the assembled sequences in real time as they come off the mammoth robot sequencing "production line". In one of those rare occasions where the British are leading a "big science" project the Centre has been responsible for the largest fraction of the Human Genome sequenced at any single institute. The code does stuff like figure out which bits of the sequence are real genes and which bits are that 90%+ of so-called "junk DNA" you might have heard of and also attempts to assign provisional functions to the genes by various computational means. Eventually people in white coats will have to confirm such assignments properly, but it's important to beat the drug companies to making good guesses.

    Ewan's code and all the data are entirely Open Source. If you've got a good reason and a reasonable Pentium with lots of memory and a 30Gb hard disk you could mirror the human genome and get it updated every night. (I feel strange just typing that sentence and I've been following this story for years). The Wellcome Trust and others (including US and European government agencies) funding the project are keeping everything Open because that's the way science is done and because this will subvert commercial attempts to stake a claim on our species' genetic heritage. (Er, go Wellcome!)

    Biochemists often talk about the "rate limiting step" in a reaction---the single point which sets the speed of the whole process---like a bottleneck. As far as I understood Ewan's talk (if you're reading this Ewan, please put me right), the rate-limiting step with the Genome Project isn't the assembly of the sequenced stretches of DNA (or "contigs") as the original poster suggests, but the collection of the data in the first place. At the Sanger they have clusters of PCs and Alphas crunching the contigs---distributing the effort would give us all a warm fuzzy feeling, but wouldn't be essential. Again, I may be wrong about this.

    One thing that definitely is a priority is making some sense out of all of this information. What would be great would be if members of the global community of hackers started taking molecular biology and biochemistry classes so they could write code to help people like me make sense of the embarrassment of riches that the project is creating. I'm off to Cambridge in two weeks to the Bioinformatics Open Software Development [mrc.ac.uk] meeting to listen to some project leaders talk and discuss the existing efforts. Personally, I would love to give crash courses in biology to programmers with time on their hands in an effort to harness their collective genius rather than sponsor an effort to write a contig-crunching client to harness their collective spare cycles, but I have no idea how such a thing could be organised. Any ideas?

  • by Lars Arvestad (5049) on Monday November 29 1999, @01:06AM (#1498065) Homepage Journal
    Common successful distributed projects in cryptography rely on the fact that all you need on a client is the algorithm and a few keys to try. Therefore, clients are really cheap (resourcewise) to distribute and use.

    In the case of the Human Genome Project, the situation is somewhat different. A well known analogy is the following: Take a few copies of a newspaper. Feed it through a shredder. Remove a handful or two of paper. Insert errors. Now, piece together one copy of the original newspaper.

    In order to make a useful contribution, a client is going to need a lot of data. This means that it will be difficult to distribute (long downloading times for instance) and that few people will appreciate having the client on the machine because the client will be using a lot of memory and the machine might be a bit unresponsive (your HGP screensaver might flush all your apps to disk for instance).


    Lars

    --
  • Re:This can't be open source! by ghazban (Score:2) Monday November 29 1999, @01:07AM
  • Re: cycles/data by ewanb (Score:1) Monday November 29 1999, @01:07AM
  • Re:perhaps this will be a wake-up call by PG13 (Score:2) Monday November 29 1999, @01:10AM
  • Re:warm and fuzzy (Score:4)

    by ewanb (18483) on Monday November 29 1999, @01:13AM (#1498069) Homepage
    Consell -

    Great that you were following the talk. I thought I put everyone to sleep

    The rate limiting step at the moment is effectively the mapping in fact, then sequencing. The interesting thing about the analysis is that the amount of CPU is unbounded. If we have more CPU we just use more accurate algorithms. We can do something within the CPU bounds on the hinxton campus, but if anyone wants to give me a super computer, then we could get more accurate analysis.

    I can always use more juice!

  • by ewanb (18483) on Monday November 29 1999, @01:15AM (#1498070) Homepage
    Lars

    This is only for the assembly and not for the analysis. With analysis you have a better data/cycles ratio. Assembly is done at the genome centres anyway...

  • Re:DeCode Genetics by PG13 (Score:1) Monday November 29 1999, @01:19AM
  • Bottleneck is somewhere else... by Silicon_Knight (Score:1) Monday November 29 1999, @01:30AM
  • Re:Prior Art? by wocky (Score:2) Monday November 29 1999, @01:34AM
  • Re:Difficult to distribute by Lars Arvestad (Score:2) Monday November 29 1999, @01:35AM
  • by lovebyte (81275) <lovebyte2000NO@SPAMgmail.com> on Monday November 29 1999, @01:38AM (#1498075) Homepage
    I, for one, don't like the idea of a private company owning my gene sequences. They will be able to limit the use of these so only really rich pharmaceutical companies will be able to develop drugs etc and then sell them at huge profits, which isn't realy for the benefit of mankind blah blah blah.

    This is an interesting statement. How do you think drugs are made now? Well, they are made by big pharma companies which make (often) a good profit. Drugs are not made for the benefit of mankind. They are made to make money.

    When it comes to patenting the use of some genes, we should consider that:

    1. patents are short lived.
    2. A company has no interest in not using its patent. So for some money, other companies will be able to buy patents
    3. patents don't stop anyone from working on whatever is patented. Lawyers always find ways to circumvent patents

    On the subject of open source distributed computing for genome data, I am afraid I agree with other people here. There is simply too much data to download. It's a pity, but it won't work. Maybe in a few years time when the problems in genomics will have changed, other problems might be more suitable to this type of computations.

  • Prior Art.... by FooGoo (Score:1) Monday November 29 1999, @01:44AM
  • Re:DeCode Genetics by Lars Arvestad (Score:1) Monday November 29 1999, @01:44AM
  • Re:lawyers in heaven by radja (Score:1) Monday November 29 1999, @01:45AM
  • I think it's technically unfeasible by kinkie (Score:1) Monday November 29 1999, @01:48AM
  • Re:Prior Art? by redhog (Score:1) Monday November 29 1999, @01:49AM
  • Re:Difficult to distribute by ewanb (Score:2) Monday November 29 1999, @01:50AM
  • DNA itself is prior art by Anonymous Coward (Score:1) Monday November 29 1999, @01:51AM
  • Re:warm and fuzzy by The_Messenger (Score:1) Monday November 29 1999, @01:52AM
  • Re:This can't be open source! by John Allsup (Score:2) Monday November 29 1999, @01:56AM
  • Re:TIGR, HUGEP and genomics by kovi (Score:1) Monday November 29 1999, @01:56AM
  • Patents are anti-competitive by Morgaine (Score:2) Monday November 29 1999, @01:59AM
  • Re:warm and fuzzy by ewanb (Score:2) Monday November 29 1999, @02:01AM
  • Re:warm and fuzzy by ewanb (Score:1) Monday November 29 1999, @02:01AM
  • Why exactly would it help to patent this info? by jemfinch (Score:1) Monday November 29 1999, @02:06AM
  • I do not think it is neccessary... by greystone (Score:1) Monday November 29 1999, @02:08AM
  • Re:I think it's technically unfeasible by ewanb (Score:1) Monday November 29 1999, @02:15AM
  • Re:DeCode Genetics and insurance by AntiNeutrino (Score:1) Monday November 29 1999, @02:16AM
  • by ewanb (18483) on Monday November 29 1999, @02:17AM (#1498097) Homepage

    It is clear from these postings that people would
    like the client to run. If there are people with
    experience in writing these sorts of d.net systems
    then please drop me a note. We have the problem
    for you to work on - it is just a question of
    figuring out how to do it.


    Drop me a mail (birney@sanger.ac.uk).
  • Re:This was my idea. by ewanb (Score:1) Monday November 29 1999, @02:26AM
  • Re:Bottleneck is somewhere else... by greystone (Score:1) Monday November 29 1999, @02:26AM
  • Molecular Biology and BioChem for hackers by Morgaine (Score:2) Monday November 29 1999, @02:30AM
  • Re:Molecular Biology and BioChem for hackers by ewanb (Score:1) Monday November 29 1999, @02:36AM
  • Re:Difficult to distribute by troc (Score:1) Monday November 29 1999, @02:41AM
  • Re:Who makes drugs now? Point 1. by dodobh (Score:1) Monday November 29 1999, @02:47AM
  • Decode the sequences? by heroine (Score:2) Monday November 29 1999, @02:52AM
  • Re:TIGR, HUGEP and genomics by Phil-14 (Score:1) Monday November 29 1999, @02:54AM
  • Re:Who makes drugs now? Point 1. by lovebyte (Score:1) Monday November 29 1999, @02:55AM
  • Distrib client worries: you're looking at it wrong by Morgaine (Score:2) Monday November 29 1999, @03:05AM
  • somebody patented Brit's natl dish by Anonymous Coward (Score:2) Monday November 29 1999, @03:09AM
  • Re:Why exactly would it help to patent this info? by Lars Arvestad (Score:2) Monday November 29 1999, @03:15AM
  • Patenting their version of the data by archmedes5 (Score:1) Monday November 29 1999, @03:16AM
  • Re:This can't be open source! by fpepin (Score:2) Monday November 29 1999, @03:19AM
  • Re:Patents - just a few ideas by Null_Operator (Score:1) Monday November 29 1999, @03:20AM
  • Publish it. by Ozzy (Score:1) Monday November 29 1999, @03:40AM
  • hmm by SKicker (Score:2) Monday November 29 1999, @03:42AM
  • Re:HGP almost completed; also, NIH computers? by imac.usr (Score:2) Monday November 29 1999, @04:17AM
  • Has anyone in aerospace tried to patent breathing? by Judah Diament (Score:1) Monday November 29 1999, @04:29AM
  • Re:Distibuted computer projects... OT by Drog (Score:1) Monday November 29 1999, @04:39AM
  • Re:warm and fuzzy by Rumor (Score:1) Monday November 29 1999, @05:24AM
  • Re:This can't be open source! by Nyarly (Score:1) Monday November 29 1999, @05:32AM
  • Re:TIGR, HUGEP and genomics by jw3 (Score:1) Monday November 29 1999, @05:39AM
  • False results (irrelevant) and feasibility (???) by jabbo (Score:2) Monday November 29 1999, @05:49AM
  • Do it!!! by DaPhreaker (Score:1) Monday November 29 1999, @06:01AM
  • Software patents by rob_from_ca (Score:1) Monday November 29 1999, @06:12AM
  • What may end up happening... by otis wildflower (Score:2) Monday November 29 1999, @06:22AM
  • Re:Prior Art? by molog (Score:1) Monday November 29 1999, @06:28AM
  • Secure Communications by Hallow (Score:1) Monday November 29 1999, @07:00AM
  • DC project not needed by bonabo (Score:1) Monday November 29 1999, @07:07AM
  • Re:HGP almost completed; also, NIH computers? by Bryan_K (Score:1) Monday November 29 1999, @07:24AM
  • Re:TIGR, HUGEP and genomics by donhav (Score:1) Monday November 29 1999, @07:38AM
  • That's a misleading stmt. by Tim (Score:1) Monday November 29 1999, @08:24AM
  • Re:Prior Art? by JohnG (Score:1) Monday November 29 1999, @09:13AM
  • Re:Distibuted computer projects... OT by Steeltoe (Score:1) Monday November 29 1999, @09:34AM
  • Re:Distibuted computer projects... OT by larkost (Score:1) Monday November 29 1999, @09:39AM
  • Patents by cabr1to (Score:1) Monday November 29 1999, @10:06AM
  • Re:This can't be open source! by thogard (Score:1) Monday November 29 1999, @11:40AM
  • Re:warm and fuzzy, the crash course. Idea: by Olof the Hopeful (Score:1) Monday November 29 1999, @11:51AM
  • Re:Contigs Schmontigs by elizabeth (Score:1) Monday November 29 1999, @03:23PM
  • FUD _is_ bad. Good thing this isn't it. by Tim (Score:1) Monday November 29 1999, @04:13PM
  • Re:Contigs Schmontigs by UWCM (Score:1) Tuesday November 30 1999, @02:03AM
  • DCing the Human Genome by ITShaman (Score:1) Tuesday November 30 1999, @03:03AM
  • Re:HGP almost completed; also, NIH computers? by bluets (Score:1) Tuesday November 30 1999, @05:30AM
  • Re:Open Source Genome Projects by bluets (Score:2) Tuesday November 30 1999, @05:39AM
  • Re:TIGR, HUGEP and genomics by kovi (Score:1) Tuesday November 30 1999, @12:04PM
  • Re:warm and fuzzy by foop (Score:1) Tuesday November 30 1999, @02:44PM
  • a comment about "patents are short lived" by klode (Score:1) Wednesday December 01 1999, @04:54AM
  • Worked at HGP by thej0ker (Score:1) Wednesday December 01 1999, @11:04AM
  • patenting of sequences/secrecy/ by genethics (Score:1) Thursday December 02 1999, @12:06PM
  • Then.. by BedPanDan (Score:1) Saturday December 04 1999, @03:25PM
  • 36 replies beneath your current threshold.
(1) | 2