Forgot your password?
typodupeerror

Google Unveils Code Search 212

Posted by kdawson
from the let-the-regexp-begin dept.
derek_farn writes, "Google now has a page that supports source code searching. I hope they extend it to be more programming-language aware (e.g., search for identifiers and functions) like the specialist code search sites (Krugle, Koders, and Codease), who probably now have very worried investors. I don't see any option to search for Cobol. I guess there is not a lot of Cobol source available on the Internet, even although there is supposed to be more Cobol source in existence than any other language (perhaps that statement is not true in the noughties)." From the Cnet.com article: "Google engineers, many of whom participate in open-source projects, already use these code searching capabilities internally. Since it is a Google Labs project, the company is not yet seeking to monetize searches through ads."
This discussion has been archived. No new comments can be posted.

Google Unveils Code Search

Comments Filter:
  • VB has been the language with the most LOC since the early-mid 90's.

    As scary as that sounds.
    • by doti (966971) on Thursday October 05, 2006 @08:03AM (#16319787) Homepage
      It's true that not a lot of people write COBOL today, but the submiter was talking about legacy code. No wonder they're not on the Internet: not only they are from a pre-Internet era, but the vast majority of it is from corporations that keep their code very closed.
      • by locoluis (69948) on Thursday October 05, 2006 @10:40AM (#16322213) Homepage Journal
        True. And if there's any COBOL code on the Internet, it can be found using the following search terms:

        "IDENTIFICATION DIVISION" "DATA DIVISION" DISPLAY PROGRAM-ID SECTION

        No need for Google to develop a special search for what look less like a computer program and more like a plain text file.
        • Just in case anyone else was interested in what you get by searching that, results 1-2 are part of the COBOL language reference from IBM, but after that you get into sample code pretty quickly. (And if you wanted to add one more language to the list that you can write "Hello, world!" in, here you go [sxu.edu].)

          37,500 results total. Not too bad for a "dead" language.
    • Re: (Score:3, Interesting)

      by plopez (54068)
      Well... if we want to quibble... :)

      Basic was intended as a teaching language and so the language incorporated lots of syntax and ideas from the 2 major languages of the time: COBOL and FORTRAN.

      BASIC eventually begat that idiot bastard child 'Visual Basic' and syntacticly hasn't changed much since. So you could say that there is a lot of COBOL in 'Visual Basic'. Sure, it became object focused and now OO, but it still resembles COBOL. So COBOL lives on, as Visual Basic. It will not die :)

      And while I am on the
  • by eldavojohn (898314) * <eldavojohn AT gmail DOT com> on Thursday October 05, 2006 @07:40AM (#16319507) Journal
    I made a simple search for "fade file:.js" in order to find a javascript function that would fade a div or table or anything really (I know scriptaculous [aculo.us] offers this already, just curious as to what's out there). I found something but the header of the file read:

    All Code herein is Copyright 2005 Match.com
    Do not copy, reproduce, reuse or sell any code herein
    without the express, written consent of Match.com.
    For information contact webmaster@match.com.
    All Rights Reserved.

    Which is expected. However, that means this tool isn't useful for finding a method or function or class I can use and then using it ... it seems to be restricted to one of two uses. If I'm looking for code that does natural language parsing, I could hope a comment somewhere contains NLP as a description of what's going on. Or, I could look for libraries out there with methods and then search for those methods to see how other people used them to get an idea of how they work. The vast majority of this code seems to be just web development front-end code at least from the few searches I've done. Too bad, that's a very small part of programming.
    • by Aladrin (926209)
      You could always just search for code under the license you want. Instead of all code.
      • by doti (966971)
        But wait, aren't all those licenses listed in the combobox free (as in open source) licenses?
        • by Sparr0 (451780)
          Yes. Why would you want to search for non-Free code? You couldn't do anything with it.
          • by doti (966971)
            But eldavojohn found non-free code with it.

            Looks like the "Any license" option disable license checking.
            It should be replaced by these two options:
            "Any free license bellow"
            "Other licenses"
    • by doti (966971)
      But you can study the code, and try to apply the general idea to your own code.
    • by CastrTroy (595695)
      I'm not sure how well licenses/copyright work on Javascript. Anything that you release in Javascript on the internet is basically open source. Sure you can tell people not to use it, but it's there for them to read, and use your ideas to make their own software. If they want to make something similar, they can change it just enough to look different, or they can take single functions, which aren't really that complicated, and change them just a little to look like their own code that was developed withou
      • by Sparr0 (451780)
        First, "its easy to break this law and not get caught" is not a good reason to break it. If you start the next MySpace and someone ends up able to prove that you ripped off their code (and yes, its possible to do so), youre going to be hurting to the tune of a large fraction of your net worth.

        Second, maybe some companies dont care if the logic is out there. I know it's optimistic, but this is the essence of open source. "Here is our code. It does something cool."
      • by swillden (191260) *

        Sure you can tell people not to use it, but it's there for them to read, and use your ideas to make their own software.

        Which is legal, BTW. Copyright doesn't cover your ideas, just your code.

        If they want to make something similar, they can change it just enough to look different, or they can take single functions, which aren't really that complicated, and change them just a little to look like their own code that was developed without even looking at the other code.

        Perhaps. Making it look differen

  • ...I can forget RegEx's now.

    But honestly, this might have some bells and whistles but I don't see myself getting rid of my regular expression searches any time soon.
  • by bluelip (123578) on Thursday October 05, 2006 @07:44AM (#16319561) Homepage Journal
    It's the sound of millions of CS majors cheering!!!!

    Dang, this a neat tool.
    • Re: (Score:2, Insightful)

      by bluelip (123578)
      _OR_, the sound of thousands of Profs moaning.....
      • Re: (Score:3, Insightful)

        by Gospodin (547743)

        Why? This makes it easier to check for plagiarism.

  • by maccallr (240314) on Thursday October 05, 2006 @07:45AM (#16319563) Homepage Journal
    At last we can use regexps and search on all the important characters between the alphanumerics! For example the prefixed '@' in PHP - very hard to figure out what this is, without reading the reference cover to cover. Now at least we can search the codebase and hope to see some useful comments preceding it, or figure out from context what's going on.

    e.g. "@fopen file:.php"

    • Re: (Score:2, Informative)

      by serialdogma (883470)
      The @ prefix in PHP just stops it from printing an error message if something goes tits-up.
      • Re: (Score:3, Informative)

        by fruey (563914)
        Not only stops it from printing an error, but ignores the error and carries on parsing the rest of the code.

        Useful for including a file that might not be there, for example...
        • by CastrTroy (595695)
          But like the original poster stated, it's not very obvious what it does from just looking at it, without comments stating that we are ignoring any errors produced. I mean, it's basically a hack so you don't have to write a try-catch with an empty catch statement. It's the PHP equivalent of VB's On Error GOTO Next. Why would you write code that's that unclear, when you have another way to write the code that makes it much more clear of what's actually going on? Why would you write code that just ignores
  • Open Source projects that you know are written in COBOL? I don't know of any. It is a lot of legacy code. There is very few new projects being started in COBOL.
  • This is pretty cool.. i hate trying to search code on normal google, it usually filters out most of the search characters and you end up with nothing useful.

    Now if only they'd add regex searching to normal google (unless it already has it and i'm missing it?)

    mmmm regex
  • Useful to whom? (Score:5, Interesting)

    by kjart (941720) on Thursday October 05, 2006 @07:47AM (#16319603)

    Whenever I search for something code related on the web it's usually because I want to know how to do something. In such cases I dont really know what the code itself would be (i.e. the reason why I'm searching) so this wouldn't help at all. I suppose if you were looking for specific code it could be useful, but why would you be doing that? That would likely be your own code, so wouldn't a simple grep be easier?

    I'm sure I'm missing something here - Google doesn't (usually) release useless new products :)

    • Re:Useful to whom? (Score:4, Insightful)

      by admdrew (782761) on Thursday October 05, 2006 @09:27AM (#16320987) Homepage

      If you're unaware of how to do something from a design standpoint, you're right that viewing code is not necessarily going to help. This tool, however, works great for more specific issues related to syntax, etc. I've already used this to see examples of ItemTemplate [google.com] in C#. A simple search on regular google yields examples, but it also returns a lot of crap.

      When considering TMTOWTDI, looking at other code similiar to yours can be very helpful, and (for me, at least) can help break out of a code writers block when I've been working with a particular chunk of code for too long.

      • by kjart (941720)

        Yeah, that's a good point I suppose. I guess if you're looking for implementations of specific classes/functions/whatever, it could be handy. It could be a double-edged sword though: I've picked up several bad habbits in the past looking at coworkers code (not to mention what I've likely passed on!) - looking over the shoulders of random people on the web may not be a good thing ;)

  • by bloblu (891170) on Thursday October 05, 2006 @07:47AM (#16319621)
    all bugs are shallow."

    Well, it looks like that's not really the case: http://www.google.com/codesearch?hl=en&lr=&q=++%5C sif%5C(%5B%5E)%5D*%5C)%3B+license%3Agpl+lang%3Ac%2 B%2B&btnG=Search [google.com]

    I hope this service will help improve code quality...
    • by doti (966971)
      That's surprising to me.
      I never made that typo, and never saw it on other's code.

      Impressive.
    • Re: (Score:2, Interesting)

      by ggvaidya (747058)
      Some of those are a hack around the VC 6 "for loop doesn't scope as per ANSI" bug. This forum post [velocityreviews.com] explains when its used.
  • by krell (896769) on Thursday October 05, 2006 @07:49AM (#16319643) Journal
    "I don't see any option to search for Cobol."

    Well, that's one entire season of "Battlestar Galactica" rendered entirely pointless. Thanks a lot!
    • by Blakey Rat (99501)
      Why do submissions start so promisingly, with people who seem intelligent and informed, and then turn into some kind of weird rant about how much COBOL code is on the Internet-- as if that mattered in some way?

      Story submitters: It's a given now that Slashdot basically has no editing what-so-ever, so please self-edit a bit before hitting go. Thank you.
  • A good start.. (Score:5, Insightful)

    by sfraggle (212671) on Thursday October 05, 2006 @07:50AM (#16319665)
    It's a good start. They really need to start searching Subversion/CVS repositories as well. One of the most obvious things that they seem to have missed is to index all the Sourceforge downloads.
  • Not that useful (Score:2, Interesting)

    by fellm (1009547)
    As a programmer who needs to solve a problem I need a place to find answers to the problem I am solving. Searching for a code won't do it because I am looking for an answer and not how to code it. To find answers I use Omgili [omgili.com] - it is a vertical search engine that search ten of thousands of forums and millions of discussions. Usually someone already asked my question and hopefully it has an answer. It is highly recommended for troubleshooting and specific problems/questions.
    • Thanks! I've already found that searching newsgroups usually finds better results than a web search, and have always wished forums were searched better. I'll check this out, let's hope it does better than Google.

  • by Sub Zero 992 (947972) on Thursday October 05, 2006 @07:54AM (#16319697) Homepage
    How to find security holes in PHP web applications:

    http://www.google.com/codesearch?hl=en&lr=&q=Where +%5C%24_POST+-addslashes+lang%3Aphp [google.com]
    • Re: (Score:2, Informative)

      by RalphSleigh (899929)
      Mmmmm, SQL injection for the world to see. Good Call.
    • This entire project is either a very good or very bad idea.

      (1) Automated searching for security vulnerabilities.
      (2) A lot of that code is copyrighted. Which yes, it's transmitted over the Intarwebs regularly, but now it's just a little easier.

      I'm not saying it's not a *cool* idea, but from the looks of the Slashdotters trying out this new power, I'm not sure Google thought this all the way through. (1) is great when your code runs a web service and nobody sees it but your team or organization. (2) I can for
    • by soliptic (665417)
      F**k me, that's massive.

      You (google) have just given me (everyone) a whole list of vulnerable projects - follow that up with a google search for some identifying feature of the project in the final output ("Powered by BadlyCodedProject v1.01" or whatever) and then a simple bit of "?id=1;%20DROP%20TABLE" url munging and the consequences.... phew...

      I suppose on the bright side it also provides a quick way for you to audit OSS tools you were considering using, and if exploitation of these poorly coded syst
  • by scottennis (225462) on Thursday October 05, 2006 @08:00AM (#16319755) Homepage
    Good programmers write good code. Great programmers find it on Google!
    • by CastrTroy (595695)
      This really is kind of true. Most of the people I know use Google Groups as a way to find a lot of code. There's no point spending a hour trying to write something, or figure out some obscure feature of your programming language when you can just search Google Groups and find the answer in 5 minutes.
  • Your Search (Score:4, Funny)

    by mazarin5 (309432) on Thursday October 05, 2006 @08:03AM (#16319781) Journal
    Your search 10 print "boobs" 20 goto 10 returned no results. Try searching again using fewer terms.
  • For online services : Don't put up code that states explicitly, not for production [google.com] .

    For users : Stay away from online services that put up code that states "not for production". :-)

    cpan.org
    twiki.org
    osuosl.org
    ...

    /K

    • by syrion (744778)
      Well, CPAN is very useful. You just check the code you're intending to include in your program to see if it says something like "this code is beta!" before you actually, you know, include it. There are a great number of "experimental" and "beta" perl modules on there--some, like B::CC, are mentioned in Programming Perl--but there are also some very useful, mature modules. Don't warn people off it; it's one of perl's best features.
      • by kafka47 (801886)

        I think I was sort of going for humour, but I can see how it might have been misconstrued.

        Perl on the other hand. Now that is funny! *duck* :-)

        /K

  • I searched for some of my own code from sourceforge CVS, and it couldn't locate it.
  • Proof (Score:5, Funny)

    by jmv (93421) on Thursday October 05, 2006 @08:23AM (#16320037) Homepage
    Crap! There goes SCO's case [google.com].
  • I know I'm not alone in my most hated language [google.com]
    • Re: (Score:2, Interesting)

      by Roger_Wilco (138600)

      I did a brief survey on "I hate [X]", and got the following:

      perl 9
      java 20
      c 8000
      c++ 11
      c# 1
      lisp 0
      scheme 0
      elisp 0
      fortran 3

      Looks like John McCarthy [paulgraham.com] wins.

    • by bogado (25959)
      I hate perl : 9 results
      I hate C++ : 11 results
      I hate C : about 8000 results (?)
      I hate Python : 7 results

      It seems that the most hated language is 'C'. Well at least by the people who force them self to use it, witch may be a good measure...
  • If sco had this to search through linux.. Just imagine..

    SCO: You used our patented while loop over 4000 times in our code...

  • Oh crap! (Score:4, Funny)

    by sparkyng (635050) on Thursday October 05, 2006 @08:44AM (#16320313)
    I hope my CS professor doesn't find this until the semester is over.
  • The search apparently lumps everything with the word "basic" in the title together. QBasic, Visual Basic, VB.Net, etc...

    While the non-visual and Visual Basic merges aren't that bad, putting VB.Net into that category is a major headache. VB.Net is syntactically similar to VB6, but is fully object oriented and is coded in just like C#. So looking for VB.Net samples in the Basic category returns a lot of VB6 code solutions that may look syntactically correct, but are far from the best practices.

    -Rick
  • Google Unveils Search Code? I was thinking they opend up.
  • ...and I thought I was the only one being this clever and humorous [google.com] in my source code...

  • Oh, how about this? [google.com]
  • by Jugalator (259273) on Thursday October 05, 2006 @09:03AM (#16320595) Journal
    This is just too funny :-)

    void Mammal::mate( Mammal& partner ) { /* potential mating partner */
            M_partner = partner.getId(); /*printf( "." ); fflush( stdout );/**/ /* mating must be mutual */
            if( partner.getPartnerId() != M_id ) { /*M_wait += 15;/**/
                    return;
            } /*printf( "+" ); fflush( stdout );/**/ /* this is male object */
            if( M_gender == 0 ) { /* perform breeding in female object */
                    partner.mate( *this );
                    return;
            } /* this is female object */
            assert( M_gender == 1 ); /* current position */
            int x = M_x, y = M_y; /* behind position */
            switch( M_direction ) {
                    case EAST: x--; break;
                    case NORTH: y++; break;
                    case WEST: x++; break;
                    case SOUTH: y--; break;
            } /* back to field wall */
            if( !M_field->in( x, y ) ) return; /* newborn's position */
            int cx = M_x, cy = M_y; /* move mother backward */
            M_x = x;
            M_y = y; /* conception */
            orgasm();
            partner.orgasm();
            Mammal* child;
            child = new Mammal( *M_field, cx, cy, NEWBORNENERGY, *this, partner ); /*printf( "CHILD%d ", child );/**/ /* birth */
            M_energy -= CHILDBIRTHENERGY;
            M_population->add( *child );

            printf( "MATE(%d,%d)->%d(%d) ", M_id, partner.getId(), child->getId(), child->getGeneration() ); /* partner.printGenotype();
            partner.printState();
            printGenotype();
            printState();
            child->printGenotype();
            child->printState();/**/
    }

    void Mammal::orgasm() {
            M_energy -= MATINGENERGY;
            M_result = 1;
    }
  • Moo (Score:3, Funny)

    by Chacham (981) on Thursday October 05, 2006 @09:22AM (#16320899) Homepage Journal
    I just searched for "20 GOTO 10". Oh my. I don't know if that is funny or sad.
  • Google found a word in a source code archive on my site. The others did not.

    I think google is the only one using regular expression patten matching from the users end.
    Hint: completely clear the field or at least delete any end character when doing a different search as some non-printable characters might remain and give you bad results.
  • Just imagine if SCO had access to this a few years ago!
  • Because of that, it's probably considered proprietary information, so you aren't going to see it released to the public. I suspect a lot of it is fairly company-specific, anyway, and may be of little use outside its original context.
  • by chroma (33185) <chroma&mindspring,com> on Thursday October 05, 2006 @10:12AM (#16321767) Homepage
    I just had a need for this very thing. I've been looking for an implementation of the Minkowski sum [wikipedia.org] in Java. And Google had it [google.com]. So if you need to implement a particular algorithm that someone else might have already implemented, this is the way to find it.

    I can't find any of the software with my name on it that's on SourceForge, though.

  • Wow, this is great! One of the things I use Google for most during the day is hunting for example code whenever I'm trying to use something new to me. This code search should make that a lot easier now. With that in mind, I'll have to be more mindful of posting my own code examples in a searchable format!
  • Code I have on my site in some ZIP file has obviously been downloaded, unpacked and indexed. Not that this process would be so hard, especially for Google with all of its technology in place, but my site isn't anywhere near important (and code is not in CVS and it's not on Sourceforge or one of the other repositories) and they have all of my little tools crawled and the snippets they show are very insightful for the searcher. A job well done.
  • That's the number of times it found the word "hack".
  • K&R vs. Alii (Score:3, Interesting)

    by zobier (585066) <zobier@zobier. n e t> on Thursday October 05, 2006 @10:46PM (#16332315)
    Indent style searches:
    K&R [google.com]
    about 5,900,000
    Alli [google.com] (Actually, this is a bit broken! Anyone worked out how to enable multiline mode?)
    about 11,100,000

The only possible interpretation of any research whatever in the `social sciences' is: some do, some don't. -- Ernest Rutherford

Working...