Catch up on stories from the past week (and beyond) at the Slashdot story archive


Forgot your password?

P2P Bibliographies with Bibster 79

Noksagt writes "P2P isn't just for government documents anymore! Bibster assists researchers in managing, searching, and sharing bibliographic data in a peer-to-peer network. This project shows great promise to researchers who currently search for citations through centralized servers (Google, Scirus, CiteSeer, ISI. and many others). By making it decentralized, researchers can share bibliographic data with no subscription costs and avoid typing this data in by hand. It can import and export citations using bibtex. The project is GPLed and free clients for windows and Linux are available. There's also a Sourceforge page for Bibster, so you can checkout from the CVS if the Bibster site is slow."
This discussion has been archived. No new comments can be posted.

P2P Bibliographies with Bibster

Comments Filter:
  • by Scythr0x0rs ( 801943 ) * on Monday August 02, 2004 @06:11PM (#9865214)
    this is news for nerds guys...
    the CVS server will slow down before the website.
    • Re:wait a minute... (Score:3, Interesting)

      by Noksagt ( 69097 )
      the CVS server will slow down before the website.
      The CVS is hosted by sourceforge, which can handle significant load. The website is hosted on some University computer & I had trouble reaching it when I was emailed the link. So it might not be able to handle the load as well.
  • I am going to download it, a create a bunch of papers written by myself. Soon, I will be published in Science, Nature, and many other of the top periodicals of chemistry, physics and biology. Perhaps I will co-author a paper with Stephen Hawking.
    • by Anonymous Coward
      People who cite will also read the paper before doing so. This system will be useful when one has a paper in hand, but does not have the bibtex entry. No one uses just a citation without the content of the paper.

      So you have to prepare the content, and you might as well submit it to those journals, conferences :)

      • . . . will this actually be useful?

        >This system will be useful when one has a paper
        >in hand, but does not have the bibtex entry.

        Perhaps I'm spoiled by working in a field with very good online databases and journals that require only brief bibliographic entries, but it's hard to imagine where this would actually be useful. 95% of the papers one has in hand were located via an online database and came with bibtex entries. On the rare occasion one finds a paper copy of an article and no bibtex e

  • by lofi-rev ( 797197 ) on Monday August 02, 2004 @06:17PM (#9865259) Journal
    oh wait....

    Seriously, having a collaborative system for journalism with moderation and web of trust like elements could be wonderful - anyone got any bright ideas on how to do it?
    • by Anonymous Coward on Monday August 02, 2004 @06:25PM (#9865318)
      I'm seeing a, a number. Yes, it starts with a 5. I believe it's past 500. It's becoming clearer...I see the number 503.

      Did you just ask a question? If you did, it appears the answer is "No"
    • Well, I have been talking with a few friends about something along these lines - first a website, but then a p2p client that decentralizes information. We're right at the birthing stage, but that's probably why no one has done it. The only way to pull it off is through fair use, and the only way to pull that off is to be completely non-profit.

      The ideal would be a system that posts articles from current news sources (full text, not just links), individuals on the site, with a commenting feature. The art
      • I have a very similar idea... now don't laugh at me... then i realized what usenet is(sue me, I never used it before).

        I think a news sindication/decentralized publication would be the greatest application ever made... It would be a killer app for p2p... The uses are endless, but done right, it could be run as a server backend for dynamic websites(like Google News, almost) or by using routing and encryption algos could be the answer to anti-censorhip...

        OK, those were to huge ideas that wouldn't be done rig
        • Heck, you could go so far as to give the option to use public key encryption, so that only readers who have the author's public key can read the article, and thus verify that the latest story from John, is actually John, and is not just spoofing the identity.

          You've got that mixed up. By definition the public key is public, so ...only readers who have the author's public key... is effectively anybody who is interested.

          • you are right, but let me clarify what i meant.

            using public key encryption means that John uses his private key to encrypt his article. Now, users who track John's articles can decrypt using his freely available public key. That means, anyone who wants can read John's articles, true, but it also means that when an article is said to be written by John, a user can prove it because only John can encrypt using his private key... it authenticates the author so you can't spoof someone's identity.
  • by macshune ( 628296 ) on Monday August 02, 2004 @06:22PM (#9865300) Journal
    Future conversation between two illustrious academics:

    "Could you send over that citation for that lagomorph genome paper?"

    "Sure thing. I'll send some Steely Dan too, it helps me when I read papers about the lagomorph genome."

    "31337, thx."

    • Which somehow also shows that an illustrious academic as a concept is a (near?) constant.

    • by Anonymous Coward
      Doom 3 has been massively pirated this weekend, at record highs. Apparently, it's shaping up to be one of the most pirated games ever. Estimates are that id Software has lost up to 2 million dollars. Activision isn't saying anything at this point. Gamespot [] and the BBC [] both have articles on the news. The PC Gamer editor has some words for the pirates in the BBC article. This setback is set to cost Activision and id Software millions. John Carmack is reportedly very unhappy.
  • by Anonymous Coward
    P2P isn't just for pirating music anymore?
  • Citation Index (Score:4, Interesting)

    by wayward ( 770747 ) on Monday August 02, 2004 @06:25PM (#9865322)
    That looks promising. Will there be an easy way to see a citation index - for example, listing all the publications that cite a given article? (Citeseer does this, and this can be important to academic types.)
  • So... (Score:2, Interesting)

    by theM_xl ( 760570 )
    Is it just me or is a scientific database every idiot can add to a bad idea?
    • Re:So... (Score:5, Interesting)

      by Rosco P. Coltrane ( 209368 ) on Monday August 02, 2004 @06:48PM (#9865455)
      Is it just me or is a scientific database every idiot can add to a bad idea?

      I suppose it's the same as a wiki: I too first thought it was the dumbest idea to allow everybody and their dogs to edit webpages, but in any wiki I used, the content always turned out to have a pretty good S/N ratio. I still don't understand why, but wikis work. Just look at wikipedia... So perhaps this will work too...
      • Re:So... (Score:5, Informative)

        by burns210 ( 572621 ) <> on Monday August 02, 2004 @10:24PM (#9866356) Homepage Journal
        "I still don't understand why, but wikis work. Just look at wikipedia.."

        Wow... Well, lets put aside the subtle notion that people are benevolent and never do wrong to a wiki, and realize the Wikipedia uses strict moderation and privledges, letting a huge moderation team track various pages along with the ability to ban users or lock pages from being edited(George W. Bush's page cannot be edited, for example).

        Wikis work because they have a chain of command.
    • Re:So... (Score:3, Insightful)

      by j1m+5n0w ( 749199 )
      Is it just me or is a scientific database every idiot can add to a bad idea?

      Maybe. On the other hand, an encyclopedia every idiot can add to [] turned out alright. But they have a certain amount of centralized control to keep things from getting out of hand.

      Fortunately, few idiots (or anyone else) have much of an incentive to falsify bibliographic data.


    • I tend to agree, unless someone else knows better. One can imagine a large amount of misinformation floating around. We could end up with a bunch of people believing that aliens crash landed in the desert. Or that processed food is healthy.
    • If an author doesn't know enough about an article to tell whether basic bibliographic information is right or wrong, he shouldn't cite the article.

      Authors, titles, approximate year of publication, etc... Anyone citing an article should see at a glance that these are correct. So what damage can a forger do? Start inserting false publisher fields? A bit embarrassing for the author, perhaps, but nothing too serious.
  • by UniAce ( 713592 ) on Monday August 02, 2004 @06:48PM (#9865458) Homepage
    What would be really nice is to have the full texts of articles available P2P. That's the advantage of using centralized databases from subscribing locations (like universities): you can sometimes access full text for newer articles with just one click. Swapping full texts would be tremendously useful (and would keep us lazy scientists from having to actually get up and go to the library). Yeah yeah, I'm sure there are copyright issues... but doesn't fair use apply somehow? I'm a psychology research assistant at a major university, and at weekly lab meetings we often send around articles by email for everyone to read and then discuss, and I've never even really thought about copyright of them until now. Isn't open sharing of knowledge at the heart of the scientific endeavor? Oh, and also: it would be awesome if user comments could be added to each citation. Like: "this was an influential paper that opened new directions for research on human memory," etc. Of course, you can also get a ROUGH idea of that kind of thing by how many times a paper's been cited by other papers, as someone else already said.
    • by j1m+5n0w ( 749199 ) on Monday August 02, 2004 @08:35PM (#9865950) Homepage Journal

      citeseer [] has full text available for for most of its articles, and its a free service, so maybe copyright isn't such a big deal for some reason. Maybe it's because most papers in computer science are available from the author's website.


      • by Anonymous Coward
        OK, so if citeseer has text for most articles and abstracts + citations for all, then explain why we need a P2P service to do less?
        • Citeseer only sees about 70% (iirc) of the papers in Computer Science, and basically none outside... and its attempts at BibTeX are usually rubbish... and it's up and down like a tart's... (well, until the mirrors are properly sorted and stable).
    • I've been working on a similar idea for news, and as far as I can tell fair use completely applies to this specific idea of yours - education and the arts, unbiased, not for profit.

      There are already some sites out there doing something similar like the Media Awareness Project [http] [] which collects and archives research on drug policy. From what I can tell, they only get sued when they get too big, present content with a bias, or try to profit.

      I find it hard to believe my little project is t
    • sounds like almost like usenet, for the comments over p2p, and SubEthaEdit [] for the group editing, with the added ability to include hidden comments, ofcourse.
    • What would be really nice is to have the full texts of articles available P2P.

      That's quite easy to do: if I have the article in ps or pdf, then the name of the file is the name of the bibtex-key. And every article is in the 'articles' directory next to the beloved .bib file.

      it would be awesome if user comments could be added to each citation.

      I use the annote field. However, how can you be sure that the review is accurate?
    • What would be really nice is to have the full texts of articles available P2P.

      S2S is such a network for academic users as the target group. It is currently in a test phase. Sponsored by the German government. Also includes an expert client, where you can sign yourself up as an expert for a specific area and get to answer questions. According to the current statistic, the network provides over 1 million documents.

      Homepage is here, but in German: []
  • Standards based? (Score:5, Interesting)

    by azaroth42 ( 458293 ) on Monday August 02, 2004 @07:06PM (#9865564) Homepage
    The next big question is whether or not it's standards based. While it would be surprising if it used Z39.50, it would be a shame if it didn't use SRW and/or CQL [].

    Especially as NISO is recommending them in their current 'Metasearch Initiative []' -- an industry/academic/government cross sector committee with the major players and interested parties for allowing cross searching of bibliographic databases with other sorts of things.

    (ObDisc, member of both SRW Editorial Board and Taskgroup 3 of NMSI)

    • Re:Standards based? (Score:4, Informative)

      by Noksagt ( 69097 ) on Monday August 02, 2004 @07:45PM (#9865721) Homepage
      Unfortunately not. Nor does it seem to use MODS XML for record storage (which, incidentally, will be used by's bibliographic [] and the bibliophile [] project, which hopes to do cross searching across the open source literature databases.

      SRW/U [] hopes to supplant Z39.50. Not only does it use MODS, but it still uses ZeeRex [] and CQL [] .

      For more nerdy e-refererence stuff, check out darcusblog []
    • FYI, Bibster uses Semantic Web technology and standards, actually. Data is stored in RDF [], and peers retrieve data using the SeRQL []
      RDF query language.
      • I think that in referring to "standards," the poster above was referring to the communications protocol. SRW is a quite interesting one, taking all the experience of the z39.50 community and applying it to the world of XML and web services.

        The other big issue I have with Bibster is that it is based on bibtex, which may be widely used in the hard sciences, but which is not international-friendly, has a bad data model insufficient to the task of representing the sorts of data that scholars in the humanities
  • But.. (Score:5, Interesting)

    by iantri ( 687643 ) <(iantri) (at) (> on Monday August 02, 2004 @07:12PM (#9865590) Homepage
    What guarantees accuracy? What guarantees high-quality results?

    If we were to look at another project, say, CDDB, which stores meta-data for CDs (Title, Arist, Track Listing), something not at all unlike storing meta-data for books (bibliographies), you'll note that CDDBs entries are frequently inaccurate, mispelled and just plain wrong.

    When it comes down to it, I don't really trust Random Joe to provide accurate trustworthy info. It's not like its like Wikipedia, or anything, which has constant peer review and a clear history.

    • but with the use of users and profiles, a person could become credible/reliable, or not, and the information would weed itself that way. p2p makes it harder to moderate, but not impossible. And the benefits most surely out weigh the added work.
  • I'm a geek... (Score:3, Insightful)

    by lukewarmfusion ( 726141 ) on Monday August 02, 2004 @07:42PM (#9865711) Homepage Journal
    ...married to a non-geek (getting her PhD in Psych). When I told her about this system, she said:

    "My system's better anyway. I have a file, with the exact bibliography printed on the folder, for every article I've read or written. If I need one, it's right there. If I need to use the citation, I can just copy it from my Excel spreadsheet. Now why would this thing be better?"

    Some people are born geeks, I guess.
    • Re:I'm a geek... (Score:3, Interesting)

      by RealAlaskan ( 576404 )
      If I need to use the citation, I can just copy it from my Excel spreadsheet. Now why would this thing be better?

      This would be better because when she reads a new article, she could get the bibliography from someone else, rather than having to type it in herself.

      Of course, if she has read so few papers, and does so little writing, that Excel (and Word? Ick!) work for her purposes, then this might be an exercise in gilding lillies.

      I use Emacs, with reftex and bibtex, and find that it works far better fo

      • She has less than I realized. In her two years as a grad student, she's collected about 550 articles total. An Excel spreadsheet will have no problem with that. Eventually, she might move to an Access database (or, if she'll let me, a SQL Server DB with a nice web interface).

        By that time, she'll probably abandon her filing cabinets. It's one thing to keep a few hundred files, but we won't have room for ten years of her readings.
        • The big problem with using a spreadsheet for this is that in order to write a paper, you have to enter the bibliograpic information in a different format for each journal you might submit to. Bibtex automates that, so you enter the information, and the formatting is done for you.

          As I said, I use Emacs with Reftex for writing. Reftex will let me search for citations in my bibliography file. I use a key made up of the first author's family name and the year. I have innumerable papers in my bibliography

          • Psychology has a standard format - APA. You're right about the electronic copies, though. Many articles are paper only, or are tough to get online.

            Her advisor is anti-computers. She's afraid that the computer will change the data when you're not looking. They employ about 15 people to run their lab (includes students, which receive credits and not money). I'd guess that most of those students could be removed from the process and replaced with a computerized data input process. You know, instead of 80 page
            • APA is one of the formats that Bibtex supports. I used it for a term paper once, just to be different. Even if there is never a need to reformat the entries, it's a real bother to have to make the citation, and remember to copy the entry into the appropriate place, and so on. Is the APA style one of those which requires that the entries be alphabetized and numbered? Those are nightmares to handle by hand.

              We used to perform some economics experiments by hand. It was a lot of bother, but for some thing

    • Re:I'm a geek... (Score:4, Interesting)

      by imkonen ( 580619 ) on Monday August 02, 2004 @11:10PM (#9866540)
      "I have a file, with the exact bibliography printed on the folder, for every article I've read or written."

      I tried to keep a system like that going for a while. It's one thing to be good about saying "Wow, that was a good article, I should fill out the bibliography right now in case I should like to cite it someday." It doesn't take much discipline since it happens roughly once a year. It takes a whole other level of discipline I just don't have to keep filling in those entries for articles I get bored with halfway through, stacks of articles my boss dumps on my desk, articles I read and decide are completely irrelevant to anything I'll ever be interested in, etc.

      Nowadays I just use SciFinder or one of the other databases which can export in citation manager friendly format instead of typing in by hand. I'm not sure I see how P2P would make my life any easier. However these are all (SciFinder, SciSearch, ISI to be sure, not so sure about others) for fee databases that require my University to pay a subscription. I'm all for the free exchange of information, especially in the scientific community, so if this facititates it, I'm on board.

  • by Anonymous Coward
    I wonder why there is no Mac OS X version. There are many scientists on OS X. It can't be a very hard port since they have a linux version, can it?
  • What an interface! (Score:4, Interesting)

    by Sajma ( 78337 ) on Monday August 02, 2004 @09:20PM (#9866166) Homepage
    A possible inquiry could be: I am searching for topics about peer-to-peer technologies.
    As a result Bibster returns bibliographic entries concerning peer-to-peer technologies.

    Next, they'll perfect image search:
    A possible inquiry could be: I want to see defiance in the face of insurmountable odds.
    As a result Imagester returns images depicting defiance in the face of insurmountable odds.

    Seriously, are they offering anything better than standard keyword and author search? What I'd really like to see is such a bibliography database that ranks search results usign a PageRank-like algorithm (as I recall, the idea for PageRank derived from research on citation graphs, so this would bring things full circle).

    I'd also like to see Google start parsing publications and indexing them by author, year, and citations. The bibliography databases that I'm familiar with require manual input of new entries; it would be cool if this could be done automatically instead. Of course, there will need to be some interface to correct erroneous entries, and this opens up a large can of worms.

    • I'd also like to see Google start parsing publications and indexing them by author, year, and citations.

      Google could start by making use of "author" and "date" meta elements of all web pages and providing a search field for them on the "Advanced search" page.
    • Seriously, are they offering anything better than standard keyword and author search?

      Yes, though it may be hard to see this at first. The system makes it possible to query for specific properties of citation entries, which is more precise than simple keyword search. Also, in the current release of the software the interface is limited to a few 'fixed' properties, but there is no underlying technical reason for this, it could be easily extended to allow the user to search for arbitrary properties of any ci
  • I don't know if this is a direction I like seeing P2P networks go, in the sense that full articles would be available for download. With some tweaking of the idea, I think there could be an advantage.

    Many universities are paying tons of money to privitized databases to store either full text articles (for some)or simply the abstacts so students can search and read articles to their hearts delight. They are, in my experience, unreliable as well. The systems crash, you get database errors or lose the connect
  • by burns210 ( 572621 ) <> on Monday August 02, 2004 @10:42PM (#9866405) Homepage Journal
    OK, these p2p apps are awesome, but I see a problem, they each need to maintain their own p2p system(protocol), by forking from another project it or by writing from scratch or they need to piggyback another network...

    When will someone sit down, using an open source model ofcourse, and write the 'granddad' p2p protocol? It doesn't have to require everything, just has to be able to support everything... Encryption, hidden routing(not being able to tell who is requesting data vs. who is just passing data along), multiple source download, huge scaling, efficient and distributed search, etc.

    This public network could become the defacto to what open source apps work off of. As long as the protocol is the focus(a nice gui as well, but seperate the frontend from the backend), you could use it link to files on your website, or you could have multiple apps(a music/napster like app, a scientific research paper app, a bibliographies app, a usenet discussion thread app) each of them using a common protocol, and routing between them, but each app filters out the noise it doesn't want.

    It could be the killer app, it could have every major p2p app migrate to it. Project Gutenberg, Bibster,, all using a common protocol and network.... *drools*
    • Check out U-P2P [] (yes, I'm involved in this). It's a fairly easy to use P2P framework where the piping is already done, and all you need to do is specify the schema of the type of document you want to share. You can then create a "bibster", a "stampster", ..., and each community is itself part of a "communityster", so you can publish or discover communities using the same mechanism. Would that fit your description?

      It's still beta stuff, but there's also some publications along with the code on the site.

    • by Anonymous Coward
      What you mean like []...
  • NLP would be nice (Score:2, Insightful)

    by tgibson ( 131396 )
    eTBlast [] is a bibliographic search engine to which you submit an entire abstract. A little natural language processing and the results returned are to articles which have similiar abstracts. Though the tool operates on the Medline database, there is no reason the algorithm couldn't be used with Bibster.
  • There needs to be some way to double check the citations, or rate the sources of the cites, or those who like to pad their papers and make up scientific-sounding stuff for websites will have a good time with this. Too good, and it will be full of bogus references to Timmy's article on Cold Fusion.
  • Here in academia, a big problem that students and faculty face is managing their personal publications.

    For example, a faculty member may be sponsored by several different projects, each of which wants that faculty member to update their web page with each new publication.
    Odds are, most faculty will update their own personal page and possibly one project page. This leaves the other projects needing to harangue the faculty member in to updating their pages.

    For example, a postdoc comes and visits, write

  • A decentralized, indexed and well documented database that everyone can access...

    What's the difference between a database and a hard drive again?
  • In what character set is the bibliographic information stored? If one has a library whose titles are in a zillion different scripts, then it is really essential that one keeps one's bibliography entries in UTF-8. Does this system use such an international encoding, or does it expect us to submit to romanizing all our foreign-script titles just to avoid the issue?

  • Well, don't answer that. This isn't really about me. I hope.

    I've installed the thing. It seems to see peers. So I thought I'd search for a very, very common author. I entered Dana Scott. Nothing. I entered Tanenbaum. Nothing. I entered local boy Vaandrager. Nothing. I entered Barendregt. Nothing. I entered "concurrent". Nothing.

    I entered my name. I got everything I've ever published. But then I had imported my own Bibtex files, so I'm not surprised (I've never cited any matches for the abo
    • The user base is VRRY small. Before I submitted it to slashdot, there were six peers. And I assume it is skewed too--the ACM box at the bottom suggests that it is only for Comp. Sci. texts & experience suggests that that is probably a select subset of this.

      The real trick is getting your peers to buy in to the program--they will likely have many references you'd be interested in anyway.
  • Am I the only one that tried this thing out and thought, "Damn. Look at that real estate."

    I'd like it more if I was uploading in the background and my queries had a lighter, smaller interface, say a shell interface. Better yet (much better) an xemacs interface that works well with reftex.

    I know that the latter is much too much to ask for a young product, but I hope that the authors give developers (not me) some APIs to get some lighter weight clients out there.

    Assuming I ever get a non-local match for

"For a male and female to live continuously together is... biologically speaking, an extremely unnatural condition." -- Robert Briffault