Open Source Library Card-Catalog Apps? 111
dmd writes: "Does there exist Open Source software for maintaining a small to medium sized library card-catalog? It seems all the tools are available:
a perl module for working with
MARC records, several for working with Z39.50 and XML, and even a web site apparently devoted to nearly this exact topic. An actual, working, catalog, however, seems to be missing. Is this something that would be valuable? I, for one, have nearly 5k volumes in my collection, and they're begging for some discipline." I'm sure cash-strapped public libraries and schools would like to be able to use free / Free tools for this, since paper books aren't going away anytime soon. Not to mention for CDs, videos, charts, museum holdings ... any ideas out there? Turnkey solutions?
Universities == The funding for this (Score:1)
...And it gets the biggest libraries in our country (the ones at research Universities) the best possible system, or at least the roots for a comunity developed system.
Sean
Bibtex+ pybliographer? (Score:1)
http://sourceforge.net/projects/pybliographer
ROADS (Score:1)
The ROADS software is a collection of tools which can be used in building on-line catalogues of Internet resources. Key features are simple text based resource description format, World-Wide Web forms based resource description editor, WWW and WHOIS++ based search capability, automatic generation of customised views of the catalogue, automatic generation of listings of recently added resources, dynamic browsing of resources in particular subject categories, highly customizable HTML output, distributed indexing and searching across multiple WHOIS++ servers using the Common Indexing Protocol and more. " (freshmeat)
Available here [lut.ac.uk]
Great opportunity for a project (Score:3)
Those schools that do have money move to software like Eloquent [eloquent-systems.com] -- systems that are way more complex than a school library typically needs. Most schools don't need that much power/customisation, and can't afford it anyway. What seems to be needed is a basic system that offers searching on author/title/subject/keyword, and possibly uses MARC records (though for a school library this is not essential).
It would have to be easy to set up, and low maintenance (ie. a basic linux box shoved under a desk somewhere with a UPS and a tape backup). You need to keep in mind that libraries -- and school libraries in particular -- are likely to have a multitude of machines running different OSes, so something like a web interface would be perfect.
Considering the fact that most schools are getting networked these days, it's feasible to have a linux box sitting under a desk somewhere running a database, some library software, and Apache, and a bunch of Mac/PC clients running MacOS and Windows and interfacing to this thing via a web server. The checkout could be the same idea. This could be extended to have non-web clients running on various platforms and talking to the server via CORBA.
In talking with librarians, I've found that you can't just say "dump MacOS/Windows and put Linux on all your machines" because they don't just use them for searching. They use them to run all sorts of stuff -- CD-ROM based educational software, etc. In other words, it's important to remember that for software like this, you can't just get a bunch of developers together and make decisions and write code. There are a ton of assumptions you just can't make when you're dealing with libraries and schools. There's a bunch of research into what people really want that's required. That makes it a little trickier a project than, say, a mahjongg game -- no offense to mahjongg hackers...
Anyway, this is a fantastic opportunity for development, and one that I have been very interested in for a while now. It's also been on the GNU project's list of stuff to do for years now. Contributing a GPLed library system would be great not only for Free Software, but also for schools everywhere who can't afford decent software in their libraries.
MySql couldn't do it right (Score:3)
And I don't mean this to sound like a slam against MySql. No SQL database could do it in a way that a librarian would be completely happy with, primarily because of the wonderful MARC format.
The MARC format is the standard format used to store biliographic information. It was originally created in the early 60's, with the idea that the primary means of transmission would be on tape. It supports well over 300 different major fields, ranging from simple ones that anyone would understand (auther, title, publisher) to arcana that only a trained librarian could love (is the item a festschrift, unusual pagination comments, magazine run dates, and on and on.) Most of the major fields have "sub-fields", where the data is broken into different elements (i.e. an author field field will have a name sub-field, a dates sub-field, a title-subfield, and possibly others.)
Fields in the MARC format have a theoretical maximum length of 10,000 characters. Many of the fields can be repeated any number of times (co-authors, variant titles, subject headings). I've seen several attempts to model the MARC format in a relational model, and while it can be done, it's a royal pain in the ass and it inevitably winds up with trade offs.
For a simple catalog, where you aren't worried about working with the MARC format, a relational database (including MySql) will be perfectly adequate. But librarians love the MARC format, and it is such a basic element of modern librarianship that any system that couldn't import and export it would be considered unacceptable - like a car with a crank starter.
And I should know. I worked as a librarian for several years; I even have the MLIS to prove it.
for what its worth (Score:1)
---
MARC And libraries and Open source (Score:1)
I too, have been considering creating an open source and extensible library management system. 3 other people in the world probaly know of of 'FreePac' (Free Public Access Catalog)'. But unfortuantly it's still vapor. (i have a C++ class for handling marc data, if anyone's interested...)
The problem with using a SQL database staight up, is the complexities of MARC. Many moons ago, MARC records used to be delevired on _tape_. They're fornated thus: [directory][data][record terminator]... Marc Records are sub organized, and there are nearly no length restrictions on the data you can put in each tag. (except for fields with code)
A product like this would be great for libraries and schools, both which operate on a low budget. Things that would be great to support would be - Web based access to take advange of high-end machines and new web browsers.., definatly need telnet/textonly/terminal access because libraries and schools usually have old terminals/PCs and old PCs.
IF anyone would like some help setting this up... I'd be happy to offer.... C++, perl, html, linux, VB, (and a bit of java)
The librarians holy grail? (Score:1)
A request, if someone is thinking of writing their own. The old catalog system at the University of Washington Libraries [washington.edu] allowed searching on multiple fields. For example, you could combine author "clark" with keyword "trail" to find Lewis & Clark related materials without scrolling past A. Clark's Acrylic Plastic Spherical Pressure Hull For Continental Shelf Depths.
Unfortunately, this functionality is disabled in their new catalog (html or telnet). This is a simple task for any database, but I can't remember a library I've visited in years that allows it.
Input from a library geek. (Score:4)
Commercially available library software that is actually used by libraries is much more than just a cataloging/look-up system to replace those old 3*5 cards.
You need an acquisitions module that has the ability to do electronic ordering and approval plan processing.
The search and report capabilities on the staff interface for these things is amazing. I can collect a list of all item records belonging to location X and created within [ range of dates ] that are attached to bibliographic records for [ material type ] within a [ call number range ], sort the records according to my criteria, then output selected fields from either the bib. or the item, or both, in the order I choose to the device of my choice (including print to e-mail or fax) and I haven't even begun to make the system sweat. Yes, this is a fairly straight forward thing to do (selecting records based on data spread across multiple related/linked records) in SQL but, you also need a front end that the end user can comprehend.
If you're going to code it, it will need to be able to interact with all of the prevailing vendors... Ebsco, Baker & Taylor, Basil Blackwell, Swets & Zeitlinger, Matthews, etc... You will want tech contacts from each of these vendors to fine tune the ordering/receiving/approval interfaces.
Finally, the amount of fiscal reporting done in libraries can boggle your mind. You would never suspect that something so seemingly simple could be so complicated. If you don't have the ability to generate financial reports you might as well go back to index cards and hand written ledgers.
Re:MARC tape issues - giving away your tax dollars (Score:2)
FLS a.k.a. free library system (Score:2)
Re:worthwhile? yes (Score:1)
While this seems to make sense, it's not really true.
As part of our client base, we have about 14 school districts and around 40 libraries. Most of these are using pretty advanced library management software (such as Follett or Athena) and not one of them has had to pay for it.
Grants for this kind of thing abound. We have people cold calling them wanting to give them money to put in NT servers and software.
The problem comes in when the grant money is spent and they can't afford to keep the equipment up, or to pay the line and ISP charges for their Internet connection.
I'm not saying it's not a worthwhile project, I just want people to be aware that there probably aren't a whole lot of libraries who would jump on something just because it was free. It's got to be better.
Re:MARC tape issues - giving away your tax dollars (Score:2)
It's also another example of the LoC getting drunk on technology. There was a great article in the New Yorker about the LoC getting a bee in its bonnet about microfilming. In the process of microfilming newspapers, warehouses full of bound copies of DECADES of original newspapers were tossed/sold/given away only to be replaced by incomplete, blurry and often illegible microfilmed copies. Original sources? Gone *forever*.
The article said that the internet and the web are the next technologies that libraries will sink millions into all while pitching their original materials. We're litterally throwing away our history as we generate it.
Re:Free databases (Score:1)
-dB
You must be a DB Admin ... (Score:1)
If you where an IT guy, you'd suggest we search for the product at microsoft.com
Re:Yes, there is: Koha. (Score:2)
> posters are missing an important point:
> libraries use the MARC data standard for catalog
> records, and
> SQL doesn't cope well with the kind of
> tricks MARC can do.
Several people have said that, but I was hoping for an example or two to support the statement.
Not that I'm interested in MARC, but that I'm curious about where people percieve that SQL has such limitations.
LDAP (Score:1)
Re:This is inane... (Score:2)
This is a fascinating programming task and I was glad to discover the references Slashdot linked to on this subject.
I've never been happy with card catalogs and their electronic descendants or with web based search engines. Knowledge exploration needs a lot of work, and these kinds of tools will lead to interesting results.
If I was young and had lots of time, this would be a really challenging project to work on. Designing a schema to organize knowledge is a really cool task.
Re:simpler and more complex than you'd think (Score:1)
There are already a number of useful Free-as-in-Speech add-ons to proprietary library tools, such as Prospero [ohio-state.edu] and DBA [ucsd.edu]. These are largely possible because they use the standardized bits in the tools to which they add value: z39.50, TIFF, etc. As you suggest, it's definitely a niche which is proven to work, with these small solutions paying off big-time to many institutions. This might seem off-topic, but the more of these small tools we have the easier it gets to start hooking them together into environments like the one the original ? poster is asking about.
I think the answer to the "why's Z so slow?" dilemma is like what Larry Wall (I think) said about Perl and Python: that Perl's worse than Python because people wanted it worse. Work on z39.50 began _long_ before SQL92 hit the markets in working products.
A key area where many vendors and publishers are starting to work together around new open standards and even code is content linking, mostly because they have to. They don't necessarily want to, but _not_ allowing linking to external sources diminishes value, and they're all catching on finally. So there's some hope, but we've got to keep hacking to keep them honest. Trust me, it works -- when a long-proprietary-code/data-vending .com sees 600 lines of GPL'd perl which can kill off their product line, they're more than willing to start offering up more interoperability if not freeing up their code. It's better for everyone.
Things a library system needs (Score:1)
Be able to read any of three or four dozen MARC "standards"
Search on any word (but the stop dictionary for words such as the and to of ect.) for a word in any one of over 1K places in the marc record.
Be able to handle more than one library.
be able to store books in different locations within the library
do user tracking by overdue, notes on the user, automatically import from other databases or text files, have any one of several options for the user such as loan period (but checks the holding to see if the standard period is longer or shorter, or if the fine for overdue is less or more), be able to block a user, be able to block a user based on what library they are a user for, much much more
be able to route books between libraries.
be able to route books to outside vendors such as binders, restorers, storage, ect.ect.ect.
be able to find out who checked out a book last.
maintain statitics on the book such as size, no. pages, language, print size, ISBN, LCCN, Be able to change the default loan period, how many times used in the library, checked out, inventoried, price, "extras" such as clear cover, a place for notes for the whole library system and another place for each library, ability to search on the note, title author, publisher, copyright date, date purchased, dewy, LCN cataloging or some other system, age limits for checking out.
keep track of budget and invoices, orders, barcodes, much much more,
Ability to keep track of when the next magizne is due, when the subscription runs out, where the old copies vs. new copies are, index the mags.,
Print reports on just about anything so throw a report generator at it. And if you want more, there is about five hundred pages of basic requirements in any book on library science. Add to that you have to know who a user is and if they are authorized to do any particular function, (Some can check out, but not in, others can add a note but not edit the book, some can edit books but not at any library except where they are assigned) you get the picture.
Off hand, I'd say that a good library system is about four hundred times more difficult than writing Linux. I know. I do Unix admin and Library systems for a living.
Trust me, this is not simple. I do think it very worthwhile though.
There is oclc.org (Score:1)
Check out OpenOPAC (Score:2)
Micro$oft(R) Windoze NT(TM)
(C) Copyright 1985-1996 Micro$oft Corp.
C:\>uptime
Re:MARC tape issues - giving away your tax dollars (Score:1)
The Library does maintain the copyright on its records outside the United States (you didn't pay US taxes did you
own copy of all the complete post-1968 catalog, you can pay LC a tape duplication fee which is supposedly only enough to cover their costs. Of course, its the government the home of $700 toilet seats, so their costs are very high.
Library of Congress MARC records prior to 1968 are a bit more complicated. In exchange for the taxpayers receiving a discount for the work, the Library agreeded a private company could convert LC's paper catalog cards into electronic records and sell the records to other people. Of course, you are still free to go to the Library and copy all the cards you want. Nor is there anything stopping another company, or just a group of concerned citizens, from making their own copies of the paper library cards and distributing them for free. You just won't get the electronic files from the private company, you'll have to type them or scan them yourself.
Re:A program is useless without the data. -- OPAC? (Score:1)
It can save you time (Score:2)
LinuxFund (Score:1)
http://209.24.233.82/development/proj ect/?id=22 [209.24.233.82]
Re:I was looking into this once... (Score:1)
A further problem is that these codes have changed over time; MARC records created long ago (how old is MARC?) have incorrect codes that I find I have to deduce and add to my lookup tables.
But yes, MARC is the essential data format. I have code in Perl and Java that can parse MARC fairly well. That's easy -- dealing with the data in some way that makes sense for your collection is harder.
That said, proprietary solutions to this problem (usually referred to as "Web OPACs") run you big bucks. If you think about it, you need a data structure, an interface to populate and edit the data, an interface to query it... for 5000 records, this might not be hard, but it's not a weekend's job.
Re:This is inane... (Score:1)
Re:worthwhile? yes (Score:1)
Computer Science 101 (Score:1)
A program is useless without the data. (Score:2)
Want it? Buy it from them. Complete with their software and a limited license.
I'll try to get the person I heard flaming about this ("Flagrant misappropriation of public records/product of tax dollars", etc.) to post with the details and whether it's still current.
The patent office aldo cut a similar deal at one point. It was still in effect as of maybe four years ago (when I interviewed at what turned out to be the database company that had the USPTO's sweatheart deal).
You're right (Score:3)
You can't just code a database -- that's almost entirely useless; there's also the matter of controlling circulation, tracking books out/returned/requested/held/sent to bindery, etc.
Plus import & export from vendors, billing, accepting bill payments, cross-referencing, all kinds of freaky subject indexing, mondo-bizarro file formats from a zillion years ago (MARC), etc. etc. etc.
There's a reason library systems tend to be proprietary -- it's because nobody else in their right mind wants to get involved with things like MARC and Z39.50.
. . . but then again I could be wrong.
Re:Easy to make (Score:1)
Except that somebody installed Access 2000 on their machine and converted your database's front end.
Did you make this thing to be multi user? If so, I pity your former employer. If not, not a bad choice.
The overboard solution (Score:1)
And if you're not up for doing much work and aren't especially up for the opensource aspect (dumb statement follows:), a mac or windows box running filemaker pro would have your web enabled database up and running in half an hour, if you have any form of clue
Re:Free databases (Score:2)
As stated in other posts, SQL can have serious problems with some of the tricks that go on with MARC format.
simpler and more complex than you'd think (Score:4)
Second is that half of the pieces that go into a big library management system (including the catalog part) are really generic business systems: EDI, invoicing, accounting, etc., but they haven't been abstracted out of the realm of our systems vendors. So the level of standards followed there is minimal so those modules generally don't interoperate with our trading partners (i.e. internal payment systems and external suppliers). Lots of redundant keying and more crappy systems to maintain there, all of which is typically deeply and proprietarily tied into the catalog data.
All that said -- and to our vendors' credit they are tending to get better these days -- we've been sharing catalog data like hackers are sharing code for over 100 years. We've been doing it online for about 35 years, but the way we do it now is pretty much the same way we've been doing it for those 35 years. i.e. largely dependent on one of two .orgs/vendors to be a clearinghouse for sharing catalog data. But those folks disappear if they can't sell the data back to us after we create it for them. So nobody running a library wants them to disappear. Especially because we've got to handle one-of-a-kind rare items in big research libraries as well as unusual local items in public libraries and so on.
Imho the solution is to first outsource all the standard business stuff to vendors+free software that can do the same job with existing standards-based tools. Then abstract away as much as possible of the catalog data into free references sources shared and maintained by the library community (think: you could run your own amazon.com recommendations site, etc.). This is what we're trying to do (shameless plug alert) with the jake project [yale.edu] for journals. Same thing applies for books, although there are probably >=100M records to normalize.
If we can get that done, then anybody could hack up a gtk+ front end to the free, shared catalog, and pick and choose the items you have yourselves. It would work sorta like dict.org or jake. Just imagine how much easier it will be to search for ebooks in gnutella once this is done... :)
Um, no (Score:2)
. . . nevertheless, a database is (by itself) almost entirely useless. It's like saying "hey, you asked for an airplane, so here's some bulk aircraft-grade aluminum. That's good enough, right?"
Unless, of course, you're talking about a "library" system that can't talk to any other library system, doesn't understand any standard library data format, cannot respond to queries in a standardized way, cannot do billing, or payment, or MeSH indexing, or anything else that library systems do.
The database is the raw material, not the finished system. There's a reason there aren't very many library software packages -- nobody in their right mind wants to write one. It's a nightmare.
Go back to writing online shopping carts.
Re:MARC tape issues - giving away your tax dollars (Score:2)
An idea for the coder. (Score:1)
Why not use a floppy for the access card? Besides being slightly bigger than the credit-card sized library cards (my library actually uses a keychain-sized thingy), why not? I'm sure that there is a good way to make it secure, even with the easy accessibility of a floppy editing system.
On the other hand, small libraries will have an easy method of doing everything without paying thousands to get library cards printed. Why not use floppies for access control in general? I can't think of a better way to do it (today), besides buying either an expensive card scanner or barcode reader and ordering the cards for it.
--
Another crucial thing: Digital asset Management (Score:1)
A crucial piece of the puzzle isn't just another relational database; the catch is full-text and field indexing everything, prefereably in a full-text XML format; there are not any full-text search engines that are open source that work outside of the scope of a standard web site; what is needed is more of a repository, where XML-based content is stored and automatically indexed, so that records of multiple types and/or XML DTDs can be stored in one searchable repository.
In short, Libraries need a car-catalog system built on a foundtation of a digital asset management system; something which I haven't found many foundations for in current OSS projects. Commercial examples of such a beast might include Folio Livepublish [nextpage.com], which is an "infobase" indexing and search engine that creates "repositories" or document collections that are fielda nd full-text searchable; a lot of people uses this type of stuff for electronic publishing and for knowledge-managment apps. I've used LivePublish as a developer/integrator/OEM at my previous employer. I don't particulaly like LivePublish, because it is (in my mind) unstable (it's an NT app), and the company that makes it has a rather rediculous royalty-based revenue model, but the product has some good ideas (like collecting content of multiple types into one repository, filtering binary types like word docs and PDFs so that they too can get indexed, fielded XML searches within elements); I think an OSS equivalent to something like this would be a great option for those of us who like some ideas that come out of commercial development firms, but don't like the rancid business models, crazy prices, and poor support of such companies.
The only architecture that even comes close to provinding the means to so this sort of thing is Zope [zope.org], but Zope doesn't currently have field-searchable XML indexing; when Zope gets it, given it's a strong persistent object system (not to mention its ease of use with RDBMS systems and clean multi-tier development), then it would be an ideal architecture for building a libary card catalog system on top of.
Please mod this up. (nt) (Score:1)
An open source audio database system (Score:2)
Am I the only one... (Score:1)
Anonymous Me (too lazy to log in)
Some links that might be useful (Score:1)
I did a little search on Google to look for the system my university [www.usp.br] uses in its library [www.usp.br], and found an interesting listing [ua.ac.be].
However, it seems to contain only commercial software (the one the guys here use, Aleph [aleph.co.il], is the first in the list), but you may find some interesting things if you browse the links (I didn't take the time for that).
--
Marcelo Vanzin
I happen to work in a library automation dept... (Score:2)
While most people would tend to think this is a fairly simple endeavor, it's not. Our consortium runs a library automation system on top of VMS on an Alpha cluster. It happens to be one of the best systems out there as far as high availability is concerned. However, it still doesn't take user's needs (Ref. librarians and patrons) into account.
If something like this is going to work, you have to anticipate what your users are going to be asking for. A simple cataloging system could probably be done with MySQL or Postgres, but it will lack any of the functionality that is in existing library catalog products.
In addition to the MARC records, you would eventually need to maintain a patron database, an acquisitions database, a record maintenance interface, circulation records and policies, etc... AND it would all have to be easy to use for a non-technical person. Even the system we use is not "easy" for the average user, it's jst reliable.
Personally, I would love to see an open source and/or free software catalog system that outshines systems like Dynix and DRA. Especially if it brings user interfaces into the year 2000. (There is plenty of talk about new client server interfaces, but nothing has come to fruition as of yet.)
I think the biggest challenge a project like this would face is that programmers are not librarians and vice-versa. They come from very separate worlds and have very little understanding about what the other discipline finds important in an automation system.
Peace,
D.B.
Re:Yes, there is: Koha. (Score:2)
GNU Task List (Score:1)
Actually, this project is in the GNU Task List [gnu.org]. A small group of volunteers began to tackle it about 2 years ago, but interest dwindled when the complexity of the task became apparent.
It appears the desired features (at least to the initial core developers far overstep replacing "cards", as it must contemplate any format (paper, digital, film, whatever), allow for the tracking and aquisition of titles, etc.
The initial idea was a web application, using the browser as a "thin client" for the entire shebang - allowing for the oft-underfunded libraries to use their current workstations for the purpose.
As far as I know, the project went as far as a MARC schema, a Z39.50 client that connected and disconnected from the server, and little else (please prove me wrong).
Perhaps this is a chance to use the much vaunted "Slashdot effect" for a positive purpose ? "Hackers wanted"
Re:Yes, there is: Koha. (Score:2)
As with RDBMSs, there are some arguments for going with commercial ODBMSs if you have very demanding requirements. The commercial ones also have lots of extra tools, many of them dealing with XML, which involves complex nested data structures and is also suited to ODBMSs.
This is inane... (Score:2)
when in doubt.. (Score:2)
You are a unique individual, just like everyone else.
-*-*-*-*-*-*-*-*-*-*-*-*-*-*
worthwhile? yes (Score:1)
I definately see this as a worthwhile project. Many libraries of inner city schools would benefit from a free cataloguing[sp?] system. I see XML being mentioned in the article post, but what exactly is being done with XML in this project arena? Personally, I can see XML as the perfect tool to fit into a card cataloguing system. An XML card catalog language specification should be constructed, and used for inter-library communication between school libraries, college libraries, and public libraries.
So yes, there seems to be some value in the project
-=MeMpHiStO=-Re:This is inane... (Score:1)
It is really hard to improve on the card catalog, manually or electonically. So what you would end up doing is just tramslating the card Cataloge to an Electronic format. Noble Work. But boring.
Oh bye the bye all the Schema has already been designed......
Just be careful... (Score:1)
Re:Gates Library grants (Score:1)
I work for a library that got a Gates foundation grant. They went to great pains to make sure we understood that they aren't M$.
In fact you get a choice: a computer package with lots of software and content for kids and training classes for the staff, or you can choose the cash option where they write you a check and you create your own solution.
There are certain guidelines on what you can do with the cash, of course. That keeps library IS types from just buying 3 or 4 Monster machines to play Quake on. bummer...
As for the automation system, good luck. They are a nightmare and it takes a brave soul to admin one...let alone write one.
-------------------------------
Modify existing software. (Score:2)
--
Re:This is inane... (Score:2)
I was going to make a case for writing an App in Java (OS of course) or maybe WATCOM Fortran
But NOOOOO you go in and give the Logical Answer.
Where is the fun in that.
Re:I was looking into this once... (Score:2)
library (Score:1)
Re:MS Access??? (Score:1)
Free databases (Score:2)
Check with your local university (Score:2)
Re:A program is useless without the data. (Score:2)
But maybe the subscription service is the deal you're talking about....
libraries lack technology (Score:1)
Due to this lack of a true exsting system, it would seem fairly easy to design one from the ground up. Make it web based, use SQL and XML.
This would take no time at all to have a good multiuser system for unlimited use. Instead of buying some huge ass system, install the thing on a dual 750 with 512mb running Apache on Linux and let it go. That should handle almost any one library.
Would cost you nothing to host and nothing to maintain. (I'm over-simplifying.)
Fook
Re:Dynix AARRGGHH!!!!! (Score:1)
I feel your pain, AC.
I recently alerted my fellow epixtech [epixtech.com]/Dynix customers to the HP-UX ftpd vulnerability. Not only did epixtech not tell anyone about it until I sounded the alarm, they'd rather extort $200 per server to install a new *binary* than patch it proactively and get a reputation as a security-conscious vendor.
Index Data (Score:3)
- Zebra information server. Eats Marc (UsMarc, other local variants) as well as XML, mails, newsgroups, etc. You can add more input filters. Talks Z39.50
- Yaz Z39.50 toolkit for client and server side
- Zap web gateway and a PHP module for building easy search gateways to anything that understands Z39.50, for example our own Zebra
- and more. Even more to come later...
I am of course biased, but these tools are designed for library applications. All open source, at Index Data. [indexdata.dk]
Huh? (Score:1)
The wheel is turning but the hamster is dead.
This is where open source should shine! (Score:2)
Instead of state/local/federal governments spending money so that specific schools or libraries (etc.) can purchase proprietary software, why not just roll it all together, total up all the amount, and then spend maybe 50% of that on funding some open source development?
From there on, each school can spend a small amount of money paying someone to customize it to their needs, but we could have interoperable, open, inexpensive software where it's really needed.
---
Here's the OSDLS project (Score:2)
Open Source Digital Library System [arizona.edu]
Note: Those of you simply suggesting ordinary databases don't have a clue as to what is actually involved. Yes, you need a database. But that is only one of MANY pieces that make up and automated library system. Commercial software for this stuff can cost tens and tens of thousands of dollars.
An open source system would be welcome, indeed.
I know of one... (Score:1)
I know a computer consultant who created and GPLed one for a school who needed new software due to Y2K issues. It originally ran with XML, but due to the size of their database, and the need for quick development (no time to see if it could be made better), he was forced to convert it to an SQL database. It is a complete library circulation system, done in 631 lines of PHP3. Contact me if you would like more info -- I don't think he ever got around to releasing it/even giving it a name, but I am sure if I asked him to he would.
Is LOC data free? (Score:2)
I sent them an email about this a couple of weeks ago but did not receive an answer. Any insight would be appreciated.
Re:simpler and more complex than you'd think (Score:1)
I'd love it. (Score:1)
I'd also find an open source library cataloging system quite handy. I too have a huge collection of books at home.
As a general layout I can see each library having a central server that proivides the DB storage and web server interface. Larger libraries would have multiple web interface servers around a central DB server. All the client computers then need is a web browser. This would greatly simplify installation at a library. In really small poor libraries the computer at the checkin/checkout desk could serve dual purposes as the server and librarians computer. If there isn't enough memory on the beast one could forgo the GUI on it and use Lynx as the interface.
If one designed the web pages right one could easily view them from either text or GUI based browsers. Personally I would specifically design the web interface to work smoothly on text mode browsers as then cheep terminals (already in place in many libraries) could be used for access.
For security one can setup the library on a private 10 net or similar. An outside internet interface could be provided through an internet link using a NATed firewall of some sort.
For getting data into the system... Beyond the standard entry screens. I was thinking of a clever Library of Congress Number, ISBN Number or Barcode Number based retreival system. When a library gets a book in it enteres the number or scans the Barcode Number off of the book. The system then queries some central server or other libraries to see if the information is already on file. If it is then the fields are populated from it. If not then the librarian can enter the data then and there. After entry it is stored locally and pushed up to the central server to speed others data entry. The central repository could be primed with data from the Library of Congress and publishers.
Searching of collections at other libraries could be added in the second phase of implementation. It could be done via direct connection from library to library (over the internet). A library would setup a list of prefered sources for inter library loan. These would be queried first, then less prefered sources would be queried next and soforth till the book is found. This way once a library is setup and has it's books entered it is also fully capible of being on the inter library loan system too. No need for central servers here. As a matter of fact the initial information gathering system for a new book could also use this same querying system to gather it's data. All that is needed is an internet link.
Code for controling printers to produce spine lables and possibly card catalog cards for new books.
Inventory bar codes. If each book is given an unique ID number then it is possible to handle the checkin/checkout via barcodes and scanners greatly speeding up the process. The barcode label would be printed at the same time as the spine label. Customer ID numbers could be just another one of the unique ID numbers used for books. This way the same entry software would work for both.
Overdue book lists by customer and book.
I think I've rattled on long enough.
the one uncatalogued item in the library... (Score:2)
for more on how even librarians (who might be expected to have more archival appreciation) are "throwing away our history as we generate it", check this excerpt from Nicholson Baker's [barnesandnoble.com] well-researched and insightful "Discards":
- And abruptly you realize, looking at these expressive dirt bands [caused by patron handling of cards], that even the libraries, like Harvard and the New York Public Library and Cornell, who microfilmed or digitized some of their cards prior to destroying them, have - by failing to capture any information at all about the relative reflectivity of the edge of each card - lost something of real interest, something eminently studiable. Who knows what a diligent researcher who photographed (from above, on a tripod) each close-packed drawer of Harvard's Widener catalog with a high-contrast camera might find out, were he to correlate his spectrographic dirt-band records with the authors that, as distinct clumps, exhibited some darkening? Of course the "Kinsey" cards would be thoroughly dirt-banded - but which others? This is, or was, a cumulative set of scholarly Nielsen ratings for topics at twentieth-century Harvard that is perhaps more representative than any other means of surveying we have. Instead of tossing its catalog out, Harvard ought to have persuaded a rich alumnus to endow a chair for dirt-banded studies.
it's an excellent read for anyone interested in the "books meet bytes" situation.---
the problem with teens is they're looking for certainties.
Re:MARC tape issues - giving away your tax dollars (Score:1)
Also, BookCAT is a nice piece of shareware for personal libraries ( -Josh
OSDLS and Avanti (Score:2)
Re:simpler and more complex than you'd think (Score:1)
I'm thinking that what may be needed most in freeing library software are some new protocols we can demand vendors adhere to in RFPs. That would open up the wide world of library automation to alternate modules. If we can get free, or even 3rd party, software to interact reliably with our proprietary behemoths it'd be a huge win.
At that point, a 'market' might open up for development of free software to add functionality to proprietary systems, which in turn creates the possiblilty of (simple) free core systems which can take advantage of the new modules.
The obvious area for this kind of thing is in web (or other remote) access to library automation systems. Vendors are providing their own 'solutions' -- but with access to protocols, much more could and would be done. z39.50 will allow some of the things I'm thinking of, but only to a point. (And why's it always so slow?!)
You've worked on this a good deal -- do you see what I'm getting at? Any insights?
Re:Yes, there is: Koha. (Score:1)
And this is just one subfield. There are many subfields per field and hundreds of fields. The complexity is too great for most generic/off-the-shelf systems (Access, MySQL, etc) to handle.
Hope this helps,
John
Re:MySql couldn't do it right (Score:2)
Am I missing something?
Re:Another crucial thing: Digital asset Management (Score:1)
I'm actually building exactly such a library system using Zope right now, and it has indeed proven to be an excellent platform for tackling this problem. I'm not, however, interfacing with MARC or any other standard format; I'm just using a MySQL backend and a traditional 'data manager' and 'data object' object oriented interface to the DB. I considered using XML at first, but, like you, I was not convinced of the readiness of the XML tools available to handle the task.
I haven't yet broached the topic with my employer, for whom I am developing this system, but since I don't believe we have any plans to market this thing, I'm hoping I can release it under an open source license when I'm done.
Re:Yes, there is: Koha. (Score:1)
MARC Record
Marc Tag
Marc Subfield
Each table is linked to the table above via foreign keys.
It's the other problem you point out -- using that data with some *context* to get something meaningful. I don't know how Web OPACs do it, but for the digital library work I do, we end up producing very specific tables for books, serial articles, etc with specific fields populated from parsing the MARC record. In our case, this is static, but a dynamic approach would be better; this would require the MARC data and it's representation to be linked -- edit the MARC record, the representation changes too (with a rebuild of the derived tables).
Simple? or not (Score:1)
For a home system, or a 1 or two branch library system where the card catalog doesn't need to deal with interlibrary loans, this isn't hard. When you get into things like "Branch A can borrow Books from branch B, but NOT if they are reference, but can't borrow CDs, "New" books, or videos, But branch B can borrow Books, New Books, and Videos, but Not CDs (Unless the CD is "Private")" is when the system gets complex (The rights management section of the database is fun, so you don't have to recode your app, just put entries into a table).
So, like I said, freeware? Maybe for a simple system. For something complex? I doubt it
Re:Modify existing software. (Score:1)
Card catalogs? Yes, but with circulation? Nope. Card catlogs keep one card per TITLE, per branch/location, but the circulation system has to keep track of EACH COPY of the book. One tracks the "Intellectual Asset", the other the "Physical Asset". The rule base for dealing with "Physical Assets" can get fun, particularly when you get into multiple "physical Asset" types, each with their own rules. Get gets particularly fun when you get into non printed assets, some of which may not have a physical asset, and are at the other end of a very thin pipe. I've spent the last 2.5 years working on a system like this (the whole project is about 11 man years in)
Ask Slashdot: (Score:2)
Re:Check with your local university (Score:1)
Not a bad idea... (Score:2)
I support the EFF [eff.org] - do you?
I was looking into this once... (Score:3)
The major problem I ran into with writing something of the sort is that there's lots of information that you really want to have that isn't on the web. Cataloging rules, the full description of the MARC fields, some of the lists (organization, I think, is one example). I could get some of those from a library, but strangely enough although I'm sure most libraries have them, they aren't necessarily on the stacks, but in people's offices. Even then, I'd have to keep them checked out for long enough that I'd rather buy a copy.
But, if anyone wants to work on it I'd be glad to help. My ideal app would have to
Hmm... (Score:2)
I would love to see such a system - I have a large collection of books myself that I would love to catalog (I also have a ton of other media that I would love to catalog as well). Such a system would be worth looking into, if it existed. I actually started to design one for my home use, but never got further than data layouts and screen layouts (mainly due to the fact that such an application is, to me at least, while practical, not very sexy or exciting to keep me motivated)...
I support the EFF [eff.org] - do you?
If the catalog is paper cards anyway... (Score:1)
Visit DC2600 [dc2600.com]
An anecdote for those suggesting local development (Score:2)
The Kansas City Public Libraries bought a commercial product from DRA years ago, but skimped on the professional support. They worked with it for years using only their own programmers.
The result was so unusable that the technique used by the Librarians to find a book was:
Now the catalog is so full of bad entries that even after years of "clean up" to work with the new, fully-supported DRA web-capable interface (See [kclibrary.org]
http://www.kclibrary.org) it's still a painful experience to try to find something. Multiple spellings for the same author are unlinked, some books under one, some under another. Some books in a series might be under one call number, while another book is somewhere else. (For instance, the second book of the Gormenghast trilogy is catalogged as the trilogy itself, and it's therefore impossible to retrieve volume 1 or 3.)
This is probably more a result of stingy city budgets than any inherent problem with the job, but it shows how badly it can be done.
Sorely needed (Score:2)
Meanwhile, a reliable Open Source system that's not too cryptic to implement could save libraries a lot of (your) money and might even help keep the most cash-strapped public libs from being shut down due to "budget constraints." Something to think about.
Re:Universities == The funding for this (Score:1)
The danger may be in mixing up the COBOL/Assembler/whatever spaghetti code of 1975 with today's development environments. The software world is striving g mightily to give us binary re-use of code, so that putting together an application is more like configuring a new remote to work with a TV rather than getting a technician to pour through a set of schematics. Perl, Python, and other scripting environments have put very powerful programming libraries into the hands of more developers than ever before. The Enterprise Java Bean notion of Container Managed entities and tools like Castor are another step in this direction because they open the door to using XML to declare the needs of the application rather than coding each program statement step by step.
In other words, Open Source has advantages today that didn't exist when universities were at the height of building communal code. Universities should be at the head of this movement but there may be too many bad memories out there for this to happen.
Lynx (Score:2)
CmdrChalupa
(who cannot, for the life of him remember how to change his sig)
Easy to make (Score:1)
Re:This is inane... (Score:1)
MySQL+PHP should work like a charm for this.
MARC tape issues - giving away your tax dollars (Score:4)
Unfortunately, some years back the firm that records these records in the MARC formats legally got control not only of their formatted tapes, but of any use of the information used after extraction from these tapes. In other words, they not only own the format, but the government funded information contained in the format.
This is critical because these MARC tapes are the primary source of library cataloging information for most libraries. There are some other independent networks, primarily of educational institutions in the western US, but most libraries depend on the Library of Congress OCLC tapes.
The whole thing stinks, and is ridiculous. As a former librarian, who also holds a BSCS, I was outraged at this theft of public assets. The worst part was dealing with my moronic former colleagues who screamed that of course this company should own this information - it was "intellectual property." Thousands of librarians wrote letters in supporting this company's "intellectual property rights" to work created at tax payer expense.
This happened because most librarians think that putting information into a data format is some mystical arcana mastered only by brilliant wizards. They do not realize that the far more difficult part of the operation was the original cataloging done by the awesome catalogers at the Library of Congress.
So, libraries pay for the nose for software. First, the fee that the vendor has to pay for using the MARC tapes, the royalties for the actual use of the data contained on the tapes, and then for the library software itself. BTW, most library software is so atrocious, buggy, and difficult to use that it's writers would receive a failing grade if it had been turned in as a senior project at any half way reputable college.
Re:A program is useless without the data. (Score:2)
But maybe the subscription service is the deal you're talking about....
Or maybe my info is out of date.
Re:Modify existing software. (Score:2)
If you build an app that uses that, and can use Z39.50, it can automatically seed your entries from detailed catalogs already available from your local library.
Yes, there is: Koha. (Score:5)
Public libraries, unfortunately, are too often dependent on fiercely proprietary-minded vendors for their daily operations.
Incidentally, the "go get MySQL, you dumbass" posters are missing an important point: libraries use the MARC [loc.gov] data standard for catalog records, and SQL doesn't cope well with the kind of tricks MARC can do.
Re:worthwhile? yes (Score:2)
Lots of possibilities (Score:2)
From what I have seen there aren't a lot of amazing library systems anywhere, OS or proprietary.
Since I don't know where you looked for a system I might be completely off base but this seems like a project a group of CS grads would have picked up somewhere along the line. Check with some universities, somebody has to have replaced those ancient mainframe systems somewhere. If not there are a few very good OS databases (seems like more pop up every day). If you are interested in building a library system they would be an excellent place to begin.