Advice for Building a Multi-Platform Lyrics Database? 65
AntonOnymous,Cowherd asks: "I am in the process of designing an application for general public use. The application will allow end users to search and display a large collection of songs (both lyrics and tunes) with annotations, all in text format. The intent is for this application to run cross-platform (Linux, Windows, Mac, and whatever else), so I want to avoid platform-specific binaries as much as possible. I also believe that the program should be Open Source. The end users will not necessarily be computer experts, so I want to avoid as much additional setup on their computers as possible. The application (data and program) will all be stored on a CD or DVD, and it should be able to be run locally. The most important part of this application is the data, not the program, so the guts of it should be fairly simple with a decent user interface. Does anyone have any suggestions as to general approach to setting this up, or have any pointers to existing open source programs which already perform a similar function?"
"One way to implement this would be to set up each song (with lyrics, tune, and annotations) as a single record in a database. I would like to avoid the inherent security issues and overhead of setting up and running a database on a user's computer.
Another possibility, which is fairly appealing, is to use a Web Browser to provide the user interface, and to use Open Source text indexing/searching programs (such as Lucene or Egothor) as the engine. It is probably safe to assume that most users have a Browser. However, most users probably would not have a web-server (even a local one) on their computer, and going by the principle of as little messing around with the user's computer as possible, I would like to avoid having to set one up, even a local one."
Comment removed (Score:5, Insightful)
Re: (Score:1)
Re:Internationalization / UTF-8 (Score:3, Informative)
Well, "script" doesn't really make sense in the context of your original post, but I'll take you at your word that you don't see the appeal of mixing scripts on one page.
To start, I'll direct you to the Japanese codepage 932, which includes at least four scripts: basic latin alphabet, katakana, hirigana, and kanji. People seem to have thought it was necessary to be able to use all of those on one page, perhaps because Japanese tends to mix three of them together on a regular basis and likes to throw in E
Re:Internationalization / UTF-8 (Score:2)
Besides, there are about 6 Russian codepages: Win1251, KOI8-R, CP866, ISO, MacCyr, GOST-Cyr. What codepage are you going to use?
Re:Internationalization (Score:2)
The inventors of Java had a the right idea: store all your characters using Unicode, and translate them to the local
Re:Internationalization (Score:2)
In fact, Java --- and Windows --- got it so catastrophically wrong (using 16-bit values for characters, instead of 32-bit value) that it was found easier to change the Unicode specification to prohibit most characters that wouldn't fit in a 16-bit value!
There is a standard in place for encoding such things using 16-bit values; it's UTF-16, and given that it's a variable-length-character encoding like UTF-8, it rather defeats t
Re:Internationalization (Score:2)
Please. Both Java and Windows simply implemented Unicode. The decision to try to do every character set on the planet in 16 bits was Unicode comittee's decision, not Sun's or Microsoft's. And it's a mistake they've been able to work around.
Re:Internationalization (Score:2)
Because when you're parsing text, you're usually only interested in a few special characters --- control codes, spaces, etc. These are all in the ASCII range. These means that all the UTF-8 extended characters will just pass straight through, unchanged, correctly. You don't need to worry about them. Because UTF-8 co
Re:Internationalization (Score:2)
It's absurd to call that kind of garbling "degrading gracefully" just because it's sort of readable. And the UTF-16 version will be perfectly readable, if the sender remembers to add the correct chara
The difference between i18n and L10n (Score:1)
Nothing is internationalized by default. There is no magic that converts a program's English strings into their Traditional Chinese translations.
"Internationalized" means capable of working with user data in multiple languages and does not imply ability to translate user data from one language to another. "Localized" means that the interface is available in more than one language.
web service (Score:2, Interesting)
Re:web service (Score:1)
Wiki on a stick (Score:4, Interesting)
I have a little more write-up in my Journal, along with links.
Re:Get a good lawyer. (Score:1)
They have domain over the actual song, but the written lyrics of the songs.
Those are delt with by ASCAP/BMI.
Just as the famed PearLyric app for Mac OS X has found out....
If you don't have the funds to pay to ASCAP/BMI (the good guys), then you're pretty much screwed.
Copyrights (Score:5, Insightful)
Re:Copyrights (Score:1)
The application (data and program) will all be stored on a CD or DVD, and it should be able to be run locally.
Re:Copyrights (Score:2)
Re:Copyrights (Score:1)
Re:Copyrights (Score:1)
Put your secret plan on
Re:Copyrights (Score:3, Insightful)
The story doesn't tell us anything about which songs he's going to be including. For all we know, it could be a collection of folk songs or hymns that are already in the public domain.
Advic
Good story about a Lyrics Server and the Lawyers (Score:2, Interesting)
So, a friend of mine wrote one of the first online lyrics servers.
Here's his story. [stereopsis.com]
Re:Good story about a Lyrics Server and the Lawyer (Score:2)
Back in 1995, I put together a website that cross-referenced the lyrics to Les Misérables in English, French and German (all typed in by hand from the CD liner notes). At first it was hosted on webspace at AOL, but I later moved it to some space I had at college. From 1996-2000 I added songs in more and more languages, each time carefully cross-referencing and linking so that you could jump from each song
The NMPA, NOT the RIAA (Score:1)
If you want to start such a database, my advice would be to lay the groundwork for it, software-wise. Then, contact the publishers to get lyric reprint licenses. It would be nice to say that the publishers would be happy to provide you with such licenses, but chances are they would be difficult to obtain, since you are not act
Re:Copyrights (Score:1)
Heh... (Score:4, Interesting)
Lots of websites already do this, why bog your self down with something that has already been done? Unless its for some kind of research project for university/college of course.
Open source solutions which do the same? Amarok has a "lyrics" tab which brings up the lyrics to the playing song, i think they are pulled from wikipedia but im not sure.
Also musicbrainz has a huge database of music too, this is why they are seemingly linked in amarok.
So basicly your not onto a winner with this unless your going to offer something all the hundreds of others fail to offer.
Amarok, wikipedia and musicbrainz are all open source.
Im not sure however, how all of these cope with non-english alphabets, which is something lots of people tend to bring up.
Re:Heh... (Score:1)
hypocrasy in action (Score:2)
Re:hypocrasy in action (Score:1)
Not gonna happen (Score:2)
He didn't say anything about a service (Score:1)
However, http://www.animelyrics.com/ [animelyrics.com] is one database of lyrics that isn't getting sued, like most things anime.
Re:He didn't say anything about a service (Score:3, Informative)
Even if the guy would have won in court, there's likely no way he could have afforded the legal costs, unfortunately, and his programming time was wasted
Re:He didn't say anything about a service (Score:1)
Re:He didn't say anything about a service (Score:1)
It's the data format and APIs (Score:2)
You think setting up a local database is a security risk, but setting up a local web server isn't? Why? You are aware that databases don't have to be servers listening on public ports don't use? You could use something like SQLite [sqlite.org].
The important thing is not the implementation itself. It's the data format and/or API. Make the data available, and plenty of people will be willing to write web interfaces, Qt interfaces, GTK interfaces, etc. Expose the API as plain C, and make the data easily importabl
Re:It's the data format and APIs (Score:1)
Start with the MusicBrainz code (Score:3, Informative)
The Mozilla Platform (Score:3, Interesting)
Copyright issues aside (I'm assuming that you're talking about lyrics that you have the legal right to use) I'd say that there's a pretty simple answer to your problem. You're thinking through the pros and cons of using a back-end database versus a browser front-end, and you're not keen on running any flavor of server.
You can get both the database and browser advantages without having to set up a separate server by building your app on the Mozilla platform. You can utilize its built-in RDF capabilities to store your data in a clean, extensible way, and fairly quickly put together a user interface using XUL and CSS that can work with Firefox, Seamonkey, Flock, etc., or even just the XUL app runner for a more stand-alone user experience.
Because all of your data (and even interfaces) will be XML-compliant, you'll even be making it easier for third party apps to work with your stuff.
Re:The Mozilla Platform (Score:1)
Re:The Mozilla Platform (Score:1)
Re:The Mozilla Platform (Score:2)
You're welcome.
If you know your users are going to be using a bunch of different browsers, it'd probably make sense to build your system around XULRunner. That way it'd be pretty much like a stand-alone app, but you (as the developer) would still get the advantages of having a built-in system to handle HTML, XML, CSS, RDF, etc. and the user would be none the wiser (although it'd be pretty almost
Congratulations on having an idea... (Score:2, Insightful)
Now you've reached the point of actually needing a clue to accomplish it.
Just pay someone, you obviously don't have a clue.
Re:Congratulations on having an idea... (Score:2)
What's with the "you obviously don't have a clue" part? He obviously DOES. He knows what he's doing. He has some ideas of how to do it. He is just asking for guidance from people who may know more than him. It's called learning. You think people are just born knowing how to write a complex application like this? They have to learn about it.
Those people he should pay, how did they get their clue?
If there is some obvious reason why he shouldn't continue, why
Troll my ass... (Score:1)
I disagree. Ask Slashdot used to be specific questions about a technology or how to go about something. Lately, however, it's been one question after another that goes:
"I'm working on this project that will be able to do X? How do I do it?"
There's a big difference between learning how to do something and asking somebody else to figure it out for you.
Re:Troll my ass... (Score:1)
Maybe I'm missing the point... (Score:2)
Java (Score:2)
As for the database handling since this will be static (if you want it to run off a CD, it's static) here is what I can think of. You can embed an SQL server (I know there is one, can't remember the name) and do it that way. I don't know i
Re:Java (Score:2)
Yes, saving a few milliseconds of CPU time, once, is important.
Seriously, if Java has any major benefits, that you don't have to compile it isn't one of them. Or at least it wouldn't be one if all computers came with a C compiler.
Also, Java isn't the only programming language with this property. Python, Ruby, Perl, Tcl ... the only popular languages which normally
need a compiler are C and C++.
nice idea... (Score:1)
Re:nice idea... (Score:1)
It seems to be a commonly expected thing that you know the words to various songs. It's extraordinarily hard to do that by ear with a hearing loss.
get yourself a lawyer... (Score:1)
Legal Fees. (Score:1)
Don't get sued. (Score:2)
Try not to get sued.
Data, not program logic (Score:2)
You mention open source. What about the lyrics themselves? If you are the single provider of that CD or DVD, I don't care if the programs are open source or not. All I care about is that the data is in an open format so I can code against it myself. Closed-format content is useless to me.