This seems like a good time to pump my own open source project:
pHash.
pHash is a perceptual hashing library that computes hashes for audio, video and image files, with text and PDF hashing coming soon.
We use an algorithm similar to YouTube's audio fingerprinting method but we do not only take into account the first 30 seconds. Although, it's impossible to tell from this basic test whether their algorithm truly only looks at the first 30 seconds, or if the algorithm considers them to be different audio files. If the song is only 1 minute in duration, and 30 seconds is blank, is that really the same audio file as the full 1 minute version? At some point the audio files are not really the same anymore, although the perceptual hashes should be somewhat close to each other.
Please give pHash a try. We could use some feedback from the OSS community and would appreciate it greatly.