Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror

Comment Don't bother with hierarchies (Score 5, Interesting) 235

Instead of trying to organize your data into a directory structure, use tagging instead. There's a lot of theory on this -- originally from library science, and more recently from user interface studies. The basic idea is that you often want your data to be in more than one category. In the old days, you couldn't do this, because in a library a book had to be on one and only one shelf. In this digital world you can put a book on more than one "shelf" by assigning multiple tags to it.

Then, to find what you want, get a search engine that supports faceted navigation.

Four "facets" of ten nodes each have the same discriminatory power as a single hierarchy of 10,000 nodes. It's simpler, cleaner, faster, and you don't have to reorganize anything. Just be careful about how you select the facets/tags. Use some kind of controlled vocabulary, which you may already have.

There are a bunch of companies that sell such search engines, including Dieselpoint, Endeca, Fast, etc.

Comment Impossible? Not true (Score 1) 266

It's very possible to do this.

The trick is that search engines deal with symbols, not necessarily words or characters. If you change the words and characters to different symbols then you're set. Imagine a dictionary of words that associated each word with a number. You keep the dictionary and don't give it to the vendor. You just give the numbers, and send your query in numbers. It works.

This particular scheme wouldn't be very secure, but it easy to imagine better ones.

Here's what you need: a search engine that allows you to modify documents as they go into the index, and also allows you to specify custom tokenizers, morphological analyzers, and whatnot.

The search engine I developed does this. http://dieselpoint.com/

Comment Chaos in the the Enterprise Search market (Score 3, Interesting) 256

This acquisition is going to mean some chaos in my industry. Full disclosure: My company, Dieselpoint, is a Fast competitor.

The enterprise search market is an industry unto itself, entirely different from web search. In this industry we sell search software for data inside a company, as opposed to general web search. In some ways, it's a much harder technical problem to solve than web search, because we deal with a much wider variety of data, security schemes, navigation rules, platforms, programming environments, etc.. Total industry size is between $1 and $2 billion, depending on how you count.

Enterprise search is interesting to larger firms like Microsoft because it touches everything in the enterprise. Everybody wants easy-to-use search for everything -- the intranet, the email archive, the content management system, the ERP system, the HR system, the CRM system, the works. It's a hard thing to do well, and the company that does it is difficult to dislodge. Being the company's internal search engine is a good strategic position to be in.

The industry is currently very fragmented, and no one has the upper hand. Fast was probably the most dominant competitor, though not the largest one. The largest one is Autonomy, but that has morphed more into a portfolio company with a lot of legacy products than a company focused on search. Fast was really the up-and-comer, and despite the financial difficulties, the one we had the hardest time selling against. Everyone else is secondary.

The acquisition means some chaos in this industry, for one major reason: Fast is no longer a viable cross-platform solution, and won't be considered for many corporate deals. There's going to be a scramble to take over the mantle.

Cross-platform capability is critical for corporate deals because, again, everybody wants to search everything. It's tough to do that if you only run on a Microsoft operating system. And while I'm sure Fast will continue to claim they'll support all platforms, who will believe them? This is Microsoft, after all. Non-Microsoft operating systems, Java, and the rest of the non-Microsoft-controlled technology will receive only short shrift in the future.

So this is really big news for our little industry.

Chris

Slashdot Top Deals

"What man has done, man can aspire to do." -- Jerry Pournelle, about space flight

Working...