Comment Not a computer problem (Score 3, Interesting) 42
Searching and indexing information isn't a computer problem. We can already find information in massive databases--MongoDB and PostgreSQL handle that well.
It's tagging information that's difficult. Contextual full-text searches often fail to find relevant context. Google does an okay job until you're looking for something specific. General information like melting arctic ice sheets or the spread of Ebola find something relevant; but try finding the particular documents covering the timeline Wikipedia gave for Thomas Duncan's infection, and each of the things the nurse said. You'll find all kinds of shit repeated in the media, but not how they originated. Some of the things in there are notoriously hard to find at all.
I've thought about how to structure a Project Management Information System for searching and retrieving important data. Work performance information, lessons learned, projects related to a topic themselves. This steps beyond multi-criteria search to multi-dimensional search: I want to find all Lessons Learned about building bridges; I want to find all Programming projects which implemented MongoDB and pull all Work Performance Information and Lessons Learned about Schema Development; etc. I need to know about specific things, but only in specific contexts.
For this to work well, people need to tag and describe the project properly. The Project Overview must carry ample wording for full-text search; but should also be tagged for explicit keywords, such that I can eschew full-text search and say "find these keywords". It would help if project managers marked projects as similar to other projects, and tagged those similarities (why is it similar?). A human can highlight what particular attributes are strongly relevant, rather than allowing the computer to notice what's related.
With so much information, searching requires this human action to improve the results. It may also be enhanced by individualized human action: what humans produce what tags and relationship? What humans do you feel provide useful tagging and relationships? What particular relationships do *you* find important? What relationships do you want to add yourself? This will allow an individual human to tailor the search to his own experiences and needs.
On top of that, such things require memory: a human must remember certain things to know what to search for. I remember working on a project where...
Computer searching is a crude form of human memory: human memory is associative, and computer searching is keyword-driven. Humans need to use their own memories, to tell the computer how they see things, and then to tell the computer how they think about what they want to know--what it's related to, what it's similar to, who they think knows best about it--and have the computer use all that information to retrieve a data set. To do that, humans must manually remember in the computer and in their brains.
The holy grail of searching is a strong AI that takes an abstract question, considers what you mean by its experience with you and its database of every other experience, pulls up everything relevant, decides what you would want to see, and discards the rest. Such a machine is largely doing your job: it's thinking for you, deciding what you'll remember, and making your decisions by occluding information which would affect your decisions. Anything less is a tool, and faulty, and requires your expertise to leverage properly.