Future Information System

Journal drowstar's Journal: Future Information System

Journal by drowstar on Saturday February 28, 2004 @01:05PM

I know, nobody cares :-)
And I know it's probably either bullshit or somebody else has thought of it, but I figured it would still put it on here just in case somebody is like "hey, I want ..." (no, I dont know, what you would want to do with it).
Here we go:
Right now, information is stored in files, which are "groups" of information, which only sometimes make sense. XML is a first way of making information accessible without the limitations, which file types have placed upon them. Still, even XML has downsides. There is no way of relating information, except by (not anticipated) grouping by names or in hierachical structures.
I would like for computers to learn, what information is related and present it accordingly. As an example: I write this document and find some related web pages. I associate these web pages with this document. Additionally the computer finds other documents relating to this content on my computer and the internet. While I edit the document, it changes significantly: I can now click on several phrases to be presented with more information on the topic. There are sketches in the document, which display how this information is related. I can rate / moderate / meta-moderate links and the document is changed accordingly. From the moment I gave a single keyword on, editing the document is a matter of relating information and judging the computer's decisions.
Of course, this requires a far more liberal idea of information. Today, every file is a singe entity with a certain amount of information stored in it. There are borders which cannot be crossed. The human operator needs to determine where and what to find, he needs to select and copy the information, he needs to put it into the document and so forth.
This way of working would be ideal for people working with information exclusivelly. New ideas as well as evidence and proof of existing information can easily be aquired and used in documents. I hesitate to use the word "file", because this idea is not practical, when this abstract concept would be implemented. The closest thing to files would be meta-data, from which everything else is interpreted, and "pure information", which is linked, grouped and cited in meta-documents. Today, everything is mixed up, most important are documents, which already contain information and "meta-data" (natural language), connections between pieces of information are not practical. They are either established through "physical" association (if they are placed in the same folder) or through hyperlinks, which can only point at another file (or a position in a file), not at related information.
20040208:
I thought about this idea a bit more, because I tried to explain it to my dad (well, the rough edges). I have not read what I have already written on the topic. I will give another point of view here and will later compare how much it matches, what I have oultined above.
The system will deal with information, rather than with files. I dont think I can mess with anything thats incorporated in computer architecture (which might ultimatelly be a way of improving things a lot more), but I find it possible that there might be a way of incorporating a better underlying system for computers. It will allow applications to do many things, which are considered impossible or hard to do, but would be a real improvement in computer interfaces and data organization.
I think of computers as tools to manage information (and it is a really universal tool! Much more so than any other tool I can think of). They are meant to acquire, provide, modify and distribute information. Therefore we need a direct approach, which deals with information rather than wrappers for information (files, file formats, folders, ...), i.e. information, not data. The difference, as I see it, is that information relates, it forms a network, one piece of information can be extended with other pieces of information, while data is separated from other data, it just sits there and waits to be accessed, modified and saved. In a way, information is more "intelligent", because it requires selection, extension and highlighting to become more usable, while data needs understanding (to become "information", which can then be dealt with the way I described) before I can work with it; it requires me to manipulate it instead of allowing me to work with it.
This just didnt get my point across. Information is data that knows about it's context. Information can extend itself, it forms a network, enabling everyone to understand where it came from, how it was concluded, ...
Maybe it helps, if I relate it to the way it works today: I open up a document. In there, there is information, but it is connected by weird, non-universally understandable structures. To edit this information, I need to interpret it, relate it (in my mind) to things I know about, I need to acquire more information from other sources, combine these pieces of information, connect it with weird, non-universally understandable structures and put it down. Then, I need to save it in a wrapping structure, make it available to others, who can use it as another source of information for their work, but, in order to be able to use it, need to do the same thing I just did over.
If I want to do the same thing the way of was trying to outline above, I would start searching the information I have for certain pieces. Once I find it (same thing as opening up a document to start working from), I chose which related information is relevant to what I am looking for. By selecting I increase the "value" of certain connections to form a single piece of information with distinguishable parts. I may add other pieces of information not yet incorporated in the "database" and relate them to my document, which in turn relates it to other pieces of information, it has been connected to previously. By doing all of the above, I enable other people to see the relationship between different kinds of data, such making it possible for them to draw their own conclusion and work on the network. Next time I look at the same information again, I will find that it has been altered significantly and find conclusions I would not have thought of.
This is what I mean with "working with information" rather than data. While the above example is the most obvious way of using the new way of storing data, it can also be used in many other ways. It can be used to distribute problems and have them solved in a combined effort. It can be used to relate things, which were completely separated before (e.g. a game can "learn" to use information it was not programmed to use originally).
The technical implications of such systems would be huge: The existing system (files store data, computer interprets data to display, human interprets display to information, human edits information, ...) would need to be redesigned. Files are not a usable concept for dealing with information. The internet is a somewhat more "primitive" way of storing information. There are still documents (i.e. files), but they have (as far as HTML and the likes are concerned) a way of relating to other documents (hyperlinks, i.e. connections to other pieces of information). Many times, web sites are powered by databases, which are interpreted by computers to make documents, which have a similar structure as language (weird, non-universally understandable structures). The new approach would be to connect the underlying data structures and combine them to form an information network. You can think of it like a HTML document with lots of links. Everytime I stumble upon a piece of information, which I find interesting and important for the purpose I am collecting information for, I can click on it and learn more about it. While today I need to read another document, find the interesting pieces of information, interpret and incorporate them into a new document and make that accessible, I would then, by simply accessing a very preciselly outlined piece of information, inform the network that I find this information relevant. By doing that, I would form a new sub-network on a certain topic. My central (starting point) document, would server other people as a short-cut to relevant information. By accessing the conclusion of a lot of research (it grows incrementally fast) I would be able to very efficiently collect information and come to new conclusions (which I would enter as new data into the network). How I came to his conclusion can easily be accessed and once more be re-enforced as valuable data.
It is important that data is split in the smallest imaginable parts. Larger data-networks become pieces of information. (For example: a timetable of a train station only makes sense, if it is combined information (time, track, destination, ...). However, time can be a separate piece of information. It must be connected to a track, a destination, ... in order to make sense. But there can be a lot more information associated with this time (accidents, news, ...), which might be relevant. While at most times the traveller will not be concerned with what happened the same time he got on a train, it might matter at times (if, for example, at the same time there was an avalange at his destination, he might chose not to go at all [bad example]). If the information is relevant (train broke), it can be related to time, date and train; if it is not (thunderstorm in South America), it wont. The consequence is a network of information, which might be relevant at times. A sub-network (what the user works on) can chose to incorporate this information or not.) These networks (or parts of it) might become part of a larger network, if the information it contains is relevant to more than the special situation the sub-network dealt with. (For example: It is not important to most people, when a train stopped at a small station. It is important, however, that it stopped there. So, the station can become part of a larger network showing what trains go where in a certain region, while the time table remains part of the sub-network containing all time tables of this station. [bad example]). A new sub-network can deal with a completely different topic, only containing very little information from another sub-network. However, there will be links to this network enabling people to research other topics, which might be more concerned with this particular aspect.
We might be working with information now. However our approach is not meant to be working with information, but with data. We need to put together a network of data, forming information, linking these networks and make information-networks to hold all information we need (or want or have).
20040208(2):
I read an article on MS's WinFS and search engine plans in Longhorn. In there there was some information on privacy concerns. I have not touched on that any in this "file" ;-). I may have created the impression that there is no difference between local and public data. Ideally there woundnt be. Unfortunatelly there are bad guys out there (and I am naturally a person who doesnt like to trust people more than necessary) and I would not want them to know what I have been searching for and (above all) I dont want them to be able to access my personal data, which would not be any different from public data in terms of how it is stored. One way to solve this (that I have just come up with, I dont guarantee anything) would be to have everything public, that has ever been. So, when I enter new data, it would not be visible to anybody. However, any other links would of course remain public (and the quality would still be improved). However, I would encourage to share data (it is not likely that any private person would provide a lot of new data anyways). Another problem might be how to make sure that intellectual property is honored. It might be possible to hope for somebody to recognize the information and link it to the source it came from (if the original "poster" did fail to do so).
IT might be fun to imagine what a user interface for this kind of computer would be like. There would be a lot of information being displayed and combined and created and sent. Conversations, information on select topics, news feeds, email, currently open "documents", ... Basically, it would be possible to create any kind of user interface on top of this system, which is a good thing (TM), because it allows for more individual computing and more productivity for the individual (no, this is no, "its good for the economy"-bullshit, but rather a "its good for me"-approach, because I like to be able to trace my ideas).
I am thinking about, how to make this system compatible to existing systems. I dont think its good to do this, but it may be necessary to push a wider adoption. It would be insane to try to make existing systems run on it.

*Update 20040405*:
I have learned of Dashboard, which takes a less radical aproach, but may be a nice way of doing something vaguely similar with existing technology.

This discussion has been archived. No new comments can be posted.