Comment Collaborative filtering could be elegant...? (Score 3) 316
I'm also surprised that this has only been brought up twice. Because I don't think it's been getting enough press, and because I like explaining things, I'll go through it in more detail.
Any registered user would be able to assign a rating, on some arbitrary scale, to any comment, on the criterion "I would like to see more like this." The system would track correlations between users across comments and use these to generate a prediction of the rating of each comment by each user. The ratings are, of course, used to sort and filter messages for display.
This addresses several design concerns:
- Varying interests
Several posters have voiced the concern that (say) some people will like "Funny" posts where others will dislike them. They have suggested several solutions, for example that there be an option in Preferences to set any given qualifier to be treated as a bonus or a penalty, or that ratings be explicitly multidimensional. The proposed system would handle such things implicitly. Also, it would give a basis for specialized rating that's fundamentally connected (if in a way that's a bit inflexible) to empirical evidence, rather than one that relies on a fixed set of somewhat arbitrarily-chosen and somewhat nonorthogonal qualifiers ("Informative", "Interesting", "Insightful").
- Abuse of power
In this design, abuse is fairly difficult. It is only possible to get "power" by granting ratings highly correlated to those of a given target audience. You can only use this power to try to knock down so many comments from that audience's sight before the engine decides your ratings are no longer correlated to those of that audience and you lose your power. Also, the system has a lot of inertia; it should take a large concerted effort to knock out any given comment.
- Insufficient moderation
Moderation points are restricted in the existing design because of the potential for abuse; as we see above, that's much less of a concern here. The proposed design also provides (to a certain extent) natural incentives to rate under-rated comments. A user who sees an incorrect rating for a given comment may stand directly to benefit by rating it himself, since it's possible that the engine doesn't yet know about him that he doesn't like that kind of comment. Of course, after the system has had some time to learn about the user, this explanation becomes implausible, and the alternative explanation -- that not enough people have bothered to assign a rating to this comment -- comes to the fore. In this case the user's only incentive to assign a corrected rating is his abstracted "public" self-interest.
- Overmoderated comments
In the existing design, certain controversial comments get large amounts of moderation points burned on them in either direction. This represents a waste of moderation effort. In the proposed design, the engine should be able to make a guess as to which user will be on which side of a controversy and show a different rating to each. This should limit the amount of wasted moderation effort.
- Granularity and roundoff error
In the current design, ratings are discrete and very granular. In the proposed design, ratings are continuous. (They also can exist over an arbitrary user-chosen range, since they're predicted to match the user's also-arbitrary ratings.)
- Self-reinforcing selection
A concern described for example here. With respect to the collaborative filter itself, every user is his own most powerful moderator; a concentration of "moderating power" within some viewpoint opposing that of the user is (to that user) merely irrelevant. (An exception would be the concerted knocking-out described above.) Of course, selection would still be possible in forces acting outside the filter, such as administration or top-level article-posting, but that's not necessarily a problem.
Some of the concerns that arise in this design:
- Implementation cost
While collaborative filtering has been an interest of mine for a while, I haven't actually looked at the literature on the subject (d'oh!), so I don't know how hard it would be to put together. It certainly wouldn't be trivial.
- Computation cost
Again, I'm similarly clueless. I know there's linear algebra and sparse matrices involved, but that's about it. After all the prerequisite number-crunching is done, though, I can guess that the incremental cost of predicting a single rating for display shouldn't be more than about ten or a hundred numeric operations (depending on how much simplicity is chosen for the model that the number-crunching generates).
- Conceptual complexity
All the other proposed designs are fairly straightforward, and it's relatively easy to understand their workings and what might go wrong with them. This design is not simple. Understanding of its inner workings requires some technical knowledge, and it may have some hidden pitfalls that aren't obvious without study.
- Privacy
The proposed design requires gathering a significant amount of information from each user.
- Balkanization
As described in the parent message. A related problem is that this design would cause an expansion of the discussion load on Slashdot because off-topic discussions would no longer be discouraged to the people interested in them.
A few concerns can be addressed by extending the proposal somewhat:
- Anonymous peoples' ratings
Clearly, the anonymous reader will need some kind of rating system. One option would be to perform principal-components analysis (that's what it's called, right?) on the entire body of ratings, and then use the strongest correlation as the one presented to the anonymous user as a representation of the interests of the Slashdot majority. Another option would be to take the top (say) five dimensions that come out of the analysis, investigate and hand-label them (#1: popular vs. unpopular; #2: funny vs. serious; etc.), and have the anonymous user assign a numeric weight to each, retained as a cookie.
- Variable criteria within a given user
Perhaps some days (or for some discussions) users will be interested in funny comments, and in other cases in serious comments. It might be a good idea to grant users some plural number of ratings categories, such that they can choose one or another (or mix among them, or even among the predefined ratings discussed above) for the purpose at hand. (Perhaps if a user hit the limit of a fixed number of categories, they could "retire" old ones they don't want anymore, to be removed from the engine's model (or whatever it is that happens to old ratings).)
- Initial knowledge
It would probably be a good idea to give the engine some automatic ratings categories -- for example, some for authorship, to specify a rating of (say) 1 if a given person wrote the message in question and a no-rating if they didn't. This would give the engine more information to draw from, and in the example would permit it to associate authors with the ratings their comments get.
- Self-rating
Some posters have stated that self-rating ("Off-topic", demoting oneself to 1, etc.) might be a helpful option. The proposed design makes self-rating impossible -- one can only give a single rating to any given comment, including one's own, and that rating may get drowned out by everything else that the system knows about the comment (its size, its writer's prior record, etc.). It's possible that this could be prevented by granting the writer the option of telling the system how much he wants the comment to be associated with him, such that his record doesn't reflect on the rating of his less-valued comments and vice versa. While this helps, it's only one-dimensional, so it doesn't help in case the author wants to flag the message 'funny' or something. Another option would be simply for the author to flag his message manually in the subject line, although then the flags might get propagated to replies' subjects and look weird. Of course, the whole thing isn't *that* much of a problem, since misrated messages are supposed to be self-correcting in the first place.
- Continuity
The drop-down list that currently selects the rating threshold has to give discrete options. The proposed design gives continuous-valued predictions, so it would probably need an input box or something, with maybe a list box to show what percentiles correspond to what ratings.
And finally, here are some strictly optional... um, options.
- Uncertainty
It's my sporadically-educated guess that the math that you have to use for this job inherently involves tracking of uncertainties -- if only because the sparse matrices need some way to distinguish between "rated as zero" and "unrated". It might be nice to include options in the user interface to make use of this uncertainty directly. For example, one could tell the engine to sort comments at the 95th percentile of their possible range, so as to highlight comments with a high uncertainty (i.e. not-yet-rated ones) for reading and rating; or at the 1st percentile, so as to only show the comments that the engine *knows* are good. At the very least, the engine should display its uncertainty level along with its predicted rating.
- Advertising...
...perhaps a given user's determined preferences could be used to influence the choice of ads?