Using AI To Filter RSS Feeds 53
holden writes "According to a blog post, AideRSS has moved from closed to open beta. I've been using AideRSS over the past few weeks to filter my RSS feeds (including Slashdot and Reddit) and I've been quite impressed. They talk a bit about how the filtering system works, which apparently tracks a mixture of things, from pick-up in other blogs, to some clustering technology."
Some filtered RSS feeds (Score:3, Informative)
Re: (Score:1)
"lindsay lohan arrested" scored 5.0 on CNN. (Score:2, Funny)
Re: (Score:2)
Title really sucks (Score:1, Redundant)
As for the article, what kind of person or group has too many RSS feeds to look through?
I'm asking because I really have no idea. I have linked the RSS bar in my Gmail to Tomshardware and Slashdot, but that's about all that I need....
Re: (Score:1, Funny)
Need RSS Feeds filter (Score:1)
I am probably an exception to the rule, but I just counted up the different individual news feeds in my NewsReader's (Default- RSS OWL [rssowl.org] Java OS) start-up OPML file, and there are 1074 unique feeds in it. Granted, there are a few major Mainstream News site feeds that I just recently updated the feed lists from their RSS pages, and haven't yet filtered out the many I consider to be irrelevant from them. Two I recently rescraped but haven't filtered are are McClatchy News and the NYTimes, but even after filter
Secret Sauce and GeoRSS (Score:3, Informative)
Secret sauce? Why do I prefer open sauce?
One other way to filter RSS is by geographic location through using GeoRSS [slashdot.org]. However, the source RSS must be offered in GeoRSS for this geolocalization filtering to work... but it's only a matter of time, we'll get there. (hey, even slash has a plugin that works for publishing GeoRSS)
Re: (Score:3, Informative)
Geolocation is a possible additional filter (think "local news" section of a newspaper), but I guess most people are interested in items from their field of interest regardless of the physical location where the post was made.
I made some experiments [bergie.iki.fi] on a more open source version of the "secret sauce". It seems quite easy to determine relevance of posts using the various social news services out there.
Re: (Score:2)
event calendar, with the advantages of letting the reader select it's own idea of "local"?
(Obviously you'd need to apply some minimal pre-filtering for "local" on the server)
If Only ... (Score:5, Funny)
Re: (Score:2)
Re: (Score:1)
Download URL (Score:1)
Re: (Score:3, Interesting)
Seeing as half my feeds are internal work related and the fact I don't want someone profiling all feeds I am reading I won't be using the service.
Re: (Score:2)
the privacy concerns for this seem to mirror those of any of the public feed browsers like Google Reader, it's probably a bad idea use them for private feeds
Potentially scary side-effects already. (Score:5, Interesting)
Lets say you're a drug company that is releasing a potentially controversial drug. You can mine the data of the blogosphere and issue press releases as a pre-emptive strike to larger media stories. This starts the real beginning of being able to effectively monitor and even potentially control some of the social aspects of the internet. I think it's a great innovation indeed, with potentially scary side-effects.
Personally it is nice to be able to filter through a billion RSS feeds to find information that I'm interested in though.
Re: (Score:3, Insightful)
I fail to see your reasoning. Companies have always been able to "monitor" blogs and subscribe to RSS feeds. And they aren't controlling the social aspects of the internet at all. A press release has always been a standard communication means of corporations; as long as they aren't creating fake blogs, I don't think they are trying to control any aspect of the social i
Re: (Score:1)
Re: (Score:2)
In reality, however, I think the common practice of spider-intelligence-gathering is simply another tool for marke
Re: (Score:3, Interesting)
Sounds like a fantastic market, actually. I recently picked up a client in the casino management market because I had made some comments on a blog regarding their lack of insight towards proper marketing and keeping a decent percentage of return customers. They actually contacted me, and I've spent a lar
Re: (Score:1)
A bit tangential but lately in server logs I have access to I've noticed a proliferation of 'vertical search engine' bots, which do not claim to ever be planning to provide the data they acquire at the websites' bandwidth cost in any manner which could possibly be deemed as reciprocal. Even more troubling are a few sites that throw mad wget type bots at sites, with user strings claiming to be a common browser, without concern for bandwidth spikes by using decent time intervals between GET requests, and wil
Another site using AI (Score:4, Interesting)
Unlike AideRSS, Thoof isn't an RSS aggregator, rather users submit stories, in a manner similar to Slashdot, Digg, and Reddit.
Sux0r : Bayesian RSS filter and you can run it too (Score:2, Informative)
http://sourceforge.net/projects/sux0r/ [sourceforge.net]
What I find interesting is, it is one of the verrry rare examples of 'internet 2' service that you can own yourself (instead of registering here or there for more ads or worse).
A downside of Sux0r is it seems not having evolved for a couple of years (but still works, possibly that's why
I for one am desperately waiting for a *local* RSS agregator which would allow *me* (and not some site's AI) to Bayes-filter my se
Re: (Score:1)
Fancy stats != intelligence
Re: (Score:2)
I have a bachelors degree in Artificial Intelligence, and I certainly wouldn't claim that intelligence doesn't simply boil down to mathematical computation at some point, indeed, I suspect that it probably does. Its just that we don't understand it yet.
recursion (Score:5, Insightful)
Re: (Score:3, Informative)
Re:recursion excursion (Score:2)
No, then a few people would be generating useful content.
Dare to dream!
Re: (Score:2)
Artificial Intelligence on Slashdot (Score:2, Funny)
however
It is pleasing to see that scientists around the world have started to produce artificial intelligence to make up for the loss of natural intelligence, but I think that like everything else, perhaps it is also equally important that we conse
Dupes! (Score:1)
Another half-baked open soucred approach (Score:2)
IA works, as noted in the readme [srijith.net], by computing a relevance factor, which in turn is based on four other 'relv' - category relevance, feed relevance, keyword relevance and item relevance. I used it as my reader for quiet some time before moving over to 'better' readers.
Filtering RSS (Score:2)
Re: (Score:2)
I guess that's your problem. Sorry, you don't get to define the terms.
Personalize instead (Score:3, Insightful)
Using an AI to resort those feeds is definitely interesting from a coders point of view but trying to give some kind of objective view to a feed is probably not what the average user wants.
Why not do it the other way around and personalize them instead? Maybe it has been done before, but it would be nice if there was a reader to rerank (or even filter out) certain domains, keywords, tags and categories. It could take the given rank as the base score and then resort it according to the user's personal preference, e.g. if someone doesn't like politics he could give the keywords "Bush, Cheney, election, etc." a negative mulitplier and maybe the keyword "funny" gets a positive one. It could even consider the time of the day - politics in the morning and funny pictures during the lunchbreak or something.
Just a qick thought though, someone can perhaps come up with something better. Anyway, I am pretty sure that personalization is the better approach here.
Re: (Score:2, Insightful)
i also think their should just be a thumbs up/ thumbs down option which would save you typing in.
Risks of non-algorithmic filtering (Score:2)
The issue is lack of responsibility or accountability, because at a certain level of complexity, it is no longer practical to understand or explain the basis of individual decision. The company can just say "the computer did it."
A few years back there was serious consideration being given to using neural
Re: (Score:2)
Thus the company could always say "this application was rejected because the applicant's income was too low, and would have been accepted if the applicant had earned X thousand more a year." Raising the question, of course, of whether this was the real reason. Or what it means to talk about "the real reason" in the case of a decision made by a neural net.
How is it any different from dealing with a person? You have no idea if what somebody tells you is "real." And people can get hunches, where they feel
Just what I need (Score:1)
Two things I'd like to see:
An offline version; I know it's unlikely to appear (Web 2.0 business model and all that) but I'll never use the online one in the long term.
The ability to upload a bookmarks file filled with rss links. I don'
My first thought? (Score:1)
I would have replied to this earlier (Score:2)