Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror

Submission + - Adobe Sponsors Competition on Vandalism Detection in Wikidata

An anonymous reader writes: Wikidata is the knowledge base of the Wikimedia Foundation that anyone can edit. Like Wikipedia, Wikidata frequently gets hit by vandalism. Since knowledge bases like Wikidata are poised to be integrated into all kinds of information systems, however, wrong facts are not just displayed on Wikidata's pages but may propagate directly to all systems using the knowledge base. Hence, detecting an reverting vandalism and other kinds of damaging edits is an even more important task than on Wikipedia. Recently, German scientists published the first machine learning-based approach on vandalism detection in Wikidata, and now Adobe sponsors a competition on vandalism detection, the WSDM Cup Challenge, awarding $2500 for the best-performing solutions that will also be published open source.

Submission + - Tweet forensics and why your impersonation of a 16-year old girl won't last long

An anonymous reader writes: Can computers pick up your age and gender from your tweets? If you want to give it a try, here's your chance: "To develop your software for age and gender identification, we provide you with a training data set that consists of blog posts, Twitter tweets, social media texts, as well as hotel reviews." Well, at least my paid Amazon reviews are safe for the time being...
Science

Submission + - Competition on Identifying Sexual Predators in Chats (uni-weimar.de)

An anonymous reader writes: Researchers from the University of Lugano, Switzerland, and other universities from the US and Europe organize a competition to automatically identify sexual predators in chat logs: [...] given chat logs involving two (or more) people [...], determine who is the one trying to convince the other to provide some sexual favour. Their data set covers hundreds of chat logs with dozens of true positives (i.e., chats where one is trying to hit on another).
Wikipedia

Submission + - Competition on Detecting Vandalism in Wikis (webis.de)

marpot writes: Recently, the 1st International Competition on Wikipedia Vandalism Detection finished: 9 groups (5 from the USA, 1 affiliated with Google) tried their best in detecting all vandalism cases from a large-scale evaluation corpus. The winning approach detects 20% of all vandalism cases without misclassifying regular edits; moreover, it can be adjusted to detect 95% of the vandalism edits while misclassifying only 30% of all regular edits. Thus, by applying both settings, manual double-checking would only be required on 34% of all edits. Nothing is known, yet, whether the rule-based bots on Wikipedia can compete with this machine learning-based strategy. Anyway, there is still a lot potential for improvements since the top 2 detectors use entirely different detection paradigms: the first analyzes an edit's content, whereas the second analyzes an edit's context using WikiTrust.

Comment Re:Existing (Score 1) 116

Me too, experience that is. We tooke the feauteres from our research with high througput, and implemented a live edit analysis for the English portion of Wikipedia. It listens on the IRC channel, downloads edits wikitexts of old and new revision, and then does its magic. And it did so once on an old laptop. The computer was connected at max 1 GBit/s.

Comment Re:Existing (Score 1) 116

I cannot agree more with what you say, but I'd like to give it a twist: I want computers to assist me, and I want them to to it good, reliable, and robust. If I happen to be a Wikipedia editor that doesn't change a thing, I still want the computer to assist me with what I'm doing. Now, currently there is no such thing, and the only thing I'd like to foster research in doing so.

Now, some always go ten steps further, when someone talks about a new "solution" based on computers. They directly envision a world where computers take over. And that, apart from being unrealistic today, must be considered ideological, instead of logical.

After all, all you see here and all you see on Wikipedia is made possible only by machines working with intelligent algorithms.

Comment Re:quite a bit of work on this (Score 1) 116

Your right, it's machine learning, data mining, NLP, and information retrieval. But the fun thing is turning a research prototype into a tool that can be left alone most of the time. That hasn't happened yet. Also, research on this problem hast started only in 2008, rule-based tools developed by Wikipedians are there since 2006. All the works you listed are acutally all there is! That's not much to work with, is it?

Comment Re:Been done? (Score 2, Informative) 116

We are very aware of the existing tools (Huggle, Twinkle, and so on). See the links in the above post, and see the links in the resources section of the competition Web page. An accurate vandalism detector will take a lot of research an development, just like spam detectors did... Why did you stop developing your tool, anyway?

Comment Re:Existing (Score 5, Informative) 116

We have studied the accuracy of ClueBot, and found that (on a small corpus) it has very good precision (low falsy positive rate), but a very low recall (low true positive rate). (see: http://www.uni-weimar.de/medien/webis/publications/downloads/papers/stein_2008c.pdf) But the picture might look quite different on a large scale.
Announcements

Submission + - Developing a Vandalism Detector for Wikipedia (webis.de)

marpot writes: The title really says it all. In an effort to assist Wikipedia's editors in their struggle to keep articles clean, we conduct a public lab on vandalism detection. Goal is the development of a practical vandalism detector that is capable of telling apart ill-intentioned edits from well-intentioned edits. Such a tool, which will work not unlike a spam detector, will release the crowd's workforce currently occupied with manual and semi-automatic edit filtering. The performance of submitted detectors is evaluated based on a large collection of human-annotated edits, which has been crowdsourced using Amazon's Mechanical Turk. Everyone is welcome to participate.

Slashdot Top Deals

Your mode of life will be changed to EBCDIC.

Working...