Wikimania 2007 talk notes: “A content driven reputation system for Wikipedia.” by Luca De Alfaro.
Trusting content. How can we determine automatically if an edit was probably good, or probably bad, based on automated methods?
Authors of long-lived content gain reputation.
Authors who get reverted lose reputation.
Longevity of edits.
Labels each bit of text by the revision in which it was added.
Short-lived text is when <= 20% of the text survives to the next revision.
Have to keep track of live / current text and the deleted text.
Did you bring the article closer to the future? Did you make the page better? If so reward. If you went against the future, then you should be punished.
Ratio that was kept versus ratio that was reverted.
+1 = everything you changed was kept
-1 = everything you did was reverted.
Run it on a single CPU, no cluster :-)
Reputation – data shown for only registered users.
89% of the edits by low reputation users were against the future direction of the page.
Versus only 5% of the edits by high reputation users were against the future direction of the page.
I.e. there is a reasonable correlation between the reputation of the user, and whether edit was with the future direction of the page.
Can catch around 80% of the things that get reverted.
Author lends 50% of their reputation of the text they create.
Want an option or special page to show the trustedness of a page. Time to process an edit is less than 1 second. Storage required is proportional to the last edit.
Instead of saying trust the whole article, can have partial trust in some of the article. Could provide a way of addressing concern about whether Wikipedia can be trusted – this way vandalism would likely be flagged as untrusted content.
Speaker would like a live feed of Wikipedia Recent Changes to continue and improve this work. Or perhaps it could run on toolserver? Erik later emphasised that having a good and friendly UI would make it easier to help with getting weight behind this tool.