Citation NeededA while ago I posted a blog post on using [improvement in Wikipedia paragraphs over time to teach literacy skills]1. Building off some ideas around that, and an [‘a google a day’ style quiz idea]2, I’ve been thinking about ways we could use [inline templates]3 to a) create a fun digital literacy based game, and b) potentially use answers to the game to feedback into the encyclopaedia.  A small [gamified (sort of) interface]4 exists which is great!! The idea would be to use a combination of templates indicating resolutions needed (e.g. ‘Citation needed’), and training examples where we know the nature of an adequate response (these would be both handwritten, and seeded from answered (or resolved) templates in earlier rounds). We’d expect particular types of answer for various templates, but the below is slightly fleshed out for the template: ‘citation needed’ (‘fact’ and other namesakes). I think something interesting could be done with claims which are lacking evidence in Wikipedia (‘citation needed’ and the related templates).  So the model would be: 1. Extract claims with ‘citation needed’ template []5 used against them (I think this would need some python extraction using punctuation, the templates are used inline after the claims they refer to, so extraction would involve a span from template backwards to the next full stop/period  start of the paragraph (with the key element highlighted). Python tools and dumps at . There is a subset of articles with this template (e.g. starting point ). These could go into topic categories. 2. Quizzers would be presented with a sentence, and asked something like “Is this claim true or false?” or “Can you find evidence for this claim”. 3. They might have three options (with submessages in sublists below): 1. What evidence can you find for this claim (Enter as many sources below as you like) 1. How good a source is this? 2. Can you corroborate the source? 2. “I can’t get this one” 1. I can’t find any information about this claim 2. Open some of the options under ‘c’, perhaps ask how people have searched (or/and guide them) 3. There is no claim made here! 4. There are multiple claims made in this statement 1. If multiple claims are made, can you rewrite the statement into separate claims, and find citations for each? 3. “This might not be true” 1. I found contradictory evidence for this claim 1. Something about contradicting evidence, and weighing up 2. This claim is outdated 1. How should the information be presented? 2. What evidence can you find for this claim 1. How good a source is this? 2. Can you corroborate the source? 3. It might be more complicated than that (if the claim is more nuanced than made in the text) 1. The information should be expanded and broken into separate claims 1. (See other sub-replies re: splitting suggestions and evidence for them) 2. The information isn’t as general as presented, further constraints should be added (e.g. it is only true to particular geographic areas, groups of people/things, etc.).

  1. Ideally there would be some training examples (pre-written) which would be diagnostic for each of these issues. They might be extracted from wiki-data/reasonator, or just manually written…plus…:
  2. In addition, as each claim is ‘answered’, they further ‘seed’ the ‘training’ examples. Contradictions in multiple responses to questions could be settled somehow or removed from the question pool (they could be harvested off for Wikipedia community input).
  3. Ideally, there would be some mechanism for the information to go back into Wikipedia, either by the quizzer editing the claims (this is a good onboarding technique!), or by creating a list which could be used, or automatically. This is an area other organisations could get involved in, including the Wikimedia community.  It is not fundamental to the basic idea here!
  4. Ideally users would get points for:
    1. Answers given
    2. Answers actually used in Wikipedia (for which 5 would be needed)
  5. There would be scope to extract already referenced claims, and use the structures in Wikidata (e.g. when did the Austrian composer, born 1756, die) to set other very answerable tasks of the kind found in a google a day.
  6. There are also lots of deadlinks, or raw-URLs used as references, and cases where a reference is given but the reference actually doesn’t support the citation – longer term exploring that area would also be interesting.


  1. “Using Wikipedia paragraph improvements to teach literacy”