This evening I met up with Gene Golovchinsky, of FXPAL (Fuji Xerox research institute in Palo Alto). Gene’s work is pretty varied (there was some cool stuff on collaborative whiteboards & storing/retrieving info a while ago) but a big area has been on recall focussed information seeking, exploratory search tasks (as opposed to closed, precision style tasks where we’re looking for one or a few ‘answers’). A part of this research has also been on collaborative information seeking – and this combination is one of my own major focuses.
Genealogy as Multiple Document Processing
So we talked about the same sorts of ideas I talked about at my meetings yesterday around epistemic commitments (credibility judgements, evaluative standards, knowledge structures), and multiple document processing with a controlled corpus as a way to validate assessments of student’s commitments.
One suggestion Gene made was that observation of genealogical style search would be an interesting lens onto multiple document processing style tasks – because often there is only partial, conflicting, inferred, or hearsay evidence to go on, often from multiple sources, and these are often hard to find (we need to work out who will be a good informant for testimonial knowledge…). I think this is an intriguing idea, although I think it introduces a set of complexities around historiography and contextual factors that make it very complex (for example, if we’re looking at birth records in certain contexts it might be useful to know if the area is Catholic or not). It occurred to me afterwards though that the same sort of task might also be interesting for non-familial genealogies such as ‘idea tracking’ where we’re interested in the history of some concept or idea (I’m interested in history of maths, for example).
Another idea we talked about was of ‘fake collaboration’, for example by setting up a (real) pair with a third “collaborator”. This third collaborator could be purported to have returned a set of results (which the experiments have seeded) and the real pair be asked to sensemake on those results for their third member to return to later. This is a nice option because it removes the search/document variable (because we seed the results) and focuses on sensemaking (which is our primary interest). We can then epxlore differences between multiple groups with the same set of information.
Some ways this might work:
- Have the 3rd party labelled as a librarian mediator who has returned results to you when you query (allowing a gap too). Make it clear the librarian is an IR expert, not a domain expert (so we avoid the issue around users assuming materials delivered by a librarian are all credible), could also include some noise into the returned documents to highlight this (and in fact, it might be interesting to see if students open patently irrelevant documents in this context)
- Another option would be to have a search system seeded with content related to some fairly predictable concept keywords students might search for after the initial set. Students could then actively search (for concepts, not query strings) and receive a plausible set of returned documents that again we’ve pre-selected. Of course this method is a different sort of search to their usual experience (of strings in google, etc.) and if students search for something odd (a search for ‘Madonna’ in a task on climate change, for example) we’d need some sort of response to them (moreover, if it ISN’T odd but is a high level concept we might also struggle to return results)
- Another angle on the 2nd option would be to present users with a set of related concepts based on the initial search results (in some ways this is the instagrok method), and see which ones they traverse (ensuring we have a document set for each concept of course). This method would allow us to track conceptual links the students are making. This is less search based again than the 2nd option, but it does control for some of the additional issues raised in that model.
All three options provide a nicer way of thinking about processing multiple documents than I’d been considering, and they’re all more search based, while providing for a high level of control, and potential for creating believable search contexts – so my thanks to Gene for a really productive conversation!
We also talked about the lack of information issue (when no answer is answer enough), some work Gene and colleague’s have been doing on search result preview to encourage people to spend longer in exploratory search while finding more novel documents (recall oriented search), and various other things – including some good PhD/academic socialising advice :-)! So again, watch this space for developing ideas!