Data Carpentry

Over the last few years there have been a number of Learning Analytics and Educational Data Mining online courses, notably (thanks to [Bodong]1): * [Data, Analytics, and Learning]2, offered by George Siemens, et al. on edX * [Big Data in Education]3, offered by Ryan Baker on Coursera * [LAK Course 2013]4 * [LAK Course 2012]5 ([its bibliography]6) * [LAK Course 2011]7 At the [University of Minnesota Department of Curriculum and Instruction]8, Bodong Chen also designed and ran a special topic course on [Learning Analytics in the Knowledge Age]1, which he brilliantly also put up on [github]9 for anyone to access, use, and adapt (very cool). Alongside that, I’ve recently become more aware of the [software carpentry]10 and [data carpentry]11 workshops – workshops designed to teach the fundamental tools to _get stuff done _for research purposes (I recently took part in one on [Python’s Natural Language Toolkit/NLTK]12). These courses also (always?) release their materials on github, and are collaboratively created for interactive workshop sessions where participants dive right in to getting things done (and, learning how to fix the things they’re trying to get done, etc.!). The courses vary across specific languages and packages (e.g. the NLTK one), and target groups (e.g. [Library Carpentry]13, to go along with other library data science resources e.g. [1]14 [2]15).  All of the courses make use of sample data (e.g. learning R on [inflammation data]16, or [gapminder data)]17 taking a kind of [‘data expedition’]18 approach. Taking an open approach, the software carpentry folks also publish (openly adaptable) their [instructor guide.]19

So, recently I’ve been thinking about whether we could develop a Learning Analytics Carpentry (LAC) course, designed to give practical professional development sessions to educators and learning technologists interested in delving into the area – to give them the practical tools and theory. Ideally this would involve a focus on data that is already accessible to practitioners (e.g. a focus on getting data out of existing systems, thinking about LAK informed learning design, etc.). You can imagine a core set of activities/foci, forked across different platforms/institution types/code languages, etc.  While I think there’s brilliant work in LAK and in the existing LAK online courses, I think the advantage of making a ‘carpentry’ style course is (a) this forking (and community updating), and (b) the very hands on and practical nature – the need to focus on resources at hand, rather than implementing new technologies to get new data.

Feedback please: Does this sound like something useful? Has it already been done? What would be involved? Are there other examples of a collaboratively created resource of this kind?


  1. 2 3

  2. 2