Putting learners in control of their data: ETAG proposals On 5 Jun, 2014 By Simon Knight With 5 Comments Share this:TwitterFacebookLinkedInMorePrintEmailLike this:Like Loading... Related Related Posts via TaxonomiesPortable apps for experiment deploymentTalking MOOCs at Stanford – Google ExperiencesTalking MOOCs at StanfordResearching moocsNew Output: Artificial Intelligence in Education?Tools for analysing learning textsFrom Bricks to Clicks – Learning Analytics recommendationsEthics and Privacy in Learning AnalyticsStudent diagnostic review/benchmarking dataCleaning essays in R (ish) consultation , ETAG , etag2a This Post Has 5 Comments Crispin Weston says: June 7, 2014 at 11:57 am - Log in to Reply Hi, Good to see a serious contribution on this very important topic. While I quite agree that there is an important balance to be struck between enabling innovation and preventing abuse, I am not very enthusiastic about your proposals. This notion of enabling learners to take control of their data and become co-designers of new learning analytics systems strikes me as hopelessly unrealistic. Just because we all drive cars does not mean that we all have to become co-designers of the latest fuel injection system. Which is not to say that we should not have access to our data. Nor do I like the proposal for consensual, government-funded projects to set up single data stores or government endorsed institutions to decree on the ethics of learning analytics. Such unwieldy talking shops will spell death to agile innovation. What we need is clear legal, regulatory frameworks, which allow counter-consensual innovation on the periphery, while insisting that it conforms to strict rules. My proposal for the framework would comprise several layers: 1. A technical, machine readable “data handling procedure description language”, describing the the different types/sensitivity of data, the types of institution, the types of relationship, the procedural rules. 2. A set of normative standards and regulations, built on top of (i.e. expressed in) the technical specification above. Learning institutions will be required to observe regulations and may also be constrained by pressure of opinion to conform to other standards. The technical spec will allow new codifications to bubble up through different communities of practice (trade associations, user communities etc). Allowing some diversity in this layer enables us to test the balance between innovation and safety in our attempt to find the middle way. 3. A national catalogue of ed-tech products (advocated in my policy proposals to ETAG, available at http://bit.ly/1lYA6EP) on which the certification of products and services against different privacy standards can be authoritatively recorded, for reference by third-party e-commerce sites, media reviewers, or directly by teachers. A key benefit of the technical bottom layer is that software systems can support the implementation of specified practice – for e.g. by saying “hang on, you can’t send that bit of data to that person” or “Data must be deleted because it is past its sell-by date – delete all? Y/N” and if no, “Do you acknowledge that the preserving this data contravene standard/regulation xyz and your decision will be recorded in the system’s audit trail – please complete the following comment section”… Clear but flexible regulatory structures will enable innovation in this space. Consensual ethics committees and concepts of learner ownership of data will kill it dead, in my view. Crispin. Simon Knight says: June 8, 2014 at 3:12 pm - Log in to Reply Thanks for these comments Crispin. You seem to have two concerns: First the co-design issue; second something around who should lead on analytics development/how that should be funded, etc. You then have your own proposal for a “counter consensual” model. I googled that expression, which I’d never heard before, and the biggest link seems to be to climate change denialism. I don’t know if you intended that meaning (or association!) though? If so, it seems to be a bit of a worrying suggestion, given the legal and ethical context. In any case the details involve a fairly sensible set of suggestions which sound a bit like the TinCanAPI http://tincanapi.co.uk/ or LearningLocker http://learninglocker.net/ type structures. You then suggest a national catalogue of ed-tech products…an app store if you will, I’m not sure why this sort of centralisation=good but never mind, I’d note plenty of these sorts of things exist, and setting them up well (and in a unified way) is incredibly challenging. I don’t think either of us would have any issue with any of that, it’s just not really addressing most of our points, nor is it particularly novel – indeed, as the links above indicate, these things already exist if I’ve understood you right? So I’d note that the real thrust of our proposals is around the potential to explore interesting new assessment potential (see ‘1’ and ‘2’), and the need to understand your data in order to take control of it (3.1). There is no suggestion of a centralised data store, you may have misunderstood here – what we are proposing is (4) the need for an evidence-base to share, collate, and create collective evidence and ideas around use of data in education, for all stakeholders. There are organisations well equipped to support this. With regard to your two specific criticisms: 1) Co-design. I’m afraid your analogy just doesn’t work, the fact it isn’t sustained (“Which i snot to say that we should not have access to our data”) is a pretty good indicator. Fuel injection systems are very different to education. They’re also very different to car design – and if car designers don’t engage in co-design, and user-testing, I’d be staggered. 2) I think I’ve answered this above, we’re not suggesting a central system for data, and nor do we think we only need an ethics talking shop – but these are very important things if we want to get this right. Cheers Simon Simon BuckinghamShum (@sbskmi) says: June 8, 2014 at 3:51 pm - Log in to Reply Hi Crispin >>>Good to see a serious contribution on this very important topic. While I quite agree that there is an important balance to be struck between enabling innovation and preventing abuse, I am not very enthusiastic about your proposals. Thanks – I hope that by clarifying a few things we may find that you’ve demolished some straw men, and we are in more agreement than you think 🙂 >>>This notion of enabling learners to take control of their data and become co-designers of new learning analytics systems strikes me as hopelessly unrealistic. Many argued the same in the early days of interactive computer systems: what can users possibly tell us, they’re not programmers? We know how to optimise a software system, not then. Wrong kind of requirement, wrong definition of what counts as ’the system’. Just because we all drive cars does not mean that we all have to become co-designers of the latest fuel injection system. Clearly. So if we follow the car analogy, the argument is not that citizens should design and manufacture crankshafts and injection systems (perhaps the equivalent being designing and coding low-level features of an LMS), but rather, ergonomic design principles require us to test the dashboard with real drivers in authentic driving conditions. The driving analogy breaks at some point, but if we stick with it, we are in the situation where the criteria for ‘good driving’ may be about to change, due to a rapidly changing world and new monitoring technologies. Suddenly, we are in the position where we can measure a lot more things about the driver and their environment (stress levels; reaction time; quality of decision making given unexpected obstacles; changes in destination en route in response to news). Nobody has invented a car dashboard for this: the drivers must be involved in designing and evaluating them. >>>Which is not to say that we should not have access to our data. But since giving a non-technical user a raw database dump is not really empowering, one also needs to give them higher level environments which make it meaningful. How will we know if they are indeed usable without testing them with users? >>>Nor do I like the proposal for consensual, government-funded projects to set up single data stores or government endorsed institutions to decree on the ethics of learning analytics. Such unwieldy talking shops will spell death to agile innovation. I’m not clear where we proposed this? The What Works Clearinghouse in the US, and the equivalents already in the UK, are not government controlled, and do not claim to have the final word on anything. They simply strive to translate peer reviewed research into terms more accessible to a busy educator. But as we point out, they have limitations as well. >>>What we need is clear legal, regulatory frameworks, which allow counter-consensual innovation on the periphery, while insisting that it conforms to strict rules. Agreed: nobody wants to silence debate. In fact the Evidence Hub and Impact Maps we linked to explicitly recognise supporting and challenging evidence as ‘first class entities’ in their data models: deliberation and argumentation are central. >>>My proposal for the framework would comprise several layers: […] http://bit.ly/1lYA6EP I think we are agreed on the value of interoperability to enable ethical data sharing between systems, always with the question, “Who benefits, and will this improve learner success?”. The different social and technical strategies to put in place a national infrastructure is not something we are expert in or have strong views on, so we defer to you to argue the merits of your approach with those who would champion others. Our contribution has been to delineate some of the different user-centred obstacles before which learners will not genuinely be ‘in control’ of their data. Simon Crispin Weston says: June 8, 2014 at 7:23 pm - Log in to Reply Dear Simon & Simon, Thank you both for your responses. I agree that one benefit of this sort of discussion is to understand each other’s position better, which might mean finding out that we never disagreed in the first place. I suggest that we have three related processes: research (e.g. into what works), assessment and analytics. By learning analytics, I think most people are referring to some sort of automated function, encapsulated in algorithms, which may serve different purposes (better assessment of the student, assessment of the effectiveness of the course, generalised research into what works etc). As all analytics require data, they presuppose the existence of some sort of prior assessment, which need not be formal of course. My car analogy related to the fact that at the moment, we do not have very much real-time data in education; and that the only reasonable chance of acquiring that data is if we develop instructional software that generates this data automatically. The only alternatives to this are (a) manual input of data by teachers, which I think is unrealistic; (b) capture of low-level data from generic software, such as which pages a student views (Andrew Ng’s well-known example of analytics in Coursera is based on this sort of data). Having developed instructional software that produces learning outcome data, we also need to develop learning analytics software that consumes this data and produces meaningful statements about student capability that can be used to make reliable predictions about student performance. Neither of these tasks are within the expertise of teachers or lecturers. In other words, I agree with Simon B that we need the higher level environments (or perhaps just “tools”) that enable users to make sense of complex data. That is my point about co-design – I don’t think teachers and lecturers have the skills required to make these tools. All analogies break down in the end but in my view the car analogy works in respect of the point that I was making. You (Simon B) wonder what I am referring to with the fuel injection analogy and guess that it must be something like “designing and coding low-level features of an LMS”. My more general position is that we have very little of the sort of education-specific software that is needed to make ed-tech work. The software we have at the moment (e.g. LMSs) generally has very little pedagogically specific functionality in them – they are just collaboration and file sharing platforms, perhaps with a multiple choice quiz engine thrown in. What I am talking about are things we do not currently have, such as: * for schools-level History (my subject), timeline editors that enable learners to sequence events, draw causal links, periods and trend lines and link these to explanatory text; * software that supports Eric Mazur’s peer instruction technique, automatically monitoring the effect of certain sorts of question on different types of teaching group and helping the teacher to select useful questions to teach particular points to particular groups (the difficulty in this case being to select a question that will divide a group in half); * more examples of Seymour Papert’s microworlds (an idea which according to Diana Laurillard has been “tragically underexploited”), allowing students to explore a particular conceptual environment and to codify and share their findings; * structured discussion software that tracks the interactions between members of a group and monitors the contributions of different members in machine-readable data, highlighting the need for teacher intervention as it arises; * automatic essay-marking software which no-one would expect to be able to assess a thesis, but might be able to deal with a short preparatory piece, subject to teacher review, etc. Unlike the LMS example, all these contain considerable pedagogical expertise as well technical expertise. If a commercial company were to provide these tools, it would have to hire or otherwise access the necessary pedagogical expertise – but that doesn’t mean that the front line teachers and lecturers who are going to end up using the tools need to be involved at the design stage. Of course, I do not dispute the fact that such systems will need to be trialled: but the fact of being a guinea pig does not make you a co-designer. I do not know anything about The What Works Clearinghouse in the US – but I see from the website that it was at least set up by the US government. The current UK government has a similar aspiration behind funding the EEF. The danger, as far as I can see, is that when you set these organisations up and ask them the question, they tend to feel that they have to give an answer. And once a single authority gives an answer to a question like “what works”, then that tends to carry a great deal of weight and may suppress debate. And when the orthodox opinion is not subject to challenge, then there is very great danger that the orthodoxy turns out to be wrong – my text on this point is JS Mill’s On Liberty. I would also point to the 1998 Tooley Report, which to my mind made a powerful case that the majority of academic research into education has been of poor quality. I think it is very dangerous to provide a means by which poor quality, unchallenged research is elevated to gospel truth by some central authority. We already have too little contested debate as it is. So I think we perceive the same problem but have different answers. You place your money on an authoritative institution; I would place my money on a lively professional press which would encourage robust debate. But in any case, I think this whole question of what works has slightly wandered from analytics, which may be a component of research, but may contribute to other processes as well. With regard to my proposals, I do think that TinCan represents a useful development – and I was involved in its prototype, which was a project called Runtime Web Services, managed by the LETSI Foundation. But in my view, TinCan is still at a very early stage of development and while it solves the problems that the old SCORM API had at the transport level, it does not do much to address the question of what data we need to transport. I don’t think there will be a single answer to this – we will need all sorts of different sorts of data to reflect different sorts of learning activity. Which is why the answer to the interoperability problem is not a single specification, but a process by which different data formats can be proposed, shared and eventually join a stable of related data standards. With respect to the catalogue, this may not be an original idea. Technically, I do not think it is difficult. Gaining critical mass may be next-to impossible without the active backing of the government and its agencies; and it might be fairly easy with that backing. In answer to your question, “why is such centralisation required”, I would answer, “in order to decentralise”. Opposites often go together, as when Archemedes said “give me a fixed point (and a long lever) and I can move the world. We suffer at the moment from poor market information and (at least at school level) a lack of procurement expertise. Many buyers in the education market feel vulnerable to being ripped off. Our reaction is to centralise procurement through various aggregated procurement schemes associated with bureaucratic frameworks which do due diligence, all of which tend to reduce the opportunities for new entrants to the market with innovative products. Provide a central registry which ensures transparency and compliance to simple standards (e.g. for data protection) and you provide the freedom to experiment which is required if the market is to grow. Final point on the catalogue: it is not an app store – but it might be a back end to many different people’s app stores, ensuring that none of these achieve an uncompetitive position thanks to its control of the shop-front. Another endemic problem in the education market is uncompetitive tie-ups between content providers and exam boards and e-commerce sites etc etc. I agree with you that we need ethical use of data but my point is that to be useful, ethics need to be codified in ways that can be consistently applied. I am just saying that ethics made concrete = standards, and that is what is urgently required in this space. With respect to surveillance, it seems to me that in the context of the fiduciary relationship that the student has with the teacher, this is intrinsic to the business of education and is part of the deal that the student accepts when they walk through the door of the school/college/university. I don’t see that the student has to be involved in specifying what data may or may not be collected by the college. It is in the nature of our legal relationships that if we don’t like the contract offered to us by one person or organisation, we can go and make a contract with someone else – subject, of course, to minimum standards applied by the state. Sorry to answer at such length but thanks for the conversation. I think I have dealt with the point about “counter consensual” (on which I refer you to Mill, who certainly discusses the concept, whether or not he uses that phrase) and will not use up any more of your time by talking about climate change. But I would be happy to discuss that another time. Not you again, Weston | Ed Tech Now says: June 22, 2014 at 10:59 am - Log in to Reply […] I responded at some length to a piece by Simon Knight and Simon BuckinghamShum, who argue for student control of their own data. As well as the key point about student control of data, the two Simons also argue for a “what works clearing house”, where people could look up technology-enhanced pedagogies that had received some sort of seal of approval. My objections to this idea are: […] Leave A Reply Click here to cancel reply. You must be logged in to post a comment.