[]^{1}As much for my own reference as anything else 🙂 One of the things I’ve spent a lot of time looking at over the last month is a set of interrater reliability measures for some peer and self assessment data I have. This data is in a database, with each line representing a single rating of a text along 6 rubric elements; so each line has the raterID, authorgroupID, and 6 scores on a 19 scale. In addition, we have some data on how the students performed on a practice rating exercise, and the dataset includes 2 groups of students performing 2 different writing tasks (fortunately students in group1 marked group1’s work, ditto group2). All students marked 2 texts, and most texts were coauthored (i.e., 2 authors) so in theory each text should have 4 peer and 2 selfassessments, in fact various gaps mean most have 3 peers with 12 self assessments. I won’t go in to the process of choosing the IRR method here but I decided Krippendorff’s alpha was the most appropriate choice. There are some resources below to explore that, particularly in relation to R, and I’m also reproducing my code below – note this is very much tied into my own data, but it might be useful for thinking about looping through subsets of data or/and shaping data, etc…or just as a comfort that someone can write such horrible code and it still does a job 🙂 (I’m open to suggestions, also on the stats side here particularly given results aren’t great – although I should caveat that my primary interest here was in getting the code working). Resources * [http://www.cookbookr.com/Statistical_analysis/Interrater_reliability/]^{2} Description of [IRR]^{3} package, etc. in R * Somewhat surprisingly good description of various IRR methods * [A demo of Krippendorff’s Alpha in R]^{4} from Fridolin Wild * [Inside R description of the IRR package]^{5} * [http://cswww.essex.ac.uk/Research/nle/arrau/alpha.html]^{6} Resource list including code examples for various methods and languages * [Krippendorff (2011) Agreement and Information in the Reliability of Coding]^{7} * Hayes, A. F., & Krippendorff, K. (2007). [Answering the call for a standard reliability measure for coding data. ]^{8} Communication Methods and Measures, 1, 7789. And macro description: * Various bits from Gwet – [rfunctions]^{9} (for Krippendorff & Gwet at least) and [notes on them]^{10}, and [notes on IRR more generally]^{11}. # The evolution of code I thought someone (?!) might find it interesting to see the evolution of my code over time. I’ve got a few versions below (there was an earlier 1, and there were many ‘faulty’ versions in between, but both these ‘work’), starting from one which does everything manually, to another which loops through lots of iterations (very slowly) to a final one which loops through lots faster (I’ve no doubt there are faster ways to do this, probably using ‘apply’ etc.) – the data context here is a large set of objects (etherpads) being rated on a number of dimensions (rubric facets) by 35 peer/self raters. Because it’s a peer assessment, all etherpadcontributors are potential raters, so you have a ~300600 matrix of pads by raters, of which most values are NA. The final version below reduces the matrix to 300~5 to just represent actual ratings. It’s worth noting I’m pretty sure I have a faster way to create the data.frame Krippendorff’s alpha actually uses (which represents possible ratings on one axis against objects on the other) but I couldn’t work out how to modify the function to just use the existing formatted data…sigh. Iteration one Manually creating each dataset and set of options to run through the functions.
library(irr)
library(reshape2)
#USE THIS FIRST: Takes all scaled data (change 'assdatacomboscaled' to 'assdatapeer' or 'assessmentdatacombo' for other versions)
a<unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS"AssIndMerge\$areaName=="MDP",select=c("areaName","projectID")))
#a<rename(a,c("pad_projectID"="projectID"))
asswproj < merge(assdatascaled,a,by.x="pad_projectID",by.y="projectID",all.x=T,all.y=F)
asswproj<asswproj[,c(2,3,4,5,6,7,1,8,9,10,11,12,13,14,15,16,17,18,19,20)]
rm(a)
#THIS VERSION uses the RaterQual element from the diagnostic, uncomment the subset to JUST use the good raters, use the cbind version to doublecount good raters
a<unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS"AssIndMerge\$areaName=="MDP",select=c("areaName","projectID")))
#a<rename(a,c("pad_projectID"="projectID"))
asswproj < merge(assessmentdatacombo,a,by.x="pad_projectID",by.y="projectID",all.x=T,all.y=F)
asswproj<asswproj[,c(2,3,4,5,6,7,1,8,9,10,11,12,13,14,15,16,17,18,19,20)]
asswproj<merge(asswproj,unique(subset(AssIndMerge,select=c("userID","RaterQualOverall"))),by.x="userID",by.y="userID",all.x=T,all.y=F)
asswproj<asswproj[,c(2,3,1,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21)]
asswproj < subset(asswproj,RaterQualOverall==1) #select good raters
qual < subset(asswproj,RaterQualOverall==1) #select good raters
qual < mutate(qual,userID=paste("q",userID),id=id*id,)
asswproj < rbind(asswproj,qual)
rm(a,qual)
#Do this for either of the above (removes single or double rated items)
#incomplete < rowSums(AssessmentForKrippendTopic,na.rm=T)
AssessmentForKrippendSourDiv < subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(sourceDiversity),select=c(3,7,9)), pad_projectID ~ userID)),select=c(1))
test<data.frame(AssessmentForKrippendSourDiv)
test\$incomplete < apply(test, 1, function(x) sum(!is.na(x)))
test<subset(test,incomplete>2,select=c(incomplete))
test<t(test) #transpose the data
kripp.alpha(as.matrix(test),"ordinal")
krippen.alpha.raw(t(test))
gwet.ac1.raw(test)
# which(x)
# rowSums(x,na.rm=T)
#}) # rows with missing values
#This set is MDP
AssessmentForKrippendTopic < (as.matrix(subset(dcast(subset(asswproj,areaName=="MDP" & !is.na(topicCoverage),select=c(3,7,8)), userID ~ pad_projectID)),select=c(1)))
AssessmentForKrippendSourDiv < subset(as.matrix(dcast(subset(asswproj,areaName=="MDP" & !is.na(sourceDiversity),select=c(3,7,9)), userID ~ pad_projectID)),select=c(1))
AssessmentForKrippendSourQual < subset(as.matrix(dcast(subset(asswproj,areaName=="MDP" & !is.na(SourceQuality),select=c(3,7,10)), userID ~ pad_projectID)),select=c(1))
AssessmentForKrippendClarity < subset(as.matrix(dcast(subset(asswproj,areaName=="MDP" & !is.na(claimClarity),select=c(3,7,11)), userID ~ pad_projectID)),select=c(1))
AssessmentForKrippendEval < subset(as.matrix(dcast(subset(asswproj,areaName=="MDP" & !is.na(evaluationQuality),select=c(3,7,12)), userID ~ pad_projectID)),select=c(1))
AssessmentForKrippendSynth < subset(as.matrix(dcast(subset(asswproj,areaName=="MDP" & !is.na(informationSynthesis),select=c(3,7,13)), userID ~ pad_projectID)),select=c(1))
kripp.alpha(AssessmentForKrippendSynth,"interval") #or "ordinal" (slightly higher alpha w/interval)
kripp.alpha(AssessmentForKrippendEval,"interval")
kripp.alpha(AssessmentForKrippendClarity,"interval")
kripp.alpha(AssessmentForKrippendSourQual,"interval")
kripp.alpha(AssessmentForKrippendSourDiv,"interval")
kripp.alpha(AssessmentForKrippendTopic,"ordinal")
rm(AssessmentForKrippendClarity,AssessmentForKrippendEval,AssessmentForKrippendSourDiv,AssessmentForKrippendSourQual,AssessmentForKrippendSynth,AssessmentForKrippendTopic)
#This set is CIS
AssessmentForKrippendTopic < subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(topicCoverage),select=c(3,7,8)), userID ~ pad_projectID)),select=c(1))
AssessmentForKrippendSourDiv < subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(sourceDiversity),select=c(3,7,9)), userID ~ pad_projectID)),select=c(1))
AssessmentForKrippendSourQual < subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(SourceQuality),select=c(3,7,10)), userID ~ pad_projectID)),select=c(1))
AssessmentForKrippendClarity < subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(claimClarity),select=c(3,7,11)), userID ~ pad_projectID)),select=c(1))
AssessmentForKrippendEval < subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(evaluationQuality),select=c(3,7,12)), userID ~ pad_projectID)),select=c(1))
AssessmentForKrippendSynth < subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(informationSynthesis),select=c(3,7,13)), userID ~ pad_projectID)),select=c(1))
kripp.alpha(AssessmentForKrippendSynth,"ordinal") #or "ordinal" (slightly higher alpha w/interval)
kripp.alpha(AssessmentForKrippendEval,"interval")
kripp.alpha(AssessmentForKrippendClarity,"interval")
kripp.alpha(AssessmentForKrippendSourQual,"interval")
kripp.alpha(AssessmentForKrippendSourDiv,"interval")
kripp.alpha(AssessmentForKrippendTopic,"interval")
rm(AssessmentForKrippendClarity,AssessmentForKrippendEval,AssessmentForKrippendSourDiv,AssessmentForKrippendSourQual,AssessmentForKrippendSynth,AssessmentForKrippendTopic)
Iteration two Creating a loop to go through a variety of subsets of
data and function options – but note this is rather slow, it runs on the full 300*600 matrix.
library(irr)
library(reshape2)
#explore library(Matrix) and http://www.johnmyleswhite.com/notebook/2011/10/31/usingsparsematricesinr/
#This gets the areaName for each project ID, to merge with the pad_projectID within each assessmentdataset
a<unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS"AssIndMerge\$areaName=="MDP",select=c("areaName","projectID")))
#Create the lists for the loops below
AreaNameList < list("CIS","MDP")
AssDatasetList < list(assdatacomboscaled=assdatacomboscaled,assessmentdatacombo=assessmentdatacombo,assessmentdatapeer=assessmentdatapeer,assdatascaled=assdatascaled)
facetList < list("topicCoverage","sourceDiversity","sourceQuality","claimClarity","evaluationQuality","informationSynthesis")
SubsetRaterQualList < list("OrdinalDiff","HMLDiff","LowestDiff")
listn < 0
#In addition, we subset by rater n
# and then subset by error rate
#create a list to store results
gdIRRoutput < list()
#Then we cycle through both areaNames
for(AreaName in AreaNameList){
#and all the datalists (peeronly, peerself combo, and each of those scaled v raw)
for(AssDataset in AssDatasetList){
ifelse(listn==4,listn < 1,listn < listn+1) #this says which element of the list we're on, used later
#then for each facet in the rubric
for(facet in facetList){
#which raterquality measure are we using? Ordinal, HML, or whichever was lowest
for(SubsetRaterQual in SubsetRaterQualList){
FacetRater < paste0(facet,SubsetRaterQual) #this is used later
#for varying degrees of ratererror on the raterqalitymeasure 0=no error 5=5 off
for(SubsetRaterErrorN in 0:5){
#this says only look at pads with x number of raters (14)
for(RaterN in 1:4){
#Add the areaName to the assessment data
asswproj < merge(AssDataset,a,by.x="pad_projectID",by.y="projectID",all.x=T,all.y=F)
#select the relevant columns
asswproj < asswproj[,c(2,3,4,5,6,7,1,8,9,10,11,12,13,14,15,16,17,18,19,20)]
#get the assessorquality data for this facet
asswproj < merge(asswproj,unique(subset(AssIndMerge,select=c("userID",FacetRater))),by.x="userID",by.y="userID",all.x=T,all.y=F)
#Select project ID, userID, areaName, whatever the facet is, and the rater quality column
asswproj < asswproj[,c("userID","pad_projectID",facet,"areaName",FacetRater)]
#select users reaching the quality threshold, remove raters who didn't rate this facet (NAs)
asswproj < subset(asswproj,asswproj[,5] < SubsetRaterErrorN & !is.na(deparse(FacetRater))) #select raters below a certain error n on the particular facet
try(
{
#check that there's at least 1 rater left, if not skip block
if (nrow(asswproj)>0) {
#subset to the AreaName, CHECK ME this removes the NAs for this facet again?, dcast so we get pads by users value=facet, then remove col 1 (padID)
AssessmentForKrippend < (as.matrix(subset(dcast(subset(asswproj,asswproj[,4]==AreaName & !is.na(asswproj[,3]),select=c("userID","pad_projectID",facet)), pad_projectID ~ userID)),select=c(1)))
#The next 3 lines remove rows not meeting a certain threshold of rater ns, then removes that column (incomplete)
AssessmentForKrippend < data.frame(AssessmentForKrippend)
AssessmentForKrippend < subset(AssessmentForKrippend,select=c(pad_projectID)) #remove pad_projectID
AssessmentForKrippend\$incomplete < apply(AssessmentForKrippend,1,function(x) sum(!is.na(x))) #Check how many ratings there are
AssessmentForKrippend < subset(AssessmentForKrippend,AssessmentForKrippend\$incomplete>RaterN,select=c(incomplete)) #Subset to those pads with the right n of ratings
#again, check that there's at least 1 rater left, if not skip block
if (nrow(AssessmentForKrippend)>0){
#run kripp.alpha and save it to a sensibly named place
AssessmentForKrippend < kripp.alpha(as.matrix(t(AssessmentForKrippend)),"ordinal")
gdIRRoutput[[paste0(AreaName,":",names(AssDatasetList[listn]),":",facet,":","nRat=",RaterN,":",SubsetRaterQual,":","nError=",SubsetRaterErrorN)]] < AssessmentForKrippend
}
}
}
)
}
}
}
}
}
}
gdIRRoutput < data.frame(t(sapply(gdIRRoutput,c)))
gdIRRoutput < subset(gdIRRoutput,select=c(8:9))
gdIRRoutput < cbind(Row.Names = rownames(gdIRRoutput), gdIRRoutput)
rownames(gdIRRoutput) < NULL
gdIRRoutput < data.frame(lapply(gdIRRoutput, as.character), stringsAsFactors=FALSE)
#df[] < lapply(df, function(x) if(is.list(x)) unlist(x))
write.csv(gdIRRoutput,file="gdIRRoutput.csv")
rm(SubsetRaterQualList,AreaNameList,AreaName,SubsetRaterQual,facet,facetList,FacetRater,SubsetRaterErrorN,RaterN,AssDataset,AssDatasetList)
Iteration three This version is substantially faster, thanks to
[Thomas Ullman]^{12} for the code! It reduces the huge matrix (~700 raters by 300 objects) to just the ~5 ratings for each object and runs Krippendorff’s alpha on that instead.
#USE THOMAS' CODE AS BELOW IN THIS LOOP it's a MASSIVE efficiency
#Include:
#text length as a feature in the loop  how do we subset this programatically?
#maybe look at self versus peer  how do we subset this programatically?
#group size  how do we subset this programatically?
library(irr)
library(reshape2)
#explore library(Matrix) and http://www.johnmyleswhite.com/notebook/2011/10/31/usingsparsematricesinr/
#This gets the areaName for each project ID, to merge with the pad_projectID within each assessmentdataset
a< unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS"AssIndMerge\$areaName=="MDP",select=c("areaName","projectID")))
#Create the lists for the loops below
AreaNameList 0) {
#subset to the AreaName, CHECK ME this removes the NAs for this facet again?, dcast so we get pads by users value=facet, then remove col 1 (padID)
AssessmentForKrippend <  subset(asswproj,asswproj[,4]==AreaName & !is.na(asswproj[,3]),select=c("userID","pad_projectID",facet))
AssessmentForKrippend = ddply(asswproj, .(pad_projectID), transform, idx = paste("obs", idx = sprintf("_%02d", 1:length(pad_projectID)), sep = ""))
AssessmentForKrippend = dcast(AssessmentForKrippend, pad_projectID ~ idx, value.var = facet)
#The next 3 lines remove rows not meeting a certain threshold of rater ns, then removes that column (incomplete)
AssessmentForKrippend RaterN,select=c(incomplete)) #Subset to those pads with the right n of ratings
#again, check that there's at least 1 rater left, if not skip block
if (nrow(AssessmentForKrippend)>0){
#run kripp.alpha and save it to a sensibly named place
#AssessmentForKrippend <  kripp.alpha(as.matrix(t(AssessmentForKrippend)),"ordinal")
AssessmentForKrippend
Iteration four
#Include:
#text length as a feature in the loop  how do we subset this programatically?
#maybe look at self versus peer  how do we subset this programatically?
#group size  how do we subset this programatically?
library(irr)
library(reshape2)
library(plyr)
#explore library(Matrix) and http://www.johnmyleswhite.com/notebook/2011/10/31/usingsparsematricesinr/
#This gets the areaName for each project ID, to merge with the pad_projectID within each assessmentdataset
a< unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS"AssIndMerge\$areaName=="MDP",select=c("areaName","projectID")))
#Create the lists for the loops below
AreaNameList0) {
#subset to the AreaName, CHECK ME this removes the NAs for this facet again?, dcast so we get pads by users value=facet, then remove col 1 (padID)
AssessmentForKrippend <  subset(asswproj,asswproj[,4]==AreaName & !is.na(asswproj[,3]),select=c("userID","pad_projectID",facet))
AssessmentForKrippend\$idxRaterN,select=c(incomplete)) #Subset to those pads with the right n of ratings
#again, check that there's at least 1 rater left, if not skip block
if (nrow(AssessmentForKrippend)>0){
#run kripp.alpha and save it to a sensibly named place
#AssessmentForKrippend <  kripp.alpha(as.matrix(t(AssessmentForKrippend)),"ordinal")
#krippall = kripp.alpha(t(as.matrix(dr[, 2:ncol(dr) ])), method="nominal")
AssessmentForKrippend0) {
AssessmentForGwet <  (as.matrix(subset(dcast(subset(asswproj,asswproj[,4]==AreaName & !is.na(asswproj[,3]),select=c("userID","pad_projectID",facet)), pad_projectID ~ userID)),select=c(1)))
AssessmentForGwetRaterN,select=c(incomplete))
if (nrow(AssessmentForGwet)>0){
AssessmentForGwet <  gwet.ac1.raw(as.matrix(AssessmentForGwet))
gdIRRoutputGwet[[paste0(AreaName,":",names(AssDatasetList[listn]),":",facet,":","nRat=",RaterN,":",SubsetRaterQual,":","nError=",SubsetRaterErrorN)]]0) {
AssessmentForKripGwet <  (as.matrix(subset(dcast(subset(asswproj,asswproj[,4]==AreaName & !is.na(asswproj[,3]),select=c("userID","pad_projectID",facet)), pad_projectID ~ userID)),select=c(1)))
AssessmentForKripGwetRaterN,select=c(incomplete))
if (nrow(AssessmentForKripGwet)>0){
AssessmentForKripGwet <  krippen.alpha.raw(as.matrix(AssessmentForKripGwet))
gdIRRoutputKripGwet[[paste0(AreaName,":",names(AssDatasetList[listn]),":",facet,":","nRat=",RaterN,":",SubsetRaterQual,":","nError=",SubsetRaterErrorN)]]
Iteration five This is the fastest/most complete, it runs a set of
functions alongside each other and saves them separately (rather than running a whole new loopset for each one), it also adds a function to look at a subset of textlengths.
source("Gwets.R") #this holds the functions written by Gwet for Gwet's AC1/2 & Kripp.Alpha versions
library(irr)
library(reshape2)
library(plyr)
#explore library(Matrix) and http://www.johnmyleswhite.com/notebook/2011/10/31/usingsparsematricesinr/
#This gets the areaName for each project ID, to merge with the pad_projectID within each assessmentdataset
AssIndMerge <  mutate(AssIndMerge,textlen=nchar(text))
a<unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS" & textlen>3AssIndMerge\$areaName=="MDP" & textlen>3,select=c("areaName","projectID","textlen","numUsers")))
##################################################################################
#a\$textlen <  set to something sensible for both MDP & CIS
perc.rank < function(x) trunc(rank(x))/length(x) #this is used to put the quantile on each textlength
##################################################################################
#b<unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS"AssIndMerge\$areaName=="MDP",select=c("projectID","textlen","numUsers")))
#We need to modify 'a' to bring in:
#'textlen' (then run some operations on this)
#do the peer/self thing later,you'll need to reimport it from the db to do 'self'
#'numUsers'
#text length as a feature in the loop  how do we subset this programatically?
#maybe look at self versus peer  how do we subset this programatically?
#group size  how do we subset this programatically?
#Create the lists for the loops below
AreaNameList < list("CIS","MDP")
AssDatasetList < list(assdatacomboscaled=assdatacomboscaled,assessmentdatacombo=assessmentdatacombo,assessmentdatapeer=assessmentdatapeer,assdatascaled=assdatascaled)
facetList < list("topicCoverage","sourceDiversity","sourceQuality","claimClarity","evaluationQuality","informationSynthesis")
SubsetRaterQualList < list("OrdinalDiff","HMLDiff","LowestDiff")
listn < 0
groupsizes< list("1","2","3","4") #1=groupsize1, 2=groupsize2, 3=groupsize2+3, 4=groupsize=1+2+3
methlist < list("interval","ordinal")
lencuts < list(.05,.32,.003,.0)
#In addition, we subset by rater n
# and then subset by error rate
#create a list to store results
gdIRRoutput < list()
gdIRRoutputKripGwet < list()
gdIRRoutputGwet < list()
#Then we cycle through both areaNames
for(AreaName in AreaNameList){
#and all the datalists (peeronly, peerself combo, and each of those scaled v raw)
for(AssDataset in AssDatasetList){
ifelse(listn==4,listn < 1,listn < listn+1) #this says which element of the list we're on, used later
#then for each facet in the rubric
for(facet in facetList){
#which raterquality measure are we using? Ordinal, HML, or whichever was lowest
for(SubsetRaterQual in SubsetRaterQualList){
FacetRater < paste0(facet,SubsetRaterQual) #this is used later
#for varying degrees of ratererror on the raterqalitymeasure 0=no error 5=5 off
for(SubsetRaterErrorN in 0:5){
#this says only look at pads with x number of raters (14)
for(RaterN in 1:4){
#loop through group sizes, if 3 then take 2+3, take 1 and 2 separately,if 4 take
for(groupsize in groupsizes){
#loop through text lengths to remove the long/short
for(lowerlen in lencuts){
upperlen < 1lowerlen
#Add the areaName to the assessment data
#bear in mind this will create NAs (with some projects already removed from the 'a' data from invalid sessions)
asswproj < merge(AssDataset,a,by.x="pad_projectID",by.y="projectID",all.x=T,all.y=F)
#subset to areaname
asswproj < subset(asswproj,asswproj[,20]==AreaName)
#subset to textlength
asswproj\$textlen < perc.rank(asswproj\$textlen)
asswproj < subset(asswproj,asswproj[,21]>lowerlen & asswproj[,21]<upperlen )
#subset to group size
ifelse(groupsize==1groupsize==2,subset(asswproj,asswproj[,22]==groupsize),asswproj < asswproj)
#select the relevant columns
asswproj < asswproj[,c(2,3,4,5,6,7,1,8,9,10,11,12,13,14,15,16,17,18,19,20)]
#get the assessorquality data for this facet
asswproj < merge(asswproj,unique(subset(AssIndMerge,select=c("userID",FacetRater))),by.x="userID",by.y="userID",all.x=T,all.y=F)
#Select project ID, userID, areaName, whatever the facet is, and the rater quality column
asswproj < asswproj[,c("userID","pad_projectID",facet,"areaName",FacetRater)]
#select users reaching the quality threshold, remove raters who didn't rate this facet (NAs)
if (SubsetRaterErrorN<6) {
asswproj < subset(asswproj,asswproj[,5] < SubsetRaterErrorN & !is.na(deparse(FacetRater)))
} #select raters below a certain error n on the particular facet (skip if 6 so it just includes all data)
try(
{
#check that there's at least 1 rater left, if not skip block
if (nrow(asswproj)>0) {
#CHECK ME this removes the NAs for this facet again?, dcast so we get pads by users value=facet, then remove col 1 (padID)
AssessmentForKrippend <  subset(asswproj,!is.na(asswproj[,3]),select=c("userID","pad_projectID",facet))
AssessmentForKrippend\$idx < NA
AssessmentForKrippend < data.frame(AssessmentForKrippend)
AssessmentForKrippend = ddply(AssessmentForKrippend, .(pad_projectID), transform, idx = paste("obs", idx = sprintf("_%02d", 1:length(pad_projectID)), sep = ""))
AssessmentForKrippend = dcast(AssessmentForKrippend, pad_projectID ~ idx, value.var = facet)
#The next 3 lines remove rows not meeting a certain threshold of rater ns, then removes that column (incomplete)
AssessmentForKrippend < subset(AssessmentForKrippend,select=c(pad_projectID)) #remove pad_projectID
AssessmentForKrippend\$incomplete < apply(AssessmentForKrippend,1,function(x) sum(!is.na(x))) #Check how many ratings there are
inc<ncol(AssessmentForKrippend)
AssessmentForKrippend < subset(AssessmentForKrippend,AssessmentForKrippend[inc]>RaterN,select=c(incomplete)) #Subset to those pads with the right n of ratings
#again, check that there's at least 1 rater left, if not skip block
if (nrow(AssessmentForKrippend)>0){
#run kripp.alpha and save it to a sensibly named place
#AssessmentForKrippend <  kripp.alpha(as.matrix(t(AssessmentForKrippend)),"ordinal")
#krippall = kripp.alpha(t(as.matrix(dr[, 2:ncol(dr) ])), method="nominal")
#run gwets
AssessmentForGwet < gwet.ac1.raw(as.matrix(AssessmentForKrippend))
gdIRRoutputGwet[[paste0(AreaName,":",names(AssDatasetList[listn]),":","groupsize",groupsize,":","percentile",lowerlen,":",facet,":","nRat=",RaterN,":",SubsetRaterQual,":","nError=",SubsetRaterErrorN)]] < AssessmentForGwet
#run gwets v of Kripp
AssessmentForKripGwet < krippen.alpha.raw(as.matrix(AssessmentForKrippend))
gdIRRoutputKripGwet[[paste0(AreaName,":",names(AssDatasetList[listn]),":","groupsize",groupsize,":","percentile",lowerlen,":",facet,":","nRat=",RaterN,":",SubsetRaterQual,":","nError=",SubsetRaterErrorN)]] < AssessmentForKripGwet
#transpose the matrix for Krippend
AssessmentForKrippend < t(as.matrix(AssessmentForKrippend))
for(krippmeth in methlist){
AssessmentForKrippend1 < kripp.alpha(AssessmentForKrippend, method=krippmeth)
gdIRRoutput[[paste0(AreaName,":",names(AssDatasetList[listn]),":","groupsize",groupsize,":","percentile",lowerlen,":",facet,":","nRat=",RaterN,":",SubsetRaterQual,":","nError=",SubsetRaterErrorN,":",krippmeth,":")]] < AssessmentForKrippend1
}
}
}
}
)
}
}
}
}
}
}
}
}
gdIRRoutput < data.frame(t(sapply(gdIRRoutput,c)))
gdIRRoutput < subset(gdIRRoutput,select=c(8:9))
gdIRRoutput < cbind(Row.Names = rownames(gdIRRoutput), gdIRRoutput)
rownames(gdIRRoutput) < NULL
gdIRRoutput < data.frame(lapply(gdIRRoutput, as.character), stringsAsFactors=FALSE)
#df[] < lapply(df, function(x) if(is.list(x)) unlist(x))
write.csv(gdIRRoutput,file="gdIRRoutputBig.csv")
gdIRRoutputKripGwet < data.frame(t(sapply(gdIRRoutputKripGwet,c)))
write.csv(gdIRRoutputKripGwet,file="gdIRRoutputKripGwetBig.csv")
gdIRRoutputGwet < data.frame(t(sapply(gdIRRoutputGwet,c)))
write.csv(gdIRRoutputGwet,file="gdIRRoutputGwetBig.csv")
rm(a,gdIRRoutput,gdIRRoutputKripGwet,gdIRRoutputGwet,AssessmentForKrippend,AssDataset,asswproj,asswproj1,a,AreaName,AreaNameList,AssessmentForGwet,AssessmentForKripGwet,AssessmentForKrippend1,FacetRater,RaterN,SusetRaterErrorN,SubsetRaterQual,SubsetRaterQualList,b,facet,facetList,facetlist,groupsize,groupsizes,inc,krippmeth,lencuts,listn,lowerlen,methlist,total,upperlen,bipolar.weights.bp.coeff.raw,circular.weights,conger.kappa.raw,fleiss.kappa.raw,identity.weights,krippen.alpha.raw,linear.weights,ordinal.weights,quadratic.weights,radical.weights,ratio.weights,trim,bipolar.weights,bp.coeff.raw,gwet.ac1.raw,SubsetRaterErrorN,AssDatasetList)
Other pages * *
[https://stat.ethz.ch/pipermail/rhelp/2014March/372538.html]^{13}




 McCray, G, (2013) [Assessing interrater agreement for nominal]^{14} [judgement variables.]^{14} Paper presented at the Language Testing Forum. Nottingham, November 1517.



Footnotes

https://commons.wikimedia.org/wiki/File%3AR_logo.svg “By R Foundation, from http://www.rproject.org [GPL (http://www.gnu.org/licenses/gpl.html)], via Wikimedia Commons” ↩

http://www.cookbookr.com/Statistical_analysis/Interrater_reliability/ ↩

http://crunch.kmi.open.ac.uk/viewsource.php?src=~fwild/krippalphademo.R ↩

%20http://www.insider.org/packages/cran/irr/docs/kripp.alpha ↩

http://repository.upenn.edu/cgi/viewcontent.cgi?article=1286&context=asc_papers ↩

http://www.unc.edu/courses/2007fall/jomc/801/001/HayesAndKrippendorff.pdf ↩

http://www.agreestat.com/inter_rater_reliability_notes.html ↩

https://stat.ethz.ch/pipermail/rhelp/2014March/372538.html%20 ↩

http://www.norbertschmitt.co.uk/uploads/27_528d02015a6da191320524.pdf ↩ ↩^{2}