Inter-rater reliability in R

[]¹As much for my own reference as anything else 🙂 One of the things I’ve spent a lot of time looking at over the last month is a set of inter-rater reliability measures for some peer and self assessment data I have. This data is in a database, with each line representing a single rating of a text along 6 rubric elements; so each line has the raterID, authorgroupID, and 6 scores on a 1-9 scale. In addition, we have some data on how the students performed on a practice rating exercise, and the dataset includes 2 groups of students performing 2 different writing tasks (fortunately students in group1 marked group1’s work, ditto group2). All students marked 2 texts, and most texts were co-authored (i.e., 2 authors) so in theory each text should have 4 peer and 2 self-assessments, in fact various gaps mean most have 3 peers with 1-2 self assessments. I won’t go in to the process of choosing the IRR method here but I decided Krippendorff’s alpha was the most appropriate choice. There are some resources below to explore that, particularly in relation to R, and I’m also reproducing my code below – note this is very much tied into my own data, but it might be useful for thinking about looping through subsets of data or/and shaping data, etc…or just as a comfort that someone can write such horrible code and it still does a job 🙂 (I’m open to suggestions, also on the stats side here particularly given results aren’t great – although I should caveat that my primary interest here was in getting the code working).# Resources * [http://www.cookbook-r.com/Statistical_analysis/Inter-rater_reliability/]² Description of [IRR]³ package, etc. in R * Somewhat surprisingly good description of various IRR methods * [A demo of Krippendorff’s Alpha in R]⁴ from Fridolin Wild * [Inside R description of the IRR package]⁵ * [http://cswww.essex.ac.uk/Research/nle/arrau/alpha.html]⁶ Resource list including code examples for various methods and languages * [Krippendorff (2011) Agreement and Information in the Reliability of Coding]⁷ * Hayes, A. F., & Krippendorff, K. (2007). [Answering the call for a standard reliability measure for coding data. ]⁸ Communication Methods and Measures, 1, 77-89. And macro description: * Various bits from Gwet – [r-functions]⁹ (for Krippendorff & Gwet at least) and [notes on them]¹⁰, and [notes on IRR more generally]¹¹. # The evolution of code I thought someone (?!) might find it interesting to see the evolution of my code over time. I’ve got a few versions below (there was an earlier 1, and there were many ‘faulty’ versions in between, but both these ‘work’), starting from one which does everything manually, to another which loops through lots of iterations (very slowly) to a final one which loops through lots faster (I’ve no doubt there are faster ways to do this, probably using ‘apply’ etc.) – the data context here is a large set of objects (etherpads) being rated on a number of dimensions (rubric facets) by 3-5 peer/self raters. Because it’s a peer assessment, all etherpad-contributors are potential raters, so you have a ~300600 matrix of pads by raters, of which most values are NA. The final version below reduces the matrix to 300~5 to just represent actual ratings. It’s worth noting I’m pretty sure I have a faster way to create the data.frame Krippendorff’s alpha actually uses (which represents possible ratings on one axis against objects on the other) but I couldn’t work out how to modify the function to just use the existing formatted data…sigh.# Iteration one Manually creating each dataset and set of options to run through the functions.

library(irr)
library(reshape2)
#USE THIS FIRST: Takes all scaled data (change 'assdatacomboscaled' to 'assdatapeer' or 'assessmentdatacombo' for other versions)
a<-unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS"|AssIndMerge\$areaName=="MDP",select=c("areaName","projectID")))
#a<-rename(a,c("pad_projectID"="projectID"))
asswproj <- merge(assdatascaled,a,by.x="pad_projectID",by.y="projectID",all.x=T,all.y=F)
asswproj<-asswproj[,c(2,3,4,5,6,7,1,8,9,10,11,12,13,14,15,16,17,18,19,20)]
rm(a)

#THIS VERSION uses the RaterQual element from the diagnostic, un-comment the subset to JUST use the good raters, use the cbind version to double-count good raters
a<-unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS"|AssIndMerge\$areaName=="MDP",select=c("areaName","projectID")))
#a<-rename(a,c("pad_projectID"="projectID"))
asswproj <- merge(assessmentdatacombo,a,by.x="pad_projectID",by.y="projectID",all.x=T,all.y=F)
asswproj<-asswproj[,c(2,3,4,5,6,7,1,8,9,10,11,12,13,14,15,16,17,18,19,20)]
asswproj<-merge(asswproj,unique(subset(AssIndMerge,select=c("userID","RaterQualOverall"))),by.x="userID",by.y="userID",all.x=T,all.y=F)
asswproj<-asswproj[,c(2,3,1,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21)]
asswproj <- subset(asswproj,RaterQualOverall==1) #select good raters
qual <- subset(asswproj,RaterQualOverall==1) #select good raters
qual <- mutate(qual,userID=paste("q",userID),id=id*id,)
asswproj <- rbind(asswproj,qual)
rm(a,qual)

#Do this for either of the above (removes single or double rated items)
#incomplete <- rowSums(AssessmentForKrippendTopic,na.rm=T)
AssessmentForKrippendSourDiv <- subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(sourceDiversity),select=c(3,7,9)), pad_projectID ~ userID)),select=-c(1))
test<-data.frame(AssessmentForKrippendSourDiv)
test\$incomplete <- apply(test, 1, function(x) sum(!is.na(x)))
test<-subset(test,incomplete>2,select=-c(incomplete))
test<-t(test) #transpose the data
kripp.alpha(as.matrix(test),"ordinal")
krippen.alpha.raw(t(test))
gwet.ac1.raw(test)
#  which(x)
#   rowSums(x,na.rm=T)
#})  # rows with missing values

#This set is MDP
AssessmentForKrippendTopic <- (as.matrix(subset(dcast(subset(asswproj,areaName=="MDP" & !is.na(topicCoverage),select=c(3,7,8)), userID ~ pad_projectID)),select=-c(1)))
AssessmentForKrippendSourDiv <- subset(as.matrix(dcast(subset(asswproj,areaName=="MDP" & !is.na(sourceDiversity),select=c(3,7,9)), userID ~ pad_projectID)),select=-c(1))
AssessmentForKrippendSourQual <- subset(as.matrix(dcast(subset(asswproj,areaName=="MDP" & !is.na(SourceQuality),select=c(3,7,10)), userID ~ pad_projectID)),select=-c(1))
AssessmentForKrippendClarity <- subset(as.matrix(dcast(subset(asswproj,areaName=="MDP" & !is.na(claimClarity),select=c(3,7,11)), userID ~ pad_projectID)),select=-c(1))
AssessmentForKrippendEval <- subset(as.matrix(dcast(subset(asswproj,areaName=="MDP" & !is.na(evaluationQuality),select=c(3,7,12)), userID ~ pad_projectID)),select=-c(1))
AssessmentForKrippendSynth <- subset(as.matrix(dcast(subset(asswproj,areaName=="MDP" & !is.na(informationSynthesis),select=c(3,7,13)), userID ~ pad_projectID)),select=-c(1))

kripp.alpha(AssessmentForKrippendSynth,"interval") #or "ordinal" (slightly higher alpha w/interval)
kripp.alpha(AssessmentForKrippendEval,"interval")
kripp.alpha(AssessmentForKrippendClarity,"interval")
kripp.alpha(AssessmentForKrippendSourQual,"interval")
kripp.alpha(AssessmentForKrippendSourDiv,"interval")
kripp.alpha(AssessmentForKrippendTopic,"ordinal")
rm(AssessmentForKrippendClarity,AssessmentForKrippendEval,AssessmentForKrippendSourDiv,AssessmentForKrippendSourQual,AssessmentForKrippendSynth,AssessmentForKrippendTopic)

#This set is CIS
AssessmentForKrippendTopic <- subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(topicCoverage),select=c(3,7,8)), userID ~ pad_projectID)),select=-c(1))
AssessmentForKrippendSourDiv <- subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(sourceDiversity),select=c(3,7,9)), userID ~ pad_projectID)),select=-c(1))
AssessmentForKrippendSourQual <- subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(SourceQuality),select=c(3,7,10)), userID ~ pad_projectID)),select=-c(1))
AssessmentForKrippendClarity <- subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(claimClarity),select=c(3,7,11)), userID ~ pad_projectID)),select=-c(1))
AssessmentForKrippendEval <- subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(evaluationQuality),select=c(3,7,12)), userID ~ pad_projectID)),select=-c(1))
AssessmentForKrippendSynth <- subset(as.matrix(dcast(subset(asswproj,areaName=="CIS" & !is.na(informationSynthesis),select=c(3,7,13)), userID ~ pad_projectID)),select=-c(1))

kripp.alpha(AssessmentForKrippendSynth,"ordinal") #or "ordinal" (slightly higher alpha w/interval)
kripp.alpha(AssessmentForKrippendEval,"interval")
kripp.alpha(AssessmentForKrippendClarity,"interval")
kripp.alpha(AssessmentForKrippendSourQual,"interval")
kripp.alpha(AssessmentForKrippendSourDiv,"interval")
kripp.alpha(AssessmentForKrippendTopic,"interval")
rm(AssessmentForKrippendClarity,AssessmentForKrippendEval,AssessmentForKrippendSourDiv,AssessmentForKrippendSourQual,AssessmentForKrippendSynth,AssessmentForKrippendTopic)

Iteration two Creating a loop to go through a variety of subsets of

data and function options – but note this is rather slow, it runs on the full 300*600 matrix.

library(irr)
library(reshape2)
#explore library(Matrix) and http://www.johnmyleswhite.com/notebook/2011/10/31/using-sparse-matrices-in-r/

#This gets the areaName for each project ID, to merge with the pad_projectID within each assessment-dataset
a<-unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS"|AssIndMerge\$areaName=="MDP",select=c("areaName","projectID")))

#Create the lists for the loops below
AreaNameList <- list("CIS","MDP")
AssDatasetList <- list(assdatacomboscaled=assdatacomboscaled,assessmentdatacombo=assessmentdatacombo,assessmentdatapeer=assessmentdatapeer,assdatascaled=assdatascaled)
facetList <- list("topicCoverage","sourceDiversity","sourceQuality","claimClarity","evaluationQuality","informationSynthesis")
SubsetRaterQualList <- list("OrdinalDiff","HMLDiff","LowestDiff")
listn <- 0
#In addition, we subset by rater n
# and then subset by error rate

#create a list to store results
gdIRRoutput <- list()
#Then we cycle through both areaNames
  for(AreaName in AreaNameList){
    #and all the datalists (peeronly, peer-self combo, and each of those scaled v raw)
    for(AssDataset in AssDatasetList){
      ifelse(listn==4,listn <- 1,listn <- listn+1) #this says which element of the list we're on, used later
      #then for each facet in the rubric
      for(facet in facetList){
        #which rater-quality measure are we using?  Ordinal, HML, or whichever was lowest
        for(SubsetRaterQual in SubsetRaterQualList){
          FacetRater <- paste0(facet,SubsetRaterQual) #this is used later
          #for varying degrees of rater-error on the rater-qality-measure 0=no error 5=5 off
          for(SubsetRaterErrorN in 0:5){
            #this says only look at pads with x number of raters (1-4)
            for(RaterN in 1:4){
              #Add the areaName to the assessment data
              asswproj <- merge(AssDataset,a,by.x="pad_projectID",by.y="projectID",all.x=T,all.y=F)
              #select the relevant columns
              asswproj <- asswproj[,c(2,3,4,5,6,7,1,8,9,10,11,12,13,14,15,16,17,18,19,20)]
              #get the assessor-quality data for this facet
              asswproj <- merge(asswproj,unique(subset(AssIndMerge,select=c("userID",FacetRater))),by.x="userID",by.y="userID",all.x=T,all.y=F)
              #Select project ID, userID, areaName, whatever the facet is, and the rater quality column
              asswproj <- asswproj[,c("userID","pad_projectID",facet,"areaName",FacetRater)]
              #select users reaching the quality threshold, remove raters who didn't rate this facet (NAs)
              asswproj <- subset(asswproj,asswproj[,5] < SubsetRaterErrorN & !is.na(deparse(FacetRater))) #select raters below a certain error n on the particular facet
              try(
                  {
                    #check that there's at least 1 rater left, if not skip block
                    if (nrow(asswproj)>0) {
                      #subset to the AreaName, CHECK ME this removes the NAs for this facet again?, dcast so we get pads by users value=facet, then remove col 1 (padID)
                      AssessmentForKrippend <- (as.matrix(subset(dcast(subset(asswproj,asswproj[,4]==AreaName & !is.na(asswproj[,3]),select=c("userID","pad_projectID",facet)), pad_projectID ~ userID)),select=-c(1)))
                      #The next 3 lines remove rows not meeting a certain threshold of rater ns, then removes that column (incomplete)
                      AssessmentForKrippend <- data.frame(AssessmentForKrippend)
                      AssessmentForKrippend <- subset(AssessmentForKrippend,select=-c(pad_projectID)) #remove pad_projectID 
                      AssessmentForKrippend\$incomplete <- apply(AssessmentForKrippend,1,function(x) sum(!is.na(x))) #Check how many ratings there are
                      AssessmentForKrippend <- subset(AssessmentForKrippend,AssessmentForKrippend\$incomplete>RaterN,select=-c(incomplete)) #Subset to those pads with the right n of ratings
                      #again, check that there's at least 1 rater left, if not skip block
                      if (nrow(AssessmentForKrippend)>0){
                        #run kripp.alpha and save it to a sensibly named place
                        AssessmentForKrippend <- kripp.alpha(as.matrix(t(AssessmentForKrippend)),"ordinal")
                        gdIRRoutput[[paste0(AreaName,":",names(AssDatasetList[listn]),":",facet,":","nRat=",RaterN,":",SubsetRaterQual,":","nError=",SubsetRaterErrorN)]] <- AssessmentForKrippend
                    }
                    }
                  }
                )
            }
          }
        }
      }
    }
  }

gdIRRoutput <- data.frame(t(sapply(gdIRRoutput,c)))
gdIRRoutput <- subset(gdIRRoutput,select=-c(8:9))
gdIRRoutput <- cbind(Row.Names = rownames(gdIRRoutput), gdIRRoutput)
rownames(gdIRRoutput) <- NULL
gdIRRoutput <- data.frame(lapply(gdIRRoutput, as.character), stringsAsFactors=FALSE)
#df[] <- lapply(df, function(x) if(is.list(x)) unlist(x))
write.csv(gdIRRoutput,file="gdIRRoutput.csv")

rm(SubsetRaterQualList,AreaNameList,AreaName,SubsetRaterQual,facet,facetList,FacetRater,SubsetRaterErrorN,RaterN,AssDataset,AssDatasetList)

Iteration three This version is substantially faster, thanks to

[Thomas Ullman]¹² for the code! It reduces the huge matrix (~700 raters by 300 objects) to just the ~5 ratings for each object and runs Krippendorff’s alpha on that instead.

#USE THOMAS' CODE AS BELOW IN THIS LOOP it's a MASSIVE efficiency
#Include:
  #text length as a feature in the loop - how do we subset this programatically?
  #maybe look at self versus peer - how do we subset this programatically?
  #group size - how do we subset this programatically? 

library(irr)
library(reshape2)
#explore library(Matrix) and http://www.johnmyleswhite.com/notebook/2011/10/31/using-sparse-matrices-in-r/

#This gets the areaName for each project ID, to merge with the pad_projectID within each assessment-dataset
a< -unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS"|AssIndMerge\$areaName=="MDP",select=c("areaName","projectID")))

#Create the lists for the loops below
AreaNameList 0) {
    #subset to the AreaName, CHECK ME this removes the NAs for this facet again?, dcast so we get pads by users value=facet, then remove col 1 (padID)
    AssessmentForKrippend < - subset(asswproj,asswproj[,4]==AreaName & !is.na(asswproj[,3]),select=c("userID","pad_projectID",facet))
    AssessmentForKrippend = ddply(asswproj, .(pad_projectID), transform, idx = paste("obs", idx = sprintf("_%02d", 1:length(pad_projectID)), sep = "")) 
    AssessmentForKrippend = dcast(AssessmentForKrippend, pad_projectID ~ idx, value.var = facet)     
    
    #The next 3 lines remove rows not meeting a certain threshold of rater ns, then removes that column (incomplete)
    AssessmentForKrippend RaterN,select=-c(incomplete)) #Subset to those pads with the right n of ratings
    
    #again, check that there's at least 1 rater left, if not skip block
    if (nrow(AssessmentForKrippend)>0){
      #run kripp.alpha and save it to a sensibly named place
      #AssessmentForKrippend < - kripp.alpha(as.matrix(t(AssessmentForKrippend)),"ordinal")
      AssessmentForKrippend

Iteration four

#Include:
#text length as a feature in the loop - how do we subset this programatically?
#maybe look at self versus peer - how do we subset this programatically?
#group size - how do we subset this programatically? 

library(irr)
library(reshape2)
library(plyr)
#explore library(Matrix) and http://www.johnmyleswhite.com/notebook/2011/10/31/using-sparse-matrices-in-r/

#This gets the areaName for each project ID, to merge with the pad_projectID within each assessment-dataset
a< -unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS"|AssIndMerge\$areaName=="MDP",select=c("areaName","projectID")))

#Create the lists for the loops below
AreaNameList0) {
    #subset to the AreaName, CHECK ME this removes the NAs for this facet again?, dcast so we get pads by users value=facet, then remove col 1 (padID)
    AssessmentForKrippend < - subset(asswproj,asswproj[,4]==AreaName & !is.na(asswproj[,3]),select=c("userID","pad_projectID",facet))
    AssessmentForKrippend\$idxRaterN,select=-c(incomplete)) #Subset to those pads with the right n of ratings
    
    #again, check that there's at least 1 rater left, if not skip block
    if (nrow(AssessmentForKrippend)>0){
      #run kripp.alpha and save it to a sensibly named place
      #AssessmentForKrippend < - kripp.alpha(as.matrix(t(AssessmentForKrippend)),"ordinal")
      #krippall = kripp.alpha(t(as.matrix(dr[, 2:ncol(dr) ])), method="nominal")
      AssessmentForKrippend0) {
    AssessmentForGwet < - (as.matrix(subset(dcast(subset(asswproj,asswproj[,4]==AreaName & !is.na(asswproj[,3]),select=c("userID","pad_projectID",facet)), pad_projectID ~ userID)),select=-c(1)))
    AssessmentForGwetRaterN,select=-c(incomplete))
    if (nrow(AssessmentForGwet)>0){
      AssessmentForGwet < - gwet.ac1.raw(as.matrix(AssessmentForGwet))
      gdIRRoutputGwet[[paste0(AreaName,":",names(AssDatasetList[listn]),":",facet,":","nRat=",RaterN,":",SubsetRaterQual,":","nError=",SubsetRaterErrorN)]]0) {
    AssessmentForKripGwet < - (as.matrix(subset(dcast(subset(asswproj,asswproj[,4]==AreaName & !is.na(asswproj[,3]),select=c("userID","pad_projectID",facet)), pad_projectID ~ userID)),select=-c(1)))
    AssessmentForKripGwetRaterN,select=-c(incomplete))
    if (nrow(AssessmentForKripGwet)>0){
      AssessmentForKripGwet < - krippen.alpha.raw(as.matrix(AssessmentForKripGwet))
      gdIRRoutputKripGwet[[paste0(AreaName,":",names(AssDatasetList[listn]),":",facet,":","nRat=",RaterN,":",SubsetRaterQual,":","nError=",SubsetRaterErrorN)]]

Iteration five This is the fastest/most complete, it runs a set of

functions alongside each other and saves them separately (rather than running a whole new loop-set for each one), it also adds a function to look at a subset of text-lengths.

source("Gwets.R") #this holds the functions written by Gwet for Gwet's AC1/2 & Kripp.Alpha versions
library(irr)
library(reshape2)
library(plyr)
#explore library(Matrix) and http://www.johnmyleswhite.com/notebook/2011/10/31/using-sparse-matrices-in-r/

#This gets the areaName for each project ID, to merge with the pad_projectID within each assessment-dataset
AssIndMerge < - mutate(AssIndMerge,textlen=nchar(text))
a<-unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS" & textlen>3|AssIndMerge\$areaName=="MDP" & textlen>3,select=c("areaName","projectID","textlen","numUsers")))

##################################################################################
#a\$textlen < - set to something sensible for both MDP & CIS
perc.rank <- function(x) trunc(rank(x))/length(x) #this is used to put the quantile on each text-length
##################################################################################

#b<-unique(subset(AssIndMerge,AssIndMerge\$areaName=="CIS"|AssIndMerge\$areaName=="MDP",select=c("projectID","textlen","numUsers")))
#We need to modify 'a' to bring in:
#'textlen' (then run some operations on this)
#do the peer/self thing later,you'll need to reimport it from the db to do 'self'
#'numUsers'
#text length as a feature in the loop - how do we subset this programatically?
#maybe look at self versus peer - how do we subset this programatically?
#group size - how do we subset this programatically? 

#Create the lists for the loops below
AreaNameList <- list("CIS","MDP")
AssDatasetList <- list(assdatacomboscaled=assdatacomboscaled,assessmentdatacombo=assessmentdatacombo,assessmentdatapeer=assessmentdatapeer,assdatascaled=assdatascaled)
facetList <- list("topicCoverage","sourceDiversity","sourceQuality","claimClarity","evaluationQuality","informationSynthesis")
SubsetRaterQualList <- list("OrdinalDiff","HMLDiff","LowestDiff")
listn <- 0
groupsizes<- list("1","2","3","4") #1=groupsize1, 2=groupsize2, 3=groupsize2+3, 4=groupsize=1+2+3
methlist <- list("interval","ordinal")
lencuts <- list(.05,.32,.003,.0)
#In addition, we subset by rater n
# and then subset by error rate

#create a list to store results
gdIRRoutput <- list()
gdIRRoutputKripGwet <- list()
gdIRRoutputGwet <- list()

#Then we cycle through both areaNames
for(AreaName in AreaNameList){
  #and all the datalists (peeronly, peer-self combo, and each of those scaled v raw)
  for(AssDataset in AssDatasetList){
    ifelse(listn==4,listn <- 1,listn <- listn+1) #this says which element of the list we're on, used later
    #then for each facet in the rubric
    for(facet in facetList){
      #which rater-quality measure are we using?  Ordinal, HML, or whichever was lowest
      for(SubsetRaterQual in SubsetRaterQualList){
        FacetRater <- paste0(facet,SubsetRaterQual) #this is used later
        #for varying degrees of rater-error on the rater-qality-measure 0=no error 5=5 off
        for(SubsetRaterErrorN in 0:5){
          #this says only look at pads with x number of raters (1-4)
          for(RaterN in 1:4){
            #loop through group sizes, if 3 then take 2+3, take 1 and 2 separately,if 4 take 
            for(groupsize in groupsizes){
              #loop through text lengths to remove the long/short 
              for(lowerlen in lencuts){
                upperlen <- 1-lowerlen
                #Add the areaName to the assessment data
                #bear in mind this will create NAs (with some projects already removed from the 'a' data from invalid sessions)
                asswproj <- merge(AssDataset,a,by.x="pad_projectID",by.y="projectID",all.x=T,all.y=F)
                #subset to areaname
                asswproj <- subset(asswproj,asswproj[,20]==AreaName)
                #subset to textlength
                asswproj\$textlen <- perc.rank(asswproj\$textlen)
                asswproj <- subset(asswproj,asswproj[,21]>lowerlen & asswproj[,21]<upperlen ) 
                #subset to group size
                ifelse(groupsize==1|groupsize==2,subset(asswproj,asswproj[,22]==groupsize),asswproj <- asswproj)
                #select the relevant columns
                asswproj <- asswproj[,c(2,3,4,5,6,7,1,8,9,10,11,12,13,14,15,16,17,18,19,20)]
                #get the assessor-quality data for this facet
                asswproj <- merge(asswproj,unique(subset(AssIndMerge,select=c("userID",FacetRater))),by.x="userID",by.y="userID",all.x=T,all.y=F)
                #Select project ID, userID, areaName, whatever the facet is, and the rater quality column
                asswproj <- asswproj[,c("userID","pad_projectID",facet,"areaName",FacetRater)]
                #select users reaching the quality threshold, remove raters who didn't rate this facet (NAs)
                if (SubsetRaterErrorN<6) {
                  asswproj <- subset(asswproj,asswproj[,5] < SubsetRaterErrorN & !is.na(deparse(FacetRater)))
                } #select raters below a certain error n on the particular facet (skip if 6 so it just includes all data)
                try(
{
  #check that there's at least 1 rater left, if not skip block
  if (nrow(asswproj)>0) {
    #CHECK ME this removes the NAs for this facet again?, dcast so we get pads by users value=facet, then remove col 1 (padID)
    AssessmentForKrippend < - subset(asswproj,!is.na(asswproj[,3]),select=c("userID","pad_projectID",facet))
    AssessmentForKrippend\$idx <- NA
    AssessmentForKrippend <- data.frame(AssessmentForKrippend)
    AssessmentForKrippend = ddply(AssessmentForKrippend, .(pad_projectID), transform, idx = paste("obs", idx = sprintf("_%02d", 1:length(pad_projectID)), sep = "")) 
    AssessmentForKrippend = dcast(AssessmentForKrippend, pad_projectID ~ idx, value.var = facet)     
    
    #The next 3 lines remove rows not meeting a certain threshold of rater ns, then removes that column (incomplete)
    AssessmentForKrippend <- subset(AssessmentForKrippend,select=-c(pad_projectID)) #remove pad_projectID 
    AssessmentForKrippend\$incomplete <- apply(AssessmentForKrippend,1,function(x) sum(!is.na(x))) #Check how many ratings there are
    inc<-ncol(AssessmentForKrippend)
    AssessmentForKrippend <- subset(AssessmentForKrippend,AssessmentForKrippend[inc]>RaterN,select=-c(incomplete)) #Subset to those pads with the right n of ratings
    
    #again, check that there's at least 1 rater left, if not skip block
    if (nrow(AssessmentForKrippend)>0){
      #run kripp.alpha and save it to a sensibly named place
      #AssessmentForKrippend < - kripp.alpha(as.matrix(t(AssessmentForKrippend)),"ordinal")
      #krippall = kripp.alpha(t(as.matrix(dr[, 2:ncol(dr) ])), method="nominal")
      
      #run gwets
      AssessmentForGwet <- gwet.ac1.raw(as.matrix(AssessmentForKrippend))
      gdIRRoutputGwet[[paste0(AreaName,":",names(AssDatasetList[listn]),":","groupsize",groupsize,":","percentile",lowerlen,":",facet,":","nRat=",RaterN,":",SubsetRaterQual,":","nError=",SubsetRaterErrorN)]] <- AssessmentForGwet
      
      #run gwets v of Kripp
      AssessmentForKripGwet <- krippen.alpha.raw(as.matrix(AssessmentForKrippend))
      gdIRRoutputKripGwet[[paste0(AreaName,":",names(AssDatasetList[listn]),":","groupsize",groupsize,":","percentile",lowerlen,":",facet,":","nRat=",RaterN,":",SubsetRaterQual,":","nError=",SubsetRaterErrorN)]] <- AssessmentForKripGwet
      #transpose the matrix for Krippend
      AssessmentForKrippend <- t(as.matrix(AssessmentForKrippend))      
      for(krippmeth in methlist){
        AssessmentForKrippend1 <- kripp.alpha(AssessmentForKrippend, method=krippmeth)
        gdIRRoutput[[paste0(AreaName,":",names(AssDatasetList[listn]),":","groupsize",groupsize,":","percentile",lowerlen,":",facet,":","nRat=",RaterN,":",SubsetRaterQual,":","nError=",SubsetRaterErrorN,":",krippmeth,":")]] <- AssessmentForKrippend1
      }
    }
  }
}
                )
              }
            }
          }
        }
      }
    }
  }
}

gdIRRoutput <- data.frame(t(sapply(gdIRRoutput,c)))
gdIRRoutput <- subset(gdIRRoutput,select=-c(8:9))
gdIRRoutput <- cbind(Row.Names = rownames(gdIRRoutput), gdIRRoutput)
rownames(gdIRRoutput) <- NULL
gdIRRoutput <- data.frame(lapply(gdIRRoutput, as.character), stringsAsFactors=FALSE)
#df[] <- lapply(df, function(x) if(is.list(x)) unlist(x))
write.csv(gdIRRoutput,file="gdIRRoutputBig.csv")

gdIRRoutputKripGwet <- data.frame(t(sapply(gdIRRoutputKripGwet,c)))
write.csv(gdIRRoutputKripGwet,file="gdIRRoutputKripGwetBig.csv")

gdIRRoutputGwet <- data.frame(t(sapply(gdIRRoutputGwet,c)))
write.csv(gdIRRoutputGwet,file="gdIRRoutputGwetBig.csv")

rm(a,gdIRRoutput,gdIRRoutputKripGwet,gdIRRoutputGwet,AssessmentForKrippend,AssDataset,asswproj,asswproj1,a,AreaName,AreaNameList,AssessmentForGwet,AssessmentForKripGwet,AssessmentForKrippend1,FacetRater,RaterN,SusetRaterErrorN,SubsetRaterQual,SubsetRaterQualList,b,facet,facetList,facetlist,groupsize,groupsizes,inc,krippmeth,lencuts,listn,lowerlen,methlist,total,upperlen,bipolar.weights.bp.coeff.raw,circular.weights,conger.kappa.raw,fleiss.kappa.raw,identity.weights,krippen.alpha.raw,linear.weights,ordinal.weights,quadratic.weights,radical.weights,ratio.weights,trim,bipolar.weights,bp.coeff.raw,gwet.ac1.raw,SubsetRaterErrorN,AssDatasetList)

Other pages * *

[https://stat.ethz.ch/pipermail/r-help/2014-March/372538.html]¹³

- - - - McCray, G, (2013) [Assessing inter-rater agreement for nominal]¹⁴ [judgement variables.]¹⁴ Paper presented at the Language Testing Forum. Nottingham, November 15-17.

🪴🌱 Finding knowledge

Explorer

Inter-rater reliability in R

Iteration two Creating a loop to go through a variety of subsets of

Iteration three This version is substantially faster, thanks to

Iteration four

Iteration five This is the fastest/most complete, it runs a set of

Other pages * *

Graph View

Table of Contents

Backlinks

🪴🌱 Finding knowledge

Explorer

Inter-rater reliability in R

Iteration two Creating a loop to go through a variety of subsets of

Iteration three This version is substantially faster, thanks to

Iteration four

Iteration five This is the fastest/most complete, it runs a set of

Other pages * *

Footnotes

Graph View

Table of Contents

Backlinks