Skip to content
This repository has been archived by the owner on May 24, 2019. It is now read-only.

Circumventing Batch Pricing

Thomas J. Leeper edited this page Aug 5, 2015 · 10 revisions

In July 2015, Amazon increased the price of MTurk HITs that include 10 or more assignments. In short, MTurk charges an additional 20% commission on top of the base commission when a HIT contains 10 or more assignments. This disproportionately affects academic requesters doing large-n survey-experimental research. This tutorial explains how to post a sequence of HITs with 9 (or fewer) assignments in order to obtain a completed number of assignments greater than 10 without incurring this extra charge.

The basic strategy will be to create a series of HITs, each of which (except for the first) has a QualificationRequirement that prevents workers who have done one of the previous HITs from completing an assignment. The example below implements this as a repeat loop in R that is fully self-contained. You may need to modify it for your purposes.

# set total number of desired assignments
total <- 1000

# create QualificationType
qual <- CreateQualificationType(name="Already completed HIT",
          description="Already completed identical HIT before.",
          status = "Active")
# generate "DoesNotExist" QualificationRequirement structure
qreq <- GenerateQualificationRequirement(qual1$QualificationTypeId, "DoesNotExist", "")

# create HITType w/o qualification requirement
hittype1 <- RegisterHITType(title = "10 Question Survey",
                description = "Something something something",
                reward = ".20", 
                duration = seconds(hours = 1), 
                auto.approval.delay = seconds(days = 1),
                keywords = "survey, questionnaire")

# create HITType w/ qualification requirement
hittype2 <- RegisterHITType(title = "10 Question Survey",
                description = "Something something something",
                reward = ".20", 
                duration = seconds(hours = 1), 
                auto.approval.delay = seconds(days = 1),
                keywords = "survey, questionnaire",
                qual.req = qreq) # this blocks past workers

# create first HIT
eq <- GenerateExternalQuestion("https://www.example.com/","400")
hit <- CreateHIT(hit.type = hittype1$HITTypeId,
                 assignments = 9, # IMPORTANT THAT THIS IS <= 9
                 expiration = seconds(days = 4),
                 question = eq$string)

# variable to index number of completed assignments
completed <- 0

# list to store assignments into
allassigns <- list()

# start the loop
repeat {
  g <- GetHIT(hit$HITId, response.group = "HITAssignmentSummary", 
              verbose = FALSE)$HITs$NumberOfAssignmentsCompleted

  # check if all 9 assignments have been completed
  if (as.numeric(g) == 9) {
    # if yes, retrieve submitted assignments
    w <- length(a) + 1
    allassigns[[w]] <- GetAssignments(hit = hit$HITId)
    
    # assign blocking qualification to workers who completed previous HIT
    AssignQualification(qual$QualificationTypeId, allassigns[[w]]$WorkerId, verbose = FALSE)

    # increment number of completed assignments
    completed <- completed + 9

    # check if enough assignments have been completed
    if(completed < total) {    
      # if not, create another HIT
      hit <- CreateHIT(hit.type = hittype2$HITTypeId,
                       assignments = 9,
                       expiration = seconds(days = 4),
                       question = eq$string)

      # wait some time and check again
      Sys.sleep(180)
    } else {
      # if total met, exit loop:
      break
    }
  } else {
    # wait some time and check again
    Sys.sleep(30) # TIME (IN SECONDS) TO WAIT BETWEEN CHECKING FOR ASSIGNMENTS
  }
}

# get all assignments for all HITs as a data.frame
m <- do.call("rbind", allassigns)

This will run by itself until the total number of assignments is reached.

Clone this wiki locally