prisoners_in_R.Rmd

---
title: "Agent-Based Tournament Coding Challenge"
output: html_document
---

# Agent-Based Tournament Coding Challenge (TU Delft - EPA1315 Data Analysis and Visualization)

## Introduction

In the agent-based tournament coding challenged each agent has to face an interated prisoner's dilemma. The prisoners dilemma is extensively studied in multiple scientific areas, such as economics, machine learning and evolutionary biology [1]. The core concept of the game is that each of the two competitors can choose 1 out of 2 options for each round. These options are to either defect or cooperate. Both player simultaneously and independently select an option and depending on both decisions a payoff for each player is determined.

Payoff | Cooperate | Defect
--- | --- | ---
Cooperate| 4/4 | 0/5
Defect | 5/0 | 2/2

The full version of this work is [hosted on GitHub](https://github.com/jasonrwang/prisoners_in_R).

### Authors

* Mees Hoff – 4963059
* Siemon Keij - 4285654
* Philip Seijger - 4272803
* Jason R Wang - 4788281


## Common Strategies

#### 1. Always Cooperate
Cooperates on every move, as a strategie this can be considered very social but not most benefial for own interest

#### 2. Always Defect
Defects on every move, this is considered a 'hard' strategy'.

#### 3. Tit for Tat
Cooperates on the first move, then simply copies the opponent’s last move. In experimental tournaments this strategy turned out surprisingly successful.

#### 4. Suspicious Tit for Tat
Same as Tit for Tat, except that it defects on the first move.

#### 5. Pavlov
Cooperates on the first move, and defects only if both the players did not agree on the previous move.

#### 6. Spiteful
Cooperates, until the opponent defects, and thereafter always defects. Can work out positively in terms of own interest, but may also be risky if first opponent defect occurs early.

#### 7. Random Player
Makes a random move each turn again.

#### 8. Periodic player CD:
Plays C, D periodically.

#### 9. Periodic player DDC
Plays D, D, C periodically.

#### 10. Periodic player CCD
Plays C, C, D periodically.

#### 11. Tit for Two Tats
Cooperates on the first move, and defects only when the opponent defects two times. Compared to the regular tit for tat, this strategy is more socially oriented.

#### 12. Soft Majority
Begins by cooperating, and cooperates as long as the number of times the opponent has cooperated is greater than or equal to the number of times it has defected. When this is no longer the case it defects.

#### 13, Hard Majority
Defects on the first move, and defects if the number of defections of the opponent is greater than or equal to the number of times it has cooperated, else cooperates.

#### 14. Hard Tit for Tat
Cooperates on the first move, and defects if the opponent has defects on any of the previous three moves, else cooperates.

#### 15. NaiveProber
Like TitforTat,but occasionally, randomly defects.

#### 16. Remorseful Prober :
Like Naive Prober, but it tries to break the series of mutual defections after defecting.

[1]

## Selected Strategy

### Axelrod Single-Objective Evolutionary Algorithm

The concept of our agent is based on the fact that for each move in the game there are four possibilities. Both players can cooperate (CC or R for reward), the first player cooperats and the other player defects (CD or S for sucker), the first player can defect and the second cooperates (DC or T for temptation), or both players defect (DD or P for penalty). For our strategie we consider the last three moves as a three letter code. For instance, the situation is which the last three rounds both agents cooperated, has a three letter sequence RRR.

This combination of letters is used to generate a number 0:63. This is done by setting R=0, T=1, S=2 and P=3 and interpret it as a number on base 4. This will result in the next formula: `(points round 1) * 4^2 + (points round 2) * 4^1 + (points round 3) * 4^0`. For example, 'SSP' yields 43. Each of the numbersfrom  0:63 either stands for defect or cooperate. Therefore, based on the last 3 moves, an option is selected. Since this decision is based upon the last the 3 moves, the first three moves cannot be defined by this strategy and are run by a modified tit for tat approach.

An Exception is made for the so called "lemon agents", these agents are introducing themselves by saying "Lemon!." and will always cheat. For this reason, in a situation of an opponent that introduces itself by "Lemon!." the defect option is always selected.


The 'main.R' code is used to call two different scripts; 'raw_parser.R' and 'extract_statistics.R'. These scripts rework the data from last years tournament into a 64-bit string which can be used in the agent.

Main.R looks as follows
```{R}
### Main File to run things in!
#setwd(Set working directory if not yet specified)

source('scripts/raw_parser.R', chdir = TRUE)
source('scripts/extract_statistics.R', chdir = TRUE)

## Run parser to create string
data_frame = axelrod_encode(raw_parser('Example_Tournament/bid_register.csv'))

## Run parsed array through statistics
axelrod_string = extract_statistics(data_frame)

# The string above can be put into the agent manually now

```

The script in 'raw_parser.R' will look at the information from a file in which the bids were registered. For every possible pair of agents, the code will save the different meetings. For every pair the script will look at three meetings and save which combination this is. For every gained combination, the script will look into if the agent won or lost in the fourth meeting. It also knows what both agents bid, this can be used to state which agent won.

'raw_parser.R' looks as follows:
```{R}
raw_parser = function(file_name) {

    tourn_raw = read.csv(file_name,sep=",",stringsAsFactors = FALSE)
    num_agents = max(unique(tourn_raw$id1)) # Only works if IDs are sequential!
    outcome_sequences = array(data=NA,dim=c(num_agents,num_agents,40)) # 40 is wasteful. See improvement suggestion below.

    # typeof(outcome_sequences) ## just for testing

    ## To get data about the number of games played between each pair
    # seq_length_list = c()

    for (i_id1 in seq(num_agents-1)) { # Iterate through all the agents
        id1_index = which(tourn_raw$id1 %in% i_id1 | tourn_raw$id2 %in% i_id1) # Find where agent plays

        for (i_id2 in seq(i_id1+1,num_agents)) { # Iterate through remaining combinations
            id2_index = which(tourn_raw$id1 %in% i_id2 | tourn_raw$id2 %in% i_id2) # Find where agent #i_id2 plays

            ## The below can be an improvement, but doesn't quite work yet.
            ## The "list" cannot replace the single element.
            # The number of times i_id1 attacks i_id2 to help pre-allocate list
            match_id = which(id1_index %in% id2_index)
            seq_list_len = length(match_id)
            #outcome_sequences[i_id1,i_id2] = vector("list",seq_list_len) # pre-allocate

            # To get data about the number of games played between each pair
            #seq_length_list = c(seq_length_list,seq_list_len)

            for (i in seq(seq_list_len)){
                outcome_sequences[i_id1,i_id2,i] = axelrod_code(
                    tourn_raw$bid1[ id1_index [ match_id[i] ] ],
                    tourn_raw$bid2[ id1_index [ match_id[i] ] ])
            }
        }
    }
    return(outcome_sequences)
}

axelrod_code = function(bid1, bid2) {
    # Determines the letter code associated with an interaction
    if (is.na(bid1)) {
        next
    }
    else if (bid1 == 'cooperate') {
       if (bid1 == bid2) { # CC -> R -> 3
            return(3)
       } else { # CD -> S -> 2
            return(2)
       }
    }  else if (bid1 == 'defect') {
        if (bid1 == bid2) { # DD -> P -> 0
            return(0)
        } else { # DC -> T -> 1
            return(1)
        }
    } else {
       print("Error in cooperate-defect matching!")
       break
    }
}

axelrod_encode = function(a_code) {

    # Takes a vector of behaviour and turns it into a useful data frame for data analysis

    num_agents = nrow(a_code) # Determine number of agents, n, in 'n x n' matrix input
    code_id = decision = win = c()

    for (i in seq(num_agents)) { # The "attacking" agent
        for (j in seq(num_agents)) { # The "defending" agent
            # Agents cannot play themselves, so skip in this case.
            if (i == j) { # This could be improved by stripping i_id1 from i_id2's for loop declaration
                next
            }

            for (k in seq(length(a_code[i,j,])-3)) {

                if (is.na(a_code[i,j,k+3])) { ## Due to dumb coding above, NA exists in list
                    next
                }

                encoded_seq = (a_code[i,j,k] * 4*4 + a_code[i,j,k+1] * 4 + a_code[i,j,k+2] * 1)

                code_id = c(code_id,encoded_seq)

                if (a_code[i,j,k+3] > 1) { # 0 and 1 are defect; 2 and 3
                    decision = c(decision,'C')
                } else {
                    decision = c(decision,'D')
                }

                if (a_code[i,j,k+3] %% 2) { # Consider R and T (1 and 3) as win
                    win = c(win,TRUE)
                } else {
                    win = c(win,FALSE)
                }
            }
        }
    }
    return(data.frame(code_id,decision,win,stringsAsFactors = FALSE))
}

```


The next script, 'extract_statistics.R' works further were 'raw_parser.R' stopped. It will look into the percentage of winnings after every combination for all the different pairs. This can be used to make a decision on what to do after every combination when encoutering an agent. To finalize a string is produced which states what to do after every combination.

```{R}
get_per <- function(choice_evaluate,data_frame,number) {

    d = data_frame
    # Count the numbers of times the pair made the same choice we are evaluating
    num_chosen = 0
    for (i in d$decision[d$code_id == number]) {
        if (i == choice_evaluate) {
            num_chosen = num_chosen + 1
        }
    }

    # Count the number of times the agent wins by making that decision
    num_wins = 0
    for (j in d[ (d$code_id==number & d$decision==choice_evaluate) , c("win")]) {
        if (j == TRUE) {
            num_wins = num_wins + 1
        }
    }
    if (num_chosen == 0) {
        p <- 0
    } else {
        p <- num_wins/num_chosen
    }
    return(p)
}

extract_statistics <- function(data_frame) {
    d = data_frame
    str <- NULL

    # Run statistics-gatherer for all 64 possible combinations.
    for (i in 0:63) {

        # Gather statistics about choices
        per_c <- get_per("C",data_frame,i)
        per_d <- get_per("D",data_frame,i)

        # Make decision for what agent should do based on choices
        if (per_c == per_d) {
            str <- paste(str, "C",sep="") # Defaults to Defect
        } else if (per_c > per_d) {
            str <- paste(str, "C",sep="")
        } else if (per_c < per_d) {
            str <- paste(str, "D",sep="")
        } else {
            str <- paste(str, " ",sep="") # For debugging

    return(data.frame(code_id,decision,win, stringsAsFactors = FALSE))
}

## For testing
# raw_parser("validate.csv")
```

The output of the scripts was the following string: CCCDCCCDCCCDCCCDCDCDCDDDCDDDDDDDCCCDCDDDCDDDDDDDDDDDCDCDCDDDDDDD. This script was used during testing, but failed to beat tit for tat and other agents consistantly. Therefore, it was decided to use a string that we got from the paper by Mittal and Deb, namely: DDDDDDCCCDDCDDDDDDCDDDDCDCCDDCDDDDCDCCCDDCDCDCDCDDCCDDDDCDDDDDCCCDCCDD. After the code, the creating and the workings of the agent will be discussed.

```{R}
library(R6)
Agent <- R6Class("Agent",
  #this part was already given
  public = list(
    bid = NULL,
    book = NULL,
    greeting = NULL,
    id = NULL,
    opponent_id = NULL,
    opponent_greeting = NULL,
    round = NULL,
    response = NULL,

    set_book = function(book=NA) {
      self$book <- book
    },

    set_id = function(id=NA) {
      self$id = id
    },

    set_opponent_id = function(opponent_id=NA) {
      self$opponent_id = opponent_id
    },

    set_response = function(response=NA) {
      self$response <- response
    },

    set_round = function(round=NA) {
      self$round <- round
    },

    set_greeting = function() {
      greeting <- "Back to PRESTO..."
    },

    receive_greeting = function(greeting = NA) {
      self$opponent_greeting = greeting
    },

    #this code was made to get a certain bid for the agent
    get_bid = function() {
      bid_vector <- c("cooperate","defect")
      #creating a few variables which will be used later on
      last_move_self <- NA
      last_move_opponent <- NA
      #create a database with the last rounds
      last_round <- tail(subset(self$book, (id1 == self$id | id2 == self$id) & (id1 == self$opponent_id | id2 == self$opponent_id)), 3)
      last_round_single <- tail(subset(self$book, (id1 == self$id | id2 == self$id) & (id1 == self$opponent_id | id2 == self$opponent_id)), 1)
      #the string which will decide what the agent should do according to the paper by Mittal and Deb
      axelrod_seq <- c("D","D","D","D","D","D","C","C","C","D","D","C","D","D","D","D","D","D","C","D","D","D","D","C","D","C","C","D","D","C","D","D","D","D","C","D","C","C","C","D","D","C","D","C","D","C","D","C","D","D","C","C","D","D","D","D","C","D","D","D","D","D","C","C","C","D","C","C","D","D")
      #this is the index for the bid, it starts at 1 because R start at 1 with counting
      number <- 1

      #first check what the bid was of the last round we played this agent
      if (nrow(last_round_single) == 0)  {
      } else if (last_round_single[1,1] == self$id) {
        last_move_self <- last_round_single[1,4]
        last_move_opponent <- last_round_single[1,5]
      } else if (last_round_single[1,1] == self$opponent_id) {
        last_move_self <- last_round_single[1,5]
        last_move_opponent <- last_round_single[1,4]
      }

      #this is a simple tit for that function for the first couple of rounds
      if (nrow(last_round) == 3) {
      } else if (nrow(last_round_single)==0) {
        self$bid <- bid_vector[1]
      } else if ((last_move_self == bid_vector[1]) & (last_move_opponent == bid_vector[1])) {
        self$bid <- sample(bid_vector, 1, prob=c(4/5,1/5))
      } else if ((last_move_self == bid_vector[1]) & (last_move_opponent == bid_vector[2])) {
        self$bid <- sample(bid_vector, 1, prob=c(1/2,1/2))
      } else if ((last_move_self == bid_vector[2]) & (last_move_opponent == bid_vector[1])) {
        self$bid <- sample(bid_vector, 1, prob=c(2/5,3/5))
      } else if ((last_move_self == bid_vector[2]) & (last_move_opponent == bid_vector[2])) {
        self$bid <- bid_vector[2]
      }

      #this is the function that looks at the last three rounds and adds it up to a number which is used to find the bid in the string
      if (nrow(last_round) < 3) {
      } else if ((last_round[3,1] == self$id) & (last_round[3,2] == self$opponent_id) & (last_round[3,4] == bid_vector[1]) & (last_round[3,5] == bid_vector[1])) {
      number <- number + (3 * 4^2)
      } else if ((last_round[3,1] == self$id) & (last_round[3,2] == self$opponent_id) & (last_round[3,4] == bid_vector[2]) & (last_round[3,5] == bid_vector[1])) {
      number <- number + (1 * 4^2)
      } else if ((last_round[3,1] == self$id) & (last_round[3,2] == self$opponent_id) & (last_round[3,4] == bid_vector[1]) & (last_round[3,5] == bid_vector[2])) {
      number <- number + (2 * 4^2)
      } else if ((last_round[3,1] == self$id) & (last_round[3,2] == self$opponent_id) & (last_round[3,4] == bid_vector[2]) & (last_round[3,5] == bid_vector[2])) {
      number <- number + (0 * 4^2)
      } else if ((last_round[3,1] == self$opponent_id) & (last_round[3,2] == self$id) & (last_round[3,4] == bid_vector[1]) & (last_round[3,5] == bid_vector[1])) {
      number <- number + (3 * 4^2)
      } else if ((last_round[3,1] == self$opponent_id) & (last_round[3,2] == self$id) & (last_round[3,4] == bid_vector[2]) & (last_round[3,5] == bid_vector[1])) {
      number <- number + (2 * 4^2)
      } else if ((last_round[3,1] == self$opponent_id) & (last_round[3,2] == self$id) & (last_round[3,4] == bid_vector[1]) & (last_round[3,5] == bid_vector[2])) {
      number <- number + (1 * 4^2)
      } else if ((last_round[3,1] == self$opponent_id) & (last_round[3,2] == self$id) & (last_round[3,4] == bid_vector[2]) & (last_round[3,5] == bid_vector[2])) {
      number <- number + (0 * 4^2)
      }

      #this is the function that looks at the last three rounds and adds it up to a number which is used to find the bid in the string
      if (nrow(last_round) < 3) {
      } else if ((last_round[2,1] == self$id) & (last_round[2,2] == self$opponent_id) & (last_round[2,4] == bid_vector[1]) & (last_round[2,5] == bid_vector[1])) {
        number <- number + (3 * 4^1)
      } else if ((last_round[2,1] == self$id) & (last_round[2,2] == self$opponent_id) & (last_round[2,4] == bid_vector[2]) & (last_round[2,5] == bid_vector[1])) {
        number <- number + (1 * 4^1)
      } else if ((last_round[2,1] == self$id) & (last_round[2,2] == self$opponent_id) & (last_round[2,4] == bid_vector[1]) & (last_round[2,5] == bid_vector[2])) {
        number <- number + (2 * 4^1)
      } else if ((last_round[2,1] == self$id) & (last_round[2,2] == self$opponent_id) & (last_round[2,4] == bid_vector[2]) & (last_round[2,5] == bid_vector[2])) {
        number <- number + (0 * 4^1)
      } else if ((last_round[2,1] == self$opponent_id) & (last_round[2,2] == self$id) & (last_round[2,4] == bid_vector[1]) & (last_round[2,5] == bid_vector[1])) {
        number <- number + (3 * 4^1)
      } else if ((last_round[2,1] == self$opponent_id) & (last_round[2,2] == self$id) & (last_round[2,4] == bid_vector[2]) & (last_round[2,5] == bid_vector[1])) {
        number <- number + (2 * 4^1)
      } else if ((last_round[2,1] == self$opponent_id) & (last_round[2,2] == self$id) & (last_round[2,4] == bid_vector[1]) & (last_round[2,5] == bid_vector[2])) {
        number <- number + (1 * 4^1)
      } else if ((last_round[2,1] == self$opponent_id) & (last_round[2,2] == self$id) & (last_round[2,4] == bid_vector[2]) & (last_round[2,5] == bid_vector[2])) {
        number <- number + (0 * 4^1)
      }

      #this is the function that looks at the last three rounds and adds it up to a number which is used to find the bid in the string
      if (nrow(last_round) < 3) {
      } else if ((last_round[1,1] == self$id) & (last_round[1,2] == self$opponent_id) & (last_round[2,4] == bid_vector[1]) & (last_round[2,5] == bid_vector[1])) {
        number <- number + (3 * 4^0)
      } else if ((last_round[1,1] == self$id) & (last_round[1,2] == self$opponent_id) & (last_round[1,4] == bid_vector[2]) & (last_round[1,5] == bid_vector[1])) {
        number <- number + (1 * 4^0)
      } else if ((last_round[1,1] == self$id) & (last_round[1,2] == self$opponent_id) & (last_round[1,4] == bid_vector[1]) & (last_round[1,5] == bid_vector[2])) {
        number <- number + (2 * 4^0)
      } else if ((last_round[1,1] == self$id) & (last_round[1,2] == self$opponent_id) & (last_round[1,4] == bid_vector[2]) & (last_round[1,5] == bid_vector[2])) {
        number <- number + (0 * 4^0)
      } else if ((last_round[1,1] == self$opponent_id) & (last_round[1,2] == self$id) & (last_round[1,4] == bid_vector[1]) & (last_round[1,5] == bid_vector[1])) {
        number <- number + (3 * 4^0)
      } else if ((last_round[1,1] == self$opponent_id) & (last_round[1,2] == self$id) & (last_round[1,4] == bid_vector[2]) & (last_round[1,5] == bid_vector[1])) {
        number <- number + (2 * 4^0)
      } else if ((last_round[1,1] == self$opponent_id) & (last_round[1,2] == self$id) & (last_round[1,4] == bid_vector[1]) & (last_round[1,5] == bid_vector[2])) {
        number <- number + (1 * 4^0)
      } else if ((last_round[1,1] == self$opponent_id) & (last_round[1,2] == self$id) & (last_round[1,4] == bid_vector[2]) & (last_round[1,5] == bid_vector[2])) {
        number <- number + (0 * 4^0)
      }

      #this function uses the number to see what letter is at that index in the provided string
      if (nrow(last_round) < 3) {
      } else if (axelrod_seq[number] == "D") {
        self$bid <- bid_vector[2]
      } else if (axelrod_seq[number] == "C") {
        self$bid <- bid_vector[1]
      }

    },

    formulate_bid = function() {
      self$get_bid()
    }
  )
)

```

This agent should beat tit for tat according to the paper by Mittal and Ded, however this hasn't been proven yet. It's optimised for some agents of last year, so it might be inefficient for the agent of this year.

### Comments on process and code
We started off on our process by reading the paper by Mittal and Deb. From this paper we got the idea to create our own 64 bit string. To make this we ran into some problems. We used the data from last year to begin with, but this didn't seem to work that well with some agents we ran together with our own agent. After tweaking our string we eventually went with the string from the paper. Our agent also has a part which is tit for tat. This is created for in the beginning, but we eventually went for defecting the first three times we meet a new agent. After these three meetings we can use the string.

Furthermore, we also have a part of the code which checks what the greeting is of the agent. This is because of the greeting for the all defect agents, they say "Lemon!.". Unfortunately this part of the code wasn't working to during our testing fase. therefore we were forced to exclude this from our delivered code.

Our first agent had different functions which were called by the get_bid function, but the provided tournament code did not work with this agent. Therefore we were forced to put all the functions in one large function. With further testing we discovered that other parts of the code that worked separately, didn't work with the provided tournament code. This let us to change our code even more.

## References
[1] S. Mittal and K. Deb, “Optimal Strategies of the Iterated Prisoner’s Dilemma Problem for Multiple Conflicting Objectives,” IEEE Trans. Evol. Comput., vol. 13, no. 3, pp. 554–565, Jun. 2009.

[2] R. Axelrod, “The evolution of strategies in the iterated prisoner’s dilemma,” in Genetic Algorithms and Simulated Annealing, L. Davis, Ed. Los Altos, CA: Morgan Kaufmann, 1987.