action-value-methods Implementation of greedy, ε-greedy and softmax methods for the n-armed bandit problem Requirements numpy matplotlib