Q-learning now has its built-in generalisation using gradient descent over linear combination of policy variables. Also, add another sample falling stones
to demonstrate how generalisation is used.
Q-learning now has its built-in generalisation using gradient descent over linear combination of policy variables. Also, add another sample falling stones
to demonstrate how generalisation is used.