Objective: employ the model subclassing API together with custom layers to create a residual network architecture. Train the custom model on the MNIST dataset by using a custom training loop and implementing the automatic differentiation tools in Tensorflow to calculate the gradients for backpropagation.