- Paper in NIPS 2016
- Authors' code
- Some TensorFlow utilities from OpenAI Baselines.
- Define network, and get gradients and variables, e.g.,
def network():
'''
Define target density and return gradients and variables.
'''
return gradients, variables
- Define gradient descent optimizer, e.g.,
def make_gradient_optimizer():
return tf.train.GradientDescentOptimizer(learning_rate=0.01)
-
Build multiple networks (particles) using
network()
and take all those gradients and variables ingrads_list
andvars_list
. -
Make SVGD optimizer, e.g.,
optimizer = SVGD(grads_list, vars_list, make_gradient_optimizer)
- In the training phase,
optimizer.update_op
will do single SVGD update, e.g.,
sess = tf.Session()
sess.run(optimizer.update_op, feed_dict={X: x, Y: y})
-
The goal of this problem is to match the target density p(x) (mixture of two Gaussians) by moving the particles initially sampled from other distributions q(x). For details, I recommend you to see the experiment section in the authors' paper.
-
I got the following result:
-
NOTE THAT I compared my implementation with that of authors and checked the results are the same.
-
In this example, we want to classify binary data by using multiple neural classifier. I checked how SVGD differs from ensemble method in this example. I made a pdf file for detailed mathematical derivations.
-
I got the following results:
- Thus, ensemble methods make particles to strongly classify samples, where as SVGD leads to draw the particles that characterize the posterior distribution.