Use Tensorflow to Compute Gradient

Posted by : (Sep 24, 2017)

Category :

In most of Tensorflow tutorials, we use minimize(loss) to automatically update parameters of the model.

In fact, minimize() is an integration of two steps: computing gradients, and applying the gradients to update parameters.

Let’s take a look at an example:

\[Y = (100 - 3W - B)^2\]

What is the gradient of W and B when W=1.0, B=1.0?

We can calculate them by hand:

let \(N = 100 - 3W - B\), so that \(Y = N^2\)

\[\frac{\partial{Y}}{\partial{W}} = \frac{\partial{Y}}{\partial{N}} * \frac{\partial{N}}{\partial{W}} = 2N * 3 = 600 - 18W - 6B = 576\] \[\frac{\partial{Y}}{\partial{B}} = \frac{\partial{Y}}{\partial{N}} * \frac{\partial{N}}{\partial{B}} = 2N * 1 = 200 - 3W - B = 196\]

ok, now let use tensorflow to compute that:

import tensorflow as tf

# make an example:
# Y = (100 - W X - B)^2
X = tf.constant(3.)
W = tf.Variable(1.)
B = tf.Variable(1.)
Y = tf.square(100 - W*X - B)

#the lr here is not about gradient computing. it only effect when appling
Ops = tf.train.GradientDescentOptimizer(learning_rate=0.001)
grads_and_vars = Ops.compute_gradients(Y)
# we can modify the gradient here and then:
# Op_update = Ops.apply_gradients(grads_and_vars)

with tf.Session() as sess:

    sess.run(tf.global_variables_initializer())
    print(sess.run(grads_and_vars))

run it, and we get:

[(-576.0, 1.0), (-192.0, 1.0)]

So next time your professor ask you to implement a back-propagation for some complex networks by your self, maybe this trick can help you double-check your implementation. Hooray!