Skip to content
  1. 2.1
  2. 2.2
  3. 2.3
  4. 2.4
  5. 2.5
Step 1 of 6~13 min left

Chapter 2 · Lesson 2.3

Chain rule on a graph

An input can affect an output through more than one path. The chain rule tells us how those paths combine into one total effect.

Lesson 2.1 gave us a graph of dependencies. Lesson 2.2 gave each operation a local rule. This lesson puts those two pieces together.

The chain rule on a graph says:

  • along one path, multiply the local sensitivities
  • across multiple paths, add the path contributions

One input, one output, more than one path

Start with the small graph we already know:

a = Value(2.0)
b = Value(-3.0)
 
c = a * b
d = a + b
e = c + d

The full expression is:

e=(ab)+(a+b)e = (a \cdot b) + (a + b)

Now ask a bigger question than Lesson 2.2 asked: how does a affect e? In Lesson 2.2, we only asked about direct neighbors like how does a affect c?

Now a and e are not directly connected. There are two paths from a to e, so one local derivative is no longer enough. We need the total effect of a on e:

aceade\begin{aligned} a &\to c \to e \\ a &\to d \to e \end{aligned}

First, do one path

Take the path a → c → e. We already know the local pieces:

ca=b,ec=1\frac{\partial c}{\partial a} = b, \qquad \frac{\partial e}{\partial c} = 1

Multiply them to get the contribution from this path:

ecca=1b=b\frac{\partial e}{\partial c} \cdot \frac{\partial c}{\partial a} = 1 \cdot b = b

At our current values, b is -3, so this path contributes -3. This is the chain rule in its simplest graph form: if influence travels through a path, multiply the local sensitivities along that path.

Then, do the other path

Now take the second path a → d → e. Its local pieces are:

da=1,ed=1\frac{\partial d}{\partial a} = 1, \qquad \frac{\partial e}{\partial d} = 1

Its contribution is:

edda=11=1\frac{\partial e}{\partial d} \cdot \frac{\partial d}{\partial a} = 1 \cdot 1 = 1

We now have both path contributions from a to e:

Path contributions

through c:ecca=1b=3\text{through } c: \quad \frac{\partial e}{\partial c} \cdot \frac{\partial c}{\partial a} = 1 \cdot b = -3
through d:edda=11=+1\text{through } d: \quad \frac{\partial e}{\partial d} \cdot \frac{\partial d}{\partial a} = 1 \cdot 1 = +1

Total effect means add the path contributions

Both paths are real. Both are ways a changes e. So the total derivative is the sum of both contributions.

ea=(ecca)+(edda)=(1b)+(11)=b+1=2\frac{\partial e}{\partial a} = \left(\frac{\partial e}{\partial c} \cdot \frac{\partial c}{\partial a}\right) + \left(\frac{\partial e}{\partial d} \cdot \frac{\partial d}{\partial a}\right) = (1 \cdot b) + (1 \cdot 1) = b + 1 = -2

Check the other input too

Now do the same for b. Again, there are two paths from b to e:

bcebde\begin{aligned} b &\to c \to e \\ b &\to d \to e \end{aligned}

For the path b → c → e, the local product is:

cb=a,ec=1\frac{\partial c}{\partial b} = a, \qquad \frac{\partial e}{\partial c} = 1
eccb=1a=a\frac{\partial e}{\partial c} \cdot \frac{\partial c}{\partial b} = 1 \cdot a = a

For the path b → d → e, the local product is:

db=1,ed=1\frac{\partial d}{\partial b} = 1, \qquad \frac{\partial e}{\partial d} = 1
eddb=11=1\frac{\partial e}{\partial d} \cdot \frac{\partial d}{\partial b} = 1 \cdot 1 = 1

These are the two contributions from b to e:

Path contributions

through c:eccb=1a=+2\text{through } c: \quad \frac{\partial e}{\partial c} \cdot \frac{\partial c}{\partial b} = 1 \cdot a = +2
through d:eddb=11=+1\text{through } d: \quad \frac{\partial e}{\partial d} \cdot \frac{\partial d}{\partial b} = 1 \cdot 1 = +1

Add them:

eb=a+1=2+1=3\frac{\partial e}{\partial b} = a + 1 = 2 + 1 = 3

With both inputs worked out, the graph gives two total derivatives:

ea=2,eb=3\frac{\partial e}{\partial a} = -2, \qquad \frac{\partial e}{\partial b} = 3

Tiny checkpoint

Consider the same graph, but with different input values.

x = Value(4.0)
y = Value(-1.0)
 
p = x * y
q = x + y
r = p + q

Answer before revealing:

  1. How many paths are there from x to r?
  2. What is the contribution through p?
  3. What is the contribution through q?
  4. What is the total derivative dr/dx?
  5. What is the total derivative dr/dy?
Reveal answers
  1. Two paths.
  2. Through p: y = -1.
  3. Through q: 1.
  4. dr/dx = y + 1 = -1 + 1 = 0.
  5. dr/dy = x + 1 = 4 + 1 = 5.

The interesting result is dr/dx = 0. One path pulls downward by -1, the other contributes +1, and they cancel.