Covers: theory of Chain rule of calculus

- Examples for chain rule application

Read the entire document

Jim Lambers

0 comment

Contributors

- Objectives
- This list helps you understand how backpropagation algorithm is used to calculate gradient of the loss function with respect to parameters of the model
- Potential Use Cases
- Mathematical foundations behind Deep Learning
- Who is This For ?
- INTERMEDIATEPeople interested in knowing how Deep Learning model training works.

Click on each of the following **annotated items** to see details.

Resources5/11

VIDEO 1. Intuitive understanding of Backward Propagation

- What is Forward Propagation?
- What is Backward Propagation?

10 minutes

ARTICLE 2. Backpropagation as Reverse Mode Differentiation

- What is Forward Mode Differentiation?
- What is Reverse Mode Differentiation and how is BackPropagation a special case of it?
- Easy to understand example differentiating the above two.

20 minutes

ARTICLE 3. Algorithm for backpropagation

- Pseudocode for Back Propagation

20 minutes

VIDEO 4. Intuitive understanding of Total differential

- Total differential as linear approximation around a point

10 minutes

ARTICLE 5. Mathematical definition of Total derivative

- Definition of Total derivative
- Examples of calculating Total derivative

10 minutes

ARTICLE 6. Multi Variable Chain Rule

- What is multivariable chain rule

10 minutes

ARTICLE 7. Total differential and Chain rule

- How is total differential related to total derivative?

20 minutes

VIDEO 8. Forward Propagation in a Deep Network

- What does it mean to perform ForwardPropagation in Neural Networks

10 minutes

ARTICLE 9. [Long Read] Detailed description of BackPropagation

- Detailed description of BackPropagation
- Code for BackPropagation

60 minutes

ARTICLE 10. [Long Read] Chain rule

- Examples for chain rule application

30 minutes

ARTICLE 11. [Long Read] The Matrix Calculus You Need For Deep Learning

- Detailed understanding of how to apply Matrix calculus to calculate gradient of loss function using chain rule

60 minutes

0 comment