Performs a backpropagation step through the module, with respect to the given input. In general this method makes the assumption forward(input) has been called before, with the same input. This is necessary for optimization reasons. If you do not respect this rule, backward() will compute incorrect gradients.

In general input and gradOutput and gradInput are Tensors. However, some special sub-classes like table layers might expect something else. Please, refer to each module specification for further information.

A backpropagation step consist in computing two kind of gradients at input given gradOutput (gradients with respect to the output of the module).

  1. Computing the gradient of the module with respect to its own input. This is returned in gradInput. Also, the gradInput state variable is updated accordingly.
  2. Computing the gradient of the module with respect to its ownparameters. Many modules do not perform this step as they do not have any parameters. The state variable name for the parameters is module dependent. The module is expected to accumulate the gradients with respect to the parameters in some variable. Zeroing this accumulation is achieved with zeroGradParameters() and updating the parameters according to this accumulation is done with updateParameters().