Module: Chainer

Defined in:: lib/chainer/function_node.rb,
lib/chainer.rb,
lib/chainer/cuda.rb,
lib/chainer/link.rb,
lib/chainer/device.rb,
lib/chainer/backend.rb,
lib/chainer/version.rb,
lib/chainer/function.rb,
lib/chainer/reporter.rb,
lib/chainer/variable.rb,
lib/chainer/optimizer.rb,
lib/chainer/parameter.rb,
lib/chainer/serializer.rb,
lib/chainer/utils/conv.rb,
lib/chainer/utils/math.rb,
lib/chainer/initializer.rb,
lib/chainer/utils/array.rb,
lib/chainer/configuration.rb,
lib/chainer/testing/array.rb,
lib/chainer/training/util.rb,
lib/chainer/variable_node.rb,
lib/chainer/datasets/cifar.rb,
lib/chainer/datasets/mnist.rb,
lib/chainer/gradient_check.rb,
lib/chainer/hyperparameter.rb,
lib/chainer/utils/variable.rb,
lib/chainer/dataset/convert.rb,
lib/chainer/gradient_method.rb,
lib/chainer/optimizers/adam.rb,
lib/chainer/dataset/iterator.rb,
lib/chainer/training/trainer.rb,
lib/chainer/training/updater.rb,
lib/chainer/initializers/init.rb,
lib/chainer/utils/initializer.rb,
lib/chainer/functions/math/exp.rb,
lib/chainer/functions/math/sum.rb,
lib/chainer/training/extension.rb,
lib/chainer/initializers/normal.rb,
lib/chainer/serializers/marshal.rb,
lib/chainer/functions/array/cast.rb,
lib/chainer/initializers/uniform.rb,
lib/chainer/initializers/constant.rb,
lib/chainer/datasets/tuple_dataset.rb,
lib/chainer/links/model/classifier.rb,
lib/chainer/functions/array/reshape.rb,
lib/chainer/functions/array/squeeze.rb,
lib/chainer/functions/math/identity.rb,
lib/chainer/functions/noise/dropout.rb,
lib/chainer/links/connection/linear.rb,
lib/chainer/optimizers/momentum_sgd.rb,
lib/chainer/functions/array/rollaxis.rb,
lib/chainer/functions/activation/relu.rb,
lib/chainer/functions/activation/tanh.rb,
lib/chainer/functions/array/transpose.rb,
lib/chainer/functions/math/basic_math.rb,
lib/chainer/iterators/serial_iterator.rb,
lib/chainer/links/connection/embed_id.rb,
lib/chainer/training/standard_updater.rb,
lib/chainer/training/triggers/interval.rb,
lib/chainer/functions/array/select_item.rb,
lib/chainer/functions/connection/linear.rb,
lib/chainer/functions/activation/sigmoid.rb,
lib/chainer/functions/array/broadcast_to.rb,
lib/chainer/functions/pooling/pooling_2d.rb,
lib/chainer/training/extensions/snapshot.rb,
lib/chainer/functions/connection/embed_id.rb,
lib/chainer/functions/evaluation/accuracy.rb,
lib/chainer/training/extensions/evaluator.rb,
lib/chainer/training/extensions/log_report.rb,
lib/chainer/functions/activation/leaky_relu.rb,
lib/chainer/functions/activation/relu_grad2.rb,
lib/chainer/links/connection/convolution_2d.rb,
lib/chainer/functions/activation/log_softmax.rb,
lib/chainer/functions/pooling/max_pooling_2d.rb,
lib/chainer/training/extensions/print_report.rb,
lib/chainer/training/extensions/progress_bar.rb,
lib/chainer/functions/activation/sigmoid_grad.rb,
lib/chainer/functions/loss/mean_squared_error.rb,
lib/chainer/functions/connection/convolution_2d.rb,
lib/chainer/functions/loss/softmax_cross_entropy.rb,
lib/chainer/functions/pooling/average_pooling_2d.rb,
lib/chainer/functions/connection/deconvolution_2d.rb,
lib/chainer/training/extensions/exponential_shift.rb,
lib/chainer/links/normalization/batch_normalization.rb,
lib/chainer/functions/connection/convolution_2d_grad_w.rb,
lib/chainer/functions/normalization/batch_normalization.rb

Overview

Function node of the computational graph. FunctionNode is a class representing a node in a computational graph. The node corresponds to an application of a differentiable function to input variables. When a differentiable function is applied to ‘Chainer::Variable` objects, it creates an instance of FunctionNode implementation and calls its `apply` method. The `apply` method basically does the following three things.

1. Adding an edge from the function node to the variable node corresponding to each input.
   The node of each input is extracted by `Chainer::`Variable.node`.
2. Computing the output arrays of the function.
3. Creating a :class:`Variable` object for each output array and
   adding an edge from the node of the variable to the function node.

The output variables are then returned.

Defined Under Namespace

Modules: CUDA, Dataset, Datasets, Device, Functions, Initializers, Iterators, Links, Optimizers, ReportService, Serializers, Testing, Training, Utils Classes: AbstractDevice, AbstractSerializer, Chain, ChainList, Configuration, CpuDevice, Deserializer, DictSummary, Function, FunctionAdapter, FunctionNode, GpuDevice, GradientMethod, Hyperparameter, HyperparameterProxy, Initializer, Link, Optimizer, Parameter, Reporter, Serializer, Summary, UpdateRule, Variable, VariableNode, WeightDecay

Constant Summary collapse

VERSION =

"0.4.1"

Class Method Summary collapse

._as_tuple(x) ⇒ Object
._copy_arrays(xs) ⇒ Object
.array?(obj) ⇒ Boolean

Returns true if the argument is either of Numo::NArray or Cumo::NArray.
.check_backward(func, x_data, y_grad, params = [], eps: 0.001, atol: 1e-5, rtol: 1e-4, no_grads: nil, dtype: nil) ⇒ Object

Test backward procedure of a given function.
.check_double_backward(func, x_data, y_grad, x_grad_grad, params = [], params_grad_grad = [], eps: 1e-3, atol: 1e-4, rtol: 1e-3, no_grads: nil, dtype: nil) ⇒ Object
.configuration ⇒ Object
.configure {|configuration| ... } ⇒ Object
.get_array_module(*args) ⇒ Class

Gets an appropriate one from Numo::NArray or Cumo::NArray from given arrays.
.grad(outputs, inputs, grad_outputs: nil, grad_inputs: nil, set_grad: false, retain_grad: false, enable_double_backprop: false) ⇒ Object
.numerical_grad(f, inputs, grad_outputs, eps = 1e-3) ⇒ Array

Computes numerical gradient by finite differences.

Class Method Details

._as_tuple(x) ⇒ `Object`

# File 'lib/chainer/gradient_check.rb', line 53

def _as_tuple(x)
  if x.is_a? Array
    return x
  else
    return [x]
  end
end

._copy_arrays(xs) ⇒ `Object`



2
3
4

# File 'lib/chainer/gradient_check.rb', line 2

def _copy_arrays(xs)
  xs.map{|x| Chainer.array?(x) ? x.dup : x}
end

.array?(obj) ⇒ `Boolean`

Returns true if the argument is either of Numo::NArray or Cumo::NArray.

Parameters:

obj (Object)

Returns:

(Boolean)

# File 'lib/chainer/backend.rb', line 19

def array?(obj)
  if CUDA.available?
    return true if obj.kind_of?(Cumo::NArray)
  end
  return true if obj.kind_of?(Numo::NArray)
  false
end

.check_backward(func, x_data, y_grad, params = [], eps: 0.001, atol: 1e-5, rtol: 1e-4, no_grads: nil, dtype: nil) ⇒ `Object`

Note:

func is called many times to get numerical gradients for all inputs. This function doesn’t work correctly when func behaves randomly as it gets different gradients.

Test backward procedure of a given function.

This function automatically check backward-process of given function. For example, when you have a Chainer::Function class MyFunc, that gets two arguments and returns one value, you can make its test like this:

def test_my_func(self):
  func = MyFunc()
  x1_data = Numo::NArray[...]
  x2_data = Numo::NArray[...]
  gy_data = Numo::NArray[...]
  check_backward(func, [x1_data, x2_data], gy_data)

This method creates Chainer::Variable objects with x_data and calls func with the Chainer::Variable s to get its result as Chainer::Variable. Then, it sets y_grad array to grad attribute of the result and calls backward method to get gradients of the inputs. To check correctness of the gradients, the function calls numerical_grad to calculate numerically the gradients and compares the types of gradients with Chainer::Testing.assert_allclose. If input objects (x1_data or/and x2_data in this example) represent integer variables, their gradients are ignored.

You can simplify a test when MyFunc gets only one argument:

check_backward(func, x1_data, gy_data)

If MyFunc is a loss function which returns a zero-dimensional array, pass nil to gy_data. In this case, it sets 1 to grad attribute of the result:

check_backward(my_loss_func, [x1_data, x2_data], nil)

If MyFunc returns multiple outputs, pass all gradients for outputs as a Array:

gy1_data = Numo::NArray[...]
gy2_data = Numo::NArray[...]
check_backward(func, x1_data, [gy1_data, gy2_data])

You can also test a Chainer::Link. To check gradients of parameters of the link, set a Array of the parameters to params arguments:

check_backward(my_link, [x1_data, x2_data], gy_data, [my_link.W, my_link.b])

Note that params are not Numo::NArray s, but Chainer::Variables s.

Function objects are acceptable as func argument:

check_backward(lambda{|x1, x1| f(x1, x2)}, [x1_data, x2_data], gy_data)

Parameters:

func (Method, Proc) —

A function which gets Chainer::Variable s and returns Chainer::Variable s. func must returns a Array of Chainer::Variable s or one Chainer::Variable. You can use Chainer::Function object, Chainer::Link object or a function satisfying the condition.
x_data (Numo::NArray or Array<Numo::NArray>) —

A set of Numo::NArray s to be passed to func. If x_data is one Numo::NArray object, it is treated as (x_data,).
y_grad (Numo::NArray or Array<Numo::NArray> or nil) —

A set of Numo::NArray s representing gradients of return-values of func. If y_grad is one Numo::NArray object, it is treated as (y_grad,). If func is a loss-function, y_grad should be set to nil.
params (Chainer::Variable or Array<Chainder::Variable>) (defaults to: []) —

A set of Chainer::Variable s whose gradients are checked. When func is a Chainer::Link object, set its parameters as params. If params is one Chainer::Variable object, it is treated as (params,).
eps (Float) (defaults to: 0.001) —

Epsilon value to be passed to numerical_grad.
atol (Float) (defaults to: 1e-5) —

Absolute tolerance to be passed to Chainer::Testing.assert_allclose.
rtol (Float) (defaults to: 1e-4) —

Relative tolerance to be passed to Chainer::Testing.assert_allclose.
no_grads (Array<Boolean>) (defaults to: nil) —

Flag to skip variable for gradient assertion. It should be same length as x_data.
dtype (Numo::NArray.class) (defaults to: nil) —

x_data and y_grad are casted to this dtype when calculating numerical gradients. Only float types and nil are allowed.

.check_double_backward(func, x_data, y_grad, x_grad_grad, params = [], params_grad_grad = [], eps: 1e-3, atol: 1e-4, rtol: 1e-3, no_grads: nil, dtype: nil) ⇒ `Object`

# File 'lib/chainer/gradient_check.rb', line 270

def check_double_backward(func, x_data, y_grad, x_grad_grad, params=[], params_grad_grad=[], eps: 1e-3, atol: 1e-4, rtol: 1e-3, no_grads: nil, dtype: nil)
  x_data = _as_tuple(x_data)
  params = _as_tuple(params)
  n_x = x_data.size

  first_order_grad = -> *inputs do
    xs = inputs[0...n_x]
    gys = inputs[n_x..-1]

    y = _as_tuple(func.(*xs))
    # Let all elements of y share the same creator.
    # See the comment in check_backward.
    y = Chainer::Functions::Math::Identity.new.apply(y)
    set_y_grad(y, gys)
    y[0].backward(enable_double_backprop: true)

    xs.map(&:grad_var) + params.map(&:grad_var)
  end

  inputs = x_data + _as_tuple(y_grad)
  grad_grad = _as_tuple(x_grad_grad) + _as_tuple(params_grad_grad)
  check_backward(first_order_grad, inputs, grad_grad, params=params, eps: eps, atol: atol, rtol: rtol, no_grads: no_grads, dtype: dtype)
end

.configuration ⇒ `Object`



97
98
99

# File 'lib/chainer.rb', line 97

def self.configuration
  @configuration ||= Configuration.new
end

.configure {|configuration| ... } ⇒ `Object`

Yields:

(configuration)



93
94
95

# File 'lib/chainer.rb', line 93

def self.configure
  yield(configuration)
end

.get_array_module(*args) ⇒ `Class`

Gets an appropriate one from Numo::NArray or Cumo::NArray from given arrays.

Parameters:

args (Array<Chainer::Variable> or Array<Numo::NArray> or Array<Cumo::NArray>) —

Values to determine whether Numo or Cumo should be used.

Returns:

(Class) —

Cumo::NArray or Numo::NArray is returned based on the types of the arguments.

# File 'lib/chainer/backend.rb', line 6

def get_array_module(*args)
  arrays = args.map {|v| v.kind_of?(Chainer::Variable) ? v.data : v }
  if CUDA.available?
    return Cumo if arrays.any? {|a| a.kind_of?(Cumo::NArray) }
  end
  return Numo
end

.grad(outputs, inputs, grad_outputs: nil, grad_inputs: nil, set_grad: false, retain_grad: false, enable_double_backprop: false) ⇒ `Object`

# File 'lib/chainer/function_node.rb', line 248

def self.grad(outputs, inputs, grad_outputs: nil, grad_inputs: nil, set_grad: false, retain_grad: false, enable_double_backprop: false)
  # The implementation consists of three steps.

  if !outputs.is_a?(Array)
    raise TypeError, "outputs must be Array, not #{outputs.class}"
  end
  if !inputs.is_a?(Array)
    raise TypeError, "inputs must be Array, not #{inputs.class}"
  end
  if !grad_outputs.nil? && !grad_outputs.is_a?(Array)
    raise TypeError, "grad_outputs must be Array, not #{grad_outputs.class}"
  end
  if !grad_inputs.nil? && !grad_inputs.is_a?(Array)
    raise TypeError, "grad_inputs must be Array, not #{grad_inputs.class}"
  end

  # 1. Backward enumeration: all the nodes reachable backward from the output
  #    nodes are enumerated. The forward direction links are collected in
  #    this step. Note that the variable nodes whose requires_grad is false
  #    are ignored and their creators are not searched.
  candidate_funcs = outputs.map(&:creator_node).compact
  visited_funcs = Set.new
  forward_graph = {}

  while func = candidate_funcs.pop
    next if visited_funcs.include?(func)
    visited_funcs.add(func)

    func.inputs.each do |x|
      next unless x.requires_grad
      forward_graph[x] = [] if forward_graph[x].nil?
      forward_graph[x] << func
      creator = x.creator_node
      if creator && !visited_funcs.include?(creator)
        candidate_funcs << creator
      end
    end
  end

  # 2. Forward enumeration: all the nodes in the subgraph reachable from the
  #    input nodes are enumerated. The extracted (sub-)subgraph is the union
  #    of all paths that backpropagation will visit.
  candidate_vars = inputs.map(&:node)
  visited_funcs = Set.new
  grad_required = Set.new
  while x = candidate_vars.pop
    grad_required.add(x)
    forward_graph[x].each do |func|
      next if visited_funcs.include?(func)
      visited_funcs.add(func)
      func.outputs.each do |y_ref|
        y = y_ref.__getobj__
        if y && forward_graph[y]
          candidate_vars << y
        end
      end
    end
  end

  # 3. Backpropagation: the backpropagation is executed along the
  #    (sub-)subgraph. It uses the topological order of the subgraph which is
  #    induced by the reversed order of function applications ("rank").
  grads = {}  # mapping from variable nodes to their gradients

  # Initialize the gradient mapping.
  grad_outputs = [nil] * outputs.size if grad_outputs.nil?
  outputs.zip(grad_outputs).each do |y, gy|
    if gy.nil?
      gy_data = y.data.new_ones
      gy = Chainer::Variable.new(gy_data, requires_grad: false)
    end

    grads[y.node] = gy
  end

  unless grad_inputs.nil?
    inputs.zip(grad_inputs).each do |x, gx|
      grads[x.node] = gx unless gx.nil?
    end
  end

  # Backprop implementation. It edits grads which will only contain the
  # gradients w.r.t. the inputs.
  old_enable_backprop = Chainer.configuration.enable_backprop
  Chainer.configuration.enable_backprop = enable_double_backprop
  backprop(outputs, inputs, grad_required, retain_grad, grads)
  Chainer.configuration.enable_backprop = old_enable_backprop

  # Extract the gradients w.r.t. the inputs and return them.
  ret = inputs.map { |x| grads[x.node] }
  if set_grad
    inputs.zip(ret).each do |x, gx|
      x.grad_var = gx
    end
  end

  ret
end

.numerical_grad(f, inputs, grad_outputs, eps = 1e-3) ⇒ `Array`

Computes numerical gradient by finite differences.

This function is used to implement gradient check. For usage example, see unit tests of Chainer::Functions.

Parameters:

f (function) —

Ruby function with no arguments that runs forward computation and returns the result.
inputs (Array<Arrays>) —

Array of arrays that should be treated as inputs. Each element of them is slightly modified to realize numerical gradient by finite differences.
grad_outputs (Array<Arrays>) —

Array of arrays that are treated as output gradients.
eps (Float) (defaults to: 1e-3) —

Epsilon value of finite differences.

Returns:

(Array) —

Numerical gradient arrays corresponding to inputs.

# File 'lib/chainer/gradient_check.rb', line 21

def numerical_grad(f, inputs, grad_outputs, eps=1e-3)
  raise unless eps > 0
  inputs = inputs.to_a
  grad_outputs = grad_outputs.to_a
  grads = inputs.map{|x| x.new_zeros()}

  inputs.zip(grads).each do |x, gx|
    orig_x = x.dup # hold original value
    x.each_with_index{|_, *i|
      orig = orig_x[*i]
      x[*i] = orig + eps
      ys1 = _copy_arrays(f.())
      x[*i] = orig - eps
      ys2 = _copy_arrays(f.())
      x[*i] = orig

      ys1.zip(ys2, grad_outputs).each do |y1, y2, gy|
        next if gy.nil?
        diff = y1 - y2
        if Chainer.array?(diff) && diff.empty?
          dot = 0
        else
          dot = (diff * gy).sum
        end
        gx[*i] += dot / (2 * eps)
      end
    }
  end

  return grads
end

Module: Chainer

Overview

Defined Under Namespace

Constant Summary collapse

Class Method Summary collapse

Class Method Details

._as_tuple(x) ⇒ Object

._copy_arrays(xs) ⇒ Object

.array?(obj) ⇒ Boolean

.check_backward(func, x_data, y_grad, params = [], eps: 0.001, atol: 1e-5, rtol: 1e-4, no_grads: nil, dtype: nil) ⇒ Object

.check_double_backward(func, x_data, y_grad, x_grad_grad, params = [], params_grad_grad = [], eps: 1e-3, atol: 1e-4, rtol: 1e-3, no_grads: nil, dtype: nil) ⇒ Object

.configuration ⇒ Object

.configure {|configuration| ... } ⇒ Object

.get_array_module(*args) ⇒ Class

.grad(outputs, inputs, grad_outputs: nil, grad_inputs: nil, set_grad: false, retain_grad: false, enable_double_backprop: false) ⇒ Object

.numerical_grad(f, inputs, grad_outputs, eps = 1e-3) ⇒ Array

._as_tuple(x) ⇒ `Object`

._copy_arrays(xs) ⇒ `Object`

.array?(obj) ⇒ `Boolean`

.check_backward(func, x_data, y_grad, params = [], eps: 0.001, atol: 1e-5, rtol: 1e-4, no_grads: nil, dtype: nil) ⇒ `Object`

.check_double_backward(func, x_data, y_grad, x_grad_grad, params = [], params_grad_grad = [], eps: 1e-3, atol: 1e-4, rtol: 1e-3, no_grads: nil, dtype: nil) ⇒ `Object`

.configuration ⇒ `Object`

.configure {|configuration| ... } ⇒ `Object`

.get_array_module(*args) ⇒ `Class`

.grad(outputs, inputs, grad_outputs: nil, grad_inputs: nil, set_grad: false, retain_grad: false, enable_double_backprop: false) ⇒ `Object`

.numerical_grad(f, inputs, grad_outputs, eps = 1e-3) ⇒ `Array`