Читать книгу Machine Learning for Tomographic Imaging - Professor Ge Wang - Страница 41
3.1.3 Activation function
ОглавлениеThe activation function is an essential part of an artificial neuron, which determines the output behavior of the neuron. The activation function empowers the network with the nonlinear mechanism. This nonlinearity enables the artificial neural network to learn a complex nonlinear mapping from input to output signals. Without a nonlinear activation function, the network will be a linear system whose information processing capability will be very limited. Mathematically, even with a single-hidden-layer neural network, we can approximate all continuous functions when the activation function is nonlinear.
For an activation function to perform satisfactorily, it should satisfy the following conditions: (i) differentiability, which is necessary for the gradient descent method to work for optimization of a network and (ii) monotonicity, which is biologically motivated for the neuron to be in either a prohibitory or an excitatory state. Only when the activation function is monotonic can a single-hidden-layer network be optimized as a convex problem.
Generally speaking, the activation function is of great importance since it delivers a single number via a ‘soft’ thresholding operation as the final result of the information processing processed by the neuron. Several commonly used activation functions are described in the following subsections.
