Читать книгу Machine Learning for Tomographic Imaging - Professor Ge Wang - Страница 43
Tanh
ОглавлениеTanh is also a commonly used nonlinear activation function, as shown in figure 3.6 (Fan 2000). Although the sigmoid function has a direct biological interpretation, it turns out that sigmoid leads to a diminishing gradient, undesirable for training a neural network. Like sigmoid, the tanh function is also ‘s’-shaped, but its output range is (−1, 1). Thus, negative inputs to tanh are mapped to negative outputs, and only a zero input is mapped to zero. These properties make it better than sigmoid. The tanh function is defined as follows:
tanh(x)=ex−e−xex+e−x.(3.4)
Figure 3.6. The tanh function is similar to sigmoid in shape but has the symmetric range [−1, 1].
Tanh is nonlinear and squashes a real-valued number to the range [−1, 1]. Unlike sigmoid, the output of tanh is zero-centered symmetrically. Therefore, tanh is often preferred over sigmoid in practice.