Читать книгу Handbook of Intelligent Computing and Optimization for Sustainable Development - Группа авторов - Страница 27
1.4.1.1.3 ReLU
ОглавлениеReLU ranges from 0 to infinity.
Using ReLU can rectify the vanishing grading problem. It also required very less computational power compared to the sigmoid and tanh. The main problem with the ReLU is that when the Z < 0, then the gradient tends to 0 which leads to no change in weights. So, to tackle this, ReLU is only used in hidden layers but not in input or output layers.
All these activation functions and forward and backward propagation are the key features that make artificial neural networks different from others. Please see Figure 1.5.
Figure 1.5 ReLU function.
Figure 1.6 Basic Bernoulli’s restricted Boltzmann machine.