Читать книгу Machine Learning for Tomographic Imaging - Professor Ge Wang - Страница 48

A special convolution: 1 × 1 convolution

Оглавление

Now, let us introduce a special convolution kernel, which is of 1 × 1. As mentioned above, convolution is a local weighted summation. In the 1 × 1 convolution, the local receptive field is 1 × 1. Therefore, 1 × 1 convolution is equivalent to the linear combination of feature maps. In the case of multi-channel and multiple convolution kernels, a 1 × 1 convolution mainly has two effects:

1 1 × 1 convolution can lead to dimension reduction. If a 1 × 1 convolution is applied after the pooling layer, its effect is also dimension reduction. Moreover, it can reduce the redundancy in feature maps, which are obtained after the processing in each layer of the network. In reference to Olshausen and Field’s work (Olshausen and Field 1996), the learnt sparse features can be considered as a linear combination of ZCA features, which is an example of feature scarcity.

2 Under the premise of keeping the feature scale unchanged (i.e. without loss of resolution), the activation function applied after 1 × 1 convolution can greatly increase the nonlinearity of the network, which helps to deepen the network.

Figure 3.11 helps illustrate the effects of the 1 × 1 convolution.


Figure 3.11. Example of a 1 × 1 convolution operation.

In figure 3.11, the number of input feature maps is 2, and their sizes are all 5 × 5. After three 1 × 1 convolution operations, the number of the output feature maps is 3, and their sizes are 5 × 5. It is seen that the 1 × 1 convolutions realize the linear combinations of multiple feature maps while keeping the feature map size intact, realizing cross-channel interaction and information integration.

Furthermore, in figure 3.12, we combine the 3 × 3 convolution with the 1 × 1 convolution. Assuming that the number of the input feature maps of w×s is 128, the computational complexity on the left is w×h×128×3 × 3×128=147456×w×h, and that on the right is w×h×128×1×1 × 32+w×h×32×3 × 3×32+w×h×32×1 × 1×128=17408×w×h. The number of parameters on the left is approximately 8.5 times that on the right. Therefore, the use of 1 × 1 convolution causes dimension reduction and reduces the number of parameters.


Figure 3.12. Original 3 × 3 convolution, and improved 3 × 3 convolution combined with two 1 × 1 convolutions.

In addition, after 1 × 1 convolution a new activation function for nonlinear transformation takes effect. Therefore, 1 × 1 convolution makes the neural network deeper and its nonlinear fitting ability stronger. Such a network can extract and express more complex and higher dimensional features.

Machine Learning for Tomographic Imaging

Подняться наверх