Читать книгу Computational Statistics in Data Science - Группа авторов - Страница 79
4.2 Convolutional Layer
ОглавлениеThe convolution operation is illustrated in Figure 2. The weight matrix of the convolutional layer is usually called the kernel matrix. The kernel matrix () shifts over the input matrix and performs elementwise multiplication between the kernel matrix () and the covered portion of the input matrix (), resulting in a feature matrix (). The stride of the kernel matrix determines the amount of movement in each step. In the example in Figure 2, the stride size is 1, so the kernel matrix moves one unit in each step. In total, the kernel matrix shifts 9 times, resulting in a feature matrix. The stride size does not have to be 1, and a larger stride size means fewer shifts.
Another commonly used structure in a CNN is the pooling layer, which is good at extracting dominant features from the input. Two main types of pooling operation are illustrated in Figure 3. Similar to a convolution operation, the kernel shifts over the input matrix with a specified stride size. If Max Pooling is applied to the input, the maximum of the covered portion will be taken as the result. If Average Pooling is applied, the mean of the covered portion will be calculated and taken as the result. The example in Figure 3 shows the result of pooling with kernel size that equals and stride that equals 1 on a input matrix.