Читать книгу Computational Analysis and Deep Learning for Medical Care - Группа авторов - Страница 15
1.2 Various CNN Models 1.2.1 LeNet-5
ОглавлениеThe LeNet architecture was proposed by LeCun et al. [1], and it successfully classified the images in the MNIST dataset. LeNet uses grayscale image of 32×32 pixel as input image. As a pre-processing step the input pixel values are normalized so that white (background) pixel represents a value of 1 and the black (foreground) represents a value of 1.175, which, in turn, speedup the learning task. The LeNet-5 architecture consists of succession of input layer, two sets of convolutional and average pooling layers, followed by a flattening convolutional layer, then two fully connected layers, and finally a softmax classifier.
The first convolutional layer filters the 32×32 input image with six filters. All filter kernels are of size 5×5 (receptive field) with a stride of 1 pixel (this is the distance between the receptive field centers of neighboring neurons in a kernel map) and uses “same” padding. Given the input image of size 28×28, apply six convolutional kernels each of size 5×5 with stride 1 in C1, the feature maps obtained is of size 14×14. Figure 1.1 shows the architecture of LeNet-5, and Table 1.1 shows the various parameter details of LeNet-5. Let Wc is the number of weights in the layer; Bc is the number of biases in the layer; Pc is the number of parameters in the layer; K is the size (width) of kernels in the layer; N is the number of kernels; C is the number of channels in the input image.
(1.1)
(1.2)
In the first convolutional layer, number of learning parameters is (5×5 + 1) × 6 = 156 parameters; where 6 is the number of filters, 5 × 5 is the filter size, and bias is 1, and there are 28×28×156 = 122,304 connections. The number of feature map calculation is as follows:
(1.3)
(1.4)
W = 32; H = 32; Fw = Fh = 5; P = 0, and the number of feature map is 28 × 28.
First pooling layer: W = 28; H = 28; P = 0; S = 2
(1.5)
Figure 1.1 Architecture of LeNet-5.
Table 1.1 Various parameters of the layers of LeNet.
Sl no. | Layer | Feature map | Feature map size | Kernel size | Stride | Activation | Trainable parameters | # Connections |
1 | Image | 1 | 32 × 32 | - | - | - | - | - |
2 | C1 | 6 | 28 × 28 | 5 × 5 | 1 | tanh | 156 | 122,304 |
3 | S1 | 6 | 14 × 14 | 2 × 2 | 2 | tanh | 12 | 5,880 |
4 | C2 | 16 | 10 × 10 | 5 × 5 | 1 | tanh | 1516 | 151,600 |
5 | S2 | 16 | 5 × 5 | 2 × 2 | 2 | tanh | 32 | 2,000 |
6 | Dense | 120 | 1 × 1 | 5 × 5 | 1 | tanh | 48,120 | 48,120 |
7 | Dense | - | 84 | - | - | tanh | 10,164 | 10,164 |
8 | Dense | - | 10 | - | - | softmax | - | - |
60,000 (Total) |
(1.6)
The number of feature map is 14×14 and the number of learning parameters is (coefficient + bias) × no. filters = (1+1) × 6 = 12 parameters and the number of connections = 30×14×14 = 5,880.
Layer 3: In this layer, only 10 out of 16 feature maps are connected to six feature maps of the previous layer as shown in Table 1.2. Each unit in C3 is connected to several 5 × 5 receptive fields at identical locations in S2. Total number of trainable parameters = (3×5×5+1)×6+(4×5×5+1)×9+(6×5×5+1) = 1516. Total number of connections = (3×5×5+1)×6×10×10+(4×5×5+1) ×9×10×10 +(6×5×5+1)×10×10 = 151,600. Total number of parameters is 60K.