Читать книгу Computational Analysis and Deep Learning for Medical Care - Группа авторов - Страница 18
1.2.4 VGGNet
ОглавлениеSimonyan and Zisserman et al. [4] introduced VGGNet for the ImageNet Challenge in 2014. VGGNet-16 consists of 16 layers; accepts a 227 × 227 × 3 RGB image as input, by subtracting global mean from each pixel. Then, the image is fed to a series of convolutional layers (13 layers) which uses a small receptive field of 3 × 3 and uses same padding and stride is 1. Besides, AlexNet and ZFNet uses max-pooling layer after convolutional layer. VGGNet does not have max-pooling layer between two convolutional layers with 3 × 3 filters and the use of three of these layers is more effective than a receptive field of 5 × 5 and as spatial size decreases, the depth increases. The max-pooling layer uses a window of size 2 × 2 pixel and a stride of 2. It is followed by three fully connected layers; first two with 4,096 neurons and third is the output layer with 1,000 neurons, since ILSVRC classification contains 1,000 channels. Final layer is a softmax layer. The training is carried out on 4 Nvidia Titan Black GPUs for 2–3 weeks with ReLU nonlinearity activation function. The number of parameters is decreased and it is 138 million parameters (522 MB). The test set top-5 error rate during competition is 7.1%. Figure 1.4 shows the architecture of VGG-16, and Table 1.5 shows its parameters.
Table 1.4 Various parameters of ZFNet.
Layer name | Input size | Filter size | Window size | # Filters | Stride | Padding | Output size | # Feature maps | # Connections |
Conv 1 | 224 × 224 | 7 × 7 | - | 96 | 2 | 0 | 110 × 110 | 96 | 14,208 |
Max-pooling 1 | 110 × 110 | 3 × 3 | - | 2 | 0 | 55 × 55 | 96 | 0 | |
Conv 2 | 55 × 55 | 5 × 5 | - | 256 | 2 | 0 | 26 × 26 | 256 | 614,656 |
Max-pooling 2 | 26 × 26 | - | 3 × 3 | - | 2 | 0 | 13 × 13 | 256 | 0 |
Conv 3 | 13 × 13 | 3 × 3 | - | 384 | 1 | 1 | 13 × 13 | 384 | 885,120 |
Conv 4 | 13 × 13 | 3 × 3 | - | 384 | 1 | 1 | 13 × 13 | 384 | 1,327,488 |
Conv 5 | 13 × 13 | 3 × 3 | - | 256 | 1 | 1 | 13 × 13 | 256 | 884,992 |
Max-pooling 3 | 13 × 13 | - | 3 × 3 | - | 2 | 0 | 6 × 6 | 256 | 0 |
Fully connected 1 | 4,096 neurons | 37,752,832 | |||||||
Fully connected 2 | 4,096 neurons | 16,781,312 | |||||||
Fully connected 3 | 1,000 neurons | 4,097,000 | |||||||
Softmax | 1,000 classes | 62,357,608 (Total) |
Figure 1.4 Architecture of VGG-16.