Читать книгу Computational Analysis and Deep Learning for Medical Care - Группа авторов - Страница 17

1.2.3 ZFNet

The architecture of ZFNet introduced by Zeiler [3] is same as that of the AlexNet, but convolutional layer uses reduced sized kernel 7 × 7 with stride 2. This reduction in the size will enable the network to obtain better hyper-parameters with less computational efficiency and helps to retain more features. The number of filters in the third, fourth and fifth convolutional layers are increased to 512, 1024, and 512. A new visualization technique, deconvolution (maps features to pixels), is used to analyze first and second layer’s feature map.

Table 1.3 AlexNet layer details.

Sl. no.	Layer	Kernel size	Stride	Activation shape	Weights	Bias	# Parameters	Activation	# Connections
1	Input Layer	-	-	(227,227,3)	0	0	-	relu	-
2	CONV1	11 × 11	4	(55,55,96)	34,848	96	34,944	relu	105,415,200
3	POOL1	3 × 3	2	(27,27,96)	0	0	0	relu	-
4	CONV2	5 × 5	1	(27,27,256)	614,400	256	614,656	relu	111,974,400
5	POOL2	3 × 3	2	(13,13,256)	0	0	0	relu	-
6	CONV3	3 × 3	1	(13,13,384)	884,736	384	885,120	relu	149,520,384
7	CONV4	3 × 3	1	(13,13,384)	1,327,104	384	1,327,488	relu	112,140,288
8	CONV5	3 × 3	1	(13,13,256)	884,736	256	884,992	relu	74,760,192
9	POOL3	3 × 3	2	(6,6,256)	0	0	0	relu	-
10	FC	-	-	9,216	37,748,736	4,096	37,752,832	relu	37,748,736
11	FC	-	-	4,096	16,777,216	4,096	16,781,312	relu	16,777,216
12	FC	-	-	4,096	4,096,000	1,000	4,097,000	relu	4,096,000
OUTPUT	FC	-	-	1,000	-	-	0	softmax	-
-	-	-	-	-	-	-	62,378,344 (Total)	-	-

Figure 1.3 Architecture of ZFNet.

ZFNet uses cross-entropy loss error function, ReLU activation function, and batch stochastic gradient descent. Training is done on 1.3 million images uses a GTX 580 GPU and it takes 12 days. The ZFNet architecture consists of five convolutional layers, followed by three max-pooling layers, and then by three fully connected layers, and a softmax layer as shown in Figure 1.3. Table 1.4 shows an input image 224 × 224 × 3 and it is processing at each layer and shows the filter size, window size, stride, and padding values across each layer. ImageNet top-5 error improved from 16.4% to 11.7%.

Computational Analysis and Deep Learning for Medical Care

Подняться наверх