Читать книгу Handbook of Intelligent Computing and Optimization for Sustainable Development - Группа авторов - Страница 126
5.3.1.2 CNN Architectures for Modulation Classification
ОглавлениеIn this case study, an eight-layer CNN for the AMC is proposed. The design draws inspiration from the visual geometry group (VGG)–CNN architecture proposed by O’Shea et al. [19, 48] for AMC. The proposed network comprises of five convolutional (Conv) layers and three fully connected (FC) dense layers including the output layer. The input size of TF spectral image is 128 × 128 that fed to the first Conv layer (Conv1) of the model. CNN has 128 filters, each of size 5 × 5. Activation function used is rectified linear unit (ReLU). To keep the output of the first layer same as the size of input image, appropriate zero padding is employed. All next Conv layers up to fourth are designed same as the first layer. The fifth Conv layer differs only with filter size, i.e., size is 7 × 7 for CNN. The sixth and seventh layers are FC dense layers, each having 256 neurons and ReLU as the activation function. The output layer is also FC dense layer with a number of neurons equal to output classes, and SoftMax as the activation function. Average pooling of stride 4 × 4 is implemented after Conv1 and Conv2 layer. A stride of 4 × 2 is carried out subsequently to Conv3 and Conv4 layers, respectively. No pooling is performed after the Conv5.
The main objective of 2D filters is to let the kernels to familiarize to I and Q data separately. Here, most of the layer’s functionality set almost same as in [19, 48]. The filter size and pooling layers are optimized for our preprocessed input unit based on the trial-and-error method. Adam optimizer used stochastic gradient instead for updating weights as similar to [48], that takes care of tuning hyperparameters like learning rate. Table 5.1 represents the details about proposed CNN layout for AMC.
Table 5.1 CNN architecture layout for TF images using RML synthetic data set.
Layer | Output | Parameter |
Input | 128 × 128 × 1 | - |
Conv 1 (128 × 5 × 5), ReLU | 128 × 128 × 1 | 3,328 |
Average Pooling (4 × 4) | 64 × 64 × 128 | - |
Conv 2 (128 × 5 × 5), ReLU | 64 × 64 × 128 | 409,728 |
Average Pooling (4 × 4) | 32 × 32 × 128 | - |
Conv 3 (128 × 5 × 5), ReLU | 32 × 32 × 128 | 409,728 |
Average Pooling (4 × 2) | 16 × 16 × 128 | - |
Conv 4 (128 × 5 × 5), ReLU | 16 × 16 × 128 | 409,728 |
Average Pooling (4 × 2) | 8 × 8 × 128 | - |
Conv 5 (128 × 7 × 7), ReLU | 8 × 8 × 128 | 802,944 |
FC Dense 6 (256), ReLU | 256 | 2,097,408 |
FC Dense 7 (256), ReLU | 256 | 65,792 |
FC Dense 8 (90), Softmax | 90 | 23,130 |
The notion of EOCs is motivated from transfer learning in which novel labeling technique is adopted for the estimated output type. Individual sample in the training data set is labeled with two labels, i.e., the modulation and received SNR label, respectively. In AMC, using CNN common approach is to use only modulation class label. By using “EOC” method, the CNN is trained to estimate both the modulation label as well as the SNR label of the input sample. It is carried out by describing output classes with [Modulation, SNR] labels rather than just [Modulation] labels. Our data contains ten modulation types; hence there will be 90 output classes. As shown in Table 5.1 in the last FC dense layer consists of 90 neurons.
The motive behind the extended class approach is to make the network more adaptable to signal features at different SNR. Further, to prepare the CNN for unpredictable SNR situation that might be encountered during the testing of an unknown sample. Therefore, the network should learn to identify the reasonably accurate SNR scenario from the input sample and then familiarize itself to achieve superior classification accuracy. At the end, many-to-one mapping function block is implemented extract only modulation type.