Читать книгу Machine Learning for Tomographic Imaging - Professor Ge Wang - Страница 62

Distinguishing features

CNN has three distinguishing features: local connectivity, shared weight, and multiple feature maps. According to the concept of receptive fields, CNN exploits the spatial locality by enforcing a local connectivity between neurons of adjacent layers. This architecture ensures that the learned ‘filters’ produce the strongest responses to spatially local input patterns of relevance. Stacking many such layers together forms nonlinear filters that become increasingly global as the depth goes deeper (i.e. responsive to an increasingly larger region) so that the network first creates representations for local and primitive features of the input, and then assembles them for semantic and global features.

In CNN, each filter is replicated across the entire visual field. These replicated units share the same parameters; that is, the same weight vector and bias is repeatedly used to produce a feature map. In a given convolutional layer, the features of interest for all neurons can be analyzed by a shift-invariant correlation. Replicating units in this way allows for the same feature to be detected regardless of their position in the visual field.

Each CNN layer has neurons arranged in three dimensions: width, height, and depth. The width and height represent the size of a feature map. The depth represents the number of feature maps over the same receptive field, which offers different structural features in the same visual scope to respond to the visual stimuli of various types, respectively. Finally, different types of layers, both locally and completely connected, are stacked to form the CNN architecture.

Together, these properties allow CNNs to achieve impressive trainability and generalizability on visual information processing problems. Local connectivity enforces the fact that correlations are often local. Weight sharing reflects the prior knowledge on shift invariance, dramatically reducing the number of free parameters, lowering the memory requirement for training the network, and enabling larger and deeper networks.

Machine Learning for Tomographic Imaging

Подняться наверх