Читать книгу Computational Statistics in Data Science - Группа авторов - Страница 81
5 Autoencoders 5.1 Introduction
ОглавлениеAn autoencoder is a special type of DNN where the target of each input is the input itself [13]. The architecture of an autoencoder is shown in Figure 5, where the encoder and decoder together form the autoencoder. In the example, the autoencoder takes a horse image as input and produces an image similar to the input image as output. When the embedding dimension is greater than or equal to the input dimension, there is a risk of overfitting, and the model may learn an identity function. One common solution is to make the embedding dimension smaller than the input dimension. Many studies showed that the intrinsic dimension of many high‐dimensional data, such as image data, is actually not truly high‐dimensional; thus, they can be summarized by low‐dimensional representations. Autoencoder summarizes the high‐dimensional data information with low‐dimensional embedding by training the framework to produce output that is similar to the input. The learned representation can be used in various downstream tasks, such as regression, clustering, and classification. Even if the embedding dimension is as small as 1, overfitting is still possible if the number of parameters in the model is large enough to encode each sample to an index. Therefore, regularization [15] is required to train an autoencoder that reconstructs the input well and learns a meaningful embedding.