Читать книгу Multimedia Security, Volume 1 - William Puech - Страница 18
1.1.5. Outline of this chapter
ОглавлениеOur objective is to recognize each step of the production chain of an image. This information can sometimes appear in the data accompanying the image, called EXIF (Exchangeable Image File Format), which also includes information such as the brand and model of the camera and lens, the time and location of the photograph, and its shooting settings. However, this information can be easily modified, and is often automatically deleted by social media for privacy reasons. Therefore, we are interested in the information left by the operations on the image itself rather than in the metadata. Some methods, like the one presented in Huh et al. (2018), offer to check the consistency of the data present in the image with its EXIF metadata.
Knowledge of the image production chain allows for the detection of changes.
A first application is the authentication of the camera model. The processing chain is specific to each device model; so it is possible to determine the device model by identifying the processing chain, as implemented in Gloe (2012) where features are used to classify photographs according to their source device. More recently, Agarwal and Farid (2017) showed that even steps common to many devices, such as JPEG compression, sometimes have implementation differences that allow us to differentiate models from multiple manufacturers, or even models from the same manufacturer.
Another application is the detection of suspicious regions in an image, based on the study of the residue – sometimes called noise – left by the processing chain. This residue is constituted of all the traces left by each operation. While it is often difficult, or even impossible, to distinguish each step in the processing chain individually, it is easier to distinguish two different processing chains as a whole. Using this idea, Cozzolino and Verdoliva proposed to use steganography tools (see Chapter 5 entitled “Steganography: Embedding data into Multimedia Content”) to extract the image residue (Cozzolino et al. 2015b). Treating this residue as a piece of hidden information in the image, an algorithm such as Expectation–Maximization (EM) is then used to classify the different regions of the image. Subsequently, neural networks have shown good performance in extracting the residue automatically (Cozzolino and Verdoliva 2020; Ghosh et al. 2019), or even in carrying out the classification themselves (Zhou et al. 2018).
The outline of this chapter arises from previous considerations. Section 1.2 describes the main operations of the image processing chain.
Section 1.3 is dedicated to the effect each step of the image processing pipeline has on the image’s noise. This section illustrates how and why the fine analysis of noise enables the reverse engineering of the image and leads to the detection of falsified areas because of the discrepancies in the noise model.
We then detail the two main operations that lead to the final coding of the image. Section 1.4 explains how demosaicing traces can be detected and analyzed to detect suspicious areas of an image. Section 1.5 describes JPEG encoding, which is usually the last step in image formation, and the one that leaves the most traces. Similarly to demosaicing, we show how the JPEG encoding of an image can be reverse-engineered to understand its parameters and detect anomalies. The most typical cases are cropping and local manipulations, such as internal or external copy and paste.
Section 1.6 specifically addresses the detection of internal copy-move, a common type of manipulation. Finally, section 1.7 discusses neural-network-based methods, often efficient but at the cost of interpretability.
Figure 1.2. Simplified processing pipeline of an image, from its acquisition by the camera sensor to its storage as a JPEG-compressed image. The left column represents the image as it goes through each step. The right column plots the noise of the image as a function of intensity in all three channels (red, green, blue). Because each step leaves a specific footprint on the noise pattern of the image, analyzing this noise enables us to reverse engineer the pipeline of an image. This in turn enables us to detect regions of an image which were processed differently, and are thus likely to be falsified