Читать книгу Multimedia Security, Volume 1 - William Puech - Страница 21

1.2.2. Demosaicing

Оглавление

Most cameras cannot see color directly, because each pixel is obtained through a single sensor that can only count the number of photons reaching it in a certain wavelength range. In order to obtain a color image, a color filter array (CFA) is placed in front of the sensors. Each of them only counts the photons of a certain wavelength. As a result, each pixel has a value relative to one color. By using filters of different colors on neighboring pixels, the missing colors can then be interpolated.

Although others exist, almost all cameras use the same CFA: the Bayer array, which is illustrated in Figure 1.3. This matrix samples half the pixels in green, a quarter in red and the last quarter in blue. Sampling more pixels in green is justified by the human visual system, which is more sensitive to the color green.

Unlike other steps in the formation of an image, a wide variety of algorithms are used to demosaic an image. The most simple demosaicing algorithm is bilinear interpolation: missing values are interpolated by averaging the most direct neighbors sampled in that channel. As the averaging is done regardless of the image gradient, this can cause visible artifacts when interpolated against a strong gradient, such as on image edges.

To avoid these artifacts, more recent methods attempt to simultaneously take into account information from the three color channels and avoid interpolation along a steep gradient. For instance, the Hamilton–Adams method is carried out in three stages (Hamilton and Adams 1997). First, it interpolates the missing green values by taking into account the green gradients corrected for the discrete Laplacian of the color already known at each pixel to interpolate horizontally or vertically, in the direction where the gradient is weakest. It then interpolates the red and blue channels on the pixels sampled in green, taking the average of the two neighboring pixels of the same color, corrected by the discrete Laplacian of the green channel in the same direction. Finally, it interpolates the red channel of blue-sampled pixels and the blue channel of red-sampled pixels using the corrected average of the Laplacian of the green channel, in the smoothest diagonal.


Figure 1.3. The Bayer matrix is by far the most used for sampling colors in cameras

Linear minimum mean-square error demosaicing (Getreuer 2011) suggests working not directly on the three color channels (red, green and blue), but on the pixelwise differences between the green channel and each of the other two channels separately. It interpolates this difference separately in the horizontal and vertical directions, in order to estimate first the green channel, followed by the differences between red and green, and then between blue and green. The red and blue channels can then be recovered by a simple subtraction. This method, as well as many others, makes the underlying assumption that the difference of color channels is smoother than the color channels themselves, and therefore easier to interpolate.

More recently, convolutional neural networks have been proposed to demosaic an image. For instance, demosaicnet uses a convolutional neural network to jointly interpolate and denoise an image (Gharbi et al. 2016; Ehret and Facciolo 2019). Even if these methods offer superior results to algorithms without training, they also require more resources, and are therefore not widely used yet in digital cameras.

The methods described here are only a brief overview of the large array of methods that exist for image demosaicing. This variety is increased by the fact that most industrial cameras do not disclose their often private demosaicing algorithm.

No demosaicing method is perfect – after all, it is a matter of reconstructing missing information – and produces some level of artifacts, although some produce much fewer artifacts than others. Therefore, it is possible to detect these artifacts to obtain information on the demosaicing method applied to the image, which is explained in section 1.4.

Multimedia Security, Volume 1

Подняться наверх