Sunday, 5 July 2015

A critical look

Most of the posts on this blog are tutorial / educational in style. I have come across a paper published by an Imatest employee that requires some commentary of a more critical nature. With some experience in the academic peer review process, I hope I can maintain the appropriate degree of objectivity in my commentary.

At any rate, if you have no interest in this kind of commentary / post, please feel free to skip it.

The paper

The paper in question is : Jackson K. M. Roland, " A study of slanted-edge MTF stability and repeatability ", Proc. SPIE 9396, Image Quality and System Performance XII, 93960L (January 8, 2015);  doi:10.1117/12.2077755;

A copy can be obtained directly from Imatest here.

Interesting point of view

One of the contributions of the paper is a discussion of the impact of edge orientation on MTF measurements. The paper appears to approach the problem from a direction that is more closely aligned with the ISO12233:2000 standard, rather than Kohm's method ("Modulation transfer function measurement method and results for the Orbview-3 high resolution imaging satellite", Proceedings of ISPRS, 2004).

By that I mean that Kohm's approach (and MTF Mapper's approach) is to compute an estimate of the edge normal, followed by projection of the pixel centre coordinates (paired with their intensity values) onto this normal. This produces a dense set of samples across the edge in a very intuitive way; the main drawback of this approach being the potential increase in the processing cost because it lends itself better to a floating point implementation.

The ISO12233:2000 approach rather attempts to project the edge "down" (assuming a vertical edge) onto the bottom-most row of pixels in the region of interest (ROI). Using the slope of the edge (estimated earlier), each pixel's intensity (sample) can be shifted left or right by the appropriate phase offset before being projected onto the bottom row. If the bottom row is modelled as bins with 0.25-pixel spacing, this process allows us to construct our 4x-oversampled, binned ESF estimate with the minimum amount of computational effort (although that might depend on whether a particular platform has strong floating-point capabilities).

The method proposed in the Imatest paper is definitely of the ISO12233:2000 variety. How can we tell? Well, the Imatest paper proposes that the ESF must be corrected by appropriate scaling of the x values using a scaling factor of cos(theta), where theta is the edge orientation angle. What this accomplishes is to "squash" the range of x values (i.e. pixel column) to be spaced at an interval that is consistent with the pixel's distance as measured along the normal to the edge. For a 5 degree angle, this correction factor is only 0.9962, meaning that distances will be squashed by a very small amount indeed. So little, in fact, that the ISO12233:2000 standard ignores this correction factor, because a pixel at a horizontal distance of 16 pixels will be mapped to a normal distance of 15.94. Keeping in mind that the ESF bins are 0.25 pixels wide, this error must have seemed small.

I recognize that the Imatest paper proposes a valid solution to this "stretching" of the ESF that would occur in its absence, and that this stretching would become quite large at larger angles (about a 1.5 pixel shift at 25 degrees for our pixel at a horizontal distance of 16 pixels).

My critique of this approach is that it would typically involve the use of floating point calculations, the potential avoidance of which appears to have been one of the main advantages of the ISO12233:2000 method. If you are going to use floating point values, then Kohm's method is more intuitive.

Major technical issues

  1. The Point Spread Functions (PSFs) used to perform the "real world" and simulated experiments were rather different, particularly in one very important aspect. The Canon 6D camera has a PSF that is anisotropic, which follows directly from its square (or even L-shaped) photosites. The composite PSF for the 6D would be an Airy pattern (diffraction) convolved with a square photosite aperture (physical sensor) convolved with a 4-dot beam splitter (the OLPF). Of course I do not have inside information on the exact photosite aperture (maybe chipworks has an image) nor the OLPF (although a 4-dot Lithium Niobate splitter seems reasonable). The point remains that this type of PSF will yield noticeably higher MTF50 values when the slanted edge approaches 45 degrees. Between the 5 and 15 degree orientations employed in the Imatest paper, we would expect a difference of about 1%. This is below the error margin of Imatest, but with a large enough set of observations this systematic effect should be visible.

    In contrast, the Gaussian PSF employed to produce the simulated images is  (or at least is supposed to be) isotropic, and should show no edge-orientation dependent bias. Bottom line: the "real world" images had an  anisotropic PSF, and the simulated images had an isotropic PSF. This means that the one cannot be used in the place of the other to evaluate the effects of edge orientation on measured MTF. Well, at least not without separating the PSF anisotropy from the residual orientation-depended artifacts of the slanted edge method.
  2.  On page 7 the Imatest paper states that "The sampling of the small Gaussian is such that the normally rotationally-invariant Gaussian function has directional factors as you approach 45 degree increments." This is further "illustrated" in Figure 13.

    At this point I take issue with the reviewers who allowed the Imatest paper to be published in this state. If you suddenly find that your Gaussian PSF becomes anisotropic, you have to take a hard look at your implementation. The only reason that the Gaussian (with a small standard deviation) is starting to develop "directional factors" is because you are undersampling the Gaussian beyond repair.

    The usual solution to this problem is to increase the resolution of your synthetic image. By generating your synthetic image at, say, 10x the scale, all your Gaussian PSFs will be reasonably wide in terms of samples in the oversampled image. For MTF measurement using the slanted edge method, you do not even have to downsize your oversampled image before applying the slanted edge method. All you have to do is to change the scale of your resolution axis in your MTF plot. That way you do not even have to worry about the MTF of the downsampling kernel.

    There are several methods that produce even higher quality simulated images. At this point I will plug my own work: see this post or this paper. These approaches rely on importance sampling (for diffraction PSFs) or direct numerical integration of the Gaussian in two dimensions; both these approaches avoid any issues with downsampling and do not sample on a regular grid. These methods are implemented in mtf_generate_rectangle.exe, which is part of the MTF Mapper package.

Minor technical issues

  1. On page 1 the Imatest paper states that the ISO 12233:2014 standard lowered the edge contrast "because with high contrast the measurement becomes unstable". This statement is quite vague, and appears to contradict the results presented in Figure 8, which shows no degradation of performance at high contrast, even in the presence of noise.

    I would offer some alternative explanations: the ISO12233 standard is often applied to images compressed with DCT-based quantization methods, such as JPEG. A high-contrast edge typically shows up with a large-magnitude DCT coefficient at higher frequencies; exactly the frequencies that are more strongly quantized, hence the well-kown appearance of "mosquito noise" in JPEG images. A lower contrast edge will reduce the relative energy at higher frequencies, thus the stronger quantization of high frequencies will have a proportionately smaller effect. I am quite temtpted to go and test this theory right away.

    Another explanation, one that is covered in some depth on Imatest's own website, is of course the potential intensity clipping that may result from incorrect exposure. Keeping the edge contrast in a more manageable range reduces the chance of clipping. Another more subtle reason is that a lower contrast chart allows more headroom for sharpening without clipping. By this I mean that sharpening (of the unsharp masking type) usually results in some "ringing" which manifests as overshoot (on the bright side of the edge) and undershoot (on the dark side of the edge). If chart contrast was so high that the overshoot of overzealous sharpening would be clipped, then it would be harder to measure (and observe) the extent of oversharpening.
  2. The noise model is employed a little basic. Strictly speaking the standard deviation of the additive Gaussian white noise should be signal dependent; this is a more accurate model of photon shot noise, and is trivial to implement. I have not done a systematic study of the effects of noise simulation models on the slanted edge method, but in 2015 one really should simulate photon shot noise as the dominant component of additive noise.
  3. Page 6 of the Imatest paper states that "There is a problem with this 5 degree angle that has not yet been addressed in any standard or paper." All I can say to this is that Kohm's paper has presented an alternative solution to this problem that really should be recognized in the Imatest paper.


Other than the unforgivable error in the generation of the simulated images, a fair effort, but more time spent on the literature, especially papers like Kohm's, would have changed the tone of the paper considerably, which in turn would have made it more credible.

1 comment:

  1. You bring up valid points, Frans. Constructive critique is important in the furthering of any field.