Omnifocus image synthesis using lens swivel

Summary

An omnifocus (or all-in-focus) image has everything from the near foreground to the far background in sharp focus. Conventional lenses can focus only on a single surface—usually, the plane of sharp focus—as dictated by the laws of physics. Consequently, objects fore and aft the plane of sharp focus gradually get out of focus and appear blurry in the image. This interplay of light and lenses leads to a fundamental problem in imaging—the Depth of Field (DOF) problem. Several methods—from optomechanical to computational imaging to purely image processing—are available that address the issue of limited DOF to various degrees. For example, wavefront coding, plenoptic imaging, Scheimpflug imaging, focus stacking, etc. However, in our opinion, no method is optimal for omnifocus imaging (discussed in greater details shortly). Therefore, we present yet another method to create an omnifocus image from a set of images obtained under multiple lens rotations. We have borrowed the essential elements from Scheimpflug imaging and focus stacking that help to overcome the limited DOF and built our own theory. In the end, we have come up with a simple mechanism for creating an all-in-focus imaging using lens tilts; yet during the process of our examination, we have also discovered a few interesting concepts of geometric optics which, we expect, will have much broader application beyond omnifocus image synthesis. In the following discourse, we present our model, simulation results, and a brief discussion of our findings.

A 3 minute quick description of the project

The above video won the Best Multimedia Award at COSI 2016.

Exteranl links

1.	Poster for Imaging and Applied Optics (Imaging) Congress, 2016
2.	Paper for Imaging and Applied Optics (Imaging) Congress, 2016
3.	Github code repo. (Python, Zemax, PyZDDE, Jupyter notebook)

Pupils are the sine qua non of optical systems. Everything in imaging can be explained in terms of the pupils. What lies at the heart of omnifocus image synthesis using lens swivel? It’s all about the pupils.

Introduction

The limited Depth of Field (DOF) is a fundamental problem (along with spatial resolution and loss of three-dimensional information) that afflict several imaging applications such as optical microscopy, machine vision, biometric imaging, etc. I have previously written about the DOF problem in the context of iris acquisition systems here.

There are several techniques in imaging that work around the DOF problem. We will briefly discuss the two current techniques that are most relevant to our proposed method—Scheimpflug imaging and (frontoparallel) focus stacking. We have used the term frontoparallel to mean the typical imaging configuration in which the sensor plane, the lens plane, and the plane of sharp focus are mutually parallel and are perpendicular to the optical axis.

In Scheimpflug imaging, (Figure 1) the lens or the sensor or both are tilted inducing a rotation of the plane of sharp focus. Using appropriate tilts of the lens and/or sensor plane we can orient the plane of sharp focus to maximize focus on a tilted object surface or while photographing a scene with significant depth. Note that the DOF region surrounding the plane of sharp focus is still finite.

Figure 1: Scheimpflug imaging. The plane of sharp focus (PoSF) can be oriented that is most suitable for focusing on an subject with significant depth or on a tilted object plane.

In focus stacking, images are captured at multiple focus depths either by changing the image- to lens- plane distance (Figure 2) or by changing the focal length. Consequently, in a single image, only those regions of the scene that are at the appropriate depth form sharp images. Since the focal depth is continually varied, collectively, the stack contains the whole scene in focus distributed among the images.

Figure 2: Illustration of frontoparallel focus stacking by varying the image plane distance. Note that the depth of field regions are not uniform.

The images in the stack could have significantly different magnifications. Therefore, they are first registered with respect to a reference image and then, the in-focus regions are identified and blended to create an all-in-focus image. Although there are several focus measure algorithms to determine the in-focus regions within an image, a simple Laplacian of Gaussian (LoG) filter performs rather well in most situations. A simulation of the three essential steps in frontoparallel focus stacking is shown in Figure 3.

Figure 3. Simulation of the three basic steps in frontoparallel focus stacking. The focused regions in each sensor image, detected using a LoG filter is shown for the three images in the stack and for the composite image.

The plane of sharp focus and the DOF region surrounding it, in each image in focus stacking, are always parallel to the lens and sensor planes. Also, the DOF region extends infinitely in the direction perpendicular to the optical axis (under first-order optics model). Naturally, significant portions of the DOF inevitably lie outside the field-of-view of the camera, remaining untapped.

Omnifocus image synthesis using lens swivel

Our method builds on the principles underlying the above two method for circumventing the DOF problem. The basic idea is shown in Figure 4. We capture multiple images while rotating the lens about the entrance pupil. The plane of sharp focus and the wedge-shaped depth of field region sweeps through the entire scene forming regions in sharp focus across the images in the stack. These in-focus regions can be selectively blended following registration and identification of in-focus regions.

Unlike in the case of frontoparallel focus stacking, the extents of the DOF region in each image is generally within the field of view of the camera (especially for larger tilts of the plane of sharp focus).

Figure 4. Illustration of the nature of depth of field as the lens is rotated.

We will now quickly introduce few technical terms (common in the field of first order optics) in order to prepare the framework for further discussion.

Pupils are the sine qua non of optical systems. Everything in imaging can be explained in terms of the pupils. What lies at the heart of omnifocus image synthesis using lens swivel? It’s all about the pupils.

The entrance pupil is the image of the stop (limiting aperture) seen through the lens elements preceding it is. The exit pupil is the image of the stop seen through the elements following it is. Furthermore, the entrance pupil is the center of projection in the object side since all chief-rays originating from the object converge (virtually) at the center of the (unaberrated) entrance pupil (shown in Figure 5). On the image side, the chief-rays emerge (virtually) from the center of the exit pupil. Therefore, the entrance and exit pupils are the centers of projection on the object and image side respectively.

Figure 5. Entrance and exit pupils of a lens. (Please note that there is a slight mistake in one label in the above figure — The labels for the object side principal plane (H) and the image side principal plane (H’) in the above figure has been mistakenly interchange.

The ratio of the (paraxial) exit pupil size to the entrance pupil size is defined as the pupil magnification, $m_p$ , (we will revisit this in the model).

Further, when we rotate a lens (say about the entrance pupil), the position of the exit pupil translates, causing the bundle of chief rays emerging from the exit pupil towards the image plane to shift. Consequently, the image field (of a scene) on the sensor plane translates (in the x-y plane of the sensor) in response to the rotation of the lens. (The exact nature of translation (and associated geometric distortion) of the image field depends on the value of the pupil magnification and the point of rotation of the lens along the optical axis.)

Ignoring the addition and subtraction portions of the image field near the edges, the shift of the corresponding points between two images obtained under two different rotations of the lens can (usually) be described by a transformation called the inter-image homography (see Figure 7 and Figure 8 for as examples of the type of shift of the image field expected under lens tilts). If the inter-image homography is known, we can undo the shifts (and distortion) of the image obtained under lens rotations during the process of registration. If this transformation is not known, we would have to estimate it from the observed images (for example, using the OpenCV function findHomography()). Such methods work well if the images are not blurry. However, since tilting the lens results in a significant portion of the object space to become out of focus in each individual image, the estimated homography from the images are expected to have large errors. Therefore, (for our particular problem) it is highly imperative to have prior knowledge about the inter-image homography.

As it turns out (see the “math” for details) if the lens is rotated about the entrance pupil, an inter-image homography that is independent of object depth exists. Instead, if the lens is rotated about a point other then the center of the entrance pupil, the amount of shift experienced by points in the image also depends on the object distance of the corresponding point. This is a consequence of the fact that the entrance pupil translates if the lens is rotated about any a different point. [TO DO: Explain a little more about the nature of projection in terms of the chief rays]

Furthermore, if the lens has pupil magnification ( $m_p$ ) equal to one, and it is rotated about the center of the entrance pupil, the inter-image homography is a very simple matrix—a special case of similarity transformation consisting of only translation and uniform scaling components.

Simulation

Now we describe the setup of the Zemax based simulation platform we used to verify the above theory. As shown in Figure 6, a 24 mm, f/2.5 paraxial thick lens with pupil magnification=1 (symmetric lens) images three playing cards placed at 800 mm, 1000 mm and 1200 mm from the lens’ vertex. We introduced slight amount of spherical aberration to ensure that the focused PSF size was comparable to the pixel size of the (simulated) digital sensor. We used the Python Zemax Dynamic Data Exchange (PyZDDE) library to automate and control the simulation. The main task of PyZDDE was to automatically tilt the lens (the two paraxial surfaces, the pupil surfaces, and the aperture mainly) by the specified tilt angle, initiate the image simulation tool in Zemax with the specified parameters (please see the Jupyter notebook for details), and return the simulated image once Zemax finished the image simulation. We used Python to store the images tagged along with the simulation parameters into an HDF5 file (using h5py library).

Figure 6. Illustration of the setup of simulation in Zemax

For the simulation we capture (simulated) 13 images while rotating the lens about the entrance pupil between -8° and +8°. Since the pupil magnification equals one, we expect to observe a simple shift (along with uniform scaling) of the image field. Figure 7 shows the orientation of the lens, the sensor image and the (detected) regions of focus in the sensor image for three rotations of the lens.

Figure 7. Schematic of the orientation of the lens and corresponding image formation on the sensor for three rotations. Notice the shift of the image field in the in the inset titled “Sensor image” and the corresponding in-focus regions detected using a LoG filter in the inset tilted “Focus measure (LoG)”

The three sensor images from Figure 7 are shown in Figure 8 for side-by-side comparison. Notice the shift of the image field between the images and the different regions of the three cards that are in focus. Notice that when the lens is tilted, no card—all of which are frontoparallel to the image sensor—is completely in focus; but small portions of all cards (especially the middle and rear in this simulation) can be seen in focus as is typical in Scheimpflug imaging.

Figure 8. The three sensor images shown side-by-side for comparison.

Following the creation of the stack, we register all the image by undoing the shift and scaling using the closed form inter-image homography (Figure 9). Then, the in-focus regions in each image is detected using a Laplacian of Gaussian (LoG) filter, and the these regions are blended to form a composite image. Observing the composite image and its focus measure in Figure 9 we see that all three cards are in focus.

Figure 9. Synthesis of all-in-focus image following registration and blending.

Discussion and findings

Advantages and comparison with frontoparallel focus stacking

The registration (image alignment) using closed form equation is simple, especially if the pupil magnification of the lens is one.

Our method can be used to improve the depth of field around a tilted object plane in Scheimpflug imaging (more on this later).

What if the pupil magnification ( $m_p$ ) of the lens is not equal to one? And what if we rotate the lens about a point away from the entrance pupil?

The rotation of the lens induces a shift and scaling of the image field. If the pupil magnification ( $m_p = 1$ ) equals one, the scaling is isotropic, and the shift is simple (the entire image fields translates uniformly along a particular direction in the sensor plane). If the pupil magnification is not equal to one, then anisotropic shift across the image field manifests as image distortion. However, irrespective of the value of the pupil magnification, if the lens is rotated about the entrance pupil, then the inter-image homography (the transformation that relates corresponding points between two images obtained under two tilts of the lens) is independent of object distance.

The following figures illustrate the above concept. Figure 10 shows the setup for the ensuing qualitative analysis. The two overlapping grids on the left in Figure 10 are the coincident images of two planes in the object spaces—a near plane, a square of 88.15 mm on each side, and a far plane, a square of 178.3 mm on each side placed at twice the distance of the near plane from the entrance pupil. The exact distances vary depending upon the pupil magnification, such that the images of the two planes are 4.5 mm on each side on the sensor. The z-axis of the camera frame passes through the center of both object planes. Therefore, the images of the two square grids are coincident in the frontoparallel configuration. The “image points” are the points of intersection of the chief-rays emanating from a 7×7 square object grid with the image plane. The lighter shaded orange “Y” markers represent the group of image points from the near object plane in frontoparallel configuration. The lighter shaded blue “inverted Y” markers represent the image points from the far object plane in the frontoparallel configuration. Note that in frontoparallel configuration the two images of the two object planes coincide; however, for the sake of visual clarity, we displaced the two set of image points horizontally by 5 mm on either side of the center (shown on the right side of Figure 10).

The darker shaded markers of either color (in Figures 11-14) represent the image points following the rotation of the lens. The translations of the image points are shown by the gray arrows between the original and shifted positions. The gray level of the arrows specifies the normalized magnitude of translation—brighter indicates relatively larger translation. The figures also display information about the standard deviation (SD) of the arrow lengths. This statistic gives a sense of the non-uniform translation of the image points across the image field. If all image points shift by the same amount, then the standard deviation will be zero. A larger value of the standard deviation indicates greater diversity in shifts, and hence greater distortion. In addition to the standard deviation, we also measure the amount by which the centroid of the set of points from the two images shifts. The translation of the centroid gives a measure of the the global image field shifts.

Figure 10.

Figures 11-12 show the movement of the grid of image points when the lens is rotated about a point away from the entrance pupil. Notice the parallax effect as a consequence of the rotation of the lens away from the entrance pupil.

img_field_shift_qs_1_rot_away_enpp_mp_not_one

Figure 11. Shift of image fields for rotation of a lens, with pupil magnification equal to 0.55, away from entrance pupil. Such rotations result in parallax between corresponding points.

img_field_shift_qs_2_rot_away_enpp_mp_equ_one

Figure 12. Shift of image fields for rotation of a lens, with pupil magnification equal to 1.0, away from the entrance pupil. Such rotations result in parallax between corresponding points.

When a lens is rotated about the entrance pupil there is no parallax as shown in Figures 13-14. Equivalently, the inter-image homography is independent of object distance.

img_field_shift_qs_3_rot_about_enpp_mp_not_one

Figure 13. Translation of the points of the image field when the lens is rotated about the entrance pupil.

img_field_shift_qs_4_rot_about_enpp_mp_equ_one

Figure 14. Translation of the points of the image filed when the lens is rotated about the entrance pupil. Further, since the pupil magnification is equal to one, the inter-image homography contains only the uniform scaling terms and translation terms.

The Math – Geometric model

Please see the poster section “Geometric Model”.

References

C. H. Anderson, J. R. Bergen, P. J. Burt, J. M. Ogden, “Pyramid Methods in Image Processing,” RCA Engineer, vol. 29, pp. 33-41 (1984).
Jacobson, Ralph, Sidney Ray, Geoffrey G. Attridge, and Norman Axford, Manual of Photography (Taylor & Francis, 2000), Chap. 10.
Indranil Sinharoy, Prasanna Rangarajan, and Marc P. Christensen, “Geometric model of image formation in Scheimpflug cameras,” PeerJ Preprints 4:e1887v1 https://doi.org/10.7287/peerj.preprints.1887v1 (2016).
Indranil Sinharoy et al., PyZDDE: Release version 2.0.2. Zenodo. 10.5281/zenodo.44295 (2016).

Opensource software and tools used in the project

PyZDDE – Python Zemax Dynamic Data Exchange Toolbox
NumPy and Scipy – scientific computing libraries in Python
IPython/Jupyter – scientific computing environment and notebooks
Sympy – Symbolic computing in Python
Matplotlib – plotting library in Python
Mayavi – 3D plotting library in Python
h5py – HDF5 library for Python

Indranil's world

Indranil's page on Imaging, Optics, Computer Vision, Python & Photography

Omnifocus image synthesis using lens swivel

A 3 minute quick description of the project

Introduction

Omnifocus image synthesis using lens swivel

Simulation

Discussion and findings

Leave a comment Cancel reply

A 3 minute quick description of the project

Introduction

Omnifocus image synthesis using lens swivel

Simulation

Discussion and findings

Let others know:

Leave a comment Cancel reply