# The DOF problem in iris acquisition systems

This is the third post in the series on Iris acquisition for biometrics. In the first and the second posts we saw that, at least in theory, iris recognition is an ideal biometric, and we went through some of the desirable properties of an iris acquisition system. However, currently most iris recognition systems require a single subject to stand (or move slowly) at a certain standoff distance from the camera in order capture and process iris images. Wouldn’t it be nice if iris recognition could be simultaneously performed for a group of people who may be standing/ moving within a large volume? Such systems could potentially be used in crowded places such as airports, stadiums, railway stations etc.

In this post, we will look at one of the limitations of current iris recognition systems – the limited depth of field, the fundamental cause of this limitation, and how some of the current systems are addressing this problem.

## The problem of DOF

The inability of any conventional imaging system to capture sharp images within a large volume is illustrated in the Figure 1.

Figure 1 Depth of field (DOF) problem. Image of the three human-figure cut-outs with sinusoidal patterns (2 lp/mm) and artificial irises and placed apart by 11 cm from each other. The camera, with lens of 80 mm focal length and f/5 aperture, was focused on the middle cut-out (3.6 meters away from the camera). It is evident that the spatial resolution in the image falls off rapidly with increasing distance from the plane of sharp focus (middle cut-out) inhibiting the camera from resolving fine details uniformly across the imaging volume.

Perfect imaging corresponds to the ability of an imager to produce a scaled replica of an object in the image  space [1].  When only a small portion of the light  wave emerging  from an infinitesimally  small point source of light is collected through a finite opening of a camera’s aperture (Figure 2 (a)), the replica in the image space is not exact even in the absence of aberrations; instead, the image of the point spreads out in space due to diffraction at the aperture. This dispersed response in the three-dimensional image space is called Point Spread Function (PSF).  The spreading of the PSF along   the transverse (xy-axis) direction (a 2D PSF) restricts an imager’s ability to resolve fine details (spatial frequency) in the image. For an extended object, which is made of several points, the 2D PSF smears the responses from neighboring points into each other causing blur. Similarly, the spread along the longitudinal direction (z-axis) limits the ability to discriminate points staggered closely in the direction of the optical axis causing a region of uncertainty; however, the extension of the 3D PSF along the optical axis enables multiple spatially-separated objects (or points) within a volume in the object space to form acceptably sharp images at once.  Conversely, an (point) object in the object space may be placed anywhere within this zone and still form a satisfactory image. This zone of tolerance in the object space is called depth of field. The corresponding zone in the image space is called depth of focus [2]. In this post, the acronym “DOF” is used for both depth of field and depth of focus wherever its meaning is apparent from the context. In the image space, the DOF is defined as the region of the 3D PSF where the intensity is above 80% of the central maximum [3,4]. This zone is in the shape of a prolate spheroid. In the absence of aberrations, the maximum intensity occurs at the geometric focal point, $z_g$, where contributions from all parts of the pupil are all in phase. Figure 2 (b) shows the aberration-free intensity distribution, $I_n(r, \delta z)$, as a function of defocus $\delta z = z_i - z_g$ about the geometric focal point for a light source placed at 100 millimeters from a lens of focal-length of 25 mm and aperture diameter of 5 mm. The expression for the distribution—normalized to make $I_n(0,0)$ equal to unity—is obtained using scalar diffraction theory and paraxial assumptions.

Figure 2 Incoherent impulse response and DOF. (a) The image A’ of a point source A spreads out in space forming a zone of tolerance called Depth of Focus (DOF) in the image space; (b) The normalized focal intensity distribution of the 3D PSF of a 25mm, f/5 lens imaging an axial point source at a distance of 100mm. The expression for the 3D PSF was obtained for a circular aperture using scalar diffraction theory and paraxial assumption. The DOF, having prolate spheroidal shape, is defined as the region within which the intensity has above 80% of the intensity at the geometric focus point. The figure shows iso-surfaces representing 0.8, 0.2, 0.05 and 0.01 intensity levels. The ticks on the left vertical side indicate the locations of the first zeroes of the Airy pattern in the focal plane. The vertical axis has been exaggerated by 10 times in order to improve the display of the distribution.

The shape—length and breadth—of the 80% intensity region (Figure 2(b)) dictates the quality of the image acquired by an imager in terms of lateral spatial resolution and DOF.

# Ambiguity function (AF) and its use in OTF analysis

## The 2D Ambiguity Function (AF) and its relation to 1D Optical Transfer Function (OTF)

The Ambiguity Function (AF) is an useful tool for optical system analysis. This post is a basic introduction to AF, and how it can be useful for analyzing incoherent optical systems. We will see that the AF simultaneously contains all the OTFs associated with an rectangularly separable incoherent optical system with varying degree of defocus [2-4]. Thus by inspecting the AF of an optical system, one can easily predict the performance of the system in the presence of defocus. It has been used in the design of extended-depth-of-field cubic phase mask system.

NOTE:

This post was created using an IPython notebook. The most recent version of the IPython notebook can be found here.

To understand the basic theory, we shall consider a one-dimensional pupil function, which is defined as:

$(1) \hspace{40pt} P(x) = \begin{cases} 1 & \text{if } |x| \leq 1, \\ 0 & \text{if } |x| > 1, \end{cases}$

The *generalized pupil function* associated with $P(x)$ is the complex function $\mathcal{P}(x)$ given by the expression [1]:

$(2) \hspace{40pt} \mathcal{P}(x) = P(x)e^{jkW(x)}$

where $W(x)$ is the aberration function. Then, the amplitude PSF of an aberrated optical system is the Fraunhofer diffraction pattern (Fourier transform with the frequency variable $f_x$ equal to $x/\lambda z_i$) of the generalized pupil function, and the intensity PSF is the squared magnitude of the amplitude PSF [1]. Note that $z_i$ is the distance between the diffraction pattern/screen and the aperture/pupil.

# Desirable properties of iris acquisition systems

In my last post, I described briefly how iris recognition works. I also described the four main modules that make up a general iris recognition system. In this post I am going to discuss some of the desirable properties of the iris acquisition module, or simply the iris camera. Although the acquisition module is the first block in the iris authentication/verification pipeline and it plays a very important role, the module has received much less attention of the researchers compared to the others.

Iris recognition algorithms have become quite mature and robust in past decade due to the rapid expansion of research both in industry and academia  [1–3]. Figure 1 shows a plot of the scientific publications (in English) on iris recognition between 1990 and 2013. The plot also shows the relative number of papers exclusively addressing the problems of the acquisition module, which is really very minuscule, compared to the total number of papers on iris recognition.

Figure 1. Number of publications in (English) journals on iris recognition between 1990 and 2013. The data was collected using Google scholar by searching the keywords IRIS + RECOGNITION + ACQUISITION + SEGMENTATION + NORMALIZATION + MATCHING. The plot shows that although the total number of research papers on iris recognition has grown tremendously during the last decade, the problems associated with iris acquisition have been overlooked.

The accuracy of iris recognition is highly dependent on the quality of iris images captured by the acquisition module. The key design constraints of the acquisition system are spatial resolution, standoff distance (the distance between the front of the lens and the subject), capture volume, subject motion, subject gaze direction and ambient environment [4]. Perhaps the most important of these are spatial resolution, standoff distance, and capture volume. They are described in details in the following paragraphs.

# Primer on iris recognition

As part of my PhD research, I am working on  extending the depth-of-field of iris recognition cameras. In a series of blog posts (in the near future) I would like to share some of the things that I have learnt during the project. This post, the first one in the series, is an introduction to iris recognition biometric technology. I believe the material presented here could benefit someone new to iris recognition get a quick yet comprehensive overview of the field. In the following paragraphs, I have described the iris anatomy, and what makes it so special as a biometric technology, followed by the general basis of iris based verification, and the four major constituents of a general iris recognition system.

The human iris is the colored portion of the eye having a diameter which ranges between 10 mm and 13 mm [1,2]. The iris is perhaps the most complex tissue structure in the human body that is visible externally. The iris pattern has most of the desirable properties of an ideal biomarker, such as uniqueness, stability over time, and relatively easy accessibility. Being an internal organ, it is also protected from damage due to injuries and/or intentional fudging [3]. The presence or absence of specific features in the iris is largely determined by heredity (based on genetics); however the spatial distribution of the cells that form a particular iris pattern during embryonic development is highly chaotic. This pseudo-random morphogenesis, which is determined by epigenetic factors, results in unique patterns of the irises in all individuals including that of identical twins [2,4,5]. Even the iris patterns of the two eyes from the same individual are largely different. The diverse microstructures in the iris that manifest at multiple spatial scales [6] are shown in Figure 1. These textures, unique to each eye, provide distinctive biometric traits that are encoded by an iris recognition system into distinctive templates for the purpose of identity authentication. It is important to note that the color of the iris is not used as a biomarker since it is determined by genetics, which is not sufficiently discriminative.

Figure 1. Complexity and uniqueness of human iris. Fine textures on the iris forms unique biometric patterns which are encoded by iris recognition systems. (Original image processed to emphasize features).

# Plotting algebraic surfaces using Mayavi

Who says math is not beautiful? Anyone who doubts the beauty in math must check out algebraic surfaces.

An implicit functions has the form:

$F(x,y,z, ...)=c$

where, $c$ is an arbitrary constant.

For example, the implicit equation of the unit circle is $x^2 + y^2 = 1$ and the equation $x^2 + y^2 + z^2 = 1$ describes a sphere of radius 1.

An algebraic surface is described by an implicit function $F(x,y,z) = c$. It may be rendered using Mayavi by evaluating the expression $F(x,y,z)-c=0$ on a regular (specified) grid, and then generate isosurface of contour value equal to 0.

Here are some examples of algebraic surfaces plotted using Mayavi:

I wrote a quick function called implicit_plot for plotting algebraic surfaces using Mayavi. The most important argument to the function is of course the expression  $F(x,y,z)-c=0$ as a string. It is probably not the best function, but the idea is to show how to plot implicit surfaces. The code snippet is included at the end of this post. Suggestions for improvement are always welcome.

# Progression of pixel resolution in digital cameras

Thanks to the megapixel war, pixels in digital sensors have shrunk considerably over the years. Consequently, the pixel resolution (number of pixels in a digital image) has improved. Currently, the pixel resolution (assuming gray-scale sensor) can compete with the aerial resolution of lenses —on paper. This post is about a plot I created sometime back (for a different presentation) which shows the growth of pixel count and the diminishing of pixel size over the years for three popular segments of digital cameras. The green line plots the diffraction-limited (aberration free) optical resolution 3 stops below the maximum aperture available for off-the-shelf lenses during the same period. The optical resolution line doesn’t mean much; however, it is plotted to compare the sensor resolution with the optical resolution over time. The graph shows that while the sensor resolution has improved by leaps and bounds, the optical resolution hasn’t. It is no surprise though, because the optical resolution, which is limited by the fundamental nature of light —diffraction. Improving the optical resolution by traditional means is very expensive, and results in bulky lenses. The time is just right for exploring computational methods for improving the system resolution of imaging systems.

Other interesting data-points in the graphs are:

1. The Kodak DSC460, based on a Nikon SLR, was one of the first digital cameras.

2. The Sharp J-SH04 was the cellphone with a camera.

The number of mega-pixels and the pixel resolution (decrease in pixel size) has increased rapidly for the cellphone and point-and-shoot cameras, probably driven by marketing rather than by picture quality. In the more professional segment, clearly the strategy has been different. This may be because of two main reasons — one, the image quality dictated by noise, color reproduction, low-light performance, etc are more important for these shooters, and two, building high-quality large lens is relatively more expensive.

# Does the F/2.0 phone camera match the F/2.0 DSLR’s optical resolution?

Summary: Most cellphone cameras today come with “large aperture” lens such as F/2.0 or F/2.2. Since the optical resolution of a “perfect” lens (a lens devoid of any aberration) is inversely proportional to the F/# (the relation is shown in equation 1), one may assume that a F/2.0 cellphone camera lens should be able to resolve fine details on an object as well as a F/2.0 DSLR lens.

$\Delta_r = \frac{1}{1.22 \lambda F/\#}$         (1)

where, $F/\# = f/D$, $f$ is the focal length, and $D$ is the aperture diameter of the lens.

The above image shows two lenses of same F/#, and also of equal normalized focal length. The one on the left is a lens whose 35 mm focal length is about 28 mm (with complete camera module, including the image sensor) from a cellphone, and the lens on the right is a 28 mm DSLR lens.

The optical resolution of a lens determines how close two line-objects or point-objects be placed before the objects cannot be distinguished from each other when viewing through the lens. The minimum resolvable separation, $\Delta_s$, is the inverse of $\Delta_r$. The larger the value of $\Delta_r$ (usually measured in line-pairs-per-mm), the better is the resolving ability of the lens. So if the f-number, $F/\#$, of a cellphone lens matches the $F/\#$ of a DSLR lens, then the equation (1) seems to suggest that they have the same optical resolution! However, as it is shown in the following paragraphs, the ability to resolve fine details of a F/2.0 DSLR lens is in fact much better than a F/2.0 cellphone camera lens. Concretely, if the focal length and the aperture diameter of the cellphone lens is 1/k (k>1) times the respective parameters of the DSLR lens, then the  $F/\#$ of the two systems are equal but the resolving ability of the cellphone lens is $1/k$ times that of the DSLR lens. For example, a 50 mm, F/2.0 lens (D = 25 mm), which is a first order approximation of a 50 mm DSLR lens, can resolve details as fine as 54 microns focused at a distance of 2 meters. Whereas a 5 mm, F/2.0 lens (D = 2.5 mm), a close approximation of a typical cellphone camera lens, can resolve details only up to about 540 microns focused at the same distance. This is essentially a manifestation of the difference in magnifications (or angular resolution) of the lenses.