In this post, we will look at one of the limitations of current iris recognition systems – the limited depth of field, the fundamental cause of this limitation, and how some of the current systems are addressing this problem.
The inability of any conventional imaging system to capture sharp images within a large volume is illustrated in the Figure 1.
Perfect imaging corresponds to the ability of an imager to produce a scaled replica of an object in the image space [1]. When only a small portion of the light wave emerging from an infinitesimally small point source of light is collected through a finite opening of a camera’s aperture (Figure 2 (a)), the replica in the image space is not exact even in the absence of aberrations; instead, the image of the point spreads out in space due to diffraction at the aperture. This dispersed response in the three-dimensional image space is called Point Spread Function (PSF). The spreading of the PSF along the transverse (xy-axis) direction (a 2D PSF) restricts an imager’s ability to resolve fine details (spatial frequency) in the image. For an extended object, which is made of several points, the 2D PSF smears the responses from neighboring points into each other causing blur. Similarly, the spread along the longitudinal direction (z-axis) limits the ability to discriminate points staggered closely in the direction of the optical axis causing a region of uncertainty; however, the extension of the 3D PSF along the optical axis enables multiple spatially-separated objects (or points) within a volume in the object space to form acceptably sharp images at once. Conversely, an (point) object in the object space may be placed anywhere within this zone and still form a satisfactory image. This zone of tolerance in the object space is called depth of field. The corresponding zone in the image space is called depth of focus [2]. In this post, the acronym “DOF” is used for both depth of field and depth of focus wherever its meaning is apparent from the context. In the image space, the DOF is defined as the region of the 3D PSF where the intensity is above 80% of the central maximum [3,4]. This zone is in the shape of a prolate spheroid. In the absence of aberrations, the maximum intensity occurs at the geometric focal point, , where contributions from all parts of the pupil are all in phase. Figure 2 (b) shows the aberration-free intensity distribution, , as a function of defocus about the geometric focal point for a light source placed at 100 millimeters from a lens of focal-length of 25 mm and aperture diameter of 5 mm. The expression for the distribution—normalized to make equal to unity—is obtained using scalar diffraction theory and paraxial assumptions.
The shape—length and breadth—of the 80% intensity region (Figure 2(b)) dictates the quality of the image acquired by an imager in terms of lateral spatial resolution and DOF.
A first order optical simulation demonstrating the effect of the DOF in image acquisition at varying depths is shown in Figure 3. For this simulation a 100 mm focal length, f/5 lens that is focused at 1300 mm is used. In this setup, the imager has a DOF of 9.5 mm in the object space (calculated by applying the lens equation to the extremes of the DOF in image space). As it can be seen, the irises located outside the DOF region are severely blurred. It has been shown in [5,6] that the performance of iris recognition deteriorates quickly with increasing amounts of defocus in the captured iris images.
In conventional imaging systems, increase of DOF is realized by making the system aperture smaller. However, stopping down the aperture to increase DOF is not a good solution for iris recognition as the increase in DOF is also accompanied by a loss in optical resolution and loss of light. As shown in Figure 4 decreasing the size of the aperture results in elongation of the PSF along the optical axis, which results in larger DOF in the object space; however the PSF along the transverse direction also increases, resulting in the loss of optical spatial resolution. The relation between DOF and lateral optical resolution () is as follows:
The above equation also suggests that an n-fold increase in DOF results in exactly n-fold loss of light [7]. The loss of light results in a decrease in system SNR.
There are various techniques for increasing the capture volume for iris acquisition, although none are probably perfect right now. Examples include use of multiple cameras (using both time and spatial multiplexing), large telescopic lenses, pan-tilt-zoom camera systems, and computational imaging techniques such as wave-front coding. Figure 5 is a schematic showing some of the state of the art iris acquisition systems and their volume.
(The figures in the post were generated using Matplotlib (Python plotting library), Mayavi (Python 3D plotting library) and Blender.
Links to post in this series
References
The Ambiguity Function (AF) is an useful tool for optical system analysis. This post is a basic introduction to AF, and how it can be useful for analyzing incoherent optical systems. We will see that the AF simultaneously contains all the OTFs associated with an rectangularly separable incoherent optical system with varying degree of defocus [2-4]. Thus by inspecting the AF of an optical system, one can easily predict the performance of the system in the presence of defocus. It has been used in the design of extended-depth-of-field cubic phase mask system.
NOTE:
This post was created using an IPython notebook. The most recent version of the IPython notebook can be found here.
To understand the basic theory, we shall consider a one-dimensional pupil function, which is defined as:
The *generalized pupil function* associated with is the complex function given by the expression [1]:
where is the aberration function. Then, the amplitude PSF of an aberrated optical system is the Fraunhofer diffraction pattern (Fourier transform with the frequency variable equal to ) of the generalized pupil function, and the intensity PSF is the squared magnitude of the amplitude PSF [1]. Note that is the distance between the diffraction pattern/screen and the aperture/pupil.
For the purpose of analyzing optical systems using the ambiguity function, it is convenient to separate the defocus term ( is the wavefront focus error coefficient [3]) from the rest of the aberration terms in the aberration function :
Since the amplitude PSF is the Fourier transform of the pupil function (see above), and the amplitude transfer function (ATF) is the Fourier transform of the amplitude PSF, the ATF is proportional to a scaled pupil function . Specifically, the ATF , is:
The (one dimensional) optical transfer function (OTF) is defined as the normalized autocorrelation of the ATF:
writing in place of , we have
The simplified equation is written as follows:
The ambiguity function, which is used for radar analysis, is defined as the Fourier transform of the product :
Comparing the simplified expression of the aberrated OTF, , and the ambiguity function, , we see that:
This implies that the ambiguity function associated with a base function contains the OTF along the line (which has a slope , and passes through the origin in the plane as shown in the following figure.
Note:
1. The “base function” itself does not contain the defocus term, although it may contain other aberrations (see equation (3)).
2. Equation (6) is the “normalized autocorrelation” of the generalized pupil function, which implies that the maximum value of the function is 1. In equation (7) there is no explicit normalization and so the maximum value can be greater or less than 1. We must be mindful of this when comparing the AF to the OTF and prefer to chose appropriate amplitude for the base function (as shown below).
In fact, the whole 2-D AF contains the OTFs for various values of defocusing arranged in a polar fashion. Of special interest is the in-focus OTF for which and hence corresponds to the line along the “” axis. i.e.
If the only aberration is that of defocus, in the expression for the generalized pupil function. Therefore the base-function which we consider is
First, we will consider the evaluation of the following integral of the general form:
where, the function is defined as:
The following figure, which shows the regions of overlap between and (with ), is useful to understand the finite limites of the integral.
When :
where, the function used above is the normalized that is defined as
When :
Note that we really didn’t have to break the integral into two parts. Instead we could just evaluate the integral between the limits and .
We can combine the two cases and write the equation more compactly as:
In our example, for which the base function is , the parameters , (so that the maximum value is 1) and . We can now write:
Now, from equation (8) we have . Therefore we can write the expression for the OTF directly from the AF.
Note:
In [2], the authors have used the *un-normalized* function that is defined as . Hence, there is an “extra” within the function in their paper. We use the normalized definition as it is used in [1], and in Numpy.
The figure below (after the code block) shows the function for . It can be seen in figure that roots of the (normalized) function occurs at . Therefore the zero value loci for the AF associated with the one-dimensional rectangular pupil are found by equating .
from __future__ import print_function, division import numpy as np import scipy as sp import matplotlib.cm as cm import matplotlib.pyplot as plt from IPython.display import Image %matplotlib inline
x = np.linspace(-8, 8, 150) y = np.sinc(x) fig, ax = plt.subplots(1,1) ax.plot(x, y, label='$sinc(x)$') ax.set_ylim(-0.3, 1.02) ax.set_xlim(-8, 8) ax.set_xlabel('x') ax.set_title('sinc(x)', y=1.02) rootsx = range(-8, 0) + range(1,9) ax.scatter(rootsx, np.zeros(len(rootsx)), c='r', zorder=20) ax.grid() plt.show()
The following figures plot the AF of the one dimensional rectangular pupil and variation of OTF with focus error . In each figure, the plot on the left shows the ambiguity function and several lines corresponding to different defocus error . The zero value locai, are denoted by the gray dashed lines. The intersection of zero value locai with the line(s) , which in the OTF plot represents the frequencies where the OTF is zero, is shown by the cross (‘x’) markers. The plots on the right show the corresponding variation of the OTF with different.
def plot_1dRectAF(w20LambdaBy2, N=15, umin=-2, umax=2, ymin=-8, ymax=8): """rudimentary function to show the AF Parameters ---------- w20LambdaBy2 : list of real values specifies the amount defocus error W_{20} in terms of lambda/2. The slope of the OTF associated with W_{20} in AF plane is 2*W20/lambda. N : integer number of zero value loci to plot """ u = np.linspace(umin, umax, 200) y = np.linspace(ymin, ymax, 400) uu, yy = np.meshgrid(u, y) # grid # Numpy's normalized sinc function = sin(pi*x)/(pi*x) af = (1 - np.abs(uu)/2)*np.sinc(yy*(2 - np.abs(uu))) # plot fig = plt.figure(figsize=(12, 7)) ax1 = fig.add_axes([0.12, 0, 0.42, 1.0]) # [*left*, *bottom*, *width*,*height*] ax2 = fig.add_axes([0.6, 0.12, 0.38, 0.76]) im = ax1.imshow(af, cmap=cm.bwr, origin='lower', extent=[umin, umax, ymin, ymax], vmin=-1.0, vmax=1.0, aspect=2./6) plt.colorbar(im, ax=ax1, shrink=0.77, aspect=35) # zero value loci for n in range(1, N+1): zvl = n/(2.0 - abs(u[1:-1])) ax1.plot(u[1:-1], zvl, color='#888888', linestyle='dashed') ax1.plot(u[1:-1], -zvl, color='#888888', linestyle='dashed') # OTF line on AF plane for elem in w20LambdaBy2: otfY = elem*u # OTF line in AF with slope 2w_{20}/lambda ax1.plot(u, otfY) # intersections def get_intersections(b): # b is tan(phi) or 2w_{20}/lambda n = np.linspace(1, np.floor(b), np.floor(b)) u1 = 1 + np.sqrt(1 - n/b) u2 = 1 - np.sqrt(1 - n/b) y1 = u1*b y2 = u2*b u = np.hstack((u1, u2)) y = np.hstack((y1, y2)) return u, y for elem in w20LambdaBy2: intersectionsU, intersectionsY = get_intersections(elem) ax1.scatter(intersectionsU, intersectionsY, marker='x', c='k', zorder=20) # OTF plots for elem in w20LambdaBy2: otf = (1 - np.abs(u)/2)*np.sinc(elem*u*(2 - np.abs(u))) ax2.plot(u, otf, label='$W_{20}' + '= {}lambda/2$'.format(elem)) # axis settings ax1.set_xlim(umin, umax) ax1.set_ylim(ymin, ymax) ax1.set_title('2-D AF of 1-D rect pupil P(x)', y=1.01) ax1.set_xlabel('u', fontsize=14) ax1.set_ylabel('y', fontsize=14) ax2.axhline(y=0, xmin=-2, xmax=2, color='#888888', zorder=0, linestyle='dashed') ax2.grid(axis='x') ax2.legend(fontsize=12) ax2.set_xlim(-2, 2); ax2.set_ylim(-0.2, 1.005) ax2.set_title("Optical Transfer Function", y=1.02) ax2.set_xlabel("u (scaled saptial frequency)", fontsize=14) #fig.tight_layout() plt.show()
Plots of AF and OTF for which for , i.e. . The equation for the OTF was obtained directly from the ambiguity function by replacing with .
The points of intersection between lines and each zero loci curve (for ) can be found by equating , which gives
For example, the line intersects the zero loci curve at , the line intersects the curve at , and the curve at , and so on. As we can see, there are odd number of intersections of the lines with the zero loci curves when the defocus error is an integer multiple of . This corresponds to an odd number of zero-value OTF points in the OTF plots when the defocus error is an integer multiple of .
Note that we only found the abscissa ( coordinates) for the intersection points above.
plot_1dRectAF(w20LambdaBy2=[0, 1, 2, 3])
Plots of AF and OTF for which for , i.e. (the particular values were arbitrarily chosen, and the line is plotted for comparison). In this case we find that there are even number of intersections for each line with the zero-loci curves in the AF plot. Correspondingly there are even number of zero-value OTF points for each OTF with focus error for .
plot_1dRectAF(w20LambdaBy2=[0, 0.5, 0.99, 1.5, 3.6])
Also, note from the above plots that the first occurrence of the OTF intersecting/crossing the zero value happens as . This point is also the known as the traditional or Hopkins criterion for misfocus [4].
Cubic phase masks (CPM) has been used to make hybrid optical systems largely invariant to defocus, thus extending the depth of field of such systems. For details on CPM systems, refer to [4].
The expression for the ambiguity function of a cubic phase mask is given below [4]:
We will numerically evaluation the above expression for the AF of cubic phase mask (CPM), recognizing that the expression is of the form of inverse Fourier Transform of where,
Here are the steps for rendering the 2-D AF for CPM:
Create a vector (i.e. between , which is the maximum region of integration)
For every “u” in create the sequence where, using the following rule:
a. if , then evaluate where
b. if , then evaluate where
Take inverse Fourier Transform of each sequence (with as the transform variable). i.e. IFFT
The expression for the OTF for the CPM is then given by:
We will use numerical integration to generate the plots of OTFs for the CPM with various amounts of defocus.
from scipy.integrate import quad import warnings warnings.simplefilter(action='error', category=np.ComplexWarning) # Turn on the warning to ensure that the numerical integration is "reliable"? warnings.simplefilter(action='always', category=sp.integrate.IntegrationWarning) ifft = np.fft.fft fftshift = np.fft.fftshift fftfreq = np.fft.fftfreq # cubic phase mask parameters alpha = 90 gamma = 3 umin, umax = -2, 2 ymin, ymax = -60, 60 w20LambdaBy2 = [0, 5, 15] # amounts of defocus in units of wavelength (by 2) uVec = np.linspace(umin, umax, 300) N = 512 # number of samples along "t" ... and for FFT L = 1 def gut(t, alpha, gamma, u): return 0.5*np.exp(1j*alpha*((t + u/2)**gamma - (t - u/2)**gamma)) guy = np.empty((N, len(uVec))) #roi = np.empty((N, len(uVec))) # for debugging & seeing the region of integration t = np.linspace(-2*L, 2*L, N) dt = (4*L)/(N-1) for i, u in enumerate(uVec): g = np.zeros_like(t, dtype='complex64') if -1 <= u/2.0 < 0: mask = (t > (-u/2 - 1))*(t < (u/2 + 1)) #roi[:, i] = mask.astype('float32') g[mask] = gut(t[mask], alpha, gamma, u) guy[:, i] = np.abs(fftshift(ifft(g))) elif 0 <= u/2.0 <= 1: mask = (t > (u/2 - 1))*(t < (-u/2 + 1)) #roi[:, i] = mask.astype('float32') g[mask] = gut(t[mask], alpha, gamma, u) guy[:, i] = np.abs(fftshift(ifft(g))) # Normalize to make maximum value = 1 guyMax = np.max(np.abs(guy.flat)) guy = guy/guyMax yindex = fftshift(fftfreq(N, dt)) ymin, ymax = yindex[0], yindex[-1] fig = plt.figure(figsize=(12, 7)) ax1 = fig.add_axes([0.12, 0, 0.5, 1.0]) # [*left*, *bottom*, *width*,*height*] ax2 = fig.add_axes([0.66, 0.23, 0.32, 0.54]) im = ax1.imshow(guy**0.8, cmap=cm.YlGnBu_r, origin='lower', extent=[umin, umax, ymin, ymax], vmin=0.0, vmax=1.0,aspect=1./40) plt.colorbar(im, ax=ax1, shrink=0.55, aspect=35) # OTF line in AF for elem in w20LambdaBy2: otfY = elem*uVec # OTF line in AF with slope 2w_{20}/lambda ax1.plot(uVec, otfY, alpha = 0.6, linestyle='solid') ax1.set_xlim(umin, umax) ax1.set_ylim(ymin, ymax) ax1.set_xlabel('u', fontsize=14) ax1.set_ylabel('y', fontsize=14) ax1.set_title('2-D AF of 1-D cpm', y=1.01) # Magnitude plots of the OTF of the cpm def otf_cpm(t, alpha, gamma, u, w20LamBy2): return (0.5*np.exp(1j*alpha*((t + u/2)**gamma - (t - u/2)**gamma)) *np.exp(1j*2*np.pi*u*w20LamBy2*t)) def complex_quad(func, a, b, **kwargs): """Compute numerical integration of complex function between limits a and b Adapted from the following SO post: stackoverflow.com/questions/5965583/use-scipy-integrate-quad-to-integrate-complex-numbers """ def real_func(x, *args): if args: return sp.real(func(x, *args)) else: return sp.real(func(x)) def imag_func(x, *args): if args: return sp.imag(func(x, *args)) else: return sp.imag(func(x)) real_integral = quad(real_func, a, b, **kwargs) imag_integral = quad(imag_func, a, b, **kwargs) return (real_integral[0] + 1j*imag_integral[0], real_integral[1:], imag_integral[1:]) for elem in w20LambdaBy2: Huw = np.empty_like(uVec, dtype='complex64') for i, u in enumerate(uVec): if -1 <= u/2.0 < 0: Huw[i] = complex_quad(func=otf_cpm, a=-u/2 - 1, b=u/2 + 1, args=(alpha, gamma, u, elem))[0] elif 0 <= u/2.0 <= 1: Huw[i] = complex_quad(func=otf_cpm, a=u/2 - 1, b= -u/2 + 1, args=(alpha, gamma, u, elem))[0] HuwMax = np.max(np.abs(Huw)) ax2.plot(uVec, np.abs(Huw)/HuwMax, label='$W_{20}' + '= {}lambda/2$'.format(elem)) ax2.legend(fontsize=12) ax2.set_ylabel("Magnitude of OTF", fontsize=14) ax2.set_xlabel("Spatial frequency, u", fontsize=14) ax2.set_title('Optical Transfer Function of cpm', y=1.01) plt.show()
We can see from the above plots of the OTF, that the OTFs are insensitive to large amounts of defoucs. More importantly, there are no regions of zero values within the passband (though the bandwidth is not really constant for large amounts of defocus). This property is extremely important from the point of view of the reconstruction filter (such as an inverse filter or an Wiener filter).
Iris recognition algorithms have become quite mature and robust in past decade due to the rapid expansion of research both in industry and academia [1–3]. Figure 1 shows a plot of the scientific publications (in English) on iris recognition between 1990 and 2013. The plot also shows the relative number of papers exclusively addressing the problems of the acquisition module, which is really very minuscule, compared to the total number of papers on iris recognition.
The accuracy of iris recognition is highly dependent on the quality of iris images captured by the acquisition module. The key design constraints of the acquisition system are spatial resolution, standoff distance (the distance between the front of the lens and the subject), capture volume, subject motion, subject gaze direction and ambient environment [4]. Perhaps the most important of these are spatial resolution, standoff distance, and capture volume. They are described in details in the following paragraphs.
Two types of spatial resolutions are associated with digital imaging systems: the optical resolution, , of the lens, and the pixel (or sensor) resolution, , of the digital sensor.
For the purpose of this work, the optical resolution is defined as the maximum spatial frequency (present) in an object being imaged that can be resolved by the optics at a predetermined contrast [5]. In other words, it is a measure of the ability of an imager to resolve fine details present on the object surface. The optical resolution is governed by diffraction, and the deviation of a lens from ideal behavior, called aberrations. The resolution at the image plane of an aberration-free (also known as diffraction-limited) lens with Entrance Pupil Diameter (EPD), , and focal-length, , is given by [6]:
Where is the F-number and is the illumination wavelength. The entrance pupil is the image of the limiting aperture of the optical system as seen from object side through the lens elements in front of the limiting aperture. The optical resolution is measured in cycles per unit-length, typically cycles/mm or line-pairs-per-mm (lp/mm). The ISO/IEC 19794-6 [7] standards proposal for MTF recommends iris acquisition devices to maintain a minimum resolution of 2 lp/mm at the object with at 60% contrast [8]. Figure 2 plots resolution (maximum spatial frequency) against F-number for several contrast values calculated at the image plane for an illumination wavelength of 850 nm. The resolution in the object space is obtained as the product of the image plane resolution and the system magnification. For example, the magnification of a 100 mm focal length, F/4 camera at a standoff distance of 5 meters is about 0.02. If there are no aberrations, the resolution in the image plane at 60% MTF is 100 lp/mm (the value of red curve at F-number = 4 in Figure 2). The corresponding resolution in the object plane is 100 lp/mm x 0.02 = 2 lp/mm.
The sensor resolution is determined by the pixel density of the sensor. The ISO/IEC 19794-6 [7] standard requires at least 100 pixels across the iris diameter. Additionally, it recommends a pixel resolution of 200 pixels across the iris diameter [4,9]. For a digital sensor with pixel width mm, the sensor Nyquist frequency is [10]:
The specified number of pixels across the iris diameter also determines the minimum required lateral magnification of the acquisition module [8].
As suggested earlier, large standoff iris recognition systems are highly desirable. Capturing high quality iris images at large distances is a difficult task [4,11]. Increasing the standoff distance while maintaining the pixel count (pixel resolution) on the iris requires the use of higher magnification (higher focal-length) optics as shown in Figure 3. However, arbitrarily increasing the focal-length to form an iris image of 200 pixels does not guarantee adequate optical resolution – an issue that has seldom been discussed in iris recognition literature. Once sufficient sampling has been achieved—either by using high pixel density sensor or through high magnification—the optical resolution ultimately dictates the image quality and consequently has a direct impact on the performance of iris recognition algorithms. As was shown by Ernst Abbe in his treatise on optical imaging, the diffraction limited optical resolution is independent of magnification and is solely determined by the F-number [12]. Increasing focal length of the system leads to loss of optical resolution as indicated by equation (1), unless the lens diameter is increased proportionally to maintain constant F-number (i.e. the F-number is not increased). However, smaller F-number lenses with larger focal lengths tend to be bulky and costly due to the use of larger number of optical elements required to correct for aberrations that scales with lens size [13]. Clearly, increasing the standoff distance from a few centimeters to a few meters without significant loss of spatial resolution is a challenge [14] for the iris recognition systems.
For the purpose of the iris recognition task, the Depth of Field (DOF) of the iris acquisition system may be defined as a range of object distances within which the spatial resolution required for successful iris recognition is maintained above a predetermined threshold SNR [15]. Incorporating the requirement as specified by the ISO/IEC 19794-6 standard, this would mean that the DOF is the region of object distances where a spatial resolution of at least 2 lp/mm is with 60% contrast ratio is maintained. The most frequently employed definition of DOF in iris acquisition literature, which is derived from geometric optic, is:
Where, is the focal-length, is the F-number, is the standoff distance, and , the circle of confusion, is a parameter that determines the smallest resolvable feature on the image of an object within the DOF. It is specified by the dimension of the blur spot in the image plane beyond which a point image is ruled out-of-focus [8]. A plot of the variation of the geometrically defined DOF with respect to the system F-number is shown in Figure 4. The geometrical DOF increases linearly with the F-number. However, DOF defined in within the framework of scalar diffraction theory (will be discussed in more details in the next post) is defined as the region near the geometrical focus where the intensity drops to 80% of the maximum intensity [16,17]. The diffraction based DOF is a function of the illumination wavelength and varies with the square of the F-number. In the image space it is given as:
Figure 5 shows the variation of the diffraction based DOF with respect to the F-number for different object distances.
The capture volume is generally referred to the three-dimensional spatial volume in which the user’s eye must be placed in order to acquire an iris image of predetermined quality [4,8,18]. The lateral extents (measured perpendicular to the optical axis) of the capture volume is determined by the FOV of the camera (provided it has sufficient resolution in the entire FOV), and the axial extent (measured along the optical axis) is determined by the DOF [8] of the lens. Time may also be included as the fourth dimension of the capture volume [9] which specifies the length of time the iris must be placed within the spatial capture volume to reliably capture iris image of sufficient quality and avoid motion-blur. For multi-camera systems and systems mounted in pan-tilt units, the net FOV is the total angular extents observable by the acquisition system.
Currently, most commercially available iris recognition systems possess shallow DOF resulting in limited usability and increased system complexity [19]. Large capture volumes are not only desirable but also critical for making the iris recognition systems less constraining. Extending the zone of image capture will allow subjects to freely move, albeit while facing the camera, within this zone during the capture process. It will also allow multiple subjects to be identified/ verified simultaneously. Increasing the capture volume of current iris acquisition devices is expected to make biometric recognition easier to use and also make them commercially more feasible [20].
(The figures in the post were generated using Matplotlib, a Python based plotting library.)
Links to post in this series
References
The human iris is the colored portion of the eye having a diameter which ranges between 10 mm and 13 mm [1,2]. The iris is perhaps the most complex tissue structure in the human body that is visible externally. The iris pattern has most of the desirable properties of an ideal biomarker, such as uniqueness, stability over time, and relatively easy accessibility. Being an internal organ, it is also protected from damage due to injuries and/or intentional fudging [3]. The presence or absence of specific features in the iris is largely determined by heredity (based on genetics); however the spatial distribution of the cells that form a particular iris pattern during embryonic development is highly chaotic. This pseudo-random morphogenesis, which is determined by epigenetic factors, results in unique patterns of the irises in all individuals including that of identical twins [2,4,5]. Even the iris patterns of the two eyes from the same individual are largely different. The diverse microstructures in the iris that manifest at multiple spatial scales [6] are shown in Figure 1. These textures, unique to each eye, provide distinctive biometric traits that are encoded by an iris recognition system into distinctive templates for the purpose of identity authentication. It is important to note that the color of the iris is not used as a biomarker since it is determined by genetics, which is not sufficiently discriminative.
The problem of iris recognition is analogous to binary classification. It involves grouping the members of a set of objects into two classes based on a suitable measure of similarity (Figure 2 (a)). Similar objects cluster together as they exhibit less variability within the same class. In the figure below, the intra-class variability (randomness within class) and the inter-class variability (randomness between classes) are plotted as a function of an appropriate similarity measure, D. The degree of variability is proportional to the uncertainty of D. For example, in face recognition, the intra-class variability would be the uncertainty of identifying a person’s face imaged under varying lighting conditions, poses, time of acquisition, etc.; on the other hand, the inter-class variability is the variability between the faces of different subjects. Naturally, the degree of similarity is higher (corresponds to lower uncertainty) in the former case. As shown in the figure, there are four possible outcomes for the binary classification problem based on the two possible choices in the decision. The region of overlap produces the two types of error rates. The False Accept (or False Positive) is the area of overlap to the left of the decision criterion, and the False Reject (False Negative) is the area of overlap to the right of the decision criterion. Objects can be reliably classified only if the intra-class variability is less than the inter-class variability [2], in which case, the two distributions are sufficiently separated in space, and the degree of overlap is minimum. Figure 2 (b) (adapted from [5]) is a plot of the distributions of hamming distances (defined later) for 1208 pairwise comparisons of irises from the same eye, and 2064 pairwise comparisons of irises from different eyes. The figure shows that the two distributions are well separated indicating the robustness of using hamming distances for the iris recognition problem.
Generation of the iris code broadly consists of four basic steps. An additional matching step is required for the task of identification of a subject based on the iris code [7,8]. A schematic of this pipeline is shown in Figure 3.
The steps are briefly described below:
1. | Iris acquisition |
The iris encoding/recognition starts with the acquisition of a high quality image of a subject’s eye. Almost all iris acquisition systems use near infrared (NIR) illumination in the 720-900 mm wavelengths for iris capture. NIR illumination provides greater visibility of the intricate structure of the iris, which is largely unaffected by pigmentation variations in the iris [2,7,9]. There has also been some evidence of the benefits of using visible illumination for iris acquisition, especially for large standoff and unconstrained environments [10]. Some research is also underway for multi-spectral iris acquisition. Two main challenges for iris acquisition systems are large standoff distances, and the ability to acquire iris images within a large volume.
2. | Segmentation and localization |
The module following the capture of an acceptable quality iris image is largely known as the segmentation and localization. The goal of this step is to accurately determine the spatial extent of the iris, locate the pupillary and limbic boundaries and identify and mask-out regions within the iris that is affected by noise such as specular reflections, superimposed eyelashes, and other occlusions that may affect the quality of the template [7]. A wide gamut of algorithms have been proposed for the segmentation and localization of iris regions, such as Daugman’s integro-differential operator [2,5], circular Hough transforms along with Canny edge detection [3,6], binary thresholding and morphological transforms [11,12], bit-planes extraction coupled with binary thresholding [13,14], and active contours [15–17].
Segmentation and localization is perhaps the most important step in the process of the biometric template generation once a high quality iris image has been acquired. This is because the performance and accuracy of the subsequent stages is critically dependent on the precision and accuracy of the segmentation stage [15].
3. | Unwrapping/ Normalization |
The spatial extent of the iris region varies greatly amidst image captures due to magnification, eye pose, and pupil dilation/expansion [5,7]. Furthermore, the inner and outer boundaries of the iris are not concentric, and they deviate considerably from perfect circles. Before the generation of biometric code the segmented iris is geometrically transformed into a scale and translation invariant space in which the radial distance between the iris boundaries along different directions are normalized between the values of 0 (pupil boundary) and 1 (limbic boundary) [2]. This unwrapping process, shown schematically in Figure 4, consists of two steps: First, a number of data points are selected along multiple radial curves interspaced between the two iris boundaries. Next, these points, which are in the Cartesian coordinates, are transformed into the doubly dimensionless polar coordinate system of two variables and . As a result, an iris imaged under varying conditions, when compared in the normalized space, will exhibit characteristic features at the same spatial locations.
4. | Encoding – generation of iris code |
The encoding process produces a binary feature vector from an ordered sequence of features extracted from the normalized iris image. A large number of iris recognition systems encode the local phase information following multi-resolution filtering by quantizing the phase at each location using two bits. Commonly used multi-resolution filters include 2-D Gabor filter [2,5,18], log-Gabor filters [3], multi-level Laplacian pyramids [6], PCA and ICA [16], etc. Similar to the segmentation module there are numerous algorithms for iris pattern encoding.
5. | Recognition – verification/ authentication of identity |
The recognition of iris for subject verification/authentication requires an additional matching step that involves measuring the similarity of a template generated from a newly acquired iris image to one or many templates stored in a database. The most common technique for comparing two iris codes is the normalized Hamming distance (HD) which is a measure of the percentage of locations in which two binary vectors vary [16]. For example, the HD between two orthogonal vectors is 1 whereas the HD between two identical vectors is 0. While an HD of 0 between two iris images from the same eye is highly improbable in practical scenarios pertaining to noise, an HD of 0.33 or less indicates a match [2].
A detailed explanation of the image processing and pattern recognition techniques employed in the field of iris recognition is beyond the scope of this post. Interested readers are referred to the work of Daugman’s [2,15] and the extensive survey of iris recognition by Bowyer et al [10].
In the next post I will cover some of the desirable properties of iris recognition acquisition systems.
(The figures in the post were generated using a combination of Matplotlib, OpenCV (for segmentation, normalization and encoding of iris in Fig 3) and Powerpoint.)
Links to post in this series
References
An implicit functions has the form:
where, is an arbitrary constant.
For example, the implicit equation of the unit circle is and the equation describes a sphere of radius 1.
An algebraic surface is described by an implicit function . It may be rendered using Mayavi by evaluating the expression on a regular (specified) grid, and then generate isosurface of contour value equal to 0.
Here are some examples of algebraic surfaces plotted using Mayavi:
I wrote a quick function called implicit_plot
for plotting algebraic surfaces using Mayavi. The most important argument to the function is of course the expression as a string
. It is probably not the best function, but the idea is to show how to plot implicit surfaces. The code snippet is included at the end of this post. Suggestions for improvement are always welcome.
Let’s start with a simple sphere in three dimensional space, whose equation is . We can call the function implicit_plot
like so (please note that it is assumed that mayavi has been imported in the script as mlab ):
import mayavi.mlab as mlab figw = mlab.figure(1, bgcolor=(0.1, 0.1, 0.1), size=(400, 400)) implicit_plot('x**2 + y**2 + z**2 - {R:f}**2'.format(R=1), (-3, 3, -3, 3, -3, 3), fig_handle=figw, Nx=64, Ny=64, Nz=64, col_isurf=(0.0,0.2,0.8), opaque=True, ori_axis=False) mlab.show()
Running the above code should render a sphere of unit radius in the given Mayavi figure window:
The equation represents a sphere when and is the square of the radius.
Here we study the various forms of surfaces represented by the above equation for ; (i.e. for even )
figw = mlab.figure(1, bgcolor=(0.97, 0.97, 0.97), size=(400, 400)) exponent = [2, 4, 6, 8, 10] num_exponent = len(exponent) col_isurf_arr = [(0.0,0.2,0.8), (0.2,0.5,0.4), (0.4,0.8,0.2), (0.8,0.7,0.1), (1.0,0.2,0.0)] # colors b = 100.0 for i, ex in enumerate(exponent): implicit_plot('x**{n} + y**{n} + z**{n} - {b}'.format(n=ex, b=b), (-15, 15, -15, 15, -15, 15), fig_handle=figw, Nx=64, Ny=64, Nz=64, col_isurf=col_isurf_arr[i], opa_val=0.35+0.25*i/num_exponent, opaque=False, ori_axis=True) mlab.show()
which produces the following output:
We can see that when is even, the sphere transforms into a cube-like surface as the value of increases. The following surface is generated for :
figw = mlab.figure(1, bgcolor=(0.1, 0.1, 0.1), size=(400, 400)) implicit_plot('x**20 + y**20 + z**20 - 100', (-2, 2, -2, 2, -2, 2), col_osurf=(0.87,0.086,0.086), fig_handle=figw, opa_val=1.0, opaque=True, ori_axis=False) mlab.show()
Next, we can study the surface transformations for odd values of . i.e. for for . The figure also renders a sphere (exponent=2) for reference.
figw = mlab.figure(1, bgcolor=(0.1, 0.1, 0.1), size=(400, 400)) exponent = [2, 3, 5, 7, 9] num_exponent = len(exponent) col_isurf_arr = [(0.0,0.2,0.8), (0.2,0.5,0.4), (0.4,0.8,0.2), (0.8,0.7,0.1), (1.0,0.2,0.0)] # colors b = 100.0 for i, ex in enumerate(exponent): implicit_plot('x**{n} + y**{n} + z**{n} - {b}'.format(n=ex, b=b), (-15, 15, -15, 15, -15, 15), fig_handle=figw, Nx=64, Ny=64, Nz=64, col_isurf=col_isurf_arr[i], opa_val=0.35+0.25*i/num_exponent, opaque=False, ori_axis=True) mlab.show()
Multiple algebraic surfaces can be rendered by evaluating a product of the expressions. For example, the expression for spheres is .
In the following example we render just two spheres:
figw = mlab.figure(1, bgcolor=(0.1, 0.1, 0.1), size=(400, 400)) # draw a base-plane implicit_plot('(z + 0)', (-5, 5, -5, 5, -5, 5), fig_handle=figw, col_isurf=(0.1,0.1,0.1), opa_val=1.0, opaque=False, ori_axis=True) # The two spheres implicit_plot('(x**2 + y**2 + (z-1.414)**2 - 2)*((x-1.5)**2 + (y-1.5)**2 + (z-0.707)**2 - 0.5)', (-5, 5, -5, 5, -5, 5), fig_handle=figw, Nx=100, Ny=100, Nz=100, col_isurf=(0.87,0.086,0.086), opa_val=1.0, opaque=False, ori_axis=False) mlab.show()
A 4-way tubing can be constructed by “adding” two cylinder surfaces together. As we saw above, two surfaces may be “added” together by multiplying the two surfaces equations. The equation for the 4-way tubing is thus , where the parameter controls the smoothness of the joint such that increasing increases the smoothness.
figw = mlab.figure(1, bgcolor=(0.1, 0.1, 0.1), size=(400, 400)) implicit_plot('(x**2 + y**2 - 1)*(x**2 + z**2 - 1) - {a}'.format(a=0.005), (-5, 5, -5, 5, -5, 5), fig_handle=figw, Nx=201, Ny=201, Nz=201, col_isurf=(1.0,204./255,51./255), col_osurf=(1.0,102./255,0.0), opaque=True, ori_axis=False) mlab.show()
Here are a few more examples of some interesting surfaces. The first one is known as the Zitrus surface whose equation is
(since the basic pattern of the code is same, we will just render the figure here)
The next surface is Diabolo, which as the equation of the form
And finally, we render the well known Sweet surface whose expression is
Obviously, the algebraic surfaces are very beautiful, and there are really endless types of them for fun and study. The following extremely resourceful links can help the interested explorers of algebraic surfaces:
Here is the code snippet for the implicit_plot
function
def implicit_plot(expr, ext_grid, fig_handle=None, Nx=101, Ny=101, Nz=101, col_isurf=(50/255, 199/255, 152/255), col_osurf=(240/255,36/255,87/255), opa_val=0.8, opaque=True, ori_axis=True, **kwargs): """Function to plot algebraic surfaces described by implicit equations in Mayavi Implicit functions are functions of the form `F(x,y,z) = c` where `c` is an arbitrary constant. Parameters ---------- expr : string The expression `F(x,y,z) - c`; e.g. to plot a unit sphere, the `expr` will be `x**2 + y**2 + z**2 - 1` ext_grid : 6-tuple Tuple denoting the range of `x`, `y` and `z` for grid; it has the form - (xmin, xmax, ymin, ymax, zmin, zmax) fig_handle : figure handle (optional) If a mayavi figure object is passed, then the surface shall be added to the scene in the given figure. Then, it is the responsibility of the calling function to call mlab.show(). Nx, Ny, Nz : Integers (optional, preferably odd integers) Number of points along each axis. It is recommended to use odd numbers to ensure the calculation of the function at the origin. col_isurf : 3-tuple (optional) color of inner surface, when double-layered surface is used. This is also the specified color for single-layered surface. col_osurf : 3-tuple (optional) color of outer surface opa_val : float (optional) Opacity value (alpha) to use for surface opaque : boolean (optional) Flag to specify whether the surface should be opaque or not ori_axis : boolean Flag to specify whether a central axis to draw or not """ if fig_handle==None: # create a new figure fig = mlab.figure(1,bgcolor=(0.97, 0.97, 0.97), fgcolor=(0, 0, 0), size=(800, 800)) else: fig = fig_handle xl, xr, yl, yr, zl, zr = ext_grid x, y, z = np.mgrid[xl:xr:eval('{}j'.format(Nx)), yl:yr:eval('{}j'.format(Ny)), zl:zr:eval('{}j'.format(Nz))] scalars = eval(expr) src = mlab.pipeline.scalar_field(x, y, z, scalars) if opaque: delta = 1.e-5 opa_val=1.0 else: delta = 0.0 #col_isurf = col_osurf # In order to render different colors to the two sides of the algebraic surface, # the function plots two contour3d surfaces at a "distance" of delta from the value # of the solution. # the second surface (contour3d) is only drawn if the algebraic surface is specified # to be opaque. cont1 = mlab.pipeline.iso_surface(src, color=col_isurf, contours=[0-delta], transparent=False, opacity=opa_val) cont1.compute_normals = False # for some reasons, setting this to true actually cause # more unevenness on the surface, instead of more smooth if opaque: # the outer surface is specular, the inner surface is not cont2 = mlab.pipeline.iso_surface(src, color=col_osurf, contours=[0+delta], transparent=False, opacity=opa_val) cont2.compute_normals = False cont1.actor.property.backface_culling = True cont2.actor.property.frontface_culling = True cont2.actor.property.specular = 0.2 #0.4 #0.8 cont2.actor.property.specular_power = 55.0 #15.0 else: # make the surface (the only surface) specular cont1.actor.property.specular = 0.2 #0.4 #0.8 cont1.actor.property.specular_power = 55.0 #15.0 # Scene lights (4 lights are used) engine = mlab.get_engine() scene = engine.current_scene cam_light_azimuth = [78, -57, 0, 0] cam_light_elevation = [8, 8, 40, -60] cam_light_intensity = [0.72, 0.48, 0.60, 0.20] for i in range(4): camlight = scene.scene.light_manager.lights[i] camlight.activate = True camlight.azimuth = cam_light_azimuth[i] camlight.elevation = cam_light_elevation[i] camlight.intensity = cam_light_intensity[i] # axis through the origin if ori_axis: len_caxis = int(1.05*np.max(np.abs(np.array(ext_grid)))) caxis = mlab.points3d(0.0, 0.0, 0.0, len_caxis, mode='axes',color=(0.15,0.15,0.15), line_width=1.0, scale_factor=1.,opacity=1.0) caxis.actor.property.lighting = False # if no figure is passed, the function will create a figure. if fig_handle==None: # Setting camera cam = fig.scene.camera cam.elevation(-20) cam.zoom(1.0) # zoom should always be in the end. mlab.show()
Please note that the main block in the the above function is really the following three lines, which generates a scalar field on regular grid using the Mayavi pipleline, and then generates a zero-value isosurface:
scalars = eval(expr) src = mlab.pipeline.scalar_field(x, y, z, scalars) cont1 = mlab.pipeline.iso_surface(src, color=col_isurf, contours=[0],transparent=False, opacity=opa_val)
Hope you enjoyed the post.