Spatio-Angular Resolution Tradeoff in Integral Photography
Spatio-Angular Resolution Tradeoff in Integral Photography
Todor Georgeiv1 , Ke Colin Zheng2 , Brian Curless2 , David Salesin1,2 , Shree Nayar3 , and Chintan Intwala1
1 AdobeSystems
2 University
of Washington
3 Columbia University
Abstract
An integral camera samples the 4D light field of a scene within a single photograph. This paper explores the
fundamental tradeoff between spatial resolution and angular resolution that is inherent to integral photography.
Based on our analysis we divide previous integral camera designs into two classes depending on how the 4D
light field is distributed (multiplexed) over the 2D sensor. Our optical treatment is mathematically rigorous and
extensible to the broader area of light field research.
We argue that for many real-world scenes it is beneficial to sacrifice angular resolution for higher spatial res-
olution. The missing angular resolution is then interpolated using techniques from computer vision. We have
developed a prototype integral camera that uses a system of lenses and prisms as an external attachment to a con-
ventional camera. We have used this prototype to capture the light fields of a variety of scenes. We show examples
of novel view synthesis and refocusing where the spatial resolution is significantly higher than is possible with
previous designs.
Categories and Subject Descriptors (according to ACM CCS): I.3.3 [Computer Graphics]: Digital Photography
c The Eurographics Association 2006.
T. Georgiev, C. Zheng, B. Curless, D. Salesin, S. Nayar, and C. Intwala / Spatio-Angular Resolution Tradeoff in Integral Photography
then propose new camera designs that produce higher spatial is that of the Hartman-Shack sensor [Tys91], which was
resolution than the camera of Ng et al., while trading-off the also proposed a century ago to study wavefront shape in
light field’s angular sampling density. However, this lower optics.
angular resolution in the input is compensated for by insert-
Both types of light field cameras share a single goal —
ing data synthesized by view interpolation of the measured
that of increasing angular resolution of the measured light
light field.
field, which often comes at the cost of spatial resolution of
We use three-view morphing to interpolate the missing an- the final 2D images that are generated. In the rest of this sec-
gular samples of radiance. We demonstrate that such interpo- tion, we explore this trade-off between angular and spatial
lated light fields generated from sparsely sampled radiance resolution and show that for typical scenes it can be advan-
are generally good enough to produce synthetic aperture ef- tageous to use higher spatial resolution at the cost of angular
fects, new view synthesis, and refocusing with minimal loss resolution.
in quality.
We have built an integral camera that uses a system of 2.1. Drawbacks of the plenoptic camera design
lenses and prisms as an external optical attachment to a con-
ventional camera. Using a computer-vision based view inter- In the plenoptic camera recently built and studied in detail
polation algorithm, we demonstrate how our camera can be by Ng et al. [NLB∗ 05], the light field is captured by an array
used to adjust the depth of field and synthesize novel views of 2962 lenslets inside a conventional camera. Each lenslet
for scenes with high-speed action, which are impossible to in this setting corresponds to a little camera producing an ap-
do with conventional cameras. Moreover, with the same 16- proximately 14 × 14 pixel image of the main lens aperture.
megapixel sensor used by Ng et al. we are able to achieve Each pixel within that small image corresponds to one view-
a much higher spatial resolution of 700 × 700 pixels in the point on the aperture, while different lenslets correspond to
computed images. different pixels in the final image. The result is an approxi-
mately 100-view light field with 90,000 pixels per view. (The
number of effective views is 100 instead of 142 due to losses,
2. Trading angular for spatial resolution
which will be discussed later.)
Work in integral / light field photography falls into two major
classes: Unfortunately, from the standpoint of professional pho-
tographers, this system produces images with very low spa-
1. The earliest works of Lipmann [Lip08] and Ives [Ive28] tial resolution. An obvious way to remedy this problem
among others, known as integral photography, used ar- would be to use more lenslets (for example, 1,000,000), with
rays of lenslets or pinholes placed directly in front of fewer views/pixels under each lenslet (for example, 16). The
film, creating multiple images on it like an array of cam- difficulty with such a remedy is that each small image of the
eras. Optically similar to that is a physical array of dig- main lens aperture created by a lenslet includes pixels at the
ital cameras, which is the main approach used in cur- aperture boundary that are either lost entirely, or noisy. Such
rent light field research (e.g., [WJV∗ 05]). A related type boundary pixels are only partially covered by the image of
of integral photography design places an array of pos- the aperture. In order to reconstruct the true irradiance cor-
itive lenses in front of a conventional camera to cre- responding to the illuminated part of each pixel we would
ate an array of real images between the lenses and the need to know exactly what percentage of it has been illu-
camera. Then the camera takes a picture focused on minated, and correct for that in software. In other words,
those images (e.g., [OHAY97]). This approach is clos- we would need very precise calibration of all pixels in the
est to ours. Also consider the following related works camera. However, captured pixel values are affected by tiny
[OAHY99, NYH01, SH02]. misalignments: A misalignment of a micrometer can change
2. The more recent approaches of Adelson et al. [AW92] a boundary pixel value by more than 10%. This problem
and Ng et al. [NLB∗ 05], known as plenoptic cameras, ef- gets very visible when the lenslets get smaller. In the lim-
fectively place a big lens in front of the array of lenslets iting case of a 2 × 2 or 4 × 4 pixel image under each lenslet
(or cameras) considered in the first approach, forming an (depending on Bayer array), all the pixels become boundary
image on the array of lenslets. Each lenslet itself creates pixels, providing no reliable 3D information at all.
an image sampling the angular distribution of radiance
at that point, which corresponds to one single direction
observed from multiple points of view on the main lens 2.2. How do we capture 4D radiance with a 2D sensor?
aperture. This approach swaps the placement of spatial
It turns out that the original integral camera designs have
and angular samples on the image plane: instead of pro-
certain advantages when it comes to acquiring images with
ducing an array of ordinary images, as in integral photog-
sufficient spatial resolution on moderate resolution sensors.
raphy, it creates what appears as a single, recognizable
"image" consisting of small 2D arrays of angular sam- For visualization purposes, suppose that optical phase
ples of a single point in the scene. A related technique space (a.k.a. “light field space”) were 2-dimensional (instead
c The Eurographics Association 2006.
T. Georgiev, C. Zheng, B. Curless, D. Salesin, S. Nayar, and C. Intwala / Spatio-Angular Resolution Tradeoff in Integral Photography
c The Eurographics Association 2006.
T. Georgiev, C. Zheng, B. Curless, D. Salesin, S. Nayar, and C. Intwala / Spatio-Angular Resolution Tradeoff in Integral Photography
x 1 T x
= . (2)
θ 0 1 θ
x x 0
= + . (3)
θ θ α
x x s
= − (4)
θ θ 0
2. Apply the usual linear lens transform.
x 1 0 x−s
= (5)
θ − 1f 1 θ Figure 3: Six designs of light field cameras. (a) Classical
3. Convert to the original optical axis coordinates by adding integral photography. (b) One lens and multiple prisms. (c)
back s. Main lens and a lens array. (d) Main lens and an array of
negative lenses. (e) Same as (d), only implemented as exter-
nal for the camera. (f) Example of external design of nega-
q 1 0 x−s s
= + (6) tive lenses and prisms that has no analog as internal.
θ − 1f 1 θ 0
We can re-write this equation as:
c The Eurographics Association 2006.
T. Georgiev, C. Zheng, B. Curless, D. Salesin, S. Nayar, and C. Intwala / Spatio-Angular Resolution Tradeoff in Integral Photography
them onto different locations in the image plane and forming images appropriately, so the result is as if the scene is viewed
different sub-images. Those different sub-images are of the by an array of parallel cameras. Again the idea is that a cam-
type Figure 1c, which is the more efficient design. (Note that era with a lens shifted from the optical axis is equivalent to a
intuition is not sufficient to convince us that this approach is camera on the axis, a lens and a prism. We should also note
exact. Intuition only tells us that “this should work at least that practically, the role of the negative lenses is to expand
approximately.”) the field of view in each image, and that the prisms can be
viewed as making up a Fresnel lens focused at the camera’s
Figure 3c is also self-explanatory from the point of view
center of projection. Other external designs are possible with
of intuition. The additional small lenses focus light rays
an array of positive lenses creating real images between the
closer than the original focal plane of the main lens. Thus
array of lenses and the main camera lens.
they form individual images instead of being integrated into
one image as in traditional one-optical-axis cameras. Again, We have built prototypes for two of the designs: Figure 3e
this is “at least approximately correct” as a design, and we with 20 lenses, cut into squares, and Figure 3f with 19 lenses
need formula 7 to prove that it is exactly correct and to find and 18 prisms. Because of chromatic problems with our
the exact values of the parameters (in terms of equivalence prisms currently we produce better images with the design
with Figure 3b.) in Figure 3e, which is used to obtain the results in this pa-
per. Also, our lenses and prisms for the design Figure 3f are
In more detail, each of the shifted lenses in Figure 3c is
not cut into squares, which leads to loss of pixels even with
equivalent to a big lens on the optical axis and a prism. The
hexagonal packing, Figure 4. We are planning to build a ver-
big lens can be combined in one with the main lens, and we
sion based on quality optical elements.
get equivalence with Figure 3b.
Figure 3d is similar, only with negative lenses. Designs
Figure 3c and Figure 3d can be used practically if we inte- 3. Synthetic aperture photography
grate an array of 10-20 lenslets into the barrel of a conven- Light fields can be used to simulate the defocus blur of a
tional camera lens and use it with a high resolution camera conventional lens, by re-projecting some or all of the images
as a compact light field camera. onto a (real or virtual) focal plane in the scene, and comput-
Figure 3e describes a design external to the camera. It is ing their average. Objects on this plane will appear sharp (in
used in this paper for the examples with 20 negative lenses. focus), while those not on this plane will appear blurred (out
The whole optical device looks like a telephoto lens, which of focus) in the resulting image. This synthetic focus can be
can be added as an attachment to the main camera lens. See thought of as resulting from a large-aperture lens, the view-
Figure 6. points of light field images being point samples on the lens
surface. This method was proposed by Levoy and Hanra-
Figure 3f is our best design. We have implemented a ver- han [LH96], first demonstrated by Isaksen et al. [IMG00],
sion made up of 19 lenses and 18 prisms. See Figure 4. It is and goes under the name of synthetic aperture photogra-
lightweight compared to similar design with a big lens. Also, phy in current work [VWJL04, WJV∗ 05]. It creates a strong
an array of prisms is cheaper than a big lens. sense of 3D; further, summing and averaging all the rays
serves as a noise reduction filter, hence the resulting im-
age has superior signal-to-noise ratio (SNR) compared to the
original inputs.
The projection and averaging approach to synthetic aper-
ture requires a dense light field. However, we are working
with relatively sparse samplings comprised of 20 images.
Simply projecting and averaging such an image set results in
pronounced ghosting artifacts, essentially the result of alias-
ing in the sampled light field. Stewart et al. [SYGM03] ex-
plore reconstruction filters to reduce the aliasing in under-
sampled light fields; however, even with 256 images some
artifacts remain.
Instead, we address the aliasing problem by generating
more camera views than those provided directly by the cam-
Figure 4: Our optical device consisting of lens-prism pairs.
era array through view morphing [SD96]. This is equivalent
to generating a synthetic light field by carefully interpolat-
As in the design of Figure 3e, the camera sees an array of ing between the samples in our sparse camera data. Funda-
virtual images created by the negative lenses, in front of the mentally, this is possible because of the well known “redun-
optical device and focuses upon them. The prisms shift these dancy” of the light field [LH96], which in the Lambertian
c The Eurographics Association 2006.
T. Georgiev, C. Zheng, B. Curless, D. Salesin, S. Nayar, and C. Intwala / Spatio-Angular Resolution Tradeoff in Integral Photography
case is constant along angular dimensions at each point on within triangles on our camera grid. Given three images I1 ,
the surface that is being observed. In the following subsec- I2 and I3 , we morph to the target image Is using barycen-
tions, we describe our method for filling out the light field tric coefficients λ1 , λ2 and λ3 . Let Wi j be the warping vector
and for using it to generate synthetic aperture images. field (or “flow”) from image Ii to image I j , according to the
disparity map from Ii to I j obtained using the segmentation-
based stereo algorithm from Section 3.1.1. Ideally, this warp-
3.1. Synthetic light field by tri-view morphing ing function will convert image Ii into an image identical to
Our sampling consists of viewpoints that lie on a grid. We I j . In general, warping any image I by a vector field W will
tessellate this grid into a triangular mesh, as illustrated in produce a new image denoted as I(W ). We warp each of
Figure 5. Our goal is to be able to fill in arbitrary viewpoints the input images to Is using affine (barycentric) combination
within the grid. As described below, we do this by computing of the three vector fields, and then we blend them together
warps that allow view morphing between each pair of views based on the same barycentric coefficients:
connected by an edge. These warps are then combined to
allow barycentric interpolation of views within each triangle
3 3
of viewpoints. Iout = ∑ λi Ii ∑ λ jWi j
i=1 j=1
3.1.1. View morphing with segmentation-based stereo
Note that we generally sample within the camera grid, so
View morphing [SD96] is a method for interpolating two ref-
that the desired image is inside of a triangle defined by the
erence images to generate geometrically correct in-between
three input images Ii , and then λi ≥ 0 and Σ3i=1 λi = 1. Ex-
views from any point on the line connecting the two initial
trapolation outside the grid is also feasible to some extent,
centers of projection. To achieve this effect, a correspon-
in which case one or more barycentric coordinates will be
dence is needed between the pair of images.
negative.
Recently, color segmentation approaches have gained in
popularity for dense correspondence computation. They use
3.2. Synthetic aperture rendering
color discontinuities to delineate object boundaries and thus
depth discontinuities. Also, they model mixed color pixels To simulate the defocus of an ordinary camera lens, we first
at boundaries with fractional contributions (a.k.a. matting) define an aperture location and size on the camera grid (see
to reduce artifacts at depth discontinuities. Figure 5). Then, we densely sample within this aperture us-
ing tri-view morphing. Finally, we determine an in-focus
We build on the segment-based optical flow work of Zit-
plane, project all images within the aperture onto this plane,
nick et al. [ZJK05]. The idea behind their method is to model
and average.
each pixel’s color as the blend of two irregularly-shaped seg-
ments with fractional contributions α and then solve for a
mutual segmentation between a pair of images that gives rise 4. Results
to segments with similar shapes and colors. We modify their
flow algorithm in two ways. First, between each pair of im- We have built working prototypes of two camera designs,
shown in Figures 3e and 3f. All of our results are based on
ages, we require the matched segments to lie along epipolar
lines. Second, we simultaneously compute epipolar flow be- images taken with the former of these two designs, so that is
tween an image and two neighbors defining a triangle, so that the design that we describe in detail here.
the segments in each image are consistent between neighbors
needed for tri-view morphing, described in the next subsec- 4.1. Camera
tion.
Our implementation of the camera design from Figure 3e
was built with an array of 4 × 5 negative lenses cut into
3.1.2. Tri-view blending
squares and attached to each-other with minimal loss of
Seitz et al. [SD96] demonstrated that any linear combination space. Before being glued together the lenses were placed
of two parallel views gives a valid interpolated projection with their flat sides facing downward on a piece of glass, so
of the scene. Multiple image morphing [GW97, LWS98] has we believe they are very well aligned on a plane and parallel
been used to extend two-view morphing to morphing among to each other. Since all lenses have the same focal length,
three or more views and into a complete image-based 3D −105 mm, their focal points are on one plane. This plane is
system [Geo98]. Tri-view morphing [XS04] is a more recent perpendicular to the direction of view to the precision of lens
system for creating the appearance of 3D via multi-image manufacturing.
morphing, making use of the trifocal tensor to generate the
We calibrate the camera centers using an off-the-shelf
warping transforms among three views.
structure-from-motion (SFM) system [BL05] which recov-
Here we summarize our method for tri-view morphing ers both the intrinsic and the extrinsic parameters of the
c The Eurographics Association 2006.
T. Georgiev, C. Zheng, B. Curless, D. Salesin, S. Nayar, and C. Intwala / Spatio-Angular Resolution Tradeoff in Integral Photography
Figure 5: The set of 20 images (middle) is a sparse light field captured with our camera. A close-up of one of the images is
shown on the left. The hazy edges are defocused images of the boundaries of the lenses; for the results in this paper, we discard
these contaminated pixels. Each vertex on the right represents one camera view. We decompose the camera plane into triangles
illustrated on the right. Any novel camera view inside these triangles can be synthesized using tri-view morphing. The circular
region represents a possible virtual aperture we want to simulate.
camera. For the purposes of synthetic aperture, one could 4.2. Renderings
also pursue the calibration method discussed by Vaish et
al. [VWJL04], in which relative camera positions are recov- With our camera prototype, twenty views are captured at a
ered. single exposure, with each view containing roughly 700 by
700 pixels. Twenty-four triangles are formed to cover the
entire viewing space. The relative locations of all the cam-
eras are recovered by running SFM on the 20 images. Once
the size, location, and shape of a virtual lens is specified,
we densely sample viewpoints using our tri-view morph-
ing algorithm at one reference depth. All examples shown
here were sampled with about 250 views. Sweeping through
planes of different depths corresponds to shifting all views
accordingly. By shifting and summing all the sampled views,
we compute synthetic aperture images at different depths.
c The Eurographics Association 2006.
T. Georgiev, C. Zheng, B. Curless, D. Salesin, S. Nayar, and C. Intwala / Spatio-Angular Resolution Tradeoff in Integral Photography
The reader is encouraged to see the electronic version of hope that our work will inspire others to explore the possi-
this paper for high resolution color images. The supplemen- bilities in this rich domain.
tary videos show sequences of synthetic aperture images as
the focal plane sweeps through a family of planes that spans References
the depths of the scenes. The sharpness of objects on the
focal plane together with the smooth blur indicates the accu- [AW92] A DELSON T., WANG J.: Single lens stereo with a
racy of our technique. The size of the virtual aperture used plenoptic camera. IEEE Transactions on Pattern Analysis
in the seagulls example (Figure 7) and in most results of the and Machine Intelligence (1992), 99–106. 2
juggling scene (Figure 8) is about one quarter of the entire [BL05] B ROWN M., L OWE D. G.: Unsupervised 3d ob-
viewing region. ject recognition and reconstruction in unordered datasets.
In Proceedings of 5th International Conference on 3D
Imaging and Modelling (3DIM) (2005), pp. 21–30. 6
5. Conclusion and future work
[GB94] G ERRARD A., B URCH J. M.: Introduction to ma-
In this paper, we describe several practical light field cam- trix methods in optics. 3
era designs with the specific application to synthetic aper- [Geo98] G EORGIEV T.: 3d graphics based on images and
ture photography. We compare the two fundamental ways of morphing. US Patent 6268846 (1998). 6
approaching light field capture, and argue that an important
[GGSC96] G ORTLER S. J., G RZESZCZUK R., S ZELISKI
point in the camera space for integral photography is in a
R., C OHEN M. F.: The lumigraph. ACM Trans. Graph.
sparse sampling of the angular dimensions of the light field
(1996), 43–54. 1
in order to achieve better spatial resolution. As such, we ex-
plore how integral cameras can be used to produce results [GS85] G UILLEMIN V., S TERNBERG S.: Symplectic
with higher spatial resolution than plenoptic cameras, using techniques in physics. 1
the same image sensor. [GW97] G EORGIEV T., WAINER M.: Morphing between
multiple images. Tech. Rep. (1997). 6
We further draw upon state-of-the-art computer vision
techniques as a post- processing tool to interpolate or “fill [IMG00] I SAKSEN A., M C M ILLAN L., G ORTLER S. J.:
in” the sparse light field. We demonstrate the effectiveness Dynamically reparameterized light fields. ACM Trans.
of this framework with realistic refocusing and depth of Graph. (2000), 297–306. 1, 5
field results. Averaging lots of intermediate views not only [Ive28] I VES H.: Camera for making parallax panorama-
reduces sampling errors, but also makes errors caused by grams. J. Opt. Soc. Amer., 17 (1928), 435–439. 2
stereo matching much more tolerable, which is one of the
[LH96] L EVOY M., H ANRAHAN P.: Light field rendering.
insights of our approach.
ACM Trans. Graph. (1996), 31–42. 1, 3, 5
Most of the computing cycles are spent on generating in- [Lip08] L IPPMANN G.: Epreuves reversible donnant la
between views. An analysis on the sampling bounds would sensation du relief. J. Phys. 7 (1908), 821–825. 1, 2
be helpful for better efficiency. How densely does one have
[LWS98] L EE S., W OLBERG G., S HIN S.: Polymorph:
to sample the viewing space in order to create non-aliased
Morphing among multiple images. IEEE Computer
results? Furthermore, it would be interesting to explore the
Graphics and Applications (1998). 6
possibility of skipping the entire process of view interpola-
tion and realizing refocusing directly from the disparity map. [NLB∗ 05] N G R., L EVOY M., B R Ĺ EDIF
˛ M., D UVAL G.,
H OROWITZ M., H ANRAHAN P.: Light field photography
We use twenty views in our camera implementation. For with a hand-held plenoptic camera. Tech. Rep. (2005). 1,
typical scenes we get good results, but for scenes with more 2
complex 3D structure we begin to observe artifacts. A de-
tailed study of the relationship between optimal sampling [NYH01] NAEMURA T., YOSHIDA T., H ARASHIMA H.:
rate and 3D scene complexity would be useful. It might be 3d computer graphics based on integral photography. Op-
possible to dynamically adjust the number of captured views tics Express, Vol. 8, 2 (2001). 2
based on scene geometry so that results with optimal resolu- [OAHY99] O KANO F., A RAI J., H OSHINO H., Y UYAMA
tion are achieved. I.: Three-dimensional video system based on integral
photography. Optical Engineering, Vol. 38, 6 (1999). 2
In the last few years, we have experienced the fast bloom
of digital photography. Sensors are gaining in resolution. [OHAY97] O KANO F., H OSHINO H., A RAI J., Y UYAMA
Capturing a light field with a single exposure becomes I.: Real-time pickup method for a three-dimensional im-
achievable in a realistic hand-held camera. This adds a whole age based on integral photography. Applied Optics, Vol.
new dimension to digital photography with the possibility 36, 7 (1997), 1598–1603. 2
of capturing a sparse light field with a compact camera de- [SD96] S EITZ S. M., DYER C. R.: View morphing. ACM
sign, and later post-processing based on computer vision. We Trans. Graph. (1996), 21–30. 5, 6
c The Eurographics Association 2006.
T. Georgiev, C. Zheng, B. Curless, D. Salesin, S. Nayar, and C. Intwala / Spatio-Angular Resolution Tradeoff in Integral Photography
c The Eurographics Association 2006.
T. Georgiev, C. Zheng, B. Curless, D. Salesin, S. Nayar, and C. Intwala / Spatio-Angular Resolution Tradeoff in Integral Photography
(b) Synthetic aperture results with the focal plane moving from near to far.
(c) Synthetic aperture results with varying depth of field. (Left image demonstrates sparse sampling.)
c The Eurographics Association 2006.