Lytro vs Mask Based Light Field Camera

Video Demonstration

Light Field Datasets

Now you can convert your medium format digital/film camera into a 4D light field camera. And that too in 5 minutes costing less than 5 dollars. Sounds fun. Lets see how we did it.

What is a Light Field Camera?

Recently, there is a great enthusiam about Lytro, the first company to offer a commercial light-field camera. Congrats to the Stanford/Lytro team for bringing this technology to market.

A Light Field camera captures the variations in the rays falling on the sensor. A traditional 2D camera (your favorite point & shoot or SLR camera) outputs a 2D image, a grid of integer values specifying the intensity at each pixel. Imagine a point in the world which is in sharp focus at the sensor. The cone of rays emerging from this point and refracted by the lens falls on a single pixel on the sensor. The intensity value of the pixel is equal to the sum of all the rays falling onto the pixel. However, we would like to capture the variation among the rays falling onto a sensor pixel. By capturing the ray-space, we can obtain all the light information in geometric ray optics inside the camera.

The ray-space is four dimensional. Imagine you are standing on the sensor. You have two degrees of freedom in where to stand on the sensor plane (x-y). At each such position, the light rays can hit you from any direction specified by the azimuth and elevation. Thus, ray-space has four dimensions, ignoring attenuation
(and wavelength and time dependency) along the rays. In more fancy terms, 2D cameras are integration devices. The image formed is an integral of the 4D ray-space along the angular dimensions.

How can we capture the Light Field?

There are several ways to capture the light field. One way is to put a lenslet array in front of the sensor such that the main lens is focused on the lenslet array and the lenslet array is focused on the sensor [1]. This is the approach used by Lytro, which is offereing the first commerical light field camera. Now the cone of rays from a focused scene point falls on the lenslet which diverts the rays to different pixels on the sensor. One can thus capture the angular variation among rays. This is a form of integral imaging, which has roots as far back as 1908 when Lippmann proposed such a design for 3D photography [2]. However, spatial resolution is lost since the sensor pixels are now used to sample the angular variations. A hand-held light field camera using lenslets was demonstrated by Ren Ng at Stanford in 2005 [1] with beautiful results on light field applications such as digital refocusing. Their proposed design uses a Contax medium format camera resulting in approximately 300 by 300 spatial resolution and 14 by 14 angular resolution.

The above approaches are based on refractive elements. Our design [3] is based on non-refractive elements such as masks [4]. In our design, a pinhole array mask (transparency) is place in front of the sensor. Each pinhole samples the angular variation by forming the image of the aperture on the sensor.


As shown in the video and figure above, we used a Mamiya 645ZD medium format camera  with a 22 mega-pixel sensor digital back having a 36mm by 48mm Dalsa CCD imaging sensor. The sensor resolution is 5344 by 4008 pixels. A 1.2mm thick glass protects the sensor. We printed a pinhole array mask of the same size and simply dropped it on top of the sensor protective glass. We used an additional glass piece to push and flatten the mask to hold it in place. The entire procedure to put the mask in the camera takes less than one minute. A single A4 sized transparency holding 20 masks can be printed for less than $100, making the additional cost of our setup just $5.

Mask PrintingWe printed the mask at local printing company (Pageworks). The masks could be printed at 5080 DPI. In our design, we choose a square pinhole opening of 25 microns width.


What are the advantages and disadvantages of using masks for light field capture?


Light field capture using masks has several advantages.


1. Low cost and ease of use: Masks can be printed at very low cost and can be easily placed inside the camera. As reported in the Stanford tech report by Ren Ng et al., the lenslet based design requires high precision since the main lens should be focused on the lenslet array and the lenslet array in turn has to focus on the sensor. Masks offer flexibility since the rays are attenuated using masks as opposed to being refracted. As the above video shows, it is fairly easy to insert a mask inside the camera. Moreover, replacing the masks is easy as compared to lenslet array. A photographer can thus replace masks on the fly to suit his/her needs.


2. Obtaining Full Resolution Image: A single-shot light field capture always loses spatial resolution as the sensor pixels are now used to sample the angular variation in rays. The number of sensor pixels = spatial resolution * angular resolution. For a traditional camera, angular resolution is 1, so that entire the spatial resolution is same as sensor resolution. For example, if we want to capture a light field with 5 by 5 angular resolution, then the spatial resolution will decrease by a factor of 5 in x and y dimensions. This loss of resolution is inherent whether we use masks or lenslets if we want to capture a light field. But by using masks, we could avoid the loss of spatial resolution in certain cases.

For example, suppose we are looking at a painting in sharp focus (a planar Lambertian scene). Such a scene does not have extra information in angular dimension. Thus capturing a light field is redundant in this case. If we use masks, we could obtain a full resolution 2D image of the painting by simply dividing the captured photo with a calibration photo. The calibration photo is a photo of a uniform intensity light box (typically used for vignetting correction).


The details are in the paper but an intuitive explanation is as follows. If a scene point is in focus, then all rays emerging from this scene point falls on the same sensor pixel. Inserting a mask simply blocks some of these rays. So the resulting image is dimmer but otherwise no spatial information is lost. By dividing by the calibration image, the intensity variation in each pixel due to the mask can be compensated.


The biggest disadvantage of using masks is the loss of light since masks are attenuators. If we use a pinhole array mask, then only 5 percent of light goes through, rest is blocked. For outdoor sunlit scenes, we can use shutter speeds of 0.5 seconds which could lead to motion blur. However for glare reduction, we showed several outdoor examples on static scenes. The light throughput can be increased by using a sum-of-cosine mask [4]. The theory behind it which we first described in SIGGRAPH 2007 paper can be used to explain most of the light field capture designs going back to the start of the century.

What mask patterns can be used?

In our SIGGRAPH 2008 paper, we experimented with both uniform and randomized pinhole arrays. For the randomized pinhole array, the location of each pinhole was randomly perturbed within some distance. Although capturing a light field using uniform pinhole array is straightforward, the spatial structure is lost when using a randomized mask. However, we show that for glare reduction, randomized masks are useful without the need for reconstructing the light field inside the camera. By using randomized masks, we can avoid the loss of spatial resolution inherent in light field reconstruction and can obtain visually pleasing results. See Figure 8 of our SIGGRAPH 2008 paper.

History of Integral Imaging


(All papers & patents referenced below can be downloaded from


Integral imaging has a long history. It was first proposed in 1908 by Lippmann and demonstrated in 1911. Sokolov (1911) used a pin-hole aperture sheet to demonstrate the idea. These ideas didn’t use a main lens to focus the scene on the lenticular arrays/lenslets. Ives in 1930 incorporated a larger aperture camera lens in front. Kanolt (1933) also experimented with pinhole arrays and large objective lens. Coffey (1933) figured out the relationship between the main lens and lenslet design: f-number matching which was also shown by Ren Ng [2005]. The first experiments using a proper lens array were performed in 1948 by S.P. Ivanov and L.V. Akimakina. Several designs for making lens arrays were subsequently proposed. In recent years, Ren Ng [2005] showed a handheld camera directly suitable for consumer photography. Our group showed a mask based approach in 2007. Our paper on glare reduction [4] takes a step further beyond light field capture and its usual applications such as digital refocusing. We show that one can reduced glare by uniform and non-uniform ray-sampling without reconstructing a light field.





[1] Ng, R., Levoy, M., Brdif, M., Duval, G., Horowitz, M., AND Hanrahan, P. 2005. Light field photography with a hand-held plenoptic camera. Tech. rep., Stanford Univ


[2] Lippmann, G. 1908. Epreuves reversible donnant la sensation du relief. J. Phys 7, 821–825.


[3] Veeraraghavan, A., Raskar, R., Agrawal, A., Mohan, A., AND Tumblin, J. 2007. Dappled photography: Mask enhanced cameras for heterodyned light fields and coded aperture refocusing. ACM Trans. Graph. 26, 3 (July), 69:1–69:12.

[4] Raskar, R., Agrawal,
A., Wilson, C. AND Veeraraghavan, A.. 2008. Glare Aware Photography: 4D Ray Sampling for Reducing Glare Effects of Camera Lenses, SIGGRAPH 2008