Surfaces with Occlusions from Layered Stereo

Michael H. Lin
Carlo Tomasi

Abstract

Stereo, or the determination of 3D structure from multiple 2D images of a scene, is one of the fundamental problems of computer vision. Although steady progress has been made in recent algorithms, producing accurate results in the neighborhood of depth discontinuities remains a challenge. Moreover, among the techniques that best localize depth discontinuities, it is common to work only with a discrete set of disparity values, hindering the modeling of smooth, non-fronto-parallel surfaces.
This dissertation proposes a three-axis categorization of binocular stereo algorithms according to their modeling of smooth surfaces, depth discontinuities, and occlusion regions, and describes a new algorithm that simultaneously lies in the most accurate category along each axis. To the author's knowledge, it is the first such algorithm for binocular stereo.
The proposed method estimates scene structure as a collection of smooth surface patches. The disparities within each patch are modeled by a continuous-valued spline, while the extent of each patch is represented via a labeled, pixelwise segmentation of the source images. Disparities and extents are alternately estimated by surface fitting and graph cuts, respectively, in an iterative, energy minimization framework. Input images are treated symmetrically, and occlusions are addressed explicitly. Boundary localization is aided by image gradients.
Qualitative and quantitative experimental results are presented, which demonstrate that, for scenes consisting of smooth surfaces, the proposed algorithm significantly improves upon the state of the art, more accurately localizing both the depth of surface interiors and the position of surface boundaries. Finally, limitations of the proposed method are discussed, and directions for future research are suggested.

Ph.D. dissertation

formatted as submitted (97 pages, 4050350 bytes): PDF file
formatted for compactness (65 pages, 3910586 bytes): PDF file
(updated 7 Jan 2003 to use Type 1 fonts throughout)

Other writeups

CVPR 2003 paper (8 pages, 2433636 bytes): gzipped PS file

Figures of results, as HTML + PNG

Selected references

Middlebury Stereo Vision Page (Daniel Scharstein and Richard Szeliski)
Multiway Cut for Stereo and Motion with Slanted Surfaces (Stan Birchfield and Carlo Tomasi)

Surfaces with Occlusions from Layered Stereo

Michael H. LinCarlo Tomasi

Abstract

Ph.D. dissertation

Other writeups

Figures of results, as HTML + PNG

Selected references

Michael H. Lin
Carlo Tomasi