Our system automatically constructs simple "pop-up" 3D models, like those one would find in a children's book, out of a single outdoor image. The system labels each region of an outdoor image as ground, vertical, or sky. Line segments fitted to the ground-vertical boundary in the image and an estimate of the horizon's position provide the necessary information to determine where to "cut" and "fold" in the image. The model is then popped up, and the image is texture mapped onto the model.

This work is part of an on-going effort in Geometrically Coherent Image Interpretation. In our ICCV'05 paper Geometric Context from a Single Image, we provide a quantitative analysis of our system and extend our work by subclassifying vertical regions and using the geometric labels as context for object detection. In our newest CVPR'06 paper, Putting Objects in Perspective, we show how 3D reasoning can be used to aid in object detection.


Some of this material is based upon work supported by the National Science Foundation under CAREER Grant No. ISS-0546547. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. This work is also partially supported by a Microsoft Research Fellowship awarded in 2006.