Our system automatically constructs simple "pop-up" 3D models, like those one would find in a children's book, out of a single outdoor image. The system labels each region of an outdoor image as ground, vertical, or sky. Line segments fitted to the ground-vertical boundary in the image and an estimate of the horizon's position provide the necessary information to determine where to "cut" and "fold"
in the image. The model is then popped up, and the image is texture mapped onto the model.
This work is part of an on-going effort in Geometrically Coherent Image Interpretation. In our ICCV'05 paper Geometric Context from a Single
Image, we provide a quantitative analysis of our system and extend our work by subclassifying vertical regions and using the geometric labels as context for object detection. In our newest CVPR'06 paper, Putting Objects in Perspective, we show how 3D reasoning can be used to aid in object detection.
Software
Software is available here.
Dataset
Data is available here.
Results
Low Resolution Video (33 MB)
High Resolution Video (73 MB)
Amtrak VRML
Jesus College, Oxford VRML
Publications
D. Hoiem, A.A. Efros, and M. Hebert, "Automatic Photo Pop-up", ACM SIGGRAPH 2005.
pdf ;
presentation