I will present my work on recognizing materials, scenes, and objects in photographs --- key computer vision problems that are made challenging by the seemingly limitless variability of natural imagery. Currently, even the most advanced recognition systems lack the geometric invariance, robustness, and flexibility to cope with the full range of this variability. To overcome these limitations, I have developed several approaches combining salient local image features with spatial relations and discriminative learning techniques. First, I will discuss a simple yet effective orderless image representation that was originally designed for the problem of recognizing images of textured surfaces subjected to viewpoint changes and non-rigid deformations. In a large-scale comparative evaluation, this method has also performed well for object categorization despite substantial clutter and occlusion. Next, I will discuss an extension of this method that incorporates global spatial information for classification of natural scene categories. Finally, I will describe a part-based object recognition approach that supports the learning of robust and geometrically invariant object models from small sets of unsegmented, cluttered training images. Baseline comparisons show that each of the proposed approaches is capable of outperforming the state of the art on challenging datasets. Apart from my work on recognition, I am also interested in acquiring high-fidelity 3D models of objects from photographs and video. In this area, I have worked on image-based techniques for reconstructing 3D shapes from silhouettes and local texture, with applications to 3D photography and video shot matching.