AI-segmented intersections from satellite imagery for 100 largest US cities
Description
This dataset is comprised of approximately 3 million segmented images of urban intersections, which were processed by an advanced computer-vision model. Satellite imagery was originally drawn from Google Maps API, which were centered on street intersections in the 100 largest U.S. cities by population. Those images were then segmented such that the drivable space (roadways) is differentiated from the non-drivable space (sidewalks, buildings, parks, open space, etc.). This allows for a range of analyses relating to the physical dimensions of the built environment, such as position, alignment and surface area of streets, sidewalks, and surrounding features.
The segmentation was performed by the Segment Anything Model (SAM), developed and released by Meta Inc. Before the segmentation was run on all intersection images, SAM was trained on manually-annotated intersection images from 14 cities – covering a wide range of densities and urban form (Arlington, TX, Baltimore, MD, Boston, MA, Denver, CO, Durham, NC, Glendale, AZ, Huntsville, AL, Irvine, CA, Jacksonville, FL, Oklahoma City, OK, Omaha, NE, Reno, NV, San Francisco, CA, and Stockton, CA). In total, 193 total high-entropy intersection images were selected for manual annotation. To increase the size of this training set, all manually annotated images were horizontally and vertically flipped, rotated (clockwise, counter-clockwise, and upside down), Gaussian blurred (up to 1.5px), and modified in terms of random noise (up to 1.05% of pixels). These modifications artificially expanded the training set to a final collection of 5,790 images. Following this fine-tuning, SAM produced pixel-level masks of non-drivable surfaces within each of the three million intersection images drawn from all 100 cities. To enable this, each intersection image was first resized to 1024x1024 pixels to meet the model’s input requirements. As part of the segmentation process, masks of non-drivable spaces were post-processed to extract contours that were then converted to polygonal geometries. To maintain spatial accuracy, each segmented image’s file name was encoded with its center-point coordinates, which was later used to compute the corresponding bounding box. This allowed the resulting polygons to be mapped back to their real-world geographic locations, thereby ensuring alignment with original satellite imagery. The segmented images total 50 gigabytes.
Files
Additional details
Related works
- Is documented by
- Publication: 10.5198/jtlu.2026.2771 (DOI)
- References
- Software: https://github.com/facebookresearch/segment-anything (URL)
Dates
- Collected
-
2024-12