Streetview Image Inpainting with Image-Edit Diffusion Models
2026-09-02 , Conference Management Room5

This talk shows how using ComfyUI and the Qwen Image model to build a "magic eraser" for high-res street imagery. We’ll dive into how to use AI-driven masks to scrub away the clutter while keeping the city’s geometry intact, plus some tips on scaling this up.


Street-level imagery captured from vehicle-mounted cameras often contains dynamic and undesirable elements such as vehicles, pedestrians, signboards, glare artifacts, or privacy-sensitive regions. These elements limit downstream applications including mapping, urban analysis, visualization, and dataset preparation for computer vision tasks. In this talk, I present a practical, end-to-end inpainting pipeline built using ComfyUI and the Qwen Image-edit diffusion model, and how to make it work for high-resolution street-view imagery.

The workflow combines prompt-guided inpainting with mask-based control to selectively remove or modify objects while preserving structural and semantic consistency. I will walk through how segmentation masks are integrated into ComfyUI graphs, how Qwen Image handles context-aware reconstruction,

I'll also talk about why other inpainting methods failed in this task, and only certain model succeed. Join to learn about how I used Image-edit model, segmentation model, and applied it to high res panoramic imagery


Level of technical complexity: 2 - intermediate Give indication of resources (video, web pages, papers, etc.) to read in advance, that will help get up to speed on advanced topics.:

https://huggingface.co/Qwen/Qwen-Image-Edit-2509, https://huggingface.co/spaces/prithivMLmods/SAM3-Demo

Indicate what is (are) the open source project(s) essential in your talk:

All of the components are open source in this project - ComfyUI, SAM3, blurring model

I make my conference contribution available under the CC BY 4.0 license. The conference contribution comprises the abstract, the text contribution for the conference proceedings, the presentation materials as well as the video recording and live transmission of the presentation:

Aman Bagrecha is a geospatial scientist and applied AI practitioner working at intersection of remote sensing and computer vision. He is an active contributor to community learning through talks, workshops, and blogs. He started Let's Talk Spatial, a community in Bangalore, India for folks interesting in geospatial technology, which has had 25+ events in last 2 years alone https://letstalkspatial.in/talks/.

Find him at amanbagrecha.com