Talk

Your Photo Gallery, but Smarter: A Local-First Semantic Image Search System That Runs on Your Laptop

Thursday, May 28

12:25 - 12:55
RoomSpaghetti
LanguageEnglish
Audience levelIntermediate
Elevator pitch

This talk explores how to build a local-first semantic image search system that runs entirely on your laptop. Using modern Python tools and local models - such as CLIP, FAISS, BLIP, BERTopic, FastAPI, and Streamlit - we turn a photo collection into a searchable library using text or images.

Abstract

Modern image collections are growing faster than our ability to organize them. Filenames, folders, and manual tags quickly become limiting when searching large photo libraries. This talk presents a practical, local-first semantic image search system that runs entirely on your laptop, allowing users to search images from their semantic meaning using natural language or example images—without relying on cloud services.

Starting from a personal real-world Python project, the talk walks through the architecture and design choices behind an end-to-end multimodal system. We explore how models like CLIP and BLIP enable cross-modal understanding between text and images, how FAISS makes large-scale similarity search fast and efficient, and how BERTopic can be used to cluster images into meaningful topics. Local LLMs are integrated into the pipeline to generate human-readable topic labels and summaries for image clusters, turning raw embeddings and clusters into explanations that users can actually understand—all while keeping data fully local.

On the engineering side, we look at how FastAPI and Streamlit come together to form a simple yet effective full-stack application for managing, searching, and exploring image collections.

While the talk touches on advanced concepts in multimodal AI and system design, everything is explained in as simple and intuitive terms as possible.

Attendees will leave with a clear mental model, architectural patterns, and concrete ideas they can reuse for their own projects involving multimodal search, local LLMs, and applied machine learning in Python.

TagsPrivacy, ML and AI, Data Science & Data Visualisation
Participant

Daniele Giunta

I am an AI Engineer specialized in generative artificial intelligence, passionate about football and curious about anything that can enrich my cultural background!


Sono un AI Engineer specializzato in intelligenza artificiale generativa, appassionato di calcio e curioso di tutto ciò che può arricchire il proprio bagaglio culturale!