This repository contains the materials I used to teach Generative Models in Computer Vision from August 2022 to December 2024, both at Pontificia Universidad Católica de Chile and in diploma programs and workshops at other institutions.
I typically taught one or two sessions per semester, during which I developed two distinct types of classes based on the learning objectives I wanted to achieve with students:
- Historical Overview of Generative Computer Vision Models: Covering GANs, VAEs, Diffusion, Latent Diffusion, text-to-image generation, etc. This class is oriented toward professionals seeking to enter the AI industry, understanding it as a tool within a technology stack. Rather than focusing on mathematical details like the Kullback-Leibler divergence, I emphasize the historical progression of this field and how these models can be utilized in ML system development. This content is available at Modelos Generativos en CV.pdf.
- Deep Dive into Diffusion Models: We begin by examining fundamental building blocks of this area—DDPMs, U-NETs, DDIM, Classifier-Free Guidance—before introducing modern formulations and techniques like DreamBooth and ControlNet. This class targets those who want a comprehensive understanding of the mathematics behind diffusion models and how modern models evolved, providing an informed foundation and up-to-date vocabulary for those looking to dive deeper into this area. This content is available at Difusion I and Difusion II.
Additionally, I created a notebook to familiarize attendees with the Diffusers library, demonstrating how straightforward it is to interact with state-of-the-art diffusion models.