Diffusion & Large Vision Models Workshop
This workshop explores the evolution of computer vision from early classification models to modern generative systems powered by diffusion and large vision models. Through a mix of theory and practical insights, learners will understand how these models work and how they’re applied in real-world scenarios.
- Overview of key milestones in computer vision, from CNNs to Vision Transformers.
- Introduction to multimodal learning with models like CLIP that connect vision and language.
- Deep dive into generative models: autoencoders, GANs, and diffusion models.
- Controllability and practical applications: inpainting, segmentation, text-to-image, and video generation.
By the end, participants will gain a strong conceptual understanding of how large vision models are designed, how they generate and edit images, and how they are shaping the future of generative models.
COURSE PREREQUISITES
- College-level Calculus and Linear Algebra.
- A basic understanding of Machine Learning concepts.
- Prior computer vision and deep learning knowledge is beneficial but not mandatory.
WORKSHOP INSTRUCTORS

Afshine Amidi
Adjunct Lecturer, ICME
Senior Machine Learning Scientist at Netflix

Shervine Amidi, ICME Alum
Adjunct Lecturer, ICME
Senior Software Engineer at Google
WORKSHOP OUTLINE
Part 1. Foundations of Vision Models and Representation Learning
|
Part 2. Multimodal Embeddings and Generative Models
|
Part 3. Applications, Challenges, and Future Directions
|
FREQUENTLY ASKED QUESTIONS
- Why is registration limited? To enhance individual engagement and hands-on learning, our workshop is held in person with a limited number of participants giving priority to ICME affiliate members.
- How do I know if my company is an ICME affiliate? Click the link to learn if your company is an ICME affiliate: https://icme.stanford.edu/Affiliate-Program
- How do I pay for the workshop? After registering, you will receive an email asking for your payment.
- Do you offer group discounts? We offer a 10% group discount if an organization registers at least five people. For more information, please contact Tanya Schornack at tschorna@stanford.edu.
- When will I receive my certificate? Expect to receive it within one week after the workshop concludes.
- What is the refund policy? Refunds are processed on a case-by-case basis. For credit card payments, a 2.9% fee will be deducted from the total amount. To request a refund, please contact ICME at icme-contact@stanford.edu. ICME reserves the right to decline refund requests.
- Do I need to bring my own computer to the workshop? Yes, you will need to bring your own computer with a full charge, as outlets are limited. We'll send additional notifications closer to the event regarding accessing class materials and a website.
- Will lunch be provided during the workshop? Lunch will be provided for all participants during the workshop. If you have any dietary requirements, please let us know.
- Where will the workshop be held? The workshop is scheduled to take place on the Stanford main campus. Detailed information will be sent to you after registration.
- Where should I park? Visitor parking is available in designated Stanford campus areas. All visitor parking payments are contactless and managed through ParkMobile. Detailed information on the nearest parking lots will be sent to you after registration. Click here to purchase visitor parking.
COURSE CANCELLATIONS
ICME reserves the right to cancel a course for any reason. If ICME cancels a course, you will automatically be granted a full refund, including the registration fee and any course fees.