Skip to main content Skip to secondary navigation

ICME Summer Workshops 2023 Details

Main content start

Workshops will be held online via Zoom from July 24 to August 11

ICME’s annual Summer Workshop Series will offer a variety of virtual data science and AI courses, taught live via Zoom by faculty, researchers, and Stanford alumni working in the industry. The series is open to the general public worldwide. Discounts are offered to students, staff, and faculty from all schools as well as to ICME industry partners.

The series offers:

  • Fourteen workshops that cover a spectrum of data science topics (see offerings below)
  • Each workshop is spilt into two half-day modules

Every participant who attends a workshop and completed the evaluation at the end will receive a Certificate of Completion for that workshop. In addition, participants looking to gain a broad knowledge base may work towards the ICME Fundamentals of Data Science Certificate. The Fundamentals certificate requires the completion of at least four workshops, one of which needs to be SWS 05 (Privacy & Responsible AI) or SWS 06 (Applications of Data Science). We strongly recommend that aspirants also complete SWS 07 (Python for Data Science). Note that if anyone arriving with no previous Python experience should complete SWS 02 (Introduction to Python) before SWS 07. For more details please see our FAQ.

To join the workshop, participants will need a device with a recent web browser and two-way audio and video access to Zoom. This can be a laptop or desktop computer running any operating system (Windows, Mac, or Linux). Participative activities may benefit from a larger screen, so joining via a smartphone or tablet is not recommended.

If you have any questions, please contact: sws-contact@stanford.edu

ICME Summer Workshops Class Information

01. Introduction to Statistics 

Monday, July 24 & Tuesday, July 25, 2023  |  8:00 AM - 11:00 AM PDT

The Introduction to Statistics workshop covers the fundamentals of statistics, which powers modern day machine learning, deep learning and data science. This workshop will provide an overview of the key methodologies of statistics, which is also known as the science of learning from data. We will cover basic techniques on how to visualize data, sample and conduct experiments. We then will detail statistical approximations including mean and standard deviation estimates from data, the normal approximation, and central limit theorem, as well as common probability rules and distributions for different types of data. We will demonstrate the important concepts and pitfalls of regression as well as regression error analysis including the bias-variance tradeoff, and how to do inference with confidence intervals and tests of hypotheses. 

Programming examples are provided in Python to give participants hands-on experience with applying concepts for data analysis. By the end of this workshop, participants will have developed a foundational understanding and hands-on experience with the statistical fundamentals behind big data and data science, which will be important for the subsequent machine learning-related workshops.

Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below. 

While the workshop does not assume previous in statistics coursework, to benefit most, you should have:

  • Mathematics Fundamentals: Participants should have a solid foundation in mathematics, including a good understanding of arithmetic, algebra, and basic geometry. 
  • Basic Probability Knowledge: Probability theory is an essential component of statistics. Participants should have a basic understanding of probability concepts, such as events, outcomes, and probabilities. Knowledge of calculating probabilities, understanding probability distributions, and interpreting basic probability statements will be beneficial.
  • Familiarity with Descriptive Statistics: Participants should have some exposure to descriptive statistics, which involves summarizing and interpreting data. Knowledge of measures such as mean, median, mode, standard deviation, and variance will provide a solid foundation for understanding statistical concepts during the workshop.

About the Instructor: Danielle Maddix

Danielle Maddix Robinson is a Senior Applied Scientist in the Machine Learning Forecasting Group within Amazon Web Services (AWS) AI Labs. She graduated with her PhD in Computational and Mathematical Engineering from the Institute of Computational and Mathematical Engineering (ICME) at Stanford University. She was advised by Professor Margot Gerritsen and developed robust numerical methods to remove spurious temporal oscillations in the degenerate Generalized Porous Medium Equation.  She is passionate about the underlying numerical analysis, linear algebra and optimization methods behind numerical PDEs and applying these techniques to deep learning.  She joined AWS in 2018 shortly after graduating, and has been working on developing statistical and deep learning models for time series forecasting. In this past year, she has been leading the research initiative on developing models for physics-constrained machine learning for scientific computing. In particular, she has researched how to apply ideas from numerical methods, e.g., finite volume schemes, to improve the accuracy of black-box ML models for ODEs and PDEs with applications to epidemiology, aerodynamics, ocean and climate models.

Back to Schedule Overview 


02. Introduction to Python 

Monday, July 24 & Tuesday, July 25, 2023  |  1:00 PM - 4:00 PM PDT

Welcome to the world of Python programming! Whether you're a beginner or an experienced coder, Python can be the perfect programming platform for you. With its ease of use, versatility, and power, Python can be applied in a wide range of situations, from simple scripting to advanced tasks like training large language models. Join us in this interactive workshop to discover the joy of programming in Python! Throughout the workshop, you will engage in a variety of activities, including captivating slides and lectures, stimulating group exercises, enjoyable quizzes, and even a "homework" assignment to tackle between sessions. 

By the end of the workshop, you will be able to solve simple problems by applying your programming skills in Python. You will know how to utilize Python documentation and other resources to seek help and enhance your learning. You will gain a clear understanding of the specific tasks where Python excels. You will familiarize yourself with the capabilities of fundamental scientific and data science Python modules. You will recognize the place of Python within the broader programming language ecosystem.

Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below. 

Please note that this workshop does not require a background in python. 

  • Basic Computer Literacy: Participants should have a basic understanding of computer operations and be comfortable using a computer. They should know how to navigate the file system, create, open, and save files, and have a basic understanding of software installation procedures.
  • Programming Concepts: Familiarity with fundamental programming concepts is required before attending a Python workshop. Participants should have a grasp of concepts such as variables, data types, loops, conditional statements, and functions. Prior exposure to any programming language will help participants quickly adapt to Python syntax and programming principles.

About the Instructor: Leopold Cambier

Leopold Cambier obtained his PhD from ICME at Stanford in 2021 under the guidance of Professor Eric Darve. His research concerned approximate direct solvers for very large systems of equations. At NVIDIA, Leopold develops fast distributed FFT algorithms for efficiently using thousands of GPUs simultaneously. He has also worked on Deep Learning frameworks such as JAX. 

Back to Schedule Overview


03. Linear Algebra 

Wednesday, July 26 & Thursday, July 27, 2023  |  8:00 AM - 11:00 AM PDT

Have you ever wondered how Google delivers search results or how Netflix curates personalized movie recommendations? The answer lies in linear algebra. Join us to uncover the underlying principles and applications of linear algebra in data science. Through engaging examples from real-world data science scenarios such as searching, ranking, and regression, you will develop a deeper understanding of vectors and their operations, including the influential dot product. Matrices will also take center stage as we analyze their significance in various contexts, such as representing social networks, solving linear systems, and acting as operators.

Delving further, we will explore different types of matrices, such as orthogonal and triangular matrices, and their frequent application in data science through decompositions like Singular Value Decomposition (SVD) and QR decomposition. Additionally, we will unravel the inner workings of the PageRank algorithm used by Google in its search engine.

This workshop serves as a tantalizing introduction to linear algebra, aiming to ignite your curiosity and pave the way for further exploration. For those seeking to deepen their knowledge, we will provide accessible resources that expand on the material covered in the workshop. Join us on this exciting journey and uncover the fundamental role of linear algebra in shaping the algorithms driving modern data science.


Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below. 

This workshop does not require a background in linear algebra. In this workshop we will build on an understanding of basic algebraic concepts, such as solving equations and manipulating expressions, and coordinate systems in 2D to aid in visualizing geometric interpretations of vectors.

To join the workshop, participants will need a device with a recent web browser and two-way audio and video access to Zoom. This can be a laptop or desktop computer running any operating system (Windows, Mac, or Linux). Participative activities may benefit from a larger screen, so joining via a smartphone or tablet is not recommended.

About the Instructor: Nadim Saad

Nadim Saad is a 5th year PhD candidate in ICME advised by Professor Margot Gerritsen. He is interested in computational mathematics generally and is working on PDE-based traffic flow modeling. He is from Lebanon and enjoys running, climbing and singing. He says: “Linear Algebra has always been one of my favorite topics ever since I was an undergrad and I’m very excited to share that with everyone”. 

Back to Schedule Overview 


04. Big Data 

Wednesday, July 26 & Thursday, July 27, 2023  | 1:00 PM - 4:00 PM PDT

Welcome to the world of Big Data! Every day, more than 370 millions of TB of data is generated in the world. Extracting value out of it requires us to understand all the different ways it can be stored and processed. This workshop is designed to help everyone, from curious learners to seasoned professionals, understand Big Data and the profound implications it has across industries. 

Together, we will discover the complexities of Big Data, dive into its ecosystem, and get hands-on experience with tools like Apache Spark and its Python library. Our objective is to give you a solid overview of the Big Data landscape and delve into the use of one of its most important tools - Spark. We will also teach you how the latest AI tools can be used to improve your efficiency in processing data.

This workshop isn't just about learning—it's about truly experiencing the world of Big Data in a modern way. Join us for an enlightening journey!

Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below.

This workshop will assume basic knowledge of Python for the few technical use-cases we will cover. Ability to use notebooks / Google Colab would help you get to work faster, but we will cover that briefly if you have never used it. You do not need to install Python or any other software before the workshop. We will provide more detailed instructions prior to the start to ensure that you are set up and ready to learn.

Our goal is to create an inclusive and supportive learning environment, and we want all students to succeed. However, to set you up for success, we also want to clearly communicate the necessary level of prior programming familiarity. If you are unsure whether you have the required background, please feel free to reach out for guidance.

To join the workshop, you'll need a device with a recent web browser and two-way audio and video access to Zoom. This could be a laptop or desktop computer running any operating system, such as Windows, Mac, or Linux. Participative activities benefit from a larger screen, so joining via a smartphone or tablet may not provide the best learning experience. 

About the Instructors: Axel Peytavin & Anna-Julia Storch

Axel Peytavin earned his M.S. from ICME in June 2023, and is currently pursuing research on Twitter data with Professor Johan Ugander and Professor Martin Saveski in Stanford’s department of Management Science & Engineering. He is also a volunteer for The Ocean Cleanup, for which we worked as a computational modeler prior to coming to Stanford, and a co-founder of the GetAlong project to bring better discussions on information on the internet.

Anna-Julia Storch recently earned her M.S. in Education Data Science from Stanford, exploring how data science can help us solve educational challenges.  She has also been a teaching assistant for Stanford's premier entrepreneurship courses including "The Lean Launchpad" (taught be Steve Blank). Prior to Stanford, she has worked in a variety of roles including as a digital consultant for a big data project at McKinsey, in VC-backed technology startups and as a business line head for a multinational HR company.  

Back to Schedule Overview 


05. Privacy and Responsible AI 

Friday, July 28 & Friday August 4, 2022  |  8:00 AM - 11:00 AM PDT

Welcome to Responsible AI in Practice! Explore the development of machine learning models and systems that prioritize fairness, accuracy, explainability, robustness, and privacy. With the growing impact of AI applications in our daily lives and their integration into critical domains like hiring, lending, and healthcare, it is crucial to adopt a responsible approach to AI development and deployment.

In this course, we'll introduce the concept of "responsible AI by design" for various consumer and enterprise applications. Discover techniques and tools for responsible AI, focusing on model explainability, fairness, and privacy. Gain practical insights into applying these techniques in industry, including challenges, guidelines, and lessons learned from real-world web-scale machine learning and data mining applications.

We emphasize that responsible AI is a socio-technical endeavor, requiring collaboration among technologists, stakeholders, and experts in ethics and related disciplines. Through industry case studies, you'll witness responsible AI in action and delve into the underlying challenges. Join us for an interactive learning experience featuring engaging presentations, collaborative exercises, polls, quizzes, and the opportunity to explore further through assigned homework.

Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below. 

Please note that this workshop does not require a background in privacy and responsible AI, rather we encourage that participants come eager to learn about the ethical considerations of technology.

  • Basic Knowledge of Artificial Intelligence and Machine Learning: Participants should have a basic understanding of artificial intelligence (AI) and machine learning (ML) concepts. Familiarity with the terminology, key components, and general workflow of AI and ML models will help participants grasp the privacy and ethical considerations specific to these technologies.
  • Basic Knowledge of Ethical Considerations in Technology: It is important for participants to have a basic understanding of ethical considerations in technology and the implications of AI and ML on society. Familiarity with topics such as fairness, bias, transparency, accountability, and the potential social impact of AI will enable participants to engage in meaningful discussions on responsible AI practices and privacy protection.
     

To join the workshop, participants will need a device with a recent web browser and two-way audio and video access to Zoom. This can be a laptop or desktop computer running any operating system (Windows, Mac, or Linux). Participative activities may benefit from a larger screen, so joining via a smartphone or tablet is not recommended.

About the Instructor: Krishnaram Kenthapadi

Krishnaram Kenthapadi is the Chief AI Officer & Chief Scientist of Fiddler AI, an enterprise startup building a responsible AI and ML monitoring platform. Previously, he was a Principal Scientist at Amazon AWS AI, where he led the fairness, explainability, privacy, and model understanding initiatives in the Amazon AI platform. Prior to joining Amazon, he led similar efforts at the LinkedIn AI team, and served as LinkedIn’s representative in Microsoft’s AI and Ethics in Engineering and Research (AETHER) Advisory Board. Previously, he was a Researcher at Microsoft Research Silicon Valley Lab. Krishnaram received his Ph.D. in Computer Science from Stanford University in 2006. He serves regularly on the senior program committees of FAccT, KDD, WWW, WSDM, and related conferences, and co-chaired the 2014 ACM Symposium on Computing for Development. His work has been recognized through awards at NAACL, WWW, SODA, CIKM, ICML AutoML workshop, and Microsoft’s AI/ML conference (MLADS). He has published 50+ papers, with 4500+ citations and filed 150+ patents (70 granted). He has presented tutorials on privacy, fairness, explainable AI, responsible AI, and model monitoring at forums such as KDD ’18 ’19 '22, WSDM ’19, WWW ’19 ’20 '21, FAccT ’20 '21 ‘22, AAAI ’20 '21, and ICML '21, and instructed a course on AI at Stanford. 

Back to Schedule Overview 


06. Applications of Data Science 

Friday, July 28 & Friday August 4, 2023  |  1:00 PM - 4:00 PM PDT

The field of data science has experienced tremendous growth in recent years, revolutionizing various domains by leveraging the abundance of digital information. To continue harnessing the power of data science for societal betterment, it is crucial to understand its current applications.

Join this workshop to explore the history and motivations behind data science. Discover the key factors that determine the suitability of a problem for data science solutions. Delve into a diverse range of data science applications, including outcome prediction, optimization, and output generation. Through interactive activities, you will gain hands-on experience and build confidence in the subject matter. By the end of the workshop, you will have a comprehensive understanding of contemporary data science applications and the necessary background to generate your own ideas for potential applications.

About the Instructor: Julia Olivieri

Julia Olivieri, Assistant Professor of Computer Science at the University of the Pacific. Her research lies at the intersection of computer science, biology, and statistics. Specifically, she develops algorithms to perform large-scale, rigorous analysis of RNA sequencing data. In 2022, Julia graduated with my PhD from Stanford University’s Institute for Computational and Mathematical Engineering with a thesis on splicing analysis in single-cell RNA sequencing data. 

Back to Schedule Overview 


07. Python for Data Science 

Monday, July 31 & Tuesday August 1, 2023 |  8:00 AM - 11:00 AM PDT

In this 6-hour workshop, you will learn how to leverage modern Python features and techniques to enhance your data science code. Python has emerged as the go-to language for data science and analysis, and its evolution has introduced powerful capabilities such as abstract classes, generic programming, and generators. While data scientists are proficient in using Python, they can significantly boost their productivity and programming skills by embracing these modern language features.

The workshop will provide hands-on instruction, utilizing Google Colab workbooks that come preloaded with the necessary libraries, eliminating the need for software installations or configurations. You will explore real-world examples of applying modern Python features to data science tasks and engage in problem-solving exercises during the session. Some of the key Python features covered include classes and inheritance, dataclasses and immutability, abstract classes and interfaces, type hints and static typing, generic programming, first-class functions and lambdas, as well as iterators and generators. Join us to unlock the full potential of Python for data science and elevate your coding proficiency. 

No software installations or configurations will be required as the entire workshop will be done using Google Colab workbooks that will come with all required libraries loaded.

Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below.

  • Proficiency in Python: Participants should be comfortable with core Python syntax and have experience in developing simple data science applications using Python.
  • Knowledge of Key Data Science Libraries: Participants should have experience using essential data science libraries such as numpy, pandas, and matplotlib. Familiarity with machine learning libraries is not required for this workshop.
  • Familiarity with Jupyter Notebooks: Participants should be familiar with Jupyter notebooks, which are commonly used for interactive data analysis and visualization in data science.
  • Basic Data Science and Computer Science Knowledge: Participants should have a foundational understanding of key concepts in data science and computer science, including probability, statistics, data structures, and algorithms.

About the Instructors: Ashwin Rao & Tikhon Jelvis

Ashwin Rao is the Co-Founder of CX Score, a seed-stage AI startup that builds products enabling enterprises to provide great customer experience on their web and mobile apps. Ashwin is also an Adjunct Professor at Stanford University, focusing his research and teaching in the area of Stochastic Control, particularly Reinforcement Learning (RL) algorithms with applications in Finance and Retail. He teaches Stanford CME 241, which is based on the RL for Finance book he wrote with Tikhon Jelvis. Previously, Ashwin led Data Science and Machine Learning at Target, where he and his team developed mathematical models and algorithms for supply-chain and logistics, merchandising, marketing, search, personalization, pricing and customer service. Before that, Ashwin was a Managing Director at Morgan Stanley and a Trading Strategist at Goldman Sachs. Ashwin holds a Bachelors degree in Computer Science and Engineering from IIT-Bombay and a Ph.D in Computer Science from University of Southern California, where he specialized in Algorithms Theory and Abstract Algebra.

Tikhon Jelvis is a founding engineer at CX Score, a seed-stage startup that builds products enabling enterprises to provide great customer experience on their web and mobile apps. Tikhon specializes in bringing ideas from programming languages and functional programming to machine learning and data science. He has developed inventory optimization, simulation and demand forecasting systems as a Principal Scientist at Target. He is a speaker and open-source contributor in the Haskell community, where he serves on the board of directors for Haskell.org. Tikhon has co-authored a book on the Foundations of Reinforcement Learning with Stanford faculty Ashwin Rao, where he developed a modern Python framework for data science and machine learning. 

Back to Schedule Overview 


08. Introduction to Machine Learning 

Monday, July 31 & Tuesday, August 1, 2023  |  1:00 PM - 4:00 PM PDT

This workshop aims to provide participants with a solid foundation in understanding and utilizing modern machine learning algorithms. The focus will be on developing a framework for decision-making regarding the application of different machine learning techniques. Key considerations such as over-fitting/under-fitting, interpretability, supervised/unsupervised methods, and handling missing data will be emphasized.The workshop will delve into the principles underlying various machine learning algorithms, providing insights into why and how they are used. However, mathematical proofs and intricate details will not be extensively covered.

In the domain of unsupervised machine learning, participants will be introduced to k-means clustering, principal component analysis (PCA), multidimensional scaling (MDS), tSNE, and independent component analysis (ICA). The supervised machine learning algorithms covered will include support vector machines (SVM), lasso, elastic net, classification and regression trees (CART), boosting, bagging, and random forests.Additionally, concepts related to imputation, regularization, and cross-validation will be discussed. Although the R programming language will be used occasionally for examples, prior exposure to R is not necessary for participants. Overall, this workshop offers a comprehensive introduction to modern machine learning algorithms, providing participants with practical knowledge and a solid foundation for further exploration in the field.

Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below.

  • Basic Programming Experience in R, Matlab, or Python: Participants should have basic programming skills in at least one of the following languages: R, Matlab, or Python.
  • Familiarity with programming concepts such as variables, loops, conditional statements, and functions will allow participants to implement and experiment with machine learning algorithms during the workshop. Knowledge of data manipulation libraries (e.g., pandas, NumPy) and visualization tools (e.g., matplotlib, ggplot) in the chosen programming language would be beneficial.
  • Familiarity with Data Analysis Concepts: Participants should have a basic understanding of data analysis concepts, including data preprocessing, feature selection, and model evaluation. Knowledge of common data preprocessing techniques, such as handling missing data and scaling features, will provide a foundation for understanding machine learning workflows. Awareness of different evaluation metrics and cross-validation methods will enable participants to assess the performance of machine learning models.

To join the workshop, participants will need a device with a recent web browser and two-way audio and video access to Zoom. This can be a laptop or desktop computer running any operating system (Windows, Mac, or Linux). Participative activities may benefit from a larger screen, so joining via a smartphone or tablet is not recommended.

About the Instructor: Alex Ioannidis

Alexander Ioannidis earned his PhD in Computational and Mathematical Engineering and Masters in Management Science and Engineering both at Stanford University. He is a research fellow working on developing novel machine learning techniques for medical and genomic applications in the Department of Biomedical Data Science. Prior to Stanford he earned his bachelors in Chemistry and Physics from Harvard and a M.Phil from the University of Cambridge. He conducted research for several years on novel superconducting and quantum computing architectures at Northrop Grumman's Advanced Technologies research center. In his free time he enjoys sailing. 

Back to Schedule Overview 


09. Introduction to Mathematical Optimization 

Wednesday, August 3 & Thursday, August 4, 2023  |  8:00 AM - 11:00 AM PDT

Mathematical optimization serves as the foundation for various applications in science and engineering by providing formal tools to determine the optimal action, design, control, or model from a range of possibilities. In the fields of data science, machine learning, and artificial intelligence, mathematical optimization plays a crucial role in model training and learning. This workshop will offer an overview of the fundamental aspects of mathematical optimization, including unconstrained and constrained optimization, convex optimization, and optimization for model training. With a practical focus, participants will actively engage in formulating and solving optimization problems using standard modeling languages and solvers throughout the workshop. By incorporating common models from machine learning and other disciplines, the aim is to familiarize participants with optimization tools, enabling them to apply them in their own work for rapid prototyping and experimentation.

The workshop will cover various topics, including formulating optimization problems, the fundamentals of constrained and unconstrained optimization, convex optimization, optimization methods for model fitting in machine learning, and optimization in Python using libraries such as SciPy and CVXPY. Throughout, in-depth Jupyter Notebook examples from machine learning, statistics, and other fields will be presented to illustrate the practical applications of optimization techniques.
 

Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below.

To benefit from this workshop, participants should have a comfortable understanding of linear algebra, differential multivariable calculus, and basic probability and statistics. Although experience with Python will be helpful, it is not mandatory.

  • Proficiency in Linear Algebra, Differential Multivariable Calculus, and Basic Probability and Statistics: Participants should have a solid understanding of linear algebra, including matrix operations, vector spaces, and eigenvectors. Knowledge of differential multivariable calculus, such as partial derivatives and gradients, is important for understanding optimization algorithms. Additionally, a basic understanding of probability and statistics will be beneficial for interpreting optimization results and evaluating models.
  • Familiarity with Modeling and Problem Formulation: Participants should have some familiarity with the process of formulating problems as mathematical models. This includes understanding how to translate real-world problems into mathematical expressions, formulate objective functions, and define constraints. Prior exposure to mathematical modeling in fields such as engineering, physics, or economics will be advantageous.
  • Optional: Experience with Python (helpful but not required): While experience with Python is not mandatory, it will be helpful for participants to have some familiarity with the language. Python is commonly used for optimization and machine learning tasks, and participants with prior experience in Python will find it easier to implement and experiment with optimization techniques using libraries such as SciPy and CVXPY. However, the workshop will provide guidance and examples in Python for those who are new to the language.

To join the workshop, participants will need a device with a recent web browser and two-way audio and video access to Zoom. This can be a laptop or desktop computer running any operating system (Windows, Mac, or Linux). Participative activities may benefit from a larger screen, so joining via a smartphone or tablet is not recommended.

About the Instructor: Kevin Carlberg

Kevin Carlberg is a Director of AI Research Science at Meta and an Affiliate Associate Professor of Applied Mathematics and Mechanical Engineering at the University of Washington. He leads a research team focused on enabling the future of augmented and virtual reality through AI-driven innovations. His individual research combines concepts from machine learning, computational physics, and high-performance computing to drastically reduce the cost of simulating nonlinear dynamical systems at extreme scale. Previously, Kevin was a Distinguished Member of Technical Staff at Sandia National Laboratories in Livermore, California, where he led a research group of PhD students, postdocs, and technical staff in applying these techniques to a range of national-security applications in mechanical and aerospace engineering. 

Back to Schedule Overview


10. Data Visualization 

Wednesday, August 3 & Thursday, August 4, 2023  |  1:00 PM - 4:00 PM PDT

Join us in our Visualization for Data Science workshop and discover the power of data visualization using Python, one of the most popular and versatile languages in the field. In this workshop, we will guide you through the essentials of creating impactful and aesthetically pleasing visual representations of data, from exploratory data analysis to polished final products. You'll have the opportunity to work with simple datasets, receive feedback on your designs from peers, and gain confidence in your data visualization skills. No prior experience in data visualization is required, but a basic understanding of Python will be helpful. By the end of this workshop, you will have a solid understanding of data visualization principles, the process of creating professional-looking visualizations in Python, and the fundamentals of exploratory data analysis. Additionally, you will be equipped with the knowledge of Python packages necessary to implement effective data communication through visualizations.

You do not need to install Python or any other software before the workshop. We will provide more detailed instructions prior to the start to ensure that you are set up and ready to learn.

Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below.

  • Basic Programming Experience in Python: Participants should have some basic experience with Python programming. They should be familiar with fundamental concepts such as variables, data types, loops, conditional statements, and functions. The workshop assumes a similar level of programming proficiency as the "Introduction to Python" workshop. This foundation will allow participants to understand and implement data visualization techniques using Python libraries effectively.
  • Familiarity with Python Data Manipulation Libraries: It would be beneficial for participants to have some familiarity with Python libraries commonly used for data manipulation, such as pandas and NumPy. Understanding how to load and manipulate data, perform basic data cleaning tasks, and create data structures will enable participants to preprocess data effectively before visualization.
  • Optional: Basic Understanding of Data Science Concepts: Although not mandatory, a basic understanding of data science concepts will enhance the learning experience. Familiarity with data-driven question answering will provide participants with a broader context for applying data visualization techniques effectively.

About the Instructor: Kaleigh Mentzer

Kaleigh Mentzer earned her PhD in Computational and Mathematical Engineering from Stanford, advised by Irene Lo and Itai Ashlagi. Her research focuses on using algorithmic and optimization-based tools to improve equitable access to education and has informed educational policy decisions in San Francisco. Kaleigh now works on large-scale data problems as a research engineer at a stealth mode startup. 

Back to Schedule Overview 


11. Introduction to Deep Learning 

Monday, August 7 & Tuesday, August 8, 2023  |  8:00 AM - 11:00 AM PDT

Welcome to the Deep Learning workshop, where you'll dive into the technology that powers cutting-edge applications like self-driving vehicles, virtual assistants, and intelligent systems. This interactive workshop offers a comprehensive learning experience through lectures, coding walkthroughs, and programming exercises. By the end of the workshop, you'll have a solid understanding of both the theory and practice of deep learning, along with practical experience in developing your own deep learning models and analysis. You'll gain comfort with deep learning terminology, be able to comprehend research articles and reports in the field, and develop proficiency in PyTorch, a popular deep learning framework. Additionally, you'll build a repertoire of useful deep learning code that can be shared online and utilized for your own projects. While no prior knowledge of deep learning is required, familiarity with programming in Python, basic linear algebra concepts, Numpy, elementary probability, and some exposure to machine learning concepts like gradient descent and regression is recommended. Join us in this workshop to unlock the exciting world of deep learning. 

It is important to note that participants do not need to install Python or any other libraries before the workshop. The workshop will utilize Jupyter notebooks, and all the necessary code, exercises, and solutions will be shared online. A Google account will be required as the workshop will use Google Colab for sharing the notebooks.

Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below.

  • Prior Experience in Python Programming: Participants should have prior experience in programming, specifically using Python. Familiarity with Python syntax, data structures, and control flow is essential for understanding and implementing deep learning algorithms. Knowledge of common Python libraries used in deep learning, such as NumPy, will also be helpful for data manipulation and numerical computations.
  • Understanding of Basic Linear Algebra Concepts and Numpy: A basic understanding of linear algebra concepts is recommended. Familiarity with concepts like vectors, matrices, matrix operations (e.g., addition, multiplication), and vector spaces will be beneficial for comprehending the underlying mathematics of deep learning algorithms. Additionally, knowledge of NumPy, a widely-used Python library for numerical computations, will enable participants to perform efficient computations and manipulate multidimensional arrays.
  • Elementary Probability and Prior Exposure to basic Machine Learning Concepts: Participants should have a basic understanding of elementary probability concepts. This includes understanding basic probability distributions, conditional probability, and concepts like expectation and variance. Additionally, prior exposure to machine learning concepts, such as gradient descent optimization, regression algorithms, and the general idea of training and evaluating models, will help participants grasp the fundamental principles of deep learning.
aashwin mishra headshot

About the Instructor: Aashwin Mishra 

Aashwin Mishra is a Project Scientist at the Machine Learning Initiative at the National Accelerator Laboratory (SLAC), where they leads key STEM DEI initiatives. Their research focuses on uncertainty quantification, probabilistic modeling, interpretability/explainability, and optimization across physics applications. Aashwin is a highly successful continuing instructor. In 2022, they taught Introduction to Deep Learning and Intermediate Topics in Machine Learning and Deep Learning.

Back to Schedule Overview 


12. Introduction to Natural Learning Processing 

Monday, August 7 & Tuesday, August 8, 2023  |  1:00 PM - 4:00 PM PDT

Welcome to the Natural Language Processing (NLP) workshop, where you'll explore the fascinating world of computers understanding and generating human language. NLP is a rapidly growing field that underlies advancements in virtual assistants, language translation, and more. In this workshop, you'll gain a solid understanding of NLP's fundamental concepts and apply them to text data using Python programming. Join us for an immersive learning experience filled with engaging lectures, collaborative coding exercises, and practical assignments to enhance your NLP skills. Our main objective is for attendees to develop a critical understanding of NLP's potential and limitations, enabling them to identify opportunities to apply NLP techniques in their own work. By the end of the workshop, you'll grasp key NLP tools and concepts such as tokenization, attention mechanisms, and transformers, and have hands-on experience using popular NLP libraries like Keras and Hugging Face. You'll also apply NLP techniques to real-world tasks, albeit at a beginner level, such as text machine translation and sentiment extraction. Embark on this journey to unlock the power of NLP in transforming the way computers interact with human language.

Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below. 

  • Python Programming Experience: Participants should have prior experience in Python programming. They should be familiar with Python syntax, data structures, control flow, and the ability to write and execute Python code. Proficiency in Python will be essential for implementing NLP algorithms and working with relevant libraries.
  • Basic Understanding of Machine Learning Concepts: It is strongly recommended that participants have a basic understanding of machine learning concepts at the level covered in the companion "Introduction to Machine Learning" workshop. Familiarity with concepts such as supervised and unsupervised learning, classification, regression, and evaluation metrics will help participants grasp the application of NLP techniques in machine learning tasks.
  • Optional: Familiarity with Deep Learning Concepts: Participants interested in applying NLP in their work will find it useful to understand the concepts covered in the "Introduction to Deep Learning" workshop offered in the same series. Although not mandatory, prior exposure to deep learning concepts, including neural networks, training methods, and deep learning frameworks like TensorFlow or PyTorch, can enhance the understanding of advanced NLP techniques.

Participants do not need to install Python or any other software before the workshop. Detailed instructions will be provided prior to the start of the workshop to ensure that participants are set up and ready to learn.

To join the workshop, participants will need a device with a recent web browser and two-way audio and video access to Zoom. This can be a laptop or desktop computer running any operating system (Windows, Mac, or Linux). Participative activities may benefit from a larger screen, so joining via a smartphone or tablet is not recommended.

Afshine and Shervine Amidi

About the Instructors: Afshine & Shervine Amidi

Afshine Amidi is currently working on solving NLP problems at Google. He also taught the Data Science Tools class to graduate students at MIT. Also, Afshine published papers at the intersection of deep learning and computational biology. He holds a Bachelor’s and a Master’s Degree from École Centrale Paris and a Master’s Degree from MIT.

Shervine Amidi is currently working on problems at the intersection of ranking and natural language processing at Google. Also, Shervine published papers at the intersection of deep learning and computational biology. He holds a Bachelor’s and a Master’s Degree from École Centrale Paris and a Master’s Degree from Stanford University.

Back to Schedule Overview 


13. Search and Recommendation 

Wednesday, August 9 & Thursday, August 10, 2023  |  8:00 AM - 11:00 AM PDT

We increasingly rely on personal and professional advice generated and curated by algorithms rather than friends and families. We begin this workshop by placing search and recommendation systems in their broad historical contexts. Next, we get into their mechanisms, largely cutting-edge machine learning-based techniques. The broad approach involves transforming implicit and explicit user preferences into data structures that can be computationally processed to yield meaningful recommendations. (Here we contextualize search as a special type of recommendation.) Emphasizing practical engineering, the latter modules cover the efficient implementation of such computations using contemporary distributed computing.

The workshop aims to empower attendees with expertise in search and recommendation systems, enabling them to contribute to industry practices and make informed decisions regarding these increasingly prevalent technologies. Participants will develop a comprehensive understanding of the evolution of recommendation systems and their relationship to search systems, including the deep learning mechanisms and techniques utilized. They will also acquire practical engineering skills necessary for implementing recommendation systems efficiently at scale. By learning how to build recommendation systems for various applications, attendees will be able to actively contribute to the advancement of the field. Furthermore, the workshop will provide insights into the state-of-the-art practices of machine learning within leading Silicon Valley companies that rely on search and recommendation systems to create value.

Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below. 

This advanced workshop is designed for experienced Python programmers, preferably with recent experience applying neural networks. Familiarity with deep learning at the level of SWS 11: Introduction to Deep Learning, is expected. A list of references at varying levels of sophistication are provided later in this description. Students interested in maximizing their learnings from this workshop will do well to familiarize themselves with as many of the references as possible.

Our goal is to create an inclusive and supportive learning environment, and we want all students to succeed. However, to set you up for success, we also want to clearly communicate the necessary level of prior knowledge and programming skill. If you are unsure whether you have the required background, please feel free to reach out for guidance.

You do not need to install Python or any other software before the workshop. We will provide more detailed instructions prior to the start to ensure that you are set up and ready to learn.

To join the workshop, you'll need a device with a recent web browser and two-way audio and video access to Zoom. This could be a laptop or desktop computer running any operating system, such as Windows, Mac, or Linux. Participative activities benefit from a larger screen, so joining via a smartphone or tablet may not provide the best learning experience. 

About the Instructor: Hao Sheng

Hao Sheng is a Staff Machine Learning Engineer at Apple SPG (Special Projects Group). He received his PhD in Computer Engineering from ICME, Stanford University as part of Stanford Machine Learning Group and Stanford Computational Policy Lab. His research interests are machine learning algorithms for environmental and social sustainability. He worked on large-scale recommendation systems at TikTok before joining Apple. Prior to Stanford, he earned his Bachelors in PPE (Philosophy, Political Science, and Economics) and a B.S. in Applied Mathematics from Yuanpei College, Peking University.

Back to Schedule Overview 


14. Generative Models 

Wednesday, August 9 & Thursday, August 10, 2023  |  1:00 PM - 4:00 PM PDT

Join us for the Generative Models workshop and unlock the fascinating world of Deep Generative Modeling. In this rapidly growing field of Machine Learning, you'll discover how computers can create intricate landscapes, generate images of non-existent humans, and produce stunning art. Deep Generative Models provide a paradigm shift from traditional discriminative models, allowing the creation of new content based on user input rather than classification.

During this workshop, you'll delve into the foundational concepts of Deep Generative Modeling, explore various types of models, and learn about their diverse applications in different fields. With a focus on Python, you'll gain hands-on experience in creating your own Deep Generative Models for various tasks. The workshop features engaging lectures, practical coding exercises, and insightful reviews to enhance your understanding. By the end of the workshop, you'll develop a clear comprehension of the underlying theory and concepts of Deep Generative Modeling. You'll explore different models such as Generative Adversarial Networks (GANs), Variational Auto-Encoders (VAEs), and Diffusion Models. Discover how these models are applied in academia, industry, and your own work. With hands-on coding experience in PyTorch, you'll be equipped to implement these models from scratch and apply them to your own applications. Join us and unleash the power of Deep Generative Modeling in the world of Machine Learning.

It is important to note that participants do not need to install Python or any other libraries before the workshop. The workshop will utilize Jupyter notebooks, and all the necessary code, exercises, and solutions will be shared online. A Google account will be required as the workshop will use Google Colab for sharing the notebooks.

Want to make sure this is the right workshop for you? Check out the workshop’s prerequisites below.

  • Prior Experience in Python Programming: Participants should have prior experience in programming, particularly using Python. Familiarity with Python syntax, data structures, control flow, and the ability to write and execute Python code is essential for understanding and implementing generative models. Knowledge of common Python libraries used in deep learning, such as PyTorch, will also be beneficial.
  • Understanding of Probability Theory and Linear Algebra: A solid understanding of probability theory is recommended. Participants should be familiar with basic probability concepts such as probability distributions, conditional probability, and expectations. Additionally, a foundational understanding of linear algebra, including concepts such as vectors, matrices, matrix operations, and vector spaces, is important for comprehending the underlying mathematics of generative models.
  • Basic Experience with Deep Learning: Participants should have basic experience and knowledge of deep learning concepts. This includes understanding the fundamentals of training neural networks, common activation functions, loss functions, optimization algorithms (e.g., gradient descent), and the general workflow of developing deep learning models. The "Introduction to Deep Learning" workshop offered in the same series is mentioned as covering most of the required prerequisites, so attending that workshop beforehand is recommended.
aashwin mishra headshot

About the Instructor: Aashwin Mishra 

Aashwin Mishra is a Project Scientist at the Machine Learning Initiative at the National Accelerator Laboratory (SLAC), where they leads key STEM DEI initiatives. Their research focuses on uncertainty quantification, probabilistic modeling, interpretability/explainability, and optimization across physics applications. Aashwin is a highly successful continuing instructor. In 2022, they taught Introduction to Deep Learning and Intermediate Topics in Machine Learning and Deep Learning.

Back to Schedule Overview 


If you would like to sign up to receive email notifications regarding the summer workshops, you can subscribe here.