Skip to main content Skip to secondary navigation
Main content start

ICME Online Summer Workshops 2020

Event Details:

Monday, August 17, 2020 - Saturday, August 22, 2020
ICME Online Summer Workshops 2020 Flyer

August 17, 2020 - 9:00am to August 22, 2020 - 9:00am

Registration is now closed!

ICME offers a variety of summer workshops to students, ICME partners, and the wider community. This year's series of day-long workshops is happening online from August 17-22, 2020, as detailed below. All workshops are from 9:00 am to 4:45 pm (four 75-minute sessions separated by time for breaks).

These are full-day workshops. You can register for one workshop per day.

If you would like to sign up to receive email notifications regarding the summer workshops, you can subscribe here.

Summer Workshops 2020 FAQ

Schedule

Monday, August 17, 2020

Introduction to Statistics

Statistics is the science of learning from data. This workshop will help you to develop the skills you need to analyze data and to communicate your findings. There won't be many formulas in the workshop; rather, we will develop the key ideas of statistical thinking that are essential for learning from data.

We will discuss the main tools for descriptive statistics which are essential for exploring data, with an emphasis on visualizing information. We will explain the important ideas about sampling and conducting experiments. Then we will look over some important rules of probability and discuss normal approximation and the central limit theorem. We will show you the important concepts and pitfalls of regression and how to do inference with confidence intervals and tests of hypotheses. You will learn how to analyze categorical data and discuss one-way analysis of variance. Finally, we will look at reproducibility, data snooping and the multiple testing fallacy, and how to account for multiple comparisons. These issues have become particularly important in the era of big data.

Broadly, there are three main reasons why statistical literacy is essential in data science: First, it provides the skills to assess whether the data are sufficient to answer the questions at hand. Second, it establishes a rigorous framework for quantifying uncertainty. And finally, it provides techniques for effectively communicating the findings of your analyses. This workshop equips you with the important tools in all of these areas. It is the statistical foundation on which the recent exciting advances in machine learning are built.

Guenther Walther

Professor Guenther Walther

Professor of Statistics

Linear Algebra

Matrix computations/linear algebra are the backbone of data science algorithms. If your linear algebra is rusty, or if you are not familiar with critical concepts (including orthogonal decompositions and least squares), then this workshop is for you. In the morning, we will cover basic ideas in linear algebra (matrix-vector manipulations, norms, subspaces and solving matrix-vector systems). In the afternoon, we will discuss more advanced concepts including QR decompositions and the SVD.

Margot Gerritsen

Professor Margot Gerritsen

Senior Associate Dean for Educational Affairs in the School of Earth, Energy and Environmental Sciences and Co-Directorof Women in Data Science (WiDS)

Tuesday, August 18, 2020

Introduction to Mathematical Optimization

Mathematical optimization underpins many applications in science and engineering, as it provides a set of formal tools to compute the ‘best’ action, design, control, or model from a set of possibilities. In data science and machine learning, mathematical optimization is the engine of model fitting. This workshop will provide an overview of the key elements of this topic (unconstrained, constrained, convex optimization, optimization for model fitting), and will have a practical focus, with participants formulating and solving optimization problems early and often using standard modeling languages and solvers. By introducing common models from machine learning and other fields, this workshop aims to make participants comfortable with optimization modeling so that they may use it for rapid prototyping and experimentation in their own work. Students should be comfortable with linear algebra, differential multivariable calculus, and basic probability and statistics. Experience with Python will be helpful, but not required.

Kevin   Carlberg

Kevin Carlberg

AI Research Science Manager at Facebook Reality Labs and an Affiliate Associate Professor of Applied Mathematics and Mechanical Engineering at the University of Washington

Introduction to Programming in R

This workshop is recommended for those who want to learn the basics of R programming in statistics, science, or engineering. The goal of this workshop is to familiarize participants with R's tools for scientific computing and data analysis. Lectures will be interactive with a focus on learning by example, and assignments will be application-driven.

Example topics: Basic data types in R, variables, and apply functions; Data input/output; Plotting in base R; Statistical applications: such as how to get a summary of the data, and how to do linear regressions

Andreas Santucci

Andreas Santucci

Computational Statistician, Data Science at Google and Lectuer, ICME

Introduction to Python

Introduction to Python will focus on scientific computing, data science and machine learning.

More precisely, the class will cover: Python basics (variables, if/else, loops, functions); Numpy and Pandas; Scipy and Scikit-learn

The class is designed for people with some experience programming, but no experience in Python. We will introduce each topic enough so that you can quickly start using Python for your own problems knowing what tools are most appropriate. The workshop will be interactive with many examples (that the participants can play with during the session).

Leopold  Cambier

Leopold Cambier

5th Year PhD Student ICME, Stanford University

Wednesday, August 19, 2020

Intro to Machine Learning

This workshop presents the basics behind understanding and using modern machine learning algorithms. We will discuss a framework for reasoning about when to apply various machine learning techniques, emphasizing questions of over-fitting/under-fitting, interpretability, supervised/unsupervised methods, and handling of missing data. The principles behind various algorithms—the why and how of using them—will be discussed, while some mathematical detail underlying the algorithms—including proofs—will not be discussed. Unsupervised machine learning algorithms presented will include k-means clustering, principal component analysis (PCA), multidimensional scaling (MDS), tSNE, and independent component analysis (ICA). Supervised machine learning algorithms presented will include support vector machines (SVM), lasso, elastic net, classification and regression trees (CART), boosting, bagging, and random forests. Imputation, regularization, and cross-validation concepts will also be covered. The R programming language will be used for occasional examples, though participants need not have prior exposure to R.

Alexander Ioannidis

Instructor: Alexander Ioannidis

Postdoctoral Fellow, Department of Biomedical Data Science at Stanford Medical School.

Thursday, August 20, 2020

Introduction to Deep Learning

Deep Learning is a rapidly expanding field with new applications found every day. In this workshop we will cover the fundamentals of deep learning for the beginner. We will introduce the math behind training deep learning models: the backpropagation algorithm. Building conceptual understanding of the fundamentals of deep learning will be the focus of the first part of the workshop. We will then cover some of the popular architectures used in deep learning, such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), LSTMs, autoencoders and GANs. There will be a hands-on computing tutorial using Jupyter notebooks to build a basic image classification model via transfer learning. By the end of the workshop, participants will have a firm understanding of the basic terminology and jargon of deep learning and will be prepared to dive into the plethora of online resources and literature available for each specific application area.

Sherrie Wang

Instructor: Sherrie Wang

PhD student at ICME, Stanford University

Friday, August 21, 2020

Introduction to High Performance Computing

In the past 50 years, supercomputers have achieved what was once considered only possible in Sci-Fi movies. The key to the tremendous success of supercomputers has been a combination of outstanding architectures plus software that uses all the available resources and makes parallelization possible. This secret sauce has led to different implementations across fields. A mechanical engineer would use MPI and OpenMP to have a balance between computations and memory load to deal with millions of nodes in physical simulations, whereas a data scientist would use MapReduce and Spark to have an adaptable and resilient algorithm for the challenges of big data. This workshop explores the key features of these two approaches, explaining their underground philosophy and how they use the architecture. The final goal is to give the student a taste of the different programming paradigms and the tools to decide which is the best approach.

Cindy Orozco Summer Workshops

Instructor: Cindy Orozco

5th Year PhD student at ICME, Stanford University

Deep Learning for Natural Language Processing

This workshop will focus on practical applications and considerations of applying deep learning to Natural language processing (NLP). We will start by drawing inspiration from more traditional NLP approaches, and show how many modern deep learning-based algorithms have deep roots in traditional techniques, while showing how deep learning has enabled new improvements. This workshop will heavily focus on student's understanding of problem templates in applied natural language processing, and about identifying application patterns.

We will have a practical focus, targeting algorithms, and problem templates which are able to be deployed and used today. We will cover the different components that go into deep learning systems, including word vector representations (word2vec, GloVe), contextual representations (ELMo, BERT), and general model components such as convolutional layers, Transformers, and others. We will also cover introductory material in applications such as classification, intent understanding, and others.

 Luke  de  Oliveira

Instructor: Luke de Oliveira

Lead Engineer for the AI platform team at Twilio as part of Twilio AI

Saturday, August 22, 2020

Data Visualization in Tableau

Thisworkshop introduces Tableau, a powerful tool for creating data visualizations. It is geared toward people in the industry and academia who want to better communicate their projects and research. Attendees will learn how to load data and use it to create charts from Tableau’s library, going from time-series visualizations, scatterplots and maps, all the way to interactive dashboards that use calculated fields, groups, sets and other advanced features.

Sergio Camelo

Instructor: Sergio Camelo Gomez

5th Year PhD Student at ICME, Stanford University

Explore More Events

Xtend 2021

Thursday, November 4, 2021 | 9:00am - 7:00pm PDT

Stanford University
475 Via Ortega Ste 60
Stanford, CA 94305
United States

Women in Data Science Conference

Monday, March 7, 2022 | 9:00am - 5:00pm PST

Symposium

Xpo Research Symposium 2022

Tuesday, May 24, 2022 | 9:00am - 5:00pm PDT