ICME Online Summer Workshops 2020
August 17, 2020 - 9:00am to August 22, 2020 - 9:00am
Registration is now closed!
ICME offers a variety of summer workshops to students, ICME partners, and the wider community. This year's series of day-long workshops is happening online from August 17-22, 2020, as detailed below. All workshops are from 9:00 am to 4:45 pm (four 75-minute sessions separated by time for breaks).
These are full-day workshops. You can register for one workshop per day.
If you would like to sign up to receive email notifications regarding the summer workshops, you can subscribe here.
Monday, August 17, 2020
Statistics is the science of learning from data. This workshop will help you to develop the skills you need to analyze data and to communicate your findings. There won't be many formulas in the workshop; rather, we will develop the key ideas of statistical thinking that are essential for learning from data.
We will discuss the main tools for descriptive statistics which are essential for exploring data, with an emphasis on visualizing information. We will explain the important ideas about sampling and conducting experiments. Then we will look over some important rules of probability and discuss normal approximation and the central limit theorem. We will show you the important concepts and pitfalls of regression and how to do inference with confidence intervals and tests of hypotheses. You will learn how to analyze categorical data and discuss one-way analysis of variance. Finally, we will look at reproducibility, data snooping and the multiple testing fallacy, and how to account for multiple comparisons. These issues have become particularly important in the era of big data.
Broadly, there are three main reasons why statistical literacy is essential in data science: First, it provides the skills to assess whether the data are sufficient to answer the questions at hand. Second, it establishes a rigorous framework for quantifying uncertainty. And finally, it provides techniques for effectively communicating the findings of your analyses. This workshop equips you with the important tools in all of these areas. It is the statistical foundation on which the recent exciting advances in machine learning are built.
Matrix computations/linear algebra are the backbone of data science algorithms. If your linear algebra is rusty, or if you are not familiar with critical concepts (including orthogonal decompositions and least squares), then this workshop is for you. In the morning, we will cover basic ideas in linear algebra (matrix-vector manipulations, norms, subspaces and solving matrix-vector systems). In the afternoon, we will discuss more advanced concepts including QR decompositions and the SVD.
Tuesday, August 18, 2020
Mathematical optimization underpins many applications in science and engineering, as it provides a set of formal tools to compute the ‘best’ action, design, control, or model from a set of possibilities. In data science and machine learning, mathematical optimization is the engine of model fitting. This workshop will provide an overview of the key elements of this topic (unconstrained, constrained, convex optimization, optimization for model fitting), and will have a practical focus, with participants formulating and solving optimization problems early and often using standard modeling languages and solvers. By introducing common models from machine learning and other fields, this workshop aims to make participants comfortable with optimization modeling so that they may use it for rapid prototyping and experimentation in their own work. Students should be comfortable with linear algebra, differential multivariable calculus, and basic probability and statistics. Experience with Python will be helpful, but not required.
AI Research Science Manager at Facebook Reality Labs and an Affiliate Associate Professor of Applied Mathematics and Mechanical Engineering at the University of Washington
This workshop is recommended for those who want to learn the basics of R programming in statistics, science, or engineering. The goal of this workshop is to familiarize participants with R's tools for scientific computing and data analysis. Lectures will be interactive with a focus on learning by example, and assignments will be application-driven.
Example topics: Basic data types in R, variables, and apply functions; Data input/output; Plotting in base R; Statistical applications: such as how to get a summary of the data, and how to do linear regressions
Computational Statistician, Data Science at Google and Lectuer, ICME
Introduction to Python will focus on scientific computing, data science and machine learning.
More precisely, the class will cover: Python basics (variables, if/else, loops, functions); Numpy and Pandas; Scipy and Scikit-learn
The class is designed for people with some experience programming, but no experience in Python. We will introduce each topic enough so that you can quickly start using Python for your own problems knowing what tools are most appropriate. The workshop will be interactive with many examples (that the participants can play with during the session).
5th Year PhD Student ICME, Stanford University
Wednesday, August 19, 2020
This workshop presents the basics behind understanding and using modern machine learning algorithms. We will discuss a framework for reasoning about when to apply various machine learning techniques, emphasizing questions of over-fitting/under-fitting, interpretability, supervised/unsupervised methods, and handling of missing data. The principles behind various algorithms—the why and how of using them—will be discussed, while some mathematical detail underlying the algorithms—including proofs—will not be discussed. Unsupervised machine learning algorithms presented will include k-means clustering, principal component analysis (PCA), multidimensional scaling (MDS), tSNE, and independent component analysis (ICA). Supervised machine learning algorithms presented will include support vector machines (SVM), lasso, elastic net, classification and regression trees (CART), boosting, bagging, and random forests. Imputation, regularization, and cross-validation concepts will also be covered. The R programming language will be used for occasional examples, though participants need not have prior exposure to R.
Instructor: Alexander Ioannidis
Postdoctoral Fellow, Department of Biomedical Data Science at Stanford Medical School.
Thursday, August 20, 2020
Deep Learning is a rapidly expanding field with new applications found every day. In this workshop we will cover the fundamentals of deep learning for the beginner. We will introduce the math behind training deep learning models: the backpropagation algorithm. Building conceptual understanding of the fundamentals of deep learning will be the focus of the first part of the workshop. We will then cover some of the popular architectures used in deep learning, such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), LSTMs, autoencoders and GANs. There will be a hands-on computing tutorial using Jupyter notebooks to build a basic image classification model via transfer learning. By the end of the workshop, participants will have a firm understanding of the basic terminology and jargon of deep learning and will be prepared to dive into the plethora of online resources and literature available for each specific application area.
Instructor: Sherrie Wang
PhD student at ICME, Stanford University
Friday, August 21, 2020
In the past 50 years, supercomputers have achieved what was once considered only possible in Sci-Fi movies. The key to the tremendous success of supercomputers has been a combination of outstanding architectures plus software that uses all the available resources and makes parallelization possible. This secret sauce has led to different implementations across fields. A mechanical engineer would use MPI and OpenMP to have a balance between computations and memory load to deal with millions of nodes in physical simulations, whereas a data scientist would use MapReduce and Spark to have an adaptable and resilient algorithm for the challenges of big data. This workshop explores the key features of these two approaches, explaining their underground philosophy and how they use the architecture. The final goal is to give the student a taste of the different programming paradigms and the tools to decide which is the best approach.
Instructor: Cindy Orozco
5th Year PhD student at ICME, Stanford University
This workshop will focus on practical applications and considerations of applying deep learning to Natural language processing (NLP). We will start by drawing inspiration from more traditional NLP approaches, and show how many modern deep learning-based algorithms have deep roots in traditional techniques, while showing how deep learning has enabled new improvements. This workshop will heavily focus on student's understanding of problem templates in applied natural language processing, and about identifying application patterns.
We will have a practical focus, targeting algorithms, and problem templates which are able to be deployed and used today. We will cover the different components that go into deep learning systems, including word vector representations (word2vec, GloVe), contextual representations (ELMo, BERT), and general model components such as convolutional layers, Transformers, and others. We will also cover introductory material in applications such as classification, intent understanding, and others.
Instructor: Luke de Oliveira
Lead Engineer for the AI platform team at Twilio as part of Twilio AI
Saturday, August 22, 2020
Thisworkshop introduces Tableau, a powerful tool for creating data visualizations. It is geared toward people in the industry and academia who want to better communicate their projects and research. Attendees will learn how to load data and use it to create charts from Tableau’s library, going from time-series visualizations, scatterplots and maps, all the way to interactive dashboards that use calculated fields, groups, sets and other advanced features.
Instructor: Sergio Camelo Gomez
5th Year PhD Student at ICME, Stanford University