Friday, March 31, 2017
This workshop presents the basics behind the application of modern machine learning algorithms. We will discuss a framework for reasoning about when to apply various machine learning techniques, emphasizing questions of over-fitting/under-fitting, regularization, interpretability, supervised/unsupervised methods, and handling of missing data. The principles behind various algorithms—the why and how of using them—will be discussed, while some mathematical detail underlying the algorithms—including proofs—will not be discussed. Unsupervised machine learning algorithms presented will include k-means clustering, principal component analysis (PCA), and independent component analysis (ICA). Supervised machine learning algorithms presented will include support vector machines (SVM), classification and regression trees (CART), boosting, bagging, and random forests. Imputation, the lasso, and cross-validation concepts will also be covered. The R programming language will be used for examples, though participants need not have prior exposure to R. This workshop will last from 9:00 a.m.-4:45 p.m., four 75-minute sessions, separated by time for breaks. Please note that this is not a Stanford for-credit course.
Prerequisite: undergraduate-level linear algebra and statistics; basic programming experience (R/Matlab/Python).
- Basic Concepts and Intro to Supervised Learning: linear and logistic regression
- Penalties, regularization, sparsity (lasso, ridge, and elastic net)
- Unsupervised learning: clustering (k-means and hierarchical) and dimensionality reduction (Principal Component Analysis, Independent Component Analysis, Self-Organizing Maps, Multi-Dimensional Scaling)
- Unsupervised Learning: NMF and text classification (bag of words model)
- Supervised Learning: loss functions, cross-validation (bias variance trade-off and learning curves), imputation (K-nearest neighbors and SVD), imbalanced data
- Classification and Regression Trees (CART)
- Ensemble methods (Boosting, Bagging, and Random Forests)
- Support Vector Machines (SVM)
- Deep learning: Neutral Networks (Feed-Forward, Convolutional, Recurrent) and training algorithms
This workshop is open to participants 18 years and older. If you are under the age of 18 and would like to participate, please email firstname.lastname@example.org
To register for this workshop, please visit: https://app.certain.com/profile/form/index.cfm?PKformID=0x2531409ffbb
Alexander is a PhD candidate in the Institute for Computational and Mathematical Engineering at Stanford. His research--under Prof. Carlos Bustamante, chair of the department of biomedical data science at Stanford Medical School--focuses on applying machine learning techniques to medicine and human genetics. Prior to Stanford he earned his bachelors in Chemistry and Physics from Harvard and a MPhil from the University of Cambridge. He worked for several years on superconducting and quantum computing architectures at Northrop Grumman's Advanced Technologies research center in Linthicum, MD. In his free time he enjoys sailing.