4:15pm to 5:30pm

4:15pm to 5:30pm

This presentation will discuss a class of preconditioning methods for solving linear systems of equations that are based on exploiting low-rank approximations to certain matrices. These methods have a number of appealing features. Because they are essentially approximate inverse techniques,they handle indefiniteness quite well. Furthermore, they are amenable to SIMD compuations such those inherent to GPUs.The talk will first describe a recursive divide and conquer approach geared toward Symmetric Positive Definite model problems issued from Finite Difference discretizations of PDEs. Then two extensions of this general approach will be described. The first exploits Schur complements in a parallel computing Domain Decomposition (DD) framework. The second extends this DD approach further by considering so-called `hierarchical interface decomposition orderings' which are essentially algebraic generalizations of `wirebaskets' techniques used in Domain Decomposition methods.

Bio: Yousef Saad is an I.T. Distinguished Professor of Computer Science in the Department of Computer Science and Engineering at the University of Minnesota. He holds the William Norris Chair for Large-Scale Computing since January 2006. He is known for his contributions to the matrix computations, including the iterative methods for solving large sparse linear algebraic systems, eigenvalue problems, and parallel computing. Saad is listed as an ISI highly cited researcher in mathematics and is the author of the highly cited book 'Iterative methods for sparse linear systems'.

NOTE: This talk will replace the CME 500 seminar on 4/20, but will be held in the same location as CME 500.

4:15pm to 5:30pm

Random matrix theory has long been used to study the spectral properties of physical systems, and has led to a rich interplay between probability theory and physics [1]. Historically, random matrices have been used to model physical systems with random fluctuations, or systems whose eigenproblems were too difficult to solve numerically. This talk explores applications of RMT to the physics of disorder in organic semiconductors [2,3]. Revisiting the old problem of Anderson localization [4] has shed new light on the emerging field of free probability theory [5]. I will discuss the implications of free probabilistic ideas for finite-dimensional random matrices [6], as well as some hypotheses about eigenvector locality. Algorithms are available in the RandomMatrices.jl package [7] written for the Julia programming language.

[1] M. L. Mehta. Random matrices, 3/e, Academic Press, 2000.

[2] J. Chen, E. Hontz, J. Moix, M. Welborn, T. Van Voorhis, A. Suarez,

R. Movassagh, and A. Edelman. Error analysis of free probability

approximations to the density of states of disordered systems.

Phys. Rev. Lett. (2012) 109:36403.

[3] M. Welborn, J. Chen, and T. Van Voorhis. Densities of states for

disordered systems from free probability. Phys. Rev. B (2013) 88:205113.

[4] P. W. Anderson. Absence of diffusion in certain random lattices.

Phys. Rev. (1958) 109:1492--1505.

[5] D. Voiculescu. Addition of certain non-commuting random variables.

J. Functional Anal. (1986) 66:323--346.

[6] J. Chen, T. Van Voorhis, and A. Edelman. Partial freeness of random

matrices. arXiv:1204.2257

**This seminar will be recorded and posted online one hour after the seminar. Please visit https://www.youtube.com/user/ICMEStudio to watch this talk.**

4:15pm to 5:30pm

We use mathematical modeling to explore the ramifications of targeting preventive disease measures to undernourished children. We consider a malaria model with superinfection and heterogeneous susceptibility, where a portion of this susceptibility is due to undernutrition (as measured by weight-for-age z scores). The portion of the total susceptibility that is due to undernutrition is estimated from a large randomized trial of supplementary feeding. We compute the malaria morbidity and mortality for a variety of policies involving supplementary food and insecticide treated nets.

4:00pm to 5:00pm

4:15pm to 5:30pm

Title: Fast and flexible linear algebra in Julia

Abstract: Applied scientists often develop computer programs exploratively, where data examination, manipulation, visualization and code development are tightly coupled. Traditionally, the programming languages used are slow, with performance critical computations relegated to library code written in languages on the other side of Ousterhout's dichotomy, e.g. LAPACK. I will introduce the Julia programming language and argue that it is well suited for computational linear algebra. Julia provides features for exploratory program development, but the language itself can be almost as fast as C and Fortran. Furthermore, Julia's rich type system makes it possible to extend linear algebra functions with user defined element types, such as finite fields or strings with algebraic structured attached. I will show examples of Julia programs that are relatively simple, yet fast and flexible at the same time. Finally, the potential and challenges for parallel linear algebra in Julia will be discussed.

4:15pm to 5:30pm

Dealing with Thread Divergence in a GPU Monte Carlo Radiation Therapy Simulator

and

Deep learning and 2D probabilistic context free grammars for parsing images of math formulas

4:15pm to 5:30pm

http://stanford.edu/~yuekai/

yuekai@gmail.com

We study how to fit a sparse linear model when the data is stored in a distributed fashion. The main computational challenge in a distributed setting is harnessing the computational capabilities of all the machines while keeping communication costs low. We devise an approach that requires only a single round of communication among the machines. We show the approach recovers the convergence rate of the (centralized) lasso as long as each machine has access to an adequate number of samples.

Joint work with Jason Lee, Qiang Liu, and Jonathan Taylor.

12:15pm to 1:15pm

4:15pm to 5:30pm

4:15pm to 5:30pm

Nonlinear systems of partial differential equations (PDEs) may permit several distinct solutions. The typical current approach to finding distinct solutions is to start Newton's method with many different initial guesses, hoping to find starting points that lie in different basins of attraction. In this talk, we present a deflation algorithm in Banach spaces for systematically modifying the residual of a nonlinear PDE problem to eliminate known solutions from consideration. This enables the Newton-Kantorovitch iteration to converge to

several different solutions, even starting from the same initial guess. The deflated Jacobian is dense, but an efficient preconditioning strategy is devised, and the number of Krylov iterations is observed not to grow as solutions are deflated. The technique is then applied to computing distinct solutions of nonlinear PDEs, tracing bifurcation diagrams, and to computing multiple local minima of PDE-constrained optimization problems.

Bio: Patrick Farrell completed his PhD in Computational Physics in 2009 at Imperial College London. He is an EPSRC Early Career Research Fellow in Applied Mathematics in the Mathematical Institute, University of Oxford.

2:15pm to 3:00pm

4:15pm to 5:30pm

4:15pm to 5:30pm

In various tasks of seismic petroleum exploration it is necessary to solve the wave equation numerically many thousands of times. Even with the largest clusters this is still a fairly time demanding task. Given some practical constraints on the problem we explore means to reduce the total computing time by significant factors. We use a combination of known tools, such as Model Order Reduction, Petrov-Galerkin oblique projections, super-shots, etc. The novelty resides in applying them to the wave equation and actually obtaining the desired speedup in a practical manner.

2:15pm to 3:00pm

4:15pm to 5:30pm

4:00pm to 5:00pm

Speaker: Margot Gerritsen

4:15pm to 5:30pm

One of the most popular techniques for the efficient solution of large sparse linear systems is to permute with only sparsity in mind and to sidestep dynamic pivoting by perturbing diagonal pivot candidates that are deemed too small. This talk demonstrates simple Hermitian Positive-Definite (HPD) matrices where such a scheme fails

catastrophically, explains the failure in terms of both pseudospectra

and element growth within the resulting triangular solves, and shows how the same phenomenon can occur within well-conditioned quasi-semidefinite systems.

An argument is made for reviving the work of Saunders and to default to preconditioning via a priori regularizations of quasi-semidefinite

systems when solving sparse non-HPD linear and least squares problems on large distributed-memory machines. Furthermore, these results are discussed in the context of the author's recent work on implementing distributed-memory Interior Point Methods in Elemental.

2:15pm to 3:00pm

Guenther Walther is a professor in the Department of Mathematics at Stanford. Guenther Walther studied mathematics, economics, and computer science at the University of Karlsruhe in Germany and received his Ph.D. in Statistics from UC Berkeley in 1994. His research has focused on statistical methodology for detection problems, shape-restricted inference, and mixture analysis, and on statistical problems in astrophysics and in flow cytometry.

4:15pm to 5:30pm

Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments

In this era of large-scale data, distributed systems built on top of clusters of commodity hardware provide cheap and reliable storage and scalable processing of massive data. In this talk, we review recent work on developing and implementing randomized matrix algorithms in large-scale parallel and distributed environments. Our main focus is on the underlying theory and practical implementation of random projection and random sampling algorithms for very large very overdetermined least squares regression problems. Theoretical results demonstrate that in near input-sparsity time and with only a few passes through the data one can obtain very strong relative-error approximate solutions, with high probability. We evaluate the performance of these algorithms on terabyte-sized data in existing distributed systems using Spark. These empirical results highlight the importance of various trade-offs (e.g., between the time to construct an embedding and the conditioning quality of the embedding, between the relative importance of computation versus communication, etc.) and demonstrate that least squares problems can be solved to low, medium, or high precision.

4:15pm to 5:30pm

Various applications in signal processing and machine learning give rise to highly structured spectral optimization problems characterized by low-rank solutions. Two examples include optimization problems from phase retrieval and matrix completion. I will describe how the particular structure of these problems allows for a special kind of duality framework that leads to computationally convenient formulations.

Please visit https://www.math.ucdavis.edu/~mpf/ for more information about Michael Friedlander.

2:15pm to 3:00pm

Eric Darve, Associate Professor of Mechanical Engineering at Stanford, will be talking about various projects in his group including fast linear solvers, hierarchical matrices, and task-based parallel programming systems.

4:00pm to 5:00pm

Speaker: Margot Gerritsen

4:15pm to 5:30pm

The problem of solving a nonsmooth nonconvex program with nonsmooth constraints differs significantly from its smooth counterpart. In case the objective and constraint functions are no longer C^2 but only locally Lipschitz, the first-order optimality conditions no longer lead to a system of equations but to a set relation involving subdifferentials. This has the consequence that algorithmic approaches that differ from the smooth case have to be taken for computing solutions. One of the most promising classes of solvers are bundle methods.

In this talk I will present the first step on a roadmap towards an algorithm for solving general nonsmooth nonconvex programs. Taking inspiration from the SQP-method for smooth optimization we develop a second-order bundle method for minimizing a nonsmooth objective function subject to nonsmooth inequality constraints starting from a strictly feasible point. Instead of using a penalty function or filter or a merit function to deal with the constraints, we determine the search direction by solving a convex quadratically constrained quadratic program to obtain good iteration points. Furthermore, global convergence of the method is proved under certain mild assumptions. For a concrete implementation numerical results will be presented, as well as an application to certificates of infeasibility and exclusion boxes for numerical constraint satisfaction problems.

Bio: Hermann Schichl is currently an associate professor at the computer mathematics group at the faculty of mathematics of the University of Vienna, Austria. He received his PhD working on infinite-dimensional differential geometry at the same university in 1998. His research interests include mixed-integer global optimization, nonsmooth optimization, mathematical modeling, rigorous computing, operations research, and computational science. He is the main developer of the COCONUT environment, a modular software platform for global optimization algorithms.

2:15pm to 3:00pm

The talk will cover basics of satellite electric propulsion and focus on recent developments in hardware design and integration, understanding of the fundamental plasma effects, and discuss existing modeling capability and challenges to be addressed in future research.

4:15pm to 5:30pm

Fayadhoi Ibrahima

Stochastic methods for flow and transport in porous media

We present stochastic methods for analyzing water saturation (hence oil production) in highly heterogenous reservoir during oil recovery via water flooding under uncertainty. We model the problem as a two-phase (oil/water) flow Buckley-Leverett problem with possible uncertainties in the porosity field and in the total Darcy flux (and ultimately the permeability field). After describing a few possible strategies (MCS, moment equations, collocation and PDF/CDF) to get statistics of the water saturation, we will focus on a one-point CDF approach using streamlines. In 1D, we provide comparisons with direct MCS to illustrate the exactness of this CDF method, as well as comparisons with a low order approximation approach.

Tania Bakhos

Fast Krylov solvers for parabolic PDE-based inverse problems

We propose a fast iterative solver for large-scale PDE-based nonlinear inverse problems, where measurements are used to reconstruct the spatial variation of parameters. Our motivation is transient hydraulic tomography, which is a method to estimate hydraulic parameters related to the subsurface from pressure measurements obtained by a series of pumping tests. Standard approaches for solving the inverse problem require repeated construction of the Jacobian, which represents the sensitivity of the measurements to the unknown parameters. This is prohibitively expensive because it requires repeated solutions of time-dependent parabolic partial differential equations corresponding to multiple sources and receivers. Additionally, solving the differential equations by time-stepping methods requires computing and storing the entire time history. We use a Laplace Transform-based exponential time integrator to independently compute the solution at multiple times, thus reducing the computational cost.

4:00pm to 5:00pm

Speaker: Margot Gerritsen

4:15pm to 5:30pm

Gaussian elimination is universally known as "the" method for solving simultaneous linear equations. As Leonhard Euler remarked in 1771, it is "the most natural way" of proceeding. The method was invented in China about 2000 years ago, then reinvented in Europe much more recently, so it is surprising that the primary European sources have not been identified until now. It is an interesting story in the history of computing and technology that Carl Friedrich Gauss came to be mistakenly identified as the inventor of Gaussian elimination even though he was not born until 1777.

The European development has three phases. First came the "schoolbook" method that began with algebra lessons written by Isaac Newton; what we learn in high school or middle school is still basically Newton's creation. Second were methods that professional hand computers used to solve the normal equations of least squares problems; until comparatively recently the chief societal use for Gaussian elimination was to solve normal equations for statistical estimation. Third was the adoption of matrix notation in the middle of the 20th century. Henceforth the schoolbook lesson and the professional algorithms were understood to be trivially related in that all can be interpreted as computing triangular decompositions.

Bio:

Dr Grcar has written several articles on the history and sociology of science focussing on mathematical sciences and making use of bibliometric data. Formerly he was a research scientist specializing in reacting fluid flow (combustion) calculations at Sandia National Lab and at Lawrence Berkeley National Lab. His expertise in numerical analysis is matrix calculations. A test matrix and a polynomial preconditioner are named after him.

2:15pm to 3:00pm

4:15pm to 5:30pm

Interpolatory factorizations provide alternatives to the singular value decomposition for low-rank matrix approximations; this class includes the CUR factorization, where the C and R matrices are formed from subsets of columns and rows of the matrix being approximated. While interpolatory approximations lack the SVD's optimality, their ingredients are easier to interpret than singular vectors: since they come directly from the matrix itself, they inherit the data's key properties (e.g., nonnegative/integer values, sparsity, etc.). We shall provide an overview of these approximate factorizations, describe how they can be analyzed using interpolatory projectors, and introduce a new method for their construction based on the Discrete Empirical Interpolation Method (DEIM). This talk describes joint work with Dan Sorensen (Rice).

NOTE: This talk will replace the CME 500 seminar on 2/2 and will be held in a different location.

4:15pm to 5:30pm

CME 500 will be replaced by the Distinguished Speaker Series on 2/2.

4:00pm to 5:00pm

Speaker: Margot Gerritsen

4:15pm to 5:30pm

For challenging numerical problems, William Kahan has said that "default evaluation in Quad is the humane option". Fortunately the gfortran compiler allows us to change "real(8)" to "real(16)" everywhere. This is the humane option for producing Quad-precision software.

We describe experiments on multiscale linear and nonlinear optimization problems using a Quad implementation of MINOS. On a range of examples we find that 34-digit Quad floating-point achieves exceptionally small primal and dual infeasibilities (of order 1e-30) when "only" 1e-15 is requested.

2:15pm to 3:00pm

Andrew Spakowitz is an Associate Professor of Chemical Engineering and (by courtesy) of Materials Science and Engineering and of Applied Physics at Stanford University. The Spakowitz lab is engaged in projects that address fundamental chemical and physical processes that underlie a range of key biological mysteries and cutting-edge materials applications. Current research in our lab focuses on three main research themes: DNA Biophysics, Protein Self Assembly, and Charge Transport in Conjugated Polymers. These broad research areas offer complementary perspectives on chemical and physical processes, and we leverage this complementarity throughout our research. Our approach draws from a diverse range of theoretical and computational methods, including analytical theory of semiflexible polymers, polymer field theory, continuum elastic mechanics, Brownian dynamics simulation, equilibrium and dynamic Monte Carlo simulations, and analytical theory and numerical simulations of reaction-diffusion phenomena. A common thread in our work is the need to capture phenomena over many length and time scales, and our flexibility in research methodologies allows us to address these problems at an unprecedented level of precision.

9:45am to 5:30pm

George Bosilca

Building blocks for resilient application

The Message Passing Interface (MPI) is the API of the message-passing parallel programming model, which is widely used by the scientific computing community because it provides an efficient and portable programming environment for distributed applications. But as parallel programming becomes increasingly mainstream, even indispensable, MPI is being asked to address an ever broader and more diverse application community; consequently, it must evolve to embrace the needs of these new users. However, since many of these new application communities utilize cloud and/or grid computing resources, in order to serve them, MPI must overcome a major limitation it currently has: it must be able to gracefully handle the type and the frequency of failures that such cloud and grid infrastructures typically exhibit. This talk describes User Level Fault Mitigation (ULFM), which is an extension to the MPI standard that offers a path forward to the kind of dynamic and resilient MPI implementation that so many scientific communities will need to parallelize their applications.

Austin Benson

Building blocks for resilient application

Errors due to hardware or low level software problems, if detected, can be fixed by various schemes, such as recomputation from a checkpoint. Silent errors are errors in application state that have escaped low-level error detection. At extreme scale, where machines can perform astronomically many operations per second, silent errors threaten the validity of computed results. We propose a new paradigm for detecting silent errors at the application level. Our central idea is to frequently compare computed values to those provided by a cheap checking computation, and to build error detectors based on the difference between the two output sequences. Numerical analysis provides us with usable checking computations for the solution of initial-value problems in ODEs and PDEs, arguably the most common problems in computational science. Here, we investigate methods based on Runge-Kutta and linear multistep methods for ODEs, and on implicit and explicit finite difference schemes for PDEs. In tests with artificially injected errors, this approach effectively detects almost all meaningful errors, without significant slowdown.

4:00pm to 5:00pm

*BIG MATH IN SMALL COMPANIES: A Quant Perspective on Entrepreneurship. *

ICME TGIF Seminar. January 23rd 4-5pm, Y2E2 Room 111.

Hear from ICME alumni and others who have founded or joined start-ups after grad school.

- How did they end up in a start-up?
- What role does "big math" play in their work?
- What unique challenges do founders face?
- And what do they find most valuable from their ICME days at Stanford?
- Come with your burning questions as well !!
- Panelists will stay for further questions and discussion during the ICME Happy Hour

Panelists will include:

- Esteban Arcaute, Adchemy (acquired by @WalmartLabs)
- Ian Christopher, analyticsMD
- Adam Guetz, Kelvin Inc.
- Montse Medina, Jetlore
- James Piette, Two Six Capital

Guests are encouraged to arrive on time as space is limited

Watch the event recording online here: https://www.youtube.com/watch?v=ASe9VLrYFjg

4:15am to 5:30pm

The Tall-Skinny QR (TSQR) algorithm is more communication efficient than the standard Householder algorithm for QR decomposition of dense matrices with many more rows than columns. However, TSQR produces a different representation of the orthogonal factor and therefore requires more software development to support the new representation. Further, implicitly applying the orthogonal factor to the trailing matrix in the context of factoring a square matrix is more complicated and costly than with the Householder representation.

In this talk, I'll show how to perform TSQR and then reconstruct the Householder vector representation with the same asymptotic communication efficiency and little extra computational cost. I'll discuss the high performance and numerical stability of this algorithm both theoretically and empirically. The new Householder reconstruction algorithm allows us to design more efficient parallel QR algorithms, with significantly lower latency cost compared to Householder QR and lower bandwidth and latency costs compared with Communication-Avoiding QR (CAQR) algorithm. As a result, our final parallel QR algorithm outperforms ScaLAPACK and Elemental implementations of Householder QR as well as our implementation of CAQR on Cray XE6 and XC30 systems at NERSC.

Bio: Grey Ballard is currently a Truman Fellow at Sandia National Labs in Livermore, CA. He received his PhD in 2013 from UC Berkeley. He worked in the BeBOP group and Parallel Computing Laboratory under advisor James Demmel. His research interests include numerical linear algebra, high performance computing, and computational science, particularly in developing algorithmic ideas that translate to improved implementations and more efficient software. His work has been recognized with the SIAM Linear Algebra Prize and two conference best paper awards, at SPAA and IPDPS, he received the C.V. Ramamoorthy Distinguished Research Award at UC Berkeley, and his PhD thesis was recognized by the ACM Doctoral Dissertation Award - Honorable Mention.

2:15pm to 3:00pm

Sanjeeb Bose is a computational scientist at Cascade Technologies and a Consulting Assistant Professor in the Institute for Computational and Mathematical Engineering. His research expertise is in the areas of modeling and high fidelity simulation of complex turbulent flows and large scale parallel computing. Large-eddy simulations of turbulent flows to predict unsteady phenomenon such as boundary layer separation and heat transfer for energy systems will be presented. The design of efficient and scalable algorithms (including scalability results up to 786,000 processors) and approaches to quantifying errors in numerical simulations of turbulent flows will also be briefly discussed.

4:00pm to 5:00pm

Speaker: Margot Gerritsen

4:15pm to 5:30pm

I compare Krylov subspace methods by their performance on square test equations Ax = b with naturally occurring matrices A. LSQR proves superior to the other methods in terms of reliability.

Numerical analysis developed to study the mathematics of computation after World War 2. The subject is still growing in the 21st century. Reasons for the growth and geographic distribution of the researchers are examined.

Bio:

Dr Grcar has written several articles on the history and sociology of science focussing on mathematical sciences and making use of bibliometric data. Formerly he was a research scientist specializing in reacting fluid flow (combustion) calculations at Sandia National Lab and at Lawrence Berkeley National Lab. His expertise in numerical analysis is matrix calculations. A test matrix and a polynomial preconditioner are named after him.

2:15pm to 3:00pm

Marco Pavone is an Assistant Professor of Aeronautics and Astronautics at Stanford University, where he also holds courtesy appointments in the Department of Electrical Engineering, in the Institute for Computational and Mathematical Engineering, and in the Information Systems Laboratory. He is a Research Affiliate at the NASA Jet Propulsion Laboratory (JPL), California Institute of Technology. Before joining Stanford, he was a Research Technologist within the Robotics Section at JPL. He received a Ph.D. degree in Aeronautics and Astronautics from the Massachusetts Institute of Technology in 2010. Dr. Pavone's areas of expertise lie in the fields of controls and robotics.

Nick Henderson is a Research Associate in the Institute for Computational and Mathematical Engineering. He received his Ph.D. degree from Stanford University in 2012. Dr. Henderson was one of six awardees of "Most Excellent Young Researchers Presentation" at the Joint International Conference on Supercomputing in Nuclear Applications in 2013.

4:15pm to 5:30pm

Yingzhou Li

Butterfly Factorization

This talk focuses on a fast algorithm accelerating an important class of matrix-vector multiplications. A matrix of size N by N in this class enjoys a special low-rank property that any contiguous p by q submatrix has a numerical rank bounded by a constant independent of N if pq<=N. We refer to these matrices as butterfly matrices and present an efficient butterfly factorization algorithm to represent them as products of sparse matrices. This butterfly factorization can be used for efficient operator compression and admits a matrix-vector multiplication in O(N log N) operations. The application of the factorization is significantly faster

than that of the butterfly algorithm as demonstrated by several numerical examples.

Akshay Mittal

An efficient non-intrusive uncertainty propagation method for stochastic multi-physics models.

Multi-physics systems are mathematically represented as coupled partial differential equation (PDE) systems, which are naturally suited for module-based (partitioned) numerical solution methods. While modularization is well established for solving deterministic PDE systems, several challenges arise in extending the framework to uncertainty propagation i.e. solving stochastic PDE systems. Monte-Carlo-based methods are blind to the module-based structure of the solver and in general, they require millions of expensive PDE solves to yield accurate statistics. Moreover, although standard (traditional) implementations of spectral methods can be modularized, they can easily succumb to the curse of dimension (COD), since the coupled nature of the model dictates that each module should handle the global (combined) stochastic space for uncertainty propagation.

Therefore, we present a reduced non-intrusive spectral projection (NISP) method for uncertainty propagation which addresses the COD and facilitates a reuse of deterministic solver modules. Our method is a modification of the standard NISP method with the intermediate construction of reduced dimensional (and order) approximations of the input data entering each respective module. Assuming a generalized polynomial chaos (gPC) approximation of the raw input data, the construction methods are based on straightforward linear algebraic computations, the costs of which are negligible in comparison to repeated module calls. We implement the reduced NISP method on some benchmark problems and demonstrate its performance gains over the standard NISP method.