CMI-Center for the Mathematics of Information-Caltech

Past Events

Faculty Lunches

2004-16

Home | People | Events | Links | Positions

2018-19 CMI Seminars

October 2, 2018
Ben Lee Volk
Algebraic Complexity Theory: Lower bounds and Derandomization (1st of 2 parts)
    Algebraic Complexity theory is a subfield of complexity theory, studying computational problems of algebraic flavor such as multiplying matrices, computing the determinant, and more generally, computing polynomials over a field, where the complexity is measured by the number of algebraic operations needed to compute it.
    Two basic open problems in the area are (1) proving superpolynomial circuit lower bounds for explicit polynomials (which is the algebraic analog to the P vs. NP problem), and (2) designing deterministic efficient algorithms for the Polynomial Identity Testing problem, that asks to determine, given a circuit computing a polynomial, whether it computes the zero polynomial. Studying these questions often leads to beautiful mathematical questions of combinatorial, algebraic and geometric nature.
    In this talk, we will discuss several old and new results in the area, explore the mysterious, bidirectional and intricate connections between the two problems above, and give some open problems.

October 9, 2018
Ben Lee Volk
Algebraic Complexity Theory: Lower bounds and Derandomization (2nd of 2 parts)

October 16, 2018
Alessandro Zocca
Hard-core interactions on graphs
In this talk I will introduce the hard-core model with Metropolis transition probabilities on finite graphs and its multi-state generalization. Motivated by the study of random-access networks performance, I will focus on the low-temperature regime and study the asymptotic behavior of these interacting particle systems by looking at first hitting times between stable states and mixing times. In particular, I will show how the order-of-magnitude of these hitting and mixing times depends on the underlying graph structure and derive precise asymptotics for various regular lattices. These results have been obtained extending the so-called "pathwise approach" developed in the statistical physics literature to study metastability phenomena and yielded a rigorous understanding of the root cause for poor delay performance of random-access networks.

October 23, 2018
Alessandro Achille (UCLA)
Complexity, Noise, and Emergent Properties of Learning Deep Representations
I will show that Information Theoretic quantities control and describe a large part of the training process of Deep Neural Networks, and can be used to explain how properties, such as invariance to nuisance variability and disentanglement of semantic factors, emerge naturally in the learned representation. The resulting theory has connections with several fields ranging from algorithmic complexity to variational inference. This framework not only predicts the asymptotic behavior of deep networks, but also shows that the initial learning transient has a large irreversible effect on the outcome of the training, which gives rise to critical learning periods akin to those observed in biological systems. This urges us to study the complex, and so far neglected, initial phase of learning.

October 30, 2018
Omer Tamuz
Combining and Comparing Experiments
A classical statistical model of hypothesis testing is the Blackwell Le Cam experiment, in which an observer makes inferences regarding the veracity of an hypothesis by observing a random outcome whose distribution depends on whether or not the hypothesis holds. We address a number of natural, classical questions: when does one experiment have more information than another? How can we quantify the amount of information an experiment holds? Our results include an answer to an old question of Blackwell, as well as a novel axiomatization of Kullback-Leibler divergence.
Joint with Xiaosheng Mu, Luciano Pomatto and Philipp Strack

November 27, 2018
Rasmus Kyng (Harvard)
Approximate Gaussian Elimination for Laplacians
We show how to perform sparse approximate Gaussian elimination for Laplacian matrices. We present a simple, nearly linear time algorithm that approximates a Laplacian matrix by a sparse LU factorization. We compute this factorization by subsampling standard Gaussian elimination. This gives the simplest known nearly linear time solver for Laplacian equations. The crux of our proof is the use of matrix martingales to analyze the algorithm. Finally, we will take a look at ongoing efforts to implement robust and fast Laplacian solvers based on these ideas.
Based on joint work with Sushant Sachdeva and Daniel Spielman.

December 4, 2018
Jan Hazla (MIT)
Reasoning in Bayesian Opinion Exchange Networks is Computationally Hard
    Bayesian models of opinion exchange are extensively studied in economics, dating back to the work of Aumann on the agreement theorem. An important class of such models features agents arranged on a network (representing, e.g., social interactions), with the network structure determining which agents communicate with each other. It is often argued that the Bayesian computations needed by agents in such models are difficult, but prior to our work there were no rigorous arguments for such hardness.
    We consider a well-studied model where fully rational agents receive private signals indicative of an unknown state of the world. Then, they repeatedly announce the state of the world they consider most likely to their neighbors, at the same time updating their beliefs based on their neighbors' announcements.
    I will discuss our complexity-theoretic results establishing hardness of agents' computations in this model. Specifically, we show that these computations are NP-hard and extend this result to PSPACE-hardness. We show hardness not only for exact computations, but also that it is computationally difficult even to approximate the rational opinion in any meaningful way.
    Joint work with Ali Jadbababie, Elchanan Mossel and Amin Rahimian.

January 22, 2019
Gautam Kamath
Privately Learning High-Dimensional Distributions
We present novel, computationally efficient, and differentially private algorithms for two fundamental high-dimensional learning problems: learning a multivariate Gaussian in R^d and learning a product distribution in {0,1}^d in total variation distance. The sample complexity of our algorithms nearly matches the sample complexity of the optimal non-private learners for these tasks in a wide range of parameters. Thus, our results show that private comes essentially for free for these problems, providing a counterpoint to the many negative results showing that privacy is often costly in high dimensions. Our algorithms introduce a novel technical approach to reducing the sensitivity of the estimation procedure that we call recursive private preconditioning, which may find additional applications.
Based on joint work with Jerry Li, Vikrant Singhal, and Jonathan Ullman.

February 26, 2019
    CMI Special Lunch Seminar
            ★ 12 noon ★
            ★ Room 213 ★
Brent Waters (UT Austin)
Cryptographic Code Obfuscation
    A cryptographic code obfuscator takes as input the description of a program P and outputs a program P' that is functionally equivalent to the original, but should hide the inner workings of the original program to the maximal extent possible. Until very recently no general purpose obfuscators existed that were based on cryptographically hard problems; however, in 2013 researchers proposed the first candidate obfuscator for "indistinguishability obfuscation". Since then there has been a tremendous interest in the subject from the cryptography community.
    In this talk I will first introduce the concept and define indistinguishability obfuscation. Then I will show techniques for building cryptographic applications from it. Finally, I will conclude with discussing the challenging open problems in the area.

February 26, 2019
Miles Lopes (UC Davis)
Two New Bootstrap Methods for High-Dimensional and Large-Scale Data
Bootstrap methods are among the most broadly applicable tools for statistical inference and uncertainty quantification. Although these methods have an extensive literature, much remains to be understood about their applicability in modern settings, where observations are high-dimensional, or where the quantity of data outstrips computational resources. In this talk, I will present a couple of new bootstrap methods that are tailored to these settings. First, I will discuss the topic of "spectral statistics" arising from high-dimensional sample covariance matrices, and describe a method for approximating the distributions of such statistics. Second, in the context of large-scale data, I will discuss a more unconventional application of the bootstrap -- dealing with the tradeoff between accuracy and computational cost for randomized numerical linear algebra. This will include joint work from a paper with Alexander Aue and Andrew Blandino; https://arxiv.org/abs/1709.08251 (to appear at Biometrika), and a paper with Michael Mahoney, and Shusen Wang; https://arxiv.org/abs/1708.01945 (to appear at JMLR).

March 5, 2019
★ NOTE: will be held in
Hall Auditorium (135 Gates Thomas) ★
Gitta Kutyniok (TU Berlin)
Approximation Theory meets Deep Learning
Despite the outstanding success of deep neural networks in real-world applications, most of the related research is empirically driven and a mathematical foundation is almost completely missing. One central task of a neural network is to approximate a function, which for instance encodes a classification task. In this talk, we will be concerned with the question, how well a function can be approximated by a neural network with sparse connectivity. Using methods from approximation theory and applied harmonic analysis, we will derive a fundamental lower bound on the sparsity of a neural network. By explicitly constructing neural networks based on certain representation systems, so-called $\alpha$-shearlets, we will then demonstrate that this lower bound can in fact be attained. Finally, we present numerical experiments, which surprisingly show that already the standard backpropagation algorithm generates deep neural networks obeying those optimal approximation rates.

March 12, 2019
Nisheeth Vishnoi (Yale)
Towards Controlling Bias in AI Systems
    Powerful AI systems, which are driven by machine learning tools, are increasingly controlling various aspects of modern society: from social interactions (e.g., Facebook, Twitter, Google, YouTube), economics (e.g., Uber, Airbnb, Banking), learning (e.g., Wikipedia, MOOCs), to governance (Judgements, Policing, Voting). These systems have a tremendous potential to change our lives for the better, but, via the ability to mimic and nudge human behavior, they also have the potential to be discriminatory, reinforce societal prejudices, and polarize opinions. Indeed, recent studies have demonstrated that these systems can be quite brittle and generally lack the required qualities to be deployed in various human-centric/societal contexts. The reason being that considerations such as fairness, explainability, accountability etc. have largely been an afterthought in the development of such AI systems.
    In this talk, I will outline our efforts towards incorporating some of the above-mentioned issues in a principled manner for core machine learning tasks such as classification, data summarization, ranking, personalization, and online advertisement. Our work leads to new algorithms that have the ability to control and alleviate bias from their outputs, comes with provable guarantees, and often has low "price of fairness".
    Based on several joint works with Elisa Celis.

May 7, 2019
         ★ Room 213 ★
Alistair Sinclair (UC Berkeley)
The Lee-Yang Theorem and the Ising Model
    The celebrated Lee-Yang Theorem of the 1950s says that the zeros of the partition function of the ferromagnetic Ising model (viewed as a polynomial in the field parameter) lie on the unit circle in the complex plane. In this talk I will discuss a recent revival of interest in this result, inspired by computational questions. I will discuss three developments. First, I will explain how a generalization of the Lee-Yang theorem to rule out repeated zeros leads to hardness results for computing averages of observables associated with the Ising model. Then I will show how to combine the theorem with recent technology of Barvinok and others to obtain the first polynomial time deterministic approximation algorithms for the partition function. And finally I will discuss the relationship of the theorem to decay of correlations. The talk will be self-contained.
    This is joint work with Jingcheng Liu and Piyush Srivastava.

May 14, 2019
         ★ Room 213 ★
Adam Smith (Boston University)
The Structure of Optimal Private Tests for Simple Hypotheses
    Hypothesis testing plays a central role in statistical inference, and is used in many settings where privacy concerns are paramount. This work answers a basic question about testing simple hypotheses subject to a stability, or "privacy", constraint: given two distributions P and Q, and a privacy level e, how many i.i.d. samples are needed to distinguish P from Q such that changing one of the samples changes the test's acceptance probability by at most e. What sort of tests have optimal sample complexity? We characterize this sample complexity up to constant factors in terms of the structure of P and Q and the privacy level e, and show that this sample complexity is achieved by a certain randomized and clamped variant of the log-likelihood ratio test. Our result is an analogue of the classical Neyman-Pearson lemma in the setting of private hypothesis testing. Our analysis also sheds light on classical hypothesis testing, giving a new interpretation of the Hellinger distance between distributions.
    Joint work with Clément Canonne, Gautam Kamath, Audra McMillan, and Jonathan Ullman. To appear as STOC 2019. Available as https://arxiv.org/abs/1811.11148

May 21, 2019
         ★ Room 213 ★
Vijay Vazirani (UCI)
Matching is as Easy as the Decision Problem, in the NC Model
    Is matching in NC, i.e., is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in TCS for over three decades, ever since the discovery of Random NC matching algorithms. Over the last five years, the TCS community has launched a relentless attack on this question, leading to the discovery of numerous powerful ideas. We give what appears to be the culmination of this line of work:  An NC algorithm for finding a minimum weight perfect matching in a general graph with polynomially bounded edge weights, provided it is given an oracle for the decision problem. Consequently, for settling the main open problem, it suffices to obtain an NC algorithm for the decision problem.
    We believe this new fact has qualitatively changed the nature of this open problem.
    Our result builds on the work of Anari and Vazirani (2018), which used planarity of the input graph critically; in fact, in three different ways. Our main challenge was to adapt these steps to general graphs by appropriately trading planarity with the use of the decision oracle. The latter was made possible by the use of several of the idea discovered over the last five years.
    The difficulty of obtaining an NC perfect matching algorithm led researchers to study matching vis-a-vis clever relaxations of the class NC. In this vein, Goldwasser and Grossman (2015) gave a pseudo-deterministic RNC algorithm for finding a perfect matching in a bipartite graph, i.e., an RNC algorithm with the additional requirement that on the same graph, it should return the same (i.e., unique) perfect matching for almost all choices of random bits. A corollary of our reduction is an analogous algorithm for general graphs.
    This talk is fully self-contained.
    Based on the following joint paper with Nima Anari: https://arxiv.org/pdf/1901.10387.pdf

2017-18 CMI Seminars

October 24, 2017
Rohit Gurjar
Derandomization: Polynomial Identity Testing and Isolation Lemma
(Part 1 of 2)
    Derandomization is one of the central questions in theory of computing. The question asks if all problems with efficient randomized algorithms also have deterministic algorithms with comparable time complexity. The question also has connections with circuit lower bounds.
    One of the randomized solutions for PIT goes via the Isolation Lemma. The lemma states that given a family F of subsets of set E, assigning random weights to the elements of E ensures there is a unique minimum weight set in the family F.
    Derandomizing the Isolation Lemma means constructing such weights in a deterministic way. One of the many applications of PIT and Isolation Lemma is a randomized parallel algorithm for the graph matching problem. Recently, we gave a geometric approach for derandomizing the Isolation Lemma. Applying it to a specific case gives a deterministic parallel algorithm for bipartite matching.
    The first talk will give an introduction to these topics. In the second talk, I will describe our isolation approach which works for totally unimodular polytopes. I will end with an open question about short vectors in lattices, which tells in which all settings our isolation approach works.

October 31, 2017
Rohit Gurjar
Derandomization: Polynomial Identity Testing and Isolation Lemma
(Part 2 of 2)

November 14, 2017
Netanel Raviv
From Network Coding to Sidon Spaces
    Network coding is an efficient method for data transmission in networks. However, in many networks even a single hardware failure might render an entire transmission unreadable. Hence, in order to combat errors and erasures in network coding, subspace codes were introduced.
    A subspace code is a family of subspaces over a finite field that intersect mutually in a small dimension. Since the set of all linear subspaces lacks a linear structure, cyclic subspace codes were defined through the natural action of the group of invertible matrices.
    In this talk, several constructions of cyclic subspaces codes will be presented. A particular attention will be given to Sidon spaces, a newly defined algebraic notion, and their potential applications in post-quantum cryptography will be discussed.

November 28, 2017
Netanel Raviv
Gradient coding - When coding theory and distributed computing meet
Gradient Descent, and its variants, are a popular method for solving empirical risk minimization problems in machine learning. However, if the size of the training set is large, a computational bottleneck is the computation of the gradient, and hence, it is common to distribute the training set among worker nodes. Doing this in a synchronous fashion faces yet another challenge of stragglers (i.e., slow or unavailable nodes), which might cause a considerable delay. Hence, schemes for mitigation of stragglers are essential.
It was recently shown by Tandon et al. that stragglers can be avoided by carefully assigning redundant computations to the worker nodes and coding across partial gradients, and a randomized construction for the coding matrix was given. In this talk we present a more efficient deterministic scheme by employing cyclic MDS codes over the real or complex field. In addition, we propose to replace the exact computation of the gradient with an approximate one; a technique which drastically increases the straggler tolerance, and stems from adjacency matrices of expander graphs.

February 13, 2018
Richard Kueng
Spherical t-designs as a general purpose tool for partial de-randomization
(Part 1 of 2)
     Spherical t-designs are ensembles of vectors which reproduce the first 2t moments of the uniform distribution on the complex unit sphere. As such they provide notions of "evenly distributed" sets of vectors that range from tight frames (t=1) to the full sphere (as t approaches infinity). Such configurations have been studied in algebraic combinatorics, coding theory, and quantum information. Similar to t-wise independent functions, they are a general purpose tool for (partial) derandomization. To emphasize this, we will consider two applications:
    i.) Optimally distinguishing quantum states: The optimal probability of correctly distinguishing two (biased) coins with a single coin toss is proportional to the total variational distance. This classical result has a quantum mechanical analogue: the optimal probability of correctly distinguishing two quantum states is proportional to the trace distance. In contrast to the classical result, achieving this bound requires choosing a particular type of quantum measurement that depends on the states to be distinguished. It is therefore natural to ask which universal quantum measurements are optimal in the sense that they perform well at distinguishing arbitrary pairs of quantum states. We will review pioneering work by Ambainis and Emerson who showed that spherical 4-designs have this property.
    ii.) Phase retrieval: The problem of retrieving phase information from amplitude measurements alone has appeared in many scientific disciplines over the last century. More recently, several new recovery algorithms have been proposed and rigorous performance guarantees have been established. The strongest results of this type are probabilistic in nature and require measurements that are chosen uniformly from the complex unit sphere. We will show that this result can be (partially) derandomized: choosing measurements independently from a spherical 4-design already allows for drawing similar conclusions.
    From a practical point of view, the usefulness of these concepts hinges on the availability of constructions for spherical designs. Despite non-constructive existence proofs and approximate randomized constructions, exact families of spherical t-designs are notoriously difficult to find. We will present a family of structured vectors that form spherical 3-designs. These vectors, called stabilizer states, have a rich combinatorial structure and are ubiquitous in quantum information theory. What is more, they perform almost as well as spherical 4-designs in various applications.

February 20, 2018
Richard Kueng
Spherical t-designs as a general purpose tool for partial de-randomization
(Part 2 of 2)

2016-17: not held

2015-16 CMI Seminars

October 27, 2015
Piyush Srivastava
Stability of Causal Inference
    Consider a system consisting of a binary "stimulus" variable X, (e.g. whether an individual consumes tobacco), a binary "response" variable Y, (e.g. whether the individual develops a particular disease) and a "hidden" confounding variable U, (e.g. a hypothetical genetic factor that affects the individual's propensity to both consume tobacco as well as to develop the disease). The system can be modeled by a directed graphical model with X, Y and U as nodes, and U ? X, U ? Y and X ? Y as edges. Given P, the observed probability distribution of the pair (X, Y), we ask: what is the causal effect of X on Y? In the early 1990s, Pearl proposed that an appropriate formalization of the above question is to compute the conditional probability of Y given X in a different graphical model in which the incoming edges to X have been deleted, given only the observed distribution P on the original model. (In the above example, this new model will only have the edges X ? Y and U ? Y.) This deletion simulates an experimental intervention on X while not making any changes to the other interactions in the model. In the general case, the graphical model can comprise several other nodes, and X and Y themselves might be disjoint subsets of these nodes instead of being singleton vertices.
    It is not hard to see that the above problem is not solvable for all graphical models and all pairs X and Y of subsets of nodes. However, a long of line of work by several researchers culminated in 2006 in a complete algorithmic characterization of graphical models in which the problem is solvable. Further, this characterization also included an algorithmic procedure which takes the observed distribution and outputs the requisite conditional distribution in those cases where the problem is solvable [Huang and Valtorta, 2006 and Shpitser and Pearl, 2006].
    These characterizations assumed an exact knowledge of both the underlying graphical model and the observed distribution P. However, in practice, we may expect imprecision both in our description of the model (which may arise from failing to include an edge that should be present), and in our knowledge of the observed distribution P (which may arise due to round-off errors in storage and due to statistical noise). In our work, we study the stability of the above causal inference procedures with respect to such imprecisions in the input.
    On the positive side, we show that the effect of removing an edge on the accuracy of these procedures is not large, as long as the removed edges are themselves "weak"; in particular, the error introduced in the output is proportional to the "strength" of the edge for a natural measure of strength. We also show that examples solvable by some sound but "incomplete" algorithms that preceded the complete algorithm described above are well-conditioned with respect to errors in the observed distribution P.
    On the negative side, we construct examples solvable by the complete algorithm described above that are remarkably ill-conditioned. More formally, we give a sequence of instances such that for any algorithm that solves the causal inference problem on an instance with n nodes, the relative error in the output is larger than the relative error in the input by a factor that is at least O(exp(n^{1/3})).
    Joint work with Leonard J. Schulman.

November 13, 2015
NOTE: 12:00 noon in Annenberg 314
Gil Cohen
Two-Source Dispersers for Polylogarithmic Entropy and Improved Ramsey Graphs
(part 1 of 2)
In his 1947 paper that inaugurated the probabilistic method, Erdös proved the existence of 2 log(n)-Ramsey graphs on n vertices. Matching Erdös' result with a constructive proof is a central problem in combinatorics that has gained a significant attention in the literature. In this talk we will present a recent work towards this goal (http://eccc.hpi-web.de/report/2015/095/).
No prior knowledge will be assumed.

November 24, 2015
Gil Cohen
Two-Source Dispersers for Polylogarithmic Entropy and Improved Ramsey Graphs
(part 2 of 2)

December 1, 2015
Madeleine Udell
Generalized Low Rank Models
(part 1 of 2)
Generalized Low Rank Models (GLRM) extend the idea of Principal Components Analysis (PCA) to embed arbitrary data tables consisting of numerical, Boolean, categorical, ordinal, and other data types into a low dimensional space. This framework encompasses many well known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously.
In these two talks, we introduce the GLRM framework with an eye towards open problems, spending time at the interface between practical large scale data analysis and statistical and optimization theory. We'll discuss

what can you model as a GLRM?
what can they be used for?
when can we trust them?
how can we fit them?
what kinds of information or control might be needed to do better?

The first talk in the series focuses on understanding the modeling flexibility and large scale data analysis applications for low rank models, and on the connecting theoretical statistical results with the practical setting of data analysis.
The second talk delves further into optimization theory, and into recent results using these ideas in a "bandit" setting in which data table entries can be actively queried.

December 8, 2015
Madeleine Udell
Generalized Low Rank Models
(part 2 of 2)
NOTE: location change to Annenberg 314

March 29, 2016
Hu Fu
Towards simple, robust and approximately optimal auctions
(part 1 of 2)
In these two talks I will describe recent developments in the design and analysis of simple and robust auctions that provably extract near optimal revenue. In the first talk I will talk about basics of optimal auction design and give several proofs of Myerson's seminal result from 1981, offering different perspectives on this classic result. Some of the proofs lead to recent generalizations. In the second talk I will talk about some recent works on prior independent and simple form auctions.

April 5, 2016
Hu Fu
Towards simple, robust and approximately optimal auctions
(part 2 of 2)

April 12, 2016
Benny Chor
Conservation Profiles, Functional Enrichment, and the Tree of Life
    Analysis of proteins and genes based solely on their sequences is one of tools employed successfully in the early days of bioinformatics. Conservation of a gene or protein across different species is usually defined as the corresponding sequences being highly similar. Such sequence conservation usually implies a similar function.
    In this work, we explore conservation, based on a very weak notion: the existence of a minimal similarity among sequences. We study this question in the context of 18 species, including mammals and insects. We show that even this minimal requirement yields interesting observations relating conservation, function, and the evolutionary history as represented by the tree of life.
    The main tools we use are enrichment analysis, based on the hypergeometric distribution, and clustering.
    The talk will be self contained.
    Joint work with Jonathan Witztum, Erez Persi, David Horn, and Metsada Pasmanik-Chor.

April 26, 2016
Anand Kumar Narayanan
Computing Isomorphisms between Finite Fields using Elliptic Curves
(part 1 of 2)
Given a prime power p^n, a finite field of cardinality p^n may be constructed as the quotient of the ring of polynomials over the prime order field Z/pZ modulo an irreducible polynomial of degree n. This construction however is non canonical, for all known (unconditional) algorithms for finding irreducible polynomials are randomized. Irrespective of the choice of irreducible polynomial, one obtains the "same" field since two finite fields of the same cardinality are isomorphic. This poses the following algorithmic problem in several applications: Given two irreducible polynomials of degree n over Z/pZ, compute an isomorphism between the respective finite fields obtained modulo them.
We begin by surveying the fastest known (randomized) algorithm for the problem, which has run time quadratic in n. We then present a new (randomized) algorithm based on elliptic curve isogenies with run time sub quadratic in n for most n. The crux of our approach is finding pre-images under the Lang map on elliptic curves over finite fields. We conclude by posing an open computational problem concerning Lang maps whose resolution would solve the isomorphism problem with run time linear in n. The discussion will be self contained and familiarity with elliptic curves, while helpful, is not required.

May 3, 2016
Anand Kumar Narayanan
Computing Isomorphisms between Finite Fields using Elliptic Curves
(part 2 of 2)

2014-15 CMI Seminars

April 7 & 14, 2015
Enrique Mallada
Nonlinear Linear Systems Analysis Applied to Networked Dynamical Systems
This two-part seminar will overview several traditional and novel analysis tools for stability and control of nonlinear dynamical systems. Topics to be covered will include Lyapunov Theory, Passivity and Monotonicity, as well as the differential versions of it. Emphasis will be made on the limitations of each technique and how these limitations affect their application spectrum.
Throughout the talks I will describe the importance of these tools in the study of networked dynamical systems. In particular, I will use a canonical model for synchronization of networked oscillators as a motivating example, and describe how to apply these tools in their study. Along the way, I will highlight a few commonly incurred mistakes and the possible workarounds.

Feb 24, 2015
Madeleine Udell
Generalized Low Rank Models
Principal components analysis (PCA) is a well-known technique for approximating a data set represented by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. We propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results.

Feb 10 & 17, 2015
Krishnamurthy Dvijotham (Dj)
Tools from Optimization Theory and Applications to Nonlinear Dynamical Systems
    The lectures would be an overview of various tools from optimization and control theory and applications of these ideas to studying properties of nonlinear dynamical systems. A list of topics to be covered includes:
    1) Quadratically Constrained Quadratic Programs (QCQPs), a class of optimization problems that are known to be hard in general. We will cover various special cases under which QCQPs are known to be efficiently solvable, and describe the relationship of these results to various "hidden convexity" results.
    2) Modern tools from nonlinear control theory: Integral Quadratic Constraints (IQCs), Contraction and Differential Positivity.
    3) Applications of these ideas to analyzing properties of nonlinear dynamical systems, including some applications to studying properties of numerical algorithms.
References:
1) Pólik, Imre, and Tamás Terlaky. "A survey of the S-lemma." SIAM review 49.3 (2007): 371-418.
2) Lessard, Laurent, Benjamin Recht, and Andrew Packard. "Analysis and design of optimization algorithms via integral quadratic constraints." arXiv preprint arXiv:1408.3595 (2014).
3) Forni, Fulvio, and Rodolphe Sepulchre. "Differentially positive systems." arXiv preprint arXiv:1405.6298 (2014).

Jan 20 & 27, 2015
Piyush Srivastava
Zeros of polynomials and their applications
    A multivariate polynomial over the complex numbers is said to be stable with respect to a region H if it does not vanish when its arguments lie in H. An important special case is when H is the product of unit disks in the complex plane (a "polydisk"), in which case stability with H is equivalent to the statement that the polynomial is non-zero when all its arguments lie in the unit disk. Another important case is when H is the product of upper half planes, in which case stability w.r.t H is equivalent to the statement that the polynomial is non-zero when all its arguments have positive imaginary parts.
    While the term "stability" itself originated in control theory, stable polynomials and their properties have had several applications in fields as diverse as linear algebra, combinatorics and probability theory, and computational complexity theory, among others. In these two talks, I will attempt to discuss the following three applications:
    1) #P-hardness of computing the magnetization of the Ising model.
    This requires an extension to the Lee-Yang theorem -- a stability theorem w.r.t the unit disk for the partition function of the Ising model that was itself proven in order to study phase transitions.
    2) Exact characterization of probabilities for which the Lovász local lemma is valid.
    This uses a statement about the stability of the independent set polynomial w.r.t a polydisk. We will discuss a related question regarding the effective verification of stability properties and its possible implications for a different proof of the Lee-Yang theorem.
    3) Negative association for probability distributions on spanning trees.
    Here, one uses the stability of the generating function of the given probability distribution w.r.t to the upper half plane (also known as the "strongly Raleigh" property in this context). This connection was used in a breakthrough result of Oveis-Gharan, Saberi and Singh [O-GSS11] on approximate solutions to the Traveling Salesperson Problem.

(1) is based on joint work with Alistair Sinclair [SS14]. (2) is based on a paper of Scott and Sokal [SS05] and some ongoing joint work with Mario Szegedy. (3) is based on papers by Borcea, Brändén and Liggett [BBL09] and Oveis-Gharan, Saberi and Singh [O-GSS11].

References:
[BBL09] Borcea, J., Brändén, P. & Liggett, T. Negative dependence and the geometry of polynomials. J. Amer. Math. Soc. 22, 521-567 (2009).
[O-GSS11] Oveis-Gharan, S., Saberi, A. & Singh, M. A Randomized Rounding Approach to the Traveling Salesman Problem. In Proc. IEEE Annual Symposium on Foundations of Computer Science (FOCS) 550-559 (2011).
[SS05] Scott, A. D. & Sokal, A. D. The Repulsive Lattice Gas, the Independent-Set Polynomial, and the Lovász Local Lemma. J Stat Phys 118, 1151-1261 (2005).
[SS14] Sinclair, A. & Srivastava, P. Lee-Yang theorems and the complexity of computing averages. Comm Math Phys 329 (3), 827-858 (2014). Extended abstract in Proc. ACM Annual Symposium on Theory of Computing (STOC) 2013, 625-634.

Jan 6, 2015
Ruta Mehta
Leontief Exchange Markets Can Solve Multivariate Polynomial Equations, Yielding FIXP and ETR Hardness
    We show FIXP-hardness of computing equilibria in Arrow-Debreu exchange markets under Leontief utility functions, and Arrow-Debreu markets under linear utility functions and Leontief production, thereby settling open questions of Vazirani and Yannakakis (2009). In both cases, as required under FIXP, the set of instances mapped onto will admit equilibria, i.e., will be "yes'' instances. If all instances are under consideration, then in both cases we prove that the problem of deciding if a given instance admits an equilibrium is ETR-complete, where ETR is the class Existential Theory of Reals.
    The main technical part of our result is the following reduction:
    Given a set 'F' of simultaneous multivariate polynomial equations in which the variables are constrained to be in a closed bounded region in the positive orthant, we construct a Leontief market 'M' which has one good corresponding to each variable in 'F'. We prove that the equilibria of 'M', when projected onto prices of these latter goods, are in one-to-one correspondence with the set of solutions of the polynomials. (Joint work with J. Garg, V. V. Vazirani, S. Yazdanbod.)

2013-14 CMI Seminars

Nov 25 & Dec 2, 2014
Siddharth Barman
An Approximate Version of Caratheodory's Theorem and Its Algorithmic Applications
In this talk I will present an approximate version of Caratheodory's theorem and its algorithmic applications. In particular, I will show that for every vector in the convex hull of a set of vectors X there exists a nearby (under p-norm distance, for p at least 2) vector that can be expressed as a convex combination of at most b vectors of X, where the bound b is independent of the dimension of the vectors.
Given the significance of Caratheodory's theorem, this approximate version is interesting in its own right. It also has notable algorithmic applications. Specifically, I will describe how this approximate version of Caratheodory's theorem leads to an algorithm for computing near optimal solutions of sparse bilinear and quadratic forms over the simplex. This algorithm, in particular, provides a polynomial-time approximation scheme for Nash equilibrium in games where the sum of the payoff matrices is sparse. Moreover, for arbitrary two-player games the running time of the algorithm matches the best-known upper bound, which is obtained in (Lipton, Markakis, and Mehta 2003).

Nov 4 & 11, 2014
Georgios Piliouras
Beyond Equilibria
    The notion of equilibrium is a central concept that lies on the intersection of game theory, control theory, dynamical systems and computer science. It provides a palpable and easy-to-understand handle for arguing about decentralized systems. This simplicity, however, comes at a cost since many times in practice this framework is too restrictive and cannot address key real life phenomena. These include the effects of transient system phenomena, the plausibility of persistent far-from-equilibrium behavior, as well as the quantitative analysis of systems with multiple equilibria.
    We will discuss a number of different techniques to address such challenges and pinpoint some key mathematical tools for:
    a) online learning/optimization based approaches (e.g., regret minimization) and their connections to convex relaxations of the notion of (Nash) equilibrium;
    b) the techniques and limitations of robust price of anarchy (including recent couplings of these ideas with stochastic/robust optimization);
    c) average case analysis techniques for systems with many attracting equilibria (from identifying system invariants to approximating the volumes of regions of attraction);
    d) novel frameworks for analyzing disequilibrium systems (e.g., persistent patterns).
    No background will be assumed in the respective disciplines and the talk will be self-contained.

Oct 28, 2014
Bernhard von Stengel
Department of Mathematics
London School of Economics
Nash Codes for Noisy Channels
When a sender transmits a message as a ''codeword'' over a noisy channel, the receiver's optimal ''decoding'' of the possibly garbled channel output is to identify the message that has most likely been sent. We extend this standard approach by asking if the sender's action is also optimal. That is, given the receiver's decoding strategy, is the sender most likely understood correctly by transmitting the original codeword, rather than any other channel input? A ''Nash code'' has this desirable stability property, where encoding and decoding define a Nash equilibrium in this sender-receiver game with fully aligned interests of the players. We study this concept with examples, and show two sufficient conditions for Nash codes: receiver-optimal codes, and, somewhat surprisingly, _arbitrary_ codes for the classic binary channel (if the receiver breaks ties ''monotonically,'' which holds for generic priors or payoffs). Joint work with Penelope Hernandez (Valencia).

Oct 7 & 14, 2014
Quentin Berthet
Statistical and Computational Complexities
In statistics and theoretical computer science, the word complexity is often used to measure the difficulty of a problem, and to describe the intrinsic nature of a given problem.
I will give a short introduction to the notion of complexity in these two domains, and show some of the challenges that arise when we try to consider them simultaneously. I will present a setting that allows to describe the tradeoffs between statistical and computational efficiencies, and illustrate it through various high-dimensional problems on random vectors, graphs, and satisfiability formulas.

Sept 23, 2014
Umang Bhaskar
Minimizing Latency in Network Congestion Games using Equilibrium Outcomes

2012-13 CMI Seminars

December 11, 2012
Sachin Adlakha

December 4, 2012
Yakov Babichenko

November 27, 2012
Yakov Babichenko

November 13, 2012

October 30, 2012
Rita Ackerman - Clustering Axioms and Properties: Towards Throretical Foundations

October 23, 2012
Umang Bhaskar- Online Mixed Packing and Covering

October 16, 2012
Umang Bhaskar- Uniqueness of Equilibria in Atomic Splittable Routing Games

October 9, 2012
Siddharth Barman - Applications of Linear Programming in Combinatorial Optimization

October 2, 2012
Siddharth Barman - Applications of Linear Programming in Coding Theory