## Select Seminar Series

All seminars- Home
- ›
- Studying at the Faculty
- ›
- Seminars ›
- Previous Seminars

# Previous Seminars

Motivated by problems arising from the study of certain relative trace formulas, I discuss a notion of endoscopy in a relative setting. The main example is that of unitary Friedberg-Jacquet periods, which are related to special cycles in certain unitary Shimura varieties. After introducing the endoscopic symmetric spaces in this case, I will sketch the proof of the fundamental lemma.

https://weizmann.zoom.us/j/98304397425

This talk concerns Gabrielov's rank Theorem, a fundamental result in local complex and real-analytic geometry, proved in the 1970's. Contrasting with the algebraic case, it is not in general true that the analytic rank of an analytic map (that is, the dimension of the analytic-Zariski closer of its image) is equal to the generic rank of the map (that is, the generic dimension of its image). This phenomena is behind several pathological examples in local real-analytic geometry. Gabrielov's rank Theorem provides a formal condition for the equality to hold

https://weizmann.zoom.us/j/9183356684?pwd=QlN4T1VtcXc4QXdXNEYxUlQ4dmR5dz09

Arthur (1989) conjectured that the discrete spectrum of automorphic representations of a connected reductive group over a number field can be decomposed into global A-packets, in terms of which he also conjectured a multiplicity formula. Arthur (2013) proved his conjectures for symplectic and orthogonal groups, in which case the global A-packets are parametrized by self-dual automorphic representations of general linear groups. In this talk, I will give a construction of the local A-packets for general symplectic and general even orthogonal groups in the nonarchimedean case. This is based on our earlier works in the tempered case, and it follows a construction by Moeglin for symplectic and orthogonal groups.

ZOOM: HTTPS://WEIZMANN.ZOOM.US/J/98304397425

Let G be a connected reductive group defined over a field k of characteristic 0. Recently Knop and Krötz showed that one can attach a Weyl group to any algebraic homogeneous G-variety defined over k. This Weyl group is called the little Weyl group. In this talk I will discuss a geometric construction of the little Weyl group for a real spherical space G/H. Our technique is based on a fine analysis of limits of conjugates of the subalgebra Lie(H) along one-parameter subgroups in the Grassmannian of subspaces of Lie(G).

ZOOM: HTTPS://WEIZMANN.ZOOM.US/J/98304397425

By old results with Millson, the generating series for the cohomology classes of special cycles on orthogonal Shimura varieties over a totally real field are Hilbert-Siegel modular forms. These forms arise via theta series. Using this result and the Siegel-Weil formula, we show that the products in the subring of cohomology generated by the special cycles are controlled by the Fourier coefficients of triple pullbacks of certain Siegel-Eisenstein series.

As a consequence, there are comparison isomorphisms between special subrings for different Shimura varieties. In the case in which the signature of the quadratic space V is (m,2) at an even number d_+ of archimedean places, the comparison gives a `combinatorial model' for the special cycle ring in terms of the associated totally positive definite space.

ZOOM: HTTPS://WEIZMANN.ZOOM.US/J/98304397425

I will review how automorphic representations arise in the calculation of string theory scattering amplitudes. As follows from the work of Green, Miller, Vanhove, Pioline and others, the automorphic representations are associated with split real groups of a certain exceptional family. In the cases that are well understood, these representation have very small Gelfand-Kirillov dimension. Their Fourier expansion can be calculated using different methods and confirms physical expectation on the wavefront set. In work with Gourevitch, Gustafsson, Persson and Sahi, the method of Whittaker pairs was employed to systematize this analysis. I will also comment on the cases that are less well understood in physics and that appear to go beyond the standard notion of automorphic representations since the usual Z-finiteness condition is violated.

ZOOM: HTTPS://WEIZMANN.ZOOM.US/J/98304397425

We explain by elementary means why the existence of a discrete series representation of a real reductive group G implies the existence of a compact Cartan subgroup of G. The presented approach has the potential to generalize to real spherical spaces.

The talk will be based on https://arxiv.org/pdf/2007.15312.pdf

ZOOM: HTTPS://WEIZMANN.ZOOM.US/J/98304397425

In this talk, I will introduce the functorial descent from cuspidal automorphic representations \pi of GL7(A) with L^S(s, \pi, \wedge^3) having a pole at s=1 to the split exceptional group G2(A), using Fourier coefficients associated to two nilpotent orbits of E7. We show that one descent module is generic, and under mild assumptions on the unramified components of \pi, it is cuspidal and having \pi as a weak functorial lift of each irreducible summand. However, we show that the other descent module supports not only the non-degenerate Whittaker integral on G2(A), but also every degenerate Whittaker integral. Thus it is generic, but not cuspidal. This is a new phenomenon, compared to the theory of functorial descent for classical and GSpin groups. This work is joint with Joseph Hundley.

ZOOM: HTTPS://WEIZMANN.ZOOM.US/J/98304397425

Review of old and new results on the global structure of singular holomorphic codimension one foliations on projective manifolds and their moduli spaces.

ZOOM: https://weizmann.zoom.us/j/97197369277?pwd=c0kyTVVDaWhLQ1NaWDJLVE9MTzBVUT09

We present a novel GAN-based model that utilizes the space of deep features learned by a pre-trained classification model. Inspired by classical image pyramid representations, we construct our model as a Semantic Generation Pyramid - a hierarchical framework which leverages the continuum of semantic information encapsulated in such deep features; this ranges from low level information contained in fine features to high level, semantic information contained in deeper features. More specifically, given a set of features extracted from a reference image, our model generates diverse image samples, each with matching features at each semantic level of the classification model. We demonstrate that our model results in a versatile and flexible framework that can be used in various classic and novel image generation tasks. These include: generating images with a controllable extent of semantic similarity to a reference image, and different manipulation tasks such as semantically-controlled inpainting and compositing; all achieved with the same model, with no further training.

https://arxiv.org/abs/2003.06221

Zoom: https://weizmann.zoom.us/j/96905267652?pwd=MGxhem1WVG1WbzRQSTVnYXZGLy81dz09

One core problem in relative harmonic analysis is to study the space of H-invariant linear functionals on an admissible representation, where H is a spherical subgroup of a reductive group G over a local field. In this talk, I will focus on the Archimedean case in the setting of Harish-Chandra modules. I will review the interpretation of these Hom spaces in terms of certain regular holonomic D-modules on G/H (arXiv:1905.08135), under mild conditions on H. Then I will try to sketch a possible extension of this strategy to the Ext-analogues and the Euler-Poincaré numbers. This is a work in progress.

ZOOM: HTTPS://WEIZMANN.ZOOM.US/J/98304397425

The usual Poincare problem consists in determining whether or not there exist upper bounds for the degree of an invariant algebraic curve of a foliation of the complex projective plane in terms of the degree of the foliation. We are interested in the local analogue, in which degrees are replaced with vanishing multiplicities at the origin. Since cusps of arbitrary degree can be invariant curves of foliations of degree one in the global setting and of multiplicity one in the local one, there are no universal upper bounds. In order to deal with this issue, extra hypotheses were considered, either on the invariant curve (Cerveau - Lins Neto,...) or on the desingularization of the foliation (Carnicer...). We consider an irreducible germ of invariant curve and no extra hypotheses. We provide a lower bound for the multiplicity of a germ of complex foliation in dimension 2 in terms of Puiseux characteristics of an irreducible invariant curve. Obviously, the lower bound is trivial for a cusp but it is valuable if the curve has at least two Puiseux characteristic exponents. This is a joint work with Jose Cano and Pedro Fortuny.

Join Zoom Meeting

**https://weizmann.zoom.us/j/9183356684?pwd=QlN4T1VtcXc4QXdXNEYxUlQ4dmR5dz09**

The classical theta correspondence establishes a relationship between automorphic representations on special orthogonal groups and automorphic representations on symplectic groups or their double covers. This correspondence is achieved by using as integral kernel a theta series that is constructed from the Weil representation. In this talk I will briefly survey earlier work on (local and global, classical and other) theta correspondences and then present an extension of the classical theta correspondence to higher degree metaplectic covers. The key issue here is that for higher degree covers there is no analogue of the Weil representation (or even a minimal representation), so additional ingredients are needed. Joint work with David Ginzburg.

ZOOM: HTTPS://WEIZMANN.ZOOM.US/J/98304397425

The second part of Hilbert's sixteenth problem asks for a study of the number and relative positions of the limit cycles of an ODE system associated to a planar vector field with polynomial component functions. Seeking a probabilistic perspective on Hilbert's problem, we present recent results on the distribution of limit cycles when the vector field component functions are random polynomials. We present a lower bound for the average number of limit cycles for the Kostlan-Shub-Smale model, and we present asymptotic results for a special class of limit cycle enumeration problems concerning a randomly perturbed center focus. We will discuss some ideas from the proofs which combine techniques from dynamical systems (such as transverse annuli, Poincare maps, and Melnikov functions) with those coming from the theory of random analytic functions (such as real zeros of random series, the Kac-Rice formula, and the barrier construction from the study of nodal sets of random eigenfunctions). We conclude by discussing future directions and open problems.

Relevant text: preprint on arXiv

**Zoom link**: https://weizmann.zoom.us/j/94594972747?pwd=cWl0bjB1T2ZYSUp0dzdaK3NNa3Mzdz09

I will present a joint work with A. Alvarez, J.L. Bravo and C. Christopher.

We study the analogue of the classical infinitesimal center problem in the plane, but for zero cycles. We define the displacement function in this context and prove that it is identically zero if and only if the deformation has a composition factor. That is, we prove that here the composition conjecture is true, in contrast with the tangential center problem on zero cycles. Finally, we give examples of applications of our results.

Zoom link: https://us02web.zoom.us/j/88255165329

Let Sp(W) x O(V) be a dual reductive pair. If pi is an irreducible representation of Sp(W) say, then one may consider its theta lift \Theta(\pi) on O(V). In this talk, we discuss how the Harish-Chandra characters of \pi and \theta(\pi) are related (when the representation of the smaller group is tempered).

ZOOM: HTTPS://WEIZMANN.ZOOM.US/J/98304397425

Let G be a reductive group over a local field F of characteristic zero, and H be a spherical subgroup. An irreducible representation of G is said to be distinguished by H if it has an H-invariant continuous linear functional. The study of distinguished representations is of much current interest, because of their relation to the Plancherel measure on G/H and to periods of automorphic forms.

While a complete classification seems to be out of reach, we established simple micro-local necessary conditions for distinction. The conditions are formulated in terms of the nilpotent orbits associated to the representation, in the spirit of the orbit method. Our results are strongest for Archimedean F. In this case, Rossmann showed that for any irreducible Casselman-Wallach representation, the Zariski closure of the wave-front set is the closure of a unique nilpotent complex orbit. We have shown that the restriction of this orbit to the complexified Lie algebra of H includes zero.

We apply this result to symmetric pairs, branching problems, and parabolic induction. We also have a twisted version for the case when π has a functional invariant with respect to an "additive" character of H. As an application of our theorem we derive necessary conditions for the existence of Rankin-Selberg, Bessel, Klyachko and Shalika models. Our results are compatible with the recent Gan-Gross-Prasad conjectures for non-generic representations. Our necessary conditions are rarely sufficient, but they are sufficient for one class of models: the Klyachko models for unitary representations of general linear groups.

This is a joint work with Eitan Sayag.

ZOOM: HTTPS://WEIZMANN.ZOOM.US/J/98304397425

A general theme in geometry is the classification of algebraic/differential geometric structures which satisfy a positivity property. I will describe an "asymptotic" version of this theme that lies at the crossroads of algebraic geometry, differential geometry, and convexity. On the algebraic side, we introduce the class of asymptotically log Fano varieties and state a program towards classification in dimension 2, generalizing the classical efforts of the Italian School of Algebraic Geometry, Manin and Hitchin and making contact, somewhat surprisingly, with linear programming.

On the differential side, we explain the resolution of the Calabi-Tian conjecture which implies such a classification classifies Kahler spaces with positive Ricci curvature away from a divisor and with conical singularities near the divisor, and relations to "small angle limits" in Riemannian geometry, stability, non-compact RIcci flat fibrations, Ricci solitons, and Kahler-Einstein edge metrics.

link to seminar: meet.google.com/ipv-hqqo-oys

The emergence of deep generative models has recently enabled the automatic generation of massive amounts of graphical content, both in 2D and in 3D.

Generative Adversarial Networks (GANs) and style control mechanisms, such as Adaptive Instance Normalization (AdaIN), have proved particularly effective in this context, culminating in the state-of-the-art StyleGAN architecture.

While such models are able to learn diverse distributions, provided a sufficiently large training set, they are not well-suited for scenarios where the distribution of the training data exhibits a multi-modal behavior. In such cases, reshaping a uniform or normal distribution over the latent space into a complex multi-modal distribution in the data domain is challenging, and the generator might fail to sample the target distribution well. Furthermore, existing unsupervised generative models are not able to control the mode of the generated samples independently of the other visual attributes, despite the fact that they are typically disentangled in the training data.

In this work, we introduce UMMGAN, a novel architecture designed to better model multi-modal distributions, in an unsupervised fashion. Building upon the StyleGAN architecture, our network learns multiple modes, in a completely unsupervised manner, and combines them using a set of learned weights. We demonstrate that this approach is capable of effectively approximating a complex distribution as a superposition of multiple simple ones.

We further show that UMMGAN effectively disentangles between modes and style, thereby providing an independent degree of control over the generated content.

This is a joint work with Prof. Dani Lischinski and Prof. Daniel Cohen-Or.

Zoom Link: https://weizmann.zoom.us/j/91197729259

Recently, Gan-Gross-Prasad formulated new restriction problems for the non-tempered representations of classical groups. In this talk, I shall explain a proof of a conjecture on general linear groups over non-Archimedean fields. The main ingredients of the proof include a use of filtration on parabolically induced representations when restricted to the mirabolic subgroups, and realizing the product with a Speh representation as a functor. The proof also uses a result of Lapid- Minguez on the irreducibility of a product of representations. If time permits, we shall also discuss generalizations to Bessel and Fourier-Jacobi models and towards Ext-branching laws.

Zoom link: https://weizmann.zoom.us/j/98304397425

Let G be a connected algebraic group.

We define and study a convolution operation between algebraic morphisms into G. We show that this operation yields morphisms with improved singularity properties, and in particular, that under reasonable assumptions one can always obtain a flat morphism with reduced fibers of rational singularities (termed an FRS morphism) after enough convolutions.

The FRS property is of high importance since (FRS) morphisms can be characterized by good asymptotic behaviour of the number of points of their fibers over finite rings of the form Z/p^kZ.

This further allows us to interpret the FRS property through probabilistic lenses.

We discuss some of the above, motivated by the special case of word maps which can be viewed as a relative

analogue in the settings of p-adic groups of Waring's problem from 1770 (see arXiv:1912.12556).

Joint work with Itay Glazer.

Zoom link: https://weizmann.zoom.us/j/98304397425

Single image super resolution (SR) has seen major performance leaps in recent years. However, existing methods do not allow exploring the infinitely many plausible reconstructions that might have given rise to the observed low-resolution (LR) image. These different explanations to the LR image may dramatically vary in their textures and fine details, and may often encode completely different semantic information. In this work, we introduce the task of explorable super resolution. We propose a framework comprising a graphical user interface with a neural network backend, allowing editing the SR output so as to explore the abundance of plausible HR explanations to the LR input. At the heart of our method is a novel module that can wrap any existing SR network, analytically guaranteeing that its SR outputs would precisely match the LR input, when downsampled. Besides its importance in our setting, this module is guaranteed to decrease the reconstruction error of any SR network it wraps, and can be used to cope with blur kernels that are different from the one the network was trained for. We illustrate our approach in a variety of use cases, ranging from medical imaging and forensics, to graphics.

Zoom link: https://weizmann.zoom.us/j/96418208541

Jim Arthur has conjectured the existence of some exotic "unipotent" representations of real reductive Lie groups, which are expected to form building blocks of the unitary dual. Though falling short of a full classification of the unitary dual itself, Arthur's conjectures touch on the essence of some of the most difficult questions concerning unitarity. In another direction, automorphic realizations of these representations are expected to have delicate arithmetic properties.

However, Arthur's unipotent representations are hard to identify, much show are unitary. I'll present a status report, including the unitary of the "Langlands element" Arthur describes directly (in joint work with Joe Hundley), and the full identification the unipotent representations for exceptional real groups (joint work with Jeff Adams, Marc van Leeuwen, Annegret Paul, and David Vogan).

Zoom meeting: https://weizmann.zoom.us/j/98304397425

Consider the function field $F$ of a smooth curve over $\FF_q$, with $q\neq 2$.

L-functions of automorphic representations of $\GL(2)$ over $F$ are important objects for studying the arithmetic properties of the field $F$. Unfortunately, they can be defined in two different ways: one by Godement-Jacquet, and one by Jacquet-Langlands. Classically, one shows that the resulting L-functions coincide using a complicated computation.

I will present a conceptual proof that the two families coincide, by categorifying the question. This correspondence will necessitate comparing two very different sets of data, which will have significant implications for the representation theory of $\GL(2)$. In particular, we will obtain an exotic symmetric monoidal structure on the category of representations of $\GL(2)$

It turns out that an appropriate space of automorphic functions is a commutative algebra with respect to this symmetric monoidal structure. I will outline this construction, and show how it can be used to construct a category of automorphic representations.

Zoom meeting: https://weizmann.zoom.us/j/98304397425

Robust recovery of lost colors in underwater images remains a challenging problem. We recently showed that this was partly due to the prevalent use of an atmospheric image formation model for underwater images and proposed a physically accurate model. The revised model showed: 1) the attenuation coefficient of the signal is not uniform across the scene but depends on object range and reflectance, 2) the coefficient governing the increase in backscatter with distance differs from the signal attenuation coefficient. Here, we present the first method that recovers color with our revised model, using RGBD images. The Sea-thru method estimates backscatter using the dark pixels and their known range information. Then, it uses an estimate of the spatially varying illuminant to obtain the range-dependent attenuation coefficient. Using more than 1,100 images from two optically different water bodies, which we make available, we show that our method with the revised model outperforms those using the atmospheric model. Consistent removal of water will open up large underwater datasets to powerful computer vision and machine learning algorithms, creating exciting opportunities for the future of underwater exploration and conservation. (Paper published in CVPR 19).

Zoom link: https://weizmann.zoom.us/j/99608045732

For a Nash manifold X and a Nash vector bundle E on X, one can form the topological vector space of Schwartz sections of E, i.e. the smooth sections which decay fast along with all derivatives. It was shown by Aizenbud and Gourevitch, and independently by Luca Prelli, that for a Nash manifold X, th complex of Schwartz sections of the de Rham complex of X has cohomologies isomorphic to the compactly supported cohomologies of X.

In my talk I will present a work in progress, joint with Avraham Aizenbud, to generalize this result to the relative case, replacing the Nash manifold M with a Nash submersion f:M-->N. Using infinity categorical methods, I will define the notion of a Schwartz section of a Nash bundle E over a complex of sheaves with constructible cohomologies, generalizing the notion of Schwartz section on an open semialgebraic set. I will then relate the Schwartz sections of the relative de Rham complex of a Nash submersion f:M-->N with the Schwartz functions on N over the derived push-forward with proper support of the constant sheaf on M. Finally, I will coclude with some applications to the relation between the Schwartz sections of the relative de Rham complex and the topology of the fibers of f.

Zoom meeting: https://weizmann.zoom.us/j/98304397425

We explain how a doubled version of the Beilinson-Bernstein localization functor can be understood using the geometry of the wonderful compactification of a group. Specifically, bimodules for the Lie algebra give rise to monodromic D-modules on the horocycle space, and to filtered D-modules on the group that respect a certain matrix coefficients filtration. These two categories of D-modules are related via an associated graded construction in a way compatible with localization, Verdier specialization, the Vinberg semigroup, and additional structures. This talk is based on joint work with David Ben-Zvi.

Zoom meeting: https://weizmann.zoom.us/j/98304397425

Randomness is an essential resource in computer science. In most applications perfect, and sometimes private, randomness is needed, while it is not even clear that such a resource exists. It is well known that the tools of classical computer science do not allow us to create perfect randomness from a single weak source of randomness - a fact that initiated the study of special functions called randomness extractors.

The first part of the talk will be focused on the performance of randomness extractors in the presence of a quantum adversary and present a new model for two-source extractors, termed the quantum Markov model.

In the second part of the talk I will describe a task called "randomness amplification and privatization", where a single and public Santha-Vazirani source of randomness is being transformed into uniform and private randomness - a cryptographic task that cannot be accomplished using the tools of classical computer science. I will then present a protocol that builds on the extractors discussed in the first part of the talk and explain the main ingredients of its security proof.

Zoom meeting : https://weizmann.zoom.us/j/99368552173

In this talk, I will explain a new way to construct smooth convolution operators on adelic groups that isolate (certain) cuspidal representations from the rest of the automorphic spectrum. Then, I will explain an application of this construction to the global Gan-Gross-Prasad conjecture for unitary groups.

This is joint work with Yifeng Liu, Wei Zhang and Xinwen Zhu.

Zoom meeting https://weizmann.zoom.us/j/98304397425

Speaker #1: Emanuel Milman (Technion)

Title: Functional Inequalities on sub-Riemannian manifolds via QCD

Abstract:We are interested in obtaining Poincar\'e and log-Sobolev inequalities on domains in sub-Riemannian manifolds (equipped with their natural sub-Riemannian metric and volume measure).

It is well-known that strictly sub-Riemannian manifolds do not satisfy any type of Curvature-Dimension condition CD(K,N), introduced by Lott-Sturm-Villani some 15 years ago, so we must follow a different path. We show that while ideal (strictly) sub-Riemannian manifolds do not satisfy any type of CD condition, they do satisfy a quasi-convex relaxation thereof, which we name QCD(Q,K,N). As a consequence, these spaces satisfy numerous functional inequalities with exactly the same quantitative dependence (up to a factor of Q) as their CD counterparts. We achieve this by extending the localization paradigm to completely general interpolation inequalities, and a one-dimensional comparison of QCD densities with their "CD upper envelope". We thus obtain the best known quantitative estimates for (say) the L^p-Poincar\'e and log-Sobolev inequalities on domains in the ideal sub-Riemannian setting, which in particular are independent of the topological dimension. For instance, the classical Li-Yau / Zhong-Yang spectral-gap estimate holds on all Heisenberg groups of arbitrary dimension up to a factor of 4.

No prior knowledge will be assumed, and we will (hopefully) explain all of the above notions during the talk.

Speaker #2: Tal Orenshtein (TU Berlin)

Title: Rough walks in random environment

Abstract. In this talk we shall review scaling limits for random walks in random environment lifted to the rough path space to the enhanced Brownian motion. Except for the immediate application to SDEs, this adds some new information on the structure of the limiting path. Time permitted, we shall elaborate on the tools to tackle these problems. Based on joint works with Olga Lopusanschi, with Jean-Dominique Deuschel and Nicolas Perkowski and with Johaness Bäumler, Noam Berger and Martin Slowik.

The Dulfo-Serganova functor is a cohomology functor relating representation theory of Lie superalgebras of different ranks.

This is a tensor functor preserving superdimension.

Serganova conjectured that the image of a finite-dimensional simple module L under Duflo-Serganova functor is semisimple. Heidersdorf and Weissauer established this conjecture for gl-case and described DS(L). In my previous talk I sketched a proof of semsimplicity for osp-type. In this talk I will explain how to compute the mulitplicites in DS(L).

This is a joint project with Thorsten Heidersdorf.

Let G be a reductive group over a local field, and H be a spherical subgroup. An irreducible representation of G is said to be distinguished by H if it has an H-invariant continuous linear functional. The study of distinguished representations is of much current interest, because of their relation to the Plancherel measure on G/H and to periods of automorphic forms.

While a complete classification seems to be out of reach, in a joint work with E. Sayag we established simple geometric necessary conditions for distinction. The conditions are formulated in terms of the nilpotent orbit associated to the representation. In the talk I will focus on the case of real reductive G, based on the recent preprint arXiv:2001.11746. Our main tool is the theory of associated varieties of modules over the Lie algebra of G.

For a $\Delta$-regular connected graph ${\sf H}$ the problem of determining the upper tail large deviation for the number of copies of ${\sf H}$ in an Erd\H{o}s-R\'{e}nyi graph on $n$ vertices with edge probability $p$ has generated significant interests. In the sparse regime, i.e. for $p \ll 1$, when $np^{\Delta/2} \gg (\log n)^{1/(v_{\sf H}-2)}$, where $v_{\sf H}$ is the number of vertices in ${\sf H}$, the upper tail large deviation event is believed to occur due to the presence of localized structures. Whereas, for $p$ below the above threshold the large deviation is expected to be given by that of a Poisson random variable. In this talk, we will discuss some progress in resolving this conjecture.

This is based on joint work with Riddhipratim Basu.

Interactive oracle proofs (IOPs) are a hybrid between interactive proofs and PCPs. In an IOP the prover is allowed to interact with a verifier (like in an interactive proof) by sending relatively long messages to the verifier, who in turn is only allowed to query a few of the bits that were sent (like in a PCP).

For any NP relation for which membership can be decided in polynomial-time and bounded polynomial space (e.g., SAT, Hamiltonicity, Clique, Vertex-Cover, etc.) and for any constant \gamma>0, we construct an IOP with communication complexity (1+\gamma) \cdot n, where n is the original witness length. The number of rounds as well as the number of queries made by the IOP verifier are constant.

Joint work with Noga Ron-Zewi

The plan is to discuss anti-concentration of the inner product between two independent random vectors in Euclidean space. We shall go over some background and applications, and see some proofs.

My goal is to demonstrate how geometric priors can replace the requirement for annotated 3D data. In this context I'll present two of my works. First, I will present a deep learning framework that estimates dense correspondence between articulated 3D shapes without using any ground truth labeling which I presented as an oral presentation at CVPR 2019: "Unsupervised Learning of Dense Shape Correspondence". We demonstrated that our method is applicable to full and partial 3D models as well as to realistic scans. The problem of incomplete 3D data can be encountered also in many other different scenarios. One such interesting problem of great practical importance is shape completion, to which I'll dedicate the second part of my lecture.

It is common to encounter situations where there is a considerable distinction between the scanned 3D model and the final rendered one. The distinction can be attributed to occlusions or directional view of the target, when the scanning device is localized in space. I will demonstrate that geometric priors can guide learning algorithms in the task of 3D model completion from partial observation. To this end, I'll present our recent work: "The Whole is Greater than the Sum of its Non-Rigid Parts". This famous declaration of Aristotle was adopted to explain human perception by the Gestalt psychology school of thought in the twentieth century. Here, we claim that observing part of an object which was previously acquired as a whole, one could deal with both partial matching and shape completion in a holistic manner. More specifically, given the geometry of a full, articulated object in a given pose, as well as a partial scan of the same object in a different pose, we address the problem of matching the part to the whole while simultaneously reconstructing the new pose from its partial observation. Our approach is data-driven, and takes the form of a Siamese autoencoder without the requirement of a consistent vertex labeling at inference time; as such, it can be used on unorganized point clouds as well as on triangle meshes. We demonstrate the practical effectiveness of our model in the applications of single-view deformable shape completion and dense shape correspondence, both on synthetic and real-world geometric data, where we outperform prior work on these tasks by a large margin.

Local differential privacy (LDP) is a model where users send privatized data to an untrusted central server whose goal it to solve some data analysis task. In the non-interactive version of this model the protocol consists of a single round in which a server sends requests to all users then receives their responses. This version is deployed in industry due to its practical advantages and has attracted significant research interest. Our main result is an exponential lower bound on the number of samples necessary to solve the standard task of learning a large-margin linear separator in the non-interactive LDP model. Via a standard reduction this lower bound implies an exponential lower bound for stochastic convex optimization and specifically, for learning linear models with a convex, Lipschitz and smooth loss. These results answer the questions posed in \citep{SmithTU17,DanielyF18}. Our lower bound relies on a new technique for constructing pairs of distributions with nearly matching moments but whose supports can be nearly separated by a large margin hyperplane. These lower bounds also hold in the model where communication from each user is limited and follow from a lower bound on learning using non-adaptive \emph{statistical queries}.

Link: https://arxiv.org/abs/1911.04014

Parameterized Anaylsis leads both to deeper understanding of intractability results and to practical solutions for many NP-hard problems. Informally speaking, Parameterized Analysis is a mathematical paradigm to answer the following fundamental question: What makes an NP-hard problem hard? Specifically, how do different parameters (being formal quantifications of structure) of an NP-hard problem relate to its inherent difficulty? Can we exploit these relations algorithmically, and to which extent? Over the past three decades, Parameterized Analysis has grown to be a mature field of outstandingly broad scope with significant impact from both theoretical and practical perspectives on computation.

In this talk, I will first give a brief introduction of the field of Parameterized Analysis. Then, I will discuss some recent work in this field, where I mainly address (i) problems at the core of this field, rooted at Graph Minors and Graph Modification Problems, and (ii) applications of tools developed in (i) in particular, and of parameterized analysis in general, to Computational Geometry and Computational Social Choice. Additionally, I will zoom into a specific result, namely, the first single-exponential time parameterized algorithm for the Disjoint Paths problem on planar graphs. An efficient algorithm for the Disjoint Paths problem in general, and on "almost planar" graphs in particular, is a critical part in the quest for the establishment of an Efficient Graph Minors Theory. As the Graph Minors Theory is the origin of Parameterized Analysis and ubiquitous in the design of parameterized algorithms, making this theory efficient is a holy grail in the field.

The MTS problem (Borodon, Linial, and Saks 1992) is a general model for the analysis of algorithms that optimize in the presence of information arriving over time, where the state space is equipped with a metric. I will discuss a relatively new approach to this area, where both the algorithm and method of analysis are derived canonically from a choice of a "regularizer" on the probability simplex. The regularizer can be interpreted as a Riemannian structure on the simplex, and then the algorithm is simply a gradient flow.

Defining the regularizer as an appropriate "noisy" multiscale metric entropy (similar to, e.g., the Talagrand \gamma_1 functional) yields the best-known competitive ratio for every metric space. For ultrametrics (aka, "HST metrics"), this achieves the conjectured bound, but the algorithm falls short of resolving the MTS conjecture for many other spaces, including subsets of the real line.

Processing large point clouds is a challenging task. Therefore, the data is often sampled to a size that can be processed more easily. The question is how to sample the data? A popular sampling technique is Farthest Point Sampling (FPS). However, FPS is agnostic to a downstream application (classification, retrieval, etc.). The underlying assumption seems to be that minimizing the farthest point distance, as done by FPS, is a good proxy to other objective functions.

We show that it is better to learn how to sample. To do that, we propose a deep network to simplify 3D point clouds. The network, termed S-NET, takes a point cloud and produces a smaller point cloud that is optimized for a particular task. The simplified point cloud is not guaranteed to be a subset of the original point cloud. Therefore, we match it to a subset of the original points in a post-processing step. We contrast our approach with FPS by experimenting on two standard data sets and show significantly better results for a variety of applications. Our code is publicly available.

Unramified principal series representations of p-adic GL(r) and its metaplectic covers are important in the theory of automorphic forms. I will present a method of relating the Whittaker coinvariants of such a representation with representations of quantum affine gl_n. This involves using a Schur-Weyl duality result due to Chari and Pressley and it allows us to compute the dimension of the Whittaker model of every irreducible smooth representation with Iwahori fixed vectors.

If time permits I will explain a conjectured version of this result for the symplectic group Sp(2r) which involves quantum symmetric pairs.

Many data analysis pipelines are adaptive: the choice of which analysis to run next depends on the outcome of previous analyses. Common examples include variable selection for regression problems and hyper-parameter optimization in large-scale machine learning problems: in both cases, common practice involves repeatedly evaluating a series of models on the same dataset. Unfortunately, this kind of adaptive re-use of data invalidates many traditional methods of avoiding over-fitting and false discovery, and has been blamed in part for the recent flood of non-reproducible findings in the empirical sciences. An exciting line of work beginning with Dwork et al. 2015 establishes the first formal model and first algorithmic results providing a general approach to mitigating the harms of adaptivity, via a connection to the notion of differential privacy. Unfortunately, until now, those results were primarily of information theoretic interest, only beating out the simple approach of gathering fresh data for every computation ("sample-splitting") at the scale of many millions of datapoints.

In this work, we give a new proof of the transfer theorem that any mechanism for answering adaptively chosen statistical queries that is differentially private and sample-accurate is also accurate out-of-sample. Our new proof is elementary and gives structural insights that we expect will be useful elsewhere. We show: 1) that differential privacy ensures that the expectation of any query on the conditional distribution on datasets induced by the transcript of the interaction is close to its expectation on the data distribution, and 2) sample accuracy on its own ensures that any query answer produced by the mechanism is close to the expectation of the query on the conditional distribution. This second claim follows from a thought experiment in which we imagine that the dataset is resampled from the conditional distribution after the mechanism has committed to its answers. The transfer theorem then follows by summing these two bounds, and in particular, avoids the "monitor argument" used to derive high probability bounds in prior work.

An upshot of our new proof technique is that the concrete bounds we obtain are substantially better than the best previously known bounds, even though the improvements are in the constants, rather than the asymptotics (which are known to be tight). As we show, our new bounds outperform the naive "sample-splitting" baseline at dramatically smaller dataset sizes compared to the previous state of the art, bringing techniques from this literature closer to practicality.

Joint work with: Christopher Jung, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi (UPenn), and Moshe Shenfeld (HUJI).

This work appeared at ITCS 2020.

This talk will address the following question: In what cases does learning a good hypothesis require more resources than verifying a hypothesis proposed by someone else?** If verification is significantly cheaper than learning, that could have important practical implications for delegation of machine learning tasks in commercial settings, and might also have consequences for verification of scientific publications, and for AI safety. Two results will be discussed: (1) There exists a learning problem where verification requires quadratically less random samples than are required for learning. (2) The broad class of Fourier-sparse functions (which includes decision trees, for example) can be efficiently verified using random samples only, even though it is widely believed to be impossible to efficiently learn this class from random samples alone.

This is joint work with Shafi Goldwasser (UC Berkeley), Guy Rothblum (WIS), and Amir Yehudayoff (Technion).

** "Learning" -- as defined in Valiant's Probably Approximately Correct (PAC) framework

Formal verification of infinite-state systems, and distributed systems in particular, is a long standing research goal. I will describe a series of works that develop a methodology for verifying distributed algorithms and systems using decidable logics, employing decomposition, abstraction, and user interaction. This methodology is implemented in an open-source tool, and has resulted in the first mechanized proofs of several important distributed protocols. I will also describe a novel approach to the problem of invariant inference based on a newly formalized duality between reachability and mathematical induction. The duality leads to a primal-dual search algorithm, and a prototype implementation already handles challenging examples that other state-of-the-art techniques cannot handle. I will briefly describe several other works, including a new optimization technique for deep learning computations that achieves significant speedups relative to existing deep learning frameworks.

Dialog is an effective way to exchange information, but subtle details and nuances are extremely important. While significant progress has paved a path to address visual dialog with algorithms, details and nuances remain a challenge. Attention mechanisms have demonstrated compelling results to extract details in visual question answering and also provide a convincing framework for visual dialog due to their interpretability and effectiveness. However, the many data utilities that accompany visual dialog challenge existing attention techniques. We address this issue and develop a general attention mechanism for visual dialog which operates on any number of data utilities. To this end, we design a factor graph based attention mechanism which combines any number of utility representations. We illustrate the applicability of the proposed approach on the challenging and recently introduced VisDial datasets, outperforming recent state-of-the-art methods by 1.1% for VisDial0.9 and by 2% for VisDial1.0 on MRR. Our ensemble model improved the MRR score on VisDial1.0 by more than 6%.

The Higher Criticism (HC) test is a useful tool for detecting the presence of a signal spread across a vast number of features, especially in the sparse setting when only few features are useful while the rest are pure noise. We adapt the HC test to the two-sample setting of detecting changes between two frequency tables. We apply this adaptation to authorship attribution challenges, where the goal is to identify the author of a document using other documents whose authorship is known. The method is simple yet performs well without handcrafting and tuning. Furthermore, as an inherent side effect, the HC calculation identifies a subset of discriminating words, which allow additional interpretation of the results. Our examples include authorship in the Federalist Papers, machine-generated texts, and the identity of the creator of the Bitcoin.

We take two approaches to analyze the success of our method. First, we show that, in practice, the discriminating words identified by the test have low variance across documents belonging to a corpus of homogeneous authorship. We conclude that in testing a new document against the corpus of an author, HC is mostly affected by words characteristic of that author and is relatively unaffected by topic structure. Finally, we analyze the power of the test in discriminating two multinomial distributions under a sparse and weak perturbation model. We show that our test has maximal power in a wide range of the model parameters, even though these parameters are unknown to the user.

An important role in modular representation theory is played by the Frobenius twist functor, twisting the k-linear structure of a representation by the Frobenius automorphism F(a)=a^p of the (algebraically closed) ground field k of characteristic p. I will define an analog of this functor for any symmetric tensor category of characteristic p. One of the main new features is that unlike the classical Frobenius twist functor, this functor need not be left or right exact. I will give examples when it is not and describe a replacement of the exactness property. I will also describe applications of this notion to formulating and proving analogs of Deligne's theorem in positive characteristic. This is joint work with V. Ostrik.

We prove that for all constants a, NQP = NTIME[n^polylog(n)] cannot be (1/2 + 2^(-log^a n) )-approximated by 2^(log^a n)-size ACC^0 of THR circuits (ACC^0 circuits with a bottom layer of threshold gates). Previously, it was even open whether E^NP can be (1/2+1/sqrt(n))-approximated by AC^0[2] circuits.

More generally, we establish a connection showing that, for a typical circuit class C, non-trivial nondeterministic CAPP algorithms imply strong (1/2 + 1/n^{omega(1)}) average-case lower bounds for nondeterministic time classes against C circuits. The existence of such (deterministic) algorithms is much weaker than the widely believed conjecture PromiseBPP = PromiseP.

Our new results build on a line of recent works, including [Murray and Williams, STOC 2018], [Chen and Williams, CCC 2019], and [Chen, FOCS 2019]. In particular, it strengthens the corresponding (1/2 + 1/polylog(n))-inapproximability average-case lower bounds in [Chen, FOCS 2019]. The two important technical ingredients are techniques from Cryptography in NC^0 [Applebaum et al., SICOMP 2006], and Probabilistic Checkable Proofs of Proximity with NC^1-computable proofs.

In this talk we introduce Newton non-degenerate codimension one foliations, that are given in terms of Newton polyhedra following the classical ideas of Kouchnirenko for hypersurfaces. We characterize Newton non-degenerate foliations in geometrical terms, as being foliations admitting a reduction of singularities of a combinatorial nature, the ones that we call (weak) toric type foliations. In addition, we provide a positive answer to Thom's question about the existence of invariant hypersurface for germs of codimension one foliations, in the three-dimensional case of toric type. In order to do it, we pass through a study of global character in dimension two. The Khovanskii-Kouchnirenko-Bernstein result that relates the number of solutions of a Laurent polynomial system in terms of the mixed volume of the associated polyhedra, leads us to conclude that isolated branches of Newton non-degenerate foliations defined over projective toric surfaces extend in a global way. This prolongation property is the key to spread the invariant surfaces in dimension three through the compact dicritical components.

In this talk we introduce Newton non-degenerate codimension one foliations, that are given in terms of Newton polyhedra following the classical ideas of Kouchnirenko for hypersurfaces. We characterize Newton non-degenerate foliations in geometrical terms, as being foliations admitting a reduction of singularities of a combinatorial nature, the ones that we call (weak) toric type foliations. In addition, we provide a positive answer to Thom's question about the existence of invariant hypersurface for germs of codimension one foliations, in the three-dimensional case of toric type. In order to do it, we pass through a study of global character in dimension two. The Khovanskii-Kouchnirenko-Bernstein result that relates the number of solutions of a Laurent polynomial system in terms of the mixed volume of the associated polyhedra, leads us to conclude that isolated branches of Newton non-degenerate foliations defined over projective toric surfaces extend in a global way. This prolongation property is the key to spread the invariant surfaces in dimension three through the compact dicritical components.

Modern machine learning algorithms have achieved remarkable performance in a myriad of applications, and are increasingly used to make impactful decisions in the hiring process, criminal sentencing, healthcare diagnostics and even to make new scientific discoveries. The use of data-driven algorithms in high-stakes applications is exciting yet alarming: these methods are extremely complex, often brittle, notoriously hard to analyze and interpret. Naturally, concerns have raised about the reliability, fairness, and reproducibility of the output of such algorithms. This talk introduces statistical tools that can be wrapped around any "black-box" algorithm to provide valid inferential results while taking advantage of their impressive performance. We present novel developments in conformal prediction and quantile regression, which rigorously guarantee the reliability of complex predictive models, and show how these methodologies can be used to treat individuals equitably. Next, we focus on reproducibility and introduce an operational selective inference tool that builds upon the knockoff framework and leverages recent progress in deep generative models. This methodology allows for reliable identification of a subset of important features that is likely to explain a phenomenon under-study in a challenging setting where the data distribution is unknown, e.g., mutations that are truly linked to changes in drug resistance.

Let B denote the range of the Brownian motion in R^d. For a deterministic Borel a measure nu we wish to find a random measure mu such that the support of mu is contained in B and the expectation of mu is nu. We discuss when exactly can there be such a random measure and construct in those cases. We establish a formula for the expectation of the double integral with respect to mu, which is a strong tool for the geometric measure theory of the Brownian path.

One of the unresolved questions in deep learning is the nature of the solutions that are being discovered. We investigated the collection of solutions reached by the same neural network (NN) architecture, with different random initialization of weights and random mini-batches. These solutions are shown to be rather similar -- more often than not, each train and test example is either classified correctly by all NN instances, or by none at all. Furthermore, all NNs seem to share the same learning dynamics, whereby initially the same train and test examples are correctly recognized by the learned model, followed by other examples that are learned in roughly the same order. When extending the investigation to heterogeneous collections of NN architectures, once again examples are seen to be learned in the same order irrespective of architecture, although the more powerful architecture may continue to learn and thus achieve higher accuracy. Finally, I will discuss cases where this pattern of similarity breaks down, which show that the reported similarity is not an artifact of optimization by gradient descent.

This talk presents new computational and statistical barriers in machine learning, along with the algorithmic developments that they inspire.

The computational barriers arise in nonconvex optimization: we prove lower bounds on the (oracle) complexity of finding stationary points using (stochastic) gradient methods, showing that gradient descent is unimprovable for a natural class of problems. We bypass this barrier by designing an algorithm that outperforms gradient descent for a large subclass of problems with high-order smoothness. Our algorithm leverages classical momentum techniques from convex optimization using a "convex until proven guilty" principle that we develop.

The statistical barrier is the large amount of data required for adversarially robust learning. In a Gaussian model, we prove that unlabeled data allows us to circumvent an information theoretic gap between robust and standard classification. Our analysis directly leads to a general robust self-training procedure; we use it to significantly improve state-of-the-ar

Numerous researchers recently applied empirical spectral analysis to the study of modern deep learning classifiers, observing spectral outliers and small but distinct bumps often seen beyond the edge of a "main bulk". This talk presents an important formal class/cross-class structure and shows how it lies at the origin of these visually striking patterns. The structure is shown to permeate the spectra of deepnet features, backpropagated errors, gradients, weights, Fisher Information matrix and Hessian, whether these are considered in the context of an individual layer or the concatenation of them all. The significance of the structure is illustrated by (i) demonstrating empirically that the feature class means separate gradually from the bulk as function of depth and become increasingly more orthogonal, (ii) proposing a correction to KFAC, a well known second-order optimization algorithm for training deepnets, and (ii) proving in the context of multinomial logistic regression that the ratio of outliers to bulk in the spectrum of the Fisher information matrix is predictive of misclassification.

Fine-grained complexity utilizes a small set of conjectures to derive conditional lower bounds for a large collection of problems. These conjectures concern the time complexity of a few core problems such as k-SAT, Orthogonal Vectors, 3SUM, k-Clique, and Set Cover. The relationships between these conjectures are poorly understood.

This talk will discuss some connections between the conjectures, including a tight reduction from Weighted-k Clique to Orthogonal Vectors and new (quantum-inspired) findings about the Set Cover Conjecture.

In recent years there has been an increasing gap between the success of machine learning algorithms and our ability to explain their success theoretically.

Namely, many of the problems that are solved to a satisfactory degree of precision are computationally hard in the worst case. Fortunately, there are often reasonable assumptions which help us to get around these worst-case impediments and allow us to rigorously analyze heuristics that are used in practice.

In this talk I will advocate a complementary approach, where instead of explicitly characterizing some desired "niceness" properties of the data, we assume access to an optimization oracle that solves a relatively simpler task. This allows us to identify the sources of hardness and extend our theoretical understanding to new domains. Furthermore we will show that seemingly innocents (and arguably justifiable) modifications to the oracle can lead to tractable reductions and even to bypass hardness results.

We demonstrate these ideas using the following results: i) An efficient algorithm for non-convex online learning using an optimization oracle. b) A faster boosting algorithm using a "simple" weak learner. iii) An efficient reduction from online to private learning.

Joint works with Naman Agarwal, Noga Alon, Elad Hazan, and Shay Moran.

In recent years there has been an increasing gap between the success of machine learning algorithms and our ability to explain their success theoretically.

Namely, many of the problems that are solved to a satisfactory degree of precision are computationally hard in the worst case. Fortunately, there are often reasonable assumptions which help us to get around these worst-case impediments and allow us to rigorously analyze heuristics that are used in practice.

In this talk I will advocate a complementary approach, where instead of explicitly characterizing some desired "niceness" properties of the data, we assume access to an optimization oracle that solves a relatively simpler task. This allows us to identify the sources of hardness and extend our theoretical understanding to new domains. Furthermore we will show that seemingly innocents (and arguably justifiable) modifications to the oracle can lead to tractable reductions and even to bypass hardness results.

We demonstrate these ideas using the following results: i) An efficient algorithm for non-convex online learning using an optimization oracle. b) A faster boosting algorithm using a "simple" weak learner. iii) An efficient reduction from online to private learning.

Joint works with Naman Agarwal, Noga Alon, Elad Hazan, and Shay Moran.

It is well known that Central Limit Theorem is very effective giving a reasonable approximation for sums of a quite small number of terms. Edgeworth expansions provide a convenient way to control the error in the Central Limit Theorem. In this talk I will review some recent results on this subject related to the interface between probability and dynamics.

It is well known that Central Limit Theorem is very effective giving a reasonable approximation for sums of a quite small number of terms. Edgeworth expansions provide a convenient way to control the error in the Central Limit Theorem. In this talk I will review some recent results on this subject related to the interface between probability and dynamics.

Recent developments in Natural Language Processing (NLP) allow models to leverage large, unprecedented amounts of raw text, culminating in impressive performance gains in many of the field"s long-standing challenges, such as machine translation, question answering, or information retrieval.

In this talk, I will show that despite these advances, state-of-the-art NLP models often fail to capture crucial aspects of text understanding. Instead, they excel by finding spurious patterns in the data, which lead to biased and brittle performance. For example, machine translation models are prone to translate doctors as men and nurses as women, regardless of context. Following, I will discuss an approach that could help overcome these challenges by explicitly representing the underlying meaning of texts in formal data structures. Finally, I will present robust models that use such explicit representations to effectively identify meaningful patterns in real-world texts, even when training data is scarce.

The rediscovery of the Archimedes Palimpsest in 1998 was a unique event: one of the most important authors of antiquity was read afresh, with many transformations made in the established text. Twenty years later, what have we learned from the Archimedes Palimpsest? What changed in our picture of the ancient history of mathematics?

Reviel Netz is Pat Suppes Professor of Greek Mathematics and Astronomy at Stanford University. Besides editing the Archimedes Palimpsest, he has published many books and articles on the history of Greek mathematics and on other subjects. His most recent book, "Scale, Space and Canon in Ancient Literary Culture", is now forthcoming from Cambridge University Press.

Refreshments will be served at 11:00am.

Define a relation between labeled ideal polygons in the hyperbolic space by requiring that the complex distances (a combination of the distance and the angle) between their respective sides equal c; the complex number c is a parameter of the relation. This defines a 1-parameter family of maps on the moduli space of ideal polygons in the hyperbolic space (or, in its real version, in the hyperbolic plane). I shall discuss complete integrability of this family of maps and mention related topics, including Coxeter's frieze patterns and elements of the theory of cluster algebras

The Caffarelli contraction theorem states that the Brenier optimal transport map sending the Gaussian measure onto a uniformly log-concave probability measure is lipschitz. In this talk, I will present a new proof, using entropic regularization and a variational characterization of lipschitz transport maps due to Gozlan and Juillet. Based on joint work with Nathael Gozlan and Maxime Prod'homme.

I will present a new approach for image parsing, Pixel Consensus Voting (PCV). The core of PCV is a framework for instance segmentation based on the Generalized Hough transform. Pixels cast discretized, probabilistic votes for the likely regions that contain instance centroids. At the detected peaks that emerge in the voting heatmap, backprojection is applied to collect pixels and produce instance masks. Unlike a sliding window detector that densely enumerates object proposals, our method detects instances as a result of the consensus among pixel-wise votes. We implement vote aggregation and backprojection using native operators of a convolutional neural network. The discretization of centroid voting reduces the training of instance segmentation to pixel labeling, analogous and complementary to fully convolutional network-style semantic segmentation, leading to an efficient and unified architecture that jointly models things and stuff. We demonstrate the effectiveness of our pipeline on COCO and Cityscapes Panoptic Segmentation and obtain competitive results. This joint work with Haochen Wang (TTIC/CMU), Ruotian Luo (TTIC), Michael Maire (TTIC/Uchicago) received an Innovation Award at the COCO/Mapillary workshop at ICCV 2019.

The network line rate is constantly on the rise to support the exploding amounts of data. This means that we have less time to process individual packets, despite a growing demand for better network telemetry. Moreover, CPU speeds are not rising at the same rate as we near the end of Moore's law, making it harder to rely on software computations. Programmable hardware switches are an emerging technology that enables flexible packet processing while optimizing for throughput and latency.

In this talk, I will introduce algorithms that leverage programmable switches for accelerating database operations and for improving network telemetry. Switches are orders of magnitude more efficient than traditional hardware accelerators, exist in the datapath, and are ideal for computation offloading. For telemetry, we will discuss how switches can probabilistically encode information across multiple packets to provide fine-grained network visibility with minimal overheads.

We construct a delegation scheme for delegating polynomial time computations. Our scheme is publicly verifiable and non-interactive in the common reference string (CRS) model. The soundness of our scheme is based on an efficiently falsifiable decisional assumption on groups with bilinear maps. Prior to this work, publicly verifiable non-interactive delegation schemes were only known under knowledge assumptions (or in the Random Oracle model) or under non-standard assumptions related to obfuscation or multilinear maps.

In addition, our scheme has two desirable features: The proofs are unambiguous, in the sense that it is hard to find two distinct proofs for the same statement, and are updatable in the sense that given a proof for the statement that a Turing machine M transitions from configuration C_0 to C_T in T steps, one can efficiently generate a proof for the statement that M transitions from configuration C_0 to C_{T+1} in T+1 steps.

We show that such a delegation scheme implies PPAD hardness, by following a similar approach to that of Choudhuri et al. (STOC2019), who obtained PPAD hardness based on an assumption related to the soundness of the Fiat-Shamir paradigm.

This is based on two joint works, both with Omer Paneth and Lisa Yang.

We develop differentially private mechanisms that achieve nearly instance-optimal losses, achieving lower loss than all appropriately unbiased mechanisms for any possible instance. We show that our mechanisms, with a modest increase in sample size (logarithmic or constant), are instance-optimal for a large family of functions. In contrast to existing mechanisms, which use the global or local sensitivity of the function being estimated, and so are necessarily instance suboptimal, the key to our construction is to use the inverse of the sensitivity. This allows a simple instance-optimal algorithm, and we develop several representative private mechanisms, including for the median and regression problems.

**Esty Kelman [TAU] Title: Boolean Functions and Their Effective Degree**

Abstract: The KKL Theorem (there is always a significantly influential variable) is sometimes tight as in the Tribes functions, but many times is not tight as in the Majority function. We propose a generalized version of KKL, with a new parameter - the effective degree, which replaces the role of the average degree in the KKL statement. This allows us to prove a tight result for many cases shows how large the influence of the most influential variable is. We generalize KKL in another manner finding a significantly influential variable within a subset of variables. This generalization, in turn, implies a generalized version of Friedgut Junta Theorem. If time allows, we will see how easily our Theorem implies the Friedgut Junta Theorem and stronger Junta results.

**Asaf Katz [U Chicago] Title: Measure rigidity for Anosov flows via the factorization method**

Abstract: We show how the factorization method, pioneered by Eskin and Mirzakhani in their groundbreaking work about measure classification for the moduli space of translation surfaces, can be adapted to smooth ergodic theory.

Using this adaption, we show that for a quantitatively non-integrable Anosov flow, every generalized u-Gibbs measure is absolutely continuous with respect to the whole unstable manifold.

In the talk I will present the Hue-Net - a novel Deep Learning framework for Intensity-based Image-to-Image Translation.

The key idea is a new technique we term network augmentation which allows a __differentiable__ construction of intensity histograms from images.

We further introduce __differentiable__ representations of (__1D__) cyclic and joint (__2D__) histograms and use them for defining loss functions based on cyclic Earth Mover's Distance (__EMD__) and Mutual Information (MI). While the Hue-Net can be applied to several image-to-image translation tasks, we choose to demonstrate its strength on color transfer problems, where the aim is to paint a source image with the colors of a different target image. Note that the desired output image does not exist and therefore cannot be used for supervised pixel-to-pixel learning.

This is accomplished by using the __HSV__ color-space and defining an intensity-based loss that is built on the __EMD__ between the cyclic hue histograms of the output and the target images. To enforce color-free similarity between the source and the output images, we define a semantic-based loss by a __differentiable__ approximation of the MI of these images.

The incorporation of histogram loss functions in addition to an adversarial loss enables the construction of semantically meaningful and realistic images.

Promising results are presented for different __datasets__.

In a joint work with Erez Lapid we constructed a new class of representations based on applying the RSK algorithm on Zelevinski's multisegments. Those constructions have the potential to be an alternative to the commonly used basis of standard representations. Intriguingly, this class also turned out to categorify a 45-year-old development in invariant theory: The Rota basis of standard bitableaux.

I will talk about this classical theme and its relation to representations of p-adic GL_n, as well the expected properties of our new class.

A classical theorem of Jordan asserts that each finite subgroup of the complex general linear group GL(n) is "almost commutative": it contains a commutative normal subgroup with index bounded by an universal constant that depends only on n.

We discuss an analogue and variants of this property for the groups of birational (and biregular) automorphisms of complex algebraic varieties, the diffeomorphisms groups of real manifolds and the groups of bimeromorphic (and biholomorphic) automorphisms of compact complex manifolds.

In this talk, we will discuss a new type of a pseudo-random object called a "pseudo-random pseudo-distribution". This object was introduced in the context of the BPL vs. L problem, and I will sketch a space-efficient construction of the latter for read-once branching programs that has near-optimal dependence on the error parameter. The talk is a distillation of a joint work with Mark Braverman and Sumegha Garg (the paper is available online: https://eccc.weizmann.ac.il/report/2017/161/).

The past decade has witnessed the emergence of single-cell technologies that measure the expression level of genes at a single-cell resolution. These developments have revolutionized our understanding of the rich heterogeneity, structure, and dynamics of cellular populations, by probing the states of millions of cells, and their change under different conditions or over time. However, in standard experiments, information about the spatial context of cells, along with additional layers of information they encode about their location along dynamic processes (e.g. cell cycle or differentiation trajectories), is either lost or not explicitly accessible. This poses a fundamental problem for elucidating collective tissue function and mechanisms of cell-to-cell communication.

In this talk I will present computational approaches for addressing these challenges, by learning interpretable representations of structure, context and design principles for multicellular systems from single-cell information. I will first describe how the locations of cells in their tissue of origin and the resulting spatial gene expression can be probabilistically inferred from single-cell information by a generalized optimal-transport optimization framework, that can flexibly incorporate prior biological assumptions or knowledge derived from experiments. Inference in this case is based on an organization principle for spatial gene expression, namely a structural correspondence between distances of cells in expression and physical space, which we hypothesized and supported for different tissues. We used this framework to spatially reconstruct diverse tissues and organisms, including the fly embryo, mammalian intestinal epithelium and cerebellum, and further inferred spatially informative genes. Since cells encode multiple layers of information, in addition to their spatial context, I will also discuss several approaches for the disentanglement of single-cell gene expression into distinct biological processes, based on ideas rooted in random matrix theory and manifold learning. I will finally discuss how these results can be generalized to reveal principles underlying self-organization of cells into multicellular structures, setting the foundation for the computationally-directed design of cell-to-cell interactions optimized for specific tissue structure or function.

Let Z(t) be a Gaussian stationary function on the real line, and fix a level L>0.

We are interested in the asymptotic behavior of the persistence probability: P(T) = P( Z(t) > L, for all t in [0,T] ).

One would guess that for "nice processes", the behavior of P(T) should be exponential. For non-negative correlations this may be established via sub-additivity arguments. However, so far, not a single example with sign-changing correlations was known to exhibit existence of the limit of {Log P(T)}/T, as T approaches infinity (that is, to have a true "persistence exponent").

In the talk I will present a proof of existence of the persistence exponent, for processes whose spectral measure is monotone on [0,∞) and is continuous and non-vanishing at 0. This includes, for example, the sinc-kernel process (whose covariance function is sin(t)/t ).

Joint work with Ohad Feldheim and Sumit Mukherjee.

Two popular objectives optimized in clustering algorithms are k-means and k-median. The k-means (resp. k-median) problem in the L_p-metric is specified by n points as input and the goal is to classify the input point-set into k clusters such that the k-means (resp. k-median) objective is minimized. The best-known inapproximability factor in literature for these NP-hard problems in L_p-metrics were well-below 1.01. In this talk, we take a significant step to improve the hardness of approximating these problems in various L_p-metrics.

When a very fast dynamic event is recorded with a low framerate camera, the resulting video suffers from severe motion blur (due to exposure time) and motion aliasing (due to low sampling rate in time). True Temporal Super-Resolution (TSR) is more than just Temporal-Interpolation (increasing framerate). It also recovers new high temporal frequencies beyond the temporal nyquist limit of the input video, thus resolving both motion-blur and motion-aliasing. In this work we propose a "Deep Internal Learning" approach for true TSR. We train a video-specific CNN on examples extracted directly from the low-framerate input video. Our method exploits the strong recurrence of small space-time patches inside a single video sequence, both within and across different spatio-temporal scales of the video. We further observe (for the first time) that small space-time patches recur also across-dimensions of the video sequence - i.e., by swapping the spatial and temporal dimensions. In particular, the higher spatial resolution of video frames provides strong examples as to how to increase the temporal resolution of that video. Such internal video-specific examples give rise to strong self-supervision, requiring no data but the input video itself. This results in Zero-Shot Temporal-SR of complex videos, which removes both motion blur and motion aliasing, outperforming previous supervised methods trained on external video datasets.

* Joint work with Shai Bagon, Eyal Naor, George Pisha, Michal Irani

The question of determining if a given algebraic variety is rational is a notoriously difficult problem in algebraic geometry, and attempts to solve rationality problems have often produced powerful new techniques. A well-known open rationality problem is the determination of a criterion for when a cubic hypersurface of five-dimensional projective space is rational. After discussing the history of this problem, I will introduce the two conjectural rationality criteria that have been put forth and then discuss a package of tools I have developed with my collaborators to bring these two conjectures together. Our theory of Relative Bridgeland Stability has a number of other beautiful consequences such as a new proof of the integral Hodge Conjecture for Cubic Fourfolds and the construction of full-dimensional families of projective HyperKahler manifolds. Time permitting I'll discuss applications of the theory of relative stability conditions to problems other than cubic fourfolds.

The doubling method, first introduced by Piatetski-Shapiro and Rallis in the 80s, has had numerous applications, e.g. to the theta correspondence and to arithmetic problems. In a series of recent works this method was generalized in several aspects, with an application to functoriality from classical groups to GL(N).

One crucial ingredient for the development of the theory is a multiplicity one result, obtained recently in a joint work with Dima and Rami.

I will briefly survey the method, discuss the multiplicity one result, and talk about applications to covering groups.

Parts of the talk are also based on a collaboration with Cai, Friedberg and Ginzburg.

We present a new/old approach to the design of online algorithms via Bregman projections. This approach is applicable to a wide range of online problems and we discuss connections to previous work on online primal-dual algorithms. In particular, the k-server problem on trees and HSTs is considered. The projection-based algorithm for this problem turns out to have a competitive ratio that matches some of the recent results given by Bubeck et al. (STOC 2018), whose algorithm uses mirror-descent-based continuous dynamics prescribed via a differential inclusion.

Joint work with Niv Buchbinder, Anupam Gupta, and Marco Molinaro.

The ideal boundary of a negatively curved manifold naturally carries two types of measures.

On the one hand, we have conditionals for equilibrium (Gibbs) states associated to Hoelder potentials; these include the Patterson-Sullivan measure and the Liouville measure. On the other hand, we have stationary measures coming from random walks on the fundamental group.

We compare and contrast these two classes.First, we show that both of these of these measures can be associated to geodesic flow invariant measures on the unit tangent bundle, with respect to which closed geodesics satisfy different equidistribution properties. Second, we show that the absolute continuity between a harmonic measure and a Gibbs measure is equivalent to a relation between entropy, (generalized) drift and critical exponent, generalizing previous formulas of Guivarc'h, Ledrappier, and Blachere-Haissinsky-Mathieu. This shows that if the manifold (or more generally, a CAT(-1) quotient) is geometrically finite but not convex cocompact, stationary measures are always singular with respect to Gibbs measures.

A major technical tool is a generalization of a deviation inequality due to Ancona saying the so called Green distance associated to the random walk is nearly additive along geodesics in the universal cover.

Part of this is based on joint work with Gerasimov-Potyagailo-Yang and part on joint work with Tiozzo.

Amir Dembo (Stanford) "Dynamics for spherical spin glasses: Disorder dependent initial conditions"

Abstract: In this talk, based on a joint work with Eliran Subag, I will explain how to rigorously derive the integro-differential equations that arise in the thermodynamic limit of the empirical correlation and response functions for Langevin dynamics in mixed spherical p-spin disordered mean-field models.

I will then compare the large time asymptotic of these equations in case of a uniform (infinite-temperature) starting point, to what one obtains when starting within one of the spherical bands on which the Gibbs measure concentrates at low temperature, commenting on the existence of an aging phenomenon, and on the relations with the recently discovered geometric structure of the Gibbs measures at low temperature.

Alexander Fish (Sydney) "Finite configurations in trees of positive growth rate"

Abstract: We will talk on the relation between the abundance of finite configurations that we observe in trees and their growth rate.

We will survey the Furstenberg-Weiss correspondence principle which relates a tree of positive growth rate with Markov process, and subsequently provides a quantitative relationship between the density of a finite subtree that appears in the tree with a quantity defined on the Markov space and which can be estimated by use of ergodic methods. We will sketch a proof of one direct and one inverse theorem relating the abundance of certain finite configurations and the growth rate of a tree. Some open problems related to this work will be discussed. Based on a joint work with Leo Jiang (Toronto).

Super resolution (SR) methods typically assume that the low-resolution (LR) image was downscaled from the unknown high-resolution (HR) image by a fixed "ideal" downscaling kernel (e.g. Bicubic downscaling). However, this is rarely the case in real LR images, in contrast to synthetically generated SR datasets. When the assumed downscaling kernel deviates from the true one, the performance of SR methods significantly deteriorates. This gave rise to Blind-SR - namely, SR when the downscaling kernel ("SR-kernel") is unknown. It was further shown that the true SR-kernel is the one that maximizes the recurrence of patches across scales of the LR image. In this paper we show how this powerful cross-scale recurrence property can be realized using Deep Internal Learning. We introduce "KernelGAN", an image-specific Internal-GAN, which trains solely on the LR test image at test time, and learns its internal distribution of patches. Its Generator is trained to produce a downscaled version of the LR test image, such that its Discriminator cannot distinguish between the patch distribution of the downscaled image, and the patch distribution of the original LR image. The Generator, once trained, constitutes the downscaling operation with the correct image-specific SR-kernel. KernelGAN is fully unsupervised, requires no training data other than the input image itself, and leads to state-of-the-art results in Blind-SR when plugged into existing SR algorithms.

The Zariski Cancellation Problem asks when a stable isomorphism of affine varieties over an algebraically closed field implies an isomorphism. This is true for affine curves (Abhyankar, Eakin, and Heinzer 72), for the affine plane in zero characteristic (Miyanishi-Sugie and Fujita 79 −80), but false for general affine surfaces in zero characteristic (Danielewski 88) and for the affine space A^3 in positive characteristic (N. Gupta 13). The talk is devoted to a recent progress in the surface case over a field of zero characteristic (Bandman-Makar-Limanov, Dubouloz, Flenner and Kaliman, e.a.). It occurs to be possible to describe the moduli space of pairs of surfaces with isomorphic cylinders.

Benjamini-Schramm convergence is a probabilistic notion useful in studying the asymptotic behavior of sequences of metric spaces. The goal of this talk is to discuss this notion and some of its applications from various perspectives, e.g. for groups, graphs, hyperbolic manifolds and locally symmetric spaces, emphasizing the distinction between the hyperbolic rank-one case and the rigid high-rank case. Understanding the "sofic" part of the Benjamini-Schramm space, i.e. all limit points of "finitary" objects, will play an important role. From the group-theoretic perspective, I will talk about sofic groups, i.e. groups which admit a probabilistic finitary approximation, as well as a companion notion of permutation stability. Several results and open problems will also be discussed.

Photos and videos are now a main mode of communication, used to tell stories, share experiences and convey ideas. However, common media editing tools are often either too complex to master, or oversimplified and limited.

In this talk I will present my strategy towards the creation of media editing techniques that are easy to learn, yet expressive enough to reflect unique creative objectives. We will mostly discuss one specific domain --- human heads --- which are both extremely common (i.e. people care about people) and technologically challenging. I will present several works on editing video by editing text, perspective manipulation, and in-camera lighting feedback. I will also discuss exciting future opportunities related to neural rendering and digital representations of humans.

BIO

Ohad Fried is a postdoctoral research scholar at Stanford University. His work lies in the intersection of computer graphics, computer vision, and human-computer interaction. He holds a PhD in computer science from Princeton University, and an M.Sc. in computer science and a B.Sc. in computational biology from The Hebrew University. Ohad's research focuses on tools, algorithms, and new paradigms for photo and video editing. Ohad is the recipient of several awards, including a Siebel Scholarship and a Google PhD Fellowship. If you own a cable modem, there's a non-negligible chance that Ohad's code runs within it, so feel free to blame him for your slow internet connection.

www.ohadf.com

It is becoming increasingly clear that implicit regularization afforded by the optimization algorithms play a central role in machine

learning, and especially so when using large, deep, neural networks. We have a good understanding of the implicit regularization afforded by stochastic approximation algorithms, such as SGD, for convex problem, and we understand and can characterize the implicit bias of different algorithms, and can design algorithms with specific biases. But in this talk I will focus on implicit biases of local search algorithms for non-convex underdetermined problem, such as deep networks. In an effort to uncover the implicit biases of gradient-based optimization of neural networks, which holds the key to their empirical success, I will discuss recent

work on implicit regularization for matrix factorization, linear convolutional networks, and two-layer ReLU networks, as well as a general bottom-up understanding on implicit regularization in terms of optimization geometry.

The broad range of demands which next generation communications systems are required to meet, spanning from increased rates to efficient power consumption, cannot be satisfied by conventional architectures. These requirements give rise to a multitude of new challenges in modern communications and signal processing systems. In this talk we focus on two fundamental aspects of digital communications receivers - signal acquisition and symbol detection - discussing methods to improve their performance and reliability. For signal acquisition, we show how analog-to-digital convertors can be designed in light of the overall system task, achieving optimal-approaching performance while utilizing efficient bit-constrained samplers. Then, we discuss how recent advances in machine learning can be exploited for symbol detection, a task which is typically addressed using channel-model-based methods, such as the Viterbi algorithm and interference cancellation methods. We present a new framework for combining artificial intelligence and model-based algorithms, allowing the latter to be implemented in a data-driven manner. The resulting architectures are capable of approaching the established performance of model-based algorithms, while being invariant of the underlying statistical model, and learning it from a small amount of training samples.

The moduli space of quadratic differentials on punctured Riemann surfaces admits a integral piecewise linear structure. It gives rise to a measure whose total mass is finite and called the Masur-Veech volume.

These volumes are also related to the asymptotic growth of the number of multicurves on hyperbolic surfaces, once averaged against the Weil-Petersson metric. Adopting this point of view, we show that the Masur-Veech volumes can be obtained as the constant term in a family of polynomials computed by the topological recursion, and as integration of a class on the moduli space of curves. Based on work of Moeller, we can prove the same statement but for a different family of polynomial (and different initial data for the topological recursion). This is based on joint works with Jorgen Andersen, Severin Charbonnier, Vincent Delecroix, Alessandro Giacchetto, Danilo Lewanski and Campbell Wheeler.

To describe different levels of stochastic behavior Ergodic Theory introduces a hierarchy of ergodic properties. Among them ergodicity, mixing and the Bernoulli property are most used to study models in science with complicated "turbulent-like" behavior. However, "typical" systems in science are modeled by smooth (differentiable) dynamical systems acting on smooth phase spaces and the classical "smooth realization problem" asks whether any smooth phase space allows a dynamical system with prescribe collection of ergodic (stochastic) properties, in other words whether topology of the phase space may impose obstructions on a system to exhibit certain stochastic behavior. In this connection I also discuss two more important characteristics of stochastic behavior -- the rate of decay of correlations (rate of mixing) and the Central Limit Theorem -- and whether they too can be realized on any smooth phase space.

We study the sample complexity of learning threshold functions under the constraint of differential privacy. Unlike the non-private case, where the sample complexity is independent of the domain size, it turns our that for private learning the sample complexity must depend on the domain size $X$. Our current understanding of this task places its sample complexity somewhere between $\log^*|X|$ and $2^{\log^*|X|}$, where at least three different algorithms are known with sample complexity exponential in $\log^*|X|$. In this work we reduce this gap significantly, and show that the sample complexity is at most polynomial in $\log^*|X|$.

Joint work with Haim Kaplan, Katrina Ligett, Yishay Mansour, and Moni Naor.

For X a symplectic manifold and L a Lagrangian submanifold, genus zero open Gromov-Witten (OGW) invariants count configurations of pseudoholomorphic disks in X with boundary conditions in L and various constraints at boundary and interior marked points. In a joint work with Jake Solomon from 2016, we define OGW invariants using bounding chains, a concept that comes from Floer theory. In a recent work, also joint with Solomon, we find that the generating function of OGW satisfies a system of PDE called open WDVV equation. This PDE translates to an associativity relation for a quantum product we define on the relative cohomology H^*(X,L). For the projective space, open WDVV gives rise to recursions that, together with other properties, allow the computation of all OGW invariants.

Automated agents and humans increasingly interact and collaborate: at home, at the workplace, on the road, and in many other everyday settings. In all these settings, effectively recognizing what users try to achieve, providing relevant assistance (or, depending on the application, taking relevant preventive measures), and supporting an effective collaboration in the system are essential tasks. All these tasks can be enhanced via efficient system redesign, and often even subtle changes can yield great benefits. However, since these systems are typically complex, hand crafting good design solutions is hard.

Utility Maximizing Design (UMD) addresses this challenge by automating the design process. It does so by offering informed search strategies to automatically and efficiently find optimal design solutions for maximizing a variety of targeted objectives. One example is Goal Recognition Design (GRD), which seeks a redesign to an environment that minimizes the maximal progress an agent can make before its goal is revealed. A second is Equi-Reward Utility Maximizing Design (ER-UMD), which seeks to maximize the performance of a planning agent in a stochastic environment. The third, Reinforcement Learning Design (RLD), parallels ER-UMD, but considers an environment with a reinforcement learning agent.

Among the different ways that may be available to change a setting, the talk will focus on information shaping, which consists of selectively revealing information to a partially informed agent in order to maximize its performance. As an example application, I will demonstrate how information shaping can be applied to enhance the performance of a robot that is navigating in an unfamiliar environment. The talk will conclude with a discussion of future extensions and various applications related to the agenda of Utility Maximizing Design, including human-robot collaboration, intrusion detection, and assisted cognition.

We consider the random Cayley graph of a finite group $G$ formed by picking $k$ random generators uniformly at random:

- We prove universality of cutoff (for the random walk) and a concentration of measure phenomenon in the Abelian setup (namely, that all but $o(|G|)$ elements lie at distance $[R-o(R),R-o(R)]$ from the origin, where $R$ is the minimal ball in $Z^k$ of size at least $|G|$), provided $k \gg 1$ is large in terms of the size of the smallest generating set of $G$. As conjectured by Aldous and Diaconis, the cutoff time is independent of the algebraic structure (it is given by the time at which the entropy of a random walk on $Z^k$ is $\log|G|$).
- We prove analogous results for the Heisenberg $H_{p,d}$ groups of $d \times d$ uni-upper triangular matrices with entries defined mod $p$, for $p$ prime and $d$ fixed or diverging slowly.
- Lastly, we resolve a conjecture of D. Wilson that if $G$ is a group of size at most $2^d$ then for all $k$ its mixing time in this model is as rapid as that of $Z_2^d$ and likewise, that the slowest mixing $p$-group of a given size is $Z_p^d$.

(Joint work with Sam Thomas.)

For X a symplectic manifold and L a Lagrangian submanifold, genus zero open Gromov-Witten (OGW) invariants count configurations of pseudoholomorphic disks in X with boundary conditions in L and various constraints at boundary and interior marked points. In a joint work with Jake Solomon from 2016, we define OGW invariants using bounding chains, a concept that comes from Floer theory. In a recent work, also joint with Solomon, we find that the generating function of OGW satisfies a system of PDE called open WDVV equation. This PDE translates to an associativity relation for a quantum product we define on the relative cohomology H^*(X,L). For the projective space, open WDVV gives rise to recursions that, together with other properties, allow the computation of all OGW invariants.

A matrix is called totally nonnegative (TN) if all its minors are nonnegative, and totally positive (TP) if all its minors are positive. Multiplying a vector by a TN matrix does not increase the number of sign variations in the vector. In a largely forgotten paper, Schwarz (1970) considered matrices whose exponentials are TN or TP. He also analyzed the evolution of the number of sign changes in the vector solutions of the corresponding linear system.

In a seemingly different line of research, Smillie (1984), Smith (1991), and others analyzed the stability of nonlinear tridiagonal cooperative systems by using the number of sign variations in the derivative vector as an integer-valued Lyapunov function.

We provide a tutorial on these fascinating research topics and show that they are intimately related. This allows to derive generalizations of the results by Smillie (1984) and Smith (1991) while simplifying the proofs. This also opens the door to many new and interesting research directions.

Joint work with and Eduardo D. Sontag, Northeastern University.

We consider (finitely supported) transportation problems on a metric space M. They form a vector space TP(M). The optimal transportation cost for such transportation problems is a norm on this space. This normed space is of interest for the theory of metric embeddings because the space M embeds into it isometrically. I am going to talk about geometry of such normed spaces. The most important questions for this talk are relations of these spaces with $L_1$ and $L_\infty$ spaces.

Meromorphic differentials define flat metrics with singularities on Riemann surfaces. Directions of geodesic segments between singularities are a closed subset of the circle. The Cantor-Bendixson rank of their set of directions is a descriptive set-theoretic measure complexity. Drawing on a previous work of David Aulicino, we prove a sharp upper bound that depends only on the genus of the underlying topological surface.

We will be discussing a Fourier-analytic approach to optimal matching between independent samples, with an elementary proof of the Ajtai-Komlos-Tusnady theorem.

The talk is based on a joint work with Michel Ledoux.

The goal of my talk (based on joint work with Vladimir Retakh) is to introduce noncommutative analogs of Catalan numbers c_n which belong to the free Laurent polynomial algebra L_n in n generators. Our noncommutative Catalan numbers C_n admit interesting (commutative and noncommutative) specializations, one of them related to Garsia-Haiman (q,t)-versions, another -- to solving noncommutative quadratic equations. We also establish total positivity of the corresponding (noncommutative) Hankel matrices H_n and introduce two kinds of noncommutative binomial coefficients which are instrumental in computing the inverse of H_n and its positive factorizations, and other combinatorial identities involving C_n.

If time permits, I will explain the relationship of the C_n with the:

1. noncommutative Laurent Phenomenon, which was previously established for Kontsevich rank 2 recursions and all marked surfaces

2. noncommutative orthogonal polynomials, which can be viewed as noncommutative determinants of an extended matrix H_n.

Let $\mathfrak{g}=\mathfrak{sl}_n$ and $U_q(\widehat{\mathfrak{g}})$ the corresponding quantum affine algebra. Hernandez and Leclerc proved that there is an isomorphism $\Phi$ from the Grothendieck ring $\mathcal{R}_{\ell}$ of a certain subcategory $\mathcal{C}_{\ell}$ of finite-dimensional $U_q(\widehat{\mathfrak{g}})$-modules to a certain quotient $\mathbb{C}[{\rm Gr}(n, n+\ell+1, \sim)]$ of a Grassmannian cluster algebra. We proved that this isomorphism induces an isomorphism $\widetilde{\Phi}$ from the monoid of dominant monomials to the monoid of semi-standard Young tableaux. Using this result and the results of Qin and the results of Kashiwara, Kim, Oh, and Park, we have that every cluster monomial (resp. cluster variable) in a Grassmannian cluster algebra is of the form $ch(T)$ for some real (resp. prime real) rectangular semi-standard Young tableau $T$, where $ch(T)$ is certain map obtained from a formula of Arakawa--Suzuki. We also translated Arakawa--Suzuki's formula to the setting of $q$-characters and apply it to study real modules, prime modules, and compatibility of cluster variables.

This is joint work with Wen Chang, Bing Duan, and Chris Fraser.

The quotient of a monoidal category by its largest tensor ideal - given by the so-called negligible morphisms - is often a semisimple category.

I will introduce a generalization of the notion of negligible morphism for some monoidal categories and discuss the associated tensor ideals in the setting of Deligne categories and tilting modules for quantum groups and algebraic groups. It turns out that they are related to other notions from representation theory like modified dimensions and the a-function.

Let g be a Lie algebra of type ADE. To a pair of weights of g (one dominant, the other arbitrary) we associate a group G and a representation N consisting of framed quiver representations of the Dynkin diagram of g. From (G,N) we can construct two varieties. The Higgs branch is the categorical quotient of N by G, which in this case is the Nakajima quiver variety and has been studied for over 25 years. The Coulomb branch has a much more complicated definition that was only recently discovered by Braverman, Finkelberg, and Nakajima. There is a duality between these spaces, which is sometimes referred to as 3d mirror symmetry or symplectic duality.

In this talk I'll try to explain the definition of the Coulomb branch, and why you might care. I will discuss its deformation quantization, which appears naturally from the construction. I'll describe also our recent result which provides an equivalence between representations of the deformation quantisation, and modules over a seemingly very different algebra which is defined combinatorially and arises in categorical representation theory. This equivalence has several interesting consequences, e.g. it provides a classification for the irreducible Gelfand-Tsetlin modules of gl(n), which was previously only known up to n=3.

This talk is based on https://arxiv.org/abs/1806.07519

Learning of layered or "deep" representations has recently enabled low-cost sensors for autonomous vehicles and efficient automated analysis of visual semantics in online media. But these models have typically required prohibitive amounts of training data, and thus may only work well in the environment they have been trained in. I'll describe recent methods in adversarial adaptive learning that excel when learning across modalities and domains. Further, these models have been unsatisfying in their complexity--with millions of parameters--and their resulting opacity. I'll report approaches which achieve explainable deep learning models, including both introspective approaches that visualize compositional structure in a deep network, and third-person approaches that can provide a natural language justification for the classification decision of a deep model.

Representations of braid groups have appeared in many different areas such as topology, statistical mechanics, conformal field theory, braided tensor categories and others. In order to compare these, intrinsic characterizations of such representations are desirable. These have been known for some time for representations in connection with vector representations of classical Lie types, in terms of Hecke algebras and so-called BMW algebras. We review these and show how these results can be extended to include more cases related to exceptional Lie types. In particular, we obtain new classes of braid representations where the images of the generators satisfy a cubic equation. Time permitting, we discuss applications of these results such as Schur-Weyl type duality theorems and classification of braided tensor categories.

Dropout is a simple yet effective regularization technique that has been applied to various machine learning tasks, including linear classification, matrix factorization and deep learning. However, the theoretical properties of dropout as a regularizer remain quite elusive. This talk will present a theoretical analysis of dropout for single hidden-layer linear neural networks. We demonstrate that dropout is a stochastic gradient descent method for minimizing a certain regularized loss. We show that the regularizer induces solutions that are low-rank, in the sense of minimizing the number of neurons. We also show that the global optimum is balanced, in the sense that the product of the norms of incoming and outgoing weight vectors of all the hidden nodes equal. Finally, we provide a complete characterization of the optimization landscape induced by dropout.

Let G be a transitive nonamenable graph, and consider supercritical Bernoulli bond percolation on G. We prove that the probability that the origin lies in a finite cluster of size n decays exponentially in n. We deduce that:

- Every infinite cluster has anchored expansion almost surely. This answers positively a question of Benjamini, Lyons, and Schramm (1997).
- Various observables, including the percolation probability and the truncated susceptibility are analytic functions of p throughout the entire supercritical phase.

Joint work with Tom Hutchcroft.

There are many formulas that express interesting properties of a finite group G in terms of sums over its characters. For evaluating or estimating these sums, one of the most salient quantities to understand is the {\it character ratio}:

$$

trace(\rho(g))/dim(\rho),

$$

for an irreducible representation $\rho$ of G and an element g of G. For example, Diaconis and Shahshahani stated a formula of the mentioned type for analyzing certain random walks on G.

Recently, we discovered that for classical groups G over finite fields there is a natural invariant of representations that provides strong information on the character ratio. We call this invariant rank. This talk will discuss the notion of rank for GLn over finite fields, and explain how one can apply the results to verify mixing time and rate for certain random walks.

The talk will assume basic notions of linear algebra in Hilbert spaces, and the definition of a group.

This is joint work with Roger Howe (Yale and Texas AM).

We present a Monte Carlo rendering framework for the physically-accurate simulation of speckle patterns arising from volumetric scattering of coherent waves. These noise-like patterns are characterized by strong statistical properties, such as the so-called memory effect. These properties are at the core of imaging techniques for applications as diverse as tissue imaging, motion tracking, and non-line-of-sight imaging. Our rendering framework can replicate these properties computationally, in a way that is orders of magnitude more efficient than alternatives based on directly solving the wave equations. At the core of our framework is a path-space formulation for the covariance of speckle patterns arising from a scattering volume, which we derive from first principles. We use this formulation to develop two Monte Carlo rendering algorithms, for computing speckle covariance as well as directly speckle fields. While approaches based on wave equation solvers require knowing the microscopic position of wavelength-sized scatterers, our approach takes as input only bulk parameters describing the statistical distribution of these scatterers inside a volume. We validate the accuracy of our framework by comparing against speckle patterns simulated using wave equation solvers, use it to simulate memory effect observations that were previously only possible through lab measurements, and demonstrate its applicability for computational imaging tasks.

Joint work with Chen Bar, Marina Alterman, Ioannis Gkioulekas

Root-reductive Lie algebras form a special type of reasonably well behaved infinite-dimensional Lie algebras. In this talk, we shall define a version of Bernstein-Gelfand-Gelfand categories O for root-reductive Lie algebras, which we called extended categories O and briefly discuss some properties of these categories. Let g be a rootreductive Lie algebra containing a splitting Borel subalgebra b satisfying a special additional condition called the Dynkin condition. The extended category O corresponding to g and b is denoted by O-bar.

The category O-bar can be decomposed analogously to the finite-dimensional cases into blocks. The main object of this talk is to give a construction of translation functors of O-bar. Then we shall see that some objects such as tilting modules arise by applying the translation functors to Verma modules just as in the finite-dimensional cases. Furthermore, the translation functors establish equivalences between some blocks of the category O-bar.

The model of interactive proofs, introduced nearly two and a half decade, is now increasingly widely being used to design computation-outsourcing protocols. In an interactive proof, an honest party interacts with powerful but strategic provers, to elicit from them the correct answer to a computational question. Classical interactive proofs assume that the provers are adversarial (i.e., they want to mislead the verifier) and cooperative (work together as a team).

In this talk, I will present my work on a new payment-based interactive-proof system, called rational proofs. In rational proofs, the provers are not adversarial but rational, that is, they want to maximize the payment received from the verifier. Using principles from mechanism design, I will show how these payments can be used to leverage correctness from multiple provers who are either cooperative or non-cooperative in nature. I will also present how the guarantees of rational proofs are related to the soundness and completeness guarantees of classical interactive proofs.

Bio: Shikha Singh is currently an Assistant Professor of Computer Science at Wellesley College and will be joining Williams College as an Assistant Professor in Fall 2019. She obtained her PhD in Computer Science from Stony Brook University and her Integrated MSc. in Mathematics and Computing from IIT Kharagpur. Her broad research interests include algorithmic game theory, algorithms and data structures for big data, and complexity theory.

We construct and study a stationary version of the Hastings-Levitov(0) model. We prove that unlike the classical model, in the stationary case particle sizes are constant in expectation, yielding that this model can be seen as a tractable off-lattice Diffusion Limited Aggregation (DLA). The stationary setting together with a geometric interpretations of the harmonic measure yields new geometric results such as stabilization, finiteness of arms and unbounded width in mean of arms. Moreover we can present an exact conjecture for the fractal dimension.

The linear-quadrati

The recurring context in which objects appear holds valuable information that can be employed to predict their existence. This intuitive observation indeed led many researchers to endow appearance-based detectors with explicit reasoning about context. The underlying thesis suggests that stronger contextual relations would facilitate greater improvements in detection capacity. In practice, however, the observed improvement in many cases is modest at best, and often only marginal. In this work we seek to improve our understanding of this phenomenon, in part by pursuing an opposite approach. Instead of attempting to improve detection scores by employing context, we treat the utility of context as an optimization problem: to what extent can detection scores be improved by considering context or any other kind of additional information? With this approach we explore the bounds on improvement by using contextual relations between objects and provide a tool for identifying the most helpful ones. We show that simple co-occurrence relations can often provide large gains, while in other cases a significant improvement is simply impossible or impractical with either co-occurrence or more precise spatial relations. To better understand these results we then analyze the ability of context to handle different types of false detections, revealing that tested contextual information cannot ameliorate localization errors, severely limiting its gains. These and additional insights further our understanding on where and why utilization of context for object detection succeeds and fails

__Speaker 1: Mark Rudelson (UMich)__

Title: Circular law for sparse random matrices.

Abstract: Consider a sequence of $n$ by $n$ random matrices $A_n$ whose entries are independent identically distributed random variables. The circular law asserts that the distribution of the eigenvalues of properly normalized matrices $A_n$ converges to the uniform measure on the unit disc as $n$ tends to infinity. We prove this law for sparse random matrices under the optimal sparsity assumption. Joint work with Konstantin Tikhomirov.

__Speaker 2: Serguei Popov (IMECC)__

Title: On the range of a two-dimensional conditioned random walk

Abstract: We consider the two-dimensional simple random walk conditioned on never hitting the origin. This process is a Markov chain, namely, it is the Doob $h$-transform of the simple random walk with respect to the potential kernel. It is known to be transient and we show that it is "almost recurrent'' in the sense that each infinite set is visited infinitely often, almost surely. We prove that, for a "typical large set", the proportion of its sites visited by the conditioned walk is approximately a Uniform$[0,1]$ random variable. Also, given a set $G\subset\R^2$ that does not "surround" the origin, we prove that a.s.\ there is an infinite number of $k$'s such that $kG\cap \Z^2$ is unvisited. These results suggest that the range of the conditioned walk has "fractal" behavior. This is a joint work with Nina Gantert and Marina Vachkovskaia, see arxiv.org/abs/1804.00291 Also, there is much more about conditioned walks in my new book (www.ime.unicamp.br/~popov/2srw.pdf, work in progress). Comments and suggestions on the latter are very welcome!

We examine several empirical and theoretical results on the training of deep networks. For example, Why are common "over-fitting" indicators (e.g., very low training error, high validation loss) misleading? Why, sometimes, early-stopping time never arrives? Why can adaptive rate methods (e.g., adam) degrade generalization? Why commonly used loss functions exhibit better generalization than others? Why use weight decay before batch-norm? When can we use low numerical precision, and how low can we get? and discuss the practical implications of these results.

Bio == Since October 2017, Daniel soudry is an assistant professor (Taub Fellow) in the Department of Electrical Engineering at the Technion, working in the areas of machine learning and theoretical neuroscience. Before that, he did his post-doc (as a Gruss Lipper fellow) working with Prof. Liam Paninski in the Department of Statistics, the Center for Theoretical Neuroscience the Grossman Center for Statistics of the Mind at Columbia University. He did his Ph.D. in the Department of Electrical Engineering at the Technion, Israel Institute of technology, under the guidance of Prof. Ron Meir. He received his B.Sc. degree in Electrical Engineering and Physics from the Technion.

Periods of automorphic forms have an important place in the theory of automorphic representations. They are often related to (special values of) L-functions and have applications to arithmetic geometry and analytic number theory. For an automorphic form on a group G, a period is its integral over a subgroup of G. If the automorphic form is not cuspidal such integrals are usually divergent. It is nonetheless possible in many cases to extend the definition of the period to almost all automorphic forms which has direct applications to the study of the given period. In this talk I will describe a general procedure of defining such periods in the case when the subgroup is reductive.

I will also discuss the joint work with A. Pollack and C. Wan that applies this to the study of certain periods and their relations to special values of L-functions confirming predictions of Sakellaridis and Venkatesh.

Existing proofs that deduce BPP=P from circuit lower bounds convert randomized algorithms to deterministic algorithms with a large polynomial slowdown. We convert randomized algorithms to deterministic algorithms with nearly minimal slowdown. Specifically, assuming exponential lower bounds against non-deterministic circuits we convert randomized algorithms that err rarely to deterministic algorithms with a similar run-time, and general randomized algorithms to deterministic algorithms whose run-time is slower by only a nearly linear factor.

Our results follow from a new connection between pseudo-entropy generators and locally list recoverable codes.

This is joint work with Dean Doron, Justin Oh and David Zuckerman

A longstanding problem in machine learning is to find unsupervised methods that can learn the statistical structure of high dimensional signals. In recent years, GANs have gained much attention as a possible solution to the problem, and in particular have shown the ability to generate remarkably realistic high resolution sampled images. At the same time, many authors have pointed out that GANs may fail to model the full distribution ("mode collapse") and that using the learned models for anything other than generating samples may be very difficult. In this paper, we examine the utility of GANs in learning statistical models of images by comparing them to perhaps the simplest statistical model, the Gaussian Mixture Model. First, we present a simple method to evaluate generative models based on relative proportions of samples that fall into predetermined bins. Unlike previous automatic methods for evaluating models, our method does not rely on an additional neural network nor does it require approximating intractable computations. Second, we compare the performance of GANs to GMMs trained on the same datasets. While GMMs have previously been shown to be successful in modeling small patches of images, we show how to train them on full sized images despite the high dimensionality. Our results show that GMMs can generate realistic samples (although less sharp than those of GANs) but also capture the full distribution, which GANs fail to do. Furthermore, GMMs allow efficient inference and explicit representation of the underlying statistical structure. Finally, we discuss how GMMs can be used to generate sharp images.

We consider an algebraic variety $V$ and its foliation, both defined over a number field. Given a (compact piece of a) leaf $L$ of the foliation, and a subvariety $W$ of complementary codimension, we give an upper bound for the number of intersections between $L$ and $W$. The bound depends polynomially on the degree of $W$, the logarithmic height of $W$, and the logarithmic distance between $L$ and the locus of points where leafs of the foliation intersect $W$ improperly.

Using this theory we prove the Wilkie conjecture for sets defined using leafs of foliations under a certain assumption about the algebraicity locus. For example, we prove the if none of the leafs contain algebraic curves then the number of algebraic points of degree $d$ and log-height $h$ on a (compact piece of a) leaf grows polynomially with $d$ and $h$. This statement and its generalizations have many applications in diophantine geometry following the Pila-Zannier strategy.

I will focus mostly on the proof of the main statement, which uses a combination of differential-algebraic methods related to foliations with some ideas from complex geometry and value distribution theory. If time permits I will briefly discuss the applications to counting algebraic points and diophantine geometry at the end.

20 questions is one of the simplest examples of a combinatorial search game:

lice thinks of an English word, and Bob's job is to figure it out by asking Yes/No questions. The worst-case complexity of the game is clearly log n, so to spice things up,we assume that Alice chooses her input according to some distribution known to Bob, and Bob now aims to minimize the expected number of questions.

An optimal solution to this problem was found already in 1952. However, the optimal strategy could ask arbitrarily complex Yes/No questions. We ask what happens when we constrain Bob to asking only "simple" questions, and what happens if Alice is allowed to lie a few times.

Joint work with Yuval Dagan (MIT), Ariel Gabizon, Daniel Kane (UCSD) and Shay Moran (IAS).

We consider statistical questions concerning colored sequences of partitions, produced by applying a partition process which was first introduced by Kakutani for the 1-dimensional case. This process can be generalized as the application of a fixed multiscale substitution rule, defined on a finite set of colored sets in R^d, on elements of maximal measure in each partition. Colored sets appearing in the sequence are modeled by certain flows on an associated directed weighted graph, and natural statistical questions can be reformulated as questions on the distribution of paths on graphs. Under some incommensurability assumptions, we show that special properties of Laplace transforms of the relevant counting functions imply explicit statistical results

A searchable symmetric encryption (SSE) scheme enables a client to store data on an untrusted server while supporting keyword searches in a secure manner. Recent experiments have indicated that the practical relevance of such schemes heavily relies on the tradeoff between their space overhead, locality (the number of non-contiguous memory locations that the server accesses with each query), and read efficiency (the ratio between the number of bits the server reads with each query and the actual size of the answer). In this talk, we survey recent constructions and matching lower bounds, and discuss their underlying techniques.

Based on joint works with Gilad Asharov, Moni Naor, and Gil Segev.

We consider the geo-localization task - finding the pose (position & orientation) of a camera in a large 3D scene from a single image. We aim toexperimentally explore the role of geometry in geo-localization in a convolutional neural network (CNN) solution. We do so by ignoring the often available texture of the scene. We therefore deliberately avoid using texture or rich geometric details and use images projected from a simple 3D model of a city, which we term lean images. Lean images contain solely information that relates to the geometry of the area viewed (edges, faces, or relative depth). We find that the network is capable of estimating the camera pose from the lean images, and it does so not by memorization but by some measure of geometric learning of the geographical area. The main contributions of this work are: (i) providing insight into the role of geometry in the CNN learning process; and (ii) demonstrating the power of CNNs for recovering camera pose using lean images.

This is a joint work with Moti Kadosh & Ariel Shamir

This will be a very introductory talk about Virasoro Lie algebra and its super-analogues: Ramond and Neveu-Schwarz Lie superalgebras.

In this talk I will describe some recent work on distributed property testing in the networks with bounded bandwidth (the CONGEST model): we have a network of computing nodes communicating over some initially-unknown network graph, where every communication link can carry a bounded number of bits per round. Some simple-looking problems, such as checking if the network contains a 4-cycle, are known to be very hard in this model, and this motivates us to consider property testing instead of exact solutions.

I will describe distributed property testing algorithms for two problems: subgraph-freeness, where we wish to determine whether the network graph contains some fixed constant-sized subgraph H; and uniformity testing, where every node of the network draws samples from an unknown distribution, and our goal is to determine whether the distribution is uniform or far from uniform. I will also discuss lower bounds.

Understanding statistical properties of zeros of Laplacian eigenfunctions is a program which is attracting much attention from mathematicians and physicists. We will discuss this program in the setting of "quantum graphs", self-adjoint differential operators acting on functions living on a metric graph. Numerical studies of quantum graphs motivated a conjecture that the distribution of nodal surplus (a suitably rescaled number of zeros of the n-th eigenfunction) has a universal form: it approaches Gaussian as the number of cycles grows. The first step towards proving this conjecture is a result established for graphs which are composed of cycles separated by bridges. For such graphs we use the nodal-magnetic theorem of the speaker, Colin de Verdiere and Weyand to prove that the distribution of the nodal surplus is binomial with parameters p=1/2 and n equal to the number of cycles. Based on joint work with Lior Alon and Ram Band.

Universal learning is considered from an information theoretic point of view following the universal prediction approach pursued in the 90's by F&Merhav. Interestingly, the extension to learning is not straight-forwar

We prove a new connection between gentle measurement (where one wants to measure n quantum states, in a way that damages the states only by a little) and differential privacy (where one wants to query a database about n users, in a way that reveals only a little about any individual user). The connection is bidirectional, though with loss of parameters in going from DP to gentle measurement. Exploiting this connection, we present a new algorithm for approximating the outcomes of many measurements on a collection of quantum states, a task called "shadow tomography". The new algorithm has the advantages of being gentle and online (the measurements can be chosen adaptively).

Joint work with Scott Aaronson.

No prior knowledge about quantum mechanics or computing will be assumed.

This work studies the relationship between the classification performed by deep neural networks (DNNs) and the decision of various classic classifiers, namely k-nearest neighbors (k-NN), support vector machines (SVM), and logistic regression (LR). This is studied at various layers of the network, providing us with new insights on the ability of DNNs to both memorize the training data and generalize to new data at the same time, where k-NN serves as the ideal estimator that perfectly memorizes the data.

First, we show that DNNs' generalization improves gradually along their layers and that memorization of non-generalizing networks happens only at the last layers.

We also observe that the behavior of DNNs compared to the linear classifiers SVM and LR is quite the same on the training and test data regardless of whether the network generalizes. On the other hand, the similarity to k-NN holds only at the absence of overfitting. This suggests that the k-NN behavior of the network on new data is a good sign of generalization. Moreover, this allows us to use existing k-NN theory for DNNs.

Complex classifiers may exhibit "embarrassing" failures even in "easy" cases where humans can provide a simple justified explanation. Avoiding such failures is obviously of key importance. In this work, we focus on one such setting, where a label is perfectly predictable if the input contains certain features, or rules, and otherwise it is predictable by a linear classifier. We define a hypothesis class that captures this notion and determine its sample complexity. We also give evidence that efficient algorithms cannot achieve this sample complexity. We then derive a simple and efficient algorithm and show that its sample complexity is close to optimal, among efficient algorithms. Experiments on synthetic and sentiment analysis data demonstrate the efficacy of the method, both in terms of accuracy and interpretabilit

A graph is automatically also a metric space, but is there anything interesting to say about such metric spaces? Many fascinating and concrete questions are encapsulated in the more general (and vague) question "to what extent can a finite graph emulate the properties of a infinite regular tree ?". We will see how this leads us to investigate expansion in graphs and questions about the large scale aspects of graph metrics including girth and diameter and the relations between the two. If time permits (probably not) I may also say a little about the local geometry of graphs.

This talk is based on many collaborations which I have had over the years, among my more recent collaborators are Michael Chapman, Yuval Peled and Yonatan Bilu. The talk requires no particular previous background and should be accessible to general mathematical audience.

In 1981 P.Arnoux and J.-C.Yoccoz constructed their famous foliation on a surface of genus three with zero flux. Later it was shown that this example can be generalized, and in particular that there is an interesting fractal set of parameters that give rise to the foliations with similar properties. This fractal was named in honour of Gerard Rauzy.

In my talk I will briefly discuss how such a family of foliations appeared in different branches of mathematics (symbolic dynamics, Teichmuller dynamics, low-dimensional topology, geometric group theory) and even in theoretical physics (conductivity theory in monocrystals) and explain what do we know about this family from ergodic point of view.

The talk is based on joint work with Pascal Hubert and Artur Avila and on a work in progress with Ivan Dynnikov and Pascal Hubert.

I'll revisit decompositions of de Rham complexes in positive characteristic (Deligne-Illusie), by discussing relations between cotangent complexes, liftings mod $p^2$, and de Rham-Witt and derived de Rham complexes. Such relations have been recently observed independently by Bhargav Bhatt.

A linear threshold function (LTF) is a Boolean function f:{-1,1}^n -> {0,1} of the form f(x) = 1_{ \sum a_i x_i > t}, for some fixed coefficients a_i and some threshold t. LTFs play an important role in complexity theory, machine learning, and other fields.

In this talk we present a new approach that allows us obtaining sharp results on Fourier-theoretic properties of biased LTFs. In particular, we determine the exact asymptotic order of the total influence and of the degree-1 Fourier weight of any biased LTF, in terms of its maximal (normalized) coefficient and its expectation. This provides a sharp generalization of theorems proved by Matulef, O'Donnell, Rubinfeld, and Servedio (in the context of property testing), and settles a conjecture posed by Kalai et al.

Our main tools are 'local' forms of the classical Chernoff inequality, like the following one proved by Devroye and Lugosi (2008): Let {x_i} be independent random variables uniformly distributed in {-1, 1}, and let a_i be nonnegative numbers such that \sum a_i^2 =1. If for some t > 0, we have Pr [\sum a_i x_i > t] = b, then Pr[\sum a_i x_i > t + delta] < b/2 holds for delta < c/ \sqrt {log(1/b)}, where c is a universal constant. Such inequalities seem to be little-known and probably can be useful in other contexts as well.

Joint work with Ohad Klein.

We will consider two different types of hypergeometric decompositions of special data associated to five pencils of quartic surfaces. We see that over the complex numbers, the middle cohomology of these five pencils yield hypergeometric Picard-Fuchs equations. Using these parameters, we then consider the same pencils over finite fields, decomposing their rational point counts using the finite-field hypergeometric functions with the same parameters as above. This is joint work with Charles Doran, Adriana Salerno, Steven Sperber, John Voight, and Ursula Whitcher.

MPC-in-the-head is a novel paradigm introduced in the work of Ishai et al. [IshaiKOS09] and, roughly speaking, allows the design of a zero-knowledge proof system for any NP-relation by relying on any multiparty computation (MPC) protocol in a modular way. On a high-level, in this transformation, the prover emulates ``in-its-head'' an execution of an MPC protocol that securely evaluates the NP-relation on the witness and commits to the views of the parties induced by this run. The verifier then tests the veracity of the computation by challenging the prover to open (decommit) the views of a subset of the parties. The key insight in the compilation is that the soundness and zero-knowledge property directly reduces to the robustness and simulation guarantees of the underlying MPC protocol.

In order to prove concentration estimates for (products of) measures with heavier tails than the standard Gaussian measure one can use several variants of the classical log-Sobolev inequality, e.g., Beckner-type inequalities of Latala and Oleszkiewicz or modified log-Sobolev inequalities of Gentil, Guillin, and Miclo. The main result I plan to present asserts that a probability measure on R^d which satisfies the former inequality satisfies also the latter. Based on joint work with Franck Barthe.

When one considers the functor of parabolic induction in various contexts there arises immediately the question of the existence of left or right adjoints. For example in the p-adic setting there is a natural left adjoint, but it was shown by Bernstein that in fact there is also a right adjoint, and they turn out to be isomorphic - this phenomenon is called "second adjointness". We explain how second adjointness is directly related to a natural braiding on a categorified version of the Hall algebra and describe the interplay between the two settings and lay out a strategy of how this connection can help understand both sides better.

Brownian motion is a classical process in probability theory belonging to the class of "Log-correlated random fields". It is well known do to Bramson that the order of the maximum has a different logarithmic correction as the corresponding independent setting.

In this talk we look at a version of branching Brownian motion where we slightly vary the diffusion parameter in a way that, when looking at the order of the maximum, we can smoothly interpolate between the logarithmic correction for independent random variables ($\frac{1}{2\sqrt 2}\ln(t)$) and the logarithmic correction of BBM ($\frac{3}{2\sqrt 2}\ln(t)$) and the logarithmic correction of 2-speed BBM with increasing variances ($\frac{6}{2\sqrt 2}\ln(t)$). We also establish in all cases the asymptotic law of the maximum and characterise the extremal process, which turns out to coincide essentially with that of standard BBM. We will see that the key to the above results is a precise understanding of the entropic repulsion experienced by an extremal particle. (joint work with A. Bovier)

Deep learning has been amongst the most emerging fields in computer science and engineering. In recent years it has been shown that deep networks are vulnerable to attacks by adversarial examples. I will introduce a novel flexible approach named Houdini for generating adversarial examples for complex and structured tasks and demonstrate successful attacks on different applications such as speech recognition, pose estimation, semantic image segmentation, speaker verification, and malware detection. Then I will discuss how this weakness can be turned into two secure applications. The first is a new technique for watermarking deep network models in a black-box way. That is, concealing information within the model that can be used by the owner of the model to claim ownership. The second application is a novel method to Speech Steganography, namely hiding a secret spoken message within an ordinary public spoken message. I will conclude the talk by a brief discussion of our attempts to detect such adversarial attacks, based on multiple semantic label representations

The derandomization of MA, the probabilistic version of NP, is a long standing open question. In this talk, we connect this problem to a variant of another major problem: the quantum PCP conjecture. Our connection goes through the surprising quantum characterization of MA by Bravyi and Terhal. They proved the MA-completeness of the problem of deciding whether the groundenergy of a uniform stoquastic local Hamiltonian is zero or inverse polynomial. We show that the gapped version of this problem, i.e. deciding if a given uniform stoquastic local Hamiltonian is frustration-free or has energy at least some constant, is in NP. Thus, if there exists a gap-amplification procedure for uniform stoquastic Local Hamiltonians (in analogy to the gap amplification procedure for constraint satisfaction problems in the original PCP theorem), then MA = NP (and vice versa). Furthermore, if this gap amplification procedure exhibits some additional (natural) properties, then P = RP. This is a joint work with Dorit Aharonov.

Inverse problems appear in many applications, such as image deblurring, inpainting and super-resolution. The common approach to address them is to design a specific algorithm (or recently - a deep neural network) for each problem. The Plug-and-Play (P&P) framework, which has been recently introduced, allows solving general inverse problems by leveraging the impressive capabilities of existing denoising algorithms. While this fresh strategy has found many applications, a burdensome parameter tuning is often required in order to obtain high-quality results. In this work, we propose an alternative method for solving inverse problems using off-the-shelf denoisers, which requires less parameter tuning (can be also translated into less pre-trained denoising neural networks). First, we transform a typical cost function, composed of fidelity and prior terms, into a closely related, novel optimization problem. Then, we propose an efficient minimization scheme with a plug-and-play property, i.e., the prior term is handled solely by a denoising operation. Finally, we present an automatic tuning mechanism to set the method's parameters. We provide a theoretical analysis of the method, and empirically demonstrate its impressive results for image inpainting, deblurring and super-resolution. For the latter, we also present an image-adaptive learning approach that further improves the results.

We study the Excluded Grid Theorem, a fundamental structural result in graph theory, that was proved by Robertson and Seymour in their seminal work on graph minors. The theorem states that there is a function f, such that for every integer g > 0, every graph of treewidth at least f(g) contains the (gxg)-grid as a minor. For every integer g>0, let f(g) be the smallest value for which the theorem holds. Establishing tight bounds on f(g) is an important graph-theoretic question. Robertson and Seymour showed that f(g) is at least of order g^2 log g). For a long time, the best known upper bounds on f(g) were super-exponential in g. The first polynomial upper bound of f(g) = O(g^98 poly log g) was proved by Chekuri and Chuzhoy. It was later improved to f(g) = O(g^36 poly log g), and then to f(g) = O(g^19 poly log g). In this talk we present our recent work that further improves this bound to f(g) = O(g^9 poly log g) via a simpler proof. Moreover, while there are natural barriers that seem to prevent the previous methods from yielding tight bounds for the theorem, it seems conceivable that the techniques proposed in this thesis can lead to even tighter bounds on f(g).

As it was observed a few years ago, there exists a certain signed count of real lines on real projective hypersurfaces of degree 2n+1 and dimension n that, contrary to the honest "cardinal" count, is independent of the choice of a hypersurface, and by this reason provides, as a consequence, a strong lower bound on the honest count. Originally, in this invariant signed count the input of a line was given by its local contribution to the Euler number of an appropriate auxiliary universal vector bundle.

The aim of the talk is to present other, in a sense more geometric, interpretations of the signs involved in the invariant count. In particular, this provides certain generalizations of Segre indices of real lines on cubic surfaces and Welschinger-Solomon weights of real lines on quintic threefolds.

This is a joint work with S.Finashin.

The theory of locally compact quantum groups grew out of the need to extend Pontryagin's duality for locally compact abelian groups to a wider class of objects, as well as from a modern "quantum" point of view suggesting the replacement of some algebras of functions on a group by non-commutative objects, namely operator algebras. In this talk, which will be split into two parts, we will show how several fundamental notions from probability and geometric group theory fit in this framework.

The first part will be an introduction to locally compact quantum groups. We will present the rationale and the definitions, give examples, and explain how the theory is related to other branches of math. If time permits, we will also touch upon more specific notions related to the second part.

In the second part we will discuss convolution semigroups of states, as well as generating functionals, on locally compact quantum groups. One type of examples comes from probability: the family of distributions of a L\'evy process form a convolution semigroup, which in turn admits a natural generating functional. Another type of examples comes from (locally compact) group theory, involving semigroups of positive-definite functions and conditionally negative-definite functions, which provide

important information about the group's geometry. We will explain how these notions are related and how all this extends to the quantum world; derive geometric characterizations of two approximation properties of locally compact quantum groups; see how generating functionals may be (re)constructed and study their domains; and indicate how our results can be used to study cocycles.

Based on joint work with Adam Skalski.

No background in operator algebras will be assumed.

Modern robotic and vision systems are often equipped with a direct 3D data acquisition device, e.g. a LiDAR or RGBD camera, which provides a rich 3D point cloud representation of the surroundings. Point clouds have been used successfully for localization and mapping tasks, but their use in semantic understanding has not been fully explored. Recent advances in deep learning methods for images along with the growing availability of 3D point cloud data have fostered the development of new 3D deep learning methods that use point clouds for semantic understanding. However, their unstructured and unordered nature make them an unnatural input to deep learning methods. In this work we propose solutions to three semantic understanding and geometric processing tasks: point cloud classification, segmentation, and normal estimation. We first propose a new global representation for point clouds called the 3D Modified Fisher Vector (3DmFV). The representation is structured and independent of order and sample size. As such, it can be used with 3DmFV-Net, a newly designed 3D CNN architecture for classification. The representation introduces a conceptual change for processing point clouds by using a global and structured spatial distribution. We demonstrate the classification performance on the ModelNet40 CAD dataset and the Sydney outdoor dataset obtained by LiDAR. We then extend the architecture to solve a part segmentation task by performing per point classification. The results here are demonstrated on the ShapeNet dataset. We use the proposed representation to solve a fundamental and practical geometric processing problem of normal estimation using a new 3D CNN (Nesti-Net). To that end, we propose a local multi-scale representation called Multi Scale Point Statistics (MuPS) and show that using structured spatial distributions is also as effective for local multi-scale analysis as for global analysis. We further show that multi-scale data integrates well with a Mixture of Experts (MoE) architecture. The MoE enables the use of semi-supervised scale prediction to determine the appropriate scale for a given local geometry. We evaluate our method on the PCPNet dataset. For all methods we achieved state-of-the-art performance without using an end-to-end learning approach.

Generative Adversarial Networks (GANs) is a recent algorithmic framework that has won considerable attention. In a nutshell, GANs receive as input an IID sample and outputs synthetic data that should resemble data from the true underlying distribution. For example, consider an algorithm that receives as input some tunes from a specific music genre (e.g. jazz, rock, pop) and then outputs a new, original, tune from that genre.

From a theoretical perspective, the distinction between algorithms that genuinely generate original new examples vs. algorithms that perform naive manipulations (or even merely memorization) of the input sample is an elusive distinction. This makes the theoretical analysis of GANs algorithms challenging.

In this work we introduce two mathematical frameworks for the task of generating synthetic data. The first model we consider is inspired by GANs, and the learning algorithm has only an indirect access to the target distribution via a discriminator. The second model, called DP-Foolability, exploits the notion of differential privacy as a criterion for "non-memorization".

We characterize learnability in each of these models as well as discuss the interrelations. As an application we prove that privately PAC learnable classes are DP-foolable. As we will discuss, this can be seen as an analogue of the equivalence between uniform convergence and learnability in classical PAC learning.

Joint work with Olivier Bousquet and Shay Moran.

https://arxiv.org/pdf/1902.03468.pdf

Let F be a function field and G a connected split reductive group over F. We define a "strange" operator between different spaces of automorphic functions on G(A)/G(F), and show that this operator is natural from the viewpoint of the geometric Langlands program via the functions-sheaves dictionary. We discuss how to define this operator over a number field by relating it to pseudo-Eisenstein series and inversion of the standard intertwining operator. This operator is also connected to Deligne-Lusztig duality and cohomological duality of representations over a local field.

The prediction of interactions between nonlinear laser beams is a longstanding open problem. A traditional assumption is that these interactions are deterministic. We have shown, however, that in the nonlinear Schrodinger equation (NLS) model of laser propagation, beams lose their initial phase information in the presence of input noise. Thus, the interactions between beams become unpredictable as well. Not all is lost, however. The statistics of many interactions are predictable by a universal model.

Computationally, the universal model is efficiently solved using a novel spline-based stochastic computational method. Our algorithm efficiently estimates probability density functions (PDF) that result from differential equations with random input. This is a new and general problem in numerical uncertainty-quantification (UQ), which leads to surprising results and analysis at the intersection of probability and approximation theory.

A homomorphic secret-sharing scheme is a secret-sharing scheme that allows locally mapping shares of a secret to compact shares of a function of the secret. The talk will survey the current state of the art on homomorphic secret sharing, covering efficient constructions, applications in cryptography and complexity theory, and open questions.

We establish two-sided bounds for expectations of order statistics (k-th maxima) of moduli of coordinates of centred log-concave random vectors with uncorrelated coordinates. Our bounds are exact up to multiplicative universal constants in the unconditional case for all k and in the isotropic case for k < c n^{5/6}. We also derive two-sided estimates for expectations of sums of k largest moduli of coordinates for some classes of random vectors. The talk will be based on the joint work with Rafal Latala.

A classical theorem of Mittag-Leffler asserts that in a given Riemann surface X, for any pattern of multiplicities of poles and any configuration of residues (summing to zero), there is a meromorphic 1-form on X that realize them. The only obstruction is that residues at simple poles should be nonzero.

If we require that the multiplicity of the zeroes is also prescribed, the problem can be reformulated in terms of strata of meromorphic differentials. Using the dictionary between complex analysis and flat geometry, we are able to provide a complete characterization of configurations of residues that are realized for a given pattern of singularities. Two nontrivial obstructions appear concerning the combinatorics of the multiplicity of zeroes and the arithmetics of the residues. This is a joint work with Quentin Gendron.

We will discuss several results relating the behavior of a random walk on a planar graph and the geometric properties of a nice embedding of the graph in the plane (specifically, circle packing of the graph). One such a result is that for a bounded degree graph, the simple random walk is recurrent if and only if the boundary of the nice embedding is a polar set (that is, Brownian motion misses it almost surely). If the degrees are unbounded, this is no longer true, but for the case of circle packing of a triangulation, there are weights which are obtained naturally from the circle packing, such that when the boundary is polar, the weighted random walk is recurrent (we believe the converse also hold). These weights arise also in the context of discrete holomorphic and harmonic functions, a discrete analog of complex holomorphic functions. We show that as the sizes of circles, or more generally, the lengths of edges in the nice embedding of the graph tend to zero, the discrete harmonic functions converge to their continuous counterpart with the same boundary conditions. Equivalently, that the exit measure of the weighted random walk converges to the exit measure of standard Brownian motion. This improves previous results of Skopenkov 2013 and Werness 2015, who proves similar results under additional local and global assumptions on the embedding. In particular, we make no assumptions on the degrees of the graph, making the result applicable to models of random planar maps.

Based of joint works with Daniel Jerison, Asaf Nachmias, Matan Seidel and Juan Souto.

Speaker 1: Fanny Augeri (WIS)

Title : Nonlinear large deviations bounds with applications to sparse Erdos-Renyi graphs.

Abstract: In this talk, I will present the framework of the so-called nonlinear large deviations introduced by Chatterjee and Dembo. In a seminal paper, they provided a sufficient criterion in order that the large deviations of a function on the discrete hypercube to be due by only changing the mean of the background measure. This sufficient condition was formulated in terms of the complexity of the gradient of the function of interest. I will present general nonlinear large deviation estimates similar to Chatterjee-Dembo's original bounds except that we do not require any second order smoothness. The approach relies on convex analysis arguments and is valid for a broad class of distributions. Then, I will detail an application of this nonlinear large deviations bounds to the problem of estimating the upper tail of cycles counts in sparse Erdos-Renyi graphs down to the connectivity parameter $n^{-1/2}$.

Speaker 2: Elliot Paquette (Ohio)

Title: The Gaussian analytic function is either bounded or covers the plane

Abstract: The Gaussian analytic function (GAF) is a power series with independent Gaussian coefficients. In the case that this power series has radius of convergence 1, under mild regularity assumptions on the coefficients, it is a classical theorem that the power series is a.s. bounded on open disk if and only if it extends continuously to a function on the closed unit disk a.s. Nonetheless, there exists a natural range of coefficients in which the GAF has boundary values in L-p, but is a.s. unbounded. How wild are these boundary values? Well, Kahane asked if a GAF either a.s. extends continuously to the closed disk or a.s. has range covering the whole plane. We show partial progress in establishing this in the affirmative.

Joint with Alon Nishry.

TBA

An (\epsilon,\phi)-expander decomposition of a graph G=(V,E) with m edges is a partition of vertices into clusters such that (1) each cluster induces subgraph with conductance at least \phi, and (2) the number of inter-cluster edges is at most \epsilon m. This decomposition has a wide range of applications including approximation algorithms for the unique game, fast algorithms for flow and cut problems, and dynamic graph algorithms.

I will describe a new algorithm for constructing (~O(\phi),\phi)-expander decomposition in ~O(m/\phi) time. This is the first nearly linear time algorithm when \phi is at least 1/polylog(m), which is the case in most practical settings and theoretical applications. Previous results either take \Omega(m^{1+o(1)}) time, or attain nearly linear time but with a weaker expansion guarantee where each output cluster is guaranteed to be contained inside some unknown expander.

Our technique can be easily extended to the dynamic setting where the graph undergoing updates. This talk is based on joint work with Di Wang [Saranurak Wang SODA'19].

Following Ulam, Banach asked on the existence of dynamical system on real line with simple Lebesgue spectrum. I discuss the connection of this problem to the famous Banach problem in ergodic theory due essentially to Rohklin. I will further present my recent contribution to this problem and the connection to so called Erdos flat polynomials problem in Harmonic analysis due to J. Bourgain . My talk is based on my recent work and joint work with Mahendra Nadkarni (CBS Mumbai, India).

Poincare duality provides an isomorphism between the homology and cohomology of a compact manifold, up to a shift. For \pi-finite spaces, i.e. spaces with finitely many non-zero homotopy groups, all of which are finite, there is a similar duality only for Q-coefficients, but no such duality exists with F_p coefficients. However, as shown by Michael Hopkins and Jacob Lurie, there is a duality between the homology and cohomology of \pi-finite spaces with coefficients in some extra-ordinary cohomology theories called Morava K-theories. This property of Morava K-theory is called ambidexterity.I will explain what is ambidexterity, some of its consequences and our contribution to the subject.

This is a joint work with Lior Yanovski and Tomer Schlank.

The purpose of this talk is to understand the behavior of the extremal zeros of random polynomials of the form $ P_N(z) = \sum_{k=0}^{N} a_k R_k(z)$ where the family $(R_k)_{k \leq N}$ is an orthonormal basis for the scalar product $\langle P,Q \rangle = \int P(z) \overline{Q(z)} e^{-2N V^{\nu(z)}} d\nu(z)$ with $\nu$ a radial probability measure on $\CC$ and $V^{\nu}(z)= \int \log |z-w|d\nu(w)$.

Although the zeros of these polynomials spread like the measure $\nu$, the zeros of maximum modulus lie outsite of the support. More precisely, the point process of the roots outside of the support of the equilibrium measure converges towards the Bergman point process of the complement of the support.

We also study similar results on a model of Coulomb gases in dimension $2$ where the confining potential is generated by the presence of a fixed background of positive charges. If $\nu$ is a probability measure, we study the system of particles with joint distribution on $\CC^N$, $\frac{1}{Z_N} \prod_{i \leq j} |x_i-x_j|^2 e^{-2(N+1)\sum_{k=1}^{N}V^{\nu}(x_k)} d\ell_{\CC^N}(x_1,\dots,x_N).$ This model in closely related to the study of the zeros of random polynomials. We show that the extremal particles of this Coulomb gas present a similar behavior to the random polynomial setting.

All the results mentioned above are done in collaboration with David Garcia-Zelada.

Consider a simple random walk on $\mathbb{Z}$ with a random coloring of $\mathbb{Z}$. Look at the sequence of the first $N$ steps taken and colors of the visited locations. From it, you can deduce the coloring of approximately $\sqrt{N}$ integers. Suppose an adversary may change $\delta N$ entries in that sequence. What can be deduced now? We show that for any $\theta<0.5,p>0$, there are $N_{0},\delta_{0}$ such that if $N>N_{0}$ and $\delta<\delta_{0}$ then with probability $>1-p$ we can reconstruct the coloring of $>N^{\theta}$ integers.

In this talk I will present two recent works:

1) **Dynamic Temporal Alignment of Speech to Lips**

Many speech segments in movies are re-recorded in a studio during post-production, to compensate for poor sound quality as recorded on location. Manual alignment of the newly-recorded speech with the original lip movements is a tedious task. We present an audio-to-video alignment method for automating speech to lips alignment, stretching and compressing the audio signal to match the lip movements. This alignment is based on deep audio-visual features, mapping the lips video and the speech signal to a shared representation. Using this shared representation we compute the lip-sync error between every short speech period and every video frame, followed by the determination of the optimal corresponding frame for each short sound period over the entire video clip. We demonstrate successful alignment both quantitatively, using a human perception-inspired metric, as well as qualitatively. The strongest advantage of our audio-to-video approach is in cases where the original voice is unclear, and where a constant shift of the sound can not give perfect alignment. In these cases, state-of-the-art methods will fail.

2) **Neural separation of observed and unobserved distributions**

Separating mixed distributions is a long standing challenge for machine learning and signal processing. Most current methods either rely on making strong assumptions on the source distributions or rely on having training samples of each source in the mixture. In this work, we introduce a new method - Neural Egg Separation - to tackle the scenario of extracting a signal from an unobserved distribution additively mixed with a signal from an observed distribution. Our method iteratively learns to separate the known distribution from progressively finer estimates of the unknown distribution. In some settings, Neural Egg Separation is initialization sensitive, we therefore introduce Latent Mixture Masking which ensures a good initialization. Extensive experiments on audio and image separation tasks show that our method outperforms current methods that use the same level of supervision, and often achieves similar performance to full supervision.

We will discuss the field of definition of a rational function and in what ways it can change under iteration, in particular when the degree over the base field drops. We present two families of rational functions with the property above, and prove that in the special case of polynomials, only one of these families is possible. We also explain how this relates to Ritt's decomposition theorem on polynomials. Joint work with Francesco Veneziano (SNS Pisa).

We explore the power of interactive proofs with a distributed verifier. In this setting, the verifier consists of n nodes and a graph G that defines their communication pattern. The prover is a single entity that communicates with all nodes by short messages. The goal is to verify that the graph G belongs to some language in a small number of rounds, and with small communication bound, i.e., the proof size.

This interactive model was introduced by Kol, Oshman and Saxena (PODC 2018) as a generalization of non-interactive distributed proofs. They demonstrated the power of interaction in this setting by constructing protocols for problems as Graph Symmetry and Graph Non-Isomorphism -- both of which require proofs of (n^2)-bits without interaction.

In this work, we provide a new general framework for distributed interactive proofs that allows one to translate standard interactive protocols (i.e., with a centralized verifier) to ones where the verifier is distributed with a proof size that depends on the computational complexity of the verification algorithm run by the centralized verifier. We show the following:

* Every (centralized) computation performed in time O(n) on a RAM can be translated into three-round distributed interactive protocol with O(log n) proof size. This implies that many graph problems for sparse graphs have succinct proofs (e.g., testing planarity).

* Every (centralized) computation implemented by either a small space or by uniform NC circuit can be translated into a distributed protocol with O(1) rounds and O(log n) bits proof size for the low space case and polylog(n) many rounds and proof size for NC.

* We show that for Graph Non-Isomorphism, one of the striking demonstrations of the power of interaction, there is a 4-round protocol with O(log n) proof size, improving upon the O(n*log n) proof size of Kol et al.

* For many problems, we show how to reduce proof size below the seemingly natural barrier of log n. By employing our RAM compiler, we get a 5-round protocol with proof size O(loglog n) for a family of problems including Fixed Automorphism, Clique and Leader Election (for the latter two problems we actually get O(1) proof size).

* Finally, we discuss how to make these proofs non-interactive {\em arguments} via random oracles.

Our compilers capture many natural problems and demonstrate the difficulty in showing lower bounds in these regimes.

Joint work with Moni Naor and Merav Parter.

A conjunction over a binary alphabet is a boolean function specified by a length n pattern of 0's, 1's and wildcards. On input bit strings of length n, the function outputs 1 if the input matches the pattern at all non wildcard positions. At CRYPTO 2018, Bishop et al. proposed a simple and elegant construction to obfuscate this class of functions by embedding the pattern in the error positions of a noisy Reed-Solomon codeword, and placing the codeword in a group exponent. They prove their scheme achieves a notion of security called "distributional virtual black box" in the generic group model for random conjunctions with at most 0.774n wildcards.

In this talk, I'll show how to abstract the Bishop et al. scheme to obtain a significantly more efficient "dual" scheme. In the generic group model, our scheme admits an intuitive proof of security and does not place any restrictions on the number of wildcards.

Next, I'll describe a simple modification to the construction that avoids encoding in a group exponent and is secure under the Learning Parity with Noise (LPN) assumption. At the heart of our security proof is a reduction from standard LPN to LPN with "structured error."

No prior knowledge of the Bishop et al. paper will be assumed.

We introduce a new method for obtaining quantitative convergence rates for the central limit theorem (CLT) in the high dimensional regime. The method is based on the notion of a martingale embedding, a multivariate analogue of Skorokhod's embedding. Using the method we are able to obtain several new bounds for convergence in transportation distance and in entropy, and in particular: (a) We improve the best known bound, for convergence in quadratic Wasserstein transportation distance for bounded random vectors; (b) We derive a non-asymptotic convergence rate for the entropic CLT in arbitrary dimension, for log-concave random vectors; (c) We give an improved bound for convergence in transportation distance under a log-concavity assumption and improvements for both metrics under the assumption of strong log-concavity.

In this talk, we will review the method, and explain how one might use it in order to prove quantitative statements about rates of convergence.

We introduce a 3D shape generative model based on deep neural networks. A new image-like (i.e., tensor) data representation for genus-zero 3D shapes is devised. It is based on the observation that complicated shapes can be well represented by multiple parameterizations (charts), each focusing on a different part of the shape.

The new tensor data representation is used as input to Generative Adversarial Networks for the task of 3D shape generation. The 3D shape tensor representation is based on a multichart structure that enjoys a shape covering property and scale-translation rigidity. Scale-translation rigidity facilitates high quality 3D shape learning and guarantees unique reconstruction. The multi-chart structure uses as input adataset of 3D shapes (with arbitrary connectivity) and a sparse correspondence between them.

The output of our algorithm is a generative model that learns the shape distribution and is able to generate novel shapes, interpolate shapes, and explore the generated shape space. The effectiveness of the method is demonstrated for the task of anatomic shape generation including human body and bone (teeth) shape generation.

Recently, the Riemannian geometry of the space of symmetric positive-defini

* Joint work with Or Yair, Ori Katz, and Miri Ben-Chen

Computing shortest paths is one of the fundamental problems of graph algorithms. The goal of *dynamic* all pairs shortest paths (APSP) is to maintain shortest path trees from all vertices as the edges of the graph change over time. The algorithm is said to be decremental if it handles only deletions, incremental if it handles only insertions and fully dynamic if it handles both deletions and insertions.

In this talk I will present a near optimal decremental algorithm that maintains approximate all pairs shortest paths.

The sharp threshold phenomenon is a central topic of research in the analysis of Boolean functions. Here, one aims to give sufficient conditions for a monotone Boolean function f to satisfy $\mu p(f)=o(\mu q(f))$, where $q = p + o(p)$, and $\mu p(f)$ is the probability that $f=1$ on an input with independent coordinates, each taking the value $1$ with probability $p$. The dense regime, where $\mu p(f)=\Theta(1)$, is somewhat understood due to seminal works by Bourgain, Friedgut, Hatami, and Kalai. On the other hand, the sparse regime where $\mu p(f)=o(1)$ was out of reach of the available methods. However, the potential power of the sparse regime was envisioned by Kahn and Kalai already in 2006. In this talk we show that if a monotone Boolean function $f$ with $\mu p(f)=o(1)$ satisfies some mild pseudo-randomness conditions then it exhibits a sharp threshold in the interval $[p,q]$, with $q = p+o(p)$. More specifically, our mild pseudo-randomness hypothesis is that the $p$-biased measure of $f$ does not bump up to $\Theta(1)$ whenever we restrict $f$ to a sub-cube of constant co-dimension, and our conclusion is that we can find $q=p+o(p)$, such that $\mu p(f)=o(\mu q(f))$ At its core, this theorem stems from a novel hypercontactive theorem for Boolean functions satisfying pseudorandom conditions, which we call `small generalized influences'. This result takes on the role of the usual hypercontractivity theorem, but is significantly more effective in the regime where $p = o(1)$. We demonstrate the power of our sharp threshold result by reproving the recent breakthrough result of Frankl on the celebrated Erdos matching conjecture, and by proving conjectures of Huang--Loh--Sudakov and Furedi--Jiang for a new wide range of the parameters. Based on a joint work with Keevash, Long, and Minzer.

In recent years, deep neural networks (DNNs) achieved unprecedented performance in many vision tasks. However, state-of-the-art results are typically achieved by very deep networks, which can reach tens of layers with tens of millions of parameters. To make DNNs implementable on platforms with limited resources, it is necessary to weaken the tradeoff between performance and efficiency. In this work, we propose a new activation unit, which is suitable for both high and low level vision problems. In contrast to the widespread per-pixel activation units, like ReLUs and sigmoids, our unit implements a learnable nonlinear function with spatial connections. This enables the net to capture much more complex features, thus requiring a significantly smaller number of layers in order to reach the same performance. We illustrate the effectiveness of our units through experiments with state-of-the-art nets for classification, denoising, and super resolution. With our approach, we are able to reduce the size of these models by nearly 50% without incurring any degradation in performance.

*Spotlight presentation at CVPR'18 (+ submitted extension)

A joint work with Tamar Rott Shaham and Tomer Michaeli

I will talk about my joint work with Dave Benson which constructs new symmetric tensor categories in characteristic 2 arising from modular representation theory of elementary abelian 2-groups, and about its conjectural generalization to characteristic p>2. I will also discuss my work with Gelaki and Coulembier which shows that integral symmetric tensor categories in characteristic p>2 whose simple objects form a fusion category are super-Tannakian (i.e., representation categories of a supergroup scheme).

The Blind Source Separation (BSS) problem consists of retrieving signals of interest, termed the sources, from a dataset consisting of their mixtures. One of the most popular and common paradigms for solving the BSS problem is Independent Component Analysis (ICA), where the sources are assumed to be (only) mutually statistically independent random processes, and the mixtures are assumed to be linear combinations thereof, where the linear mixing operator is unknown. In this talk, we shall start with the Gaussian Maximum Likelihood (GML) approach for the semi-blind problem, in which the sources are assumed to be temporally-dive

The talk will be about two classical problems in the theory of Carleman classes of smooth functions. The first one is to describe the image of a Carleman class under the Borel map (which maps a smooth function to its jet of Taylor coefficients at a given point). The second one concerns possible ways to construct a function in a given Carleman class with prescribed Taylor coefficients. I will present solutions to both problems. If time permits, I will also discuss related problems which originate in the singularity theory of Carleman classes.

The reducibility and structure of parabolic inductions is a basic problem in the representation theory of p-adic groups. Of particular interest are principal series and degenerate principal series representations, that is parabolic induction of 1-dimensional representations of Levi subgroups.

In this talk, I will start by describing the functor of normalized induction and its left adjoint, the Jacquet functor, and by going through several examples in the group SL_4(Q_p) will describe an algorithm which can be used to determine reducibility of such representations.

This algorithm is the core of a joint project with Hezi Halawi, in which we study the structure of degenerate principal series of exceptional groups of type En (see https://arxiv.org/abs/1811.02974).

Recent years have seen the emergence of new kinds of software including deep learning, programmable computer networks, and blockchains. Unfortunately, these systems have been shown to suffer from critical safety and security errors, affecting their wider adoption. The goal of my research is to develop new automated program verification and synthesis techniques which ensure safety and reliability of these systems.

In this talk, I will start by introducing AI2, the first automated verifier for neural networks able to certify large convolutional models. The key idea behind AI2 is to bridge abstract interpretation and neural networks, enabling a sound over-approximation of a network’s behavior in a scalable manner. I will then briefly discuss DL2, a system which enables clean interaction with deep learning models by allowing users to pose queries in a declarative manner. Finally, I will demonstrate how automated program analysis and synthesis can address key security and reliability challenges in domains such as computer networks and blockchains, preventing severe outages and financial losses.

Bio: Dana Drachsler-Cohen is an ETH Postdoctoral Fellow at the department of Computer Science, ETH Zurich. Her research interests span automated reasoning, program synthesis and machine learning. She obtained her PhD from the Computer Science Department at the Technion in 2017.

To compute the spectral norm of a p-tensor one needs to optimize a homogeneous polynomial of degree p over the sphere. When p=2 (the matrix case) it is algorithmically easy, but for p>2 it can be NP-hard. In this talk I will focus on (randomized) optimization in high-dimensions when the objective function is a linear combination of homogeneous polynomials with Gaussian coefficients. Such random functions are called spherical spin glasses in physics and have been extensively studied since the 80s. I will describe certain geometric properties of spherical spin glasses unique to the full-RSB case, and explain how they can be used to design a polynomial time algorithm that finds points within a small multiplicative error from the global maximum.

We tackle the problem of multiple image alignment and 3D reconstruction under extreme noise.

Modern alignment schemes, based on similarity measures, feature matching and optical flow are often pairwise, or assume global alignment. Nevertheless, under extreme noise, the alignment success sharply deteriorates, since each image does not contain enough information. Yet, when multiple images are well aligned, the signal emerges from the stack.

As the problems of alignment and 3D reconstruction are coupled, we constrain the solution by taking into account only alignments that are geometrically feasible and solve for the entire stack simultaneously. The solution is formulated as a variational problem where only a single variable per pixel, the scene's distance, is solved for. Thus, the complexity of the algorithm is independent of the number of images. Our method outperforms state-of-the-art techniques, as indicated by our simulations and by experiments on real-world scenes.

And, finally- I will discuss how this algorithm is related to our current and planned work with underwater robots.

We show that when data are endogenously truncated the widely-used IV fails to render the relationship causal as well as introduces bias into the exogenous covariates. We offer a newly-introduce

A family of sets F is said to satisfy the (p,q)-property if among any p sets in F, some q have a non-empty intersection. Hadwiger and Debrunner (1957) conjectured that for any p > q > d there exists a constant c = c_d(p,q), such that any family of compact convex sets in R^d that satisfies the (p,q)-property, can be pierced by at most c points. The classical Helly's Theorem is equivalent to the fact that c_d(p,p)=1 (p > d).

In a celebrated result from 1992, Alon and Kleitman proved the conjecture. However, obtaining sharp bounds on the minimal such c_d(p,q), called `the Hadwiger-Debrunner numbers', is still a major open problem in combinatorial geometry.

In this talk we present improved upper and lower bounds on the Hadwiger-Debrunner numbers, the latter using the hypergraph container method.

Based on joint works with Shakhar Smorodinsky and Gabor Tardos.

Godofredo Iommi (PUC Chile)

**Title: Upper semi-continuity of the entropy map for Markov shifts**

Abstract: In this talk I will show that for finite entropy countable Markov shifts the entropy map is upper semi-continuous when restricted to the set of ergodic measures. This is joint work with Mike Todd and Anibal Velozo.

Paul Dario (ENS)

**Title: Homogenization on supercritical percolation cluster**

Abstract: The standard theory of stochastic homogenization requires an assumption of uniform ellipticity on the environment. In this talk, we investigate how one can remove this assumption in a specific case: the infinite cluster of the supercritical Bernouilli percolation of Zd. We will present a renormalization argument for the infinite cluster and show how one can use it to adapt the theory developped in the uniformly elliptic setting. We will then review some results which can be obtained through this technique: homogenization theorem, large scale regularity, Liouville theorem and bounds on the corrector.

The common practice in image generation is to train a network and fix it to a specific working point. In contrast, in this work we would like to provide control at inference time in a generic way. We propose to add a second training phase, where we train an additional tuning block. The goal of these tuning blocks is to provide us with control over the output image by modifying the latent space representation. I will first present our latest attempt in the domain of generative adversarial, where we aim to improve the quality of the results using the discriminator information -- we name this adversarial feedback loop. Second, I will present Dynamic-Net, where we train a single network such that it can emulate many networks trained with different objectives. The main assumption of both works is that we can learn how to change the latent space in order to achieve a specific goal. Evaluation on many application shows that the latent space can be manipulated, such that, it allows us to have diversified control at inference time.

* Joint work with Firas Shama*, Alon Shoshan* and Lihi Zelnik Manor.

Deep learning has become pervasive in everyday life, powering language applications like Apple's Siri, Amazon's Alexa, and Google Translate. The inherent limitation of these deep learning systems, however, is that they often function as a "black box," preventing researchers and users from discerning the roles of different components and what they learn during the training process. In this talk, I will describe my research on interpreting deep learning models for language along three lines. First, I will present a methodological framework for investigating how these models capture various language properties. The experimental evaluation will reveal a learned hierarchy of internal representations in deep models for machine translation and speech recognition. Second, I will demonstrate that despite their success, deep models of language fail to deal even with simple kinds of noise, of the type that humans are naturally robust to. I will then propose simple methods for improving their robustness to noise. Finally, I will turn to an intriguing problem in language understanding, where dataset biases enable trivial solutions to complex language tasks. I will show how to design models that are more robust to such biases, and learn less biased latent representations

Random curves in space and how they are knotted give an insight into the behavior of "typical" knots and links. They have been studied by biologists and physicists in the context of the structure of random polymers. Several randomized models have been suggested and investigated, and many results have been obtained via computational experiment. The talk will begin with a review of this area.

In work with Hass, Linial, and Nowik, we study knots based on petal projections, potholder diagrams, and more. We have found explicit formulas for the limit distribution of finite type invariants of random knots and links in the Petaluma model. I will discuss these results and sketch proof ideas as time permits.

Tensor categories are abelian k-linear monoidal categories modelled on the representation categories of affine (super)group schemes over k. Deligne gave very succinct intrinsic criteria for a tensor category to be equivalent to such a representation category, over fields k of characteristic zero. These descriptions are known to fail badly in prime characteristics. In this talk, I will present analogues in prime characteristic of these intrinsic criteria. Time permitting, I will comment on the link with a recent conjecture of V. Ostrik which aims to extend Deligne's work in a different direction.

Inspired by empirical observations on honey bee colonies, we analyze the emergence of task differentiation in a model complex system, characterized by an absence of hierarchical control, yet able to exhibit coordinated behavior and collective function. The analysis considers the steady-state response of a mechanical interaction network to exogenous resonant excitation. It demonstrates how an optimal allocation of excitation sensitivities across the network nodes that results in a maximal response may be constructed either using global information about the network topology or, alternatively, through iterated dynamics of an intuitive learning paradigm that only relies on local information within the network. Importantly, the analysis derives explicit conditions on the topology that guarantee optimal selection using only local dynamics, but also describes circumstances when the system may naturally evolve to a condition of collapse, in which all the excitation sensitivities equal zero, at least over intermediate times. The discussion considers the implications to other network designs, including naturally occurring networks, as well as the effects of noise and nonlinearity.

In the classical Node-Disjoint Paths (NDP) problem, we are given an n-vertex graph G, and a collection of pairs of its vertices, called demand pairs. The goal is to route as many of the demand pairs as possible, where to route a pair we need to select a path connecting it, so that all selected paths are disjoint in their vertices.

The best current algorithm for NDP achieves an $O(\sqrt{n})$-approximation, while, until recently, the best negative result was a roughly $\Omega(\sqrt{\log n})$-hardness of approximation. Recently, an improved $2^{\Omega(\sqrt{\log n})}$-hardness of approximation for NDP was shown, even if the underlying graph is a subgraph of a grid graph, and all source vertices lie on the boundary of the grid. Unfortunately, this result does not extend to grid graphs.

The approximability of NDP in grids has remained a tantalizing open question, with the best upper bound of $\tilde{O}(n^{1/4})$, and the best lower bound of APX-hardness. In this talk we come close to resolving this question, by showing an almost polynomial hardness of approximation for NDP in grid graphs.

Our hardness proof performs a reduction from the 3COL(5) problem to NDP, using a new graph partitioning problem as a proxy. Unlike the more standard approach of employing Karp reductions to prove hardness of approximation, our proof is a Cook-type reduction, where, given an input instance of 3COL(5), we produce a large number of instances of NDP, and apply an approximation algorithm for NDP to each of them. The construction of each new instance of NDP crucially depends on the solutions to the previous instances that were found by the approximation algorithm.

Joint work with David H.K. Kim and Rachit Nimavat.

We define a functor from the category of representations of algebraic supergroups with reductive even part to the category of equivariant sheaves and show several applications of this construction to representation theory, in particular projectivity criterion, classification of blocks and computation of dimensions of irreducible representations.

Every $k$ entries in a permutation can have one of k! different relative orders, called patterns. How many times does each pattern occur in a large random permutation of size $n$? The distribution of this k!-dimensional vector of pattern densities was studied by Janson, Nakamura, and Zeilberger (2015). Their analysis showed that some component of this vector is asymptotically multinormal of order $1/\sqrt(n)$, while the orthogonal component is smaller. Using representations of the symmetric group, and the theory of U-statistics, we refine the analysis of this distribution. We show that it decomposes into $k$ asymptotically uncorrelated components of different orders in $n$, that correspond to representations of $S_k$. Some combinations of pattern densities that arise in this decomposition have interpretations as practical nonparametric statistical tests.

Imbalanced data is problematic for many machine learning applications. Some classifiers, for example, can be strongly biased due to diversity of class populations. In unsupervised learning, imbalanced sampling of ground truth clusters can lead to distortions of the learned clusters. Removing points (undersampling) is a simple solution, but this strategy leads to information loss and can decrease generalization performance. We instead focus on data generation to artificially balance the data.

In this talk, we present a novel data synthesis method, which we call SUGAR (Synthesis Using Geometrically Aligned Random-walks) [1], for generating data in its original feature space while following its intrinsic geometry. This geometry is inferred by a diffusion kernel [2] that captures a data-driven manifold and reveals underlying structure in the full range of the data space -- including undersampled regions that can be augmented by new synthesized data.

We demonstrate the potential advantages of the approach by improving both classification and clustering performance on numerous datasets. Finally, we show that this approach is especially useful in biology, where rare subpopulations and gene-interaction relationships are affected by biased sampling.

[1] Lindenbaum, Ofir, Jay S. Stanley III, Guy Wolf, and Smita Krishnaswamy. "Geometry-Based Data Generation." *NIPS* (2018).

[2] Coifman, Ronald R., and Stéphane Lafon. "Diffusion maps." *Applied and computational harmonic analysis *(2006).

In the last 35 years, geometric flows have proven to be a powerful tool in geometry and topology. The Mean Curvature Flow is, in many ways, the most natural flow for surfaces in Euclidean space.

In this talk, which will assume no prior knowledge, I will illustrate how mean curvature flow could be used to address geometric questions. I will then explain why the formation of singularities of the mean curvature flow poses difficulties for such applications, and how recent new discoveries about the structure of singularities (including a work joint with Kyeongsu Choi and Robert Haslhofer) may help overcome those difficulties.

The problem of computational super-resolution asks to recover fine features of a signal from inaccurate and bandlimited data, using an a-priori model as a regularization. I will describe several situations for which sharp bounds for stable reconstruction are known, depending on signal complexity, noise/uncertainty level, and available data bandwidth. I will also discuss optimal recovery algorithms, and some open questions.

We show that static data structure lower bounds in the group (linear) model imply semi-explicit lower bounds on matrix rigidity. In particular, we prove that an explicit lower bound of t >> log^2(n) on the cell-probe complexity of linear data structures in the group model, even against arbitrarily small linear space (s = (1+eps)n), would already imply a semi-explicit (P^NP) construction of rigid matrices with significantly better parameters than the current state of art (Alon, Panigrahy, and Yekhanin, 2009). Our result further asserts that polynomial (t > n^eps) data structure lower bounds against near-maximal space, would imply super-linear circuit lower bounds for log-depth linear circuits (a four-decade open question). In the succinct space regime (s = n+o(n)), we show that any improvement on current cell-probe lower bounds in the linear model would also imply new rigidity bounds. Our main result relies on a new connection between the "inner" and "outer" dimensions of a matrix (Paturi and Pudlak, 2006), and on a new worst-to-average case reduction for rigidity, which is of independent interest.

Joint work with Zeev Dvir (Princeton) and Alexander Golovnev (Harvard).

Consider an instance of Euclidean k-means or k-medians clustering. We show that the cost of the optimal solution is preserved up to a factor of $(1+\epsilon)$ under a projection onto a random $O(log(k/\epsilon)/\epsilon^2)$-dimensional subspace whp. Further, the cost of every clustering is preserved within $(1+\epsilon)$. Crucially, the dimension does not depend on the total number of points n in the instance. Additionally, our result applies to Euclidean k-clustering with the distances raised to the p-th power for any constant $p$.

For k-means, our result resolves an open problem posed by Cohen, Elder, Musco, Musco, and Persu (STOC 2015); for k-medians, it answers a question raised by Kannan.

Joint work with Konstantin Makarychev and Ilya Razenshteyn.

The goal of style transfer algorithms is to render the content of one image using the style of another. We propose Style Transfer by Relaxed Optimal Transport and Self-Similarity (STROTSS), a new optimization-based style transfer algorithm. We extend our method to allow user-specified point-to-point or region-to-region control over visual similarity between the style image and the output. Such guidance can be used to either achieve a particular visual effect or correct errors made by unconstrained style transfer. In order to quantitatively compare our method to prior work, we conduct a large-scale user study designed to assess the style-content tradeoff across settings in style transfer algorithms. Our results indicate that for any desired level of content preservation, our method provides higher quality stylization than prior work.

Joint work with Nick Kolkin and Jason Salavon.

Understanding deep learning calls for addressing three fundamental questions: expressiveness, optimization and generalization. Expressiveness refers to the ability of compactly sized deep neural networks to represent functions capable of solving real-world problems. Optimization concerns the effectiveness of simple gradient-based algorithms in solving non-convex neural network training programs. Generalization treats the phenomenon of deep learning models not overfitting despite having much more parameters than examples to learn from. This talk will describe a series of works aimed at unraveling some of the mysteries behind optimization and expressiveness. I will begin by discussing recent analyses of optimization for deep linear neural networks. By studying the trajectories of gradient descent, we will derive the most general guarantee to date for efficient convergence to global minimum of a gradient-based algorithm training a deep network. Moreover, in stark contrast to conventional wisdom, we will see that, sometimes, gradient descent can train a deep linear network faster than a classic linear model. In other words, depth can accelerate optimization, even without any gain in expressiveness, and despite introducing non-convexity to a formerly convex problem. In the second (shorter) part of the talk, I will present an equivalence between convolutional and recurrent networks --- the most successful deep learning architectures to date --- and hierarchical tensor decompositions. The equivalence brings forth answers to various questions concerning expressiveness, resulting in new theoretically-b

Optimization works covered in this talk were in collaboration with Sanjeev Arora, Elad Hazan, Noah Golowich and Wei Hu. Expressiveness works were with Amnon Shashua, Or Sharir, Yoav Levine, Ronen Tamari and David Yakira.

In this talk, I will discuss a question which originates in complex analysis but is really a problem in non-linear elliptic PDE. A finite Blaschke product is a proper holomorphic self-map of the unit disk, just like a polynomial is a proper holomorphic self-map of the complex plane. A celebrated theorem of Heins says that up to post-composition with a M\"obius transformation, a finite Blaschke product is uniquely determined by the set of its critical points. Konstantin Dyakonov suggested that it may interesting to extend this result to infinite degree. However, one must be a little careful since infinite Blaschke products may have identical critical sets. I will show that an infinite Blaschke product is uniquely determined by its "critical structure" and describe all possible critical structures which can occur. By Liouville's correspondence, this question is equivalent to studying nearly-maximal solutions of the Gauss curvature equation $\Delta u = e^{2u}$. This problem can then be solved using PDE techniques, using the method of sub- and super-solutions.

The hydrogen atom system is one of the most thoroughly studied examples of a quantum mechanical system. It can be fully solved, and the main reason why is its (hidden) symmetry. In this talk I shall explain how the symmetries of the Schrödinger equation for the hydrogen atom, both visible and hidden, give rise to an example in the recently developed theory of algebraic families of Harish-Chandra modules. I will show how the algebraic structure of these symmetries completely determines the spectrum of the Schrödinger operator and sheds new light on the quantum nature of the system. No prior knowledge on quantum mechanics will be assumed.

As software has grown increasingly critical to our society's infrastructure, mechanically-verified software has grown increasingly important, feasible, and prevalent. Proof assistants have seen tremendous growth in recent years because of their success in the mechanical verification of high-value applications in many areas, including cyber security, cyber-physical systems, operating systems, compilers, and microkernels. These proof assistants are built on top of constructive type theory whose computational interpretation is given by the proofs-as-programs paradigm, which establishes a correspondence between formal proofs and computer programs. However, while both proof theory and programming languages have evolved significantly over the past years, the cross-fertilization of the independent new developments in each of these fields has yet to be explored in the context of this paradigm. This naturally gives rise to the following questions: how can modern notions of computation influence and contribute to formal foundations, and how can modern reasoning techniques improve the way we design and reason about programs?

In this talk I first demonstrate how using programming principles that go beyond the standard lambda-calculus, namely state and non-determinism, promotes the specification and verification of modern systems, e.g. distributed systems. I then illustrate the surprising fragility of proof assistants in the presence of such new computational capabilities, and outline my ongoing efforts to develop a more robust foundation. For the converse direction, I show how incorporating modern proof-theoretic techniques offers a more congenial framework for reasoning about hard programming problems and hence facilitates the verification effort.

Representation theory of non-compact real groups, such as SL(2,R), is a fundamental discipline with uses in harmonic analysis, number theory, physics, and more. This theory is analytical in nature, but in the course of the 20th century it was algebraized and geometrized (the key contributions are by Harish-Chandra for the former and by Beilinson-Bernstein for the latter). Roughly and generally speaking, algebraization strips layers from the objects of study until we are left with a bare skeleton, amenable to symbolic manipulation. Geometrization, again very roughly, reveals how algebraic objects have secret lives over spaces - thus more amenable to human intuition. In this talk, I will try to motivate and present one example - the calculation of the Casselman-Jacquet module of a principal series representation (I will explain the terms in the talk).

The Bakry-Emery theorem asserts that uniformly log-concave probability measures satisfy certain functional inequalities, with constants that are better than those associated with the Gaussian measure. In this talk, I will explain how if the constant is almost that of the Gaussian, then the measure almost splits off a Gaussian factor, with explicit quantitative bounds. The proof is based on a combination of Stein's method and simple arguments from calculus of variations. Joint work with Thomas Courtade.

This talk will describe my past and ongoing work on translating images and words between very different datasets without supervision. Humans often do not require supervision to make connections between very different sources of data, but this is still difficult for machines. Recently great progress was made by using adversarial training a powerful yet tricky method. Although adversarial methods have had great success, they have critical failings which significantly limit their breadth of applicability and motivate research into alternative non-adversarial methods. In this talk, I will describe novel non-adversarial methods for unsupervised word translation and for translating images between very different datasets (joint work with Lior Wolf). As image generation models are an important component in our method, I will present a non-adversarial image generation approach, which is often better than current adversarial approaches (joint work with Jitendra Malik).

We all know what is an algebraically closed field and that any field can be embedded into an algebraically closed field. But what happens if multiplication is not commutative? In my talk I'll suggest a definition of an algebraically closed skew field, give an example of such a skew field, and show that not every skew field can be embedded into an algebraically closed one.

It is still unknown whether an algebraically closed skew field exists in the finite characteristic case!

Ruelle's operator theorem states that the Ruelle operator $L$, which is a positive operator acting on Holder functions, is conjugated to $P+R$ where $R$ is a one-dimensional projection and the norm of $R$ is smaller than 1. This decomposition, also known as spectral gap, is of interest as it allows to characterise the underlying dynamical system through, e.g., central limit theorems or continuous response to perturbations. However, the conjugation depends on the existence of a positive eigenfunction of $L$, which might not exist in more general, fibred situations due to purely functorial reasons. A possibility to circumvent this problem is to consider quotients of operators of the form $f \mapsto \frac{L^m(f L^n (1))}{L^{m+n}(1)}.$ In fact, it is possible to provide reasonable conditions such that their dual operators contract the Wasserstein distance exponentially in $m$. The result gives rise, for example, to a law of the iterated logarithm for continued fractions with sequentially restricted entries or a topology on the set of equilibrium states for semigroups of expanding maps. This is joint work with Paulo Varandas and Xuan Zhang.

We all capture the world around us through digital data such as images, videos and sound. However, in many cases, we are interested in certain properties of the data that are either not available or difficult to perceive directly from the input signal. My goal is to "Re-render Reality", i.e., develop algorithms that analyze digital signals and then create a new version of it that allows us to see and hear better. In this talk, I'll present a variety of methodologies aimed at enhancing the way we perceive our world through modified, re-rendered output. These works combine ideas from signal processing, optimization, computer graphics, and machine learning, and address a wide range of applications. More specifically, I'll demonstrate how we can automatically reveal subtle geometric imperfection in images, visualize human motion in 3D, and use visual signals to help us separate and mute interference sound in a video. Finally, I'll discuss some of my future directions and work in progress.

BIO: Tali is a Senior Research Scientist at Google, Cambridge, developing algorithms at the intersection of computer vision and computer graphics. Before Google, she was a Postdoctoral Associate at the Computer Science and Artificial Intelligence Lab (CSAIL) at MIT, working with Prof. William T. Freeman. Tali completed her Ph.D studies at the school of electrical engineering, Tel-Aviv University, Israel, under the supervision of Prof. Shai Avidan, and Prof. Yael Moses. Her research interests include computational photography, image synthesize, geometry and 3D reconstruction.

In this talk we survey some formulations of fairness requirements for decision making under uncertainty. We then discuss results from 3 recent papers:

- Treating individuals fairly is not in conflict with long-term scientific learning goals if the population is sufficiently diverse.
- When there is a pipeline of decisions, end-to-end fairness is impossible to achieve even in a very simple model.
- Exploiting the knowledge acquired by others can unfairly advantage the free rider.

These papers are joint work with a number of co-authors:

Christopher Jung, Neil Lutz, Jamie Morgenstern, Aaron Roth, Bo Waggoner, Steven Wu, and Juba Ziani

Despite their superb empirical performance, deep learning models for natural language processing (NLP) are often considered black boxes, as relatively little is known as to what accounts for their success. This lack of understanding turns model development into a slow and expensive trial-and-error process, which limits many researchers from developing state-of-the-ar

This is joint work with Hao Peng, Sam Thomson and Noah A. Smith

Zero knowledge protocols are spectacular, allowing to prove NP statements without revealing anything but their validity. An essential element that enables this wonder is interaction. But how much interaction exactly is needed? This question has long challenged cryptographers and is yet to be settled under standard assumptions. In fact, the question appears to be equally challenging also for natural relaxations of the zero knowledge requirement. The difficulty in answering the round complexity question stems from that of a foundational question in cryptography --- what is the power of non-black-box reductions?

In this talk, I will explain this difficulty and present a new non-black-box technique that resolves, under standard assumptions, the round complexity of weak zero knowledge protocols (Dwork-Naor-Reingold-Stockmeyer '98). Specifically, assuming quasipolynomial hardness of the Learning with Errors problem and fully-homomorphic encryption, we construct a two message protocol, a setting where (full-fledged) zero knowledge is impossible.

The talk will assume no prior knowledge in cryptography. It is based on joint work with Dakshita Khurana and Omer Paneth (the paper can be found on

A long-standing open problem, known as Hadwiger's covering problem, asks what is the smallest natural number $N(n)$ such that every convex body in can be covered by a union of the interiors of at most $N(n)$ of its translates. Despite continuous efforts, the best general upper bound known for this number remains as it was more than sixty years ago, of the order of ${2n \choose n} \ln n$.

In this talk, I will discuss some history of this problem and present a new result in which we improve this bound by a sub-exponential factor. Our approach combines ideas from previous work, with tools from Asymptotic Geometric Analysis. As a key step, we use thin-shell estimates for isotropic log-concave measures to prove a new lower bound for the maximum volume of the intersection of a convex body $K$ with a translate of $-K$. We further show that the same bound holds for the volume of $K\cap(-K)$ if the center of mass of $K$ is at the origin.

If time permits we shall discuss some other methods and results concerning this problem and its relatives.

Joint work with H. Huang, B. Vritsiou, and T. Tkocz

Visual repetitions are abundant in our surrounding physical world: small image patches tend to reoccur within a natural image, and across different rescaled versions thereof. Similarly, semantic repetitions appear naturally inside an object class within image datasets, as a result of different views and scales of the same object. In my thesis I studied deviations from these expected repetitions, and demonstrated how this information can be exploited to tackle both low-level and high-level vision tasks. These include “blind” image reconstruction tasks (e.g. dehazing, deblurring), image classification confidence estimation, and more.

For fixed t>1 and L>3 we establish sharp asymptotic formula for the log-probability that the number of cycles of length L in the Erdos - Renyi random graph G(N,p) exceeds its expectation by a factor t, assuming only that p >> log N/sqrt(N). We obtain such sharp upper tail bounds also for the Schatten norms of the corresponding adjacency matrices, and in a narrower range of p=p(N), also for general subgraph counts. In this talk, based on a recent joint work with Nick Cook, I will explain our approach and in particular our quantitative refinement of Szemeredi's regularity lemma for sparse random graphs in the large deviations regime.

Deep convolutional network architectures are often assumed to guarantee generalization for small image translations and deformations. In this paper we show that modern CNNs (VGG16, ResNet50, and InceptionResNetV2) can drastically change their output when an image is translated in the image plane by a few pixels, and that this failure of generalization also happens with other realistic small image transformations. Furthermore, we see these failures to generalize more frequently in more modern networks. We show that these failures are related to the fact that the architecture of modern CNNs ignores the classical sampling theorem so that generalization is not guaranteed. We also show that biases in the statistics of commonly used image datasets makes it unlikely that CNNs will learn to be invariant to these transformations. Taken together our results suggest that the performance of CNNs in object recognition falls far short of the generalization capabilities of humans.

Joint work with Aharon Azulay

This talk concerns the well-studied model of "cake-cutting" for studying questions regarding notions of fair division of resources. We focus on discrete versions rather than using infinite-precision real values, specifically, focusing on the communication complexity. Using general discrete simulations of classical infinite-precision protocols (Robertson-Webb and moving-knife), we roughly partition the various fair-allocation problems into 3 classes: "easy" (constant number of rounds of logarithmic many bits), "medium" (poly-log total communication), and "hard".

Our main technical result concerns two of the "medium" problems (perfect allocation for 2 players and equitable allocation for any number of players) which we prove are not in the "easy" class. Our main open problem is to separate the "hard" from the "medium" classes.

Joint work with Simina Branzei

For a vector space F^n over a field F, an (η, ß)-dimension expander of degree d is a collection of d linear maps Γ_j : F^n \to F^n such that for every subspace U of F^n of dimension at most ηn, the image of U under all the maps, ∑_{j=1}^d Γ_j(U), has dimension at least ßdim(U). Over a finite field, a random collection of d=O(1) maps Γ_j over excellent “lossless” expansion with high probability: ß ≈ d for η ≥ Ω(1/\eta). When it comes to a family of explicit constructions (for growing n), however, achieving even expansion factor β = 1 + ε with constant degree is a non-trivial goal. We present an explicit construction of dimension expanders over finite fields based on linearized polynomials and subspace designs, drawing inspiration from recent progress on list decoding in the rank-metric. Our approach yields the following:

- Lossless expansion over large fields; more precisely ß ≥ (1–ε)d and η ≥ (1–ε)/d with d=O_ε(1), when |F| ≥ Ω(n).
- Optimal up to constant factors expansion over fields of arbitrarily small polynomial size; more precisely ß ≥ Ω(δd) and η ≥ Ω(1/(δd)) with d = O_δ(1), when |F| ≥ n^δ.

Previously, an approach reducing to monotone expanders (a form of vertex expansion that is highly non-trivial to establish) gave (Ω(1), 1+Ω(1))-dimension expanders of constant degree over all fields. An approach based on “rank condensing via subspace designs” led to dimension expanders with ß ≥ Ω(√d) over large finite fields. Ours is the first construction to achieve lossless dimension expansion, or even expansion proportional to the degree. Based on joint work with Venkatesan Guruswami and Chaoping Xing.

As algorithmic prediction systems have become more widespread, so too have concerns that these systems may be discriminatory against groups of people protected by laws and ethics. We present a recent line of work that takes a complexity theoretic perspective towards combating discrimination in prediction systems. We'll focus on fair classification within the versatile framework of Dwork et al. [ITCS'12], which assumes the existence of a metric that measures similarity between pairs of individuals. Unlike earlier work, we do not assume that the entire metric is known to the learning algorithm; instead, the learner can query this metric a bounded number of times. We propose a new notion of fairness called *metric multifairness* and show how to achieve this notion in our setting. Metric multifairness is parameterized by a similarity metric d on pairs of individuals to classify and a rich collection C of (possibly overlapping) "comparison sets" over pairs of individuals. At a high level, metric multifairness guarantees that *similar subpopulations are treated similarly*, as long as these subpopulations are identified within the class C.

In this talk I will present an algorithms for approximating the number of k-cliques in a graph when given query access to the graph. This problem was previously studied for the cases of k=2 (edges) and k=3 (triangles). We give an algorithm that works for any k >= 3, and is actually conceptually simpler than the k=3 algorithm. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries and (3) pair queries. Let n denote the number of vertices in the graph, m the number of edges, and C_k the number of k-cliques. We design an algorithm that outputs a (1+\epsilon)-approximation (with high probability) for C_k, whose expected query complexity and running time are O (\frac{n}{C_k^{1/k}}+\frac{m^{k/2}}{C_k}) poly (\log n, 1/\epsilon,k).

Hence, the complexity of the algorithm is sublinear in the size of the graph for C_k = \omega(m^{k/2-1}). Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on \log n, 1/\epsilon and k).

This is joint work with Talya Eden and C. Seshadhri.

In typical imaging systems, an image/video is first acquired, then compressed for transmission or storage, and eventually presented to human observers using different and often imperfect display devices. While the resulting quality of the perceived output image may severely be affected by the acquisition and display processes, these degradations are usually ignored in the compression stage, leading to an overall sub-optimal system performance. In this work we propose a compression methodology to optimize the system's end-to-end reconstruction error with respect to the compression bit-cost. Using the alternating direction method of multipliers (ADMM) technique, we show that the design of the new globally-optimized compression reduces to a standard compression of a "system adjusted" signal. Essentially, we propose a new practical framework for the information-theoretic problem of remote source coding. The main ideas of our method are further explained using rate-distortion theory for Gaussian signals. We experimentally demonstrate our framework for image and video compression using the state-of-the-art HEVC standard, adjusted to several system layouts including acquisition and rendering models. The experiments established our method as the best approach for optimizing the system performance at high bit-rates from the compression standpoint.

In addition, we relate the proposed approach also to signal restoration using complexity regularization, where the likelihood of candidate solutions is evaluated based on their compression bit-costs.

Using our ADMM-based approach, we present new restoration methods relying on repeated applications of standard compression techniques. Thus, we restore signals by leveraging state-of-the-art models designed for compression. The presented experiments show good results for image deblurring and inpainting using the JPEG2000 and HEVC compression standards.

* Joint work with Prof. Alfred Bruckstein and Prof. Michael Elad.

** More details about the speaker and his research work are available at http://ydar.cswp.cs.technion.ac.il/

We consider the goal of social welfare maximization where a single item is to be assigned to one of to n potential agents with interdependent values.

That is, each agent has her own private signal, and the valuation of each agent is a known function of all n private signals. This captures settings such as valuations for artwork, oil drilling rights, broadcast rights, and many more. In the interdependent value setting, all previous work has assumed a so-called single-crossing condition. Single-crossing means that the impact of agent i’s private signal, s_i, on her own valuation is greater than the impact of s_ii on the valuation of any other agent. It is known that without the single-crossing condition an efficient outcome cannot be obtained. We study welfare maximization for interdependent valuations through the lens of approximation.

We show that, in general, without the single-crossing condition, one cannot hope to approximate the optimal social welfare any better than the approximation given by assigning the item to a random bidder. Consequently, we introduce a relaxed version of single-crossing, c-single-crossing, parameterized by c ≥ 1, which means that the impact of s_i on the valuation of agent i is at least 1/c times the impact of s_i on the valuation of any other agent (c = 1 is single-crossing). Using this parameterized notion, we obtain a host of positive results. We also consider interdependent settings when valuations are concave and give improved results.

Joint work with Alon Eden, Michal Feldman, and Kira Goldner.

Given an underlying finite point set P in the plane, we seek a small set Q that would hit any convex set that contains at least an Epsilon-fraction of P. Such a set Q is called a weak Epsilon-net. The study of Epsilon-nets is central to Computational and Combinatorial Geometry, and it bears important connections to Statistical Learning Theory, Extremal Combinatorics, Discrete Optimization, and other areas.

It is an outstanding open problem to determine tight asymptotic bounds on weak Epsilon-nets with respect to convex sets. For any underlying point set in the plane we describe such a net whose cardinality is roughly proportional to Epsilon^{-3/2}. This is the first improvement of the over-25-year-old bound of Alon, Barany, Furedi, and Kleitman.

The tremendous success of the Machine Learning paradigm heavily relies on the development of powerful optimization methods. The canonical algorithm for training learning models is SGD (Stochastic Gradient Descent), yet this method has its limitations. It is often unable to exploit useful statistical/geometric structure, it might degrade upon encountering prevalent non-convex phenomena, and it is hard to parallelize. In this talk I will discuss an ongoing line of research where we develop alternative methods that resolve some of SGD"s limitations. The methods that I describe are as efficient as SGD, and implicitly adapt to the underlying structure of the problem in a data dependent manner.

In the first part of the talk, I will discuss a method that is able to take advantage of hard/easy training samples. In the second part, I will discuss a method that enables an efficient parallelization of SGD. Finally, I will briefly describe a method that implicitly adapts to the smoothness and noise properties of the learning objective.

In the minimum k-edge-connected spanning subgraph (k-ECSS) problem the goal is to find the minimum weight subgraph resistant to up to k-1 edge failures. This is a central problem in network design, and a natural generalization of the minimum spanning tree (MST) problem. In this talk, I will present fast randomized distributed approximation algorithms for k-ECSS in the CONGEST model.

SRB measures are an important object in dynamical systems and mathematical physics. Named after Sinai , Ruelle and Bowen, these measures have important properties of being equilibrium states which describe chaotic behaviour, yet may also describe the asymptotic of ``observable” events in the phase space. An open and important question, is in what generality do systems admit SRB measures?

We present the notion of generalised SRB measures (GSRB in short), mention some of their important properties, and present a new condition to characterise their existence on a general setup.

The first part of the talk will describe some of the motivation leading to define and to study SRB measures; and so we will define GSRB measures and compare their properties with the properties sought for SRB measures. We will also describe a case study of examples motivating to study GSRB measures. Our new result is a characterisation of systems admitting GSRB measures.

In the second part of the talk, as much as time permits, we will present some key steps in the construction of GSRB measures.

The "Duty of Care"is a legal obligation requiring one to adhere to a standard of reasonable care while performing acts that could harm others. We propose a formal, mathematical interpretation of the duty of care in the context of "being cautious while driving". The framework enables tackling moral questions in a formal way and is an example of a new research field which we call regulatory science.

Joint work with Shaked Shammah and Amnon Shashua.

Single-particle cryo-electron microscopy (cryo-EM) is an innovative technology for elucidating structures of biological molecules at atomic-scale resolution. In a cryo-EM experiment, tomographic projections of a molecule, taken at unknown viewing directions, are embedded in highly noisy images at unknown locations. The cryo-EM problem is to estimate the 3-D structure of a molecule from these noisy images.

Inspired by cryo-EM, the talk will focus on two estimation problems: *multi-reference alignment* and *blind deconvolution*. These problems abstract away much of the intricacies of cryo-EM, while retaining some of its essential features. In multi-reference alignment, we aim to estimate a signal from its noisy, rotated observations. While the rotations and the signal are unknown, the goal is only to estimate the signal. In the blind deconvolution problem, the goal is to estimate a signal from its convolution with an unknown, sparse signal in the presence of noise. Focusing on the low SNR regime, I will propose the *method of moments* as a computationally efficient estimation framework for both problems and will introduce its properties. In particular, I will show that the method of moments allows estimating the sought signal accurately in any noise level, provided sufficiently many observations are collected, with only one pass over the data. I will then argue that the same principles carry through to cryo-EM, show examples, and draw potential implications.

The representation theory of GL(n,F), F non-archimedean is a classical subject initiated by Bernstein and Zelevinsky in the 1970s.

I will review some recent results and conjectures which aim to characterize irreducibility of parabolic induction, in terms of geometry. Joint with Alberto Minguez

Ramsey theory assures us that in any graph there is a clique or independent set of a certain size, roughly logarithmic in the graph size. But how difficult is it to find the clique or independent set? If the graph is given explicitly, then it is possible to do so while examining a linear number of edges. If the graph is given by a black-box, where to figure out whether a certain edge exists the box should be queried, then a large number of queries must be issued. But what if one is given a program or circuit for computing the existence of an edge? What if we are assured that the program is small without being given explicitly?

In this talk I will explore recent work on the complexity of search problems with guaranteed solution (the class TFNP) and the tight relationship with cryptographic assumptions and techniques.

Based on joint works with Pavel Hubacek, Ilan Komargodski and Eylon Yogev

In this talk I will survey the way machine learning research is affecting the field of security, and the way security research Is affecting the field of machine learning. After giving several examples of each type, I will discuss some of the latest developments In adversarial machine learning research, which are used by hackers to defeat security mechanisms based on regular machine learning techniques

In the vector balancing problem, we are given N vectors v_1,..., v_N in an n-dimensional normed space, and our goal is to assign signs to them, so that the norm of their signed sum is as small as possible. The balancing constant of the vectors is the smallest number beta, such that any subset of the vectors can be balanced so that their signed sum has norm at most beta.

The vector balancing constant generalizes combinatorial discrepancy, and is related to rounding problems in combinatorial optimization, and to the approximate Caratheodory theorem. We study the question of efficiently approximating the vector balancing constant of any set of vectors, with respect to an arbitrary norm. We show that the vector balancing constant can be approximated in polynomial time to within factors logarithmic in the dimension, and is characterized by (an appropriately optimized version of) a known volumetric lower bound. Our techniques draw on results from geometric functional analysis and the theory of Gaussian processes. Our results also imply an improved approximation algorithm for hereditary discrepancy.

Joint work with Aleksandar Nikolov, Nicole Tomczak-Jaegermann and Kunal Talwar.

In the talk I will tell about a theory of cyclic elements in semisimple Lie algebras. This notion was introduced by Kostant, who associated a cyclic element with the principal nilpotent and proved that it is regular semisimple. In particular, I will tell about the classification of all nilpotents giving rise to semisimple and regular semisimple cyclic elements. The results are from my joint work with V. Kac and E. Vinberg.

There is a movement in design of experiments away from the classic randomization put forward by Fisher, Cochran and others to one based on optimization. In fixed-sample trials comparing two groups, measurements of subjects are known in advance and subjects can be divided optimally into two groups based on a criterion of homogeneity or "imbalance" between the two groups. These designs are far from random. This talk seeks to understand the benefits and the costs over classic randomization in the context of different performance criterions such as Efron's worst-case analysis. In the criterion that we motivate, randomization beats optimization. However, the optimal design is shown to lie between these two extremes. Much-needed further work will provide a procedure to find this optimal designs in different scenarios in practice. Until then, it is best to randomize.

I will define the Enright functor for contragredient Lie superalgebras and discuss its properties. If time permits, we may discuss a proof of Arakawa's Theorem for osp(1|2l).

This work is a joint project with V. Serganova.

The goal of my talk (based on joint work with Vladimir Retakh) is to introduce noncommutative clusters and their mutations, which can be viewed as generalizations of both classical and quantum cluster structures.

Each noncommutative cluster S is built on a torsion-free group G and a certain collection of its automorphisms. We assign to S a noncommutative algebra A(S) related to the group algebra of G, which is an analogue of the cluster algebra, and establish a noncommutative version of Laurent Phenomenon in some algebras A(S).

"Cluster groups" G for which the Noncommutative Laurent Phenomenon holds include triangular groups of marked surfaces (closely related to the fundamental groups of their ramified double covers), free group of rank 2, and principal noncommutative tori which exist for any exchange matrix B.

Various algebraic and topological situations give rise to compatible sequences of representations of different groups, such as the symmetric groups, with stable asymptotic behavior. Representation stability is a recent approach to studying such sequences, which has proved effective for extracting important invariants. Coming from this point of view, I will introduce the associated character theory, which formally explains many of the approach's strengths (in char 0). Central examples are simultaneous characters of all symmetric groups, or of all Gl(n) over some finite field. Their mere existence gives applications to statistics of random matrices over finite fields, and raises many combinatorial questions.

We will show that there exists a metric space that does not admit a coarse embedding into any Alexandrov space of global nonpositive curvature, thus answering a question of Gromov (1993). In contrast, any metric space embeds coarsely into an Alexandorv space of nonnegative curvature. Based on joint works with Andoni and Neiman, and Eskenazis and Mendel.

Let $X$ be an $n$-dimensional smooth projective variety over a non-Archimedean local field $K$, such as the $p$-adic numbers, and let $\omega$ be an global $n$-form on $X$. The set $X(K)$ of $K$-points on $X$ has the structure of a $K$-analytic manifold, and $\omega$ induces a measure $|\omega|$ on $X(K)$. For any finite extension $K'$ of $K$, there is a natural continuous map from $X(K')$ to the Berkovich analytification $X^{\mathrm{an}}$ of $X$. We study the asymptotics of the images of the measures $|\omega\otimes_KK'|$ on $X^{\mathrm{an}}$ as $K'$ runs through towers of finite extensions of $K$. This is joint work with Johannes Nicaise.

Let F be a fixed infinite, vertex-transitive graph. We say a graph G is `r-locally F' if for every vertex v of G, the ball of radius r and centre v in G is isometric to the ball of radius r in F. For each positive integer n, let G_n be a graph chosen uniformly at random from the set of all unlabelled, n-vertex graphs that are r-locally F. We investigate the properties that the random graph G_n has with high probability --- i.e., how these properties depend on the fixed graph F.

We show that if F is a Cayley graph of a torsion-free group of polynomial growth, then there exists a positive integer r_0 such that for every integer r at least r_0, with high probability the random graph G_n = G_n(F,r) defined above has largest component of size between n^{c_1} and n^{c_2}, where 0 < c_1 < c_2 < 1 are constants depending upon F alone, and moreover that G_n has a rather large automorphism group. This contrasts sharply with the random d-regular graph G_n(d) (which corresponds to the case where F is replaced by the infinite d-regular tree).

Our proofs use a mixture of results and techniques from group theory, geometry and combinatorics.

We obtain somewhat more precise results in the case where F is L^d (the standard Cayley graph of Z^d): for example, we obtain quite precise estimates on the number of n-vertex graphs that are r-locally L^d, for r at least linear in d.

Many intriguing open problems remain: concerning groups with torsion, groups with faster than polynomial growth, and what happens for more general structures than graphs.

This is joint work with Itai Benjamini (WIS).

I will describe a new approach to the study of spherical spin glass models via free energy landscapes, defined by associating to interior points of the sphere the free energy computed only over the spherical band around them.

They are closely related to several fundamental objects from spin glass theory: the TAP free energy, pure states decomposition, overlap distribution, and temperature chaos. I will explain some of of those connections.

TBA

TBA

In this talk I explore various limitations of sensors and displays, and suggest new ways to overcome them.

These include:

- The limited 2D displays of today's screens - I will show how we can design new displays that enable us to see in 3D *without* any glasses.
- The limited spatial resolution of images - I will discuss the crucial factors for successful Super-Resolution.
- The poor image quality due to motion blur (due to camera motion or scene motion) - I will present a new approach for Blind Non-Uniform Deblurring.

A classical problem in causal inference is that of matching treatment units to control units in an observational dataset. This problem is distinct from simple estimation of treatment effects as it provides additional practical interpretability of the underlying causal mechanisms that is not available without matching. Some of the main challenges in developing matching methods arise from the tension among (i) inclusion of as many covariates as possible in defining the matched groups, (ii) having matched groups with enough treated and control units for a valid estimate of average treatment effect in each group, (iii) computing the matched pairs efficiently for large datasets, and (iv) dealing with complicating factors such as non-independence among units. We propose the Fast Large-scale Almost Matching Exactly (FLAME) framework to tackle these problems for categorical covariates. At its core this framework proposes an optimization objective for match quality that captures covariates that are integral for making causal statements while encouraging as many matches as possible. We demonstrate that this framework is able to construct good matched groups on relevant covariates and further extend the methodology to incorporate continuous and other complex covariates. related papers: https://arxiv.org/abs/1707.06315, https://arxiv.org/abs/1806.06802

Moosa and Scanlon defined a general notion of "fields with operators'', that generalizes those of difference and differential fields. In the case of "free'' operators in characteristic zero they also analyzed the basic model-theoretic properties of the theory of such fields. In particular, they showed in this case the existence of the model companion, a construction analogous to that of algebraically closed fields for usual fields. In positive characteristic, they provided an example showing that the model companion need not exist.

I will discuss work, joint with Beyarslan, Hoffman and Kowalski, that completes the description of the free case, namely, it provides a full classification of those free operators for which the model companion exists. Though the motivating question is model theoretic, the description and the proof are completely algebraic and geometric. If time permits, I will discuss additional properties, such as quantifier elimination. All notions related to model theory and to fields with operators will be explained (at least heuristically).

SPECIAL NOTE: this will be part of a model theory day. Thus, the talk will be preceded by an introduction to algebraic geometry by the same speaker, 10-10:45 (in Room 1) and followed by a talk by Nick Ramsey " Classification Theory and the Construction of PAC Fields" , 14-16 (in Room 155). See https://mt972.weebly.com/ for more information

Technological advances have changed every aspect of our lives in recent decades, yet, for the most part, the same systems of democratic decision making have been in place for centuries. I will argue that computer scientists can help rethink the practice of democracy, as well as its potential applications. I will focus on three emerging paradigms that go far beyond your run-of-the-mill election: (i) liquid democracy, an approach that allows voters to transitively delegate their votes; (ii) participatory budgeting, whereby residents collectively decide how to spend their local government's budget; and (iii) virtual democracy, which employs instant elections among machine learning models of real voters to address the grand AI challenge of ethical decision making.

We present a joint audio-visual model for isolating a single speech signal from a mixture of sounds such as other speakers and background noise. Solving this task using only audio as input is extremely challenging and does not provide an association of the separated speech signals with speakers in the video. In this paper, we present a deep network-based model that incorporates both visual and auditory signals to solve this task. The visual features are used to "focus" the audio on desired speakers in a scene and to improve the speech separation quality. To train our joint audio-visual model, we introduce AVSpeech, a new dataset comprised of thousands of hours of video segments from the Web. We demonstrate the applicability of our method to classic speech separation tasks, as well as real-world scenarios involving heated interviews, noisy bars, and screaming children, only requiring the user to specify the face of the person in the video whose speech they want to isolate. Our method shows clear advantage over state-of-the-art audio-only speech separation in cases of mixed speech. In addition, our model, which is speaker-independent (trained once, applicable to any speaker), produces better results than recent audio-visual speech separation methods that are speaker-dependent (require training a separate model for each speaker of interest).

Joint work with Inbar Mosseri, Oran Lang, Tali Dekel, Kevin Wilson, Avinatan Hassidim, Bill Freeman and Miki Rubinstein of Google Research.

Cryo-electron microscopy (cryo-EM) is an imaging technology that is revolutionizing structural biology, enabling reconstruction of molecules at near-atomic resolution.

Cryo-EM produces a large number of noisy two-dimensional tomographic projection images of a molecule, taken at unknown viewing directions.

The extreme levels of noise make classical tasks in statistics and signal processing, such as alignment, detection and clustering, very challenging.

I will start the talk by studying the multi-reference alignment problem, which can be interpreted as a simplified model for cryo-EM. In multi-reference alignment, we aim to estimate multiple signals from circularly-translated, unlabeled, noisy copies.

In high noise regimes, the measurements cannot be aligned or clustered. Nonetheless, accurate and efficient estimation can be achieved via group-invariant representations (invariant polynomials). Furthermore, such estimators achieve the optimal estimation rate.

Then, I will show how this framework can be applied to the problem of 2-D classification in cryo-EM. In the last part of the talk, I will introduce the analog invariants of the cryo-EM problem and discuss how they can be used for ab initio modeling.

generation of neuronal network oscillations are not well understood. We view this process as the individual neurons' oscillations being communicated among the nodes in the network, mediated by the impedance profiles of the isolated (uncoupled) individual neurons. In order to test this idea, we developed a mathematical tool that we refer to as the Frequency Response Alternating Map (FRAM). The FRAM describes how the impedances of the individual oscillators interact to generate network responses to oscillatory inputs. It consists of decoupling the non-autonomous part of the coupling term and substituting the reciprocal coupling by a sequence of alternating one-directional forcing effects (cell 1 forces cell 2, which in turn forces cell 1 and so on and so forth). The end result is an expression of the network impedance for each node (in the network) as power series, each term involving the product of the impedances of the autonomous part of the individual oscillators. For linear systems we provide analytical expressions of the FRAM and we show that its convergence properties and limitations. We illustrate numerically that this convergence is relatively fast. We apply the FRAM to the phenomenon of network resonance to the simplest type of oscillatory network: two non-oscillatory nodes receiving oscillatory inputs in one or the two nodes. We discuss extensions of the FRAM to include non-linear systems and other types of network architectures.

The use of latent variables in probabilistic modeling is a standard approach in numerous data analysis applications. In recent years, there has been a surge of interest in spectral methods for latent variable models, where inference is done by analyzing the lower order moments of the observed data. In contrast to iterative approaches such as the EM algorithm, under appropriate conditions spectral methods are guaranteed to converge to the true model parameters given enough data samples.

The focus of the seminar is the development of novel spectral based methods for two problems in statistical machine learning. In the first part, we address unsupervised ensemble learning, where one obtains predictions from different sources or classifiers, yet without knowing the reliability and expertise of each source, and with no labeled data to directly assess it. We develop algorithms to estimate the reliability of the classifiers based on a common assumption that different classifiers make statistically independent errors. In addition, we show how one can detect subsets of classifiers that strongly violate the model of independent errors, in a fully unsupervised manner.

In the second part of the seminar we show how one can use spectral methods to learn the parameters of binary latent variable models. This model has many applications such as overlapping clustering and Gaussian-Bernoulli restricted Boltzmann machines. Our methods are based on computing the eigenvectors of both the second and third moments of the observed variables.

For both problems, we show that spectral based methods can be applied effectively, achieving results that are state of the art in various problems in computational biology and population genetics.

Information transfer between triangle meshes is of great importance in computer graphics and geometry processing. To facilitate this process, a smooth and accurate map is typically required between the two meshes. While such maps can sometimes be computed between nearly-isometric meshes, the more general case of meshes with diverse geometries remains challenging. This talk describes a novel approach for direct map computation between triangle meshes without mapping to an intermediate domain, which optimizes for the harmonicity and reversibility of the forward and backward maps. Our method is general both in the information it can receive as input, e.g. point landmarks, a dense map or a functional map, and in the diversity of the geometries to which it can be applied. We demonstrate that our maps exhibit lower conformal distortion than the state-of-the-art, while succeeding in correctly mapping key features of the input shapes.

(joint with Noam Aigerman, Raz Sluzky and Yaron Lipman)

Computing homeomorphisms between surfaces is an important task in shape analysis fields such as computer graphics, medical imaging and morphology. A fundamental tool for these tasks is solving Dirichlet's problem on an arbitrary Jordan domain with disc topology, where the boundary of the domain is mapped homeomorphically to the boundary of a specific target domain: A convex polygon. By the Rado-Kneser-Choquet Theorem such harmonic mappings are homeomorphisms onto the convex polygon. Standard finite element approximations of harmonic mappings lead to discrete harmonic mappings, which have been proven to be homeomorphisms as well. Computing the discrete harmonic mappings is very efficient and reliable as the mappings are obtained as the solution of a sparse linear system of equations.

In this talk we show that the methodology above, can be used to compute *conformal* homeomorphisms, for domains with either disc or sphere topology:

By solving Dirichlet's problem with correct boundary conditions, we can compute conformal homeomorphisms from arbitrary Jordan domains to a specific canonical domain- a triangle. The discrete conformal mappings we compute are homeomorphisms, and approximate the conformal homeomorphism uniformly and in H^1. Similar methodology can also be used to conformally map a sphere type surface to a planar Jordan domain, whose edges are identified so that the planar domain has the topology of a sphere.

In knot theory and more generally the theory of 3-manifolds various "quantum invariants" like the Witten-Reshetikhin-Turaev or the Kashaev invariant have been much studied in recent years, in particular because of the famous "volume conjecture" related to the asymptotic growth of the Kashaev invariant. Rather surprisingly, it transpired a few years ago that these invariants also have very non-trivial number-theoretical properties, including a kind of weak invariance under the modular group SL(2,Z) ("quantum modular forms") and the experimental discovery of the appearance of certain units in cyclotomic extensions as factors in the asymptotic expansions. The talk will report on this and specifically on recent joint work with Frank Calegari and Stavros Garoufalidis that constructs such units in a purely algebraic way starting from elements in algebraic K-theory or in the more elementary "Bloch group". As an unexpected application, this result led to the proof of a well-known conjecture of Nahm on the modularity of certain q-hypergeometric series.

The classical Gaussian isoperimetric inequality, established in the 70's independently by Sudakov-Tsirelson and Borell, states that the optimal way to decompose $\mathbb{R}^n$ into two sets of prescribed Gaussian measure, so that the (Gaussian) area of their interface is minimal, is by using two complementing half-planes. This is the Gaussian analogue of the classical Euclidean isoperimetric inequality, and is therefore referred to as the "single-bubble" case.

A natural generalization is to decompose $\mathbb{R}^n$ into $q \geq 3$ sets of prescribed Gaussian measure. It is conjectured that when $q \leq n+1$, the configuration whose interface has minimal (Gaussian) area is given by the Voronoi cells of $q$ equidistant points. For example, for $q=3$ (the "double-bubble"conjecture) in the plane ($n=2$), the interface is conjectured to be a "tripod" or "Y" - three rays meeting at a single point in 120 degree angles. For $q=4$ (the "triple-bubble" conjecture) in $\mathbb{R}^3$, the interface is conjectured to be a tetrahedral cone.

We confirm the Gaussian double-bubble and, more generally, multi-bubble conjectures for all $3 \leq q \leq n+1$. The double-bubble case $q=3$ is simpler, and we will explain why.

None of the numerous methods discovered over the years for establishing the classical $q=2$ case seem amenable to the $q \geq 3$ cases, and our method consists of establishing a Partial Differential Inequality satisfied by the isoperimetric profile. To treat $q > 3$, we first prove that locally minimimal configurations must have flat interfaces, and thus convex polyhedral cells. Uniqueness of minimizers up to null-sets is also established.

This is joint work with Joe Neeman (UT Austin).

Typically, when semi-discrete approximations to time-dependent partial differential equations (PDE) or schemes for ordinary differential equation (ODE) are constructed they are derived such that they are stable and have a specified truncation error $\tau$. Under these conditions, the Lax-Richtmyer equivalence theorem assures that the scheme converges and that the error is, at most, of the order of $|| \tau ||$. In most cases, the error is in indeed of the order of $|| \tau ||$. We demonstrate that schemes can be constructed, whose truncation errors are $\tau$, however, the actual errors are much smaller. This error reduction is made by constructing the schemes such that they inhibit the accumulation of the local errors; therefore they are called Error Inhibiting Schemes (EIS).

In the Minimum Hypergraph Bisection problem, the vertex set of a hypergraph has to be partitioned into two parts of equal size so that the number of hyperedges intersecting both parts is minimized.

This problem is a natural generalization of the well-studied Minimum Bisection problem in graphs.

We present a sharp distinction between Minimum Bisection in hypergraphs and graphs.

Whereas it is well-known that all bi-criteria approximation algorithms for Minimum Bisection in graphs can be extended to hypergraphs with the exact same guarantees, we prove that this is not the case when considering true (i.e., non bi-criteria) approximation algorithms.

Specifically, we show that Minimum Hypergraph Bisection admits an $\tilde{\mathcal{O}}(\sqrt{n})$ approximation algorithm.

However, we also show that any $\alpha$-approximation algorithm for Minimum Hypergraph Bisection implies an approximation of $\Omega(\alpha^{-2})$ for Densest $k$-Subgraph.

Thus, assuming the exponential time hypothesis there is no efficient approximation algorithm for Minimum Hypergraph Bisection with an approximation ratio $n^{poly(\log{\log{n}})}$.

In particular, Minimum Hypergraph Bisection is much harder to approximate than Minimum Bisection in graphs, for which a logarithmic approximation algorithm exists.

If time permits, the problem of constructing trees that are cut sparsifiers for hypergraph and vertex cuts will also be discussed.

While similar trees lie at the heart of powerful algorithms for Minimum Bisection in graphs, we prove that this is not the case for hypergraphs.

Joint work with Harald R\"{a}cke and Richard Stotz.

**Ohad Feldheim: Convergence of a quantile admission processes**

**Abstract**: Consider the following stochastic model for a growing set. At time 0 the model consists of the singleton S = {-infty}. At every subsequent time, two i.i.d. samples, distributed according to some distribution D on R, are suggested as candidates for S. If the smaller among the two is closer to at least a fraction of r of the current elements of S (in comparison with the larger one), then it is admitted into S.

How will the distribution of the members of S evolve over time as a function of r and D?

This model was suggested by Alon, Feldman, Mansour, Oren and Tennenholtz as a model for the evolution of an exclusive social group. We'll show that the empirical distribution of the elements of S converges to a (not-necessarily deterministic) limit distribution for any r and D.

This we do by relating the process to a random walk in changing environment. The analysis of this random walk involves various classical exponential concentration inequalities as well as a new general inequality concerning mean and minimum of independent random variables.

Joint work with Naomi Feldheim

**Eviatar Procaccia:** **Stabilization of Diffusion Limited Aggregation in a Wedge**

**Abstract**: We prove a discrete Beurling estimate for the harmonic measure in a wedge in $\mathbf{Z}^2$, and use it to show that Diffusion Limited Aggregation (DLA) in a wedge of angle smaller than $\pi/4$ stabilizes. This allows to consider the infinite DLA and questions about the number of arms, growth and dimension. I will present some conjectures and open problems.

Artificial illumination plays a vital role in human civilization. In computer vision, artificial light is extensively used to recover 3D shape, reflectance, and further scene information. However, in most computer vision applications using artificial light, some additional structure is added to the illumination to facilitate the task at hand. In this work, we show that the ubiquitous alternating current (AC) lights already have a valuable inherent structure that stems from bulb flicker. By passively sensing scene flicker, we reveal new scene information which includes: the type of bulbs in the scene, the phases of the electric grid up to city scale, and the light transport matrix. This information yields unmixing of reflections and semi-reflections, nocturnal high dynamic range, and scene rendering with bulbs not observed during acquisition. Moreover, we provide methods that enable capturing scene flicker using almost any off-the-shelf camera, including smartphones.

In underwater imaging, similar structures are added to illumination sources to overcome the limited imaging range. In this setting, we show that by optimizing camera and light positions while taking the effect of light propagation in scattering media into account we achieve superior imaging of underwater scenes while using simple, unstructured illumination.

The fisherman caught a quantum fish. "Fisherman, please let me go", begged the fish, "and I will grant you three wishes". The fisherman agreed. The fish gave the fisherman a quantum computer, three quantum signing tokens and his classical public key. The fish explained: "to sign your three wishes, use the tokenized signature scheme on this quantum computer, then show your valid signature to the king, who owes me a favor".

The fisherman used one of the signing tokens to sign the document "give me a castle!" and rushed to the palace. The king executed the classical verification algorithm using the fish's public key, and since it was valid, the king complied.

The fisherman's wife wanted to sign ten wishes using their two remaining signing tokens. The fisherman did not want to cheat, and secretly sailed to meet the fish. "Fish, my wife wants to sign ten more wishes". But the fish was not worried: "I have learned quantum cryptography following the previous story (The Fisherman and His Wife by the brothers Grimm). The quantum tokens are consumed during the signing. Your polynomial wife cannot even sign four wishes using the three signing tokens I gave you".

"How does it work?" wondered the fisherman. "Have you heard of quantum money? These are quantum states which can be easily verified but are hard to copy. This tokenized quantum signature scheme extends Aaronson and Christiano's quantum money scheme, and a variant by Zhandry, which is why the signing tokens cannot be copied".

"Does your scheme have additional fancy properties?" the fisherman asked. "Yes, the scheme has other security guarantees: revocability, testability and everlasting security. Furthermore, if you're at sea and your quantum phone has only classical reception, you can use this scheme to transfer the value of the quantum money to shore", said the fish, and swam away.

Joint work with Shalev Ben-David. https://arxiv.org/abs/1609.09047

**Bhaswar Bhattacharya**: Large Deviation Variational Problems in Random Combinatorial Structures

The upper tail problem in the Erdos-Renyi random graph $G\sim\mathcal{G}_{n,p}$, where every edge is included independently with probability $p$, is to estimate the probability that the number of copies of a graph $H$ in $G$ exceeds its expectation by a factor of $1+\delta$. The arithmetic analog of this problem counts the number of $k$-term arithmetic progressions in a random subset of $\{1, 2, \ldots, N\}$, where every element is included independently with probability $p$. The recently developed framework of non-linear large deviations (Chatterjee and Dembo (2016) and Eldan (2017)) shows that the logarithm of these tail probabilities can be reduced to a natural variational problem on the space of weighted graphs/functions. In this talk we will discuss methods for solving these variational problems in the sparse regime ($p \rightarrow 0$), and show how the solutions are often related to extremal problems in combinatorics. (This is based on joint work with Shirshendu Ganguly, Eyal Lubetzky, Xuancheng Shao, and Yufei Zhao.)

**Elliot Paquette**: Random matrix point processes via stochastic processes

In 2007, Virag and Valko introduced the Brownian carousel, a dynamical system that describes the eigenvalues of a canonical class of random matrices. This dynamical system can be reduced to a diffusion, the stochastic sine equation, a beautiful probabilistic object requiring no random matrix theory to understand. Many features of the limiting eigenvalue point process, the Sine--beta process, can then be studied via this stochastic process. We will sketch how this stochastic process is connected to eigenvalues of a random matrix and sketch an approach to two questions about the stochastic sine equation: deviations for the counting Sine--beta counting function and a functional central limit theorem.

The celebrated Gan-Gross-Prasad conjectures aim to describe the branching behavior of representations of classical groups, i.e., the decomposition of irreducible representations when restricted to a lower rank subgroup.

These conjectures, whose global/automorphic version bear significance in number theory, have thus far been formulated and resolved for the generic case.

In this talk, I will present a newly formulated rule in the p-adic setting (again conjectured by G-G-P) for restriction of representations in non-generic Arthur packets of GL_n.

Progress towards the proof of the new rule takes the problem into the rapidly developing subject of quantum affine algebras. These techniques use a version of the Schur-Weyl duality for affine Hecke algebras, combined with new combinatorial information on parabolic induction extracted by Lapid-Minguez.

All known algorithms for solving NP-complete problems require exponential time in the worst case; however, these algorithms nevertheless solve many problems of practical importance astoundingly quickly, and are hence relied upon in a broad range of applications. This talk is built around the observation that "Empirical Hardness Models" - statistical models that predict algorithm runtime on novel instances from a given distribution - work surprisingly well. These models can serve as powerful tools for algorithm design, specifically by facilitating automated methods for algorithm design and for constructing algorithm portfolios. They also offer a statistical alternative to beyond-worst-case analysis and a starting point for theoretical investigations.

bio at http://www.cs.ubc.ca/~kevinlb/bio.html

In analysis, a convolution of two functions usually results in a smoother, better behaved function. Given two morphisms f,g from algebraic varieties X,Y to an algebraic group G, one can define a notion of convolution of these morphisms. Analogously to the analytic situation, this operation yields a morphism (from X x Y to G) with improved smoothness properties.

In this talk, I will define a convolution operation and discuss some of its properties. I will then present a recent result; if G is an algebraic group, X is smooth and absolutely irreducible, and f:X-->G is a dominant map, then after finitely many self convolutions of f, we obtain a morphism with the property of being flat with fibers of rational singularities (a property which we call (FRS)).

Moreover, Aizenbud and Avni showed that the (FRS) property has an equivalent analytic characterization, which leads to various applications such as counting points of schemes over finite rings, representation growth of certain compact p-adic groups and arithmetic groups of higher rank, and random walks on (algebraic families of) finite groups. We will discuss some of these applications, and maybe some of the main ideas of the proof of the above result.

Joint with Yotam Hendel.

Random surfaces in statistical physics are commonly modeled by a real-valued function phi on a lattice, whose probability density penalizes nearest-neighbor fluctuations. Precisely, given an even function V, termed the potential, the energy H(phi) is computed as the sum of V over the nearest-neighbor gradients of phi, and the probability density of phi is set proportional to exp(-H(phi)). The most-studied case is when V is quadratic, resulting in the so-called Gaussian free field. Brascamp, Lieb and Lebowitz initiated in 1975 the study of the global fluctuations of random surfaces for other potential functions and noted that understanding is lacking even in the case of the quartic potential, V(x)=x^4. We will review the state of the art for this problem and present recent work with Alexander Magazinov which finally settles the question of obtaining upper bounds for the fluctuations for the quartic and many other potential functions.

Fast and robust three-dimensional reconstruction of facial geometric structure from a single image is a challenging task with numerous applications in computer vision and graphics. We propose to leverage the power of convolutional neural networks (CNNs) to produce highly detailed face reconstruction directly from a single image. For this purpose, we introduce an end-to-end CNN framework which constructs the shape in a coarse-to-fine fashion. The proposed architecture is composed of two main blocks, a network which recovers the coarse facial geometry (CoarseNet), followed by a CNN which refines the facial features of that geometry (FineNet). To alleviate the lack of training data for face reconstruction, we train our model using synthetic data as well as unlabeled facial images collected from the internet. The proposed model successfully recovers facial geometries from real images, even for faces with extreme expressions and under varying lighting conditions. In this talk, I will summarize three papers that were published at 3DV 2016, CVPR 2017 (as an oral presentation), and ICCV 2017.

Bio: Matan Sela holds a Ph.D in Computer Science from the Technion - Israel Institute of Technology. He received B.Sc. and M.Sc. (both with honors) in electrical engineering, both from The Technion - Israel Institute of Technology in 2012 and 2015, respectively. During summer 2017, he was a research intern at Google, Mountain View, California. His interests are Machine Learning, Computer Vision, Computer Graphics, Geometry Processing and any combination of thereof.

The Baum-Bott indexes are important local invariants for singular holomorphic foliations by curves with isolated singularities. On a compact complex manifold, we consider foliations with fixed cotangent bundle. Then, the Baum-Bott map associates to a foliation its Baum-Bott indexes on its singularities. We focus on foliations on the projective space and we are interested in the generic rank of that map. The generic rank for foliations on the projective plane is known. For high-dimensional projectives spaces, we give an upper bound and in some cases we determine the generic rank.

This paper makes progress on the problem of explicitly constructing a binary tree code with constant distance and constant alphabet size.

For every constant delta < 1 we give an explicit binary tree code with distance delta and alphabet size poly(log n), where n is the depth of the tree. This is the first improvement over a two-decade-old construction that has an exponentially larger alphabet of size poly(n).

As part of the analysis, we prove a bound on the number of positive integer roots a real polynomial can have in terms of its sparsity with respect to the Newton basis---a result of independent interest.

Joint work with G. Cohen and B. Haeupler

**David Jerison:** Localization of eigenfunctions via an effective potential

We discuss joint work with Douglas Arnold, Guy David, Marcel Filoche and Svitlana Mayboroda. Consider the Neumann boundary value problem for the operator $L u = - \mbox{div} (A \nabla u) + Vu$ on a Lipschitz domain $\Omega$ and, more generally, on a manifold with or without boundary. The eigenfunctions of $L$ are often localized, as a result of disorder of the potential $V$, the matrix of coefficients $A$, irregularities of the boundary, or all of the above. In earlier work, Filoche and Mayboroda introduced the function $u$ solving $Lu = 1$, and showed numerically that it strongly reflects this localization. Here, we deepen the connection between the eigenfunctions and this {\em landscape} function $u$ by proving that its reciprocal $1/u$ acts as an {\em effective potential}. The effective potential governs the exponential decay of the eigenfunctions of the system and delivers information on the distribution of eigenvalues near the bottom of the spectrum.

**Ron Rosenthal:** Eigenvector correlation in the complex Ginibre ensemble

The complex Ginibre ensemble is a non-Hermitian random matrix on C^N with i.i.d. complex Gaussian entries normalized to have mean zero and variance 1=N. Unlike the Gaussian unitary ensemble, for which the eigenvectors are orthogonal, the geometry of the eigenbases of the Ginibre ensemble are not particularly well understood. We will discuss a some results regarding the analytic and algebraic structure of eigenvector correlations in this matrix ensemble. In particular, we uncover an extended algebraic structure which describes the asymptotic behavior (as N goes to infinity) of these correlations. Our work extends previous results of Chalker and Mehlig [CM98], in which the correlation for pairs of eigenvectors was computed. Based on a joint work with Nick Crawford.

Recent work has shown impressive success in automatically creating new images with desired properties such as transferring painterly style, modifying facial expressions or manipulating the center of attention of the image. In this talk I will discuss two of the standing challenges in image synthesis and how we tackle them:

- I will describe our efforts in making the synthesized images more photo-realistic.

- I will further show how we can broaden the scope of data that can be used for training synthesis networks, and with that provide a solution to new applications.

This talk will include an introduction to the topic of V(D)J rearrangements of particular subsets of T cells and B cells of the adaptive human immune system, in particular of IgG heavy chains. There are many statistical problems that arise in trying to understand these cells. They involve estimating aspects of functionals of discrete probabilities on (random) finite sets. Topics include but are not limited to exchangeability, estimating non-centrality parameters, and estimating covariance matrices from what are called "replicates" that have been amplified by the PCR process and (partially) sequenced.

I have received considerable assistance from Lu Tian, and also Yi Liu; as well, I have been helped considerably by Andrew Fire and Scott Boyd, and also Jorg Goronzy

Single cells and cells in a tissue respond to stimuli by deforming, changing their shape, and/or moving. Some of these responses can be understood from the underlying biochemical signaling, a topic that has been of interest to both biologists and modelers. In our recent work, my group has studied the link between mechanical tension on cells and their internal chemical signaling. (Our primary focus has been, and remains, that of Rho proteins.) Here I will describe a simple "toy model" that captures key aspects of what is known biologically. The model is simple enough to understand mathematically, and yet capable fo displaying several regimes of behavior consistent with experimental observations. I describe how we investigated the model in a single cell, and how this was then used to capture multiple cells that interact with one another mechanically. We find waves of expansion and contraction that sweep through the model "tissue" is certain parameter regimes. This work is joint with Cole Zmurchok and Dhanajay Bhaskar.

We study the problem of identifying correlations in multivariate data, under information constraints:

Either on the amount of memory that can be used by the algorithm, or the amount of communi- cation when the data is distributed across several machines. We prove a tight trade-off between the memory/communication complexity and the sample complexity, implying (for example) that to detect pairwise correlations with optimal sample complexity, the number of required mem-ory/communication bits is at least quadratic in the dimension. Our results substantially improve those of Shamir (2014), which studied a similar question in a much more restricted setting. To the best of our knowledge, these are the first provable sample/memory/communication trade-offs for a practical estimation problem, using standard distributions, and in the natural regime where the memory/communication budget is larger than the size of a single data point. To derive our theorems, we prove a new information-theoretic result, which may be relevant for studying other information-constrained learning problems.

Joint work with Ohad Shamir

Non-signaling strategies are collections of distributions with certain non-local correlations that have been studied recently in the context of delegation of computation.

In this talk I will discuss a recent result showing that the Hadamard based PCP of [ALMSS'92] is sound against non-signaling strategies. As part of the proof, we study the classical problem of linearity testing [BLR'93] in the setting of non-signaling strategies, and prove that any no-signaling strategy that passes the linearity test with high probability must be close to a quasi-distribution over linear functions.

Joint work with Alessandro Chiesa and Peter Manohar (UC Berkeley).

Consider a sequence of growing graphs converging locally weakly to an infinite (possibly random) tree. As there are uncountably many Ising and Potts Gibbs measures on the limiting tree in the low-temperature regime it is not apriori clear whether the local weak limit of such measures exists and if so, identifying the limit remains a challenge. In this talk, I will describe these limits. The talk is based on joint works with Amir Dembo and Allan Sly.

Recently, there has been increasing interest to leverage the competence of neural networks to analyze data. In particular, new clustering methods that employ deep embeddings have been presented.

In this talk, we depart from centroid-based models and suggest a new framework, called Clustering-driven deep embedding with PAirwise Constraints (CPAC), for non-parametric clustering using a neural network. We present a clustering-driven embedding based on a Siamese network that encourages pairs of data points to output similar representations in the latent space. Our pair-based model allows augmenting the information with labeled pairs to constitute a semi-supervised framework. Our approach is based on analyzing the losses associated with each pair to refine the set of constraints.

We show that clustering performance increases when using this scheme, even with a limited amount of user queries.

We present state-of-the-art results on different types of datasets and compare our performance to parametric and non-parametric techniques.

Our first theorem is a hierarchy theorem for the query complexity of testing graph properties with one-sided error; more precisely, we show that for every sufficiently fast-growing function f from (0,1) to the natural numbers, there is a graph property whose one-sided-error query complexity is precisely f(\Theta(\epsilon)). No result of this type was previously known for any f which is super-polynomial. Goldreich [ECCC 2005] asked to exhibit a graph property whose query complexity is exponential in 1/\epsilon. Our hierarchy theorem partially resolves this problem by exhibiting a property whose one-sided-error query complexity is exponential in 1/\epsilon. We also use our hierarchy theorem in order to resolve a problem raised by Alon and Shapira [STOC 2005] regarding testing relaxed versions of bipartiteness.

Our second theorem states that for any function f there is a graph property whose one-sided-error query complexity is at least f(\epsilon) while its two-sided-error query complexity is only polynomial in 1/\epsilon. This is the first indication of the surprising power that two-sided-error testing algorithms have over one-sided-error ones, even when restricted to properties that are testable with one-sided error. Again, no result of this type was previously known for any f that is super-polynomial.

The above theorems are derived from a graph theoretic result which we think is of independent interest, and might have further applications. Alon and Shikhelman [JCTB 2016] introduced the following generalized Turan problem: for fixed graphs H and T, and an integer n, what is the maximum number of copies of T, denoted by ex(n,T,H), that can appear in an n-vertex H-free graph? This problem received a lot of attention recently, with an emphasis on T = C_3, H=C_{2m+1}. Our third theorem gives tight bounds for ex(n,C_k,C_m) for all the remaining values of k and m.

Joint work with Asaf Shapira.

Non-Euclidean, or incompatible elasticity is an elastic theory for bodies that do not have a reference (stress-free) configuration. It applies to many systems, in which the elastic body undergoes inhomogeneous growth (e.g. plants, self-assembled molecules). Mathematically, it is a question of finding the "most isometric" immersion of a Riemannian manifold (M,g) into Euclidean space of the same dimension, by minimizing an appropriate energy functional.

Much of the research in non-Euclidean elasticity is concerned with elastic bodies that have one or more slender dimensions (such as leaves), and finding appropriate dimensionally-reduced models for them.

In this talk I will give an introduction to non-Euclidean elasticity, and then focus on thin bodies and present some recent results on the relations between their elastic behavior and their curvature.

Based on a joint work with Asaf Shachar.

We prove a dichotomy theorem for two-party protocols, and show that for every poly-time two-party protocol with single-bit output, at least one of following holds:

- The protocol can be used to construct a key-agreement protocol.
- For every constant ρ > 0 the parties' output is ρ -uncorrelated: let (X; Y; T) denote the parties' outputs and the protocol's transcript respectively. A protocol is &rho -uncorrelated if there exists an efficient "decorralizer" algorithm Decor, that when given a random transcript T, produces two numbers P
_{A}; P_{B}, such that no efficient algorithm can distinguish (U_{PS};U_{PB}; T) (where Up denotes a biassed coin with bias ρ from (X; Y; T), with distinguishing advantage larger than ρ.

Namely, if the protocol cannot be used to construct key-agreement, then its output distribution (X; Y; T) is trivial: it can be simulated non-interactively by the parties given public randomness (used to sample T). (The precise statement also has qualifiers of the form: "on infinitely many choices of the security parameter").

We use the above characterization to prove that (α= 24ε^{2})-correct differentially private symmetric protocol for computing XOR, implies the existence of key-agreement protocol. The above dependency between α and &epsilon is tight since an θ( ε^{2})-correct "-differentially private protocol for computing XOR is known to exists unconditionally. It also improves, in the ( ε,α)dependency aspect, upon Goyal et al. [ICALP '16] who showed that, for some constant c > 0, a c-correct "-differentially private protocol for computing XOR implies oblivious transfer. Our result extends to a weaker notion of di erential privacy in which the privacy only requires to hold against external observer. Interestingly, the reductions used for proving the above results are non black box.

Joint work with: Eran Omri and Kobbi Nissim and Ronen Shaltiel and Jad Silbak

Talk 1: David Ellis (Queen Mary U)

**Title: The edge-isoperimetric problem for antipodally symmetric subsets of the discrete cube.**

Abstract: A major open problem in geometry is to solve the isoperimetric problem for n-dimensional real projective space, i.e. to determine, for each real number V, the minimum possible size of the boundary of a (well-behaved) set of volume V, in n-dimensional real projective space. We study a discrete analogue of this question: namely, among all antipodally symmetric subsets of {0,1}^n of fixed size, which sets have minimal edge-boundary? We obtain a complete answer to the second question. This is joint work with Imre Leader (Cambridge)

Talk 2: Benjamin Fehrman (Max Planck Institute)

**Title: Well-posedness of stochastic porous media equations with nonlinear, conservative noise.**

Abstract: In this talk, which is based on joint work with Benjamin Gess, I will describe a pathwise well-posedness theory for stochastic porous media equations driven by nonlinear, conservative noise. Such equations arise in the theory of mean field games, as an approximation to the Dean-Kawasaki equation in fluctuating hydrodynamics, to describe the fluctuating hydrodynamics of a zero range process, and as a model for the evolution of a thin film in the regime of negligible surface tension. Our methods are loosely based on the theory of stochastic viscosity solutions, where the noise is removed by considering a class of test functions transported along underlying stochastic characteristics. We apply these ideas after passing to the equation's kinetic formulation, for which the noise enters linearly and can be inverted using the theory of rough paths.

The harnessing of modern computational abilities for many-body wave-function representations is naturally placed as a prominent avenue in contemporary condensed matter physics. Specifically, highly expressive computational schemes that are able to efficiently represent the entanglement properties which characterize many-particle quantum systems are of interest. In the seemingly unrelated field of machine learning, deep network architectures have exhibited an unprecedented ability to tractably encompass the convoluted dependencies which characterize hard learning tasks such as image classification or speech recognition. However, theory is still lagging behind these rapid empirical advancements, and key questions regarding deep learning architecture design have no adequate answers. In the presented work, we establish a Tensor Network (TN) based common language between the two disciplines, which allows us to offer bidirectional contributions. By showing that many-body wave-functions are structurally equivalent to mappings of convolutional and recurrent arithmetic circuits, we construct their TN descriptions in the form of Tree and Matrix Product State TNs, and bring forth quantum entanglement measures as natural quantifiers of dependencies modeled by such networks. Accordingly, we propose a novel entanglement based deep learning design scheme that sheds light on the success of popular architectural choices made by deep learning practitioners, and suggests new practical prescriptions. Specifically, our analysis provides prescriptions regarding connectivity (pooling geometry) and parameter allocation (layer widths) in deep convolutional networks, and allows us to establish a first of its kind theoretical assertion for the exponential enhancement in long term memory brought forth by depth in recurrent networks. In the other direction, we identify that an inherent re-use of information in state-of-the-art deep learning architectures is a key trait that distinguishes them from TN based representations. Therefore, we suggest a new TN manifestation of information re-use, which enables TN constructs of powerful architectures such as deep recurrent networks and overlapping convolutional networks. This allows us to theoretically demonstrate that the entanglement scaling supported by state-of-the-art deep learning architectures can surpass that of commonly used expressive TNs in one dimension, and can support volume law entanglement scaling in two dimensions with an amount of parameters that is a square root of that required by Restricted Boltzmann Machines. We thus provide theoretical motivation to shift trending neural-network based wave-function representations closer to state-of-the-art deep learning architectures.

Interacting systems are prevalent in nature, from dynamical systems in physics to complex societal dynamics. In this talk I will introduce our neural relational inference model: an unsupervised model that learns to infer interactions while simultaneously learning the dynamics purely from observational data. Our model takes the form of a variational auto-encoder, in which the latent code represents the underlying interaction graph and the reconstruction is based on graph neural networks.

I will describe work with C.E.Wayne in which we study how dissipation leads to a very slow drift in secular parameters for the Non-linear Schroedinger equation. The nice thing about this model is that one can describe a relatively complex phenomenon in almost explicit terms as a perturbation from an integrable Hamiltonian system

The advancement in quantum computing, where Google, IBM, Microsoft, Intel are competing in the (exponentially growing) number of qubits in their (some already) commercial quantum computers that they produce, requires the reexamination of the Internet Security, and the public key infrastructure. The talk will describe the concept of overlay security together with blockchain based directories for establishing symmetric keys. When combined with nested Lamport signature and Merkle trees for digital signatures the result is a complete, easily implementable architecture with information theoretically secure communication, and hash based signatures.

**Mark Rudelson (UMich)**

**Title**: Invertibility of the adjacency matrices of random graphs.

**Abstract**: Consider an adjacency matrix of a bipartite, directed, or undirected Erdos-Renyi random graph. If the average degree of a vertex is large enough, then such matrix is invertible with high probability. As the average degree decreases, the probability of the matrix being singular increases, and for a sufficiently small average degree, it becomes singular with probability close to 1. We will discuss when this transition occurs, and what the main reason for the singularity of the adjacency matrix is.

This is a joint work with Anirban Basak.

**Yinon Spinka (TAU)**

**Title**: Finitary codings of Markov random fields

**Abstract**: Let X be a stationary Z^d-process. We say that X can be coded by an i.i.d. process if there is a(deterministic and translation-invariant) way to construct a realization of X from i.i.d. variables associated to the sites of Z^d. That is, if there is an i.i.d. process Y and a measurable map F from the underlying space of Y to that of X, which commutes with translations of Z^d and satisfies that F(Y)=X in distribution. Such a coding is called finitary if, in order to determine the value of X at a given site, one only needs to look at a finite (but random) region of Y.

It is known that a phase transition (existence of multiple Gibbs states) is an obstruction for the existence of such a finitary coding. On the other hand, we show that when X is a Markov random field satisfying certain spatial mixing conditions, then X can be coded by an i.i.d. process in a finitary manner. Moreover, the coding radius has exponential tails, so that typically the value of X at a given site is determined by a small region of Y.

We give applications to models such as the Potts model, proper colorings and the hard-core model.

Image restoration algorithms are typically evaluated by some distortion measure (e.g. PSNR, SSIM, IFC, VIF) or by human opinion scores that quantify perceived perceptual quality. In this work, we prove mathematically that distortion and perceptual quality are at odds with each other. Specifically, we study the optimal probability for correctly discriminating the outputs of an image restoration algorithm from real images. We show that as the mean distortion decreases, this probability must increase (indicating worse perceptual quality). As opposed to the common belief, this result holds true for any distortion measure, and is not only a problem of the PSNR or SSIM criteria. However, as we show experimentally, for some measures it is less severe (e.g. distance between VGG features). We also show that generative-adversarial-nets (GANs) provide a principled way to approach the perception-distortion bound. This constitutes theoretical support to their observed success in low-level vision tasks. Based on our analysis, we propose a new methodology for evaluating image restoration methods, and use it to perform an extensive comparison between recent super-resolution algorithms. Our study reveals which methods are currently closest to the theoretical perception-distortion bound.

* Joint work with Yochai Blau.

We present novel oblivious routing algorithms for the splittable and the unsplittable multicommodity flow settings. Our algorithms for both models improve upon the state-of-the-art, in terms of running time and performance with respect to graphs that exhibit good expansion guarantees. As an intermediate step towards the unsplittable setting, we present a novel generalization of Valiant's classical load balancing scheme for packet-switched networks to arbitrary graphs, which is of independent interest. Our approach relies on diffusing traffic throughout the network and then regathering it to its destination, via iterative applications of the random walk operator. Consequently, the performance guarantees of our algorithms are derived from the convergence of the random walk operator to the stationary distribution and are expressed in terms of the spectral gap of the graph (which dominates the mixing time).

Genome-wide forces of mutation and selection create local and global sequence patterns that carry information on functional role of genomic DNA. I will describe algorithmic approaches to genome analysis that decode sequence patterns and identify features of structural gene organization and gene expression (e.g. leaderless transcription). The algorithms make unsupervised inference of the structure of underlying graphical model and its parameters.

The subsequently developed software tools were, among other applications, used at NCBI (Bethesda, MD) to annotate and re-annotate 2,500+ fungal genomes and 130,000+ prokaryotic genomes. Yet another algorithm was employed by DOE JGI (Walnut Creek, CA) to annotate the largest collection of metagenomes.

Speaker s short Bio

Mark Borodovsky, PhD, a Regents' Professor at the Joint Georgia Tech and Emory University Department of Biomedical Engineering and the School of Computational Science and Engineering. Since 1990 when he arrived to Atlanta from Moscow, Dr Borodovsky had also joint faculty appointments at the School of Biology and the School of Mathematics. Dr. Borodovsky is a Founder of the Georgia Tech Bioinformatics graduate program.

Suppose a language L can be decided by a bounded-error randomized algorithm that runs in space S and time n * poly(S). We give a randomized algorithm for L that still runs in space O(S) and time n * poly(S) that uses only O(S) random bits; our algorithm has a low failure probability on all but a negligible fraction of inputs of each length. An immediate corollary is a deterministic algorithm for L that runs in space O(S) and succeeds on all but a negligible fraction of inputs of each length. We also discuss additional complexity-theoretic applications of our technique.

Points of infinite multiplicity are particular points which the Brownian motion visits infinitely often. Following a work of Bass, Burdzy and Khoshnevisan, we construct and study a measure carried by these points. Joint work with Yueyun Hu and Zhan Shi.

Scattering effects in images, including those related to haze, fog, and appearance of clouds, are fundamentally dictated by microphysical characteristics of the scatterers. We define and derive recovery of these characteristics, in a three-dimensional heterogeneous medium. Recovery is based on a novel tomography approach. Multiview (multi-angular) and multi-spectral data are linked to the underlying microphysics using 3D radiative transfer, accounting for multiple-scattering. Despite the nonlinearity of the tomography model, inversion is enabled using a few approximations that we describe. As a case study, we focus on passive remote sensing of the atmosphere, where scatterer retrieval can benefit modeling and forecasting of weather, climate, and pollution.

We present a new way to derive the replica symmetric solution for the free energy in mean-field spin glasses. Only the Sherrington-Kirpatrick case has been worked out in details, but the method also works in other cases, for instance for the perceptron (work in progress), and probably also for the Hopfield net. The method is closely related to the TAP equations (for Thouless-Anderson-Palmer). It does not give any new results, presently, but it gives a new viewpoint, and it looks to be quite promising. As the TAP equations are widely discussed in the physics literature, also at low temperature, it is hoped that the method could be extended to this case, too. But this is open, and probably very difficult.

We will introduce the (new) notion of approximability in triangulated categories and show its power.

The brief summary is that the derived category of quasicoherent sheaves on a separated, quasicompact scheme is an approximable triangulated category.

As relatively easy corollaries one can: (1) prove an old conjecture of Bondal and Van den Bergh, about strong generation in D^{perf}(X), (2) generalize an old theorem of of Rouquier about strong generation in D^b_{coh}(X). Rouquier proved the result only in equal characteristic, we can extend to mixed characteristic, and (3) generalize a representability theorem of Bondal and Van den Bergh,from proper schemes of finite type over fields to proper schemes of finite type over any noetherian rings.

After stating these results and explaining what they mean, we will (time permitting) also mention structural theorems. It turns out that approximable triangulated categories have a fair bit of intrinsic, internal structure that comes for free.

Recent success stories of using machine learning for diagnosing skin cancer, diabetic retinopathy, pneumonia, and breast cancer may give the impression that artificial intelligence (AI) is on the cusp of radically changing all aspects of health care. However, many of the most important problems, such as predicting disease progression, personalizing treatment to the individual, drug discovery, and finding optimal treatment policies, all require a fundamentally different way of thinking. Specifically, these problems require a focus on *causality* rather than simply prediction. Motivated by these challenges, my lab has been developing several new approaches for causal inference from observational data. In this talk, I describe our recent work on the deep Markov model (Krishnan, Shalit, Sontag AAAI '17) and TARNet (Shalit, Johansson, Sontag, ICML '17).

Recent success stories of using machine learning for diagnosing skin cancer, diabetic retinopathy, pneumonia, and breast cancer may give the impression that artificial intelligence (AI) is on the cusp of radically changing all aspects of health care. However, many of the most important problems, such as predicting disease progression, personalizing treatment to the individual, drug discovery, and finding optimal treatment policies, all require a fundamentally different way of thinking. Specifically, these problems require a focus on *causality* rather than simply prediction. Motivated by these challenges, my lab has been developing several new approaches for causal inference from observational data. In this talk, I describe our recent work on the deep Markov model (Krishnan, Shalit, Sontag AAAI '17) and TARNet (Shalit, Johansson, Sontag, ICML '17).

The generalized Jacobian Jac_m(C ') of a smooth hyperelliptic curve C' associated with a module m is an algebraic group that can be described by using lines bundle of the curve C' or by using a symmetric product of the curve C' provided with a law of composition. This second definition of the Jacobian Jac_m(C') is directly related to the fibres of a Mumford system. To be precise it is a subset of the compactified Jac_m(C') which is related to the fibres. This presentation will help us to demystify the relationship of these two mathematical objects.

We show that any area-preserving C^{r}-diffeomorphism of a two-dimensional surface displaying an elliptic fxed point can be C^{r}-perturbed to one exhibiting a chaotic island whose metric entropy is positive, for every 1 ≤ r≤ 1. This proves a conjecture of Herman stating that the identity map of the disk can be C^{ ∞}-perturbed to a conservative di eomorphism with positive metric entropy. This implies also that the Chirikov standard map for large and small parameter values can be C^{∞}- approximated by a conservative diffeomorphisms displaying a positive metric entropy (a weak version of Sinai's positive metric entropy conjecture). Finally, this sheds light onto a Herman's question on the density of C^{r}-conservative di eomorphisms displaying a positive metric entropy: we show the existence of a dense set formed by conservative diffeomorphisms which either are weakly stable (so, conjecturally, uniformly hyperbolic) or display a chaotic island of positive metric entropy.

This is a joint work with Pierre Berger.

Multiple wireless sensing tasks, e.g. radar detection for driver safety, involve estimating the "channel" or relationship between signal transmitted and received.

In this talk I will tell about the standard math model for the radar channel. In the case where the channel is sparse, I will demonstrate a channel estimation algorithm that is sub-linear in sampling and arithmetic complexity (and convince you of the need for such).

The main ingredients in the algorithm will be the use of an intrinsic algebraic structure known as the Heisenberg group and recent developments in the theory of the sparse Fast Fourier Transform (sFFT, due to Indyk et al.)

The talk will assume minimal background knowledge.

Understanding the ways in which the Brain gives rise to the Mind (memory, behavior, cognition, intelligence, language) is the most challengingproblem facing science today. As the answer seems likely to be at least partly computational, computer scientists should work on this problem --- except there is no obvious place to start. I shall recount recent work (with W. Maass and S. Vempala) on a model for the formation and association of memories in humans, and reflect on why it may be a clue about language.

Variational problems in geometry, fabrication, learning, and related applications lead to structured numerical optimization problems for which generic algorithms exhibit inefficient or unstable performance. Rather than viewing the particularities of these problems as barriers to scalability, in this talk we embrace them as added structure that can be leveraged to design large-scale and efficient techniques specialized to applications with geometrically structure variables. We explore this theme through the lens of case studies in surface parameterization, optimal transport, and multi-objective design

A long line of research studies the space complexity of estimating a norm l(x) in the data-stream model, i.e., when x is the frequency vector of an input stream consisting of insertions and deletions of items of n types. I will focus on norms l (in R^n) that are *symmetric*, meaning that l is invariant under sign-flips and coordinate-permutations, and show that the streaming space complexity is essentially determined by the measure-concentration characteristics of l. These characteristics are known to govern other phenomena in high-dimensional spaces, such as the critical dimension in Dvoretzky's Theorem.

The family of symmetric norms contains several well-studied norms, such as all l_p norms, and indeed we provide a new explanation for the disparity in space complexity between p<=2 and p>2. We also obtain bounds for other norms that are useful in applications.

Joint work with Jaroslaw Blasiok, Vladimir Braverman, Stephen R. Chestnut, and Lin F. Yang.

Multi-finger caging offers a robust approach to object grasping. To securely grasp an object, the fingers are first placed in caging regions surrounding a desired immobilizing grasp. This prevents the object from escaping the hand, and allows for great position uncertainty of the fingers relative to the object. The hand is then closed until the desired immobilizing grasp is reached.

While efficient computation of two-finger caging grasps for polygonal objects is well developed, the computation of three-finger caging grasps has remained a challenging open problem. We will discuss the caging of polygonal objects using three-finger hands that maintain similar triangle finger formations during the grasping process. While the configuration space of such hands is four dimensional, their contact space which represents all two and three finger contacts along the grasped object's boundary forms a two-dimensional stratified manifold.

We will present a caging graph that can be constructed in the hand's relatively simple contact space. Starting from a desired immobilizing grasp of the object by a specific triangular finger formation, the caging graph is searched for the largest formation scale value that ensures a three-finger cage about the object. This value determines the caging regions, and if the formation scale is kept below this value, any finger placement within the caging regions will guarantee a robust object grasping.

An extremal metric, as defined by Calabi, is a canonical Kahler metric: it minimizes the curvature within a given Kahler class. According to the Yau-Tian-Donaldson conjecture, polarized Kahler manifolds admitting an extremal metric should correspond to stable manifolds in a Geometric Invariant Theory sense.

In this talk, we will explain that a projective extremal Kahler manifold is asymptotically relatively Chow stable. This fact was conjectured by Apostolov and Huang, and its proof relies on quantization techniques. We will explain various implications, such that unicity or splitting results for extremal metrics.

Joint work with Yuji Sano ( Fukuoka University).

A curious property of randomized log-space search algorithms is that their outputs are often longer than their workspace. One consequence is that there is no clear way to reproduce the same output when running the algorithm twice on the same input. It is not feasible to store the random bits (or the output) of the previous run in log-space, and using new random bits in another execution can result in a different output. This leads to the question: how can we reproduce the results of a randomized log space computation of a search problem?

We will give a precise definition of this notion of "reproducibility". Then we will show that every problem in search-RL has a randomized log-space algorithm where the output can be reproduced. Reproducibility can be thought of as an extension of pseudo-determinism. Indeed, for some problems in search-RL we show pseudo-deterministic algorithms whose running time significantly improve on known deterministic algorithms.

Joint work with Yang Liu.

Oren Louidor (Technion)

**Title: Dynamical freezing in a spin-glass with logarithmic correlations**.

Abstract: We consider a continuous time random walk on the 2D torus, governed by the exponential of the discrete Gaussian free field acting as potential. This process can be viewed as Glauber dynamics for a spin-glass system with logarithmic correlations. Taking temperature to be below the freezing point, we then study this process both at pre-equilibrium and in-equilibrium time scales. In the former case, we show that the system exhibits aging and recover the arcsine law as asymptotics for a natural two point temporal correlation function. In the latter case, we show that the dynamics admits a functional scaling limit, with the limit given by a variant of Kolmogorov's K-process, driven by the limiting extremal process of the field, or alternatively, by a super-critical Liouville Brownian motion. Joint work with A. Cortines, J. Gold and A. Svejda.

Alexander Glazman (Tel Aviv)

**Title: Level lines of a random Lipschitz function**

Abstract: We consider the uniform distribution on Lipschitz functions on the triangular lattice, i.e. all integer-valued functions which differ by 0 or 1 on any two adjacent vertices. We show that with a positive probability such a function exhibits macroscopic level lines. Instead of working directly with Lipschitz functions we map this model to the loop $O(2)$ model with parameter $x=1$. The loop $O(n)$ model is a model for a random collection of non-intersecting loops on the hexagonal lattice, which is believed to be in the same universality class as the spin $O(n)$ model.

A main tool in the proof is a positive association (FKG) property that was recently shown to hold when $n \ge 1$ and $0<x\le\frac{1}{\sqrt{n}}$. Though the case $n=2$, $x=1$ is not in the FKG regime, it turns out that when loops are assigned one of two colours independently the marginal on loops of either of the colours does satisfy the FKG property. The colouring of loops allows to view the loop $O(2)$ model with $x=1$ as coupling of two percolation configurations. Studying each of them independently and using the XOR operation we establish existence of macroscopic loops, i.e. level lines in the original setting.

(Based on a joint work with H. Duminil-Copin, and I. Manolescu)

In unsupervised domain mapping, the learner is given two unmatched datasets A and B. The goal is to learn a mapping G_AB that translates a sample in A to the analog sample in B. Recent approaches have shown that when learning simultaneously both G_AB and the inverse mapping G_BA, convincing mappings are obtained. In this work, we present a method of learning G_AB without learning G_BA. This is done by learning a mapping that maintains the distance between a pair of samples. Moreover, good mappings are obtained, even by maintaining the distance between different parts of the same sample before and after mapping. We present experimental results that the new method not only allows for one sided mapping learning, but also leads to preferable numerical results over the existing circularity-based constraint.

In formal verification, one uses mathematical tools in order to prove that a system satisfies a given specification.

In this talk I consider three limitations of traditional formal verification:

The first is the fact that treating systems as finite-state machines is problematic, as systems become more complex, and thus we must consider infinite-state machines.

The second is that Boolean specifications are not rich enough to specify properties required by today's systems. Thus, we need to go beyond the Boolean setting.

The third concerns certifying the results of verification: while a verifier may answer that a system satisfies its specification, a designer often needs some palpable evidence of correctness.

I will present several works addressing the above limitations, and the mathematical tools involved in overcoming them.

No prior knowledge is assumed. Posterior knowledge is guaranteed.

We develop the theory of semisimplifications of tensor categories defined by Barrett and Westbury. By definition, the semisimplification of a tensor category is its quotient by the tensor ideal of negligible morphisms, i.e., morphisms f such that Tr(fg)=0 for any morphism g in the opposite direction. In particular, we compute the semisimplification of the category of representations of a finite group in characteristic p in terms of representations of the normalizer of its Sylow p-subgroup. This allows us to compute the semisimplification of the representation category of the symmetric group S_{n+p} in characteristic p, where n=0,...,p-1, and of the Deligne category Rep^{ab} S_t, t in N. We also compute the semisimplification of the category of representations of the Kac-De Concini quantum group of the Borel subalgebra of sl_2. Finally, we study tensor functors between Verlinde categories of semisimple algebraic groups arising from the semisimplification construction, and objects of finite type in categories of modular representations of finite groups (i.e., objects generating a fusion category in the semisimplification).

This is joint work with Victor Ostrik.

In the area of distributed graph algorithms a number of network's entities with local views solve some computational task by exchanging messages with their neighbors. Quite unfortunately, an inherent property of most existing distributed algorithms is that throughout the course of their execution, the nodes get to learn not only their own output but rather learn quite a lot on the inputs or outputs of many other entities. This leakage of information might be a major obstacle in settings where the output (or input) of network's individual is a private information (e.g. distributed networks of selfish agents, decentralized digital currency such as Bitcoin, voting systems).

While being quite unfamiliar notion in the classical distributed setting, the notion of secure multi-party computation (MPC) is one of the main themes in the Cryptography community. Yet despite all extensive work in the area, no existing algorithm fits the framework of classical distributed models in which there are no assumptions on the graph topologies and only messages of bounded size are sent on the edges in each round.

In this work, we introduce a new framework for \emph{secure distributed graph algorithms} and provide the first \emph{general compiler} that takes any "natural" non-secure distributed algorithm that runs in $r$ rounds, and turns it into a secure algorithm that runs in $\widetilde{O}(r \cdot D \cdot poly(\Delta))$ rounds where $\Delta$ is the maximum degree in the graph and $D$ is its diameter. This round complexity is nearly optimal for bounded degree graphs.

The main technical part of our compiler is based on a new cycle cover theorem: We show that the edges of every bridgeless graph $G$ of diameter $D$ can be covered by a collection of cycles such that each cycle is of length $\widetilde{O}(D)$ and each edge of the graph $G$ appears in $\widetilde{O}(1)$ many cycles. This provides the basis for additional combinatorial constructions required by our compiler and might be of independent combinatorial and algorithmic interest.

Joint work with Merav Parter.

In recent years, robots have played an active role in everyday life: medical robots assist in complex surgeries, low-cost commercial robots clean houses and fleets of robots are used to efficiently manage warehouses. A key challenge in these systems is motion planning, where we are interested in planning a collision-free path for a robot in an environment cluttered with obstacles. While the general problem has been studied for several decades now, these new applications introduce an abundance of new challenges.

In this talk I will describe some of these challenges as well as algorithms developed to address them. I will overview general challenges such as compression and graph-search algorithms in the context of motion planning. I will show why traditional Computer Science tools are ill-suited for these problems and introduce alternative algorithms that leverage the unique characteristics of robot motion planning. In addition, I will describe domains-specific challenges such as those that arise when planning for assistive robots and for humanoid robots and overview algorithms tailored for these specific domains.

Language technology has become pervasive in everyday life, powering applications like Apple's Siri or Google's Assistant. Neural networks are a key component in these systems thanks to their ability to model large amounts of data. Contrary to traditional systems, models based on deep neural networks (a.k.a. deep learning) can be trained in an end-to-end fashion on input-output pairs, such as a sentence in one language and its translation in another language, or a speech utterance and its transcription. The end-to-end training paradigm simplifies the engineering process while giving the model flexibility to optimize for the desired task. This, however, often comes at the expense of model interpretability: understanding the role of different parts of the deep neural network is difficult, and such models are often perceived as "black-box". In this work, we study deep learning models for two core language technology tasks: machine translation and speech recognition. We advocate an approach that attempts to decode the information encoded in such models while they are being trained. We perform a range of experiments comparing different modules, layers, and representations in the end-to-end models. Our analyses illuminate the inner workings of end-to-end machine translation and speech recognition systems, explain how they capture different language properties, and suggest potential directions for improving them. The methodology is also applicable to other tasks in the language domain and beyond.

In this talk, I present an analogue of the Hardy-Littlewood conjecture on the asymptotic distribution of prime constellations in the setting of short intervals in function fields of smooth projective curves over finite fields.

I will discuss the definition of a "short interval" on a curve as an additive translation of the space of global sections of a sufficiently positive divisor E by a suitable rational function f, and show how this definition generalizes the definition of a short interval in the polynomial setting.

I will give a sketch of the proof which includes a computation of a certain Galois group, and a counting argument, namely, Chebotarev density type theorem.

This is a joint work with Tyler Foster.

Computational problems whose input is a program are central in Cryptography, as well as Complexity, Learning, and Optimization. The nature of such problems crucially depends on the way the program is accessed -- as a black box or explicitly by its implementation.

In which settings can we exploit code to gain an advantage over black-box access? In Cryptography, we explore this question from two opposing perspectives:

Protecting Code: Can we obfuscate a program's code so that its functionality is preserved but it is otherwise unintelligible? Intuitively, such obfuscated code reveals nothing more than black-box access to the program. Obfuscation is, therefore, a powerful tool with numerous applications in software protection and Cryptography.

Exploiting Code: Most security proofs in cryptography consist of a reduction that translates any potential attacker into an algorithm solving some underlying hard problem. While most security reductions only require black-box access to the attacker, for many applications black-box reductions are provably insufficient. Can we exploit the attacker's code to prove security where black-box reductions fail?

In this talk, I will describe new techniques for protecting and exploiting code, taking advantage of the inherent tension between these two tasks. I will also demonstrate applications of these techniques in and beyond cryptography.

Sebastien Bubeck (Microsoft Research)

**Title: k-server via multiscale entropic regularization**

Abstract: I will start by describing how mirror descent is a natural strategy for online decision making, specifically in online learning and metrical task systems. To motivate the k-server problem I will also briefly recall what we know and what we don't know for structured state/action spaces in these models. Using the basic mirror descent calculations I will show how to easily obtain a log(k)-competitive algorithm for k-paging. I will then introduce our new parametrization of fractional k-server on a tree, and explain how to analyze the movement cost of entropy-regularized mirror descent on this parametrization. This leads to a depth*log(k)-competitive (fractional) algorithm for general trees, and log^2(k) for HSTs. I will also briefly mention dynamic embeddings to go beyond the standard log(n) loss in the reduction from general metrics to HSTs.

Joint work with Michael B. Cohen, James R. Lee, Yin Tat Lee, and Aleksander Madry.

Percy Deift (Courant Institute )

**Title: Universality in numerical analysis with some examples of cryptographic algorithms.**

Abstract: We show that a wide variety of numerical algorithms with random data exhibit universality. Most of the results are computational, but in some important cases universality is established rigorously. We also discuss universality for some cryptographic algorithms.

Joint work with C. Pfrany, G. Menon, S. Olver, T. Trogdan and S. Miller.

Given a p-adic group G, number theorists are interested in producing admissible representations of G: representations which have a well-defined character functional. One way to produce such representations is by "Jacquet induction" from smaller groups, whose characters can be understood inductively. The complementary space of "new" characters which are not obtained by induction (complementary with respect to a natural metric on the space of characters) is given by what is called "elliptic" characters. Given a representation V of G, the "new" input from its character is captured by the operator Ax(V), with A (the Bernstein-Deligne-Kazhdan A-operator) the projector to the elliptic component (note that this is different from the component of the character lattice valued in elliptic elements). I will talk about my ongoing work with Xuhua He on extending this operator to a trace functional Ax(V) for V a finitely-generated representation (whose Grothendieck group is well understood), which works by first constructing a virtual elliptic admissible representation from any finitely generated representation.

Being the gradient flow of the area functional, the mean curvature flow can be thought of as a greedy algorithm for simplifying embedded shapes. But how successful is this algorithm?

In this talk, I will describe three examples for how mean curvature flow, as well as its variants and weak solutions, can be used to achieve this desired simplification.

The first is a short time smoothing effect of the flow, allowing to smooth out some rough, potentially fractal initial data.

The second is an application of mean curvature flow with surgery to smooth differential topology, allowing to conclude Schoenflies-type theorems about the moduli space of smooth embedded spheres and tori, satisfying some curvature conditions.

The third is an application of (weak,modified) mean curvature flow to differential geometry, allowing to relate bounds on the gaussian entropy functional to the topology of a closed hypersurface.

In this talk, which will assume no prior knowledge in PDE or mean curvature flow, I will try to highlight the relation between the analysis of the flow and in particular, its singularity formation, to both ''time dependent'' and "classical" geometry.

Some of the results described in this talk are joint works with Reto Buzano, Robert Haslhofer and Brian White.

We recall the notion of a hall algebra associated to a category, and explain how this construction can be done in a way that naturally includes a higher algebra structure, motivated by work of Toen and Dyckerhoff-Kapranov. We will then explain how this leads to new insights about the bi-algebra structure and related concepts.

The problem of computational super-resolution asks to recover high-frequency features of an object from the noisy and blurred/band-limited samples of its Fourier transform, based on some a-priori information about the object class. A core theoretical question is to quantify the possible amount of bandwidth extension and the associated stability of recovering the missing frequency components, as a function of the sample perturbation level and the object prior.

In this work we assume that the object has a compact space/time extent in one dimension (but otherwise can be fairly arbitrary), while the low-pass window can have a super-exponentially decaying "soft" shape (such as a Gaussian). In contrast, previously known results considered only the ideal "hard" window (a characteristic function of the band) and objects of finite energy. The super-resolution problem in this case is equivalent to a stable analytic continuation of an entire function of finite exponential type. We show that a weighted least-squares polynomial approximation with equispaced samples and a judiciously chosen number of terms allows one to have a super-resolution factor which scales logarithmically with the noise level, while the pointwise extrapolation error exhibits a Holder-type continuity with an exponent derived from weighted potential theory. The algorithm is asymptotically minimax, in the sense that there is essentially no better algorithm yielding meaningfully lower error over the same smoothness class.

The results can be considered as a first step towards analyzing the much more realistic model of a sparse combination of compactly-supported "approximate spikes", which appears in applications such as synthetic aperture radar, seismic imaging and direction of arrival estimation, and for which only limited special cases are well-understood.

Joint work with L.Demanet and H.Mhaskar.

The use of a computational PIR scheme has been instrumental in reducing interaction from interactive proofs, and in converting multi-prover interactive proofs to (single prover) 2-message computationally sound proofs (also known as arguments).

In this talk we will focus on the secrecy guarantees of this transformation.

We show that if we start with an interactive proof which is only *honest-verifier* zero-knowledge, and we use a quasi-poly secure *symmetric* PIR scheme (or a 2-message OT scheme) to reduce interaction, then the resulting 2-message argument is witness indistinguishable, and in the delayed-input setting it is distributional weak zero-knowledge (which implies strong witness indistinguishable and witness hiding in the delayed input setting). Moreover, under the same assumption (which can be instantiated from quasi-poly DDH/QR/N'th residuosity assumption), we construct a two-message argument with (similar) *statistical* secrecy guarantees. For the latter, we apply the PIR heuristic on a computationally sound proof, which is honest-verifier statistical zero-knowledge.

This is based on joint works with Abhishek Jain, Dakshita Khurana, Ron Rothblum and Amit Sahai.

Graph matching is one of the most well-studied problems in combinatorial optimization, with applications ranging from scheduling and object recognition to numerical analysis and computational chemistry.

Nevertheless, until recently very little was unknown about this problem in real-life **dynamic networks**, which aim to model the constantly changing physical world.

In the first part of the talk we'll discuss our work on dynamic graph matching, and in the second part we'll highlight our work on a few related problems.

Talk 1: Amir Dembo (Stanford)

**Title: Large deviations theory for chemical reaction networks.**

The microscopic dynamics of well-stirred networks of chemical reactions are modeled as jump Markov processes. At large volume, one may expect in this framework to have a straightforward application of large deviation theory. This is not at all true, for the jump rates are typically neither globally Lipschitz, nor bounded away from zero, with both blowup and absorption as quite possible scenarios. In joint work with Andrea Agazzi and Jean-Pierre Eckman, we utilize Lyapunov stability theory to bypass this challenge and to characterize a large class of network topologies that satisfy the full Wentzell-Freidlin theory of asymptotic rates of exit from domains of attraction.

Talk 2: Yuval Peres (Microsoft)

Title: **Trace reconstruction for the deletion channel**

Abstract: In the trace reconstruction problem, an unknown string $x$ of $n$ bits is observed through the deletion channel, which deletes each bit with some constant probability q, yielding a contracted string. How many independent outputs (traces) of the deletion channel are needed to reconstruct $x$ with high probability?

The best lower bound known is linear in $n$. Until 2016, the best upper bound was exponential in the square root of $n$. We improve the square root to a cube root using statistics of individual output bits and some inequalities for Littlewood polynomials on the unit circle. This bound is sharp for reconstruction algorithms that only use this statistical information. (Similar results were obtained independently and concurrently by De O'Donnell and Servedio). If the string $x$ is random, we can show a subpolynomial number of traces suffices by comparison to a random walk. (Joint works with Fedor Nazarov, STOC 2017, with Alex Zhai, FOCS 2017 and with Nina Holden & Robin Pemantle, preprint (2017).)

Image captioning -- automatic production of text describing a visual scene -- has received a lot of attention recently. However, the objective of captioning, evaluation metrics, and the training protocol remain somewhat unsettled. The general goal seems to be for machines to describe visual signal like humans do. We pursue this goal by incorporating a discriminability loss in training caption generators. This loss is explicitly "aware" of the need for the caption to convey information, rather than appear fluent or reflect word distribution in the human captions. Specifically, the loss in our work is tied to discriminative tasks: producing a referring expression (text that allows a recipient to identify a region in the given image) or producing a discriminative caption which allows the recipient to identify an image within a set of images. In both projects, use of the dscriminability loss does not require any additional human annotations, and relies on collaborative training between the caption generator and a comprehension model, which is a proxy for a human recipient. In experiments on standard benchmarks, we show that adding discriminability objectives not only improves the discriminative quality of the generated image captions, but, perhaps surprisingly, also makes the captions better under a variety of traditional metrics.

Joint work with Ruotian Luo (TTIC), Brian Price and Scott Cohen (Adobe).

Let K be a commutative ring. Consider the groups GLn(K). Bernstein and Zelevinsky have studied the representations of the general linear groups in case the ring K is a nite eld. Instead of studying the representations of GLn(K) for each n separately, they have studied all the representations of all the groups GLn(K) si- multaneously. They considered on R := nR(GLn(K)) structures called parabolic (or Harish-Chandra) induction and restriction, and showed that they enrich R with a structure of a so called positive self adjoint Hopf algebra (or PSH algebra). They use this structure to reduce the study of representations of the groups GLn(K) to the following two tasks:

1. Study a special family of representations of GLn(K), called cuspidal representa- tions. These are representations which do not arise as direct summands of parabolic induction of smaller representations.

2. Study representations of the symmetric groups. These representation also has a nice combinatorial description, using partitions.

In this talk I will discuss the study of representations of GLn(K) where K is a nite quotient of a discrete valuation ring (such as Z=pr or k[x]=xr, where k is a nite eld). One reason to study such representation is that all continuous complex representations of the groups GLn(Zp) and GLn(k[[x]]) (where Zp denotes the p-adic integers) arise from these nite quotients. I will explain why the natural generalization of the Harish-Chandra functors do not furnish a PSH algebra in this case, and how is this related to the Bruhat decomposition and Gauss elimination. In order to overcome this issue we have constructed a generalization of the Harish-Chandra functors. I will explain this generalization, describe some of the new functors properties, and explain how can they be applied to studying complex representations.

The talk will be based on a joint work with Tyrone Crisp and Uri Onn.

Today's technological world is increasingly dependent upon the reliability, robustness, quality of service and timeliness of networks including those of power distribution, financial, transportation, communication, biological, and social. For the time-critical functionality in transferring resources and information, a key requirement is the ability to adapt and reconfigure in response to structural and dynamic changes, while avoiding disruption of service and catastrophic failures. We will outline some of the major problems for the development of the necessary theory and tools that will permit the understanding of network dynamics in a multiscale manner.

Many interesting networks consist of a finite but very large number of nodes or agents that interact with each other. The main challenge when dealing with such networks is to understand and regulate the collective behavior. Our goal is to develop mathematical models and optimization tools for treating the ** Big Data** nature of large scale networks while providing the means to understand and regulate the collective behavior and the dynamical interactions (short and long-range) across such networks.

The key mathematical technique will be based upon the use optimal mass transport theory and resulting notions of curvature applied to weighted graphs in order to characterize network robustness. Examples will be given from biology, finance, and transportation.

I shall review the framework of algebraic families of Harish-Chandra modules, introduced recently, by Bernstein, Higson, and the speaker. Then, I shall describe three of their applications.

The first is contraction of representations of Lie groups. Contractions are certain deformations of representations with applications in mathematical physics.

The second is the Mackey bijection, this is a (partially conjectural) bijection between the admissible dual of a real reductive group and the admissible dual of its Cartan motion group.

The third is the hidden symmetry of the hydrogen atom as an algebraic family of Harish-Chandra modules.

Consider a random sequence of n bits that has entropy at least n-k, where k << n. A commonly used observation is that an average coordinate of this random sequence is close to being uniformly distributed, that is, the coordinate "looks random''. In this work, we prove a stronger result that says, roughly, that the average coordinate looks random to an adversary that is allowed to query about n/k other coordinates of the sequence, even if the adversary is non-deterministic.

As an application of this result, we prove a new result on depth-3 circuits, which recovers as a direct corollary the known lower bounds for the parity and majority functions, as well as a lower bound on sensitive functions due to Boppana.

In this talk we discuss the fine scale $L^2$-mass distribution of toral Laplace eigenfunctions with respect to random position. For the 2-dimensional torus, under certain flatness assumptions on the Fourier coefficients of the eigenfunctions and generic restrictions on energy levels, both the asymptotic shape of the variance and the limiting Gaussian law are established, in the optimal Planck-scale regime. We also discuss the 3-dimensional case, where the asymptotic behaviour of the variance is analysed in a more restrictive scenario. This is joint work with Igor Wigman.

Deep Learning has led to a dramatic leap in Super-Resolution (SR) performance in the past few years. However, being supervised, these SR methods are restricted to specific training data, where the acquisition of the low-resolution (LR) images from their high-resolution (HR) counterparts is predetermined (e.g., bicubic downscaling), without any distracting artifacts (e.g., sensor noise, image compression, non-ideal PSF, etc). Real LR images, however, rarely obey these restrictions, resulting in poor SR results by SotA (State of the Art) methods. In this paper we introduce "Zero-Shot" SR, which exploits the power of Deep Learning, but does not rely on prior training. We exploit the internal recurrence of information inside a single image, and * train a small image-specific CNN at test time, on examples extracted solely from the input image itself.* As such, it can adapt itself to different settings per image. This allows to perform SR of real old photos, noisy images, biological data, and other images where the acquisition process is unknown or non-ideal. On such images, our method outperforms SotA CNN-based SR methods, as well as previous unsupervised SR methods. To the best of our knowledge, this is the first unsupervised CNN-based SR method.

Abstract. As is well known and easy to prove the Weyl algebras A_n over a field of characteristic zero are simple. Hence any non-zero homomorphism from A_n to A_m is an embedding and m \geq n. V. Bavula conjectured that the same is true over the fields with finite characteristic. It turned out that exactly one half of his conjecture is correct (m \geq n but there are homomorphisms which are not embeddings).

If we replace the Weyl algebra by its close relative symplectic Poisson algebra (polynomial algebra with F[x_1, ..., x_n; y_1, ..., y_n] variables and Poisson bracket given by {x_i, y_i} =1 and zero on the rest of the pairs), then independently of characteristic all homomorphisms are embeddings.

In this talk I will describe a family of integral representations for the standard twisted L-function of a cuspidal representation of the exceptional group of type G_2. This integral representations. These integral representations are unusual in the sense that they unfold with a non-unique model. A priori this integral is not Eulerian but using remarkable machinery proposed by I. Piatetski-Shapiro and S. Rallis we prove that in fact the integral does factor. In the course of the plocal unramified calculation we use another non-standard method, approximations of generating functions. I will then describe a few applications of these integral representations to the study of the analytic behaviour of the this L-function and to various functorial lifts associated with the group G_2.

We present an explicit pseudorandom generator with polylog(n) seed length for read-once constant-width branching programs that can read their $n$ input bits in any order. This improves upon the work of Impagliazzo, Meka, and Zuckerman (FOCS, 2012), where they required seed length $n^{1/2+o(1)}$.

A central ingredient in our work is a bound on the Fourier spectrum of constant-width branching programs, settling a conjecture posed by Reingold, Steinke, and Vadhan (RANDOM, 2013).

Our analysis crucially uses a notion of local monotonicity on the edge labeling of the branching program. We carry critical parts of our proof under the assumption of local monotonicity and show how to deduce our results for unrestricted branching programs.

(Joint work with Eshan Chattopadhyay, Pooya Hatami, and Omer Reingold)

Machine learning has recently been revolutionized by the introduction of Deep Neural Networks. However, from a theoretical viewpoint these methods are still poorly understood. Indeed the key challenge in Machine Learning today is to derive rigorous results for optimization and generalization in deep learning. In this talk I will present several tractable approaches to training neural networks. At the second part I will discuss a new sequential algorithm for decision making that can take into account the structure in the action space and is more tuned with realistic decision making scenarios.

I will present our work that provides some of the first positive results and yield new, provably efficient, and practical algorithms for training certain types of neural networks. In a second work I will present a new online algorithm that learns by sequentially sampling random networks and asymptotically converges, in performance, to the optimal network. Our approach improves on previous random features based learning in terms of sample/computational complexity, and expressiveness. In a more recent work we take a different perspective on this problem. I will provide sufficient conditions that guarantee tractable learning, using the notion of refutation complexity. I will then discuss how this new idea can lead to new interesting generalization bounds that can potentially explain generalization in settings that are not always captured by classical theory.

In the setting of reinforcement learning I will present a recently developed new algorithm for decision making in a metrical action space. As an application, we consider a dynamic pricing problem in which a seller is faced with a stream of patient buyers. Each buyer buy at the lowest price in a certain time window. We use our algorithm to achieve an optimal regret, improving on previously known regret bound.

Schwartz functions are classically defined as smooth functions such that they, and all their (partial) derivatives, decay at infinity faster than the inverse of any polynomial. This was formulated on $\mathbb{R}^n$ by Laurent Schwartz, and later on Nash manifolds (smooth semi-algebraic varieties) by Fokko du Cloux and by Rami Aizenbud and Dima Gourevitch. In a joint work with Boaz Elazar we have extended the theory of Schwartz functions to the category of (possibly singular) real algebraic varieties. The basic idea is to define Schwartz functions on a (closed) algebraic subset of $\mathbb{R}^n$ as restrictions of Schwartz functions on $\mathbb{R}^n$.

Both in the Nash and the algebraic categories there exists a very useful characterization of Schwartz functions on open subsets, in terms of Schwartz functions on the embedding space: loosely speaking, Schwartz functions on an open subset are exactly restrictions of Schwartz functions on the embedding space, which are zero "to infinite order" on the complement to this open subset. This characterization suggests a very intuitive way to attach a space of Schwartz functions to an arbitrary (not necessarily semi-algebraic) open subset of $\mathbb{R}^n$.

In this talk, I will explain this construction, and more generally the construction of the category of tempered smooth manifolds. This category is in a sense the "largest" category whose objects "look" locally like open subsets of $\mathbb{R}^n$ (for some $n$), and on which Schwartz functions may be defined. In the development of this theory some classical results of Whitney are used, mainly Whitney type partition of unity (this will also be explained in the talk). As time permits, I will show some properties of Schwartz functions, and describe some possible applications. This is a work in progress.

Over the past two decades, machine learning has rapidly evolved and emerged as a highly influential discipline of computer science and engineering. One of the pillars of machine learning is mathematical optimization, and the connection between the two fields has been a primary focus of research. In this talk, I will present two recent works that contribute to this study, focusing on online learning---a central model in machine learning for sequential decision making and learning under uncertainty. In the first part of the talk, I will describe a foundational result concerned with the power of optimization in online learning, and give answer to the question: does there exist a generic and efficient reduction from online learning to black-box optimization? In the second part, I will discuss a recent work that employs online learning techniques to design a new efficient and adaptive preconditioned algorithm for large-scale optimization. Despite employing preconditioning, the algorithm is practical even in modern optimization scenarios such as those arising in training state-of-the-art deep neural networks. I will present the new algorithm along with its theoretical guarantees and demonstrate its performance empirically.

The random-cluster model is a dependent percolation model where the weight of a configuration is proportional to q to the power of the number of connected components. It is highly related to the ferromagnetic q-Potts model, where every vertex is assigned one of q colors, and monochromatic neighbors are encouraged. Through non-rigorous means, Baxter showed that the phase transition is first-order whenever $q > 4$ - i.e. there are multiple Gibbs measures at criticality. We provide a rigorous proof of this claim. Like Baxter, our proof uses the correspondence between the above models and the six-vertex model, which we analyze using the Bethe ansatz and transfer matrix techniques. We also prove Baxter's formula for the correlation length of the models at criticality.

This is joint work with Hugo Duminil-Copin, Maxime Gangebin, Ioan Manolescu, and Vincent Tassion.

We present a general approach to video understanding, inspired by semantic transfer techniques that have been successfully used for 2D image analysis. Our method considers a video to be a 1D sequence of clips, each one associated with its own semantics. The nature of these semantics -- natural language captions or other labels -- depends on the task at hand. A test video is processed by forming correspondences between its clips and the clips of reference videos with known semantics, following which, reference semantics can be transferred to the test video. We describe two matching methods, both designed to ensure that (a) reference clips appear similar to test clips and (b), taken together, the semantics of the selected reference clips is consistent and maintains temporal coherence. We use our method for video captioning on the LSMDC'16 benchmark, video summarization on the SumMe and TVSum benchmarks, Temporal Action Detection on the Thumos2014 benchmark, and sound prediction on the Greatest Hits benchmark. Our method not only surpasses the state of the art, in four out of five benchmarks, but importantly, it is the only single method we know of that was successfully applied to such a diverse range of tasks.

Digital geometry processing (DGP) is one of the core topics of computer graphics, and has been an active line of research for over two decades. On one hand, the field introduces theoretical studies in topics such as vector-field design, preservative maps and deformation theory. On the other hand, the tools and algorithms developed by this community are applicable in fields ranging from computer-aided design, to multimedia, to computational biology and medical imaging. Throughout my work, I have sought to bridge the gap between the theoretical aspects of DGP and their applications. In this talk, I will demonstrate how DGP concepts can be leveraged to facilitate real-life applications with the right adaptation. More specifically, I will portray how I have employed deformation theory to support problems in animation and augmented reality. I will share my thoughts and first taken steps to enlist DGP to the aid of machine learning, and perhaps most excitingly, I will discussion my own and the graphics community's contributions to computational fabrication field, as well as my vision for its future.

Bio: Dr. Amit H. Bermano is a postdoctoral Researcher at the Princeton Graphics Group, hosted by Professor Szymon Rusinkiewicz and Professor Thomas Funkhouser. Previously, he was a postdoctoral researcher at Disney Research Zurich in the computational materials group, led by Dr. Bernhard Thomaszewski. He conducted his doctoral studies at ETH Zurich under the supervision of Prof. Dr. Markus Gross, in collaboration with Disney Research Zurich. His Masters and Bachelors degrees were obtained at The Technion - Israel Institute of Technology under the supervision of Prof. Craig Gotsman. His research focuses on connecting the geometry processing field with other fields in computer graphics and vision, mainly by using geometric methods to facilitate other applications. His interests in this context include computational fabrication, animation, augmented reality, medical imaging and machine learning.

We present new heavy-hitters algorithms satisfying local-differential-privacy, with optimal or near-optimal worst-case error, running time, and memory. In our algorithms, the server running time is $\tilde O(n)$ and user running time is $\tilde O(1)$, hence improving on the prior state-of-the-art result of Bassily and Smith [STOC 2015] requiring $O(n^{5/2})$ server time and $O(n^{3/2})$ user time. With a typically large number of participants in local algorithms ($n$ in the millions), this reduction in time complexity is crucial for making locally-private heavy-hitters algorithms usable in practice.

Joint work with Raef Bassily, Kobbi Nissim, and Abhradeep Thakurta.

In this talk, I will try to present some techniques to handle the problem of large deviations of the spectrum of random matrices. I will focus on the case of macroscopic statistics of the spectrum of Hermitian matrices - in particular Wigner matrices - as the empirical distribution of the eigenvalues, the largest eigenvalue or the traces of powers.

In a first part, I will be concerned with the so-called "objective method''. Coined by David Aldous, this method consists in introducing, given a sequence of random objects, like random finite graphs, a new infinite random object from which one can deduce asymptotic properties of the original sequence. In the context of random matrices, this method has been mainly advertised by Balint Virag, and proven effective in showing universality results for the so-called beta-ensembles. Regarding large deviations of random matrices, this "objective method'' amounts to embed our sequence of matrices with growing size into an appropriate space on which one is able to understand the large deviations, and carry out a contraction principle. I will review several large deviations principles obtained by this method, given by interpretations of random matrices as either dense or sparse graphs, and point out the limits of this strategy.

Cryo-EM is an imaging technology that is revolutionizing structural biology; the Nobel Prize in Chemistry 2017 was recently awarded to Jacques Dubochet, Joachim Frank and Richard Henderson "for developing cryo-electron microscopy for the high-resolution structure determination of biomolecules in solution".

Cryo-electron microscopes produce a large number of very noisy two-dimensional projection images of individual frozen molecules. Unlike related methods, such as computed tomography (CT), the viewing direction of each image is unknown. The unknown directions, together with extreme levels of noise and additional technical factors, make the determination of the structure of molecules challenging.

While other methods for structure determination, such as x-ray crystallography and nuclear magnetic resonance (NMR), measure ensembles of molecules together, cryo-EM produces measurements of individual molecules. Therefore, cryo-EM could potentially be used to study mixtures of different conformations of molecules. Indeed, current algorithms have been very successful at analyzing homogeneous samples, and can recover some distinct conformations mixed in solutions, but, the determination of multiple conformations, and in particular, continuums of similar conformations (continuous heterogeneity), remains one of the open problems in cryo-EM.

I will discuss a one-dimensional discrete model problem, Heterogeneous Multireference Alignment, which captures many of the group properties and other mathematical properties of the cryo-EM problem. I will then discuss different components which we are introducing in order to address the problem of continuous heterogeneity in cryo-EM: 1. "hyper-molecules", the first mathematical formulation of truly continuously heterogeneous molecules, 2. The optimal representation of objects that are highly concentrated in both the spatial domain and the frequency domain using high-dimensional prolate spheroidal functions, and 3. Bayesian algorithms for inverse problems with an unsupervised-learning component for recovering such hyper-molecules in cryo-EM.

Weyl group multiple Dirichlet series are Dirichlet series in r complex variables which initially converge on a cone in C^r, possess analytic continuation to a meromorphic function on the whole complex space, and satisfy functional equations whose action on C^r is isomorphic to the Weyl group of a reduced root system. I will review different constructions of such series and discuss the relations between them.

The talk will discuss informational lower bounds of approximate Nash equilibrium in two complexity models: Query Complexity and Communication Complexity.

For both models we prove exponential (in the number of players) lower bound on the complexity of computing ε -Nash equilibrium, for constant value of approximation ε .

Consider a real Gaussian stationary process, either on Z or on R.

What is the probability that it remains positive on [0,N] for large N?

The relation between this probability, known as the persistence probability, and the covariance kernel of the process has been investigated since the 1950s with motivations stemming from probability, engineering and mathematical physics. Nonetheless, until recently, good estimates were known only for particular cases, or when the covariance kernel is either non-negative or summable.

In the first hour of the talk we will discuss new spectral methods which greatly simplify the analysis of persistence. We will then describe its qualitative behavior in a very general setting.

In the second hour, we will describe (very) recent progress. In particular we will show the proof of the "spectral gap conjecture'', which states: if the spectral measure vanishes on an interval containing 0 then the persistence is less then e^{-cN^2}, and this bound is tight if the measure is non-singular and compactly supported.

Time permitting, we will also discuss "tiny persistence'' phenomena (of the order of e^{-e^{cN}}).

Based on joint works with Ohad Feldheim, Benjamin Jaye, Fedor Nazarov and Shahaf Nitzan.

Isolating the voice of a specific person while filtering out other voices or background noises is challenging when video is shot in noisy environments, using a single microphone. For example, video conferences from home or office are disturbed by other voices, TV reporting from city streets is mixed with traffic noise, etc. We propose audio-visual methods to isolate the voice of a single speaker and eliminate unrelated sounds. Face motions captured in the video are used to estimate the speaker's voice, which is applied as a filter on the input audio. This approach avoids using mixtures of sounds in the learning process, as the number of such possible mixtures is huge, and would inevitably bias the trained model.

In the first part of this talk, I will describe a few techniques to predict speech signals by a silent video of a speaking person. In the second part of the talk, I will propose a method to separate overlapping speech of several people speaking simultaneously (known as the cocktail-party problem), based on the speech predictions generated by video-to-speech system.

Assume that $(X,f)$ is a dynamical system and $\phi$ is a real non negative potential such that the corresponding $f$-invariant measure $\mu_\phi$ measure is infinite. Under assumptions of good inducing schemes, we give conditions under which the pressure of $f$ for a perturbed potential $\phi+s\psi$ relates to the pressure of the induced system term.

This extends some of Sarig's results to the setting of infinite "equilibrium states".

In addition, limit properties of the family of measures $\mu_{\phi+s\psi}$ as $s\to 0$ are studied and statistical properties (e.g. correlation coefficients) under the limit measure are derived. I will discuss several examples.

This is based on joint work with H. Bruin and M. Todd.

As a concrete variant of motivic integration, I will discuss uniform p-adic integration and constructive aspects of results involved. Uniformity is in the p-adic fields, and, for large primes p, in the fields F_p((t)) and all their finite field extensions. Using real-valued Haar measures on such fields, one can study integrals, Fourier transforms, etc. We follow a line of research that Jan Denef started in the eighties, with in particular the use of model theory to study various questions related to p-adic integration. A form of uniform p-adic quantifier elimination is used. Using the notion of definable functions, one builds constructively a class of complex-valued functions which one can integrate (w.r.t. any of the variables) without leaving the class. One can also take Fourier transforms in the class. Recent applications in the Langlands program are based on Transfer Principles for uniform p-adic integrals, which allow one to get results for F_p((t)) from results for Q_p, once p is large, and vice versa. These Transfer Principles are obtained via the study of general kinds of loci, some of them being zero loci. More recently, these loci are playing a role in the uniform study of p-adic wave front sets for (uniformly definable) p-adic distributions, a tool often used in real analysis.

This talk contains various joint works with Gordon, Hales, Halupczok, Loeser, and Raibaut, and may mention some work in progress with Aizenbud about WF-holonomicity of these distributions, in relation to a question raized by Aizenbud and Drinfeld.

Covariance matrix estimation is essential in many areas of modern Statistics and Machine Learning including Graphical Models, Classification/Discriminant Analysis, Principal Component Analysis, and many others. Classical statistics suggests using Sample Covariance Matrix (SCM) which is a Maximum Likelihood Estimator (MLE) in the Gaussian populations. Real world data, however, usually exhibits heavy-tailed behavior and/or contains outliers, making the SCM non-efficient or even useless. This problem and many similar ones gave rise to the Robust Statistics field in early 60s, where the main goal was to develop estimators stable under reasonable deviations from the basic Gaussian assumptions. One of the most prominent robust covariance matrix estimators was introduced and thoroughly studied by D. Tyler in the mid-80s. This important representative of the family of M-estimators can be defined as an MLE of a certain population. The problem of robust covariance estimation becomes even more involved in the high-dimensional scenario, where the number of samples n is of the order of the dimension p, or even less. In such cases, prior knowledge, often referred to as structure, is utilized to decrease the number of degrees of freedom and make the estimation possible. Unlike the Gaussian setting, in Tyler's case even imposition of linear structure becomes challenging due to the non-convexity of the negative log-likelihood. Recently, Tyler's target function was shown to become convex under a certain change of metric (geodesic convexity), which stimulated further investigation of the estimator.

In this work, we focus on the so-called group symmetry structure, which essentially means that the true covariance matrix commutes with a group of unitary matrices. In engineering applications such structures appear due to the natural symmetries of the physical processes; examples include circulant, perHermitian, proper quaternion matrices, etc. Group symmetric constraints are linear, and thus convex in the regular Euclidean metric. We show that they are also convex in the geodesic metric. These properties allow us to develop symmetric versions of the SCM and Tyler's estimator and build a general framework for their performance analysis. The classical results claim that at least n = p and n = p+1 samples in general position are necessary to ensure the existence and uniqueness of the SCM and Tyler's estimator, respectively. We significantly improve the sample complexity requirements for both estimators under the symmetry structure and show that in some cases even 1 or 2 samples are enough to guarantee the existence and uniqueness regardless of the ambient dimension.

In this paper we study the adaptive complexity of submodular optimization. Informally, the adaptive complexity of a problem is the minimal number of sequential rounds required to achieve a constant factor approximation when polynomially-many queries can be executed in parallel at each round. Adaptivity is a fundamental concept that is heavily studied in computer science, largely due to the need for parallelizing computation. Somewhat surprisingly, very little is known about adaptivity in submodular optimization. For the canonical problem of maximizing a monotone submodular function under a cardinality constraint, to the best of our knowledge, all that is known to date is that the adaptive complexity is between 1 and Ω(n).Our main result in this paper is a tight characterization showing that the adaptive complexity of maximizing a monotone submodular function under a cardinality constraint is, up to lower order terms, θ(log n):We describe an algorithm which requires O(log n) sequential rounds and achieves an approximation that is arbitrarily close to 1/3; We show that no algorithm can achieve an approximation better than O(1 / log n) with fewer than O(log n / log log n) rounds. Thus, when allowing for parallelization, our algorithm achieves a constant factor approximation exponentially faster than any known existing algorithm for submodular maximization. Importantly, the approximation algorithm is achieved via adaptive sampling and complements a recent line of work on optimization of functions learned from data. In many cases, we do not know the functions we optimize and learn them from labeled samples. Recent results show that no algorithm can obtain a constant factor approximation guarantee using polynomially-many labeled samples as in the PAC and PMAC models, drawn from any distribution. Since learning with non-adaptive samples over any distribution results in a sharp impossibility, we consider learning with adaptive samples where the learner obtains poly(n) samples drawn from a distribution of her choice in every round. Our result implies that in the realizable case, where there is a true underlying function generating the data, θ(log n) batches, up to lower order terms, of adaptive samples are necessary and sufficient to approximately "learn to optimize" a monotone submodular function under a cardinality constraint. This is joint work with Yaron Singer.

Given an unknown D-dimensional quantum state rho, as well as M two-outcome measurements E_1,...,E_M, how many copies of rho do we need, if we want to learn the approximate probability that E_i accepts rho for *every* i? In this talk, I'll prove the surprising result --I didn't believe it myself at first -- that one can achieve this using a number of copies that's polylogarithmic in both M and D. So, e.g., one can learn whether *every* size-n^3 quantum circuit accepts or rejects an n-qubit state, given only poly(n) copies of the state. To prove this will require first surveying previous results on measuring quantum states and succinctly describing them, including my 2004 postselected learning theorem, and my 2006 "Quantum OR Bound" (with an erroneous proof fixed in 2016 by Harrow, Lin, and Montanaro).

As time permits, I'll also discuss new joint work with Xinyi Chen, Elad Hazan, and Ashwin Nayak, which takes my 2006 result on PAC-learnability of quantum states, and extends to the setting of online learning. Here we show that, given a sequence of T two-outcome measurements on an n-qubit state, even if the sequence is chosen adversarially, one can still learn to predict the outcomes of those measurements with total regret O(n) (in the "realizable" case) or O(sqrt(Tn)) (in the "non-realizable" case).

No quantum computing background will be assumed.

Calabi conjectured that the complex Monge-Ampere equation on compact Kaehler manifolds has a unique solution.

This was solved by Yau in 1978. In this talk, we present a non-archimedean version on projective Berkovich spaces.

In joint work with Burgos, Jell, Kunnemann and Martin, we improve a result of Boucksom, Favre and Jonsson in the equicharacteristic 0 case. We give also a result in positive equicharacteristic using test ideals.

We study the non-vanishing gradient-like vector fields $v$ on smooth compact manifolds $X$ with boundary. We call such fields traversing. With the help of a boundary generic field $v$, we divide the boundary $\d X$ of $X$ into two complementary compact manifolds, $\d^+X(v)$ and $\d^-X(v)$. Then we introduce the causality map $C_v: \d^+X(v) \to \d^-X(v)$, a distant relative of the Poincare return map. Let $\mathcal F(v)$ denote the oriented 1-dimensional foliation on $X$, produced by a traversing $v$-flow.

Our main result, the Holography Theorem, claims that, for boundary generic traversing vector fields $v$, the knowledge of the causality map $C_v$ is allows for a reconstruction of the pair $(X, \mathcal F(v))$, up to a homeomorphism $\Phi: X \to X$ which is the identity on the boundary $\d X$. In other words, for a massive class of ODE's, we show that the topology of their solutions, satisfying a given boundary value problem, is rigid. We call these results ``holographic" since the $(n+1)$-dimensional $X$ and the un-parameterized dynamics of the flow on it are captured by a single correspondence $C_v$ between two $n$-dimensional screens, $\d^+X(v)$ and $\d^-X(v)$.

This holography of traversing flows has numerous applications to the dynamics of general flows. Time permitting, we will discuss some applications of the Holography Theorem to the geodesic flows and the inverse scattering problems on Riemannian manifolds with boundary.

Let Rep(GL(m|n)) denote the category of finite-dimensional algebraic representations of the supergroup Gl(m|n). Nowadays the abelian structure (Ext^1 between irreducibles, block description,...) is well understood. Kazhdan-Lusztig theory gives an algorithmic solution for the character problem, and in special cases even explicit character formulas. However we understand the monoidal structure hardly at all (e.g. the decomposition of tensor products into the indecomposable constituents). I will talk about the problem of decomposing tensor products "up to superdimension 0", i.e. about the structure of Rep(GL(m|n))/N where N is the ideal of indecomposable representations of superdimension 0.

When dealing with a highly variable problem such as action recognition, focusing on a small area, such as the hand's region, makes the problem more manageable, and enables us to invest relatively high amount of resources needed for interpretation in a small but highly informative area of the image. In order to detect this region of interest in the image and properly analyze it, I have built a process that includes several steps, starting with a state of the art hand detector, incorporating both detection of the hand by appearance and by estimation of human body pose. The hand detector is built upon a fully convolutional neural network, detecting hands efficiently and accurately. The human body pose estimation starts with a state of the art head detector and continues with a novel approach where each location in the image votes for the position of each body keypoint, utilizing information from the whole image. Using dense, multi-target votes enables us to compute image-dependent joint keypoint probabilities by looking at consensus voting, and accurately estimates the body pose. Once the detection of the hands is complete, an additional step of segmentation of the hand and fingers is made. In this step each hand pixel in the image is labeled using a dense fully convolutional network. Finally, an additional step is made to segment and identify the held object. Understanding the hand-object interaction is an important step toward understanding the action taking place in the image. These steps enable us to perform fine interpretation of hand-object interaction images as an essential step towards understanding the human-object interaction and recognizing human activities.

Recent progress in imaging technologies leads to a continuous growth in biomedical data, which can provide better insight into important clinical and biological questions. Advanced machine learning techniques, such as artificial neural networks are brought to bear on addressing fundamental medical image computing challenges such as segmentation, classification and reconstruction, required for meaningful analysis of the data. Nevertheless, the main bottleneck, which is the lack of annotated examples or 'ground truth' to be used for training, still remains.

In my talk, I will give a brief overview on some biomedical image analysis problems we aim to address, and suggest how prior information about the problem at hand can be utilized to compensate for insufficient or even the absence of ground-truth data. I will then present a framework based on deep neural networks for the denoising of Dynamic contrast-enhanced MRI (DCE-MRI) sequences of the brain. DCE-MRI is an imaging protocol where MRI scans are acquired repetitively throughout the injection of a contrast agent, that is mainly used for quantitative assessment of blood-brain barrier (BBB) permeability. BBB dysfunctionality is associated with numerous brain pathologies including stroke, tumor, traumatic brain injury, epilepsy. Existing techniques for DCE-MRI analysis are error-prone as the dynamic scans are subject to non-white, spatially-dependent and anisotropic noise. To address DCE-MRI denoising challenges we use an ensemble of expert DNNs constructed as deep autoencoders, where each is trained on a specific subset of the input space to accommodate different noise characteristics and dynamic patterns. Since clean DCE-MRI sequences (ground truth) for training are not available, we present a sampling scheme, for generating realistic training sets with nonlinear dynamics that faithfully model clean DCE-MRI data and accounts for spatial similarities. The proposed approach has been successfully applied to full and even temporally down-sampled DCE-MRI sequences, from two different databases, of stroke and brain tumor patients, and is shown to favorably compare to state-of-the-art denoising methods.

We consider the problem of hidden common manifold extraction from multiple data sets, which have observation-specific distortions and artifacts. A new manifold learning method is presented based on alternating products of diffusion operators and local kernels. We provide theoretical analysis showing that our method is able to build a variant of the Laplacian of the hidden common manifold, while suppressing the observation-specific artifacts. The generality of this method is demonstrated in data analysis applications, where different types of devices are used to measure the same activity. In particular, we present applications to problems in biomedicine, neuroscience, and audio analysis.

This is joint work with Roy Lederman and Hau-tieng Wu.

We prove the first super-logarithmic lower bounds on the cell probe complexity of dynamic *boolean* (a.k.a. decision) data structure problems, a long-standing milestone in data structure lower bounds. We introduce a new technique and use it to prove a ~ log^{1.5}(n) lower bound on the operational time of a wide range of boolean data structure problems, most notably, on the query time of dynamic range counting *over F_2* ([Patrascu07]). Proving a super-logarithmic lower bound for this problem was explicitly posed as one of five important open problems in the late Mihai Patrascu's obituary [Tho13]. This result also implies the first super-logarithmic lower bound for the classical 2D range counting problem,one of the most fundamental data structure problems in computational geometry and spatial databases. We derive similar lower bounds for boolean versions of dynamic polynomial evaluation and 2D "rectangle stabbing", and for the (non-boolean) problems of range selection and range median. Our technical centerpiece is a new way of "weakly" simulating dynamic data structures using efficient *one-way* communication protocols with small advantage over random guessing. This simulation involves a surprising excursion to low-degree (Chebychev) polynomials which may be of independent interest, and offers an entirely new algorithmic angle on the "cell sampling" method of Panigrahy et al. [PTW10].

Joint work with Kasper Green-Larsen and Huacheng Yu.

We study the classic bipartite matching problem in the online setting, first introduced in the seminal work of Karp, Vazirani and Vazirani. Specifically, we consider the problem for the well-studied class of regular graphs. Matching in this class of graphs was studied extensively in the offline setting. In the online setting, an optimal deterministic algorithm, as well as efficient algorithms under stochastic input assumptions were known. In this work, we present a novel randomized algorithm with competitive ratio tending to one on this family of graphs, under adversarial arrival order. Our main contribution is a novel algorithm which achieves competitive ratio 1-O(\sqrt{\log d}/\sqrt{d}) in expectation on d-regular graphs. In contrast, we show that all previously-known online algorithms, such as the generally worst-case optimal ranking algorithm of Karp et al., are restricted to a competitive ratio strictly bounded away from one, even as d grows. Moreover, we show the convergence rate of our algorithm's competitive ratio to one is nearly tight, as no algorithm achieves competitive ratio better than 1-O(1/\sqrt{d}). Finally, we show that our algorithm yields a similar competitive ratio with high probability, as well as guaranteeing each offline vertex a probability of being matched tending to one.

The recent success of convolutional neural networks (CNNs) for image processing tasks is inspiring research efforts attempting to achieve similar success for geometric tasks. One of the main challenges in applying CNNs to surfaces is defining a natural convolution operator on surfaces. In this paper we present a method for applying deep learning to sphere-type shapes using a global seamless parameterization to a planar flat-torus, for which the convolution operator is well defined. As a result, the standard deep learning framework can be readily applied for learning semantic, high-level properties of the shape. An indication of our success in bridging the gap between images and surfaces is the fact that our algorithm succeeds in learning semantic information from an input of raw low-dimensional feature vectors.

We demonstrate the usefulness of our approach by presenting two applications: human body segmentation, and automatic landmark detection on anatomical surfaces. We show that our algorithm compares favorably with competing geometric deep-learning algorithms for segmentation tasks, and is able to produce meaningful correspondences on anatomical surfaces where hand-crafted features are bound to fail.

Joint work with: Meirav Galun, Noam Aigerman, Miri Trope, Nadav Dym, Ersin Yumer, Vladimir G. Kim and Yaron Lipman.

The past five years have seen a dramatic increase in the performance of recognition systems due to the introduction of deep architectures for feature learning and classification. However, the mathematical reasons for this success remain elusive. In this talk we will briefly survey some existing theory of deep learning. In particular, we will focus on data structure based theory and discuss two recent developments.

The first work studies the generalization error of deep neural network. We will show how the generalization error of deep networks can be bounded via their classification margin. We will also discuss the implications of our results for the regularization of the networks. For example, the popular weight decay regularization guarantees the margin preservation, but it leads to a loose bound to the classification margin. We show that a better regularization strategy can be obtained by directly controlling the properties of the network's Jacobian matrix.

The second work focuses on solving minimization problems with neural networks. Relying on recent recovery techniques developed for settings in which the desired signal belongs to some low-dimensional set, we show that using a coarse estimate of this set leads to faster convergence of certain iterative algorithms with an error related to the accuracy of the set approximation. Our theory ties to recent advances in sparse recovery, compressed sensing and deep learning. In particular, it provides an explanation for the successful approximation of the ISTA (iterative shrinkage and thresholding algorithm) solution by neural networks with layers representing iterations.

Joint work with Guillermo Sapiro, Miguel Rodrigues, Jure Sokolic, Alex Bronstein and Yonina Eldar.

Computational social choice deals with algorithms for aggregating individual preferences or opinions towards collective decisions. AI researchers (including myself) have long argued that such algorithms could play a crucial role in the design and implementation of multiagent systems. However, in the last few years I have come to realize that the "killer app" of computational social choice is helping people -- not software agents -- make joint decisions. I will illustrate this theme through two recent endeavors: Spliddit.org, a website that offers provably fair solutions to everyday problems; and Robovote.org, which provides optimization-driven voting methods.

Throughout the talk, I will devote special attention to the theoretical foundations and results that make these services possible.

The theory of hypergeometric functions with matrix argument was developed in the 1950s by S. Bochener for Hermitian matrices, and by C. Herz for symmetric matrices. This theory admits a common generalization to the setting of symmetric cones, which is discussed in the book by Faraut-Koranyi. It also has applications to the study of non-central distributions in statistics and to the theory of random matrices.

In the 1980s, I.G. Macdonald introduced a one parameter family of multivariate hypergeometric functions, which, for special values of the parameter, are the *radial* parts of the matrix hypergeometric functions. He also formulated a number of natural conjectures about these functions, which in the matrix case can be proved by appropriate integral formulas. However this technique is unavailable in the general setting and as a result these conjectures have remained open.

In recent work with G. Olafsson we have solved most of these conjectures, using ideas from the theory of Cherednik algebras and Jack polynomials. Among other results we obtain sharp estimates for the exponential kernel that allow us to establish a rigorous theory of the Fourier and Laplace transforms, and we prove an explicit formula for the Laplace transform of a Jack polynomial conjectured by Macdonald. This opens the door for several future developments in the associated harmonic analysis, some of which we also treat. This includes (1) the Paley-Wiener theorem, (2) Laplace transform identities for hypergeometric functions, and (3) the "so-called" Ramanujan master theorem.

**Note the unusual room [De Picciotto ****Building, Room 25]**

We show that there exist binary locally testable codes (for all rates) and locally correctable codes (for low rates) with rate and distance approaching the Gilbert-Varshamov bound (which is the best rate-distance tradeoff known for general binary error-correcting codes). Our constructions use a number of ingredients: Thommesen's random concatenation technique, the Guruswami-Sudan-Indyk strategy for list-decoding concatenated codes, the Alon-Edmonds-Luby distance amplification method, and the local list-decodability and local testability of Reed-Muller codes. Interestingly, this seems to be the first time that local testability is used in the construction of locally correctable codes.

Joint work with Sivakanth Gopi, Rafael Oliveira, Noga Ron-Zewi and Shubhangi Saraf

This is a joint work with Bezrukavnikov and Kazhdan. The goal of my talk is to give an explicit formula for the Bernstein projector to representations of depth $\leq r$. As a consequence, we show that the depth zero Bernstein projector is supported on topologically unipotent elements and it is equal to the restriction of the character of the Steinberg representation. As another application, we deduce that the depth $r$ Bernstein projector is stable. Moreover, for integral depths our proof is purely local.

Image denoising is the most fundamental problem in image enhancement, and it is largely solved: It has reached impressive heights in performance and quality -- almost as good as it can ever get. But interestingly, it turns out that we can solve many other problems using the image denoising "engine". I will describe the Regularization by Denoising (RED) framework: using the denoising engine in defining the regularization of any inverse problem. The idea is to define an explicit image-adaptive regularization functional directly using a high performance denoiser. Surprisingly, the resulting regularizer is guaranteed to be convex, and the overall objective functional is explicit, clear and well-defined. With complete flexibility to choose the iterative optimization procedure for minimizing this functional, RED is capable of incorporating any image denoising algorithm as a regularizer, treat general inverse problems very effectively, and is guaranteed to converge to the globally optimal result.

* Joint work with Peyman Milanfar (Google Research) and Yaniv Romano (EE-Technion).

We propose a procedure (the first of its kind) for computing a fully data-dependent interval that traps the mixing time t_mix of a finite reversible ergodic Markov chain at a prescribed confidence level. The interval is computed from a single finite-length sample path from the Markov chain, and does not require the knowledge of any parameters of the chain. This stands in contrast to previous approaches, which either only provide point estimates, or require a reset mechanism, or additional prior knowledge.

The interval is constructed around the relaxation time t_relax, which is strongly related to the mixing time, and the width of the interval converges to zero roughly at a sqrt{n} rate, where n is the length of the sample path. Upper and lower bounds are given on the number of samples required to achieve constant-factor multiplicative accuracy. The lower bounds indicate that, unless further restrictions are placed on the chain, no procedure can achieve this accuracy level before seeing each state at least \Omega(t_relax) times on the average. Future directions of research are identified. Time permitting, we will mention some recent further developments by D. Levin and Y. Peres.

Joint work with Daniel Hsu and Csaba Szepesvari.

Graphs that are prevalent in current applications (the Internet, Facebook etc.) are not only very large and highly dynamic, but also distributed between many servers, none of which sees the graph in its entirety. The distributed monitoring problem deals with the question of imposing conditions on the local graphs, such that as long as they hold, it is guaranteed that some desired property holds for the global graph.

While defining local conditions for linear properties (e.g. average degree) is relatively easy, they are more difficult to derive for non-linear functions over the graph. We propose a solution and a general definition of solution optimality, and demonstrate how to apply it to two important graph properties -- spectral gap and number of triangles. We also define an absolute lower bound on the communication overhead for distributed monitoring, and compare our algorithm to it, with good results. Performance improves as the graph becomes larger and denser -- that is, when distributing it is more important.

I will talk about my recent adventures with ants. Together with biologists we study P. longicornis ants as they collaboratively transport a large food item to their nest. This collective navigation process is guided by pheromones which are laid by individual ants. Using a new methodology to detect scent marks, we identify a new kind of ant trail characterized by very short and dynamic pheromone markings and highly stochastic navigation response to them. We argue that such a trail can be highly beneficial in conditions in which knowledge of individual ants regarding the underlying topological structure is unreliable. This gives rise to a new theoretical search model on graphs under unreliable guiding instructions, which is of independent computational interest. To illustrate the model, imagine driving a car in an unknown cou