Select Seminar Series
All seminars- Home
-   ›  
- Studying at the Faculty
-   ›  
- Seminars   ›  
- Foundations of Computer Science Seminar
1
We prove a few new lower bounds on the randomized competitive ratio for the k-server problem and other related problems, resolving some long-standing conjectures. In particular, for metrical task systems (MTS) we asympotically settle the competitive ratio and obtain the first improvement to an existential lower bound since the introduction of the model 35 years ago (in 1987).
More concretely, we show:
1. There exist (k+1)-point metric spaces in which the randomized competitive ratio for the k-server problem is Omega(log^2 k). This refutes the folklore conjecture (which is known to hold in some families of metrics) that in all metric spaces with at least k+1 points, the competitive ratio is \Theta(logk).
2. Consequently, there exist n-point metric spaces in which the randomized competitive ratio for MTS is Omega(log^2 n). This matches the upper bound that holds for all metrics. The previously best existential lower bound was Omega(logn) (which was known to be tight for some families of metrics).
3. For all k< n \in \mathbb{N}, for *all* n-point metric spaces the randomized k-server competitive ratio is at least Omega(logk), and consequently the randomized MTS competitive ratio is at least Omega(logn). These universal lower bounds are asymptotically tight. The previous bounds were Omega(logk/loglogk) and Omega(logn/loglogn), respectively.
4. The randomized competitive ratio for the w-set metrical service systems problem, and its equivalent width-w layered graph traversal problem, is Omega(w^2). This slightly improves the previous lower bound and matches the recently discovered upper bound.
5. Our results imply improved lower bounds for other problems like k-taxi, distributed paging and metric allocation.
These lower bounds share a common thread, and other than the third bound, also a common construction.
Joint work with Sebastien Bubeck and Christian Coester.
The edit distance, also known as Levenshtein distance, is the minimum number of character insertions, deletions, and substitutions needed to transform one string into another. The textbook dynamic programming algorithm computes the edit distance of two length-n strings in O(n²) time, which is optimal up to sub-polynomial factors and conditioned on the Strong Exponential Time Hypothesis (SETH). In a bounded setting, where the running time is parameterized by the edit distance k, the celebrated algorithm by Landau and Vishkin (JCSS’88) achieves a running time of O(n+k²), which is optimal as a function of n and k (again, up to sub-polynomial factors and conditioned on SETH). While the theory community thoroughly studied the Levenshtein distance, most practical applications rely on a more general weighted edit distance, where each edit has a weight depending on its type and the involved characters from the alphabet Σ. This is formalized through a weight function w : Σ∪{ε} × Σ∪{ε} → ℝ normalized so that w(a,a) = 0 for a∊Σ∪{ε} and w(a,b) ≥ 1 for a,b∊Σ∪{ε} with a ≠ b. The classic O(n²)-time algorithm supports this setting seamlessly, but for many decades only a straightforward O(nk)-time solution was known for the bounded version of the weighted edit distance problem. In this talk, I will present an O(n+k⁵)-time algorithm (joint work with Das, Gilbert, Hajiaghayi, and Saha; STOC'23; arXiv:2302.04229) and a very recent Õ(n+√{nk³})-time algorithm (joint work with Cassis and Wellnitz). I will also sketch a lower bound that proves the optimality of the latter algorithm for √n ≤ k ≤ n (up to sub-polynomial factors and conditioned on the All-Pairs Shortest Paths Hypothesis).
We consider the numerical taxonomy problem of fitting an SxS distance matrix D with a tree metric T. Here T is a weighted tree spanning S where the path lengths in T induce a metric on S. If there is a tree metric matching D exactly, then it is easily found. If there is no exact match, then for some k, we want to minimize the L_k norm of the errors, that is, pick T so as to minimize ||D-T||_k = (Sum_{i,j in S} |D(i,j)-T(i,j)|^k) ^{1/k}.
This problem was raised in biology in the 1960s for k = 1,2. The biological interpretation is that T represents a possible evolution behind the species in S matching some measured distances in D. Sometimes, it is required that T is an ultrametric, meaning that all species are at the same distance from the root.
An evolutionary tree induces a hierarchical classification of species and this is not just tied to biology. Medicine, ecology and linguistics are just some of the fields where this concept appears, and it is even an integral part of machine learning and data science. Fundamentally, if we can approximate distances with a tree, then they are much easier to reason about: many questions that are NP-hard for general metrics can be answered in linear time on tree metrics. In fact, humans have appreciated hierarchical classifications at least since Plato and Aristotle (350 BC).
The numerical taxonomy problem is important in practice and many heuristics have been proposed. In this talk we will review the basic algorithmic theory, results and techniques, for the problem, including the most recent result from FOCS'21 [Vincent Cohen-Addad et al., 2021]. They paint a varied landscape with big differences between different moments, and with some very nice open problems remaining.
- At STOC'93, Farach, Kannan, and Warnow [Martin Farach et al., 1995] proved that under L_\infty, we can find the optimal ultrametric. Almost all other variants of the problem are APX-hard.
- At SODA'96, Agarwala, Bafna, Farach, Paterson, and Thorup [Richa Agarwala et al., 1999] showed that for any norm L_k, k >= 1, if the best ultrametric can be \alpha-approximated, then the best tree metric can be 3\alpha-approximated. In particular, this implied a 3-approximation for tree metrics under L_\infty.
- At FOCS'05, Ailon and Charikar [Nir Ailon and Moses Charikar, 2011] showed that for any L_k, k >= 1, we can get an approximation factor of O(((log n)(log log n))^{1/k}) for both tree and ultrametrics. Their paper was focused on the L_1 norm, and they wrote "Determining whether an O(1) approximation can be obtained is a fascinating question".
- At FOCS'21, Cohen-Addad, Das, Kipouridis, Parotsidis, and Thorup [Vincent Cohen-Addad et al., 2021] showed that indeed a constant factor is possible for L_1 for both tree and ultrametrics. This uses the special structure of L_1 in relation to hierarchies.
- The status of L_k is wide open for 1 < k < \infty. All we know is that the approximation factor is between Omega(1) and O((log n)(log log n)).
One thing that distinguishes (theoretical) computer science from other scientific disciplines is its full-throated support of a fundamentally adversarial view of the universe. Malicious adversaries, with unbounded computational advantages, attempt to foil our algorithms at every turn and destroy their quantitative guarantees. However, there is one strange exception to this world view and it is this: the algorithm must accept its input as sacrosanct, and may never simply reject its input as illegitimate. But what if some inputs really are illegitimate? Is building a "bullshit detector" for algorithms a good idea?
To illustrate the power of the Bullshit Detection worldview, we give the first polynomial-time protocol for Byzantine Agreement that is resilient to f<n/3 corruptions against an omniscient, computationally unbounded adversary in an asynchronous message-passing model. (This is the first improvement to Ben-Or and Bracha's exponential time protocols from the 1980s that are resilient to f<n/3 corruptions.) The key problem is to design a coin-flipping protocol in which corrupted parties (who chose the outcomes of their coins maliciously) are eventually detected via statistical tests.
We will also discuss other algorithmic contexts in which bullshit detection might be useful.
Linear Programming is the backbone of algorithm design and combinatorial optimization.
The fastest known algorithms for solving linear programs, both in theory and practice, are
based on Interior-Point Methods: The core idea behind IPMs is to iteratively use Newton
approximation to reduce a linear program to a sequence of ~\sqrt{n} linear systems. Whether
this number of iterations can be improved is one of the major open problems in convex optimization.
A long line of work has shown that \sqrt{n} is indeed optimal for the special class of *self-concordant*
Barrier IPMs [Nestrov-Nemirovski94], but for general IPMs very little is known.
We propose a purely information-theoretic query model for studying the rate of convergence of IPMs,
via linear-system queries: Each iteration of the algorithm can adaptively specify an arbitrary diagonal
matrix H (an ellipsoid) and a vector v, and the oracle returns the least-squares minimizer of the linear
system \arg\min_x || Ax - v||_H. We show this model captures all known (deterministic) IPMs. Our main
result is an \Omega(\sqrt{n}) lower bound on the iterations of any deterministic algorithm in this model,
albeit for an (exponentially) ill-conditioned LP. In this talk I will describe this progress, assuming no prior
knowledge on IPMs.
Based on Joint work with Shunhua Jiang, Rasmus Kyng and Ming Ding.
We present a randomized algorithm that computes single-source shortest paths (SSSP) in O(m\log^8(n)\log W) time when edge weights are integral, can be negative, and are >= -W. This essentially resolves the classic negative-weight SSSP problem. The previous bounds are ~O((m+n^{1.5})\log W) and m^{4/3+o(1)}\log W. Near-linear time algorithms were known previously only for the special case of planar directed graphs. Also, independently of our work, the recent breakthrough on min-cost flow [Chen, Kyng, Liu, Peng, Probst-Gutenberg and Sachdeva] implies an algorithm for negative-weight SSSP with running time m^{1+o(1)}.
In contrast to all recent developments that rely on sophisticated continuous optimization methods and dynamic algorithms, our algorithm is simple: it requires only a simple graph decomposition and elementary combinatorial tools. In fact, ours is the first combinatorial algorithm for negative-weight SSSP to break through the classic O(m\sqrt{n}\log nW) bound from over three decades ago [Gabow and Tarjan SICOMP'89].
In the coin problem (also known as bias amplification) we are asked to distinguish, with probability at least 2/3, between n i.i.d. coins which are heads with probability (1/2+\beta) from ones which are heads with probability (1/2-\beta). We are interested in the space complexity of the coin problem, corresponding to the width of a read-once branching program solving the problem.
The coin problem becomes more difficult as \beta becomes smaller.
Statistically, it can be solved whenever \beta = \Omega(n^{-1/2}), using counting.
It has been previously shown that for \beta = O(n^{-1/2}), counting is essentially optimal (equivalently, width poly(n) is necessary [Braverman-Garg-Woodruff FOCS'20]).
On the other hand, the coin problem only requires O(\log n) width for \beta>n^{-c} for any constant c>\log_2(\sqrt{5}-1)\approx 0.306 (following low-width simulation of AND-OR tree of [Valiant JAlg'84]).
In this work, we close the gap between the bounds, showing a tight threshold between the values of \beta=n^{-c} where O(\log n) width suffices and the regime where poly(n) width is needed, with a transition at c=1/3.
This gives a complete characterization (up to constant factors) of the memory complexity of solving the coin problem, for all values of bias \beta.
We introduce new techniques in both bounds. For the upper bound, we give a construction based on recursive majority that does not require a memory stack of size \log n bits.
For the lower bound, we introduce new combinatorial techniques for analyzing progression of the success probabilities in read-once branching programs.
The talk is based on joint work with Mark Braverman and Sumhega Garg.
Reed-Muller codes are an important family of error correcting codes that plays a crucial role in a number of results in theoretical computer science. For the field of size $q$ and degree $d$, it was proved [Bhattacharyya et al., Haramaty et al.] that the corresponding Reed-Muller code is testable with $C_q * q^{d}/\eps$ queries. We give a new, more natural proof of this result that achieves a better dependency of C_q on the field size q -- from a tower-type bound to a polynomial bound. Our proof applies more generally to the setting of affine lifted codes, and uses the notion of global hypercontractivity, a recent generalization of classical hypercontractive inequalities.
In this talk, we will give a gentle introduction to the concept of global hypercontractivity, and explain the relation of the to the problem of optimal testing of Reed-Muller codes.
Based on a joint work with Tali Kaufman.
For an $n$-vertex digraph $G=(V,E)$, a \emph{shortcut set} is a (small) subset of edges $H$ taken from the transitive closure of $G$ that, when added to $G$ guarantees that the diameter of $G \cup H$ is small. Shortcut sets, introduced by Thorup in 1993, have a wide range of applications in algorithm design, especially in the context of parallel, distributed and dynamic computation on directed graphs. A folklore result in this context shows that every $n$-vertex digraph admits a shortcut set of linear size (i.e., of $O(n)$ edges) that reduces the diameter to $\widetilde{O}(\sqrt{n})$. Despite extensive research over the years, the question of whether one can reduce the diameter to $o(\sqrt{n})$ with $\widetilde{O}(n)$ shortcut edges has been left open.
In this talk, I will present the first improved diameter-sparsity tradeoff for this problem, breaking the $\sqrt{n}$ diameter barrier. Specifically, we show an $O(n^{\omega})$-time randomized algorithm for computing a linear shortcut set that reduces the diameter of the digraph to $\widetilde{O}(n^{1/3})$. We also extend our algorithms to provide improved $(\beta,\epsilon)$ hopsets for $n$-vertex weighted directed graphs.
Joint work with Shimon Kogan.
In this talk I will discuss two related data structure problems. The first is to preprocess two strings S and T, each of length n, into a data structure that can report the optimal alignment score (Longest common subsequence / edit distance) between any substring of S and any substring of T. I will describe a data structure with query time (log n)^(2+o(1)), and construction time and space n^(2+o(1)). This is optimal up to polylogarithmic/subpolynomial factors since computing just the alignment between S and T cannot be done in subquadratic time assuming the strong exponential time hypothesis.
The second problem is to preprocess a directed weighted planar graph into a data structure that reports the distance between any two vertices. Using the concept of voronoi diagrams, dramatic progress was made on such distance oracles in the past few years. We now have oracles of almost linear n^{1+o(1)} size that answer queries in (log n)^(2+o(1) time. However, the construction time of these oracles is not nearly linear, but O(n^{3/2}), which is not known to be optimal.
I will describe how ideas developed for planar distance oracles lead to the data structure for the string alignment problem. The structure of the alignment graph allows for a simpler presentation of some of the techniques involved, and is further exploited to obtain a faster construction time. It is not uncommon that techniques originally developed for pattern matching problems are later extended and generalized to planar graphs. Here ideas propagate in the opposite direction; techniques originally developed for planar graphs are specialized to a pattern matching problem.
Based on joint works with Panos Charalampopoulos, Pawel Gawrychowski and Oren Weimann (SODA 18, STOC 2019, ICALP 2021)
Consider the algorithm designer for a two-sided market with buyers and sellers, such as Airbnb or Uber. The platform's goal is to design a mechanism that optimizes the "gains from trade": the aggregate gain in the market due to the trades (e.g. in items, labor) that have occurred.
In this talk, I will present the first guarantees on gains from trade for a market that contains a multi-dimensional buyer, who is interested in multiple different kinds of items, and single-dimensional sellers, who each are only interested in selling the single item that they own. We present a logarithmic approximation to the optimal gains from trade in a parameter of the market. We also provide a logarithmic approximation in the number of sellers to the constrained-optimal gains from trade: optimal subject to standard truthfulness constraints. Our techniques come from a variety of domains: including online algorithms and revenue maximization.
Joint work with Yang Cai, Steven Ma, and Mingfei Zhao
https://weizmann.zoom.us/j/91630184141?pwd=djRXWjBVUFZRMCsvM3h6T1hFbTZrdz09
Interactive oracle proofs (IOPs) are a hybrid between interactive proofs and PCPs. In an IOP the prover is allowed to interact with a verifier (like in an interactive proof) by sending relatively long messages to the verifier, who in turn is only allowed to query a few of the bits that were sent (like in a PCP).
For any NP relation for which membership can be decided in polynomial-time and bounded polynomial space (e.g., SAT, Hamiltonicity, Clique, Vertex-Cover, etc.) and for any constant \gamma>0, we construct an IOP with communication complexity (1+\gamma) \cdot n, where n is the original witness length. The number of rounds as well as the number of queries made by the IOP verifier are constant.
Joint work with Noga Ron-Zewi
Many data analysis pipelines are adaptive: the choice of which analysis to run next depends on the outcome of previous analyses. Common examples include variable selection for regression problems and hyper-parameter optimization in large-scale machine learning problems: in both cases, common practice involves repeatedly evaluating a series of models on the same dataset. Unfortunately, this kind of adaptive re-use of data invalidates many traditional methods of avoiding over-fitting and false discovery, and has been blamed in part for the recent flood of non-reproducible findings in the empirical sciences. An exciting line of work beginning with Dwork et al. 2015 establishes the first formal model and first algorithmic results providing a general approach to mitigating the harms of adaptivity, via a connection to the notion of differential privacy. Unfortunately, until now, those results were primarily of information theoretic interest, only beating out the simple approach of gathering fresh data for every computation ("sample-splitting") at the scale of many millions of datapoints.
In this work, we give a new proof of the transfer theorem that any mechanism for answering adaptively chosen statistical queries that is differentially private and sample-accurate is also accurate out-of-sample. Our new proof is elementary and gives structural insights that we expect will be useful elsewhere. We show: 1) that differential privacy ensures that the expectation of any query on the conditional distribution on datasets induced by the transcript of the interaction is close to its expectation on the data distribution, and 2) sample accuracy on its own ensures that any query answer produced by the mechanism is close to the expectation of the query on the conditional distribution. This second claim follows from a thought experiment in which we imagine that the dataset is resampled from the conditional distribution after the mechanism has committed to its answers. The transfer theorem then follows by summing these two bounds, and in particular, avoids the "monitor argument" used to derive high probability bounds in prior work.
An upshot of our new proof technique is that the concrete bounds we obtain are substantially better than the best previously known bounds, even though the improvements are in the constants, rather than the asymptotics (which are known to be tight). As we show, our new bounds outperform the naive "sample-splitting" baseline at dramatically smaller dataset sizes compared to the previous state of the art, bringing techniques from this literature closer to practicality.
Joint work with: Christopher Jung, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi (UPenn), and Moshe Shenfeld (HUJI).
This work appeared at ITCS 2020.
Fine-grained complexity utilizes a small set of conjectures to derive conditional lower bounds for a large collection of problems. These conjectures concern the time complexity of a few core problems such as k-SAT, Orthogonal Vectors, 3SUM, k-Clique, and Set Cover. The relationships between these conjectures are poorly understood.
This talk will discuss some connections between the conjectures, including a tight reduction from Weighted-k Clique to Orthogonal Vectors and new (quantum-inspired) findings about the Set Cover Conjecture.
We construct a delegation scheme for delegating polynomial time computations. Our scheme is publicly verifiable and non-interactive in the common reference string (CRS) model. The soundness of our scheme is based on an efficiently falsifiable decisional assumption on groups with bilinear maps. Prior to this work, publicly verifiable non-interactive delegation schemes were only known under knowledge assumptions (or in the Random Oracle model) or under non-standard assumptions related to obfuscation or multilinear maps.
In addition, our scheme has two desirable features: The proofs are unambiguous, in the sense that it is hard to find two distinct proofs for the same statement, and are updatable in the sense that given a proof for the statement that a Turing machine M transitions from configuration C_0 to C_T in T steps, one can efficiently generate a proof for the statement that M transitions from configuration C_0 to C_{T+1} in T+1 steps.
We show that such a delegation scheme implies PPAD hardness, by following a similar approach to that of Choudhuri et al. (STOC2019), who obtained PPAD hardness based on an assumption related to the soundness of the Fiat-Shamir paradigm.
This is based on two joint works, both with Omer Paneth and Lisa Yang.
In this talk, we will discuss a new type of a pseudo-random object called a "pseudo-random pseudo-distribution". This object was introduced in the context of the BPL vs. L problem, and I will sketch a space-efficient construction of the latter for read-once branching programs that has near-optimal dependence on the error parameter. The talk is a distillation of a joint work with Mark Braverman and Sumegha Garg (the paper is available online: https://eccc.weizmann.ac.il/report/2017/161/).
We present a new/old approach to the design of online algorithms via Bregman projections. This approach is applicable to a wide range of online problems and we discuss connections to previous work on online primal-dual algorithms. In particular, the k-server problem on trees and HSTs is considered. The projection-based algorithm for this problem turns out to have a competitive ratio that matches some of the recent results given by Bubeck et al. (STOC 2018), whose algorithm uses mirror-descent-based continuous dynamics prescribed via a differential inclusion.
Joint work with Niv Buchbinder, Anupam Gupta, and Marco Molinaro.
We study the sample complexity of learning threshold functions under the constraint of differential privacy. Unlike the non-private case, where the sample complexity is independent of the domain size, it turns our that for private learning the sample complexity must depend on the domain size $X$. Our current understanding of this task places its sample complexity somewhere between $\log^*|X|$ and $2^{\log^*|X|}$, where at least three different algorithms are known with sample complexity exponential in $\log^*|X|$. In this work we reduce this gap significantly, and show that the sample complexity is at most polynomial in $\log^*|X|$.
Joint work with Haim Kaplan, Katrina Ligett, Yishay Mansour, and Moni Naor.
The model of interactive proofs, introduced nearly two and a half decade, is now increasingly widely being used to design computation-outsourcing protocols. In an interactive proof, an honest party interacts with powerful but strategic provers, to elicit from them the correct answer to a computational question. Classical interactive proofs assume that the provers are adversarial (i.e., they want to mislead the verifier) and cooperative (work together as a team).
In this talk, I will present my work on a new payment-based interactive-proof system, called rational proofs. In rational proofs, the provers are not adversarial but rational, that is, they want to maximize the payment received from the verifier. Using principles from mechanism design, I will show how these payments can be used to leverage correctness from multiple provers who are either cooperative or non-cooperative in nature. I will also present how the guarantees of rational proofs are related to the soundness and completeness guarantees of classical interactive proofs.
Bio: Shikha Singh is currently an Assistant Professor of Computer Science at Wellesley College and will be joining Williams College as an Assistant Professor in Fall 2019. She obtained her PhD in Computer Science from Stony Brook University and her Integrated MSc. in Mathematics and Computing from IIT Kharagpur. Her broad research interests include algorithmic game theory, algorithms and data structures for big data, and complexity theory.
Existing proofs that deduce BPP=P from circuit lower bounds convert randomized algorithms to deterministic algorithms with a large polynomial slowdown. We convert randomized algorithms to deterministic algorithms with nearly minimal slowdown. Specifically, assuming exponential lower bounds against non-deterministic circuits we convert randomized algorithms that err rarely to deterministic algorithms with a similar run-time, and general randomized algorithms to deterministic algorithms whose run-time is slower by only a nearly linear factor.
Our results follow from a new connection between pseudo-entropy generators and locally list recoverable codes.
This is joint work with Dean Doron, Justin Oh and David Zuckerman
20 questions is one of the simplest examples of a combinatorial search game:
lice thinks of an English word, and Bob's job is to figure it out by asking Yes/No questions. The worst-case complexity of the game is clearly log n, so to spice things up,we assume that Alice chooses her input according to some distribution known to Bob, and Bob now aims to minimize the expected number of questions.
An optimal solution to this problem was found already in 1952. However, the optimal strategy could ask arbitrarily complex Yes/No questions. We ask what happens when we constrain Bob to asking only "simple" questions, and what happens if Alice is allowed to lie a few times.
Joint work with Yuval Dagan (MIT), Ariel Gabizon, Daniel Kane (UCSD) and Shay Moran (IAS).
A searchable symmetric encryption (SSE) scheme enables a client to store data on an untrusted server while supporting keyword searches in a secure manner. Recent experiments have indicated that the practical relevance of such schemes heavily relies on the tradeoff between their space overhead, locality (the number of non-contiguous memory locations that the server accesses with each query), and read efficiency (the ratio between the number of bits the server reads with each query and the actual size of the answer). In this talk, we survey recent constructions and matching lower bounds, and discuss their underlying techniques.
Based on joint works with Gilad Asharov, Moni Naor, and Gil Segev.
In this talk I will describe some recent work on distributed property testing in the networks with bounded bandwidth (the CONGEST model): we have a network of computing nodes communicating over some initially-unknown network graph, where every communication link can carry a bounded number of bits per round. Some simple-looking problems, such as checking if the network contains a 4-cycle, are known to be very hard in this model, and this motivates us to consider property testing instead of exact solutions.
I will describe distributed property testing algorithms for two problems: subgraph-freeness, where we wish to determine whether the network graph contains some fixed constant-sized subgraph H; and uniformity testing, where every node of the network draws samples from an unknown distribution, and our goal is to determine whether the distribution is uniform or far from uniform. I will also discuss lower bounds.
We prove a new connection between gentle measurement (where one wants to measure n quantum states, in a way that damages the states only by a little) and differential privacy (where one wants to query a database about n users, in a way that reveals only a little about any individual user). The connection is bidirectional, though with loss of parameters in going from DP to gentle measurement. Exploiting this connection, we present a new algorithm for approximating the outcomes of many measurements on a collection of quantum states, a task called "shadow tomography". The new algorithm has the advantages of being gentle and online (the measurements can be chosen adaptively).
Joint work with Scott Aaronson.
No prior knowledge about quantum mechanics or computing will be assumed.
A graph is automatically also a metric space, but is there anything interesting to say about such metric spaces? Many fascinating and concrete questions are encapsulated in the more general (and vague) question "to what extent can a finite graph emulate the properties of a infinite regular tree ?". We will see how this leads us to investigate expansion in graphs and questions about the large scale aspects of graph metrics including girth and diameter and the relations between the two. If time permits (probably not) I may also say a little about the local geometry of graphs.
This talk is based on many collaborations which I have had over the years, among my more recent collaborators are Michael Chapman, Yuval Peled and Yonatan Bilu. The talk requires no particular previous background and should be accessible to general mathematical audience.
A linear threshold function (LTF) is a Boolean function f:{-1,1}^n -> {0,1} of the form f(x) = 1_{ \sum a_i x_i > t}, for some fixed coefficients a_i and some threshold t. LTFs play an important role in complexity theory, machine learning, and other fields.
In this talk we present a new approach that allows us obtaining sharp results on Fourier-theoretic properties of biased LTFs. In particular, we determine the exact asymptotic order of the total influence and of the degree-1 Fourier weight of any biased LTF, in terms of its maximal (normalized) coefficient and its expectation. This provides a sharp generalization of theorems proved by Matulef, O'Donnell, Rubinfeld, and Servedio (in the context of property testing), and settles a conjecture posed by Kalai et al.
Our main tools are 'local' forms of the classical Chernoff inequality, like the following one proved by Devroye and Lugosi (2008): Let {x_i} be independent random variables uniformly distributed in {-1, 1}, and let a_i be nonnegative numbers such that \sum a_i^2 =1. If for some t > 0, we have Pr [\sum a_i x_i > t] = b, then Pr[\sum a_i x_i > t + delta] < b/2 holds for delta < c/ \sqrt {log(1/b)}, where c is a universal constant. Such inequalities seem to be little-known and probably can be useful in other contexts as well.
Joint work with Ohad Klein.
MPC-in-the-head is a novel paradigm introduced in the work of Ishai et al. [IshaiKOS09] and, roughly speaking, allows the design of a zero-knowledge proof system for any NP-relation by relying on any multiparty computation (MPC) protocol in a modular way. On a high-level, in this transformation, the prover emulates ``in-its-head'' an execution of an MPC protocol that securely evaluates the NP-relation on the witness and commits to the views of the parties induced by this run. The verifier then tests the veracity of the computation by challenging the prover to open (decommit) the views of a subset of the parties. The key insight in the compilation is that the soundness and zero-knowledge property directly reduces to the robustness and simulation guarantees of the underlying MPC protocol.
The derandomization of MA, the probabilistic version of NP, is a long standing open question. In this talk, we connect this problem to a variant of another major problem: the quantum PCP conjecture. Our connection goes through the surprising quantum characterization of MA by Bravyi and Terhal. They proved the MA-completeness of the problem of deciding whether the groundenergy of a uniform stoquastic local Hamiltonian is zero or inverse polynomial. We show that the gapped version of this problem, i.e. deciding if a given uniform stoquastic local Hamiltonian is frustration-free or has energy at least some constant, is in NP. Thus, if there exists a gap-amplification procedure for uniform stoquastic Local Hamiltonians (in analogy to the gap amplification procedure for constraint satisfaction problems in the original PCP theorem), then MA = NP (and vice versa). Furthermore, if this gap amplification procedure exhibits some additional (natural) properties, then P = RP. This is a joint work with Dorit Aharonov.
We study the Excluded Grid Theorem, a fundamental structural result in graph theory, that was proved by Robertson and Seymour in their seminal work on graph minors. The theorem states that there is a function f, such that for every integer g > 0, every graph of treewidth at least f(g) contains the (gxg)-grid as a minor. For every integer g>0, let f(g) be the smallest value for which the theorem holds. Establishing tight bounds on f(g) is an important graph-theoretic question. Robertson and Seymour showed that f(g) is at least of order g^2 log g). For a long time, the best known upper bounds on f(g) were super-exponential in g. The first polynomial upper bound of f(g) = O(g^98 poly log g) was proved by Chekuri and Chuzhoy. It was later improved to f(g) = O(g^36 poly log g), and then to f(g) = O(g^19 poly log g). In this talk we present our recent work that further improves this bound to f(g) = O(g^9 poly log g) via a simpler proof. Moreover, while there are natural barriers that seem to prevent the previous methods from yielding tight bounds for the theorem, it seems conceivable that the techniques proposed in this thesis can lead to even tighter bounds on f(g).
A homomorphic secret-sharing scheme is a secret-sharing scheme that allows locally mapping shares of a secret to compact shares of a function of the secret. The talk will survey the current state of the art on homomorphic secret sharing, covering efficient constructions, applications in cryptography and complexity theory, and open questions.
An (\epsilon,\phi)-expander decomposition of a graph G=(V,E) with m edges is a partition of vertices into clusters such that (1) each cluster induces subgraph with conductance at least \phi, and (2) the number of inter-cluster edges is at most \epsilon m. This decomposition has a wide range of applications including approximation algorithms for the unique game, fast algorithms for flow and cut problems, and dynamic graph algorithms.
I will describe a new algorithm for constructing (~O(\phi),\phi)-expander decomposition in ~O(m/\phi) time. This is the first nearly linear time algorithm when \phi is at least 1/polylog(m), which is the case in most practical settings and theoretical applications. Previous results either take \Omega(m^{1+o(1)}) time, or attain nearly linear time but with a weaker expansion guarantee where each output cluster is guaranteed to be contained inside some unknown expander.
Our technique can be easily extended to the dynamic setting where the graph undergoing updates. This talk is based on joint work with Di Wang [Saranurak Wang SODA'19].
We explore the power of interactive proofs with a distributed verifier. In this setting, the verifier consists of n nodes and a graph G that defines their communication pattern. The prover is a single entity that communicates with all nodes by short messages. The goal is to verify that the graph G belongs to some language in a small number of rounds, and with small communication bound, i.e., the proof size.
This interactive model was introduced by Kol, Oshman and Saxena (PODC 2018) as a generalization of non-interactive distributed proofs. They demonstrated the power of interaction in this setting by constructing protocols for problems as Graph Symmetry and Graph Non-Isomorphism -- both of which require proofs of (n^2)-bits without interaction.
In this work, we provide a new general framework for distributed interactive proofs that allows one to translate standard interactive protocols (i.e., with a centralized verifier) to ones where the verifier is distributed with a proof size that depends on the computational complexity of the verification algorithm run by the centralized verifier. We show the following:
* Every (centralized) computation performed in time O(n) on a RAM can be translated into three-round distributed interactive protocol with O(log n) proof size. This implies that many graph problems for sparse graphs have succinct proofs (e.g., testing planarity).
* Every (centralized) computation implemented by either a small space or by uniform NC circuit can be translated into a distributed protocol with O(1) rounds and O(log n) bits proof size for the low space case and polylog(n) many rounds and proof size for NC.
* We show that for Graph Non-Isomorphism, one of the striking demonstrations of the power of interaction, there is a 4-round protocol with O(log n) proof size, improving upon the O(n*log n) proof size of Kol et al.
* For many problems, we show how to reduce proof size below the seemingly natural barrier of log n. By employing our RAM compiler, we get a 5-round protocol with proof size O(loglog n) for a family of problems including Fixed Automorphism, Clique and Leader Election (for the latter two problems we actually get O(1) proof size).
* Finally, we discuss how to make these proofs non-interactive {\em arguments} via random oracles.
Our compilers capture many natural problems and demonstrate the difficulty in showing lower bounds in these regimes.
Joint work with Moni Naor and Merav Parter.
A conjunction over a binary alphabet is a boolean function specified by a length n pattern of 0's, 1's and wildcards. On input bit strings of length n, the function outputs 1 if the input matches the pattern at all non wildcard positions. At CRYPTO 2018, Bishop et al. proposed a simple and elegant construction to obfuscate this class of functions by embedding the pattern in the error positions of a noisy Reed-Solomon codeword, and placing the codeword in a group exponent. They prove their scheme achieves a notion of security called "distributional virtual black box" in the generic group model for random conjunctions with at most 0.774n wildcards.
In this talk, I'll show how to abstract the Bishop et al. scheme to obtain a significantly more efficient "dual" scheme. In the generic group model, our scheme admits an intuitive proof of security and does not place any restrictions on the number of wildcards.
Next, I'll describe a simple modification to the construction that avoids encoding in a group exponent and is secure under the Learning Parity with Noise (LPN) assumption. At the heart of our security proof is a reduction from standard LPN to LPN with "structured error."
No prior knowledge of the Bishop et al. paper will be assumed.
Computing shortest paths is one of the fundamental problems of graph algorithms. The goal of *dynamic* all pairs shortest paths (APSP) is to maintain shortest path trees from all vertices as the edges of the graph change over time. The algorithm is said to be decremental if it handles only deletions, incremental if it handles only insertions and fully dynamic if it handles both deletions and insertions.
In this talk I will present a near optimal decremental algorithm that maintains approximate all pairs shortest paths.
Recent years have seen the emergence of new kinds of software including deep learning, programmable computer networks, and blockchains. Unfortunately, these systems have been shown to suffer from critical safety and security errors, affecting their wider adoption. The goal of my research is to develop new automated program verification and synthesis techniques which ensure safety and reliability of these systems.
In this talk, I will start by introducing AI2, the first automated verifier for neural networks able to certify large convolutional models. The key idea behind AI2 is to bridge abstract interpretation and neural networks, enabling a sound over-approximation of a network’s behavior in a scalable manner. I will then briefly discuss DL2, a system which enables clean interaction with deep learning models by allowing users to pose queries in a declarative manner. Finally, I will demonstrate how automated program analysis and synthesis can address key security and reliability challenges in domains such as computer networks and blockchains, preventing severe outages and financial losses.
Bio: Dana Drachsler-Cohen is an ETH Postdoctoral Fellow at the department of Computer Science, ETH Zurich. Her research interests span automated reasoning, program synthesis and machine learning. She obtained her PhD from the Computer Science Department at the Technion in 2017.
A family of sets F is said to satisfy the (p,q)-property if among any p sets in F, some q have a non-empty intersection. Hadwiger and Debrunner (1957) conjectured that for any p > q > d there exists a constant c = c_d(p,q), such that any family of compact convex sets in R^d that satisfies the (p,q)-property, can be pierced by at most c points. The classical Helly's Theorem is equivalent to the fact that c_d(p,p)=1 (p > d).
In a celebrated result from 1992, Alon and Kleitman proved the conjecture. However, obtaining sharp bounds on the minimal such c_d(p,q), called `the Hadwiger-Debrunner numbers', is still a major open problem in combinatorial geometry.
In this talk we present improved upper and lower bounds on the Hadwiger-Debrunner numbers, the latter using the hypergraph container method.
Based on joint works with Shakhar Smorodinsky and Gabor Tardos.
In the classical Node-Disjoint Paths (NDP) problem, we are given an n-vertex graph G, and a collection of pairs of its vertices, called demand pairs. The goal is to route as many of the demand pairs as possible, where to route a pair we need to select a path connecting it, so that all selected paths are disjoint in their vertices.
The best current algorithm for NDP achieves an $O(\sqrt{n})$-approximation, while, until recently, the best negative result was a roughly $\Omega(\sqrt{\log n})$-hardness of approximation. Recently, an improved $2^{\Omega(\sqrt{\log n})}$-hardness of approximation for NDP was shown, even if the underlying graph is a subgraph of a grid graph, and all source vertices lie on the boundary of the grid. Unfortunately, this result does not extend to grid graphs.
The approximability of NDP in grids has remained a tantalizing open question, with the best upper bound of $\tilde{O}(n^{1/4})$, and the best lower bound of APX-hardness. In this talk we come close to resolving this question, by showing an almost polynomial hardness of approximation for NDP in grid graphs.
Our hardness proof performs a reduction from the 3COL(5) problem to NDP, using a new graph partitioning problem as a proxy. Unlike the more standard approach of employing Karp reductions to prove hardness of approximation, our proof is a Cook-type reduction, where, given an input instance of 3COL(5), we produce a large number of instances of NDP, and apply an approximation algorithm for NDP to each of them. The construction of each new instance of NDP crucially depends on the solutions to the previous instances that were found by the approximation algorithm.
Joint work with David H.K. Kim and Rachit Nimavat.
We show that static data structure lower bounds in the group (linear) model imply semi-explicit lower bounds on matrix rigidity. In particular, we prove that an explicit lower bound of t >> log^2(n) on the cell-probe complexity of linear data structures in the group model, even against arbitrarily small linear space (s = (1+eps)n), would already imply a semi-explicit (P^NP) construction of rigid matrices with significantly better parameters than the current state of art (Alon, Panigrahy, and Yekhanin, 2009). Our result further asserts that polynomial (t > n^eps) data structure lower bounds against near-maximal space, would imply super-linear circuit lower bounds for log-depth linear circuits (a four-decade open question). In the succinct space regime (s = n+o(n)), we show that any improvement on current cell-probe lower bounds in the linear model would also imply new rigidity bounds. Our main result relies on a new connection between the "inner" and "outer" dimensions of a matrix (Paturi and Pudlak, 2006), and on a new worst-to-average case reduction for rigidity, which is of independent interest.
Joint work with Zeev Dvir (Princeton) and Alexander Golovnev (Harvard).
As software has grown increasingly critical to our society's infrastructure, mechanically-verified software has grown increasingly important, feasible, and prevalent. Proof assistants have seen tremendous growth in recent years because of their success in the mechanical verification of high-value applications in many areas, including cyber security, cyber-physical systems, operating systems, compilers, and microkernels. These proof assistants are built on top of constructive type theory whose computational interpretation is given by the proofs-as-programs paradigm, which establishes a correspondence between formal proofs and computer programs. However, while both proof theory and programming languages have evolved significantly over the past years, the cross-fertilization of the independent new developments in each of these fields has yet to be explored in the context of this paradigm. This naturally gives rise to the following questions: how can modern notions of computation influence and contribute to formal foundations, and how can modern reasoning techniques improve the way we design and reason about programs?
In this talk I first demonstrate how using programming principles that go beyond the standard lambda-calculus, namely state and non-determinism, promotes the specification and verification of modern systems, e.g. distributed systems. I then illustrate the surprising fragility of proof assistants in the presence of such new computational capabilities, and outline my ongoing efforts to develop a more robust foundation. For the converse direction, I show how incorporating modern proof-theoretic techniques offers a more congenial framework for reasoning about hard programming problems and hence facilitates the verification effort.
Zero knowledge protocols are spectacular, allowing to prove NP statements without revealing anything but their validity. An essential element that enables this wonder is interaction. But how much interaction exactly is needed? This question has long challenged cryptographers and is yet to be settled under standard assumptions. In fact, the question appears to be equally challenging also for natural relaxations of the zero knowledge requirement. The difficulty in answering the round complexity question stems from that of a foundational question in cryptography --- what is the power of non-black-box reductions?
In this talk, I will explain this difficulty and present a new non-black-box technique that resolves, under standard assumptions, the round complexity of weak zero knowledge protocols (Dwork-Naor-Reingold-Stockmeyer '98). Specifically, assuming quasipolynomial hardness of the Learning with Errors problem and fully-homomorphic encryption, we construct a two message protocol, a setting where (full-fledged) zero knowledge is impossible.
The talk will assume no prior knowledge in cryptography. It is based on joint work with Dakshita Khurana and Omer Paneth (the paper can be found on
For a vector space F^n over a field F, an (η, ß)-dimension expander of degree d is a collection of d linear maps Γ_j : F^n \to F^n such that for every subspace U of F^n of dimension at most ηn, the image of U under all the maps, ∑_{j=1}^d Γ_j(U), has dimension at least ßdim(U). Over a finite field, a random collection of d=O(1) maps Γ_j over excellent “lossless” expansion with high probability: ß ≈ d for η ≥ Ω(1/\eta). When it comes to a family of explicit constructions (for growing n), however, achieving even expansion factor β = 1 + ε with constant degree is a non-trivial goal. We present an explicit construction of dimension expanders over finite fields based on linearized polynomials and subspace designs, drawing inspiration from recent progress on list decoding in the rank-metric. Our approach yields the following:
- Lossless expansion over large fields; more precisely ß ≥ (1–ε)d and η ≥ (1–ε)/d with d=O_ε(1), when |F| ≥ Ω(n).
- Optimal up to constant factors expansion over fields of arbitrarily small polynomial size; more precisely ß ≥ Ω(δd) and η ≥ Ω(1/(δd)) with d = O_δ(1), when |F| ≥ n^δ.
Previously, an approach reducing to monotone expanders (a form of vertex expansion that is highly non-trivial to establish) gave (Ω(1), 1+Ω(1))-dimension expanders of constant degree over all fields. An approach based on “rank condensing via subspace designs” led to dimension expanders with ß ≥ Ω(√d) over large finite fields. Ours is the first construction to achieve lossless dimension expansion, or even expansion proportional to the degree. Based on joint work with Venkatesan Guruswami and Chaoping Xing.
In this talk I will present an algorithms for approximating the number of k-cliques in a graph when given query access to the graph. This problem was previously studied for the cases of k=2 (edges) and k=3 (triangles). We give an algorithm that works for any k >= 3, and is actually conceptually simpler than the k=3 algorithm. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries and (3) pair queries. Let n denote the number of vertices in the graph, m the number of edges, and C_k the number of k-cliques. We design an algorithm that outputs a (1+\epsilon)-approximation (with high probability) for C_k, whose expected query complexity and running time are O (\frac{n}{C_k^{1/k}}+\frac{m^{k/2}}{C_k}) poly (\log n, 1/\epsilon,k).
Hence, the complexity of the algorithm is sublinear in the size of the graph for C_k = \omega(m^{k/2-1}). Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on \log n, 1/\epsilon and k).
This is joint work with Talya Eden and C. Seshadhri.
Given an underlying finite point set P in the plane, we seek a small set Q that would hit any convex set that contains at least an Epsilon-fraction of P. Such a set Q is called a weak Epsilon-net. The study of Epsilon-nets is central to Computational and Combinatorial Geometry, and it bears important connections to Statistical Learning Theory, Extremal Combinatorics, Discrete Optimization, and other areas.
It is an outstanding open problem to determine tight asymptotic bounds on weak Epsilon-nets with respect to convex sets. For any underlying point set in the plane we describe such a net whose cardinality is roughly proportional to Epsilon^{-3/2}. This is the first improvement of the over-25-year-old bound of Alon, Barany, Furedi, and Kleitman.
In the minimum k-edge-connected spanning subgraph (k-ECSS) problem the goal is to find the minimum weight subgraph resistant to up to k-1 edge failures. This is a central problem in network design, and a natural generalization of the minimum spanning tree (MST) problem. In this talk, I will present fast randomized distributed approximation algorithms for k-ECSS in the CONGEST model.
Ramsey theory assures us that in any graph there is a clique or independent set of a certain size, roughly logarithmic in the graph size. But how difficult is it to find the clique or independent set? If the graph is given explicitly, then it is possible to do so while examining a linear number of edges. If the graph is given by a black-box, where to figure out whether a certain edge exists the box should be queried, then a large number of queries must be issued. But what if one is given a program or circuit for computing the existence of an edge? What if we are assured that the program is small without being given explicitly?
In this talk I will explore recent work on the complexity of search problems with guaranteed solution (the class TFNP) and the tight relationship with cryptographic assumptions and techniques.
Based on joint works with Pavel Hubacek, Ilan Komargodski and Eylon Yogev
Technological advances have changed every aspect of our lives in recent decades, yet, for the most part, the same systems of democratic decision making have been in place for centuries. I will argue that computer scientists can help rethink the practice of democracy, as well as its potential applications. I will focus on three emerging paradigms that go far beyond your run-of-the-mill election: (i) liquid democracy, an approach that allows voters to transitively delegate their votes; (ii) participatory budgeting, whereby residents collectively decide how to spend their local government's budget; and (iii) virtual democracy, which employs instant elections among machine learning models of real voters to address the grand AI challenge of ethical decision making.
In the Minimum Hypergraph Bisection problem, the vertex set of a hypergraph has to be partitioned into two parts of equal size so that the number of hyperedges intersecting both parts is minimized.
This problem is a natural generalization of the well-studied Minimum Bisection problem in graphs.
We present a sharp distinction between Minimum Bisection in hypergraphs and graphs.
Whereas it is well-known that all bi-criteria approximation algorithms for Minimum Bisection in graphs can be extended to hypergraphs with the exact same guarantees, we prove that this is not the case when considering true (i.e., non bi-criteria) approximation algorithms.
Specifically, we show that Minimum Hypergraph Bisection admits an $\tilde{\mathcal{O}}(\sqrt{n})$ approximation algorithm.
However, we also show that any $\alpha$-approximation algorithm for Minimum Hypergraph Bisection implies an approximation of $\Omega(\alpha^{-2})$ for Densest $k$-Subgraph.
Thus, assuming the exponential time hypothesis there is no efficient approximation algorithm for Minimum Hypergraph Bisection with an approximation ratio $n^{poly(\log{\log{n}})}$.
In particular, Minimum Hypergraph Bisection is much harder to approximate than Minimum Bisection in graphs, for which a logarithmic approximation algorithm exists.
If time permits, the problem of constructing trees that are cut sparsifiers for hypergraph and vertex cuts will also be discussed.
While similar trees lie at the heart of powerful algorithms for Minimum Bisection in graphs, we prove that this is not the case for hypergraphs.
Joint work with Harald R\"{a}cke and Richard Stotz.
The fisherman caught a quantum fish. "Fisherman, please let me go", begged the fish, "and I will grant you three wishes". The fisherman agreed. The fish gave the fisherman a quantum computer, three quantum signing tokens and his classical public key. The fish explained: "to sign your three wishes, use the tokenized signature scheme on this quantum computer, then show your valid signature to the king, who owes me a favor".
The fisherman used one of the signing tokens to sign the document "give me a castle!" and rushed to the palace. The king executed the classical verification algorithm using the fish's public key, and since it was valid, the king complied.
The fisherman's wife wanted to sign ten wishes using their two remaining signing tokens. The fisherman did not want to cheat, and secretly sailed to meet the fish. "Fish, my wife wants to sign ten more wishes". But the fish was not worried: "I have learned quantum cryptography following the previous story (The Fisherman and His Wife by the brothers Grimm). The quantum tokens are consumed during the signing. Your polynomial wife cannot even sign four wishes using the three signing tokens I gave you".
"How does it work?" wondered the fisherman. "Have you heard of quantum money? These are quantum states which can be easily verified but are hard to copy. This tokenized quantum signature scheme extends Aaronson and Christiano's quantum money scheme, and a variant by Zhandry, which is why the signing tokens cannot be copied".
"Does your scheme have additional fancy properties?" the fisherman asked. "Yes, the scheme has other security guarantees: revocability, testability and everlasting security. Furthermore, if you're at sea and your quantum phone has only classical reception, you can use this scheme to transfer the value of the quantum money to shore", said the fish, and swam away.
Joint work with Shalev Ben-David. https://arxiv.org/abs/1609.09047
All known algorithms for solving NP-complete problems require exponential time in the worst case; however, these algorithms nevertheless solve many problems of practical importance astoundingly quickly, and are hence relied upon in a broad range of applications. This talk is built around the observation that "Empirical Hardness Models" - statistical models that predict algorithm runtime on novel instances from a given distribution - work surprisingly well. These models can serve as powerful tools for algorithm design, specifically by facilitating automated methods for algorithm design and for constructing algorithm portfolios. They also offer a statistical alternative to beyond-worst-case analysis and a starting point for theoretical investigations.
bio at http://www.cs.ubc.ca/~kevinlb/bio.html
This paper makes progress on the problem of explicitly constructing a binary tree code with constant distance and constant alphabet size.
For every constant delta < 1 we give an explicit binary tree code with distance delta and alphabet size poly(log n), where n is the depth of the tree. This is the first improvement over a two-decade-old construction that has an exponentially larger alphabet of size poly(n).
As part of the analysis, we prove a bound on the number of positive integer roots a real polynomial can have in terms of its sparsity with respect to the Newton basis---a result of independent interest.
Joint work with G. Cohen and B. Haeupler
We study the problem of identifying correlations in multivariate data, under information constraints:
Either on the amount of memory that can be used by the algorithm, or the amount of communi- cation when the data is distributed across several machines. We prove a tight trade-off between the memory/communication complexity and the sample complexity, implying (for example) that to detect pairwise correlations with optimal sample complexity, the number of required mem-ory/communication bits is at least quadratic in the dimension. Our results substantially improve those of Shamir (2014), which studied a similar question in a much more restricted setting. To the best of our knowledge, these are the first provable sample/memory/communication trade-offs for a practical estimation problem, using standard distributions, and in the natural regime where the memory/communication budget is larger than the size of a single data point. To derive our theorems, we prove a new information-theoretic result, which may be relevant for studying other information-constrained learning problems.
Joint work with Ohad Shamir
Non-signaling strategies are collections of distributions with certain non-local correlations that have been studied recently in the context of delegation of computation.
In this talk I will discuss a recent result showing that the Hadamard based PCP of [ALMSS'92] is sound against non-signaling strategies. As part of the proof, we study the classical problem of linearity testing [BLR'93] in the setting of non-signaling strategies, and prove that any no-signaling strategy that passes the linearity test with high probability must be close to a quasi-distribution over linear functions.
Joint work with Alessandro Chiesa and Peter Manohar (UC Berkeley).
Our first theorem is a hierarchy theorem for the query complexity of testing graph properties with one-sided error; more precisely, we show that for every sufficiently fast-growing function f from (0,1) to the natural numbers, there is a graph property whose one-sided-error query complexity is precisely f(\Theta(\epsilon)). No result of this type was previously known for any f which is super-polynomial. Goldreich [ECCC 2005] asked to exhibit a graph property whose query complexity is exponential in 1/\epsilon. Our hierarchy theorem partially resolves this problem by exhibiting a property whose one-sided-error query complexity is exponential in 1/\epsilon. We also use our hierarchy theorem in order to resolve a problem raised by Alon and Shapira [STOC 2005] regarding testing relaxed versions of bipartiteness.
Our second theorem states that for any function f there is a graph property whose one-sided-error query complexity is at least f(\epsilon) while its two-sided-error query complexity is only polynomial in 1/\epsilon. This is the first indication of the surprising power that two-sided-error testing algorithms have over one-sided-error ones, even when restricted to properties that are testable with one-sided error. Again, no result of this type was previously known for any f that is super-polynomial.
The above theorems are derived from a graph theoretic result which we think is of independent interest, and might have further applications. Alon and Shikhelman [JCTB 2016] introduced the following generalized Turan problem: for fixed graphs H and T, and an integer n, what is the maximum number of copies of T, denoted by ex(n,T,H), that can appear in an n-vertex H-free graph? This problem received a lot of attention recently, with an emphasis on T = C_3, H=C_{2m+1}. Our third theorem gives tight bounds for ex(n,C_k,C_m) for all the remaining values of k and m.
Joint work with Asaf Shapira.
We prove a dichotomy theorem for two-party protocols, and show that for every poly-time two-party protocol with single-bit output, at least one of following holds:
- The protocol can be used to construct a key-agreement protocol.
- For every constant ρ > 0 the parties' output is ρ -uncorrelated: let (X; Y; T) denote the parties' outputs and the protocol's transcript respectively. A protocol is &rho -uncorrelated if there exists an efficient "decorralizer" algorithm Decor, that when given a random transcript T, produces two numbers PA; PB, such that no efficient algorithm can distinguish (UPS ;UPB ; T) (where Up denotes a biassed coin with bias ρ from (X; Y; T), with distinguishing advantage larger than ρ.
Namely, if the protocol cannot be used to construct key-agreement, then its output distribution (X; Y; T) is trivial: it can be simulated non-interactively by the parties given public randomness (used to sample T). (The precise statement also has qualifiers of the form: "on infinitely many choices of the security parameter").
We use the above characterization to prove that (α= 24ε2)-correct differentially private symmetric protocol for computing XOR, implies the existence of key-agreement protocol. The above dependency between α and &epsilon is tight since an θ( ε2)-correct "-differentially private protocol for computing XOR is known to exists unconditionally. It also improves, in the ( ε,α)dependency aspect, upon Goyal et al. [ICALP '16] who showed that, for some constant c > 0, a c-correct "-differentially private protocol for computing XOR implies oblivious transfer. Our result extends to a weaker notion of di erential privacy in which the privacy only requires to hold against external observer. Interestingly, the reductions used for proving the above results are non black box.
Joint work with: Eran Omri and Kobbi Nissim and Ronen Shaltiel and Jad Silbak
The advancement in quantum computing, where Google, IBM, Microsoft, Intel are competing in the (exponentially growing) number of qubits in their (some already) commercial quantum computers that they produce, requires the reexamination of the Internet Security, and the public key infrastructure. The talk will describe the concept of overlay security together with blockchain based directories for establishing symmetric keys. When combined with nested Lamport signature and Merkle trees for digital signatures the result is a complete, easily implementable architecture with information theoretically secure communication, and hash based signatures.
We present novel oblivious routing algorithms for the splittable and the unsplittable multicommodity flow settings. Our algorithms for both models improve upon the state-of-the-art, in terms of running time and performance with respect to graphs that exhibit good expansion guarantees. As an intermediate step towards the unsplittable setting, we present a novel generalization of Valiant's classical load balancing scheme for packet-switched networks to arbitrary graphs, which is of independent interest. Our approach relies on diffusing traffic throughout the network and then regathering it to its destination, via iterative applications of the random walk operator. Consequently, the performance guarantees of our algorithms are derived from the convergence of the random walk operator to the stationary distribution and are expressed in terms of the spectral gap of the graph (which dominates the mixing time).
Suppose a language L can be decided by a bounded-error randomized algorithm that runs in space S and time n * poly(S). We give a randomized algorithm for L that still runs in space O(S) and time n * poly(S) that uses only O(S) random bits; our algorithm has a low failure probability on all but a negligible fraction of inputs of each length. An immediate corollary is a deterministic algorithm for L that runs in space O(S) and succeeds on all but a negligible fraction of inputs of each length. We also discuss additional complexity-theoretic applications of our technique.
A long line of research studies the space complexity of estimating a norm l(x) in the data-stream model, i.e., when x is the frequency vector of an input stream consisting of insertions and deletions of items of n types. I will focus on norms l (in R^n) that are *symmetric*, meaning that l is invariant under sign-flips and coordinate-permutations, and show that the streaming space complexity is essentially determined by the measure-concentration characteristics of l. These characteristics are known to govern other phenomena in high-dimensional spaces, such as the critical dimension in Dvoretzky's Theorem.
The family of symmetric norms contains several well-studied norms, such as all l_p norms, and indeed we provide a new explanation for the disparity in space complexity between p<=2 and p>2. We also obtain bounds for other norms that are useful in applications.
Joint work with Jaroslaw Blasiok, Vladimir Braverman, Stephen R. Chestnut, and Lin F. Yang.
A curious property of randomized log-space search algorithms is that their outputs are often longer than their workspace. One consequence is that there is no clear way to reproduce the same output when running the algorithm twice on the same input. It is not feasible to store the random bits (or the output) of the previous run in log-space, and using new random bits in another execution can result in a different output. This leads to the question: how can we reproduce the results of a randomized log space computation of a search problem?
We will give a precise definition of this notion of "reproducibility". Then we will show that every problem in search-RL has a randomized log-space algorithm where the output can be reproduced. Reproducibility can be thought of as an extension of pseudo-determinism. Indeed, for some problems in search-RL we show pseudo-deterministic algorithms whose running time significantly improve on known deterministic algorithms.
Joint work with Yang Liu.
In the area of distributed graph algorithms a number of network's entities with local views solve some computational task by exchanging messages with their neighbors. Quite unfortunately, an inherent property of most existing distributed algorithms is that throughout the course of their execution, the nodes get to learn not only their own output but rather learn quite a lot on the inputs or outputs of many other entities. This leakage of information might be a major obstacle in settings where the output (or input) of network's individual is a private information (e.g. distributed networks of selfish agents, decentralized digital currency such as Bitcoin, voting systems).
While being quite unfamiliar notion in the classical distributed setting, the notion of secure multi-party computation (MPC) is one of the main themes in the Cryptography community. Yet despite all extensive work in the area, no existing algorithm fits the framework of classical distributed models in which there are no assumptions on the graph topologies and only messages of bounded size are sent on the edges in each round.
In this work, we introduce a new framework for \emph{secure distributed graph algorithms} and provide the first \emph{general compiler} that takes any "natural" non-secure distributed algorithm that runs in $r$ rounds, and turns it into a secure algorithm that runs in $\widetilde{O}(r \cdot D \cdot poly(\Delta))$ rounds where $\Delta$ is the maximum degree in the graph and $D$ is its diameter. This round complexity is nearly optimal for bounded degree graphs.
The main technical part of our compiler is based on a new cycle cover theorem: We show that the edges of every bridgeless graph $G$ of diameter $D$ can be covered by a collection of cycles such that each cycle is of length $\widetilde{O}(D)$ and each edge of the graph $G$ appears in $\widetilde{O}(1)$ many cycles. This provides the basis for additional combinatorial constructions required by our compiler and might be of independent combinatorial and algorithmic interest.
Joint work with Merav Parter.
Computational problems whose input is a program are central in Cryptography, as well as Complexity, Learning, and Optimization. The nature of such problems crucially depends on the way the program is accessed -- as a black box or explicitly by its implementation.
In which settings can we exploit code to gain an advantage over black-box access? In Cryptography, we explore this question from two opposing perspectives:
Protecting Code: Can we obfuscate a program's code so that its functionality is preserved but it is otherwise unintelligible? Intuitively, such obfuscated code reveals nothing more than black-box access to the program. Obfuscation is, therefore, a powerful tool with numerous applications in software protection and Cryptography.
Exploiting Code: Most security proofs in cryptography consist of a reduction that translates any potential attacker into an algorithm solving some underlying hard problem. While most security reductions only require black-box access to the attacker, for many applications black-box reductions are provably insufficient. Can we exploit the attacker's code to prove security where black-box reductions fail?
In this talk, I will describe new techniques for protecting and exploiting code, taking advantage of the inherent tension between these two tasks. I will also demonstrate applications of these techniques in and beyond cryptography.
The use of a computational PIR scheme has been instrumental in reducing interaction from interactive proofs, and in converting multi-prover interactive proofs to (single prover) 2-message computationally sound proofs (also known as arguments).
In this talk we will focus on the secrecy guarantees of this transformation.
We show that if we start with an interactive proof which is only *honest-verifier* zero-knowledge, and we use a quasi-poly secure *symmetric* PIR scheme (or a 2-message OT scheme) to reduce interaction, then the resulting 2-message argument is witness indistinguishable, and in the delayed-input setting it is distributional weak zero-knowledge (which implies strong witness indistinguishable and witness hiding in the delayed input setting). Moreover, under the same assumption (which can be instantiated from quasi-poly DDH/QR/N'th residuosity assumption), we construct a two-message argument with (similar) *statistical* secrecy guarantees. For the latter, we apply the PIR heuristic on a computationally sound proof, which is honest-verifier statistical zero-knowledge.
This is based on joint works with Abhishek Jain, Dakshita Khurana, Ron Rothblum and Amit Sahai.
Graph matching is one of the most well-studied problems in combinatorial optimization, with applications ranging from scheduling and object recognition to numerical analysis and computational chemistry.
Nevertheless, until recently very little was unknown about this problem in real-life **dynamic networks**, which aim to model the constantly changing physical world.
In the first part of the talk we'll discuss our work on dynamic graph matching, and in the second part we'll highlight our work on a few related problems.
Consider a random sequence of n bits that has entropy at least n-k, where k << n. A commonly used observation is that an average coordinate of this random sequence is close to being uniformly distributed, that is, the coordinate "looks random''. In this work, we prove a stronger result that says, roughly, that the average coordinate looks random to an adversary that is allowed to query about n/k other coordinates of the sequence, even if the adversary is non-deterministic.
As an application of this result, we prove a new result on depth-3 circuits, which recovers as a direct corollary the known lower bounds for the parity and majority functions, as well as a lower bound on sensitive functions due to Boppana.
We present an explicit pseudorandom generator with polylog(n) seed length for read-once constant-width branching programs that can read their $n$ input bits in any order. This improves upon the work of Impagliazzo, Meka, and Zuckerman (FOCS, 2012), where they required seed length $n^{1/2+o(1)}$.
A central ingredient in our work is a bound on the Fourier spectrum of constant-width branching programs, settling a conjecture posed by Reingold, Steinke, and Vadhan (RANDOM, 2013).
Our analysis crucially uses a notion of local monotonicity on the edge labeling of the branching program. We carry critical parts of our proof under the assumption of local monotonicity and show how to deduce our results for unrestricted branching programs.
(Joint work with Eshan Chattopadhyay, Pooya Hatami, and Omer Reingold)
Over the past two decades, machine learning has rapidly evolved and emerged as a highly influential discipline of computer science and engineering. One of the pillars of machine learning is mathematical optimization, and the connection between the two fields has been a primary focus of research. In this talk, I will present two recent works that contribute to this study, focusing on online learning---a central model in machine learning for sequential decision making and learning under uncertainty. In the first part of the talk, I will describe a foundational result concerned with the power of optimization in online learning, and give answer to the question: does there exist a generic and efficient reduction from online learning to black-box optimization? In the second part, I will discuss a recent work that employs online learning techniques to design a new efficient and adaptive preconditioned algorithm for large-scale optimization. Despite employing preconditioning, the algorithm is practical even in modern optimization scenarios such as those arising in training state-of-the-art deep neural networks. I will present the new algorithm along with its theoretical guarantees and demonstrate its performance empirically.
We present new heavy-hitters algorithms satisfying local-differential-privacy, with optimal or near-optimal worst-case error, running time, and memory. In our algorithms, the server running time is $\tilde O(n)$ and user running time is $\tilde O(1)$, hence improving on the prior state-of-the-art result of Bassily and Smith [STOC 2015] requiring $O(n^{5/2})$ server time and $O(n^{3/2})$ user time. With a typically large number of participants in local algorithms ($n$ in the millions), this reduction in time complexity is crucial for making locally-private heavy-hitters algorithms usable in practice.
Joint work with Raef Bassily, Kobbi Nissim, and Abhradeep Thakurta.
The talk will discuss informational lower bounds of approximate Nash equilibrium in two complexity models: Query Complexity and Communication Complexity.
For both models we prove exponential (in the number of players) lower bound on the complexity of computing ε -Nash equilibrium, for constant value of approximation ε .
In this paper we study the adaptive complexity of submodular optimization. Informally, the adaptive complexity of a problem is the minimal number of sequential rounds required to achieve a constant factor approximation when polynomially-many queries can be executed in parallel at each round. Adaptivity is a fundamental concept that is heavily studied in computer science, largely due to the need for parallelizing computation. Somewhat surprisingly, very little is known about adaptivity in submodular optimization. For the canonical problem of maximizing a monotone submodular function under a cardinality constraint, to the best of our knowledge, all that is known to date is that the adaptive complexity is between 1 and Ω(n).Our main result in this paper is a tight characterization showing that the adaptive complexity of maximizing a monotone submodular function under a cardinality constraint is, up to lower order terms, θ(log n):We describe an algorithm which requires O(log n) sequential rounds and achieves an approximation that is arbitrarily close to 1/3; We show that no algorithm can achieve an approximation better than O(1 / log n) with fewer than O(log n / log log n) rounds. Thus, when allowing for parallelization, our algorithm achieves a constant factor approximation exponentially faster than any known existing algorithm for submodular maximization. Importantly, the approximation algorithm is achieved via adaptive sampling and complements a recent line of work on optimization of functions learned from data. In many cases, we do not know the functions we optimize and learn them from labeled samples. Recent results show that no algorithm can obtain a constant factor approximation guarantee using polynomially-many labeled samples as in the PAC and PMAC models, drawn from any distribution. Since learning with non-adaptive samples over any distribution results in a sharp impossibility, we consider learning with adaptive samples where the learner obtains poly(n) samples drawn from a distribution of her choice in every round. Our result implies that in the realizable case, where there is a true underlying function generating the data, θ(log n) batches, up to lower order terms, of adaptive samples are necessary and sufficient to approximately "learn to optimize" a monotone submodular function under a cardinality constraint. This is joint work with Yaron Singer.
Given an unknown D-dimensional quantum state rho, as well as M two-outcome measurements E_1,...,E_M, how many copies of rho do we need, if we want to learn the approximate probability that E_i accepts rho for *every* i? In this talk, I'll prove the surprising result --I didn't believe it myself at first -- that one can achieve this using a number of copies that's polylogarithmic in both M and D. So, e.g., one can learn whether *every* size-n^3 quantum circuit accepts or rejects an n-qubit state, given only poly(n) copies of the state. To prove this will require first surveying previous results on measuring quantum states and succinctly describing them, including my 2004 postselected learning theorem, and my 2006 "Quantum OR Bound" (with an erroneous proof fixed in 2016 by Harrow, Lin, and Montanaro).
As time permits, I'll also discuss new joint work with Xinyi Chen, Elad Hazan, and Ashwin Nayak, which takes my 2006 result on PAC-learnability of quantum states, and extends to the setting of online learning. Here we show that, given a sequence of T two-outcome measurements on an n-qubit state, even if the sequence is chosen adversarially, one can still learn to predict the outcomes of those measurements with total regret O(n) (in the "realizable" case) or O(sqrt(Tn)) (in the "non-realizable" case).
No quantum computing background will be assumed.
We prove the first super-logarithmic lower bounds on the cell probe complexity of dynamic *boolean* (a.k.a. decision) data structure problems, a long-standing milestone in data structure lower bounds. We introduce a new technique and use it to prove a ~ log^{1.5}(n) lower bound on the operational time of a wide range of boolean data structure problems, most notably, on the query time of dynamic range counting *over F_2* ([Patrascu07]). Proving a super-logarithmic lower bound for this problem was explicitly posed as one of five important open problems in the late Mihai Patrascu's obituary [Tho13]. This result also implies the first super-logarithmic lower bound for the classical 2D range counting problem,one of the most fundamental data structure problems in computational geometry and spatial databases. We derive similar lower bounds for boolean versions of dynamic polynomial evaluation and 2D "rectangle stabbing", and for the (non-boolean) problems of range selection and range median. Our technical centerpiece is a new way of "weakly" simulating dynamic data structures using efficient *one-way* communication protocols with small advantage over random guessing. This simulation involves a surprising excursion to low-degree (Chebychev) polynomials which may be of independent interest, and offers an entirely new algorithmic angle on the "cell sampling" method of Panigrahy et al. [PTW10].
Joint work with Kasper Green-Larsen and Huacheng Yu.
We study the classic bipartite matching problem in the online setting, first introduced in the seminal work of Karp, Vazirani and Vazirani. Specifically, we consider the problem for the well-studied class of regular graphs. Matching in this class of graphs was studied extensively in the offline setting. In the online setting, an optimal deterministic algorithm, as well as efficient algorithms under stochastic input assumptions were known. In this work, we present a novel randomized algorithm with competitive ratio tending to one on this family of graphs, under adversarial arrival order. Our main contribution is a novel algorithm which achieves competitive ratio 1-O(\sqrt{\log d}/\sqrt{d}) in expectation on d-regular graphs. In contrast, we show that all previously-known online algorithms, such as the generally worst-case optimal ranking algorithm of Karp et al., are restricted to a competitive ratio strictly bounded away from one, even as d grows. Moreover, we show the convergence rate of our algorithm's competitive ratio to one is nearly tight, as no algorithm achieves competitive ratio better than 1-O(1/\sqrt{d}). Finally, we show that our algorithm yields a similar competitive ratio with high probability, as well as guaranteeing each offline vertex a probability of being matched tending to one.
Computational social choice deals with algorithms for aggregating individual preferences or opinions towards collective decisions. AI researchers (including myself) have long argued that such algorithms could play a crucial role in the design and implementation of multiagent systems. However, in the last few years I have come to realize that the "killer app" of computational social choice is helping people -- not software agents -- make joint decisions. I will illustrate this theme through two recent endeavors: Spliddit.org, a website that offers provably fair solutions to everyday problems; and Robovote.org, which provides optimization-driven voting methods.
Throughout the talk, I will devote special attention to the theoretical foundations and results that make these services possible.
We show that there exist binary locally testable codes (for all rates) and locally correctable codes (for low rates) with rate and distance approaching the Gilbert-Varshamov bound (which is the best rate-distance tradeoff known for general binary error-correcting codes). Our constructions use a number of ingredients: Thommesen's random concatenation technique, the Guruswami-Sudan-Indyk strategy for list-decoding concatenated codes, the Alon-Edmonds-Luby distance amplification method, and the local list-decodability and local testability of Reed-Muller codes. Interestingly, this seems to be the first time that local testability is used in the construction of locally correctable codes.
Joint work with Sivakanth Gopi, Rafael Oliveira, Noga Ron-Zewi and Shubhangi Saraf
Graphs that are prevalent in current applications (the Internet, Facebook etc.) are not only very large and highly dynamic, but also distributed between many servers, none of which sees the graph in its entirety. The distributed monitoring problem deals with the question of imposing conditions on the local graphs, such that as long as they hold, it is guaranteed that some desired property holds for the global graph.
While defining local conditions for linear properties (e.g. average degree) is relatively easy, they are more difficult to derive for non-linear functions over the graph. We propose a solution and a general definition of solution optimality, and demonstrate how to apply it to two important graph properties -- spectral gap and number of triangles. We also define an absolute lower bound on the communication overhead for distributed monitoring, and compare our algorithm to it, with good results. Performance improves as the graph becomes larger and denser -- that is, when distributing it is more important.
I will talk about my recent adventures with ants. Together with biologists we study P. longicornis ants as they collaboratively transport a large food item to their nest. This collective navigation process is guided by pheromones which are laid by individual ants. Using a new methodology to detect scent marks, we identify a new kind of ant trail characterized by very short and dynamic pheromone markings and highly stochastic navigation response to them. We argue that such a trail can be highly beneficial in conditions in which knowledge of individual ants regarding the underlying topological structure is unreliable. This gives rise to a new theoretical search model on graphs under unreliable guiding instructions, which is of independent computational interest. To illustrate the model, imagine driving a car in an unknown country that is in the aftermath of a major hurricane which has randomly flipped a certain small fraction of the road-signs. Under such conditions of unreliability, how can you still reach your destination fast? I will discuss the limits of unreliability that allow for efficient navigation. In trees, for example, there is a phase transition phenomenon that occurs roughly around the inverse of the square root of the maximal degree. That is, if noise is above this threshold then any algorithm cannot avoid finding the target in exponential time (in the original distance), while below the threshold we identify an optimal, almost linear, walking algorithm. Finally, I will discuss algorithms that under such a noisy model aim to minimize the number of queries to find a target (rather than the number of moves).
This talk is based on joint works with biologists from the Weizmann Institute: Ofer Feinerman, Udi Fonio, and others, and with CS researchers: Lucas Bockowski, Adrian Kosowski, and Yoav Rodeh.
While there has been a lot of progress in designing efficient custom protocols for computing Private Set Intersection (PSI), there has been less research on using generic MPC protocols for this task. However, there are many variants of the set intersection functionality which seem hard to compute with existing custom protocols and are easy to compute with generic MPC based solutions (for example comparing the cardinality of the intersection with a threshold or measuring ad conversion rates). Generic protocols work over circuits which compute the intersection. For sets of size n the best known circuit constructions compute O(n log n) comparisons. In this work we propose new circuit-based protocols for computing variants of the intersection, with circuits computing only O(n) comparisons. Our constructions are based on a new variant of Cuckoo hashing in two dimensions. We employ several optimizations and determine experimentally the required sizes of tables and circuits, and measure the runtime, showing that our protocol is more efficient in concrete terms than existing constructions. The proof technique is new and can be generalized to analyzing simple Cuckoo hashing as well as new variants.
Joint work with Srikanth Srinivasan.
The algorithm referred to in the title builds on Luks's powerful group-theoretic divide-and-conquer method (1980) and addresses the bottleneck situation where Luks's method fails to "divide".
Luks will continue to "conquer" if an alternative method "divides"; we develop such a partitioning technique.
In the talk we shall outline the algorithm and explain in some detail its group theoretic core, the "Unaffected Stabilizers Lemma" and the "Local Certificates" routine. The Lemma is used to construct, somewhat implausibly, global automorphisms out of local information -- a key step toward the construction of combinatorial structures to which the partitioning method from the previous day's lecture will be applied, providing the required "divide" step.
I will describe two branches of my work related to algorithms for distributed networks. The main focus will be devoted for Fault-Tolerant (FT) Network Structures.
The undisrupted operation of structures and services is a crucial requirement in modern day communication networks. As the vertices and edges of the network may occasionally fail or malfunction, it is desirable to make those structures robust against failures.
FT Network Structures are low cost highly resilient structures, constructed on top of a given network, that satisfy certain desirable performance requirements concerning, e.g., connectivity, distance or capacity. We will overview some results on fault tolerant graph structures with a special focus on FT Breadth-First-Search.
The second part of the talk will discuss distributed models and algorithms for large-scale networks. Towards the end, we will see some connections between distributed computing and other areas such as EE and Biology.
Pseudo-deterministic algorithms are randomized search algorithms that on different executions on the same input, output the same solution with high probability.
We will discuss how pseudo-deterministic algorithms bridge the gap between randomized search and decision problems for problems in P and in NC. Next, we will show a pseudo-deterministic NC algorithm for bipartite matching. Finally, we will show how pseudo-determinism can be used to save on random bits used by classical randomized algorithms, and apply the method to obtain an algorithm for RNC depth first search using only O(log^2 n) random bits. This is joint work with Shafi Goldwasser.
The Operator Scaling problem asks whether a set of complex matrices can be jointly moved to a certain canonical (isotropic) position. This problem has a remarkable number of myriad incarnations: non-commutative algebra, invariant theory, arithmetic complexity, quantum information theory, analytic inequalities and more. We will describe an efficient algorithm solving all these related problems, and explain how their analysis combines ideas from all these areas.
Through these connections, the algorithm can be shown to solve some non-convex optimization problems, some systems of quadratic equations, and some linear programs with exponentially many inequalities - all these, and concrete examples we will give, suggest that it might be a powerful algorithmic tool via reductions to these problems.
No special background will be assumed!
Joint on two joint works with Ankit Garg, Leonid Gurvits and Rafael Olivera.
This talk is longer than usual and has a two-hour slot.
The sensitivity conjecture is a famous open problem in the theory of boolean functions. Let f be a boolean function defined on the hypercube. The sensitivity of a node x is the number of its neighbours in the hypercube, for which f give the opposite value as that it does on x. The sensitivity conjecture speculates that if all nodes have low sensitivity, then the function f must be simple. Concretely, all its Fourier mass is supported on levels with low hamming weight.
Recently, Gopalan et al [CCC 2016] conjectured a robust analogue of the sensitivity conjecture: if most of the nodes have low sensitivity, then most of the Fourier mass is supported on levels with low hamming weight. They also prove it under the stronger assumption that all nodes have low sensitivity. In this work, we prove this conjecture, with near tight quantitative bounds.
Joint work with Avishay Tal (IAS) and Jiapeng Zhang (UCSD).
We show an efficient method for converting a logic circuit of gates with fan-out 1 into an equivalent circuit that works even if some fraction of its gates are short-circuited, i.e., their output is short-circuited to one of their inputs. Our conversion can be applied to any circuit with fan-in k>= 2, yielding a resilient circuit whose size is polynomial in the size of the (non-resilient) input circuit.
The resilient circuit gives the correct output as long as less than 1/3 of the gates in any of its input-to-output paths are corrupted. Furthermore, we prove that a resilience level of 1/3 is optimal (maximal) for this type of faulty gates. This fully answers an open question by Kalai et al. (FOCS 2012).
Joint work with Mark Braverman and Michael Yitayew.
We discuss recent progress on hardness of 2-to-2 games, with applications to the inapproximability of the Vertex-Cover problem.
A 2-to-2 game (which is a variant of Khot's well known unique games), is defined by a graph where there is a variable in each node, and a constraint of a specific structure defined on each edge. While in unique games each edge- constraint must be a one-to-one correspondence -- i.e. for each assignment to one node there is exactly one assignent to the other node that satisfies the constraint -- in 2-to-2 games the correspondence has a "two-to-two" structure.
The goal is to distinguish between instances in which almost all of the edge- constraints can be satisfied, and instances in which almost none of them can be satisfied simultaneously.
We present a new combinatorial hypothesis regarding Grassmann graphs, and show that it implies that 2-to-2 games are NP-hard *in a certain sense*. As a consequence, the hypothesis implies that it is NP-hard to distinguish between graphs that have an independent set of fractional size (1- 1/sqrt{2}), and graphs with no independent sets of any constant fractional size. This easily implies that it is NP-hard to approximate the Vertex Cover problem within a factor \sqrt{2} - o(1).
The talk is mostly based on a joint work with Subhash Khot and Muli Safra, nevertheless, we will also mention results from a more recent extension, which is a joint work with Irit Dinur, Subhash Khot, Guy Kindler and Muli Safra.
We characterize the communication complexity of truthful mechanisms. Our departure point is the well known taxation principle. The taxation principle asserts that every truthful mechanism can be interpreted as follows: every player is presented with a menu that consists of a price for each bundle (the prices depend only on the valuations of the other players). Each player is allocated a bundle that maximizes his profit according to this menu. We define the taxation complexity of a truthful mechanism to be the logarithm of the maximum number of menus that may be presented to a player.
Our main finding is that in general the taxation complexity essentially equals the communication complexity. The proof consists of two main steps. First, we prove that for rich enough domains the taxation complexity is at most the communication complexity. We then show that the taxation complexity is much smaller than the communication complexity only in "pathological" cases and provide a formal description of these extreme cases.
Our approach yields several applications, including strengthening the solution concept with low communication overhead, fast computation of prices, and hardness of approximation by computationally efficient truthful mechanisms.
In the classical Node-Disjoint Paths (NDP) problem, the input consists of an undirected n-vertex graph G, and a collection M of pairs of its vertices, called source-destination, or demand, pairs. The goal is to route the largest possible number of the demand pairs via node-disjoint paths. The best current approximation for the problem is achieved by a simple greedy algorithm, whose approximation factor is O(\sqrt n), while the best current negative result is a roughly \Omega(log^{1/2}n)-hardness of approximation. Even seemingly simple special cases of the problem are still poorly understood: when the input graph is a grid, the best current algorithm achieves a \tilde{O}(n^{1/4})- approximation, and when it is a general planar graph, the best current approximation ratio of an efficient algorithm is \tilde{O}(n^{9/19}). The best currently known lower bound for both these versions of the problem is APX- hardness.
In this talk we will show that NDP is 2^{\Omega(\log n)}-hard to approximate, unless all problems in NP have algorithms with running time n^{O(\log n)}. Our result holds even when the underlying graph is a planar graph with maximum vertex degree 3, and all source vertices lie on the boundary of a single face. We extend this result to the closely related Edge-Disjoint Paths problem, showing the same hardness of approximation ratio even for sub-cubic planar graphs with all sources lying on the boundary of a single face.
This is joint work with David H.K. Kim and Rachit Nimavat.
One of the most common tasks in cryptography and cryptanalysis is to find some interesting event (a needle) in an exponentially large collection (haystack) of N=2^n possible events, or to demonstrate that no such event is likely to exist. In particular, we are interested in finding needles which are defined as events that happen with an unusually high probability of p>>1/N in a haystack which is an almost uniform distribution on N possible events. When the search algorithm can only sample values from this distribution, the best known time/memory tradeoff for finding such an event requires O(1/Mp^2) time given O(M) memory.
In this talk I will describe much faster needle searching algorithms in the common cryptographic setting in which the distribution is defined by applying some deterministic function f to random inputs. Such a distribution can be modeled by a random directed graph with N vertices in which almost all the vertices have O(1) predecessors while the vertex we are looking for has an unusually large number of O(pN) predecessors. When we are given only a constant amount of memory, we propose a new search methodology which we call NestedRho. As p increases, such random graphs undergo several subtle phase transitions, and thus the log-log dependence of the time complexity T on p becomes a piecewise linear curve which bends four times. Our new algorithm is faster than the O(1/p^2) time complexity of the best previous algorithm in the full range of 1/N < p < 1 , and in particular it improves the previous time complexity by a significant factor of \sqrt{N} for any p in the range N^(- 0.75) < p < N^(-0.5). When we are given more memory, we show how to combine the NestedRho technique with the parallel collision search technique in order to further reduce its time complexity. Finally, we show how to apply our new search technique to more complicated distributions with multiple peaks when we want to find all the peaks whose probabilities are higher than p.
Joint work with Itai Dinur, Orr Dunkelman and Nathan Keller.
Suppose that you have n truly random bits X=(X1,...,Xn) and you wish to use them to generate m>>n pseudorandom bits Y=(Y1,..., Ym) using a local mapping, i.e., each Yi should depend on at most d=O(1) bits of x. In the polynomial regime of m=n^s, s>1, the only known solution, originates from (Goldreich, ECCC 2000), is based on Random Local Functions: Compute Yi by applying some fixed (public) d-ary predicate P to a random (public) tuple of distinct inputs. In this talk, we will try to understand, for any value of s, how the pseudorandomness of the resulting sequence depends on the choice of the underlying predicate.
Based on joint work with Shachar Lovett.
Interactive proofs have had a dramatic impact on Complexity Theory and Cryptography. The celebrated IP=PSPACE Theorem [LFKN92,Shamir92] allows an all-powerful but untrusted prover to convince a polynomial-time verifier of the validity of extremely complicated statements (as long as they can be evaluated using polynomial space). The interactive proof system designed for this purpose requires a polynomial number of communication rounds. This talk will focus on studying the power of more efficient interactive proof systems.
Our main result is that for every statement that can be evaluated in polynomial time and bounded-polynomial space, there exists an interactive proof that satisfies the following strict efficiency requirements:
(1) The honest prover runs in polynomial time.
(2) The verifier is almost linear time (and under some conditions even sub linear).
(3) The interaction consists of only a constant number of communication rounds.
To obtain this result, we introduce and study several new notions for interactive proofs, which may be of independent interest. One of these notions is that of unambiguous interactive proofs, where the prover has a unique successful strategy. Another notion is that of probabilistically checkable interactive proofs (PCIPs) where the verifier only reads a few bits of the transcript in checking the proof (this could be viewed as an interactive extension of PCPs).
Joint work with Omer Reingold and Ron Rothblum.
Over the past 30 years numerous algorithms have been designed for symmetry breaking problems in the LOCAL model, such as maximal matching, MIS, vertex coloring, and edge-coloring. For most problems the best randomized algorithm is at least exponentially faster than the best deterministic algorithm. In this paper we prove that these exponential gaps are necessary and establish connections between the deterministic and randomized complexities in the LOCAL model. Each result has a very compelling take-away message:
1. Fast Δ-coloring of trees requires random bits: Building on the recent lower bounds of Brandt et al., we prove that the randomized complexity of Δ-coloring a tree with maximum degree Δ≥55 is Θ(log_Δ log n), whereas its deterministic complexity is Θ(log_Δ n) for any Δ≥3. This also establishes a large separation between the deterministic complexity of Δ-coloring and (Δ+1)-coloring trees.
2. Randomized lower bounds imply deterministic lower bounds: We prove that any deterministic algorithm for a natural class of problems that runs in O(1)+o(log_Δ n) rounds can be transformed to run in O(log*n −log*Δ+1) rounds. If the transformed algorithm violates a lower bound (even allowing randomization), then one can conclude that the problem requires Ω(log_Δ n) time deterministically.
3. Deterministic lower bounds imply randomized lower bounds: We prove that the randomized complexity of any natural problem on instances of size n is at least its deterministic complexity on instances of size √ log n. This shows that a deterministic Ω(log_Δ n) lower bound for any problem implies a randomized Ω(log_Δ log n) lower bound. It also illustrates that the graph shattering technique is absolutely essential to the LOCAL model.
Joint work with Tsvi Kopelowitz and Seth Pettie. http://arxiv.org/abs/1602.08166
In distributed systems, communication between the participants in the computation is usually the most expensive part of the computation. Theoretical models of distributed systems usually reflect this by neglecting the cost of local computation, and charging only for messages sent between the participants; in particular, we usually assume that the computation proceeds in rounds, and in each round, each participant can send only a limited number of bits. We are interested in characterizing the number of rounds required to perform various tasks.
In this talk I will describe two sets of results. The first concerns the complexity of distributed subgraph detection: we have n servers, each representing a node in an undirected graph, and each server receives as input its adjacent edges in the graph. The goal of the computation is to determine whether the global input graph contains some fixed subgraph. I will describe upper and lower bounds for several classes of subgraphs, through a connection to Turan numbers. The general case remains open.
In the second part of the talk I will describe recent work on multi- party number-in-hand communication and information complexity, and show a tight upper and lower bound for set disjointness in the shared blackboard model.
Joint work with Mark Braverman, Andrew Drucker and Fabian Kuhn.
The task of finding heavy hitters is one of the best known and well studied problems in the area of data streams. In a sense, the strongest guarantee available is the L2 guarantee, which requires finding all items that occur at least eps*||f|| times in the stream, where the i-th coordinate of the vector f is the number of occurrences of i in the stream. The first algorithm to achieve the L2 guarantee was the CountSketch (Charikar, Chen, and Farach-Colton ICALP'02), which, for constant eps, requires O(log n) words of memory and O(log n) update time. It is known to be space-optimal if the stream includes deletions.
In this talk I will discuss recent improvements that allow us to find L2 heavy hitters in O(1) memory and O(1) update time in insertion only streams. The improvements rely on a deeper understanding of the AMS sketch (Alon, Matias, and Szegedy STOC'96) and similar sketches and draw on the theory of Gaussian processes. This talk is based on joint work with Vladimir Braverman, Nikita Ivkin, Jelani Nelson, Zhengyu Wang, and David P. Woodruff in arxiv:1511.00661 and arxiv:1603.00759.
In many situations, sample data is obtained from a noisy or imperfect source. In order to address such corruptions, we propose the methodology of sampling correctors. Such algorithms use structure that the distribution is purported to have, in order to allow one to make "on-the-fly" corrections to samples drawn from probability distributions. These algorithms may then be used as filters between the noisy data and the end user. We show connections between sampling correctors, distribution learning algorithms, and distribution property testing algorithms. We show that these connections can be utilized to expand the applicability of known distribution learning and property testing algorithms as well as to achieve improved algorithms for those tasks.Warning: This talk contains more questions than answers...
Joint work with Clement Canonne and Themis Gouleakis.
We present a randomized algorithm that computes a Minimum Spanning Tree (MST) in O(log^* n) rounds, with high probability, in the Congested Clique model of distributed computing. In this model, the input is a graph on n nodes, initially each node knows only its incident edges, and per round each two nodes can exchange O(log n) bits.
Our key technical novelty is an O(log^* n) Graph Connectivity algorithm, the heart of which is a (recursive) forest growth method, based on a combination of two ideas: a sparsity-sensitive sketching aimed at sparse graphs and a random edge sampling aimed at dense graphs.
Our result improves significantly over the $O(\log \log \log n)$ algorithm of Hegeman et al. [PODC 2015] and the $O(\log \log n)$ algorithm of Lotker et al. [SPAA 2003; SICOMP 2005].
Join work with Mohsen Ghaffari.
Estimating the amount of distinct elements in a dataset by examining only a fraction of the data is known to be a hard problem, both theoretically and in practice.
Our work explores a breakthrough theoretical result by Valiant and Valiant from 2011 that presents a provably accurate method for doing such estimations.
Our goal is to put this theory into practice for the important task of estimating the deduplication ratio of a very large dataset. However, deploying this technique in a real world setting runs into significant obstacles.
In the talk I will describe new techniques that help bridging the gap and enable the use of this exciting approach. Our work achieves a major improvement over the current state of the art practical solutions.
The talk is for a general audience, no prior knowledge is assumed.
Based on joint work with Dmitry Sotnikov and Ety Khaitzin that appeared at Usenix FAST 2016.
Expander graphs are widely studied, and various methods are known to obtain bounded degree expander graphs. Recently, there is a growing interest in understanding combinational expansion in higher dimensions (higher dimensional simplicial complexes). However, bounded degree combinatorial expanders (random or explicit) were not known till our work.
We present a local to global criterion on a complex that implies combinatorial expansion. We use our criterion to present explicit bounded degree high dimensional expanders. This solves in the affirmative an open question raised by Gromov, who asked whether bounded degree high dimensional expanders could at all exist.
We expect that the emerging theory of high dimensional expansion is likely to have various application in the theory of computation. Thus, one of the goals of this talk in to introduce this concept to the theory community.
Based on joint works with David Kazhdan and Alex Lubotzky, and with Shai Evra.
Raz's celebrated Parallel Repetition Theorem shows that the probability of simultaneously winning n independent instances of a two-player one-round game G is exponentially small in n, when the maximum success probability of G is less than 1. Though the statement is intuitive, the proof is rather nontrivial and has found important application in hardness of approximation, cryptography, and communication complexity.
There are two major open problems regarding the parallel repetition of games: does an analogue of Raz's theorem hold for (a) games with more than two players, and (b) games with quantumly entangled players? Extending Raz’s theorem to these settings is a challenging problem for a number of reasons: techniques for attacking direct sum/direct product problems in multiparty settings are lacking, and our understanding of quantum entanglement as an information theoretic resource is quite limited.
In this work, we show to sidestep these barriers and make progress on the two open problems. We first prove exponential-decay parallel repetition theorems for a class of games we called "anchored games" in the multiplayer and entangled-player settings. Then, we show how to efficiently transform any game into an equivalent anchored game. Together, our results provide a simple hardness-amplification technique for games in both the classical multiplayer and quantum settings.
Joint work with Mohammad Bavarian and Thomas Vidick.
Adaptivity is an important feature of data analysis - the choice of questions to ask about a dataset often depends on previous interactions with the same dataset. However, statistical validity is typically studied in a nonadaptive model, where all questions are specified before the dataset is drawn. Recent work by Dwork et al. (STOC, 2015) initiated the formal study of this problem, and gave the first upper bounds on the achievable generalization error for adaptive data analysis.
The results of Dwork et al. are based on a connection with algorithmic stability in the form of differential privacy. We extend their work by giving a quantitatively optimal, more general, and simpler proof of their main theorem that stable algorithms of the kind guaranteed by differential privacy imply low generalization error. We also show that weaker stability guarantees such as bounded KL divergence and total variation distance lead to correspondingly weaker generalization guarantees.
Joint work with Raef Bassily, Kobbi Nissim, Adam Smith, Thomas Steinke, and Jonathan Ullman.
For the past 40 years computer scientists generally believed that NP-complete problems are intractable. In particular, Boolean satisfiability (SAT), as a paradigmatic NP-complete problem, has been considered to be intractable. Over the past 20 years, however, there has been a quiet, but dramatic, revolution, and very large SAT instances are now being solved routinely as part of software and hardware design.
In this talk I will review this amazing development and show that we can leverage SAT solving to accomplish other Boolean reasoning tasks. Counting the number of satisfying truth assignments of a given Boolean formula or sampling such assignments uniformly at random are fundamental computational problems in computer science with numerous applications. While the theory of these problems has been thoroughly investigated in the 1980s, approximation algorithms developed by theoreticians do not scale up to industrial-sized instances. Algorithms used by the industry offer better scalability, but give up certain correctness guarantees to achieve scalability. We describe a novel approach, based on universal hashing and Satisfiability Modulo Theory, that scales to formulas with hundreds of thousands of variable without giving up correctness guarantees.
In today’s world there are huge amounts of data that need to get reliably stored or transmitted. However, some amount of noise or corruption is inevitable. An error-correcting code is a scheme for robustly representing data in the form of a codeword that allows one to detect and correct errors in transmission. Locally-testable and locally-decodable codes are special families of error-correcting codes that admit highly efficient algorithms that detect and correct errors in sublinear time with high probability, probing only a small number of entries of the corrupted codeword. While locally-testable and locally-decodable codes have been intensely studied in the past 2 decades, in recent years there has been even further incentive for their study due to their relevance for transmission and storage of massive data and the successful implementation of local codes in cloud storage systems.
In this talk, I will show an exponential improvement on the best-known running time of error detection and correction algorithms for locally-testable and locally-decodable codes. Specifically, I will describe new families of locally-testable codes with constant rate that can detect a constant fraction of errors in time (log n)^{O(log log n)} and new families of locally-decodable codes of constant rate that can correct a constant fraction of errors in time exp(\sqrt{log n}). Prior to that, the best known running time for such codes was n^{epsilon} (for a constant epsilon) using several, quite different, constructions.
(Based on joint work with Swastik Kopparty, Or Meir and Shubhangi Saraf)
Locality-Sensitive Hashing (LSH) is a powerful technique for the approximate nearest neighbor search (ANN) in high dimensions. In this talk I will present two recent results:
1) I will show a data structure for ANN for the Euclidean distance that provably outperforms the best possible LSH-based data structure. We proceed via designing a good *data-dependent* hash family.
2) I will show a practical and optimal LSH family for the cosine similarity (a.k.a. Euclidean distance on a sphere). It substantially outperforms the celebrated Hyperplane LSH family. Along the way, I will try to debunk two popular myths about LSH:
* LSH-based data structures consume too much memory and are thus impractical;
* Optimal LSH constructions are too complicated to be made practical.
The talk is based on two papers: arXiv: 1501.01062 (joint with Alexandr Andoni, STOC 2015) and arXiv: 1509.02897 (joint with Alexandr Andoni, Piotr Indyk, Thijs Laarhoven and Ludwig Schmidt, NIPS 2015).
We show techniques for decreasing the error probability of randomized algorithms and for converting randomized algorithms to deterministic (non-uniform) algorithms. Unlike most existing techniques that involve repetition of the randomized algorithm and hence a slowdown, our techniques produce algorithms with a similar run-time to the original randomized algorithms.
The amplification technique is related to a certain stochastic multi-armed bandit problem. The derandomization technique -- which is the main contribution of this work -- points to an intriguing connection between derandomization and sketching/sparsification.
We demonstrate the techniques by showing applications to max-cut on dense graphs, approximate clique, constraint satisfaction problems on dense bipartite graphs, and list decoding to unique decoding for Reed-Muller code.
This is joint work with Ofer Grossman.
The class P attempts to capture the efficiently solvable computational tasks. It is full of practically relevant problems, with varied and fascinating combinatorial structure.
In this talk, I will give an overview of a rapidly growing body of work that seeks a better understanding of the structure within P. Inspired by NP-hardness, the main tool in this approach are combinatorial reductions. Combining these reductions with a small set of plausible conjectures, we obtain tight lower bounds on the time complexity of many of the most important problems in P.
I will present the most recent landscape of P and the conjectures on which this project is based on (e.g. the Strong Exponential Time Hypothesis). I will discuss recent attempts on identifying new conjectures: either more reliable ones, or ones that will get us closer to a full classification of the important problems in P.
Finally, I will highlight a surprising new reduction from Circuit-SAT to natural problems in P like Edit-Distance which proves that minor improvements over the quadratic running time of Edit-Distance are enough to prove major complexity separations.
We study the two-party communication complexity of the geometric problem of finding an approximate Brouwer fixed-point of a composition of two Lipschitz functions g*f, where Alice knows f and Bob knows g.
We prove an essentially tight communication lower bound on this problem, using a novel adaptation of the Raz-McKenzie simulation theorem into geometric settings.
We show that a slightly stronger version of this communication problem would imply an (essentially) tight communication lower bounds on the problem of finding an approximate Nash equilibrium in 2-player (and n-player) games, where each player initially knows only his own payoff matrix.
Joint work with Tim Roughgarden.
Online d-dimensional vector packing models many settings such as minimizing resources in data centers where jobs have multiple resource requirements (CPU, Memory, etc.). However, no online d-dimensional vector packing algorithm can achieve a competitive ratio better than d. Fortunately, in many natural applications, vectors are relatively small, and thus the lower bound does not hold. For sufficiently small vectors, an O(log d)-competitive algorithm was known. We improve this to a constant competitive ratio, arbitrarily close to e (where e is the base of the natural logarithm), given that vectors are sufficiently small.
We give improved results for the two dimensional case. For arbitrarily small vectors, the First Fit algorithm for two dimensional vector packing is no better than 2-competitive. We present a natural family of First Fit variants, and for optimized parameters get a competitive ratio of approximately 1.48 for sufficiently small vectors.
We improve upon the 1.48 competitive ratio - not via a First Fit variant - and give a competitive ratio arbitrarily close to 4/3 for packing small, two dimensional vectors. We show that no algorithm can achieve better than a 4/3 competitive ratio for two dimensional vectors, even if one allows the algorithm to split vectors among arbitrarily many bins.
Much of the theory of mathematical programs for combinatorial optimization can be described in the following way: A polytope of interest has exponentially many (in the dimension) facets, but can be written as the linear projection of a simpler convex body in a higher-dimensional space. Simple might mean a polytope with a much smaller number of facets, or a spectrahedron (the intersection of an affine subspace with the PSD cone) of small dimension. This allows one to optimize linear functionals over the base polytope by instead optimizing a lifted functional over the lifted body.
Unless P=NP, one does not expect certain polytopes--like the convex hull of indicators of traveling salesman tours in a graph--to have a small lift. But it remained open to prove any non-trivial lower bound on the necessary dimension for a spectrahedral lift, i.e. to prove that semi-definite programs do not yield efficient optimization procedures over these polytopes.
We show that the cut, TSP, and stable set polytopes on n-vertex graphs are not the linear image of a spectrahedron of dimension less than exp(n^c) for some constant c > 0. In the process, many interesting phenomena emerge: Factorization of operators through the PSD cone, quantum information theory, discrete Fourier analysis, and real algebraic geometry.
This is based joint work with Prasad Ragahvendra and David Steurer.
An Oblivious RAM (ORAM), introduced by Goldreich and Ostrovsky (JACM 1996), is a (probabilistic) RAM that hides its access pattern, i.e. for every input the observed locations accessed are similarly distributed. Great progress has been made in recent years in minimizing the overhead of ORAM constructions, with the goal of obtaining the smallest overhead possible.
We revisit the lower bound on the overhead required to obliviously simulate programs, due to Goldreich and Ostrovsky. While the lower bound is fairly general, including the offline case, when the simulator is given the reads and writes ahead of time, it does assume that the simulator behaves in a "balls and bins" fashion. That is, the simulator must act by shuffling data items around, and is not allowed to have sophisticated encoding of the data.
We prove that for the OFFLINE case, showing a lower bound without the above restriction is related to the size of the circuits for sorting. Our proof is constructive, and uses a bit-slicing approach which manipulates the bit representations of data in the simulation. This implies that without obtaining yet unknown superlinear lower bounds on the size of such circuits, we cannot hope to get lower bounds on offline (unrestricted) ORAMs.
The algorithmic task of computing the Hamming distance between a given pattern of length m and each location in a text of length n is one of the most fundamental algorithmic tasks in string algorithms. Unfortunately, there is evidence that for a given text and pattern, one cannot compute the exact Hamming distance for all locations in the text in time which is polynomially less than o(n\sqrt m). Nevertheless, Karloff showed that if one is willing to suffer a 1+-\epsilon approximation, then it is possible to solve the problem with high probability in O~(n / \epsilon^2) time.
Due to related lower bounds for computing the Hamming distance of two strings in the one-way communication complexity model, it is strongly believed that obtaining an algorithm for solving the approximation version cannot be done much faster as a function of 1 / \epsilon. We will show that this belief is false by introducing a new O~(n / \epsilon) time algorithm that succeeds with high probability.
The main idea behind our algorithm, which is common in sparse recovery problems, is to reduce the variance of a specific randomized experiment by (approximately) separating heavy hitters from non-heavy hitters. However, while known sparse recovery techniques work very well on vectors, they do not seem to apply here, where we are dealing with mismatches between pairs of characters. We introduce two main algorithmic ingredients. The first is a new sparse recovery method that applies for pair inputs (such as in our setting). The second is a new construction of hash/projection functions, which allows to count the number of projections that induce mismatches between two characters exponentially faster than brute force. We expect that these algorithmic techniques will be of independent interest.
We show a 2^{n+o(n)}-time algorithm for the Shortest Vector Problem on n-dimensional lattices (improving on the previous best-known algorithm of Micciancio and Voulgaris, which runs in time 4^{n+o(n)}). The algorithm uses the elementary yet powerful observation that, by properly combining samples from a Gaussian distribution over the lattice, we can produce exact samples from a narrower Gaussian distribution on the lattice. We use such a procedure repeatedly to obtain samples from an arbitrarily narrow Gaussian distribution over the lattice, allowing us to find a shortest vector.
Both the algorithm and the analysis are quite simple in hindsight. (The main technical tool is an identity on Gaussian measures with a simple geometric proof originally due to Riemann.) If time allows and interest merits, we will discuss a more technical variant of this algorithm that solves the Closest Vector Problem (a seemingly harder problem) in the same asymptotic running time.
Based on joint work with Divesh Aggarwal, Daniel Dadush, and Oded Regev. (See http://arxiv.org/abs/1412.7994 and http://arxiv.org/abs/1504.01995.)
The Hamming and the edit metrics are two common notions of measuring distances between pairs of strings x,y lying in the Boolean hypercube. The edit distance between x and y is defined as the minimum number of character insertion, deletion, and bit flips needed for converting x into y. Whereas, the Hamming distance between x and y is the number of bit flips needed for converting x to y.
In this paper we study a randomized injective embedding of the edit distance into the Hamming distance with a small distortion. This question was studied by Jowhari (ESA 2012) and is mainly motivated by two questions in communication complexity: the document exchange problem and deciding edit distance using a sketching protocol.
We show a randomized embedding with quadratic distortion. Namely, for any $x,y$ satisfying that their edit distance equals $k$, the Hamming distance between the embedding of $x$ and $y$ is $O(k^2)$ with high probability. This improves over the distortion ratio of $O(\log n \log^* n)$ obtained by Jowhari for small values of $k$. Moreover, the embedding output size is linear in the input size and the embedding can be computed using a single pass over the input.
Assortment planning is a major operational issue that arises in many industries, such as retailing, airlines and consumer electronics. Given a set of products that are differentiated by price, quality and possibly other attributes, one has to decide on the subset of products and the respective quantities that will be stocked and offered to heterogeneous customers, who exhibit substitution behavior.
The general problem can be shown to be NP-hard to approximate better than a factor linear in the number of products. In this talk we discuss how for a range of practically interesting special cases, one could design conceptually simple policies that admit provably near-optimal solutions. The analysis reveals interesting structural properties, including hidden submodularity and decomposition properties.
The talk is based on several papers which are Joint work with Ali Aouad, Vineet Goyal and Danny Segev
The theory of structure and pseudo-randomness has been very influential in several areas of mathematics, such as number theory, graph theory and harmonic analysis. It is also been influential in theoretical computer science, with applications in complexity theory, cryptography and property testing. At a high level, it allows to analyze arbitrary objects by decomposing them to a "structural" component and a "pseudo-random" component. The pseudo-random component behaves in many ways like random noise, while the structural component has a concise representation which makes it amenable to analysis and algorithmic manipulation.
In this talk, I will describe applications of this paradigm to coding theory. I will describe a new general approach to list decoding, which follows by decomposing an arbitrary received word to a structural received word and pseudo-random noise. This allows for a simplified analysis of the list decoding problem. In particular, I will describe how this approach leads to a resolution of a conjecture by Gopalan, Klivans and Zuckerman [STOC 2008], that the list decoding radius of Reed-Muller codes (in certain regimes) is equal to the minimal distance of the code.
Based on joint work with Abhishek Bhowmick.
There are at least 2 aspects to learning: predicting the outcome of unseen events, and finding simple explanations of observed systems. We shall discuss 2 formal abstractions of these aspects: PAC learning and sample compression schemes. We shall start with an introduction to these notions, and then discuss the equivalence between them.
Based on a joint project with Shay Moran, Amir Shpilka and Avi Wigderson.
We generalize the technique of smoothed analysis to distributed algorithms in dynamic network models. Whereas standard smoothed analysis studies the impact of small random perturbations of input values on algorithm performance metrics, dynamic graph smoothed analysis studies the impact of random perturbations of the underlying changing network graph topologies. Similar to the original application of smoothed analysis, our goal is to study whether known strong lower bounds in dynamic network models are robust or fragile: do they withstand small (random) perturbations, or do such deviations push the graphs far enough from a precise pathological instance to enable much better performance? Fragile lower bounds are likely not relevant for real-world deployment, while robust lower bounds represent a true difficulty caused by dynamic behavior. We apply this technique to three standard dynamic network problems with known strong worst-case lower bounds: random walks, flooding, and aggregation. We prove that these bounds provide a spectrum of robustness when subjected to smoothing---some are extremely fragile (random walks), some are moderately fragile / robust (flooding), and some are extremely robust (aggregation).
Joint work with Jeremy Fineman (Georgetown), Seth Gilbert (National University of Singapore), and Calvin Newport (Georgetown).
Given an input $x$, and a search problem $F$, local computation algorithms (LCAs) implement access to specified locations of $y$ in a legal output $y \in F(x)$, using polylogarithmic time and space. Previous work on LCAs restricted its focus to graphs of bounded degree, or degree of bounded expectation that is distributed binomially.
Using a new palette of techniques, we show that it is possible to obtain LCAs for maximal independent set (MIS) and maximal matching (MM) on trees with degree bounded by $\polylog{n}$. Our result immediately extends to all graphs with degree bounded by $\polylog{n}$, as long as they do not contain short cycles (of length $\polylog{n}$).
We define a family of graphs called $d$-light graphs, and show how to convert a large class of online algorithms (including MIS and MM) to LCAs on these graphs. We then show that applying the MIS (or MM) LCA on appropriately selected $d$-light subgraphs, allows us to quickly address all of the vertices of the $\polylog{n}$-degree graph.
In addition to expanding the range of LCAs to graphs of polylogarithmic degree, our new techniques also significantly improve the running times and space requirements, expand the family of graphs, and better define the family of algorithms to which previous results apply. Furthermore our proofs are simpler and more concise than the previous proof methods.
Joint work with Omer Reingold.
The interplay between algorithms and proofs has been one of the most fascinating and fruitful themes in theoretical Computer Science. In this talk I will describe a different connection between these two concepts - a general methodology for algorithm design using an automatic transformation of a proof into an algorithm. This methodology yields a systematic way to design and analyze algorithms across many domains. In particular we and others have used it for problems in combinatorial optimization, machine learning, and quantum information theory, and it shows promise for making progress on important open questions such as settling Khot's Unique Games Conjecture. I will demonstrate this methodology by presenting an algorithm for the Sparse Coding / Dictionary Learning problem that handles much more general inputs than was known before, and an algorithm for the Unique Games problem that can solve all the previously known candidate hard instances. Key to our approach is the notion of "pseudo-distributions", which are objects that are emphatically different than actual distributions but behave like them in the eyes of low degree polynomials. We use these pseudo-distributions to "lift" a proof into an algorithm via the Shor-Parrilo-Lasserre "Sum-of-Squares" semidefinite programming hierarchy. The talk will be based on several joint works with (varying subsets of) Fernando Brandao, Aram Harrow, Jonathan Kelner, David Steurer and Yuan Zhou.
In many settings, people exhibit behavior that is inconsistent across time — we allocate a block of time to get work done and then procrastinate, or put effort into a project and then later fail to complete it. An active line of research in behavioral economics and related fields has developed and analyzed models for this type of time-inconsistent behavior.
Here we propose a graph-theoretic model of tasks and goals, in which dependencies among actions are represented by a directed graph, and a time-inconsistent agent constructs a path through this graph. We first show how instances of this path-finding problem on different input graphs can reconstruct a wide range of qualitative phenomena observed in the literature on time-inconsistency, including procrastination, abandonment of long-range tasks, and the benefits of reduced sets of choices. We then explore a set of analyses that quantify over the set of all graphs; among other results, we find that in any graph, there can be only polynomially many distinct forms of time-inconsistent behavior; and any graph in which a time-inconsistent agent incurs significantly more cost than an optimal agent must contain a large “procrastination” structure as a minor. Finally, we use this graph-theoretic model to explore ways in which tasks can be designed to help motivate agents to reach designated goals.
This is joint work with Jon Kleinberg.
I will describe recent applications of "birthday repetition" that can give (conditional) quasi-polynomial time hardness results for Densest k-Subgraph and for $\epsilon$-Nash in 2-player games. (For the latter result we use a non-standard "PCP for PPAD" assumption which I will also discuss.) Both results are tight by [FS97], [LMM03], [Barman15].
Based on:
DkS- http://eccc.hpi-web.de/report/2015/074/
(joint work with Mark Braverman, Young Kun Ko, and Omri Weinstein)
Nash- http://arxiv.org/abs/1504.02411
(joint work with Yakov Babichenko and Christos Papadimitriou)
Research can be viewed as a search of an accepting-state in an exponential-space. When we look in hind-sight after we find an accepting-state, can we identify a much shorter path than the one that was discovered by trial and error as the search actually proceeded?
In this talk I'll show that this is the case for Distributed-Computability: The discovery that different distributed problems have different levels of difficulty, and identifying the weakest model of distributed-computation that allows to solve a problem. I'll explain the essence of 40 years of research in an hour, by showing that if the right questions were asked at the right time, all the results could have been had in a span of time order-of-magnitude shorter.
Some of the major ideas in the talk were developed in works with Afek (TAU), and Borowsky (Akamai), Lynch (MIT), and Rajsbaum (UNAM).
The PCP theorem (AS,ALMSS 1991) guarantees that every NP language has a Probabilistically Checkable Proof (PCP) system allowing a verifier to check a witness very efficiently using randomness, and allowing for small error.
Most of the talk will not assume prior knowledge, but I will also devote some time to some recent work joint with Harsha and Kindler.
In this work we make (some) progress towards proving the so-called "sliding-scale conjecture". This is a conjecture of BGLR from 1993 about the tradeoff between the number of bits read from the PCP proof and the error of the verifier.
Our work revisits older constructions and analyzes them using the more modern "modular-composition" approach.
Based on joint work with Prahladh Harsha and Guy Kindler.
Motivated by the goal of securely searching and updating distributed data, we introduce the notion of function secret sharing (FSS), a form of “additive secret sharing” for {\em functions} f: {0,1}^n → G, where G is an abelian group.
An m-party FSS scheme for function class F allows one to split any function f from F into m succinctly described functions f_i, such that: (1) for every input x, f(x) is equal to the sum of evaluations \sum_i f_i(x), and (2) any strict subset of "share functions" f_i hides f. FSS provides a natural generalization of distributed point functions, as introduced by (Gilboa-Ishai Eurocrypt 2014), which coincide with the special case of two parties and the class F of point functions (which evaluate to 0 at all but one point).
We present two types of results:
- We obtain efficiency improvements and extensions of the original distributed point function construction.
- We then initiate a systematic study of general FSS, providing constructions for richer function classes, and establishing relations with other cryptographic primitives.
Joint work with Niv Gilboa and Yuval Ishai.
We introduce and construct a pseudo-random object which we call a local correlation breaker (LCB). This is an algorithm that gets as input a sequence of (possibly correlated) random variables and an independent weak source of randomness. The output of the LCB is a sequence of random variables with the following property. If the i'th input random variable is uniform then the i'th output variable is uniform even if a bounded number of any other output variables are given. That is, an LCB uses the weak-source to "break" local correlations between random variables.
Based on LCBs we obtain improved constructions of mergers with weak-seeds and multi-source extractors. In particular, we construct a 3-source extractor for entropies delta*n, O(log n) and O(loglog n), for any constant delta. We further construct a 7-source extractor for poly-logarithmic entropy.
Joint work with Guy Rothblum.
No prior knowledge is assumed.
Bloom filters and Counting Bloom Filters (CBFs) are widely used in networking device algorithms. They implement fast set representations to support membership queries with limited error. Unlike Bloom filters, CBFs also support element deletions. In the first part of the talk, I will introduce a new general method based on variable increments to improve the efficiency of CBFs and their variants. I will demonstrate that this method can always achieve a lower false positive rate and a lower overflow probability bound than CBFs in practical systems.
Next, I will present ongoing research on data center networking. In particular, I will introduce a new approach to providing network isolation, so customers can feel alone in shared clouds, without any network contention from other customers. I will also demonstrate theoretical conditions for the isolation algorithm.
An instance of the E2-Lin(2) problem is a system of equations of the form "x_i + x_j = b (mod 2)". Given such a system in which it is possible to satisfy all but an epsilon fraction of the equations, we would like to find an assignment that violates as few equations as possible. In this paper, we show that it is NP-hard to satisfy all but a C*epsilon fraction of the equations, for any C< 11/8 and 0 < epsilon <= 1/8. Our result holds also for the special case of Max-Cut. The previous best NP-hardness result, standing for over 15 years, had 5/4 in place of 11/8.
Our proof is by a modified gadget reduction from a predicate that supports a pairwise independent distribution. We also show an inherent limitation to this type of gadget reduction. In particular, we show that no such reduction can establish a hardness factor C greater than ~2.54.
Joint work with Johan Hastad, Rajsekar Manokaran, Ryan O'Donnell, John Wright.