results for au:Pokutta_S in:cs

- We study reinforcement learning under model misspecification, where we do not have access to the true environment but only to a reasonably close approximation to it. We address this problem by extending the framework of robust MDPs to the model-free Reinforcement Learning setting, where we do not have access to the model parameters, but can only sample states from it. We define robust versions of Q-learning, SARSA, and TD-learning and prove convergence to an approximately optimal robust policy and approximate value function respectively. We scale up the robust algorithms to large MDPs via function approximation and prove convergence under two different settings. We prove convergence of robust approximate policy iteration and robust approximate value iteration for linear architectures (under mild assumptions). We also define a robust loss function, the mean squared robust projected Bellman error and give stochastic gradient descent algorithms that are guaranteed to converge to a local minimum.
- In this work we introduce a conditional accelerated lazy stochastic gradient descent algorithm with optimal number of calls to a stochastic first-order oracle and convergence rate $O\left(\frac{1}{\varepsilon^2}\right)$ improving over the projection-free, Online Frank-Wolfe based stochastic gradient descent of Hazan and Kale [2012] with convergence rate $O\left(\frac{1}{\varepsilon^4}\right)$.
- Oct 31 2016 cs.LG arXiv:1610.09269v1We study the cost function for hierarchical clusterings introduced by [arXiv:1510.05043] where hierarchies are treated as first-class objects rather than deriving their cost from projections into flat clusters. It was also shown in [arXiv:1510.05043] that a top-down algorithm returns a hierarchical clustering of cost at most $O\left(\alpha_n \log n\right)$ times the cost of the optimal hierarchical clustering, where $\alpha_n$ is the approximation ratio of the Sparsest Cut subroutine used. Thus using the best known approximation algorithm for Sparsest Cut due to Arora-Rao-Vazirani, the top down algorithm returns a hierarchical clustering of cost at most $O\left(\log^{3/2} n\right)$ times the cost of the optimal solution. We improve this by giving an $O(\log{n})$-approximation algorithm for this problem. Our main technical ingredients are a combinatorial characterization of ultrametrics induced by this cost function, deriving an Integer Linear Programming (ILP) formulation for this family of ultrametrics, and showing how to iteratively round an LP relaxation of this formulation by using the idea of \emphsphere growing which has been extensively used in the context of graph partitioning. We also prove that our algorithm returns an $O(\log{n})$-approximate hierarchical clustering for a generalization of this cost function also studied in [arXiv:1510.05043]. Experiments show that the hierarchies found by using the ILP formulation as well as our rounding algorithm often have better projections into flat clusters than the standard linkage based algorithms. We also give constant factor inapproximability results for this problem.
- Conditional gradient algorithms (also often called Frank-Wolfe algorithms) are popular due to their simplicity of only requiring a linear optimization oracle and more recently they also gained significant traction for online learning. While simple in principle, in many cases the actual implementation of the linear optimization oracle is costly. We show a general method to lazify various conditional gradient algorithms, which in actual computations leads to several orders of magnitude of speedup in wall-clock time. This is achieved by using a faster separation oracle instead of a linear optimization oracle, relying only on few linear optimization oracle calls.
- For the linear bandit problem, we extend the analysis of algorithm CombEXP from [R. Combes, M. S. Talebi Mazraeh Shahi, A. Proutiere, and M. Lelarge. Combinatorial bandits revisited. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 2116--2124. Curran Associates, Inc., 2015. URL http://papers.nips.cc/paper/5831-combinatorial-bandits-revisited.pdf] to the high-probability case against adaptive adversaries, allowing actions to come from an arbitrary polytope. We prove a high-probability regret of \(O(T^2/3)\)for time horizon \(T\). While this bound is weaker than the optimal \(O(\sqrtT)\)bound achieved by GeometricHedge in [P. L. Bartlett, V. Dani, T. Hayes, S. Kakade, A. Rakhlin, and A. Tewari. High-probability regret bounds for bandit online linear optimization. In 21th Annual Conference on Learning Theory (COLT 2008), July 2008. http://eprints.qut.edu.au/45706/1/30-Bartlett.pdf], CombEXP is computationally efficient, requiring only an efficient linear optimization oracle over the convex hull of the actions.
- Sep 20 2016 cs.CY arXiv:1609.05821v1Transportation systems are currently being transformed by advances in information and communication technologies. The development of autonomous transportation holds the promise of providing revolutionary improvements in speed, efficiency, safety and reliability along with concomitant benefits for society and economy. It is anticipated these changes will soon affect household activity patterns, public safety, supply chains and logistics, manufacturing, and quality of life in general.
- Dec 16 2015 cs.CC arXiv:1512.04932v3We generalize the reduction mechanism for linear programming problems and semidefinite programming problems from [arXiv:1410.8816] in two ways 1) relaxing the requirement of affineness and 2) extending to fractional optimization problems. As applications we provide several new LP-hardness and SDP-hardness results, e.g., for the SparsestCut problem, the BalancedSeparator problem, the MaxCut problem and the Matching problem on 3-regular graphs. We also provide a new, very strong Lasserre integrality gap result for the IndependentSet problem, which is strictly greater than the best known LP approximation, showing that the Lasserre hierarchy does not always provide the tightest SDP relaxation.
- We study the value of information in sequential compressed sensing by characterizing the performance of sequential information guided sensing in practical scenarios when information is inaccurate. In particular, we assume the signal distribution is parameterized through Gaussian or Gaussian mixtures with estimated mean and covariance matrices, and we can measure compressively through a noisy linear projection or using one-sparse vectors, i.e., observing one entry of the signal each time. We establish a set of performance bounds for the bias and variance of the signal estimator via posterior mean, by capturing the conditional entropy (which is also related to the size of the uncertainty), and the additional power required due to inaccurate information to reach a desired precision. Based on this, we further study how to estimate covariance based on direct samples or covariance sketching. Numerical examples also demonstrate the superior performance of Info-Greedy Sensing algorithms compared with their random and non-adaptive counterparts.
- In this note, we present an information diffusion inequality derived from an elementary argument, which gives rise to a very general Fano-type inequality. The latter unifies and generalizes the distance-based Fano inequality and the continuous Fano inequality established in [Corollary 1, Propositions 1 and 2, arXiv:1311.2669v2], as well as the generalized Fano inequality in [Equation following (10); T. S. Han and S. Verdú. Generalizing the Fano inequality. IEEE Transactions on Information Theory, 40(4):1247-1251, July 1994].
- Apr 06 2015 cs.CC arXiv:1504.00703v5Yannakakis showed that the matching problem does not have a small symmetric linear program. Rothvoß recently proved that any, not necessarily symmetric, linear program also has exponential size. It is natural to ask whether the matching problem can be expressed compactly in a framework such as semidefinite programming (SDP) that is more powerful than linear programming but still allows efficient optimization. We answer this question negatively for symmetric SDPs: any symmetric SDP for the matching problem has exponential size. We also show that an O(k)-round Lasserre SDP relaxation for the metric traveling salesperson problem yields at least as good an approximation as any symmetric SDP relaxation of size $n^k$. The key technical ingredient underlying both these results is an upper bound on the degree needed to derive polynomial identities that hold over the space of matchings or traveling salesperson tours.
- Mar 04 2015 cs.CC arXiv:1503.00753v2The vertex cover problem is one of the most important and intensively studied combinatorial optimization problems. Khot and Regev (2003) proved that the problem is NP-hard to approximate within a factor $2 - \epsilon$, assuming the Unique Games Conjecture (UGC). This is tight because the problem has an easy 2-approximation algorithm. Without resorting to the UGC, the best inapproximability result for the problem is due to Dinur and Safra (2002): vertex cover is NP-hard to approximate within a factor 1.3606. We prove the following unconditional result about linear programming (LP) relaxations of the problem: every LP relaxation that approximates vertex cover within a factor $2-\epsilon$ has super-polynomially many inequalities. As a direct consequence of our methods, we also establish that LP relaxations (as well as SDP relaxations) that approximate the independent set problem within any constant factor have super-polynomial size.
- We characterize the performance of sequential information guided sensing, Info-Greedy Sensing, when there is a mismatch between the true signal model and the assumed model, which may be a sample estimate. In particular, we consider a setup where the signal is low-rank Gaussian and the measurements are taken in the directions of eigenvectors of the covariance matrix in a decreasing order of eigenvalues. We establish a set of performance bounds when a mismatched covariance matrix is used, in terms of the gap of signal posterior entropy, as well as the additional amount of power required to achieve the same signal recovery precision. Based on this, we further study how to choose an initialization for Info-Greedy Sensing using the sample covariance matrix, or using an efficient covariance sketching scheme.
- Nov 03 2014 cs.CC arXiv:1410.8816v5We define a reduction mechanism for LP and SDP formulations that degrades approximation factors in a controlled fashion. Our reduction mechanism is a minor restriction of classical reductions establishing inapproximability in the context of PCP theorems. As a consequence we establish strong linear programming inapproximability (for LPs with a polynomial number of constraints) for many problems. In particular we obtain a $3/2-\varepsilon$ inapproximability for VertexCover answering an open question in [arXiv:1309.0563] and we answer a weak version of our sparse graph conjecture posed in [arXiv:1311.4001] showing an inapproximability factor of $1/2+\varepsilon$ for bounded degree IndependentSet. In the case of SDPs, we obtain inapproximability results for these problems relative to the SDP-inapproximability of MaxCUT. Moreover, using our reduction framework we are able to reproduce various results for CSPs from [arXiv:1309.0563] via simple reductions from Max-2-XOR.
- We present an information-theoretic approach to lower bound the oracle complexity of nonsmooth black box convex optimization, unifying previous lower bounding techniques by identifying a combinatorial problem, namely string guessing, as a single source of hardness. As a measure of complexity we use distributional oracle complexity, which subsumes randomized oracle complexity as well as worst-case oracle complexity. We obtain strong lower bounds on distributional oracle complexity for the box $[-1,1]^n$, as well as for the $L^p$-ball for $p \geq 1$ (for both low-scale and large-scale regimes), matching worst-case upper bounds, and hence we close the gap between distributional complexity, and in particular, randomized complexity, and worst-case complexity. Furthermore, the bounds remain essentially the same for high-probability and bounded-error oracle complexity, and even for combination of the two, i.e., bounded-error high-probability oracle complexity. This considerably extends the applicability of known bounds.
- We present an information-theoretic framework for sequential adaptive compressed sensing, Info-Greedy Sensing, where measurements are chosen to maximize the extracted information conditioned on the previous measurements. We show that the widely used bisection approach is Info-Greedy for a family of $k$-sparse signals by connecting compressed sensing and blackbox complexity of sequential query algorithms, and present Info-Greedy algorithms for Gaussian and Gaussian Mixture Model (GMM) signals, as well as ways to design sparse Info-Greedy measurements. Numerical examples demonstrate the good performance of the proposed algorithms using simulated and real data: Info-Greedy Sensing shows significant improvement over random projection for signals with sparse and low-rank covariance matrices, and adaptivity brings robustness when there is a mismatch between the assumed and the true distributions.
- The groundbreaking work of Rothvoß [arxiv:1311.2369] established that every linear program expressing the matching polytope has an exponential number of inequalities (formally, the matching polytope has exponential extension complexity). We generalize this result by deriving strong bounds on the polyhedral inapproximability of the matching polytope: for fixed $0 < \varepsilon < 1$, every polyhedral $(1 + \varepsilon / n)$-approximation requires an exponential number of inequalities, where $n$ is the number of vertices. This is sharp given the well-known $\rho$-approximation of size $O(\binom{n}{\rho/(\rho-1)})$ provided by the odd-sets of size up to $\rho/(\rho-1)$. Thus matching is the first problem in $P$, whose natural linear encoding does not admit a fully polynomial-size relaxation scheme (the polyhedral equivalent of an FPTAS), which provides a sharp separation from the polynomial-size relaxation scheme obtained e.g., via constant-sized odd-sets mentioned above. Our approach reuses ideas from Rothvoß [arxiv:1311.2369], however the main lower bounding technique is different. While the original proof is based on the hyperplane separation bound (also called the rectangle corruption bound), we employ the information-theoretic notion of common information as introduced in Braun and Pokutta [http://eccc.hpi-web.de/report/2013/056/], which allows to analyze perturbations of slack matrices. It turns out that the high extension complexity for the matching polytope stem from the same source of hardness as for the correlation polytope: a direct sum structure.
- Nov 19 2013 cs.CC arXiv:1311.4001v4We study the minimum number of constraints needed to formulate random instances of the maximum stable set problem via linear programs (LPs), in two distinct models. In the uniform model, the constraints of the LP are not allowed to depend on the input graph, which should be encoded solely in the objective function. There we prove a $2^{\Omega(n/ \log n)}$ lower bound with probability at least $1 - 2^{-2^n}$ for every LP that is exact for a randomly selected set of instances; each graph on at most n vertices being selected independently with probability $p \geq 2^{-\binom{n/4}{2}+n}$. In the non-uniform model, the constraints of the LP may depend on the input graph, but we allow weights on the vertices. The input graph is sampled according to the G(n, p) model. There we obtain upper and lower bounds holding with high probability for various ranges of p. We obtain a super-polynomial lower bound all the way from $p = \Omega(\log^{6+\varepsilon} / n)$ to $p = o (1 / \log n)$. Our upper bound is close to this as there is only an essentially quadratic gap in the exponent, which currently also exists in the worst-case model. Finally, we state a conjecture that would close this gap, both in the average-case and worst-case models.
- In Rothvoß it was shown that there exists a 0/1 polytope (a polytope whose vertices are in {0,1}^n) such that any higher-dimensional polytope projecting to it must have 2^\Omega(n) facets, i.e., its linear extension complexity is exponential. The question whether there exists a 0/1 polytope with high PSD extension complexity was left open. We answer this question in the affirmative by showing that there is a 0/1 polytope such that any spectrahedron projecting to it must be the intersection of a semidefinite cone of dimension~2^\Omega(n) and an affine space. Our proof relies on a new technique to rescale semidefinite factorizations.
- Recently Schrijver's open problem, whether the Chvátal--Gomory closure of an irrational polytope is polyhedral was answered independently in the affirmative by Dadush, Dey, and Vielma (even for arbitrarily compact convex set) as well as by Dunkel and Schulz. We present a very short, easily accesible proof that the Chvátal--Gomory closure of a compact convex set is a polytope.
- Extended formulations are an important tool to obtain small (even compact) formulations of polytopes by representing them as projections of higher dimensional ones. It is an important question whether a polytope admits a small extended formulation, i.e., one involving only a polynomial number of inequalities in its dimension. For the case of symmetric extended formulations (i.e., preserving the symmetries of the polytope) Yannakakis established a powerful technique to derive lower bounds and rule out small formulations. We rephrase the technique of Yannakakis in a group-theoretic framework. This provides a different perspective on symmetric extensions and considerably simplifies several lower bound constructions.
- We develop a framework for approximation limits of polynomial-size linear programs from lower bounds on the nonnegative ranks of suitably defined matrices. This framework yields unconditional impossibility results that are applicable to any linear program as opposed to only programs generated by hierarchies. Using our framework, we prove that O(n^1/2-eps)-approximations for CLIQUE require linear programs of size 2^n^\Omega(eps). (This lower bound applies to linear programs using a certain encoding of CLIQUE as a linear optimization problem.) Moreover, we establish a similar result for approximations of semidefinite programs by linear programs. Our main ingredient is a quantitative improvement of Razborov's rectangle corruption lemma for the high error regime, which gives strong lower bounds on the nonnegative rank of certain perturbations of the unique disjointness matrix.
- We solve a 20-year old problem posed by Yannakakis and prove that there exists no polynomial-size linear program (LP) whose associated polytope projects to the traveling salesman polytope, even if the LP is not required to be symmetric. Moreover, we prove that this holds also for the cut polytope and the stable set polytope. These results were discovered through a new connection that we make between one-way quantum communication protocols and semidefinite programming reformulations of LPs.