A polynomial threshold function (PTF) $f:\mathbb{R}^n \rightarrow \mathbb{R}$ is a function of the form $f(x) = \mathsf{sign}(p(x))$ where $p$ is a polynomial of degree at most $d$. PTFs are a classical and well-studied complexity class with applications across complexity theory, learning theory, approximation theory, quantum complexity and more. We address the question of designing pseudorandom generators (PRG) for polynomial threshold functions (PTFs) in the gaussian space: design a PRG that takes a seed of few bits of randomness and outputs a $n$-dimensional vector whose distribution is indistinguishable from a standard multivariate gaussian by a degree $d$ PTF. Our main result is a PRG that takes a seed of $d^{O(1)}\log ( n / \varepsilon)\log(1/\varepsilon)/\varepsilon^2$ random bits with output that cannot be distinguished from $n$-dimensional gaussian distribution with advantage better than $\varepsilon$ by degree $d$ PTFs. The best previous generator due to O'Donnell, Servedio, and Tan (STOC'20) had a quasi-polynomial dependence (i.e., seedlength of $d^{O(\log d)}$) in the degree $d$. Along the way we prove a few nearly-tight structural properties of restrictions of PTFs that may be of independent interest.

0
下载
关闭预览

相关内容

正态(或高斯或高斯或拉普拉斯-高斯)分布是实值随机变量的一种连续概率分布。高斯分布具有一些独特的属性,这些属性在分析研究中很有价值。 例如,法线偏差的固定集合的任何线性组合就是法线偏差。 当相关变量呈正态分布时,许多结果和方法(例如不确定性的传播和最小二乘参数拟合)都可以以显式形式进行分析得出。

Renewal equations are a popular approach used in modelling the number of new infections, i.e., incidence, in an outbreak. We develop a stochastic model of an outbreak based on a time-varying variant of the Crump-Mode-Jagers branching process. This model accommodates a time-varying reproduction number and a time-varying distribution for the generation interval. We then derive renewal-like integral equations for incidence, cumulative incidence and prevalence under this model. We show that the equations for incidence and prevalence are consistent with the so-called back-calculation relationship. We analyse two particular cases of these integral equations, one that arises from a Bellman-Harris process and one that arises from an inhomogeneous Poisson process model of transmission. We outline an argument to show that the incidence integral equations that arise from both of these specific models agree with the renewal equation used ubiquitously in infectious disease modelling. We present a numerical discretisation scheme to solve these equations, and use this scheme to estimate rates of transmission from both a Bellman-Harris process and Poisson process for Influenza, Measles, Smallpox and SARS incidence.

0
0
下载
预览

We study the problem of generating a hyperplane tessellation of an arbitrary set $T$ in $\mathbb{R}^n$, ensuring that the Euclidean distance between any two points corresponds to the fraction of hyperplanes separating them up to a pre-specified error $\delta$. We focus on random gaussian tessellations with uniformly distributed shifts and derive sharp bounds on the number of hyperplanes $m$ that are required. Surprisingly, our lower estimates falsify the conjecture that $m\sim \ell_*^2(T)/\delta^2$, where $\ell_*^2(T)$ is the gaussian width of $T$, is optimal.

0
0
下载
预览

The Gromov-Hausdorff distance $(d_{GH})$ proves to be a useful distance measure between shapes. In order to approximate $d_{GH}$ for compact subsets $X,Y\subset\mathbb{R}^d$, we look into its relationship with $d_{H,iso}$, the infimum Hausdorff distance under Euclidean isometries. As already known for dimension $d\geq 2$, the $d_{H,iso}$ cannot be bounded above by a constant factor times $d_{GH}$. For $d=1$, however, we prove that $d_{H,iso}\leq\frac{5}{4}d_{GH}$. We also show that the bound is tight. In effect, this gives rise to an $O(n\log{n})$-time algorithm to approximate $d_{GH}$ with an approximation factor of $\left(1+\frac{1}{4}\right)$.

0
0
下载
预览

DeepONets have recently been proposed as a framework for learning nonlinear operators mapping between infinite dimensional Banach spaces. We analyze DeepONets and prove estimates on the resulting approximation and generalization errors. In particular, we extend the universal approximation property of DeepONets to include measurable mappings in non-compact spaces. By a decomposition of the error into encoding, approximation and reconstruction errors, we prove both lower and upper bounds on the total error, relating it to the spectral decay properties of the covariance operators, associated with the underlying measures. We derive almost optimal error bounds with very general affine reconstructors and with random sensor locations as well as bounds on the generalization error, using covering number arguments. We illustrate our general framework with four prototypical examples of nonlinear operators, namely those arising in a nonlinear forced ODE, an elliptic PDE with variable coefficients and nonlinear parabolic and hyperbolic PDEs. While the approximation of arbitrary Lipschitz operators by DeepONets to accuracy $\epsilon$ is argued to suffer from a "curse of dimensionality" (requiring a neural networks of exponential size in $1/\epsilon$), in contrast, for all the above concrete examples of interest, we rigorously prove that DeepONets can break this curse of dimensionality (achieving accuracy $\epsilon$ with neural networks of size that can grow algebraically in $1/\epsilon$). Thus, we demonstrate the efficient approximation of a potentially large class of operators with this machine learning framework.

0
0
下载
预览

Airfoil shape design is a classical problem in engineering and manufacturing. Our motivation is to combine principled physics-based considerations for the shape design problem with modern computational techniques informed by a data-driven approach. Traditional analyses of airfoil shapes emphasize a flow-based sensitivity to deformations which can be represented generally by affine transformations (rotation, scaling, shearing, translation). We present a novel representation of shapes which decouples affine-style deformations from a rich set of data-driven deformations over a submanifold of the Grassmannian. The Grassmannian representation, informed by a database of physically relevant airfoils, offers (i) a rich set of novel 2D airfoil deformations not previously captured in the data, (ii) improved low-dimensional parameter domain for inferential statistics informing design/manufacturing, and (iii) consistent 3D blade representation and perturbation over a sequence of nominal shapes.

0
0
下载
预览

The paper deals with the distribution of singular values of the input-output Jacobian of deep untrained neural networks in the limit of their infinite width. The Jacobian is the product of random matrices where the independent rectangular weight matrices alternate with diagonal matrices whose entries depend on the corresponding column of the nearest neighbor weight matrix. The problem was considered in \cite{Pe-Co:18} for the Gaussian weights and biases and also for the weights that are Haar distributed orthogonal matrices and Gaussian biases. Basing on a free probability argument, it was claimed that in these cases the singular value distribution of the Jacobian in the limit of infinite width (matrix size) coincides with that of the analog of the Jacobian with special random but weight independent diagonal matrices, the case well known in random matrix theory. The claim was rigorously proved in \cite{Pa-Sl:21} for a quite general class of weights and biases with i.i.d. (including Gaussian) entries by using a version of the techniques of random matrix theory. In this paper we use another version of the techniques to justify the claim for random Haar distributed weight matrices and Gaussian biases.

0
0
下载
预览

We consider the problem of high-dimensional filtering of state-space models (SSMs) at discrete times. This problem is particularly challenging as analytical solutions are typically not available and many numerical approximation methods can have a cost that scales exponentially with the dimension of the hidden state. Inspired by lag-approximation methods for the smoothing problem, we introduce a lagged approximation of the smoothing distribution that is necessarily biased. For certain classes of SSMs, particularly those that forget the initial condition exponentially fast in time, the bias of our approximation is shown to be uniformly controlled in the dimension and exponentially small in time. We develop a sequential Monte Carlo (SMC) method to recursively estimate expectations with respect to our biased filtering distributions. Moreover, we prove for a class of class of SSMs that can contain dependencies amongst coordinates that as the dimension $d\rightarrow\infty$ the cost to achieve a stable mean square error in estimation, for classes of expectations, is of $\mathcal{O}(Nd^2)$ per-unit time, where $N$ is the number of simulated samples in the SMC algorithm. Our methodology is implemented on several challenging high-dimensional examples including the conservative shallow-water model.

0
0
下载
预览

Linear error-correcting codes can be used for constructing secret sharing schemes; however finding in general the access structures of these secret sharing schemes and, in particular, determining efficient access structures is difficult. Here we investigate the properties of certain algebraic hypersurfaces over finite fields, whose intersection numbers with any hyperplane only takes a few values; these varieties give rise to $q$-divisible linear codes with at most $5$ weights. Furthermore, for $q$ odd these codes turn out to be minimal and we characterize the access structures of the secret sharing schemes based on their dual codes. Indeed, the secret sharing schemes thus obtained are democratic, that is each participant belongs to the same number of minimal access sets and can easily be described.

0
0
下载
预览

The problem of Approximate Nearest Neighbor (ANN) search is fundamental in computer science and has benefited from significant progress in the past couple of decades. However, most work has been devoted to pointsets whereas complex shapes have not been sufficiently treated. Here, we focus on distance functions between discretized curves in Euclidean space: they appear in a wide range of applications, from road segments to time-series in general dimension. For $\ell_p$-products of Euclidean metrics, for any $p$, we design simple and efficient data structures for ANN, based on randomized projections, which are of independent interest. They serve to solve proximity problems under a notion of distance between discretized curves, which generalizes both discrete Fr\'echet and Dynamic Time Warping distances. These are the most popular and practical approaches to comparing such curves. We offer the first data structures and query algorithms for ANN with arbitrarily good approximation factor, at the expense of increasing space usage and preprocessing time over existing methods. Query time complexity is comparable or significantly improved by our algorithms, our algorithm is especially efficient when the length of the curves is bounded.

0
3
下载
预览

Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors. Recent work in theoretical computer science has shown that, in appropriate distributional models, it is possible to robustly estimate the mean and covariance with polynomial time algorithms that can tolerate a constant fraction of corruptions, independent of the dimension. However, the sample and time complexity of these algorithms is prohibitively large for high-dimensional applications. In this work, we address both of these issues by establishing sample complexity bounds that are optimal, up to logarithmic factors, as well as giving various refinements that allow the algorithms to tolerate a much larger fraction of corruptions. Finally, we show on both synthetic and real data that our algorithms have state-of-the-art performance and suddenly make high-dimensional robust estimation a realistic possibility.

0
3
下载
预览
小贴士
相关论文
Mikko S. Pakkanen,Xenia Miscouridou,Tresnia Berah,Swapnil Mishra,Thomas A. Mellan,Samir Bhatt
0+阅读 · 1月14日
Sjoerd Dirksen,Shahar Mendelson,Alexander Stollenwerk
0+阅读 · 1月13日
Sushovan Majhi,Jeffrey Vitter,Carola Wenk
0+阅读 · 1月13日
Error estimates for DeepOnets: A deep learning framework in infinite dimensions
Samuel Lanthaler,Siddhartha Mishra,George Em Karniadakis
0+阅读 · 1月13日
Olga A. Doronina,Zachary J. Grey,Andrew Glaws
0+阅读 · 1月12日
Hamza Ruzayqat,Aimad Er-Raiy,Alexandros Beskos,Dan Crisan,Ajay Jasra,Nikolas Kantas
0+阅读 · 1月12日
Angela Aguglia,Michela Ceria,Luca Giuzzi
0+阅读 · 1月11日
Ioannis Z. Emiris,Ioannis Psarros
3+阅读 · 2020年4月13日
Ilias Diakonikolas,Gautam Kamath,Daniel M. Kane,Jerry Li,Ankur Moitra,Alistair Stewart
3+阅读 · 2017年12月14日
相关资讯
图机器学习 2.2-2.4 Properties of Networks, Random Graph
图与推荐
9+阅读 · 2020年3月28日
Transferring Knowledge across Learning Processes
CreateAMind
8+阅读 · 2019年5月18日
A Technical Overview of AI & ML in 2018 & Trends for 2019
待字闺中
10+阅读 · 2018年12月24日
disentangled-representation-papers
CreateAMind
24+阅读 · 2018年9月12日
Hierarchical Disentangled Representations
CreateAMind
3+阅读 · 2018年4月15日
【推荐】手把手深度学习模型部署指南
机器学习研究会
4+阅读 · 2018年1月23日
分布式TensorFlow入门指南
机器学习研究会
4+阅读 · 2017年11月28日
【推荐】决策树/随机森林深入解析
机器学习研究会
5+阅读 · 2017年9月21日
【推荐】GAN架构入门综述(资源汇总)
机器学习研究会
9+阅读 · 2017年9月3日
Top