** A key tool for building differentially private systems is adding Gaussian noise to the output of a function evaluated on a sensitive dataset. Unfortunately, using a continuous distribution presents several practical challenges. First and foremost, finite computers cannot exactly represent samples from continuous distributions, and previous work has demonstrated that seemingly innocuous numerical errors can entirely destroy privacy. Moreover, when the underlying data is itself discrete (e.g., population counts), adding continuous noise makes the result less interpretable. With these shortcomings in mind, we introduce and analyze the discrete Gaussian in the context of differential privacy. Specifically, we theoretically and experimentally show that adding discrete Gaussian noise provides essentially the same privacy and accuracy guarantees as the addition of continuous Gaussian noise. We also present an simple and efficient algorithm for exact sampling from this distribution. This demonstrates its applicability for privately answering counting queries, or more generally, low-sensitivity integer-valued queries. **

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。
> Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

Source: Apple - iOS 8

** Centered Gaussian random fields (GRFs) indexed by compacta such as smooth, bounded Euclidean domains or smooth, compact and orientable manifolds are determined by their covariance operators. We consider centered GRFs given as variational solutions to coloring operator equations driven by spatial white noise, with an elliptic self-adjoint pseudodifferential coloring operator from the H\"ormander class. This includes the Mat\'ern class of GRFs as a special case. Using biorthogonal multiresolution analyses on the manifold, we prove that the precision and covariance operators, respectively, may be identified with bi-infinite matrices and finite sections may be diagonally preconditioned rendering the condition number independent of the dimension $p$ of this section. We prove that a tapering strategy by thresholding applied on finite sections of the bi-infinite precision and covariance matrices results in optimally numerically sparse approximations. That is, asymptotically only linearly many nonzero matrix entries are sufficient to approximate the original section of the bi-infinite covariance or precision matrix using this tapering strategy to arbitrary precision. The locations of these nonzero matrix entries are known a priori. The tapered covariance or precision matrices may also be optimally diagonally preconditioned. Analysis of the relative size of the entries of the tapered covariance matrices motivates novel, multilevel Monte Carlo (MLMC) oracles for covariance estimation, in sample complexity that scales log-linearly with respect to the number $p$ of parameters. In addition, we propose and analyze a novel compressive algorithm for simulating and kriging of GRFs. The complexity (work and memory vs. accuracy) of these three algorithms scales near-optimally in terms of the number of parameters $p$ of the sample-wise approximation of the GRF in Sobolev scales. **

** We consider stochastic optimization problems where a smooth (and potentially nonconvex) objective is to be minimized using a stochastic first-order oracle. These type of problems arise in many settings from simulation optimization to deep learning. We present Retrospective Approximation (RA) as a universal sequential sample-average approximation (SAA) paradigm where during each iteration $k$, a sample-path approximation problem is implicitly generated using an adapted sample size $M_k$, and solved (with prior solutions as "warm start") to an adapted error tolerance $\epsilon_k$, using a "deterministic method" such as the line search quasi-Newton method. The principal advantage of RA is that decouples optimization from stochastic approximation, allowing the direct adoption of existing deterministic algorithms without modification, thus mitigating the need to redesign algorithms for the stochastic context. A second advantage is the obvious manner in which RA lends itself to parallelization. We identify conditions on $\{M_k, k \geq 1\}$ and $\{\epsilon_k, k\geq 1\}$ that ensure almost sure convergence and convergence in $L_1$-norm, along with optimal iteration and work complexity rates. We illustrate the performance of RA with line-search quasi-Newton on an ill-conditioned least squares problem, as well as an image classification problem using a deep convolutional neural net. **

** Differential privacy allows bounding the influence that training data records have on a machine learning model. To use differential privacy in machine learning, data scientists must choose privacy parameters $(\epsilon,\delta)$. Choosing meaningful privacy parameters is key since models trained with weak privacy parameters might result in excessive privacy leakage, while strong privacy parameters might overly degrade model utility. However, privacy parameter values are difficult to choose for two main reasons. First, the upper bound on privacy loss $(\epsilon,\delta)$ might be loose, depending on the chosen sensitivity and data distribution of practical datasets. Second, legal requirements and societal norms for anonymization often refer to individual identifiability, to which $(\epsilon,\delta)$ are only indirectly related. We transform $(\epsilon,\delta)$ to a bound on the Bayesian posterior belief of the adversary assumed by differential privacy concerning the presence of any record in the training dataset. The bound holds for multidimensional queries under composition, and we show that it can be tight in practice. Furthermore, we derive an identifiability bound, which relates the adversary assumed in differential privacy to previous work on membership inference adversaries. We formulate an implementation of this differential privacy adversary that allows data scientists to audit model training and compute empirical identifiability scores and empirical $(\epsilon,\delta)$. **

** The direct Gaussian copula model with discrete marginal distributions is an appealing data-analytic tool but poses difficult computational challenges due to its intractable likelihood. A number of approximations/surrogates for the likelihood have been proposed, including the continuous extension-based approximation (CE) and the distributional transform-based approximation (DT). The continuous extension approach is exact up to Monte Carlo error but does not scale well computationally. The distributional transform approach permits efficient computation but offers no theoretical guarantee that it is exact. In practice, though, the distributional transform-based approximate likelihood is so very nearly exact for some variants of the model as to permit genuine maximum likelihood or Bayesian inference. We demonstrate the exactness of the distributional transform-based objective function for two interesting variants of the model, and propose a quantity that can be used to assess exactness for experimentally observed datasets. Said diagnostic will permit practitioners to determine whether genuine Bayesian inference or ordinary maximum likelihood inference using the DT-based likelihood is possible for a given dataset. **

** Approximate Bayesian computation (ABC) is a likelihood-free inference method that has been employed in various applications. However, ABC can be sensitive to outliers if a data discrepancy measure is chosen inappropriately. In this paper, we propose to use a nearest-neighbor-based $\gamma$-divergence estimator as a data discrepancy measure. We show that our estimator possesses a suitable theoretical robustness property called the redescending property. In addition, our estimator enjoys various desirable properties such as high flexibility, asymptotic unbiasedness, almost sure convergence, and linear-time computational complexity. Through experiments, we demonstrate that our method achieves significantly higher robustness than existing discrepancy measures. **

** Mendelian randomization (MR) is a statistical method exploiting genetic variants as instrumental variables to estimate the causal effect of modifiable risk factors on an outcome of interest. Despite wide uses of various popular two-sample MR methods based on genome-wide association study summary level data, however, those methods could suffer from potential power loss or/and biased inference when the chosen genetic variants are in linkage disequilibrium (LD), and have relatively large direct effects on the outcome whose distribution might be heavy-tailed which is commonly referred to as the idiosyncratic pleiotropy. To resolve those two issues, we propose a novel Robust Bayesian Mendelian Randomization (RBMR) model that uses the more robust multivariate generalized $t$-distribution to model such direct effects in a probabilistic model framework which can also incorporate the LD structure explicitly. The generalized $t$-distribution can be represented as a Gaussian scaled mixture so that our model parameters can be estimated by the EM-type algorithms. We compute the standard errors by calibrating the evidence lower bound (ELBO) using the likelihood ratio test. Through extensive simulation studies, we show that our RBMR has robust performance compared to other competing methods. We also apply our RBMR method to two benchmark data sets and find that RBMR has smaller bias and standard errors. Using our proposed RBMR method, we found that coronary artery disease (CAD) is associated with increased risk of coronavirus disease 2019 (COVID-19). We also develop a user-friendly R package RBMR for public use. **

** We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models. **