Standard likelihood penalties to learn Gaussian graphical models are based on regularising the off-diagonal entries of the precision matrix. Such methods, and their Bayesian counterparts, are not invariant to scalar multiplication of the variables, unless one standardises the observed data to unit sample variances. We show that such standardisation can have a strong effect on inference and introduce a new family of penalties based on partial correlations. We show that the latter, as well as the maximum likelihood, $L_0$ and logarithmic penalties are scale invariant. We illustrate the use of one such penalty, the partial correlation graphical LASSO, which sets an $L_{1}$ penalty on partial correlations. The associated optimization problem is no longer convex, but is conditionally convex. We show via simulated examples and in two real datasets that, besides being scale invariant, there can be important gains in terms of inference.

0
下载
关闭预览

相关内容

《图形模型》是国际公认的高评价的顶级期刊,专注于图形模型的创建、几何处理、动画和可视化,以及它们在工程、科学、文化和娱乐方面的应用。GMOD为其读者提供了经过彻底审查和精心挑选的论文,这些论文传播令人兴奋的创新,传授严谨的理论基础,提出健壮和有效的解决方案,或描述各种主题中的雄心勃勃的系统或应用程序。 官网地址:http://dblp.uni-trier.de/db/journals/cvgip/

We introduce generalized spatially coupled parallel concatenated codes (GSC-PCCs), a class of spatially coupled turbo-like codes obtained by coupling parallel concatenated codes (PCCs) with a fraction of information bits repeated before the PCC encoding. GSC-PCCs can be seen as a generalization of the original spatially coupled parallel concatenated convolutional codes (SC-PCCs) proposed by Moloudi et al. [1]. To characterize the asymptotic performance of GSC-PCCs, we derive the corresponding density evolution equations and compute their decoding thresholds. We show that the proposed codes have some nice properties such as threshold saturation and that their decoding thresholds improve with the repetition factor $q$. Most notably, our analysis suggests that the proposed codes asymptotically approach the capacity as $q$ tends to infinity with any given constituent convolutional code.

0
0
下载
预览

Learning individualized treatment rules (ITRs) is an important topic in precision medicine. Current literature mainly focuses on deriving ITRs from a single source population. We consider the observational data setting when the source population differs from a target population of interest. We assume subject covariates are available from both populations, but treatment and outcome data are only available from the source population. Although adjusting for differences between source and target populations can potentially lead to an improved ITR for the target population, it can substantially increase the variability in ITR estimation. To address this dilemma, we develop a weighting framework that aims to tailor an ITR for a given target population and protect against high variability due to superfluous covariate shift adjustments. Our method seeks covariate balance over a nonparametric function class characterized by a reproducing kernel Hilbert space and can improve many ITR learning methods that rely on weights. We show that the proposed method encompasses importance weights and the so-called overlap weights as two extreme cases, allowing for a better bias-variance trade-off in between. Numerical examples demonstrate that the use of our weighting method can greatly improve ITR estimation for the target population compared with other weighting methods.

0
0
下载
预览

Contemporary data-driven methods are typically fed with full supervision on large-scale datasets which limits their applicability. However, in the actual systems with limitations such as measurement error and data acquisition problems, people usually obtain incomplete data. Although data completion has attracted wide attention, the underlying data pattern and relativity are still under-developed. Currently, the family of latent variable models allows learning deep latent variables over observed variables by fitting the marginal distribution. As far as we know, current methods fail to perceive the data relativity under partial observation. Aiming at modeling incomplete data, this work uses relational inference to fill in the incomplete data. Specifically, we expect to approximate the real joint distribution over the partial observation and latent variables, thus infer the unseen targets respectively. To this end, we propose Omni-Relational Network (OR-Net) to model the pointwise relativity in two aspects: (i) On one hand, the inner relationship is built among the context points in the partial observation; (ii) On the other hand, the unseen targets are inferred by learning the cross-relationship with the observed data points. It is further discovered that the proposed method can be generalized to different scenarios regardless of whether the physical structure can be observed or not. It is demonstrated that the proposed OR-Net can be well generalized for data completion tasks of various modalities, including function regression, image completion on MNIST and CelebA datasets, and also sequential motion generation conditioned on the observed poses.

0
0
下载
预览

In this paper, we propose a simple and easy-to-implement Bayesian hypothesis test for the presence of an association, described by Kendall's tau coefficient, between two variables measured on at least an ordinal scale. Owing to the absence of the likelihood functions for the data, we employ the asymptotic sampling distributions of the test statistic as the working likelihoods and then specify a truncated normal prior distribution on the noncentrality parameter of the alternative hypothesis, which results in the Bayes factor available in closed form in terms of the cumulative distribution function of the standard normal distribution. Investigating the asymptotic behavior of the Bayes factor we find the conditions of the priors so that it is consistent to whichever the hypothesis is true. Simulation studies and a real-data application are used to illustrate the effectiveness of the proposed Bayes factor. It deserves mentioning that the proposed method can be easily covered in undergraduate and graduate courses in nonparametric statistics with an emphasis on students' Bayesian thinking for data analysis.

0
0
下载
预览

[This paper was initially published in PHME conference in 2016, selected for further publication in International Journal of Prognostics and Health Management.] This paper describes an Autoregressive Partially-hidden Markov model (ARPHMM) for fault detection and prognostics of equipments based on sensors' data. It is a particular dynamic Bayesian network that allows to represent the dynamics of a system by means of a Hidden Markov Model (HMM) and an autoregressive (AR) process. The Markov chain assumes that the system is switching back and forth between internal states while the AR process ensures a temporal coherence on sensor measurements. A sound learning procedure of standard ARHMM based on maximum likelihood allows to iteratively estimate all parameters simultaneously. This paper suggests a modification of the learning procedure considering that one may have prior knowledge about the structure which becomes partially hidden. The integration of the prior is based on the Theory of Weighted Distributions which is compatible with the Expectation-Maximization algorithm in the sense that the convergence properties are still satisfied. We show how to apply this model to estimate the remaining useful life based on health indicators. The autoregressive parameters can indeed be used for prediction while the latent structure can be used to get information about the degradation level. The interest of the proposed method for prognostics and health assessment is demonstrated on CMAPSS datasets.

0
0
下载
预览

Recently, Chatterjee (2021) introduced a new rank-based correlation coefficient which can be used to test for independence between two random variables. His test has already attracted much attention as it is distribution-free, consistent against all fixed alternatives, asymptotically normal under the null hypothesis of independence and computable in (near) linear time; thereby making it appropriate for large-scale applications. However, not much is known about the power properties of this test beyond consistency against fixed alternatives. In this paper, we bridge this gap by obtaining the asymptotic distribution of Chatterjee's correlation under any changing sequence of alternatives "converging" to the null hypothesis (of independence). We further obtain a general result that gives exact detection thresholds and limiting power for Chatterjee's test of independence under natural nonparametric alternatives "converging" to the null. As applications of this general result, we prove a non-standard $n^{-1/4}$ detection boundary for this test and compute explicitly the limiting local power on the detection boundary, for popularly studied alternatives in literature such as mixture models, rotation models and noisy nonparametric regression. Moreover our convergence results provide explicit finite sample bounds that depend on the "distance" between the null and the alternative. Our proof techniques rely on second order Poincar\'{e} type inequalities and a non-asymptotic projection theorem.

0
0
下载
预览

In a multiple linear regression model, the algebraic formula of the decomposition theorem explains the relationship between the univariate regression coefficient and partial regression coefficient using geometry. It was found that univariate regression coefficients are decomposed into their respective partial regression coefficients according to the parallelogram rule. Multicollinearity is analyzed with the help of the decomposition theorem. It was also shown that it is a sample phenomenon that the partial regression coefficients of important explanatory variables are not significant, but the sign expectation deviation cause may be the population structure between the explained variables and explanatory variables or may be the result of sample selection. At present, some methods of diagnostic multicollinearity only consider the correlation of explanatory variables, so these methods are basically unreliable, and handling multicollinearity is blind before the causes are not distinguished. The increase in the sample size can help identify the causes of multicollinearity, and the difference method can play an auxiliary role.

0
0
下载
预览

The existence of simple, uncoupled no-regret dynamics that converge to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems. Specifically, it has been known for more than 20 years that when all players seek to minimize their internal regret in a repeated normal-form game, the empirical frequency of play converges to a normal-form correlated equilibrium. Extensive-form (that is, tree-form) games generalize normal-form games by modeling both sequential and simultaneous moves, as well as private information. Because of the sequential nature and presence of partial information in the game, extensive-form correlation has significantly different properties than the normal-form counterpart, many of which are still open research directions. Extensive-form correlated equilibrium (EFCE) has been proposed as the natural extensive-form counterpart to normal-form correlated equilibrium. However, it was currently unknown whether EFCE emerges as the result of uncoupled agent dynamics. In this paper, we give the first uncoupled no-regret dynamics that converge to the set of EFCEs in $n$-player general-sum extensive-form games with perfect recall. First, we introduce a notion of trigger regret in extensive-form games, which extends that of internal regret in normal-form games. When each player has low trigger regret, the empirical frequency of play is close to an EFCE. Then, we give an efficient no-trigger-regret algorithm. Our algorithm decomposes trigger regret into local subproblems at each decision point for the player, and constructs a global strategy of the player from the local solutions at each decision point.

0
3
下载
预览

Machine learning methods are powerful in distinguishing different phases of matter in an automated way and provide a new perspective on the study of physical phenomena. We train a Restricted Boltzmann Machine (RBM) on data constructed with spin configurations sampled from the Ising Hamiltonian at different values of temperature and external magnetic field using Monte Carlo methods. From the trained machine we obtain the flow of iterative reconstruction of spin state configurations to faithfully reproduce the observables of the physical system. We find that the flow of the trained RBM approaches the spin configurations of the maximal possible specific heat which resemble the near criticality region of the Ising model. In the special case of the vanishing magnetic field the trained RBM converges to the critical point of the Renormalization Group (RG) flow of the lattice model. Our results suggest an alternative explanation of how the machine identifies the physical phase transitions, by recognizing certain properties of the configuration like the maximization of the specific heat, instead of associating directly the recognition procedure with the RG flow and its fixed points. Then from the reconstructed data we deduce the critical exponent associated to the magnetization to find satisfactory agreement with the actual physical value. We assume no prior knowledge about the criticality of the system and its Hamiltonian.

0
3
下载
预览

Knowledge graphs contain rich relational structures of the world, and thus complement data-driven machine learning in heterogeneous data. One of the most effective methods in representing knowledge graphs is to embed symbolic relations and entities into continuous spaces, where relations are approximately linear translation between projected images of entities in the relation space. However, state-of-the-art relation projection methods such as TransR, TransD or TransSparse do not model the correlation between relations, and thus are not scalable to complex knowledge graphs with thousands of relations, both in computational demand and in statistical robustness. To this end we introduce TransF, a novel translation-based method which mitigates the burden of relation projection by explicitly modeling the basis subspaces of projection matrices. As a result, TransF is far more light weight than the existing projection methods, and is robust when facing a high number of relations. Experimental results on the canonical link prediction task show that our proposed model outperforms competing rivals by a large margin and achieves state-of-the-art performance. Especially, TransF improves by 9%/5% in the head/tail entity prediction task for N-to-1/1-to-N relations over the best performing translation-based method.

0
4
下载
预览
小贴士
相关论文
Min Qiu,Xiaowei Wu,Jinhong Yuan,Alexandre Graell i Amat
0+阅读 · 5月3日
Rui Chen,Jared D. Huling,Guanhua Chen,Menggang Yu
0+阅读 · 5月3日
Qianyu Feng,Linchao Zhu,Bang Zhang,Pan Pan,Yi Yang
0+阅读 · 5月2日
Pablo Juesas,Emmanuel Ramasso,Sébastien Drujont,Vincent Placet
0+阅读 · 5月1日
Arnab Auddy,Nabarun Deb,Sagnik Nandy
0+阅读 · 4月30日
Andrea Celli,Alberto Marchesi,Gabriele Farina,Nicola Gatti
3+阅读 · 2020年6月20日
Shotaro Shiba Funai,Dimitrios Giataganas
3+阅读 · 2018年10月18日
Kien Do,Truyen Tran,Svetha Venkatesh
4+阅读 · 2018年1月26日
相关VIP内容
相关资讯
Transferring Knowledge across Learning Processes
CreateAMind
6+阅读 · 2019年5月18日
LASSO回归与XGBoost:融合模型预测房价
论智
16+阅读 · 2018年8月8日
Hierarchical Disentangled Representations
CreateAMind
3+阅读 · 2018年4月15日
【论文】变分推断(Variational inference)的总结
机器学习研究会
23+阅读 · 2017年11月16日
已删除
将门创投
3+阅读 · 2017年9月12日
Auto-Encoding GAN
CreateAMind
5+阅读 · 2017年8月4日
Top