Common reporting styles for statistical results, such as $p$-values and confidence intervals (CI), have been reported to be prone to dichotomous interpretations, especially with respect to null hypothesis testing frameworks. For example, when the $p$-value is small enough or the CIs of the mean effects of a studied drug and a placebo are not overlapping, scientists tend to claim significant differences while often disregarding the magnitudes and absolute differences in the effect sizes. Techniques relying on the visual estimation of the strength of evidence have been recommended to reduce such dichotomous interpretations but their effectiveness has also been challenged. We ran two experiments to compare several alternative representations of confidence intervals and used Bayesian multilevel models to estimate the effects of the representation styles on differences in subjective confidence in the results. We also asked the respondents' opinions and preferences in representation styles. Our results suggest that adding visual information to classic CI representation can decrease the tendency towards dichotomous interpretations $-$ measured as the "cliff effect": the sudden drop in confidence around $p$-value 0.05 $-$ compared with classic CI visualization and textual representation of the CI with $p$-values. As a contribution to open science, our data and all analyses are publicly available at https://github.com/helske/statvis .

### 相关内容

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

A comprehensive artificial intelligence system needs to not only perceive the environment with different `senses' (e.g., seeing and hearing) but also infer the world's conditional (or even causal) relations and corresponding uncertainty. The past decade has seen major advances in many perception tasks such as visual object recognition and speech recognition using deep learning models. For higher-level inference, however, probabilistic graphical models with their Bayesian nature are still more powerful and flexible. In recent years, Bayesian deep learning has emerged as a unified probabilistic framework to tightly integrate deep learning and Bayesian models. In this general framework, the perception of text or images using deep learning can boost the performance of higher-level inference and in turn, the feedback from the inference process is able to enhance the perception of text or images. This survey provides a comprehensive introduction to Bayesian deep learning and reviews its recent applications on recommender systems, topic models, control, etc. Besides, we also discuss the relationship and differences between Bayesian deep learning and other related topics such as Bayesian treatment of neural networks.

Detection and recognition of text in natural images are two main problems in the field of computer vision that have a wide variety of applications in analysis of sports videos, autonomous driving, industrial automation, to name a few. They face common challenging problems that are factors in how text is represented and affected by several environmental conditions. The current state-of-the-art scene text detection and/or recognition methods have exploited the witnessed advancement in deep learning architectures and reported a superior accuracy on benchmark datasets when tackling multi-resolution and multi-oriented text. However, there are still several remaining challenges affecting text in the wild images that cause existing methods to underperform due to there models are not able to generalize to unseen data and the insufficient labeled data. Thus, unlike previous surveys in this field, the objectives of this survey are as follows: first, offering the reader not only a review on the recent advancement in scene text detection and recognition, but also presenting the results of conducting extensive experiments using a unified evaluation framework that assesses pre-trained models of the selected methods on challenging cases, and applies the same evaluation criteria on these techniques. Second, identifying several existing challenges for detecting or recognizing text in the wild images, namely, in-plane-rotation, multi-oriented and multi-resolution text, perspective distortion, illumination reflection, partial occlusion, complex fonts, and special characters. Finally, the paper also presents insight into the potential research directions in this field to address some of the mentioned challenges that are still encountering scene text detection and recognition techniques.

Over the past few years, we have seen fundamental breakthroughs in core problems in machine learning, largely driven by advances in deep neural networks. At the same time, the amount of data collected in a wide array of scientific domains is dramatically increasing in both size and complexity. Taken together, this suggests many exciting opportunities for deep learning applications in scientific settings. But a significant challenge to this is simply knowing where to start. The sheer breadth and diversity of different deep learning techniques makes it difficult to determine what scientific problems might be most amenable to these methods, or which specific combination of methods might offer the most promising first approach. In this survey, we focus on addressing this central issue, providing an overview of many widely used deep learning models, spanning visual, sequential and graph structured data, associated tasks and different training methods, along with techniques to use deep learning with less data and better interpret these complex models --- two central considerations for many scientific use cases. We also include overviews of the full design process, implementation tips, and links to a plethora of tutorials, research summaries and open-sourced deep learning pipelines and pretrained models, developed by the community. We hope that this survey will help accelerate the use of deep learning across different scientific domains.

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well known causal inference framework. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.

Breast cancer remains a global challenge, causing over 1 million deaths globally in 2018. To achieve earlier breast cancer detection, screening x-ray mammography is recommended by health organizations worldwide and has been estimated to decrease breast cancer mortality by 20-40%. Nevertheless, significant false positive and false negative rates, as well as high interpretation costs, leave opportunities for improving quality and access. To address these limitations, there has been much recent interest in applying deep learning to mammography; however, obtaining large amounts of annotated data poses a challenge for training deep learning models for this purpose, as does ensuring generalization beyond the populations represented in the training dataset. Here, we present an annotation-efficient deep learning approach that 1) achieves state-of-the-art performance in mammogram classification, 2) successfully extends to digital breast tomosynthesis (DBT; "3D mammography"), 3) detects cancers in clinically-negative prior mammograms of cancer patients, 4) generalizes well to a population with low screening rates, and 5) outperforms five-out-of-five full-time breast imaging specialists by improving absolute sensitivity by an average of 14%. Our results demonstrate promise towards software that can improve the accuracy of and access to screening mammography worldwide.

Deep learning has penetrated all aspects of our lives and brought us great convenience. However, the process of building a high-quality deep learning system for a specific task is not only time-consuming but also requires lots of resources and relies on human expertise, which hinders the development of deep learning in both industry and academia. To alleviate this problem, a growing number of research projects focus on automated machine learning (AutoML). In this paper, we provide a comprehensive and up-to-date study on the state-of-the-art AutoML. First, we introduce the AutoML techniques in details according to the machine learning pipeline. Then we summarize existing Neural Architecture Search (NAS) research, which is one of the most popular topics in AutoML. We also compare the models generated by NAS algorithms with those human-designed models. Finally, we present several open problems for future research.

One of the main challenges in ranking is embedding the query and document pairs into a joint feature space, which can then be fed to a learning-to-rank algorithm. To achieve this representation, the conventional state of the art approaches perform extensive feature engineering that encode the similarity of the query-answer pair. Recently, deep-learning solutions have shown that it is possible to achieve comparable performance, in some settings, by learning the similarity representation directly from data. Unfortunately, previous models perform poorly on longer texts, or on texts with significant portion of irrelevant information, or which are grammatically incorrect. To overcome these limitations, we propose a novel ranking algorithm for question answering, QARAT, which uses an attention mechanism to learn on which words and phrases to focus when building the mutual representation. We demonstrate superior ranking performance on several real-world question-answer ranking datasets, and provide visualization of the attention mechanism to otter more insights into how our models of attention could benefit ranking for difficult question answering challenges.

Because of continuous advances in mathematical programing, Mix Integer Optimization has become a competitive vis-a-vis popular regularization method for selecting features in regression problems. The approach exhibits unquestionable foundational appeal and versatility, but also poses important challenges. We tackle these challenges, reducing computational burden when tuning the sparsity bound (a parameter which is critical for effectiveness) and improving performance in the presence of feature collinearity and of signals that vary in nature and strength. Importantly, we render the approach efficient and effective in applications of realistic size and complexity - without resorting to relaxations or heuristics in the optimization, or abandoning rigorous cross-validation tuning. Computational viability and improved performance in subtler scenarios is achieved with a multi-pronged blueprint, leveraging characteristics of the Mixed Integer Programming framework and by means of whitening, a data pre-processing step.

We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.

Hao Wang,Dit-Yan Yeung
26+阅读 · 7月2日
Zobeir Raisi,Mohamed A. Naiel,Paul Fieguth,Steven Wardell,John Zelek
10+阅读 · 6月8日
Maithra Raghu,Eric Schmidt
26+阅读 · 3月26日
Liuyi Yao,Zhixuan Chu,Sheng Li,Yaliang Li,Jing Gao,Aidong Zhang
40+阅读 · 2月5日
William Lotter,Abdul Rahman Diab,Bryan Haslam,Jiye G. Kim,Giorgia Grisot,Eric Wu,Kevin Wu,Jorge Onieva Onieva,Jerrold L. Boxerman,Meiyun Wang,Mack Bandler,Gopal Vijayaraghavan,A. Gregory Sorensen
8+阅读 · 2019年12月27日
Xin He,Kaiyong Zhao,Xiaowen Chu
27+阅读 · 2019年8月14日
3+阅读 · 2018年8月8日
Ana Kenney,Francesca Chiaromonte,Giovanni Felici
4+阅读 · 2018年8月7日
Avik Ray,Joe Neeman,Sujay Sanghavi,Sanjay Shakkottai
3+阅读 · 2018年2月24日

22+阅读 · 7月27日

27+阅读 · 7月26日

17+阅读 · 2019年10月17日

CreateAMind
8+阅读 · 2019年5月22日
CreateAMind
25+阅读 · 2019年1月3日
CreateAMind
8+阅读 · 2019年1月2日

10+阅读 · 2018年12月24日
CreateAMind
20+阅读 · 2018年9月12日
CreateAMind
3+阅读 · 2018年4月15日

22+阅读 · 2017年11月16日

5+阅读 · 2017年9月24日

3+阅读 · 2017年8月6日
CreateAMind
5+阅读 · 2017年8月4日
Top