机器学习和深度学习最新研究论文Top20

2018 年 2 月 7 日 深度学习 Thuy T. Pham

机器学习,尤其是深度学习的子领域,近年来取得了许多惊人的进展,重要的研究论文可能会导致技术上的突破,被数百亿人所使用。这一领域的研究正在迅速发展,并帮助我们的读者监测自2014 以来发表的最重要的最新科学论文的列表。

我们用来选择20篇论文的标准是使用来自三个学术来源:scholar.google.com ; academic.microsoft.com ; 和 semanticscholar.org由于引用次数在不同的来源和估计,我们列出了academic.microsoft.com的结果 略低于其他。

对于每篇论文,我们还会给出它发表的年份,由semanticscholar.org提供的“高度影响力引用计数”(HIC)和“引用速度”(CV)度量  介绍出版物如何相互构建和相互关联的HIC是识别有意义的引用的结果。简历是过去3年中每年引用次数的加权平均数。对于一些参考,其中CV是零,这意味着它是空白或由semanticscholar.org没有显示

这20篇论文中的大部分(但不是全部),包括前8名,都是深度学习的主题。然而,我们看到了强大的多样性 - 只有一个作者(Yoshua Bengio)有两篇论文,论文发表在许多不同的场合:CoRR(3),ECCV(3),IEEE CVPR(3),NIPS(2),ACM Comp Surveys,ICML,IEEE PAMI,IEEE TKDE,Information Fusion,Int。J. on Computers&EE,JMLR,KDD和Neural Networks。前两篇论文是迄今为止最高的论文数量。请注意,第二篇论文仅在去年发表。阅读(或重新阅读)并了解最新进展。

  1. Dropout: a simple way to prevent neural networks from overfitting, by Hinton, G.E., Krizhevsky, A., Srivastava, N., Sutskever, I., & Salakhutdinov, R. (2014). Journal of Machine Learning Research, 15, 1929-1958. (cited 2084 times, HIC: 142 , CV: 536).

    Summary: The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. This significantly reduces overfitting and gives major improvements over other regularization methods

  2. Deep Residual Learning for Image Recognition, by He, K., Ren, S., Sun, J., & Zhang, X. (2016). CoRR, abs/1512.03385. (cited 1436 times, HIC: 137 , CV: 582). 
    Summary: We present a residual learning framework to ease the training of deep neural networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

  3. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, by Sergey Ioffe, Christian Szegedy (2015) ICML. (cited 946 times, HIC: 56 , CV: 0).
    Summary: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change.  We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs.  Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

  4. Large-Scale Video Classification with Convolutional Neural Networks , by Fei-Fei, L., Karpathy, A., Leung, T., Shetty, S., Sukthankar, R., & Toderici, G. (2014). IEEE Conference on Computer Vision and Pattern Recognition (cited 865 times, HIC: 24 , CV: 239)
    Summary: Convolutional Neural Networks (CNNs) have been established as a powerful class of models for image recognition problems. Encouraged by these results, we provide an extensive empirical evaluation of CNNs on large-scale video classification using a new dataset of 1 million YouTube videos belonging to 487 classes .

  5. Microsoft COCO: Common Objects in Context , by Belongie, S.J., Dollár, P., Hays, J., Lin, T., Maire, M., Perona, P., Ramanan, D., & Zitnick, C.L. (2014). ECCV. (cited 830 times, HIC: 78 , CV: 279) Summary: We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. Our dataset contains photos of 91 objects types that would be easily recognizable by a 4 year old. Finally, we provide baseline performance analysis for bounding box and segmentation detection results using a Deformable Parts Model.

  6. Learning deep features for scene recognition using places database , by Lapedriza, À., Oliva, A., Torralba, A., Xiao, J., & Zhou, B. (2014). NIPS. (cited 644 times, HIC: 65 , CV: 0)
    Summary: We introduce a new scene-centric database called Places with over 7 million labeled pictures of scenes. We propose new methods to compare the density and diversity of image datasets and show that Places is as dense as other scene datasets and has more diversity.

  7. Generative adversarial nets, by Bengio, Y., Courville, A.C., Goodfellow, I.J., Mirza, M., Ozair, S., Pouget-Abadie, J., Warde-Farley, D., & Xu, B. (2014) NIPS. (cited 463 times, HIC: 55 , CV: 0)
    Summary: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

  8. High-Speed Tracking with Kernelized Correlation Filters, by Batista, J., Caseiro, R., Henriques, J.F., & Martins, P. (2015). CoRR, abs/1404.7584. (cited 439 times, HIC: 43 , CV: 0)
    Summary: In most modern trackers,  to cope with natural image changes, a classifier is typically trained with translated and scaled sample patches. We propose an analytic model for datasets of thousands of translated patches. By showing that the resulting data matrix is circulant, we can diagonalize it with the discrete Fourier transform, reducing both storage and computation by several orders of magnitude.

  9. A Review on Multi-Label Learning Algorithms, by  Zhang, M., & Zhou, Z. (2014). IEEE TKDE,  (cited 436 times, HIC: 7 , CV: 91)
    Summary:  This paper aims to provide a timely review on multi-label learning studies the problem where each example is represented by a single instance while associated with a set of labels simultaneously.

  10. How transferable are features in deep neural networks, by Bengio, Y., Clune, J., Lipson, H., & Yosinski, J. (2014) CoRR, abs/1411.1792. (cited 402 times, HIC: 14 , CV: 0)
    Summary: We experimentally quantify the generality versus specificity of neurons in each layer of a deep convolutional neural network and report a few surprising results. Transferability is negatively affected by two distinct issues: (1) the specialization of higher layer neurons to their original task at the expense of performance on the target task, which was expected, and (2) optimization difficulties related to splitting networks between co-adapted neurons, which was not expected.

  11. Do we need hundreds of classifiers to solve real world classification problems, by Amorim, D.G., Barro, S., Cernadas, E., & Delgado, M.F. (2014).  Journal of Machine Learning Research (cited 387 times, HIC: 3 , CV: 0)
    Summary: We evaluate 179 classifiers arising from 17 families (discriminant analysis, Bayesian, neural networks, support vector machines, decision trees, rule-based classifiers, boosting, bagging, stacking, random forests and other ensembles, generalized linear models, nearest-neighbors, partial least squares and principal component regression, logistic and multinomial regression, multiple adaptive regression splines and other methods). We use 121 data sets from UCI data base to study the classifier behavior, not dependent on the data set collection. The winners are the random forest (RF) versions implemented in R and accessed via caret) and the SVM with Gaussian kernel implemented in C using LibSVM.

  12. Knowledge vault: a web-scale approach to probabilistic knowledge fusion, by Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., ... & Zhang, W. (2014, August). In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining  ACM. (cited 334 times, HIC: 7 , CV: 107).
    Summary: We introduce Knowledge Vault, a Web-scale probabilistic knowledge base that combines extractions from Web content (obtained via analysis of text, tabular data, page structure, and human annotations) with prior knowledge derived from existing knowledge repositories for constructing knowledge bases. We employ supervised machine learning methods for fusing  distinct information sources. The Knowledge Vault is substantially bigger than any previously published structured knowledge repository, and features a probabilistic inference system that computes calibrated probabilities of fact correctness.

  13. Scalable Nearest Neighbor Algorithms for High Dimensional Data, by Lowe, D.G., & Muja, M. (2014). IEEE Trans. Pattern Anal. Mach. Intell., (cited 324 times, HIC: 11 , CV: 69).
    Summary: We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms.  In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper.

  14. Trends in extreme learning machines: a review, by Huang, G., Huang, G., Song, S., & You, K. (2015).  Neural Networks,  (cited 323 times, HIC: 0 , CV: 0)
    Summary: We aim to report the current state of the theoretical research and practical advances on Extreme learning machine (ELM). Apart from classification and regression, ELM has recently been extended for clustering, feature selection, representational learning and many other learning tasks.  Due to its remarkable efficiency, simplicity, and impressive generalization performance, ELM have been applied in a variety of domains, such as biomedical engineering, computer vision, system identification, and control and robotics.

  15. A survey on concept drift adaptation, by Bifet, A., Bouchachia, A., Gama, J., Pechenizkiy, M., & Zliobaite, I.  ACM Comput. Surv., 2014 , (cited 314 times, HIC: 4 , CV: 23)
    Summary: This work aims at providing a comprehensive introduction to the concept drift adaptation that refers to an online supervised learning scenario when the relation between the input data and the target variable changes over time.

  16. Multi-scale Orderless Pooling of Deep Convolutional Activation Features, by Gong, Y., Guo, R., Lazebnik, S., & Wang, L. (2014).  ECCV(cited 293 times, HIC: 23 , CV: 95)
    Summary:  To improve the invariance of CNN activations without degrading their discriminative power, this paper presents a simple but effective scheme called multi-scale orderless pooling (MOP-CNN).

  17. Simultaneous Detection and Segmentation, by Arbeláez, P.A., Girshick, R.B., Hariharan, B., & Malik, J. (2014) ECCV , (cited 286 times, HIC: 23 , CV: 94)
    Summary: We aim to detect all instances of a category in an image and, for each instance, mark the pixels that belong to it. We call this task Simultaneous Detection and Segmentation (SDS).

  18. A survey on feature selection methods, by Chandrashekar, G., & Sahin, F.  Int. J. on Computers & Electrical Engineering, (cited 279 times, HIC: 1 , CV: 58)
    Summary: Plenty of feature selection methods are available in literature due to the availability of data with hundreds of variables leading to data with very high dimension.

  19. One Millisecond Face Alignment with an Ensemble of Regression Trees, by Kazemi, Vahid, and Josephine Sullivan, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014, (cited 277 times, HIC: 15 , CV: 0)
    Summary: This paper addresses the problem of Face Alignment for a single image. We show how an ensemble of regression trees can be used to estimate the face's landmark positions directly from a sparse subset of pixel intensities, achieving super-realtime performance with high quality predictions.

  20. A survey of multiple classifier systems as hybrid systems , by Corchado, E., Graña, M., & Wozniak, M. (2014). Information Fusion, 16, 3-17. (cited 269 times, HIC: 1 , CV: 22)
    Summary: A current focus of intense research in pattern classification is the combination of several classifier systems, which can be built following either the same or different models and/or datasets building.


Article: https://www.kdnuggets.com/2017/04/top-20-papers-machine-learning.html

登录查看更多
6

相关内容

神经网络(Neural Networks)是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛,以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交,有助于全面的神经网络研究,从行为和大脑建模,学习算法,通过数学和计算分析,系统的工程和技术应用,大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流,并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此,神经网络编委会代表的专家领域包括心理学,神经生物学,计算机科学,工程,数学,物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学,神经科学,学习系统,数学和计算分析、工程和应用。 官网地址:http://dblp.uni-trier.de/db/journals/nn/
最新《深度学习自动驾驶》技术综述论文,28页pdf
专知会员服务
153+阅读 · 2020年6月14日
机器翻译深度学习最新综述
专知会员服务
96+阅读 · 2020年2月20日
2019->2020必看的十篇「深度学习领域综述」论文
专知会员服务
269+阅读 · 2020年1月1日
深度学习自然语言处理综述,266篇参考文献
专知会员服务
225+阅读 · 2019年10月12日
【哈佛大学商学院课程Fall 2019】机器学习可解释性
专知会员服务
98+阅读 · 2019年10月9日
机器学习在材料科学中的应用综述,21页pdf
专知会员服务
45+阅读 · 2019年9月24日
十大深度学习热门论文(2018年版)
论智
3+阅读 · 2018年4月24日
最前沿的深度学习论文、架构及资源分享
深度学习与NLP
13+阅读 · 2018年1月25日
干货 | 深度学习论文汇总
AI科技评论
4+阅读 · 2018年1月1日
126篇殿堂级深度学习论文分类整理,从入门到应用
全球人工智能
5+阅读 · 2017年12月27日
深度学习在计算机视觉领域的前沿进展
我爱机器学习
11+阅读 · 2017年1月7日
A Survey of Deep Learning for Scientific Discovery
Arxiv
29+阅读 · 2020年3月26日
Optimization for deep learning: theory and algorithms
Arxiv
102+阅读 · 2019年12月19日
Arxiv
9+阅读 · 2019年4月19日
Hardness-Aware Deep Metric Learning
Arxiv
6+阅读 · 2019年3月13日
Arxiv
53+阅读 · 2018年12月11日
Logically-Constrained Reinforcement Learning
Arxiv
3+阅读 · 2018年12月6日
Deep Randomized Ensembles for Metric Learning
Arxiv
4+阅读 · 2018年9月4日
Arxiv
3+阅读 · 2018年4月10日
VIP会员
相关论文
A Survey of Deep Learning for Scientific Discovery
Arxiv
29+阅读 · 2020年3月26日
Optimization for deep learning: theory and algorithms
Arxiv
102+阅读 · 2019年12月19日
Arxiv
9+阅读 · 2019年4月19日
Hardness-Aware Deep Metric Learning
Arxiv
6+阅读 · 2019年3月13日
Arxiv
53+阅读 · 2018年12月11日
Logically-Constrained Reinforcement Learning
Arxiv
3+阅读 · 2018年12月6日
Deep Randomized Ensembles for Metric Learning
Arxiv
4+阅读 · 2018年9月4日
Arxiv
3+阅读 · 2018年4月10日
Top
微信扫码咨询专知VIP会员