【AI日报】2019-03-01 星期五

3 月 1 日 好东西传送门

【机器学习】 

1)侯世达:让机器学习思考的人

https://mp.weixin.qq.com/s/ZGOe6dJ9gb7tXogDit7Xmw

2)F-Principle:初探深度学习在计算数学的应用

https://mp.weixin.qq.com/s/yXGuoBPmIeA3SteD_TliXg


【新技术与新应用】 

1)为什么说互联网公司是养鸡模式,AI公司是养小孩儿模式

https://mp.weixin.qq.com/s/64PjHAi7yzqmNciJKFmylQ

2)新技术到底靠不靠谱?在中国用一下就知道了

https://mp.weixin.qq.com/s/jzCJZ9mfTCgdYfwEygNhAg

3)网易伏羲AI实验室负责人李仁杰:人工智能在游戏中的赋能与落地

https://mp.weixin.qq.com/s/1FJcKskI2LPhwhIREVtlCg


【Fintech】 

1)见微数据:公告搜索利器 

https://dwz.cn/WUh2njfl

2)农行研发中心总经理:数字化转型,科技研发如何突破? 

https://dwz.cn/8YGsovUU

3)22家区域性银行金融科技战略研究:认知、路径与场景 

https://dwz.cn/Fw4ayWI3 


【自然语言处理】 

1) 【基于BERT的文本生成】Pretraining-Based Natural Language Generation for Text Summarization 

http://www.weibo.com/2678093863/HiHJibA4n 

2) 文本数据集构建工具库,抓取、清理网页并去重,用以创建大规模单语数据集 

http://www.weibo.com/1402400261/HiHSweAHK 

3) Spacy/Gensim/Textacy主题建模实例 

http://www.weibo.com/1402400261/HiHPjF9VM 

4)硕博论文 | 基于知识库的自然语言理解 04#

https://mp.weixin.qq.com/s/hBcsPcs1z9GyYK2RoJeCFg

5)微信AI拿下NLP竞赛全球冠军

https://mp.weixin.qq.com/s/Jnp6jmy-8lloI7p4dAofKg


登录查看更多
点赞 0

Neural text classification models typically treat output labels as categorical variables which lack description and semantics. This forces their parametrization to be dependent on the label set size, and, hence, they are unable to scale to large label sets and generalize to unseen ones. Existing joint input-label text models overcome these issues by exploiting label descriptions, but they are unable to capture complex label relationships, have rigid parametrization, and their gains on unseen labels happen often at the expense of weak performance on the labels seen during training. In this paper, we propose a new input-label model which generalizes over previous such models, addresses their limitations and does not compromise performance on seen labels. The model consists of a joint non-linear input-label embedding with controllable capacity and a joint-space-dependent classification unit which is trained with cross-entropy loss to optimize classification performance. We evaluate models on full-resource and low- or zero-resource text classification of multilingual news and biomedical text with a large label set. Our model outperforms monolingual and multilingual models which do not leverage label semantics and previous joint input-label space models in both scenarios.

点赞 0
阅读1+

Rapidly developed neural models have achieved competitive performance in Chinese word segmentation (CWS) as their traditional counterparts. However, most of methods encounter the computational inefficiency especially for long sentences because of the increasing model complexity and slower decoders. This paper presents a simple neural segmenter which directly labels the gap existence between adjacent characters to alleviate the existing drawback. Our segmenter is fully end-to-end and capable of performing segmentation very fast. We also show a performance difference with different tag sets. The experiments show that our segmenter can provide comparable performance with state-of-the-art.

点赞 0
阅读1+

Computing universal distributed representations of sentences is a fundamental task in natural language processing. We propose ConsSent, a simple yet surprisingly powerful unsupervised method to learn such representations by enforcing consistency constraints on sequences of tokens. We consider two classes of such constraints -- sequences that form a sentence and between two sequences that form a sentence when merged. We learn sentence encoders by training them to distinguish between consistent and inconsistent examples, the latter being generated by randomly perturbing consistent examples in six different ways. Extensive evaluation on several transfer learning and linguistic probing tasks shows improved performance over strong unsupervised and supervised baselines, substantially surpassing them in several cases. Our best results are achieved by training sentence encoders in a multitask setting and by an ensemble of encoders trained on the individual tasks.

点赞 0
阅读1+

Large-scale probabilistic representations, including statistical knowledge bases and graphical models, are increasingly in demand. They are built by mining massive sources of structured and unstructured data, the latter often derived from natural language processing techniques. The very nature of the enterprise makes the extracted representations probabilistic. In particular, inducing relations and facts from noisy and incomplete sources via statistical machine learning models means that the labels are either already probabilistic, or that probabilities approximate confidence. While the progress is impressive, extracted representations essentially enforce the closed-world assumption, which means that all facts in the database are accorded the corresponding probability, but all other facts have probability zero. The CWA is deeply problematic in most machine learning contexts. A principled solution is needed for representing incomplete and indeterminate knowledge in such models, imprecise probability models such as credal networks being an example. In this work, we are interested in the foundational problem of learning such open-world probabilistic models. However, since exact inference in probabilistic graphical models is intractable, the paradigm of tractable learning has emerged to learn data structures (such as arithmetic circuits) that support efficient probabilistic querying. We show here how the computational machinery underlying tractable learning has to be generalized for imprecise probabilities. Our empirical evaluations demonstrate that our regime is also effective.

点赞 0
阅读2+

The field of natural language processing has seen impressive progress in recent years, with neural network models replacing many of the traditional systems. A plethora of new models have been proposed, many of which are thought to be opaque compared to their feature-rich counterparts. This has led researchers to analyze, interpret, and evaluate neural networks in novel and more fine-grained ways. In this survey paper, we review analysis methods in neural language processing, categorize them according to prominent research trends, highlight existing limitations, and point to potential directions for future work.

点赞 0
阅读1+

Natural Language Inference (NLI) is a fundamental and challenging task in Natural Language Processing (NLP). Most existing methods only apply one-pass inference process on a mixed matching feature, which is a concatenation of different matching features between a premise and a hypothesis. In this paper, we propose a new model called Multi-turn Inference Matching Network (MIMN) to perform multi-turn inference on different matching features. In each turn, the model focuses on one particular matching feature instead of the mixed matching feature. To enhance the interaction between different matching features, a memory component is employed to store the history inference information. The inference of each turn is performed on the current matching feature and the memory. We conduct experiments on three different NLI datasets. The experimental results show that our model outperforms or achieves the state-of-the-art performance on all the three datasets.

点赞 0
阅读1+

This paper proposes a variational self-attention model (VSAM) that employs variational inference to derive self-attention. We model the self-attention vector as random variables by imposing a probabilistic distribution. The self-attention mechanism summarizes source information as an attention vector by weighted sum, where the weights are a learned probabilistic distribution. Compared with conventional deterministic counterpart, the stochastic units incorporated by VSAM allow multi-modal attention distributions. Furthermore, by marginalizing over the latent variables, VSAM is more robust against overfitting. Experiments on the stance detection task demonstrate the superiority of our method.

点赞 0
阅读1+

The paper presents a first attempt towards unsupervised neural text simplification that relies only on unlabeled text corpora. The core framework is comprised of a shared encoder and a pair of attentional-decoders that gains knowledge of both text simplification and complexification through discriminator-based-losses, back-translation and denoising. The framework is trained using unlabeled text collected from en-Wikipedia dump. Our analysis (both quantitative and qualitative involving human evaluators) on a public test data shows the efficacy of our model to perform simplification at both lexical and syntactic levels, competitive to existing supervised methods. We open source our implementation for academic use.

点赞 0
阅读1+
Top