VIP内容

题目: Integrating Deep Learning with Logic Fusion for Information Extraction

摘要:

信息抽取(Information extraction, IE)旨在从输入文本中产生结构化的信息,例如命名实体识别和关系抽取。通过特征工程或深度学习为IE提出了各种尝试。然而,他们中的大多数人并没有将任务本身所固有的复杂关系联系起来,而这一点已被证明是特别重要的。例如,两个实体之间的关系高度依赖于它们的实体类型。这些依赖关系可以看作是复杂的约束,可以有效地表示为逻辑规则。为了将这种逻辑推理能力与深度神经网络的学习能力相结合,我们提出将一阶逻辑形式的逻辑知识集成到深度学习系统中,以端到端方式联合训练。该集成框架通过逻辑规则对神经输出进行知识正则化增强,同时根据训练数据的特点更新逻辑规则的权值。我们证明了该模型在多个IE任务上的有效性和泛化性。

作者:

Sinno Jialin Pan是南洋理工大学计算机科学与工程学院院长兼副教授,研究方向是迁移学习、数据挖掘、人工智能、机器学习。

成为VIP会员查看完整内容
0
54

热门内容

The quest of `can machines think' and `can machines do what human do' are quests that drive the development of artificial intelligence. Although recent artificial intelligence succeeds in many data intensive applications, it still lacks the ability of learning from limited exemplars and fast generalizing to new tasks. To tackle this problem, one has to turn to machine learning, which supports the scientific study of artificial intelligence. Particularly, a machine learning problem called Few-Shot Learning (FSL) targets at this case. It can rapidly generalize to new tasks of limited supervised experience by turning to prior knowledge, which mimics human's ability to acquire knowledge from few examples through generalization and analogy. It has been seen as a test-bed for real artificial intelligence, a way to reduce laborious data gathering and computationally costly training, and antidote for rare cases learning. With extensive works on FSL emerging, we give a comprehensive survey for it. We first give the formal definition for FSL. Then we point out the core issues of FSL, which turns the problem from "how to solve FSL" to "how to deal with the core issues". Accordingly, existing works from the birth of FSL to the most recent published ones are categorized in a unified taxonomy, with thorough discussion of the pros and cons for different categories. Finally, we envision possible future directions for FSL in terms of problem setup, techniques, applications and theory, hoping to provide insights to both beginners and experienced researchers.

0
332
下载
预览

最新内容

Broadcast/multicast communication systems are typically designed to optimize the outage rate criterion, which neglects the performance of the fraction of clients with the worst channel conditions. Targeting ultra-reliable communication scenarios, this paper takes a complementary approach by introducing the conditional value-at-risk (CVaR) rate as the expected rate of a worst-case fraction of clients. To support differential quality-of-service (QoS) levels in this class of clients, layered division multiplexing (LDM) is applied, which enables decoding at different rates. Focusing on a practical scenario in which the transmitter does not know the fading distribution, layer allocation is optimized based on a dataset sampled during deployment. The optimality gap caused by the availability of limited data is bounded via a generalization analysis, and the sample complexity is shown to increase as the designated fraction of worst-case clients decreases. Considering this theoretical result, meta-learning is introduced as a means to reduce sample complexity by leveraging data from previous deployments. Numerical experiments demonstrate that LDM improves spectral efficiency even for small datasets; that, for sufficiently large datasets, the proposed mirror-descent-based layer optimization scheme achieves a CVaR rate close to that achieved when the transmitter knows the fading distribution; and that meta-learning can significantly reduce data requirements.

0
0
下载
预览

最新论文

Broadcast/multicast communication systems are typically designed to optimize the outage rate criterion, which neglects the performance of the fraction of clients with the worst channel conditions. Targeting ultra-reliable communication scenarios, this paper takes a complementary approach by introducing the conditional value-at-risk (CVaR) rate as the expected rate of a worst-case fraction of clients. To support differential quality-of-service (QoS) levels in this class of clients, layered division multiplexing (LDM) is applied, which enables decoding at different rates. Focusing on a practical scenario in which the transmitter does not know the fading distribution, layer allocation is optimized based on a dataset sampled during deployment. The optimality gap caused by the availability of limited data is bounded via a generalization analysis, and the sample complexity is shown to increase as the designated fraction of worst-case clients decreases. Considering this theoretical result, meta-learning is introduced as a means to reduce sample complexity by leveraging data from previous deployments. Numerical experiments demonstrate that LDM improves spectral efficiency even for small datasets; that, for sufficiently large datasets, the proposed mirror-descent-based layer optimization scheme achieves a CVaR rate close to that achieved when the transmitter knows the fading distribution; and that meta-learning can significantly reduce data requirements.

0
0
下载
预览
参考链接
子主题
Top