通过脱钩功能学习与决策脱钩而匹配的可解释和低资源配置实体 (Interpretable and Low-Resource Entity Matching via Decoupling Feature Learning from Decision Making) - 专知论文

会员服务 ·

0

entity · 学成 · 表征学习 · 归纳学习 · 数据集 ·

2021 年 6 月 8 日

Interpretable and Low-Resource Entity Matching via Decoupling Feature Learning from Decision Making

翻译：通过脱钩功能学习与决策脱钩而匹配的可解释和低资源配置实体

Zijun Yao,Chengjiang Li,Tiansi Dong,Xin Lv,Jifan Yu,Lei Hou,Juanzi Li,Yichi Zhang,Zelin Dai

Entity Matching (EM) aims at recognizing entity records that denote the same real-world object. Neural EM models learn vector representation of entity descriptions and match entities end-to-end. Though robust, these methods require many resources for training, and lack of interpretability. In this paper, we propose a novel EM framework that consists of Heterogeneous Information Fusion (HIF) and Key Attribute Tree (KAT) Induction to decouple feature representation from matching decision. Using self-supervised learning and mask mechanism in pre-trained language modeling, HIF learns the embeddings of noisy attribute values by inter-attribute attention with unlabeled data. Using a set of comparison features and a limited amount of annotated data, KAT Induction learns an efficient decision tree that can be interpreted by generating entity matching rules whose structure is advocated by domain experts. Experiments on 6 public datasets and 3 industrial datasets show that our method is highly efficient and outperforms SOTA EM models in most cases. Our codes and datasets can be obtained from https://github.com/THU-KEG/HIF-KAT.

翻译：实体匹配(EM)的目的是承认代表同一真实世界物体的实体记录。神经EM模型学习实体说明的矢量代表,并与实体端对端匹配。这些方法虽然很健全,但需要许多培训资源,缺乏解释性。在本文件中,我们提议了一个全新的EM框架,由异质信息融合和关键属性树组成,通过引入将特征代表与匹配决定相匹配。在培训前的语言模型中使用自我监督的学习和掩码机制,HIF通过不贴标签的数据,通过跨属性的注意来学习噪音属性值的嵌入。使用一套比较特征和有限的附加说明数据,KAT Inging学会一种高效的决策树,可以通过生成由域专家倡导的结构匹配的实体规则加以解释。对6个公共数据集和3个工业数据集的实验表明,我们的方法效率很高,在多数情况下都超越SOTA EM模型。我们的代码和数据集可以从https://github.com/THHU/GHG/HGAT获得。

0

相关内容

entity

工业互联网平台发展与展望，33页ppt

工业互联网平台发展与展望，33页ppt

专知会员服务

68+阅读 · 2021年3月6日

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

专知会员服务

84+阅读 · 2020年11月25日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

专知会员服务

310+阅读 · 2020年2月26日

【IPAM workshops】加州大学洛杉矶分校会议：Geometry and Learning from Data in 3D and Beyond， workshop Ⅳ： Deep Geometric Learning of Big Data and Applications

【IPAM workshops】加州大学洛杉矶分校会议：Geometry and Learning from Data in 3D and Beyond， workshop Ⅳ： Deep Geometric Learning of Big Data and Applications

专知会员服务

19+阅读 · 2019年11月10日

【ICCV 2019 Workshop】Geometric View of Optimal Transportation and Generative Adversarial Networks ，石溪大学，哈佛大学顾险峰教授

【ICCV 2019 Workshop】Geometric View of Optimal Transportation and Generative Adversarial Networks ，石溪大学，哈佛大学顾险峰教授

专知会员服务

26+阅读 · 2019年10月30日

可解释机器学习（Interpretable Machine Learning）：打开黑盒之谜（238页书籍下载）

可解释机器学习（Interpretable Machine Learning）：打开黑盒之谜（238页书籍下载）

专知会员服务

152+阅读 · 2019年10月27日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

LibRec 精选：你见过最有趣的论文标题是什么？

LibRec 精选：你见过最有趣的论文标题是什么？

LibRec智能推荐

4+阅读 · 2019年11月6日

LibRec 精选：从0开始构建RNN网络

LibRec 精选：从0开始构建RNN网络

LibRec智能推荐

5+阅读 · 2019年5月31日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Anonymizing Machine Learning Models

Arxiv

0+阅读 · 2021年8月2日

Creating Powerful and Interpretable Models withRegression Networks

Arxiv

0+阅读 · 2021年7月30日

Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective

Arxiv

4+阅读 · 2021年3月31日

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Arxiv

8+阅读 · 2020年12月20日

Contrastive Transformation for Self-supervised Correspondence Learning

Contrastive Transformation for Self-supervised Correspondence Learning

Arxiv

13+阅读 · 2020年12月9日

Scalable Gromov-Wasserstein Learning for Graph Partitioning and Matching

Arxiv

8+阅读 · 2019年10月9日

Learning a Matching Model with Co-teaching for Multi-turn Response Selection in Retrieval-based Dialogue Systems

Arxiv

6+阅读 · 2019年6月11日

Task-Free Continual Learning

Arxiv

6+阅读 · 2018年12月10日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

Learning Representative Temporal Features for Action Recognition

Arxiv

4+阅读 · 2018年3月14日

VIP会员

文章信息

相关主题

相关VIP内容

工业互联网平台发展与展望，33页ppt

工业互联网平台发展与展望，33页ppt

专知会员服务

68+阅读 · 2021年3月6日

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

专知会员服务

84+阅读 · 2020年11月25日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

专知会员服务

310+阅读 · 2020年2月26日

【IPAM workshops】加州大学洛杉矶分校会议：Geometry and Learning from Data in 3D and Beyond， workshop Ⅳ： Deep Geometric Learning of Big Data and Applications

【IPAM workshops】加州大学洛杉矶分校会议：Geometry and Learning from Data in 3D and Beyond， workshop Ⅳ： Deep Geometric Learning of Big Data and Applications

专知会员服务

19+阅读 · 2019年11月10日

【ICCV 2019 Workshop】Geometric View of Optimal Transportation and Generative Adversarial Networks ，石溪大学，哈佛大学顾险峰教授

【ICCV 2019 Workshop】Geometric View of Optimal Transportation and Generative Adversarial Networks ，石溪大学，哈佛大学顾险峰教授

专知会员服务

26+阅读 · 2019年10月30日

可解释机器学习（Interpretable Machine Learning）：打开黑盒之谜（238页书籍下载）

可解释机器学习（Interpretable Machine Learning）：打开黑盒之谜（238页书籍下载）

专知会员服务

152+阅读 · 2019年10月27日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型中的检索与结构化增强生成综述

《实现多层防御多轮交战机制的扩展型随机齐射模型》2025年最新83页

【CMU博士论文】交互驱动的人体动作估计与生成

如何避免生成式人工智能在作战中失控失效

相关资讯

LibRec 精选：你见过最有趣的论文标题是什么？

LibRec 精选：你见过最有趣的论文标题是什么？

LibRec智能推荐

4+阅读 · 2019年11月6日

LibRec 精选：从0开始构建RNN网络

LibRec 精选：从0开始构建RNN网络

LibRec智能推荐

5+阅读 · 2019年5月31日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Anonymizing Machine Learning Models

Arxiv

0+阅读 · 2021年8月2日

Creating Powerful and Interpretable Models withRegression Networks

Arxiv

0+阅读 · 2021年7月30日

Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective

Arxiv

4+阅读 · 2021年3月31日

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Arxiv

8+阅读 · 2020年12月20日

Contrastive Transformation for Self-supervised Correspondence Learning

Contrastive Transformation for Self-supervised Correspondence Learning

Arxiv

13+阅读 · 2020年12月9日

Scalable Gromov-Wasserstein Learning for Graph Partitioning and Matching

Arxiv

8+阅读 · 2019年10月9日

Learning a Matching Model with Co-teaching for Multi-turn Response Selection in Retrieval-based Dialogue Systems

Arxiv

6+阅读 · 2019年6月11日

Task-Free Continual Learning

Arxiv

6+阅读 · 2018年12月10日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

Learning Representative Temporal Features for Action Recognition

Arxiv

4+阅读 · 2018年3月14日

微信扫码咨询专知VIP会员