OA-Mine:对微弱监督的电子商务产品开放世界特性采矿 (OA-Mine: Open-World Attribute Mining for E-Commerce Products with Weak Supervision) - 专知论文

会员服务 ·

0

MINE · 语言模型化 · 情景 · Extensibility · MoDELS ·

2022 年 4 月 29 日

OA-Mine: Open-World Attribute Mining for E-Commerce Products with Weak Supervision

翻译：OA-Mine:对微弱监督的电子商务产品开放世界特性采矿

Xinyang Zhang,Chenwei Zhang,Xian Li,Xin Luna Dong,Jingbo Shang,Christos Faloutsos,Jiawei Han

from arxiv, WWW 2022

Automatic extraction of product attributes from their textual descriptions is essential for online shopper experience. One inherent challenge of this task is the emerging nature of e-commerce products -- we see new types of products with their unique set of new attributes constantly. Most prior works on this matter mine new values for a set of known attributes but cannot handle new attributes that arose from constantly changing data. In this work, we study the attribute mining problem in an open-world setting to extract novel attributes and their values. Instead of providing comprehensive training data, the user only needs to provide a few examples for a few known attribute types as weak supervision. We propose a principled framework that first generates attribute value candidates and then groups them into clusters of attributes. The candidate generation step probes a pre-trained language model to extract phrases from product titles. Then, an attribute-aware fine-tuning method optimizes a multitask objective and shapes the language model representation to be attribute-discriminative. Finally, we discover new attributes and values through the self-ensemble of our framework, which handles the open-world challenge. We run extensive experiments on a large distantly annotated development set and a gold standard human-annotated test set that we collected. Our model significantly outperforms strong baselines and can generalize to unseen attributes and product types.

翻译：从文字描述中自动提取产品属性对于在线浏览经验至关重要。这项任务的一个固有挑战是电子商务产品的新兴性质 -- -- 我们不断看到新型产品及其独特的新属性。大多数以前关于该问题的工作都利用一组已知属性的新值,但无法处理不断变化的数据所产生的新属性。在这项工作中,我们在开放世界环境中研究采矿属性问题,以提取新的属性及其价值。用户只需为少数已知属性类型提供几个例子,即薄弱的监管。我们提出了一个原则性框架,首先生成属性值候选人,然后将其分组为属性组合。候选人生成步骤探索一个预先培训的语言模型,从产品标题中提取短语。然后,一个属性认知微调方法优化多任务目标,并塑造语言模型的表达方式,以提取新的属性和价值。最后,我们通过处理开放世界挑战的自构框架,发现新的属性和价值。我们对一个庞大的远方位发展模型进行了广泛的实验,并且我们收集了一个高清晰的、高清晰的模型,我们用来测试了我们所收集的、高清晰的金质标准模型。

0

相关内容

MINE

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

《数学学报》期刊

国家自然科学基金

5+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

棉花GhCAD6基因在棉花纤维发育中的功能及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

ZmEREB58转录因子在玉米虫害胁迫响应中的调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PVT-AW-PCES集成系统耦合运行机理与特性规律研究

国家自然科学基金

0+阅读 · 2013年12月31日

拟南芥AMOS1基因介导的铵胁迫信号传导途径研究

国家自然科学基金

0+阅读 · 2012年12月31日

pH响应离子液体的溶液化学研究

国家自然科学基金

0+阅读 · 2012年12月31日

SNF1/AMPK/SnRK1复合体的亚基UPS调控拟南芥花粉与柱头互作的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

微通道内气液界面传质机理与调控

国家自然科学基金

0+阅读 · 2012年12月31日

Adversarial Patch Attacks and Defences in Vision-Based Tasks: A Survey

Adversarial Patch Attacks and Defences in Vision-Based Tasks: A Survey

Arxiv

0+阅读 · 2022年6月16日

BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection

Arxiv

0+阅读 · 2022年6月16日

Federated Graph Neural Networks: Overview, Techniques and Challenges

Arxiv

16+阅读 · 2022年2月15日

Recent Advances of Continual Learning in Computer Vision: An Overview

Recent Advances of Continual Learning in Computer Vision: An Overview

Arxiv

22+阅读 · 2021年9月23日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

35+阅读 · 2021年8月2日

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Arxiv

10+阅读 · 2020年3月31日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Arxiv

14+阅读 · 2020年3月24日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

UNITER: Learning UNiversal Image-TExt Representations

UNITER: Learning UNiversal Image-TExt Representations

Arxiv

23+阅读 · 2019年9月25日

One for All: Neural Joint Modeling of Entities and Events

Arxiv

11+阅读 · 2018年12月1日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Adversarial Patch Attacks and Defences in Vision-Based Tasks: A Survey

Adversarial Patch Attacks and Defences in Vision-Based Tasks: A Survey

Arxiv

0+阅读 · 2022年6月16日

BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection

Arxiv

0+阅读 · 2022年6月16日

Federated Graph Neural Networks: Overview, Techniques and Challenges

Arxiv

16+阅读 · 2022年2月15日

Recent Advances of Continual Learning in Computer Vision: An Overview

Recent Advances of Continual Learning in Computer Vision: An Overview

Arxiv

22+阅读 · 2021年9月23日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

35+阅读 · 2021年8月2日

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Arxiv

10+阅读 · 2020年3月31日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Arxiv

14+阅读 · 2020年3月24日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

UNITER: Learning UNiversal Image-TExt Representations

UNITER: Learning UNiversal Image-TExt Representations

Arxiv

23+阅读 · 2019年9月25日

One for All: Neural Joint Modeling of Entities and Events

Arxiv

11+阅读 · 2018年12月1日

相关基金

《数学学报》期刊

国家自然科学基金

5+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

棉花GhCAD6基因在棉花纤维发育中的功能及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

ZmEREB58转录因子在玉米虫害胁迫响应中的调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PVT-AW-PCES集成系统耦合运行机理与特性规律研究

国家自然科学基金

0+阅读 · 2013年12月31日

拟南芥AMOS1基因介导的铵胁迫信号传导途径研究

国家自然科学基金

0+阅读 · 2012年12月31日

pH响应离子液体的溶液化学研究

国家自然科学基金

0+阅读 · 2012年12月31日

SNF1/AMPK/SnRK1复合体的亚基UPS调控拟南芥花粉与柱头互作的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

微通道内气液界面传质机理与调控

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员