OSCAR:基于语义的数据搭载方法 (OSCAR: A Semantic-based Data Binning Approach) - 专知论文

会员服务 ·

0

统计量 · 可辨认的 · 分类数据 · 分桶 · 可理解性 ·

2022 年 7 月 15 日

OSCAR: A Semantic-based Data Binning Approach

翻译：OSCAR:基于语义的数据搭载方法

Vidya Setlur,Michael Correll,Sarah Battersby

from arxiv, 5 pages (4 pages text + 1 page references), 3 figures

Binning is applied to categorize data values or to see distributions of data. Existing binning algorithms often rely on statistical properties of data. However, there are semantic considerations for selecting appropriate binning schemes. Surveys, for instance, gather respondent data for demographic-related questions such as age, salary, number of employees, etc., that are bucketed into defined semantic categories. In this paper, we leverage common semantic categories from survey data and Tableau Public visualizations to identify a set of semantic binning categories. We employ these semantic binning categories in OSCAR: a method for automatically selecting bins based on the inferred semantic type of the field. We conducted a crowdsourced study with 120 participants to better understand user preferences for bins generated by OSCAR vs. binning provided in Tableau. We find that maps and histograms using binned values generated by OSCAR are preferred by users as compared to binning schemes based purely on the statistical properties of the data.

翻译：Binning 用于对数据值进行分类或查看数据分布。现有的binning 算法通常依靠数据的统计特性。但是,在选择适当的宾馆计划时,存在语义考虑。例如,调查收集人口相关问题的应答者数据,如年龄、工资、雇员人数等,这些数据被归入定义的语义类别。在本文中,我们利用调查数据和公共可视化表格中常见的语义分类来确定一套语义宾馆类别。我们在OSCAR中使用了这些语义宾馆类别:一种基于字段推断语义类型的自动选择书包的方法。我们进行了由120名参与者组成的多方源研究,以更好地了解用户对OSCAR与表au提供的宾馆生成的书包的偏好。我们发现,使用OSCAR生成的宾点值的地图和直方图被用户偏好于纯粹基于数据统计属性的宾馆计划。

0

相关内容

统计量

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

专知会员服务

64+阅读 · 2020年2月16日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

HDL通过SR-BI调节造血干细胞抑制动脉粥样硬化的作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

miR-124靶向作用ROCK1调节早期糖尿病肾病肾小球内皮通透性和凋亡的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

瘢痕疙瘩中DAB-1抑制E3连接酶SIAH1对TIEG1泛素化介导TGF-β/Smads信号通路的研究

国家自然科学基金

0+阅读 · 2014年12月31日

淫羊藿苷抑制小胶质细胞激活及调控NADPH oxidase通路在抗帕金森病中的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Progranulin在糖尿病肾病足细胞损伤中的保护作用及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

miRNA-92a对Rho激酶调控的动脉粥样硬化血管重构的影响及机制

国家自然科学基金

0+阅读 · 2013年12月31日

缺氧细胞中mTORC1通过下调EF-Tumt表达引起线粒体损害的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

转录因子Sox9调控胚胎干细胞软骨分化的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

酪氨酸磷酸化信号转导网络在丙型肝炎病毒NS3致癌机理中的作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

RGC-32参与TGF-β#35825;导肾小管上皮向间充质细胞转化的分子调控机制

国家自然科学基金

0+阅读 · 2008年12月31日

Semantic2Graph: Graph-based Multi-modal Feature for Action Segmentation in Videos

Arxiv

0+阅读 · 2022年9月13日

Data Augmentation by Selecting Mixed Classes Considering Distance Between Classes

Arxiv

0+阅读 · 2022年9月12日

Bilevel Optimization for Feature Selection in the Data-Driven Newsvendor Problem

Arxiv

0+阅读 · 2022年9月12日

Is Synthetic Dataset Reliable for Benchmarking Generalizable Person Re-Identification?

Arxiv

0+阅读 · 2022年9月12日

An Improved Algorithm For Online Reranking

Arxiv

0+阅读 · 2022年9月11日

Proximal nested sampling for high-dimensional Bayesian model selection

Proximal nested sampling for high-dimensional Bayesian model selection

Arxiv

0+阅读 · 2022年9月9日

Fast and Accurate Importance Weighting for Correcting Sample Bias

Arxiv

0+阅读 · 2022年9月9日

Machine Learning-based Selection of Graph Partitioning Strategy Using the Characteristics of Graph Data and Algorithm

Arxiv

0+阅读 · 2022年9月9日

Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM

Arxiv

0+阅读 · 2022年9月8日

Semantic Models for the First-stage Retrieval: A Comprehensive Review

Arxiv

20+阅读 · 2021年9月17日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

专知会员服务

64+阅读 · 2020年2月16日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】面向可扩展深度神经网络的预测编码：理论与实践

如何快速获取数百万架无人机？

EMNLP 2025 | RTQA：递归思想求解复杂的时间知识图谱问答

组合式零样本学习综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Semantic2Graph: Graph-based Multi-modal Feature for Action Segmentation in Videos

Arxiv

0+阅读 · 2022年9月13日

Data Augmentation by Selecting Mixed Classes Considering Distance Between Classes

Arxiv

0+阅读 · 2022年9月12日

Bilevel Optimization for Feature Selection in the Data-Driven Newsvendor Problem

Arxiv

0+阅读 · 2022年9月12日

Is Synthetic Dataset Reliable for Benchmarking Generalizable Person Re-Identification?

Arxiv

0+阅读 · 2022年9月12日

An Improved Algorithm For Online Reranking

Arxiv

0+阅读 · 2022年9月11日

Proximal nested sampling for high-dimensional Bayesian model selection

Proximal nested sampling for high-dimensional Bayesian model selection

Arxiv

0+阅读 · 2022年9月9日

Fast and Accurate Importance Weighting for Correcting Sample Bias

Arxiv

0+阅读 · 2022年9月9日

Machine Learning-based Selection of Graph Partitioning Strategy Using the Characteristics of Graph Data and Algorithm

Arxiv

0+阅读 · 2022年9月9日

Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM

Arxiv

0+阅读 · 2022年9月8日

Semantic Models for the First-stage Retrieval: A Comprehensive Review

Arxiv

20+阅读 · 2021年9月17日

相关基金

HDL通过SR-BI调节造血干细胞抑制动脉粥样硬化的作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

miR-124靶向作用ROCK1调节早期糖尿病肾病肾小球内皮通透性和凋亡的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

瘢痕疙瘩中DAB-1抑制E3连接酶SIAH1对TIEG1泛素化介导TGF-β/Smads信号通路的研究

国家自然科学基金

0+阅读 · 2014年12月31日

淫羊藿苷抑制小胶质细胞激活及调控NADPH oxidase通路在抗帕金森病中的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Progranulin在糖尿病肾病足细胞损伤中的保护作用及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

miRNA-92a对Rho激酶调控的动脉粥样硬化血管重构的影响及机制

国家自然科学基金

0+阅读 · 2013年12月31日

缺氧细胞中mTORC1通过下调EF-Tumt表达引起线粒体损害的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

转录因子Sox9调控胚胎干细胞软骨分化的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

酪氨酸磷酸化信号转导网络在丙型肝炎病毒NS3致癌机理中的作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

RGC-32参与TGF-β#35825;导肾小管上皮向间充质细胞转化的分子调控机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员