基于流动性的系统日内新闻筛选方法 (Towards systematic intraday news screening: a liquidity-focused approach) - 专知论文

会员服务 ·

0

新闻 · 波动 · 自动评估 · 朴素贝叶斯 · 情绪分类 ·

2023 年 4 月 11 日

Towards systematic intraday news screening: a liquidity-focused approach

翻译：基于流动性的系统日内新闻筛选方法

Jianfei Zhang,Mathieu Rosenbaum

News can convey bearish or bullish views on financial assets. Institutional investors need to evaluate automatically the implied news sentiment based on textual data. Given the huge amount of news articles published each day, most of which are neutral, we present a systematic news screening method to identify the ``true'' impactful ones, aiming for more effective development of news sentiment learning methods. Based on several liquidity-driven variables, including volatility, turnover, bid-ask spread, and book size, we associate each 5-min time bin to one of two specific liquidity modes. One represents the ``calm'' state at which the market stays for most of the time and the other, featured with relatively higher levels of volatility and trading volume, describes the regime driven by some exogenous events. Then we focus on the moments where the liquidity mode switches from the former to the latter and consider the news articles published nearby impactful. We apply naive Bayes on these filtered samples for news sentiment classification as an illustrative example. We show that the screened dataset leads to more effective feature capturing and thus superior performance on short-term asset return prediction compared to the original dataset.

翻译：新闻可以传达对金融资产的看空或看涨意见。机构投资者需要根据文本数据自动评估隐含的新闻情绪。鉴于每天发布的新闻文章数量巨大，其中大部分是中性的，本文提出一种系统化的新闻筛选方法，以识别真正有影响力的新闻，旨在更有效地开发新闻情绪学习方法。基于多个流动性驱动变量，包括波动率、成交量、买卖价差和档位大小，我们将每个5分钟时间段关联到两种特定的流动性模式之一。其中之一表示市场大部分时间保持“平静”的状态，另一个则标志着相对较高水平的波动和交易量驱动的状态，描绘了某些外部事件所驱动的体制。然后，我们关注流动性模式从前者切换到后者的时刻，并考虑附近发布的新闻文章具有影响力。我们在这些筛选样本上应用朴素贝叶斯进行新闻情绪分类，举例说明。我们表明，经过筛选的数据集具有更有效的特征捕获，因此在短期资产回报预测方面表现更佳，相对于原始数据集。

0

相关内容

新闻，是指报纸、电台、电视台、互联网等媒体经常使用的记录与传播信息的 [2] 一种文体，是反映时代的一种文体。新闻概念有广义与狭义之分。广义上：除了发表于报刊、广播、互联网、电视上的评论与专文外的常用文本都属于新闻，包括消息、通讯、特写、速写（有的将速写纳入特写之列）等等； [3] 狭义上：消息是用概括的叙述方式，以较简明扼要的文字，迅速及时地报道附近新近发生的、有价值的事实，使一定人群了解。新闻一般包括标题、导语、主体、背景和结语五部分。前三者是主要部分，后二者是辅助部分。写法以叙述为主兼或有议论、描写、评论等。新闻是包含海量资讯的新闻服务平台,真实反映每时每刻的重要事件。您可以搜索新闻事件、热点话题、人物动态、产品资讯等,快速了解它们的最新进展。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

Nat. Biotechnol. | 机器学习为生物库驱动的药物发现提供动力

Nat. Biotechnol. | 机器学习为生物库驱动的药物发现提供动力

专知会员服务

11+阅读 · 2022年9月12日

ISAIR2022 的重要日期

ISAIR2022 的重要日期

专知会员服务

12+阅读 · 2022年3月15日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

专知会员服务

17+阅读 · 2020年5月19日

【NLP| 推荐文章】知识图谱问答系统的神经网络方法介绍（Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs）

专知会员服务

59+阅读 · 2019年11月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

基于在线双向拍卖的运输服务市场交易策略研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于地质统计学方法确定干热岩体天然裂隙分布格局

国家自然科学基金

0+阅读 · 2015年12月31日

基于二萜生物碱骨架特征以cortistatin A为先导化合物探索新型肿瘤血管生成抑制剂研究

国家自然科学基金

0+阅读 · 2013年12月31日

一类Monge-Ampère方程解的边界行为

国家自然科学基金

0+阅读 · 2013年12月31日

商丹构造带中高应变剪切带的野外和数值模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于超长引水隧洞水电站巨大水流惯性的平压措施与机组运行控制策略的研究

国家自然科学基金

0+阅读 · 2012年12月31日

土壤氨氧化微生物对重金属胁迫的响应机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于DSP的LDoS/LDDoS攻击建模、检测和过滤方法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

度序列与图性质及图的t-Pebbling数

国家自然科学基金

0+阅读 · 2011年12月31日

煤最易自燃临界水分分布研究

国家自然科学基金

0+阅读 · 2009年12月31日

A Hierarchical Context-aware Modeling Approach for Multi-aspect and Multi-granular Pronunciation Assessment

Arxiv

0+阅读 · 2023年5月29日

Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models

Arxiv

0+阅读 · 2023年5月29日

Forecasting the levels of disability in the older population of England: Application of neural nets

Arxiv

0+阅读 · 2023年5月28日

Improving accuracy of GPT-3/4 results on biomedical data using a retrieval-augmented language model

Arxiv

1+阅读 · 2023年5月26日

Methodological considerations for novel approaches to covariate-adjusted indirect treatment comparisons

Arxiv

0+阅读 · 2023年5月26日

Emotions in Requirements Engineering: A Systematic Mapping Study

Arxiv

0+阅读 · 2023年5月25日

Disentangled Generation Network for Enlarged License Plate Recognition and A Unified Dataset

Arxiv

0+阅读 · 2023年5月25日

Feature space reduction method for ultrahigh-dimensional, multiclass data: Random forest-based multiround screening (RFMS)

Arxiv

0+阅读 · 2023年5月25日

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

Arxiv

12+阅读 · 2023年4月26日

Embedding-based Retrieval in Facebook Search

Arxiv

12+阅读 · 2020年6月20日

VIP会员

文章信息

相关主题

朴素贝叶斯

相关VIP内容

Nat. Biotechnol. | 机器学习为生物库驱动的药物发现提供动力

Nat. Biotechnol. | 机器学习为生物库驱动的药物发现提供动力

专知会员服务

11+阅读 · 2022年9月12日

ISAIR2022 的重要日期

ISAIR2022 的重要日期

专知会员服务

12+阅读 · 2022年3月15日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

专知会员服务

17+阅读 · 2020年5月19日

【NLP| 推荐文章】知识图谱问答系统的神经网络方法介绍（Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs）

专知会员服务

59+阅读 · 2019年11月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

A Hierarchical Context-aware Modeling Approach for Multi-aspect and Multi-granular Pronunciation Assessment

Arxiv

0+阅读 · 2023年5月29日

Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models

Arxiv

0+阅读 · 2023年5月29日

Forecasting the levels of disability in the older population of England: Application of neural nets

Arxiv

0+阅读 · 2023年5月28日

Improving accuracy of GPT-3/4 results on biomedical data using a retrieval-augmented language model

Arxiv

1+阅读 · 2023年5月26日

Methodological considerations for novel approaches to covariate-adjusted indirect treatment comparisons

Arxiv

0+阅读 · 2023年5月26日

Emotions in Requirements Engineering: A Systematic Mapping Study

Arxiv

0+阅读 · 2023年5月25日

Disentangled Generation Network for Enlarged License Plate Recognition and A Unified Dataset

Arxiv

0+阅读 · 2023年5月25日

Feature space reduction method for ultrahigh-dimensional, multiclass data: Random forest-based multiround screening (RFMS)

Arxiv

0+阅读 · 2023年5月25日

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

Arxiv

12+阅读 · 2023年4月26日

Embedding-based Retrieval in Facebook Search

Arxiv

12+阅读 · 2020年6月20日

相关基金

基于在线双向拍卖的运输服务市场交易策略研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于地质统计学方法确定干热岩体天然裂隙分布格局

国家自然科学基金

0+阅读 · 2015年12月31日

基于二萜生物碱骨架特征以cortistatin A为先导化合物探索新型肿瘤血管生成抑制剂研究

国家自然科学基金

0+阅读 · 2013年12月31日

一类Monge-Ampère方程解的边界行为

国家自然科学基金

0+阅读 · 2013年12月31日

商丹构造带中高应变剪切带的野外和数值模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于超长引水隧洞水电站巨大水流惯性的平压措施与机组运行控制策略的研究

国家自然科学基金

0+阅读 · 2012年12月31日

土壤氨氧化微生物对重金属胁迫的响应机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于DSP的LDoS/LDDoS攻击建模、检测和过滤方法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

度序列与图性质及图的t-Pebbling数

国家自然科学基金

0+阅读 · 2011年12月31日

煤最易自燃临界水分分布研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员