Crude oil news tone and futures market returns: An integrated approach based on large language models and domainspecific lexicons
DOI:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The verticalization of large language models (LLMs) has become an important trend. In highly specialized and crossmarket contexts, such as crude oil futures, generalpurpose LLMs struggle to accurately interpret domainspecific terminology and complex semantics, and they are prone to lookahead bias. This paper proposes an integrated approach that combines LLMs with a domainspecific sentiment lexicon to measure the tone of crude oil news. By integrating the prior knowledge of a domain lexicon with the semantic understanding of LLMs, the proposed method enhances both the interpretability and predictive power of tone measures. Specifically, based on 93 004 Chinese and English crude oil news articles from InfoBank and Factiva between 2018 and 2022, this study constructs a vertical domain lexicon, refines high signaltonoise corpora, performs domainspecific pretraining on a general BERT model, and applies weakly supervised finetuning guided by futures return signs to generate an integrated tone index. Empirical results show that the tone measure derived from the proposed method significantly outperforms dictionarybased and general LLM methods in explaining and predicting Shanghai crude oil futures returns. Robustness tests confirm consistent results, and the method also exhibits superior outofsample predictive performance for 2023~2024. Further analysis reveals that the impact of news tone on crude oil futures returns is transmitted through investor attention as a mediating channel, and under the integrated framework, this mechanism aligns closely with economic logic in both direction and significance. Moreover, the risk spillover analysis indicates that the integrated tone measure effectively captures the tailrisk transmission from international news sentiment to China’s crude oil futures market. This study contributes by proposing a reproducible vertical application framework for financial large language models, revealing the dual role of news tone in market return formation and risk transmission, and providing new quantitative tools for risk identification and policy formulation in the crude oil futures market.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: March 26,2026
  • Published:
You are the th visitor Address:Room 908, Building A, 25th Teaching Building, Tianjin University, 92 Weijin Road, Nankai District, Tianjin Postcode:300072
Telephone:022-27403197 Email:jmsc@tju.edu.cn