Abstract:The verticalization of large language models (LLMs) has become an important trend. In highly specialized and crossmarket contexts, such as crude oil futures, generalpurpose LLMs struggle to accurately interpret domainspecific terminology and complex semantics, and they are prone to lookahead bias. This paper proposes an integrated approach that combines LLMs with a domainspecific sentiment lexicon to measure the tone of crude oil news. By integrating the prior knowledge of a domain lexicon with the semantic understanding of LLMs, the proposed method enhances both the interpretability and predictive power of tone measures. Specifically, based on 93 004 Chinese and English crude oil news articles from InfoBank and Factiva between 2018 and 2022, this study constructs a vertical domain lexicon, refines high signaltonoise corpora, performs domainspecific pretraining on a general BERT model, and applies weakly supervised finetuning guided by futures return signs to generate an integrated tone index. Empirical results show that the tone measure derived from the proposed method significantly outperforms dictionarybased and general LLM methods in explaining and predicting Shanghai crude oil futures returns. Robustness tests confirm consistent results, and the method also exhibits superior outofsample predictive performance for 2023~2024. Further analysis reveals that the impact of news tone on crude oil futures returns is transmitted through investor attention as a mediating channel, and under the integrated framework, this mechanism aligns closely with economic logic in both direction and significance. Moreover, the risk spillover analysis indicates that the integrated tone measure effectively captures the tailrisk transmission from international news sentiment to China’s crude oil futures market. This study contributes by proposing a reproducible vertical application framework for financial large language models, revealing the dual role of news tone in market return formation and risk transmission, and providing new quantitative tools for risk identification and policy formulation in the crude oil futures market.