CPI forecast and model comparison in a data-rich environment
DOI:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The advent of the Big Data era has brought unprecedented opportunities as well as challenges to CPI forecasting. Making full use of high-dimensional data and developing interpretable machine learning methods for forecasting are of great significance both theoretically and practically. Thus, this paper constructs a large monthly macroeconomic dataset for China, which consists of 239 variables across 9 categories. Based on this large dataset, the paper evaluates the forecasting performances of 13 common methods for CPI, including traditional time series models, regularized regressions, factor models, and ensemble methods. Further, based on the idea of control variables, a derivative algorithm for machine learning is constructed to explain the results and conduct the mechanism analysis. According to the results, random forest and XGBoost exhibit superior predictive performance, especially in the medium and long-term horizons. Further investigation proves that the non-linearity and non-sparsity of ensemble methods play a vital role for better forecast precision. Meanwhile, according to the variable importance measures of the two ensemble methods, variables in the autoregressive, price, and employment categories contribute a large portion of predictive power, which is in line with economic theory and stylized facts.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: September 23,2025
  • Published:
You are the th visitor Address:Room 908, Building A, 25th Teaching Building, Tianjin University, 92 Weijin Road, Nankai District, Tianjin Postcode:300072
Telephone:022-27403197 Email:jmsc@tju.edu.cn