STLGJun 19, 2025

News Sentiment Embeddings for Stock Price Forecasting

arXiv:2507.01970v11 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

This addresses the problem of predicting stock prices for investors and analysts, but it is incremental as it applies existing embedding methods to a specific financial dataset.

The paper tackled stock price forecasting for the SPY ETF by using news headline embeddings from the Wall Street Journal, combined with financial data, and found that this approach improved prediction performance by at least 40% compared to models without headline data.

This paper will discuss how headline data can be used to predict stock prices. The stock price in question is the SPDR S&P 500 ETF Trust, also known as SPY that tracks the performance of the largest 500 publicly traded corporations in the United States. A key focus is to use news headlines from the Wall Street Journal (WSJ) to predict the movement of stock prices on a daily timescale with OpenAI-based text embedding models used to create vector encodings of each headline with principal component analysis (PCA) to exact the key features. The challenge of this work is to capture the time-dependent and time-independent, nuanced impacts of news on stock prices while handling potential lag effects and market noise. Financial and economic data were collected to improve model performance; such sources include the U.S. Dollar Index (DXY) and Treasury Interest Yields. Over 390 machine-learning inference models were trained. The preliminary results show that headline data embeddings greatly benefit stock price prediction by at least 40% compared to training and optimizing a machine learning system without headline data embeddings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes