1.2STMar 22, 2023
Forecasting Large Realized Covariance Matrices: The Benefits of Factor Models and ShrinkageRafael Alves, Diego S. de Brito, Marcelo C. Medeiros et al.
We propose a model to forecast large realized covariance matrices of returns, applying it to the constituents of the S\&P 500 daily. To address the curse of dimensionality, we decompose the return covariance matrix using standard firm-level factors (e.g., size, value, and profitability) and use sectoral restrictions in the residual covariance matrix. This restricted model is then estimated using vector heterogeneous autoregressive (VHAR) models with the least absolute shrinkage and selection operator (LASSO). Our methodology improves forecasting precision relative to standard benchmarks and leads to better estimates of minimum variance portfolios.
3.3EMApr 7, 2021
The Proper Use of Google Trends in Forecasting ModelsMarcelo C. Medeiros, Henrique F. Pires
It is widely known that Google Trends have become one of the most popular free tools used by forecasters both in academics and in the private and public sectors. There are many papers, from several different fields, concluding that Google Trends improve forecasts' accuracy. However, what seems to be widely unknown, is that each sample of Google search data is different from the other, even if you set the same search term, data and location. This means that it is possible to find arbitrary conclusions merely by chance. This paper aims to show why and when it can become a problem and how to overcome this obstacle.
19.1EMDec 23, 2020
Machine Learning Advances for Time Series ForecastingRicardo P. Masini, Marcelo C. Medeiros, Eduardo F. Mendes
In this paper we survey the most recent advances in supervised machine learning and high-dimensional models for time series forecasting. We consider both linear and nonlinear alternatives. Among the linear methods we pay special attention to penalized regressions and ensemble of models. The nonlinear methods considered in the paper include shallow and deep neural networks, in their feed-forward and recurrent versions, and tree-based methods, such as random forests and boosted trees. We also consider ensemble and hybrid models by combining ingredients from different alternatives. Tests for superior predictive ability are briefly reviewed. Finally, we discuss application of machine learning in economics and finance and provide an illustration with high-frequency financial data.
2.3EMNov 8, 2020
Do We Exploit all Information for Counterfactual Analysis? Benefits of Factor Models and Idiosyncratic CorrectionJianqing Fan, Ricardo P. Masini, Marcelo C. Medeiros
Optimal pricing, i.e., determining the price level that maximizes profit or revenue of a given product, is a vital task for the retail industry. To select such a quantity, one needs first to estimate the price elasticity from the product demand. Regression methods usually fail to recover such elasticities due to confounding effects and price endogeneity. Therefore, randomized experiments are typically required. However, elasticities can be highly heterogeneous depending on the location of stores, for example. As the randomization frequently occurs at the municipal level, standard difference-in-differences methods may also fail. Possible solutions are based on methodologies to measure the effects of treatments on a single (or just a few) treated unit(s) based on counterfactuals constructed from artificial controls. For example, for each city in the treatment group, a counterfactual may be constructed from the untreated locations. In this paper, we apply a novel high-dimensional statistical method to measure the effects of price changes on daily sales from a major retailer in Brazil. The proposed methodology combines principal components (factors) and sparse regressions, resulting in a method called Factor-Adjusted Regularized Method for Treatment evaluation (\texttt{FarmTreat}). The data consist of daily sales and prices of five different products over more than 400 municipalities. The products considered belong to the \emph{sweet and candies} category and experiments have been conducted over the years of 2016 and 2017. Our results confirm the hypothesis of a high degree of heterogeneity yielding very different pricing strategies over distinct municipalities.
1.4MLSep 29, 2020
Online Action Learning in High Dimensions: A Conservative PerspectiveClaudio Cardoso Flores, Marcelo Cunha Medeiros
Sequential learning problems are common in several fields of research and practical applications. Examples include dynamic pricing and assortment, design of auctions and incentives and permeate a large number of sequential treatment experiments. In this paper, we extend one of the most popular learning solutions, the $ε_t$-greedy heuristics, to high-dimensional contexts considering a conservative directive. We do this by allocating part of the time the original rule uses to adopt completely new actions to a more focused search in a restrictive set of promising actions. The resulting rule might be useful for practical applications that still values surprises, although at a decreasing rate, while also has restrictions on the adoption of unusual actions. With high probability, we find reasonable bounds for the cumulative regret of a conservative high-dimensional decaying $ε_t$-greedy rule. Also, we provide a lower bound for the cardinality of the set of viable actions that implies in an improved regret bound for the conservative version when compared to its non-conservative counterpart. Additionally, we show that end-users have sufficient flexibility when establishing how much safety they want, since it can be tuned without impacting theoretical properties. We illustrate our proposal both in a simulation exercise and using a real dataset.
2.3APSep 28, 2020
Lockdown effects in US states: an artificial counterfactual approachCarlos B. Carneiro, Iúri H. Ferreira, Marcelo C. Medeiros et al.
We adopt an artificial counterfactual approach to assess the impact of lockdowns on the short-run evolution of the number of cases and deaths in some US states. To do so, we explore the different timing in which US states adopted lockdown policies, and divide them among treated and control groups. For each treated state, we construct an artificial counterfactual. On average, and in the very short-run, the counterfactual accumulated number of cases would be two times larger if lockdown policies were not implemented.
5.9STDec 19, 2019
Regularized Estimation of High-Dimensional Vector AutoRegressions with Weakly Dependent InnovationsRicardo P. Masini, Marcelo C. Medeiros, Eduardo F. Mendes
There has been considerable advance in understanding the properties of sparse regularization procedures in high-dimensional models. In time series context, it is mostly restricted to Gaussian autoregressions or mixing sequences. We study oracle properties of LASSO estimation of weakly sparse vector-autoregressive models with heavy tailed, weakly dependent innovations with virtually no assumption on the conditional heteroskedasticity. In contrast to current literature, our innovation process satisfy an $L^1$ mixingale type condition on the centered conditional covariance matrices. This condition covers $L^1$-NED sequences and strong ($α$-) mixing sequences as particular examples.
1.9MLAug 10, 2018
BooST: Boosting Smooth Trees for Partial Effect Estimation in Nonlinear RegressionsYuri Fonseca, Marcelo Medeiros, Gabriel Vasconcelos et al.
In this paper, we introduce a new machine learning (ML) model for nonlinear regression called the Boosted Smooth Transition Regression Trees (BooST), which is a combination of boosting algorithms with smooth transition regression trees. The main advantage of the BooST model is the estimation of the derivatives (partial effects) of very general nonlinear models. Therefore, the model can provide more interpretation about the mapping between the covariates and the dependent variable than other tree-based models, such as Random Forests. We present several examples with both simulated and real data.