Copycat vs. Original: Multi-modal Pretraining and Variable Importance in Box-office Prediction
This work provides automated tools for the movie industry to predict revenue and assess commercial viability, though it is incremental in applying multimodal methods to a specific domain.
The study tackled box-office revenue prediction by developing a multimodal neural network that grounds descriptive keywords in movie poster visuals, reducing prediction error by 14.5%. It also analyzed copycat movies, finding a positive revenue effect that diminishes with increased similarity and competition.
The movie industry is associated with an elevated level of risk, which necessitates the use of automated tools to predict box-office revenue and facilitate human decision-making. In this study, we build a sophisticated multimodal neural network that predicts box offices by grounding crowdsourced descriptive keywords of each movie in the visual information of the movie posters, thereby enhancing the learned keyword representations, resulting in a substantial reduction of 14.5% in box-office prediction error. The advanced revenue prediction model enables the analysis of the commercial viability of "copycat movies," or movies with substantial similarity to successful movies released recently. We do so by computing the influence of copycat features in box-office prediction. We find a positive relationship between copycat status and movie revenue. However, this effect diminishes when the number of similar movies and the similarity of their content increase. Overall, our work develops sophisticated deep learning tools for studying the movie industry and provides valuable business insight.