Experimenting with Multi-modal Information to Predict Success of Indian IPOs
This work addresses the need for data-driven investment decisions for investors in the Indian IPO market, but it appears incremental as it applies existing methods to a new domain-specific dataset.
The paper tackled the problem of predicting the success of Indian IPOs by developing a machine learning and NLP approach that uses multi-modal data, including text from prospectuses and market factors, and created new datasets to estimate direction and underpricing with respect to opening, high, and closing prices on listing day.
With consistent growth in Indian Economy, Initial Public Offerings (IPOs) have become a popular avenue for investment. With the modern technology simplifying investments, more investors are interested in making data driven decisions while subscribing for IPOs. In this paper, we describe a machine learning and natural language processing based approach for estimating if an IPO will be successful. We have extensively studied the impact of various facts mentioned in IPO filing prospectus, macroeconomic factors, market conditions, Grey Market Price, etc. on the success of an IPO. We created two new datasets relating to the IPOs of Indian companies. Finally, we investigated how information from multiple modalities (texts, images, numbers, and categorical features) can be used for estimating the direction and underpricing with respect to opening, high and closing prices of stocks on the IPO listing day.