Roberta Goes for IPO: Prospectus Analysis with Language Models for Indian Initial Public Offerings

Abhishek Mishra1 and Yogendra Sisodia2, 1Trust Group, India, 2Conga, India; Abhishek Mishra1 and Yogendra Sisodia2, 1Trust Group, India, 2Conga, India

Roberta Goes for IPO: Prospectus Analysis with Language Models for Indian Initial Public Offerings

Authors

Abhishek Mishra¹ and Yogendra Sisodia², ¹Trust Group, India, ²Conga, India

Abstract

With the advent of large-scale language models in natural language processing (NLP), extracting valuable information from financial documents has gained popularity among researchers, and deep learning has boosted the development of effective text mining models. Prospectus text mining is very important for the investor community to identify major risk factors and evaluate the usage of the amount to be raised during an IPO. In this paper, we investigate how the recently introduced pre-trained language model Roberta can be adapted for this task. We also introduced prospectus-specific sentence transformers for semantic textual similarity along with a dataset to verify the efficacy of our work.

Keywords

IPO, Prospectus, Large Language Models, Semantic Textual Similarity.

CS&IT Conference Proceedings

Roberta Goes for IPO: Prospectus Analysis with Language Models for Indian Initial Public Offerings