Authors
Abhishek Mishra1 and Yogendra Sisodia2, 1Trust Group, India, 2Conga, India
Abstract
With the advent of large-scale language models in natural language processing (NLP), extracting valuable information from financial documents has gained popularity among researchers, and deep learning has boosted the development of effective text mining models. Prospectus text mining is very important for the investor community to identify major risk factors and evaluate the usage of the amount to be raised during an IPO. In this paper, we investigate how the recently introduced pre-trained language model Roberta can be adapted for this task. We also introduced prospectus-specific sentence transformers for semantic textual similarity along with a dataset to verify the efficacy of our work.
Keywords
IPO, Prospectus, Large Language Models, Semantic Textual Similarity.