keyboard_arrow_up
The Annual Report Algorithm : Retrieval of Financial Statements and Extraction of Textual Information

Authors

Jorg Hering, University of Erlangen-Nurnberg, Germany

Abstract

U.S. corporations are obligated to file financial statements with the U.S. Securities and Exchange Commission (SEC). The SEC´s Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system containing millions of financial statements is one of the most important sources of corporate information available. The paper illustrates which financial statements are publicly available by analyzing the entire SEC EDGAR database since its implementation in 1993. It shows how to retrieve financial statements in a fast and efficient way from EDGAR. The key contribution however is a platform-independent algorithm for business and research purposes designed to extract textual information embedded in financial statements. The dynamic extraction algorithm capable of identifying structural changes within financial statements is applied to more than 180,000 annual reports on Form 10-K filed with the SEC for descriptive statistics and validation purposes.

Keywords

Textual analysis, Textual sentiment, 10-K parsing rules, Information extraction, EDGAR search engine

Full Text  Volume 7, Number 4