CLDBJun 13, 2018

OpenEDGAR: Open Source Software for SEC EDGAR Analysis

arXiv:1806.04973v1Has Code
Originality Synthesis-oriented
AI Analysis

This provides a tool for academic and industrial researchers working with financial regulatory data, but it is incremental as it builds on existing frameworks like Django.

The authors tackled the problem of efficiently analyzing SEC EDGAR data by developing OpenEDGAR, an open-source Python framework that enables rapid construction of research databases with features for retrieval, parsing, and search of filing documents.

OpenEDGAR is an open source Python framework designed to rapidly construct research databases based on the Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system operated by the US Securities and Exchange Commission (SEC). OpenEDGAR is built on the Django application framework, supports distributed compute across one or more servers, and includes functionality to (i) retrieve and parse index and filing data from EDGAR, (ii) build tables for key metadata like form type and filer, (iii) retrieve, parse, and update CIK to ticker and industry mappings, (iv) extract content and metadata from filing documents, and (v) search filing document contents. OpenEDGAR is designed for use in both academic research and industrial applications, and is distributed under MIT License at https://github.com/LexPredict/openedgar.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes