https://github.com/datasets/edgar
Securities and Exchange Commission (SEC) EDGAR database which contains regulatory filings from publicly-traded US corporations.
https://github.com/datasets/edgar
Last synced: 5 months ago
JSON representation
Securities and Exchange Commission (SEC) EDGAR database which contains regulatory filings from publicly-traded US corporations.
- Host: GitHub
- URL: https://github.com/datasets/edgar
- Owner: datasets
- Created: 2014-03-03T20:17:23.000Z (over 12 years ago)
- Default Branch: main
- Last Pushed: 2024-10-25T14:25:56.000Z (over 1 year ago)
- Last Synced: 2025-03-04T17:34:47.122Z (over 1 year ago)
- Language: HTML
- Homepage: https://datahub.io/
- Size: 26.4 KB
- Stars: 330
- Watchers: 33
- Forks: 68
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Securities and Exchange Commission (SEC) EDGAR database. EDGAR contains
regulatory filings from publicly-traded US corporations including their annual
and quarterly reports:
[edgar]: http://www.sec.gov/edgar.shtml
> All companies, foreign and domestic, are required to file registration
> statements, periodic reports, and other forms electronically through EDGAR.
> Anyone can access and download this information for free. [from the [SEC
> website][edgar]]
## Human Interface
See

## Bulk Data
~~EDGAR provides bulk access via FTP: - [official
documentation][ftp-doc]. We summarize here the main points.~~
Each company in EDGAR gets an identifier known as the CIK which is a 10 digit
number. You can find the CIK by searching EDGAR using a name of stock market
ticker.
For example, [searching for IBM by ticker][ibm-search] shows us that
the the CIK is `0000051143`.
Note that leading zeroes are often omitted ~~(e.g. in the ftp access)~~ so this
would become `51143`.
[ibm-search]: http://www.sec.gov/cgi-bin/browse-edgar?CIK=ibm&action=getcompany

Next each submission receives an 'Accession Number' (acc-no). For example,
IBM's quarterly financial filing (form 10-Q) in October 2013 had accession
number: `0000051143-13-000007`.
### HTTPS File Paths
Given a company with CIK (company ID) XXX (omitting leading zeroes) and
document accession number YYY (acc-no on search results) the path would be:
File paths are of the form:
/edgar/data/XXX/YYY.txt
For example, for the IBM data above it would be:
~~~~
EDGAR has retired HTTP services. Instead use the HTTPS equivalent.
Note, if you are looking for a nice HTML version you can find it at in the
Archives section with a similar URL (just add -index.html):
### Indices
If you want to get a list of all filings you'll want to grab an Index. As the help page explains:
> The EDGAR indices are a helpful resource for HTTPS retrieval, listing the
> following information for each filing: Company Name, Form Type, CIK, Date
> Filed, and File Name (including folder path).
>
> Four types of indexes are available:
>
> * company — sorted by company name
> * form — sorted by form type
> * master — sorted by CIK number
> * XBRL — list of submissions containing XBRL financial files, sorted by CIK
> number; these include Voluntary Filer Program submissions
URLs are like:
~~~~
That is, they have the following general form:
~~ftp://ftp.sec.gov/edgar/full-index/{YYYY}/QTR{1-4}/{index-name}.[gz|zip]~~
https://www.sec.gov/Archives/edgar/full-index/{YYYY}/QTR{1-4}/{index-name}.[gz|zip]
So for XBRL in the 3rd quarter of 2010 we'd do:
~~~~
~~[ftp-doc]: https://www.sec.gov/edgar/searchedgar/ftpusers.htm~~
### CIK lists and lookup
There's a full list of all companies along with their CIK code here:
If you want to look up a CIK or company by its ticker you can do the following query against the normal search system:
Then parse the atom to grab the CIK. (If you prefer HTML output just omit output=atom).
There is also a full-text company name to CIK lookup here:
(Note this does a POST to a 'text' API at )
## Parsing XBRL Data
See `scripts` and README file there.
## References
* CorpWatch have an excellent API and DB dump covering a lot of EDGAR info - see the [CorpWatch DataHub Entry][corpwatch]
[corpwatch]: http://datahub.io/dataset/corpwatch