-
Notifications
You must be signed in to change notification settings - Fork 16
/
README.Rmd
162 lines (131 loc) · 7.59 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
---
output:
github_document:
html_preview: false
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r echo=FALSE}
# We jump through these hoops to make sure we're re-using the same cache path
cache.path = if (file.exists(file.path(getwd(),"docs"))) {
file.path(getwd(),"docs","README_cache/")
} else {
file.path(getwd(),"README_cache/")
}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-",
cache.path = cache.path
)
library(edgarWebR)
```
# edgarWebR
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/edgarWebR)](https://cran.r-project.org/package=edgarWebR)
![R-CMD-check](https://github.com/mwaldstein/edgarWebR/workflows/R-CMD-check/badge.svg)
[![codecov.io](https://codecov.io/github/mwaldstein/edgarWebR/coverage.svg?branch=master)](https://codecov.io/github/mwaldstein/edgarWebR?branch=master)
* Author/Maintainer: [Micah J Waldstein](https://micah.waldste.in)
* License: [MIT](https://opensource.org/licenses/MIT)
## Introduction
edgarWebR provides an interface to access the [SEC's EDGAR
system](https://www.sec.gov/edgar/search-and-access) for company
financial filings.
edgarWebR does *not* provide any functionality to extract financial data or
other information from filings, only the metadata and company information. For
processing of the financial data.
## Ethical Use & Fair Access
Because of abusive use of this library, the SEC is likely to block its use "as
is" without setting a custom 'User Agent' identifier. Details for setting a
custom agent are below.
Users of this library are required to follow the SEC's [Privacy and Security
Policy](https://www.sec.gov/privacy.htm#security). Failure to follow that
guidance may result in having your requests blocked. Per the SEC's [Developer
Resources](https://www.sec.gov/developer):
> To ensure that everyone has equitable access to SEC EDGAR content, please use
> efficient scripting, downloading only what you need and please moderate
> requests to minimize server load. Current guidelines limit each user to a
> total of no more than 10 requests per second, regardless of the number of
> machines used to submit requests.
>
> To ensure that SEC.gov remains available to all users, we reserve the right to
> block IP addresses that submit excessive requests. The SEC does not allow
> "unclassified" bots or automated tools to crawl the site. Any request that has
> been identified as part of an unclassified bot or an automated tool outside of
> the acceptable policy will be managed to ensure fair access for all users.
Users of this library are advised to set a custom user-agent by setting the
environment variable `EDGARWEBR_USER_AGENT`.
## EDGAR Tools
The EDGAR System provides a number of
[tools](https://www.sec.gov/edgar/search-and-access)
for filing and entity lookup and examination. As of v1.0, edgarWebR supports
all public search and browse interfaces.
*Search Interfaces:*
| Tool | URL | edgarWebR function(s) |
|-------------------------------|-----------------------------------------------------------------|-----------------------|
| Company | https://www.sec.gov/edgar/searchedgar/companysearch.html | `company_search()`, `company_information()`, `company_details()`, `company_filings()` |
| Recent Filings | https://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent | `latest_filings()` |
| Full Text | https://www.sec.gov/edgar/search/ | `full_text()` |
| Header Search | https://www.sec.gov/cgi-bin/srch-edgar | `header_search()` |
| Fund Disclosures | https://www.sec.gov/edgar/searchedgar/prospectus.htm | Use `company_search()` and specify the 'type' parameter as 485 |
| Fund Voting Records | https://www.sec.gov/edgar/searchedgar/n-px.htm | Use `company_search()` and specify the 'type' parameter as 'N-PX' |
| Fund Search | https://www.sec.gov/edgar/searchedgar/mutualsearch.html | `fund_search()`, `fund_fast_search()` |
| Var. Insurance Products | https://www.sec.gov/edgar/searchedgar/vinsurancesearch.html | `variable_insurance_search()`, `variable_insurance_fast_search()` |
| Confidential treatment orders | https://www.sec.gov/edgar/searchedgar/ctorders.htm | Use `header_search()`, `company_search()`, `latest_filings()`, or `full_text()` and use form types 'CT ORDER'|
| Effectiveness notices | https://www.sec.gov/cgi-bin/browse-edgar?action=geteffect | `effectiveness()` |
| CIK | https://www.sec.gov/edgar/searchedgar/cik.htm | `cik_search()` |
| Daily Filings | https://www.sec.gov/edgar/searchedgar/currentevents.htm | `current_events()` |
| Correspondence | https://www.sec.gov/answers/edgarletters.htm | Use `header_search()`, `company_search()`, `latest_filings()`, or `full_text()` and use form types 'upload' or 'corresp'|
Once a filing is found via any of the above, there are a number of functions to
process the result -
* `filing_documents()`
* `filing_filers()`
* `filing_funds()`
* `filing_information()`
* `filing_details()` - returns all 4 of the filing components in a list.
### Parsing Tools
While edgarWebR is primarily focused on providing an interface to the online
SEC tools, there are a few activities for handling filing documents for which
no current tools exist.
* `parse_submission()` - takes a full submission SGML document and parses out
component documents. Most of the time, the documents of interest in a
particular submission will be online and accessible via
`filing_documents()` - this function is to unpack the raw submission to get
all the documents. You may also find it more efficient if you're regularly
downloading all of the files in a given submission.
* `parse_filing()` - Takes a HTML narrative filing and annotates each
paragraph with item and part numbers.
### Data Sets
There is one dataset provided with edgarWebR - `sic_codes`, providing a
catalog of SIC codes and their hierarchy.
### URL Tools
There are also a number of utility functions to help construct useful URL's
once you have a company CIK, submission accession number or specific file.
* `company_href()` for linking to the company page
* `submission_index_href()` and its family of related functions for linking
to a specific submission and file.
## Installation
edgarWebR is available from CRAN, so can be simply installed via
```{r install, eval=FALSE}
install.packages("edgarWebR")
```
To install the development version,
```{r install_dev, eval=FALSE}
# Install the development version from GitHub:
# install.packages("devtools")
devtools::install_github("mwaldstein/edgarWebR")
```
## Example
```{r example, cache=TRUE}
company_filings("AAPL", type = "10-K", count = 10)
```
## Related Packages
* [XBRL](https://CRAN.R-project.org/package=XBRL) - Low level
extration of data from XBRL financial files
* [finstr](https://github.com/bergant/finstr) - Process XBRL to extract data,
combine periods, and make basic financial calulations.
* [finreportr](https://github.com/sewardlee337/finreportr) - All in one to
pull finnacials and information from EDGAR
Code of Conduct
---------------
Please note that this project is released with a [Contributor Code of
Conduct](https://mwaldstein.github.io/edgarWebR/CONDUCT.html). By participating in this project you agree to abide by
its terms. Report violations to (micah@waldste.in).