https://github.com/ekzhu/xbrl2rdf
Publishing XBRL document as RDF data
https://github.com/ekzhu/xbrl2rdf
Last synced: 6 months ago
JSON representation
Publishing XBRL document as RDF data
- Host: GitHub
- URL: https://github.com/ekzhu/xbrl2rdf
- Owner: ekzhu
- License: mit
- Created: 2014-02-28T00:52:57.000Z (over 11 years ago)
- Default Branch: master
- Last Pushed: 2017-05-15T18:16:04.000Z (over 8 years ago)
- Last Synced: 2025-04-10T21:15:09.881Z (6 months ago)
- Language: Java
- Homepage:
- Size: 568 KB
- Stars: 8
- Watchers: 3
- Forks: 2
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
xbrl2rdf
========This is a command line tool for converting XBRL documents to Resource Description Framework (RDF) format. It uses [xCurator](https://github.com/ekzhu/xcurator). It can be used together with [secxbrl](https://github.com/ekzhu/secxbrl), which downloads the URLs of XBRL documents from SEC.
##Build
Building this tool requires [Ant](http://ant.apache.org). Run `ant -version` to see if it is installed.
1. Check out the repo.
2. Run `ant init` to resolve dependencies
3. Currently the xCurator dependency cannot be resolved automatically. You need to build it yourself. See instruction [here](https://github.com/ekzhu/xcurator). Once built, copy the `xcurator.jar` to the `lib` directory.
4. Run `ant clean dist` to build afresh.
5. The runnable JAR file `xbrl2rdf.jar` is in `dist` directory##Quick start
Follow the instructions for building. Once done, in the directory of `xbrl2rdf.jar`, run the following commands
mkdir tdb
java -jar xbrl2rdf.jar -d http://www.sec.gov/Archives/edgar/data/1326801/000132680114000007/fb-20131231.xml -h http://corpbase.org -m fb-20131231-mapping.xml -t tdbThis will download Facebook Inc.'s 2013 annual filing, and convert the document into RDF data. The RDF data is stored in a [TDB](http://jena.apache.org/documentation/tdb/), which is in the `tdb` directory. The entities and relations in the data are serialized into the mapping file `fb-20131231-mapping.xml`. The domain name of the RDF data is set to `http://corpbase.org`.
Local XBRL documents can also be used, just use local file path for `-d` argument.
##Serve data over HTTP
[Fuseki](http://jena.apache.org/documentation/serving_data) can be used for serving the RDF data over HTTP through [SPARQL](http://www.w3.org/TR/sparql11-query/) endpoint. It also provides a nice web interface for running SPARQL queries.
For quick start, [download](http://jena.apache.org/download) Fuseki, then run the following command:
bash fuseki-server --loc /path/to/tdb/directory /corpbase
The web query interface can now be accessed at `http://localhost:3030/control-panel.tpl`, select `/corpbase` dataset.
##Query the data
The data can be queried directly through the SPARQL endpoint served by Fuseki server, either using [ARQ](http://jena.apache.org/documentation/query/) or the web query interface provided by Fuseki server.
You can use the following SPARQL query to get the net income information in the filings.
prefix class:
prefix property:
prefix rdf:
prefix rdfs:select distinct ?name ?netincome ?start ?end
where
{
?a rdf:type class:xbrl.
?a property:EntityRegistrantName ?n.
?n property:value ?name.
?a property:NetIncomeLoss ?z.
?z property:value ?netincome.
?z property:context ?context.
?context property:entity ?contextentity.
filter not exists {?contextentity property:segment ?x}.
?context property:period ?period.
?period property:startDate ?start.
?period property:endDate ?end.
}
order by ?name[Apache Jena](http://jena.apache.org) provides many other tools and libraries to work with RDF data, vist their website for more information.