https://github.com/paulhoule/infovore
RDF-Centric Map/Reduce Framework and Freebase data conversion tool
https://github.com/paulhoule/infovore
Last synced: about 1 month ago
JSON representation
RDF-Centric Map/Reduce Framework and Freebase data conversion tool
- Host: GitHub
- URL: https://github.com/paulhoule/infovore
- Owner: paulhoule
- License: other
- Created: 2012-11-02T01:56:00.000Z (about 13 years ago)
- Default Branch: master
- Last Pushed: 2021-11-15T06:03:33.000Z (about 4 years ago)
- Last Synced: 2025-10-20T15:52:35.742Z (about 1 month ago)
- Language: Java
- Size: 4.18 MB
- Stars: 148
- Watchers: 21
- Forks: 21
- Open Issues: 49
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
- awesome-semantic-web - infovore - RDF-Centric Map/Reduce Framework and Freebase data conversion tool. (Machine Learning / BBedit)
- awesome-bigdata - Infovore - RDF-centric Map/Reduce framework. (Graph Data Model)
- fucking-awesome-bigdata - Infovore - RDF-centric Map/Reduce framework. (Graph Data Model)
- awesome-bigdata - Infovore - RDF-centric Map/Reduce framework. (Graph Data Model)
- A-curated-list-of-awesome-big-data-frameworks-ressources-and-other-awesomeness.- - Infovore - RDF-centric Map/Reduce framework. (Graph Data Model)
- awesome-semantic-web - infovore - RDF-Centric Map/Reduce Framework and Freebase data conversion tool. (Machine Learning / BBedit)
- data-engineering-collection - Infovore - RDF-centric Map/Reduce framework. (Graph Data Model)
- awesome-bigdata - Infovore - RDF-centric Map/Reduce framework. (Graph Data Model)
- awesome-bigdata - Infovore - RDF-centric Map/Reduce framework. (Graph Data Model)
README
Overview
--------
Infovore is an RDF processing system that uses Hadoop to process RDF data
sets in the billion triple range and beyond. Infovore was originally designed to process
the (old) proprietary Freebase dump into RDF, but once Freebase came out with an official RDF
dump, Infovore gained the ability to clean and purify the dump, making it not just possible
but easy to process Freebase data with triple stores such as Virtuoso 7.
Every week we run Infovore in Amazon Elastic/Map reduce in order to produce a product known as
[:BaseKB](http://basekb.com/).
Infovore depends on the [Centipede](https://github.com/paulhoule/centipede/wiki) framework for packaging
and processing command-line arguments. The [Telepath](https://github.com/paulhoule/telepath/wiki) project
extends the Infovore project in order to process Wikipedia usage information to produce a product called
[:SubjectiveEye3D](https://github.com/paulhoule/telepath/wiki/SubjectiveEye3D).
Supporting
----------
It costs several hundreds of dollars per month to process and store files in connection with this work.
Please join Gittip and make a small weekly donation to keep this data free.
Building
--------
Infovore software requires JDK 7.
mvn clean install
Installing
----------
The following cantrip, run from the top level "infovore" directory, initializes the bash shell
for the use of the "haruhi" program, which can be used to run Infovore applications
packaged in the Bakemono Jar.
source haruhi/target/path.sh
More Information
----------------
See
https://github.com/paulhoule/infovore/wiki
for documentation and join the discussion group at
https://groups.google.com/forum/#!forum/infovore-basekb