Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dki-lab/Freebase-Setup
The last data dump of Freebase with introductory explanation of its schema
https://github.com/dki-lab/Freebase-Setup
Last synced: 3 months ago
JSON representation
The last data dump of Freebase with introductory explanation of its schema
- Host: GitHub
- URL: https://github.com/dki-lab/Freebase-Setup
- Owner: dki-lab
- License: cc0-1.0
- Created: 2020-09-08T03:44:30.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2023-07-16T03:21:42.000Z (over 1 year ago)
- Last Synced: 2024-07-14T12:37:56.320Z (4 months ago)
- Language: Python
- Size: 117 KB
- Stars: 70
- Watchers: 3
- Forks: 13
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-starred - dki-lab/Freebase-Setup - The last data dump of Freebase with introductory explanation of its schema (others)
README
This repository contains instructions for setting up a SPARQL endpoint to Freebase via Virtuoso as well as some useful materials about Freebase.
## Requirements
* OpenLink Virtuoso 7.2.5 (download from this [link](https://sourceforge.net/projects/virtuoso/files/virtuoso/7.2.5/virtuoso-opensource.x86_64-generic_glibc25-linux-gnu.tar.gz/download))
* Python 3 (required if using the provided Python script)## Setup
### Freebase data dump
The latest offical data dump of Freebase can be downloaded [here](https://developers.google.com/freebase). However, in the official dump, the format of some literal types is not fully compatible with the N-Triples RDF standard (it's missing type decoration such as `^^`), which may cause it to fail to load into triplestores like Virtuoso. We fixed this issue. Our processed Virtuoso DB file can be downloaded from either this [OneDrive link](https://buckeyemailosu-my.sharepoint.com/:u:/g/personal/su_809_osu_edu/Ed0SY7sAS_ZGqNTovDYhVCcBxEmZfhL3B-chAiuoZCrpVg?e=vpHUei), [Dropbox link](https://www.dropbox.com/s/q38g0fwx1a3lz8q/virtuoso_db.zip) or via wget (WARNING: 53G+ disk space is needed):
```
wget https://www.dropbox.com/s/q38g0fwx1a3lz8q/virtuoso_db.zip
```In case you'd like to load your own RDF into Virtuoso, see [here](http://vos.openlinksw.com/owiki/wiki/VOS/VirtBulkRDFLoader) for instructions. If you prefer some other triplestore, try fixing the literal format issue of Freebase using the script `fix_freebase_literal_format.py` to get the N-Triples-formatted data.
### Managing the Virtuoso service
We provide a wrapper script (`virtuoso.py`, adapted from [Sempre](https://github.com/percyliang/sempre)) for managing the Virtuoso service. To use it, first change the `virtuosoPath` in the script to your local Virtuoso directory. Assuming the Virtuoso db file is located in a directory named `virtuoso_db` under the same directory as the script `virtuoso.py` and 3001 is the intended HTTP port for the service, to start the Virtuoso service:
```
python3 virtuoso.py start 3001 -d virtuoso_db
```and to stop a currently running service at the same port:
```
python3 virtuoso.py stop 3001
```A server with at least 100 GB RAM is recommended. You may adjust the maximum amount of RAM the service may use and other configurations via the provided script.
## Useful materials about Freebase
- Some basic concepts: https://developers.google.com/freebase/guide/basic_concepts
- A brief review of Freebase: https://github.com/nchah/freebase-mql
- [RDF](https://www.w3.org/TR/rdf11-concepts/) and [SPARQL](https://www.w3.org/TR/sparql11-query/)
- Freebase is strongly typed (in terms of terminology, type=class, property=relation=predicate). Each property has one and only one _domain type_ (the type of the subject entity) and _range type_ (the type of the object entity). For example, the property `people.person.place_of_birth`'s domain type is `people.person` and range type is `location.location`.
- A list of frequently used relations for meta-information:
- `common.topic`: the type of all entities. An item is an entity if it has this type.
- `type.object.name`: the canonical name of an item (entity/type/property). Filter by the language decorator (e.g., `@en`) if only a certain language is of interest.
- `common.topic.alias`: a list of common aliases of an entity.
- `common.topic.description`: a textual description of an entity.
- `type.object.type`: the list of types of an entity.
- `type.type.properties`: the list of properties with a type as the domain type.
- `type.type.expected_by`: the list of properties with a type as the range type.
- `type.property.expected_type`: the range type of a property.
- `freebase.type_profile.strict_included_types` and `freebase.type_profile.included_types`: the parent types of a type.
- `freebase.type_hints.mediator`: true if a type is a CVT (Compound Value Type), Freebase's way of model Neo-Davidsonian event semantics via event reification.
- `type.property.master_property` and `type.property.reverse_property`: Freebase has twin properties. These two properties let you find the twin.