Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mneedham/neo4j-bbc


https://github.com/mneedham/neo4j-bbc

Last synced: 12 days ago
JSON representation

Awesome Lists containing this project

README

        

= BBC Champions League Graph

This project takes us from BBC live text commentary for Champions League 2014/2015 matches to a Neo4j graph containing the events of each match.

== Quick start

* Install the latest version of Neo4j from http://neo4j.com/download/
* Windows users: Install desktop application & then click the 'start' button
* Mac/Linux: Unpack the tarball & then `./bin/neo4j start`
* Download link:https://raw.githubusercontent.com/mneedham/neo4j-bbc/master/import.cql[import.cql] to your machine

Import the data into Neo4j:

[source, bash]
----
cd neo4j-community-2.2.2
./bin/neo4j-shell --file import.cql
----

Open http://localhost:7474 and you're good to go

== Working with the data

If you want to play with the raw data you'll first need to setup a Python environment.

Install link:https://virtualenv.pypa.io/en/latest/[virtualenv] and create a sandbox for this project:

[source, bash]
----
virtualenv bbc
source bbc/bin/activate
----

Install the appropriate libraries:

[source, bash]
----
pip install -r requirements.txt
----

Download all the matches:

[source, bash]
----
python find_all_matches.py | xargs wget -P data/raw
----

Generate the CSV files that we import into Neo4j:

[source, bash]
----
python extract_players.py
# players will be written to data/players.csv
----

[source, bash]
----
python extract_events.py
# the other CSV files in data/ will be written
----