Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/volfpeter/graphscraper
Python 3 graph implementation designed to be turned into a web scraper for graph data.
https://github.com/volfpeter/graphscraper
cache-storage graph graph-algorithms graph-database python python3 social-network-analysis web web-scraping
Last synced: 3 months ago
JSON representation
Python 3 graph implementation designed to be turned into a web scraper for graph data.
- Host: GitHub
- URL: https://github.com/volfpeter/graphscraper
- Owner: volfpeter
- License: mit
- Created: 2017-09-29T13:31:53.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2023-01-22T10:04:35.000Z (about 2 years ago)
- Last Synced: 2024-07-08T05:49:40.228Z (7 months ago)
- Topics: cache-storage, graph, graph-algorithms, graph-database, python, python3, social-network-analysis, web, web-scraping
- Language: Python
- Homepage:
- Size: 31.3 KB
- Stars: 6
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- License: LICENSE.txt
Awesome Lists containing this project
README
|Downloads|
GraphScraper
=================GraphScraper is a Python 3 library that contains a base graph implementation designed
to be turned into a web scraper for graph data. It has two major features:1) The graph automatically manages a database (using either SQLAlchemy or
Flask-SQLAlchemy) where it stores all the nodes and edges the graph has seen.2) The base graph implementation provides hook methods that, if implemented,
turn the graph into a web scraper.Yet another graph implementation - why
-------------------------------------------There are many excellent graph libraries available for different purposes. I started
implementing this one because i haven't found a graph library that is dynamic (i don't
need the whole graph in memory - or on disk - before i start working with it), that
can be used as a web scraper (to seamlessly load nodes and edges from some remote
data source when that piece of data is needed) and that keeps all data (the graph)
automatically up-to-date on the disk. GraphScraper aims to satisfy these requirements.Examples
----------------------Besides the base graph implementation, the following working examples are also included
in the library, that show you how you can implement and use an actual graph scraper:- `igraphwrapper`: Instead of web-scraping, this example is using an igraph_ graph
instance as the "remote" source to scrape data from.
- `spotifyartist`: This example is using the Spotify_ web API to load artists and
edges are defined by Artist similarity.
More graph implementations
----------------------------------- USPTO_ patent citation graph
- Mastodon_ social graphRelated projects
------------------------- localclustering_: a local graph clustering algorithm
Dependencies
-----------------If you wish to use one of the included graph implementations, then please read the
corresponding module's description for additional requirements.Contribution
-----------------Any form of constructive contribution (feedback, features, bug fixes, tests, additional
documentation, etc.) is welcome... _igraph: http://igraph.org
.. _localclustering: https://github.com/volfpeter/localclustering
.. _Spotify: https://developer.spotify.com/web-api/
.. |Downloads| image:: https://pepy.tech/badge/graphscraper
.. _USPTO: https://github.com/volfpeter/uspto-patent-citation-graph
.. _Mastodon: https://github.com/volfpeter/mastodon-social-graph