An open API service indexing awesome lists of open source software.

https://github.com/pysemtec/semantic-python-overview

(subjective) overview of projects which are related both to python and semantic technologies (RDF, OWL, Reasoning, ...)
https://github.com/pysemtec/semantic-python-overview

collection community-driven datalog knowledge-graph ontology owl python rdf semantic-web semantics sparql swrl

Last synced: 17 days ago
JSON representation

(subjective) overview of projects which are related both to python and semantic technologies (RDF, OWL, Reasoning, ...)

Awesome Lists containing this project

README

          

[![join community](https://pysemtec.org/img/join-community.svg "join community")](https://pysemtec.org)
# Semantic Python Overview

This repository aims to collect and curate a list of projects which are related both to python and semantic technologies (RDF, OWL, SPARQL, Reasoning, ...). It is inspired by collections like [awesome lists](https://github.com/sindresorhus/awesome#readme). The list might be incomplete and biased, due to the limited knowledge of its authors. Improvements are very welcome. Feel free to file an issue or a pull request. Every section is alphabetically sorted.

Furthermore, this repository might serve as a **cristallization point for a community** interested in such projects – and how they might productively interact. See [this discussion](https://github.com/cknoll/semantic-python-overview/discussions/1) for more information.

## Established Projects

- [Bioregistry](https://github.com/biopragmatics/bioregistry) - The Bioregistry
- docs: https://bioregistry.readthedocs.io
- website: https://bioregistry.io/
- features:
- Open source (and CC 0) repository of prefixes, their associated metadata, and mappings to external registries' prefixes
- Standarization of prefixes and CURIEs
- Interconversion between CURIEs and IRIs
- Generation of context-specific prefix maps for usage in RDF, LinkML, SSSOM, OWL, etc.
- [brickschema](https://github.com/BrickSchema/py-brickschema) – Brick Ontology Python package
- Brick is an open-source effort to standardize semantic descriptions of the physical, logical and virtual assets in buildings and the relationships between them.
- docs: https://brickschema.readthedocs.io/en/latest/
- website: https://brickschema.org/
- features:
- basic inference with different reasoners
- web based interaction (by means of [Yasgui](https://github.com/TriplyDB/Yasgui))
- Translations from different formats (Haystack, VBIS)
- [Cooking with Python and KBpedia](https://www.mkbergman.com/cooking-with-python-and-kbpedia/)
- Tutorial series on "how to pick tools and then use Python for using and manipulating the KBpedia knowledge graph"
- [Material in form of Jupyter Notebooks](https://github.com/Cognonto/CWPK),
- accompanying python package [cowpoke](https://github.com/Cognonto/cowpoke),
- [CubicWeb](https://www.cubicweb.org/) a framework to build semantic web applications
- website: https://www.cubicweb.org
- docs: https://cubicweb.readthedocs.io/en/latest/
- features:
- An engine driven by the explicit data model of the application
- RQL, an intuitive query language close to the business vocabulary
- An architecture that separates data selection and visualisation
- Data security by design
- An efficient data storage

- [Eddy](https://github.com/obdasystems/eddy) - graphical ontology editor
- website: https://www.obdasystems.com/eddy
- features:
- graphical ontology editing
- uses bespoke Graphol format but has an OWL2 export
- visualization built on PyQt5
- literature references:
- [*Lembo, D and Pantaleone, D and Santarelli, V and Savo, DF: **Eddy: A Graphical Editor for OWL 2 Ontologies**. IJCAI 2016; 4252-4253*](https://cs.unibg.it/savo/papers/LPSS-IJCAI-16.pdf)
- [fastobo-py](https://github.com/fastobo/fastobo-py): Python bindings for *fastobo* (rust library to parse OBO 1.4)
- features:
- load, edit and serialize ontologies in the OBO 1.4 format
- [FunOwl](https://github.com/hsolbrig/funowl) – functional OWL syntax for Python
- features:
- provide a pythonic API that follows the OWL functional model for constructing OWL
- [Gastrodon](https://github.com/paulhoule/gastrodon) - puts RDF data on your fingertips in Pandas; gateway to matplotlib, scikit-learn and other visualization tools.
- features:
- interpolate variables into SPARQL queries
- access local RDFlib graphs and remote SPARQL protocol endpoints
- convert SPARQL result set to pandas dataframes
- understandable error messages
- input/output graphs in Turtle form
- conversion between RDF collections and Python collections
- Sphinx domain to incorporate RDF data into documentation
- [gizmos](https://github.com/ontodev/gizmos) – Utilities for ontology development
- features:
- modules for "export", "extract", "tree"-rendering
- [Jabberwocky](https://github.com/sap218/jabberwocky) – a toolkit for ontologies
- features:
- associated text mining using an ontology terms & synonyms
- tf-idf for synonym curation then adding those synonyms into an ontology
- [kglab](https://github.com/DerwenAI/kglab) - Graph Data Science
- docs: https://derwen.ai/docs/kgl/
- tutorial: https://derwen.ai/docs/kgl/tutorial/
- features:
- an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries
- perspective: there are several "camps" of graph technologies, with little discussion between them
- focus on supporting "Hybrid AI" approaches that combine two or more graph technologies with other ML work
- PyData stack – e.g., Pandas, scikit-learn, etc. – allows for graph work within data science workflows
- scale-out tools – e.g., RAPIDS, Arrow/Parquet, Dask – provide for scaling graph computation (not necessarily databases)
- graph algorithm libraries include NetworkX, iGraph, cuGraph – plus related visualization libraries in PyVis, Cairo, etc.
- W3C libraries in Py also lacked full integration: RDFlib, pySHACL, OWL-RL, etc.
- pslpython provides for _probabilistic soft logic_, working with uncertainty in probabilistic graphs
- additional integration paths and examples show how to work with deep learning (PyG)
- import paths from graph databases, such as Neo4j
- import paths from note-taking tools, such as Roam Research
- usage in [MkRefs](https://github.com/DerwenAI/mkrefs) to add semantic features into MkDocs so that open source projects can federate bibliographies, shared glossaries, etc.
- kglab team provides hands-on workshops at technology conferences for people to gain experience with these different graph approaches
- [KGX](https://github.com/biolink/kgx) - Library for building and exchanging knowledge graphs
- docs: https://kgx.readthedocs.io/
- features:
- Load graphs into an in-memory model to facilitate data integration, validation, and graph operations
- Provides an easy way to bring data into Biolink Model, a a high-level data model for biomedical knowledge graphs
- The core data structure is a Property Graph (PG), represented internally using a `networkx.MultiDiGraph`
- Supports various input and output formats including,
- RDF serializations
- SPARQL endpoints
- Neo4j endpoints
- CSV/TSV and JSON
- OWL
- OBOGraph JSON format
- SSSOM
- [LangChain](https://github.com/langchain-ai/langchain)'s GraphSparqlQAChain – A LangChain module for making RDF and OWL accessible via natural language
- docs: https://python.langchain.com/docs/use_cases/graph/graph_sparql_qa
- features:
- Generates SPARQL SELECT and UPDATE queries from natural language
- Runs the generated queries against local files, endpoints, or triple stores
- Returns natural language responses
- [LinkML](https://github.com/linkml/linkml) – Linked Open Data Modeling Language
- features:
- A high level simple way of specifying data models, optionally enhanced with semantic annotations
- A python framework for compiling these data models to json-ld, json-schema, shex, shacl, owl, sql-ddl
- A python framework for data conversion and validation, as well as generated Python dataclasses
- [Macleod](https://github.com/thahmann/macleod) – Ontology development environment for Common Logic (CL)
- features:
- Translating a CLIF file to formats supported by FOL reasoners
- Extracting an OWL approximation of a CLIF ontology
- Verifying (non-trivial) logical consistency of a CLIF ontology
- Proving theorems/lemmas, such as properties of concepts and relations or competency questions
- GUI (alpha state)
- [Morph-KGC](https://github.com/oeg-upm/morph-kgc) – System to create RDF and RDF-star knowledge graphs from heterogeneous sources with R2RML, RML and RML-star
- docs: https://morph-kgc.readthedocs.io
- features:
- support for relational databases, tabular files (e.g. CSV, Excel, Parquet) and hierarchical files (XML and JSON)
- generates RDF and RDF-star knowledge graphs by running through the command line or as a library
- integrates with RDFlib and Oxigraph to load the generated RDF directly to those libraries
- [nxontology](https://github.com/related-sciences/nxontology) – NetworkX-based library for representing ontologies
- features:
- load ontologies into a `networkx.DiGraph` or `MultiDiGraph` from `.obo`, `.json`, or `.owl` formats
(powered by pronto / fastobo)
- compute information content scores for nodes and semantic similarity scores for node pairs
- [obonet](https://github.com/dhimmel/obonet) – read OBO-formatted ontologies into NetworkX
- features:
- Load an `.obo` file into a `networkx.MultiDiGraph`
- Users should try [nxontology](https://github.com/related-sciences/nxontology) first, as a more general purpose successor to this project
- [OnToology](https://github.com/OnToology/OnToology) – System for collaborative ontology development process
- docs: http://ontoology.linkeddata.es/stepbystep
- live version: http://ontoology.linkeddata.es/
- citable reference: https://doi.org/10.1016/j.websem.2018.09.003
- [OntoPilot](https://github.com/stuckyb/ontopilot) – software for ontology development and deployment
- docs: https://github.com/stuckyb/ontopilot/wiki
- features:
- support end users in ontology development, documentation and maintainance
- convert spreadsheet data (one entity per row) to owl files
- call a reasoner before triple-store insertion
- [ontospy](https://github.com/lambdamusic/Ontospy) – Python library and command-line interface for inspecting and visualizing RDF models
- docs: http://lambdamusic.github.io/Ontospy/
- features:
- extract and print out any ontology-related information
- convert different OWL syntax variants
- generate html documentation for an ontology
- [ontor](https://github.com/felixocker/ontor) – Python library for manipulating and vizualizing OWL ontologies in Python
- features:
- tool set based on owlready2 and networkx
- [owlready2](https://bitbucket.org/jibalamy/owlready2/src/master/README.rst) – ontology oriented programming in Python
- docs: https://owlready2.readthedocs.io/en/latest/index.html
- features:
- parse owl files (RDF/XML or OWL/XML)
- parse SWRL rules
- call reasoner (via java)
- literature references:
- [*Lamy, JB: Owlready: **Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies**. Artificial Intelligence In Medicine 2017;80:11-28*](http://www.lesfleursdunormal.fr/_downloads/article_owlready_aim_2017.pdf)
- [*Lamy, JB: **Ontologies with Python**, Apress, 2020*](https://www.apress.com/fr/book/9781484265512)
- accompanying material:
- [Oxrdflib](https://github.com/oxigraph/oxrdflib) – Oxrdflib provides rdflib stores using pyoxigraph (rust-based)
- could be used as drop-in replacements of the rdflib default ones
- [pronto](https://github.com/althonos/pronto): library to parse, browse, create, and export ontologies
- features:
-supports several ontology languages and formats
- docs: https://pronto.readthedocs.io/en/latest/api.html
- [pycottas](https://github.com/arenas-guerrero-julian/pycottas) – Library for working with compressed COTTAS files
- docs: https://pycottas.readthedocs.io
- features:
- compress RDF files to COTTAS format
- evaluate triple patterns over compressed RDF
- integrates with RDFlib as a store backend to query COTTAS files with SPARQL
- [pyfactxx](https://github.com/tilde-lab/pyfactxx) – Python bindings for FaCT++ OWL 2 C++ reasoner
- features:
- well-optimized reasoner for SROIQ(D) description logic, with additional improvements
- [rdflib](https://github.com/RDFLib/rdflib) integration
- easy cross-platform installation
- [PyFuseki](https://github.com/yubinCloud/pyfuseki) – Library that interact with Jena Fuseki (SPARQL server):
- docs: https://yubincloud.github.io/pyfuseki/

- [PyKEEN](https://github.com/pykeen/pykeen) (**Py**thon **K**nowl**E**dge **E**mbeddi**N**gs) – Python package to train and evaluate knowledge graph embedding models
- features:
- 44 Models
- 37 Datasets
- 5 Inductive Datasets
- support for multi-modal information
- [PyLD](https://github.com/digitalbazaar/pyld) - A JSON-LD processor written in Python
- conforms:
- JSON-LD 1.1, W3C Candidate Recommendation, 2019-12-12 or newer
- JSON-LD 1.1 Processing Algorithms and API, W3C Candidate Recommendation, 2019-12-12 or newer
- JSON-LD 1.1 Framing, W3C Candidate Recommendation, 2019-12-12 or newer
- [pyLoDStorage](https://github.com/WolfgangFahl/pyLoDStorage) – python library to interchange data between SPARQL-, JSON and SQL-endpoints
- features:
- Integration of [tabulate library](https://pypi.org/project/tabulate/)
- QueryManager class for handling named queries
- Basic data structure: **l**ists of **d**icts (thus: "LoD")
- docs: https://wiki.bitplan.com/index.php/PyLoDStorage
- [PyOBO](https://github.com/pyobo/pyobo)
- docs: https://pyobo.readthedocs.io
- features:
- Provides unified, high-level access to names, descriptions, synonyms, xrefs, hierarchies, properties, relationships, etc. in ontologies from many sources listed in the Bioregistry
- Converts databases into OWL and OBO ontologies
- Wrapper around ROBOT for using Java tooling to convert between OBO and OWL
- Internal DSL for generating OBO ontology
- [Pyoxigraph](https://oxigraph.org/pyoxigraph/stable/index.html) – Python graph database library implementing the SPARQL standard.
- built on top of [Oxigraph](https://github.com/oxigraph/oxigraph) using [PyO3](https://pyo3.rs/)
- docs: https://oxigraph.org/pyoxigraph/stable/index.html
- two stores with SPARQL 1.1 capabilities. in-memory/disk based
- [PyRes](https://github.com/eprover/PyRes)
- resolution-based theorem provers for first-order logic
- focus on good comprehensibility of the code
- Literature: [Teaching Automated Theorem Proving by Example](https://link.springer.com/chapter/10.1007/978-3-030-51054-1_9)
- [pystardog](https://github.com/stardog-union/pystardog)
- Python bindings for the [Stardog Knowledge Graph platform](https://www.stardog.com/)
- [Quit Store](https://github.com/AKSW/QuitStore) – workspace for distributed collaborative Linked Data knowledge engineering ("Quads in Git")
- features:
- read and write RDF Datasets
- create multiple branches of the Dataset
- literature references:
- [*Decentralized Collaborative Knowledge Management using Git*](https://natanael.arndt.xyz/bib/arndt-n-2018--jws)
by Natanael Arndt, Patrick Naumann, Norman Radtke, Michael Martin, and Edgard Marx in Journal of Web Semantics, 2018
[[@sciencedirect](https://www.sciencedirect.com/science/article/pii/S1570826818300416)] [[@arXiv](https://arxiv.org/abs/1805.03721)]

- [RaiseWikibase](https://github.com/UB-Mannheim/RaiseWikibase) – A tool for speeding up multilingual knowledge graph construction with Wikibase
- fast inserts into a Wikibase instance: creates up to a million entities and wikitexts per hour
- docs: https://ub-mannheim.github.io/RaiseWikibase/
- ships with `docker-compose.yml` for Wikibase (Database, PHP-code)
- publication: https://link.springer.com/chapter/10.1007%2F978-3-030-80418-3_11
- [Reasonable](https://github.com/gtfierro/reasonable) – An OWL 2 RL reasoner with reasonable performance
- written in Rust with Python-Bindings (via [pyo3](https://pyo3.rs/))
- [ROBOT](https://github.com/ontodev/robot) – Java-tool for automating ontology workflow with several reasoners (ELK, Hermite, ...) and Python interface
- General docs: https://robot.obolibrary.org/
- Python interfaces: https://robot.obolibrary.org/python
- Docs on reasoning: https://robot.obolibrary.org/reason
- [rdflib](https://github.com/RDFLib/rdflib) – Python package for working with RDF
- docs: https://rdflib.readthedocs.io/
- graphical package overview: https://rdflib.dev/
- features:
- parsers and serializers for RDF/XML, NTriples, Turtle, JSON-LD and more
- a graph interface which can be backed by any one of a number of store implementations
- store implementations for in-memory storage and persistent storage
- a SPARQL 1.1 implementation – supporting SPARQL 1.1 Queries and Update statements
- [rdflib-endpoint](https://github.com/vemonet/rdflib-endpoint) – Python package for easily deploying SPARQL endpoints for RDFLib Graphs
- features:
- exposing machine learning models or any other logic implemented in Python through a SPARQL endpoint, using custom functions
- serving local RDF files using the command line interface
- [serd](https://gitlab.com/drobilla/python-serd) – Python serd module, providing bindings for Serd, a lightweight C library for working with RDF data
- docs: https://drobilla.gitlab.io/python-serd/singlehtml/
- [ sparqlfun](https://github.com/linkml/sparqlfun)
- LinkML based SPARQL template library and execution engine
- modularized core library of SPARQL templates
- Fully FAIR description of templates
- Rich expressive language for moedeling templates
- uses [LinkML](https://linkml.io/linkml/) as base language
- optional python bindings / [object model](https://github.com/linkml/sparqlfun/blob/main/sparqlfun/model.py) using LinkML
- supports both SELECT and CONSTRUCT
- optional export to TSV, JSON, YAML, RDF
- extensive [endpoint metadata](https://github.com/linkml/sparqlfun/tree/main/sparqlfun/config)
- [SPARQL kernel](https://github.com/paulovn/sparql-kernel) for Jupyter
- features:
- sending queries to an SPARQL endpoint
- fetching and presenting the results in a notebook
- [SPARQLing Unicorn QGIS Plugin](https://github.com/sparqlunicorn/sparqlunicornGoesGIS) – QGIS plugin which adds a GeoJSON layer from SPARQL enpoint queries
- docs: https://sparqlunicorn.github.io/sparqlunicornGoesGIS/
- QGIS plugin page: https://plugins.qgis.org/plugins/sparqlunicorn/
- features:
- Querying geospatial vector layers from SPARQL endpoints
- Conversion of geoformats (GeoJSON, SHP, KML, GML, etc.) to geospatial RDF
- Conversion of RDF geodata (GeoSPARQL-formatted) from one coordinate reference system to another
- SHACL validation of geospatial RDF graphs including validation of geoliteral (WKT, GML) contents
- [SPARQLWrapper](https://github.com/RDFLib/sparqlwrapper) – A wrapper for a remote SPARQL endpoint
- docs: https://sparqlwrapper.readthedocs.io/en/latest/index.html
- features:
- Creating a query invocation
- Optionally converting the result into a more manageable format
- [WikidataIntegrator](https://github.com/SuLab/WikidataIntegrator) – Library for reading and writing to Wikidata/Wikibase
- features:
- high integration with the Wikidata SPARQL endpoint

## Probably Stalled or Outdated Projects

- [Athene](https://github.com/dityas/Athene) DL reasoner in pure python
- "[C]urrent version is a beta and only supports ALC. But it can easily be extended by adding tableau rules."
- Last update: 2017
- [cwm](https://en.wikipedia.org/wiki/Cwm_(software))
- Self description: "\[cwm is a\] forward chaining semantic reasoner that can be used for querying, checking, transforming and filtering information".
- Created in 2000 by Tim Berners-Lee and Dan Connolly, see [w3.org](https://www.w3.org/2000/10/swap/doc/cwm)
- [air-reasoner](https://github.com/mit-dig/air-reasoner)
- Self description: "Reasoner for the AIR policy language, based on cwm"
- based on cwm
- Last update: 2013
- [FuXi](https://pypi.org/project/FuXi/)
- Self description: "An OWL / N3-based in-memory, logic reasoning system for RDF"
- based on cwm
- Last update: 2013
- see also (hg-repo)
- [pysumo](https://github.com/pySUMO/pysumo)
- Ontology IDE for the Sugested Upper Merged Ontology (SUMO)
- Docs: https://pysumo.readthedocs.io/
- Last update: 2015

## Further Projects / Links

- [ontology](https://github.com/ozekik/awesome-ontology) – A curated list of ontology things (with some python-related entries)
- [awesome-semantic-web#python](https://github.com/semantalytics/awesome-semantic-web#python) Python section of awesome list for semantic-web-related projects
- [github-semantic-web-python](https://github.com/topics/semantic-web?l=python) – github project search with `topic=semantic-web` and `language=python`
- "Graph Thinking" – Talk by Paco Nathan ([@ceteri](https://github.com/ceteri)) PyData Global 2021; [slides](https://derwen.ai/s/kcgh#84), [video](https://www.youtube.com/watch?v=bqku2a7ScXg)
- [Hydra Ecosystem](https://github.com/HTTP-APIs) - Semantically Linked REST APIs
- docs: https://www.hydraecosystem.org/
- tutorials: the stack has three major layers ([server](https://github.com/HTTP-APIs/hydrus), [client](https://github.com/HTTP-APIs/hydra-python-agent), [GUI](https://github.com/HTTP-APIs/hydra-python-agent-gui)); each repo has it own README
- features:
- deploy a server automatically from API Documentation (JSON-LD and W3C Hydra)
- client automatically reads the documentation and provides access to endpoints
- GUI allows visualization of the network generated by the servers and external resources
- a [parser](https://github.com/HTTP-APIs/hydra-openapi-parser) for OpenAPI specs translation
- notes:
- under development, experimental
- part of Google Summer of Code
- [Pywikibot](https://github.com/wikimedia/pywikibot)
- Library to interact with Wikidata and Wikimedia API
- see also: https://www.wikidata.org/wiki/Wikidata:Creating_a_bot#Pywikibot
- [semantic](https://github.com/crm416/semantic) – Python library for extracting semantic information from text, such as dates and numbers
- [Solving Einstein Puzzle](https://github.com/cknoll/demo-material/blob/main/expertise_system/einstein-zebra-puzzle-owlready-solution1.ipynb) – jupyter notebook demonstrating how to use owlready2 to solve a logic puzzle
- [W3C-Link-List1](https://www.w3.org/2001/sw/wiki/SemanticWebTools#Python_Developers) – link list "SemanticWebTools", section "Python_Developers" (wiki page)
- might be outdated
- [W3C-Link-List2](https://www.w3.org/2001/sw/wiki/Python) – list of tools usable from, or with, Python (wiki page)
- [wikidata-mayors](https://github.com/njanakiev/wikidata-mayors)
- Python code to ask wikidata for european mayors and where they where born
- Article: https://towardsdatascience.com/where-do-mayors-come-from-querying-wikidata-with-python-and-sparql-91f3c0af22e2
- [yamlpyowl](https://github.com/cknoll/yamlpyowl) – read an yaml-specified ontology into python by means of owlready2 (experimental)
- [Notebook, which generates quiz questions from wikidata](https://gist.github.com/ak314/fc6c6f911cb4f39453b575854cdc4869)
- [related presentation slides](https://www.slideshare.net/robertoturrin/how-to-turn-wikipedia-into-a-quiz-game)