Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dlr-sc/gitlab2prov
๐๏ธ Extract provenance information (W3C PROV) from GitLab projects.
https://github.com/dlr-sc/gitlab2prov
extract-provenance-information git gitlab graphs knowledge-graph prov-generation provenance python software-analytics w3c-prov
Last synced: 3 months ago
JSON representation
๐๏ธ Extract provenance information (W3C PROV) from GitLab projects.
- Host: GitHub
- URL: https://github.com/dlr-sc/gitlab2prov
- Owner: DLR-SC
- License: mit
- Created: 2019-10-14T12:50:04.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2023-09-05T14:28:30.000Z (over 1 year ago)
- Last Synced: 2024-10-08T14:40:32.659Z (4 months ago)
- Topics: extract-provenance-information, git, gitlab, graphs, knowledge-graph, prov-generation, provenance, python, software-analytics, w3c-prov
- Language: Python
- Homepage:
- Size: 2.61 MB
- Stars: 16
- Watchers: 9
- Forks: 3
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
README
Welcome to
gitlab2prov
! ๐
> `gitlab2prov` is a Python library and command line tool that extracts provenance information from GitLab projects.
---
The `gitlab2prov` data model has been designed according to [W3C PROV](https://www.w3.org/TR/prov-overview/) specification.
The model documentation can be found [here](https://github.com/DLR-SC/gitlab2prov/tree/master/docs).## ๏ธ๐๏ธ ๏ธInstallation
Please note that this tool requires Git to be installed on your machine.
Clone the project and install using `pip`:
```bash
pip install .
```Or install the latest release from [PyPi](https://pypi.org/project/gitlab2prov/):
```bash
pip install gitlab2prov
```To install `gitlab2prov` with all extra dependencies require the `[dev]` extras:
```bash
pip install .[dev] # clone repo, install with extras
pip install gitlab2prov[dev] # PyPi, install with extras
```## โก Getting started
`gitlab2prov` needs a [personal access token](https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html) to clone git repositories and to authenticate with the GitLab API.
Follow [this guide](./docs/guides/tokens.md) to create an access token with the required [scopes](https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html#personal-access-token-scopes).## ๐โ Usage
`gitlab2prov` can be configured using the command line interface or by providing a configuration file in `.yaml` format.
### Command Line Usage
The command line interface consists of commands that can be chained together like a unix pipeline.```
Usage: gitlab2prov [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...Extract provenance information from GitLab projects.
Options:
--version Show the version and exit.
--verbose Enable logging to 'gitlab2prov.log'.
--config FILE Read config from file.
--validate FILE Validate config file and exit.
--help Show this message and exit.Commands:
combine Combine multiple graphs into one.
extract Extract provenance information for one or more...
load Load provenance files.
merge-duplicated-agents Merge duplicated agents based on a name to...
pseudonymize Pseudonymize a provenance graph.
save Save provenance information to a file.
stats Print statistics such as node counts and...
```### Configuration Files
`gitlab2prov` supports configuration files in `.yaml` format that are functionally equivalent to command line invocations.To read configuration details from a file instead of specifying on the command line, use the `--config` option:
```ini
# initiate a run using a config file
gitlab2prov --config config/example.yaml
```
You can validate your config file using the provided JSON-Schema `gitlab2prov/config/schema.json` that comes packaged with every installation:
```ini
# check config file for syntactical errors
gitlab2prov --validate config/example.yaml
```Config file example:
```yaml
- extract:
url: ["https://gitlab.com/example/foo"]
token: tokenA
- extract:
url: ["https://gitlab.com/example/bar"]
token: tokenB
- load:
input: [example.rdf]
- pseudonymize:
- combine:
- save:
output: combined
format: [json, rdf, xml, dot]
- stats:
fine: true
explain: true
formatter: table
```The config file example is functionally equivalent to this command line invocation:
```
gitlab2prov extract -u https://gitlab.com/example/foo -t tokenFoo \
extract -u https://gitlab.com/example/bar -t tokenBar \
load -i example.rdf \
pseudonymize \
combine \
save -o combined -f json -f rdf -f xml -f dot \
stats --fine --explain --formatter table
```### ๐จ Provenance Output Formats
`gitlab2prov` supports output formats that the [`prov`](https://github.com/trungdong/prov) library provides:
* [PROV-N](http://www.w3.org/TR/prov-n/)
* [PROV-O](http://www.w3.org/TR/prov-o/) (RDF)
* [PROV-XML](http://www.w3.org/TR/prov-xml/)
* [PROV-JSON](http://www.w3.org/Submission/prov-json/)
* [Graphviz](https://graphviz.org/) (DOT)## ๐ค Contributing
Contributions and pull requests are welcome!
For major changes, please open an issue first to discuss what you would like to change.## โจ How to cite
If you use GitLab2PROV in a scientific publication, we would appreciate citations to the following paper:
* Schreiber, A., de Boer, C. and von Kurnatowski, L. (2021). [GitLab2PROVโProvenance of Software Projects hosted on GitLab](https://www.usenix.org/conference/tapp2021/presentation/schreiber). 13th International Workshop on Theory and Practice of Provenance (TaPP 2021), USENIX Association
Bibtex entry:
```BibTeX
@InProceedings{SchreiberBoerKurnatowski2021,
author = {Andreas Schreiber and Claas de~Boer and Lynn von~Kurnatowski},
booktitle = {13th International Workshop on Theory and Practice of Provenance (TaPP 2021)},
title = {{GitLab2PROV}{\textemdash}Provenance of Software Projects hosted on GitLab},
year = {2021},
month = jul,
publisher = {{USENIX} Association},
url = {https://www.usenix.org/conference/tapp2021/presentation/schreiber},
}
```You can also cite specific releases published on Zenodo: [![DOI](https://zenodo.org/badge/215042878.svg)](https://zenodo.org/badge/latestdoi/215042878)
## โ๏ธ References
**Influencial Software for `gitlab2prov`**
* Martin Stoffers: "Gitlab2Graph", v1.0.0, October 13. 2019, [GitHub Link](https://github.com/DLR-SC/Gitlab2Graph), DOI 10.5281/zenodo.3469385* Quentin Pradet: "How do you rate limit calls with aiohttp?", [GitHub Gist](https://gist.github.com/pquentin/5d8f5408cdad73e589d85ba509091741), MIT LICENSE
**Influencial Papers for `gitlab2prov`**:
* De Nies, T., Magliacane, S., Verborgh, R., Coppens, S., Groth, P., Mannens, E., and Van de Walle, R. (2013). [Git2PROV: Exposing Version Control System Content as W3C PROV](https://dl.acm.org/doi/abs/10.5555/2874399.2874431). In *Poster and Demo Proceedings of the 12th International Semantic Web Conference* (Vol. 1035, pp. 125โ128).
* Packer, H. S., Chapman, A., and Carr, L. (2019). [GitHub2PROV: provenance for supporting software project management](https://dl.acm.org/doi/10.5555/3359032.3359039). In *11th International Workshop on Theory and Practice of Provenance (TaPP 2019)*.
**Papers that refer to `gitlab2prov`**:
* Andreas Schreiber, Claas de Boer (2020). [Modelling Knowledge about Software Processes using Provenance Graphs and its Application to Git-based VersionControl Systems](https://dl.acm.org/doi/10.1145/3387940.3392220). In *ICSEW'20: Proceedings of the IEEE/ACM 42nd Conference on Software Engineering Workshops* (pp. 358โ359).
* Tim Sonnekalb, Thomas S. Heinze, Lynn von Kurnatowski, Andreas Schreiber, Jesus M. Gonzalez-Barahona, and Heather Packer (2020). [Towards automated, provenance-driven security audit for git-based repositories: applied to germany's corona-warn-app: vision paper](https://doi.org/10.1145/3416507.3423190). In *Proceedings of the 3rd ACM SIGSOFT International Workshop on Software Security from Design to Deployment* (pp. 15โ18).
* Andreas Schreiber (2020). [Visualization of contributions to open-source projects](https://doi.org/10.1145/3430036.3430057). In *Proceedings of the 13th International Symposium on Visual Information Communication and Interaction*. ACM, USA.
## ๐ Dependencies
`gitlab2prov` depends on several open source packages that are made freely available under their respective licenses.| Package | License |
| --------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------- |
| [GitPython](https://github.com/gitpython-developers/GitPython) | [![License](https://img.shields.io/badge/License-BSD_3--Clause-orange.svg)](https://opensource.org/licenses/BSD-3-Clause) |
| [click](https://github.com/pallets/click) | [![License](https://img.shields.io/badge/License-BSD_3--Clause-orange.svg)](https://opensource.org/licenses/BSD-3-Clause) |
| [python-gitlab](https://github.com/python-gitlab/python-gitlab) | [![License: LGPL v3](https://img.shields.io/badge/License-LGPL_v3-blue.svg)](https://www.gnu.org/licenses/lgpl-3.0) |
| [prov](https://pypi.org/project/prov/) | [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) |
| [jsonschema](https://github.com/python-jsonschema/jsonschema) | [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) |
| [ruamel.yaml](https://pypi.org/project/ruamel.yaml/) | [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) |
| [pydot](https://github.com/pydot/pydot) | [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) |## ๐ License
This project is [MIT](https://github.com/dlr-sc/gitlab2prov/blob/master/LICENSE) licensed.
Copyright ยฉ 2019 German Aerospace Center (DLR) and individual contributors.