https://github.com/tubaf-ifi-dipit/github2pandas_manager
Aggregation of github activities on multiple repositories based on github2pandas
https://github.com/tubaf-ifi-dipit/github2pandas_manager
git-miner git-mining-tool github learning-analytics python
Last synced: about 1 year ago
JSON representation
Aggregation of github activities on multiple repositories based on github2pandas
- Host: GitHub
- URL: https://github.com/tubaf-ifi-dipit/github2pandas_manager
- Owner: TUBAF-IFI-DiPiT
- License: bsd-2-clause
- Created: 2021-08-22T14:47:37.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2022-06-13T07:15:59.000Z (almost 4 years ago)
- Last Synced: 2024-10-28T23:06:55.336Z (over 1 year ago)
- Topics: git-miner, git-mining-tool, github, learning-analytics, python
- Language: Python
- Homepage:
- Size: 191 KB
- Stars: 1
- Watchers: 2
- Forks: 3
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# github2pandas_manager Introduction
`github2pandas_manager` coordinates data aggregation activities for multiple GitHub-repositories. The user selects a list of repositories by names, name pattern, organizations or individual queries and provides a collection of versions, releases, pull-requests etc. For this purpose `github2pandas_manager` reads a configuration file (yml), collects the referenced repositories and provides the demanded information as Python pandas or csv files.
Take a view to the documentation of [github2pandas](https://github.com/TUBAF-IFI-DiPiT/github2pandas) for being familiar with the individual aggregation classes.
## Application example
https://user-images.githubusercontent.com/10922356/144754607-fcf170eb-a632-4dbe-875c-fb73e0689928.mp4
## Concept

## Installation
`github2pandas-manager` is available on [pypi](https://pypi.org/project/github2pandas-manager/). Use pip to install the package.
### global
On Linux:
```
sudo pip3 install github2pandas-manager
sudo pip install github2pandas-manager
```
On Windows as admin or for one user:
```
pip install github2pandas-manager
pip install --user github2pandas-manager
```
### in virtual environment:
```
pipenv install github2pandas-manager
```
In addition a GitHub token is required for authentication. The [website](https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token) describes how you can generate this for your GitHub account. Add your toke to an hidden `.env` file, an example is given in `.env.example`.
## Run examples
The [example](https://github.com/TUBAF-IFI-DiPiT/github2pandas_manager/tree/main/examples) folder contains four types of query configurations for different purposes:
| Fokus | Keywords | Example |
| -------| -----------| ----- |
| Repo names | List all relevant repositories by username and repository name - `repo_names` | [ProjectsByRepoNames.yml](https://github.com/TUBAF-IFI-DiPiT/github2pandas_manager/blob/main/examples/ProjectsByRepoNames.yml) |
| Repo name patterns | Describe relevant repositories by white- and black-patterns - `repo_white_pattern`, `repo_black_pattern` | [ProjectsByRepoNamePatterns.yml](https://github.com/TUBAF-IFI-DiPiT/github2pandas_manager/blob/main/examples/ProjectsByRepoNamePatterns.yml)|
| Repos by organizations | Select all repositories of an organization account - `organization_names` | [ProjectsByOrganizations.yml](https://github.com/TUBAF-IFI-DiPiT/github2pandas_manager/blob/main/examples/ProjectsByOrganizations.yml) |
| Repos by a set of query parameter | Select all repositories according to programming languages, stars etc. - `language`, `start_date`, `end_date`, `star_filter` | [ProjectsByQuery.yml](https://github.com/TUBAF-IFI-DiPiT/github2pandas_manager/blob/main/examples/ProjectsByQuery.yml) |
In order to start the examples just run:
```
pipenv run python -m github2pandas_manager -path ./examples/ProjectsByQuery.yml
```
## YAML-Configuration schema
In addition to the specific configuration parameters mentioned above, each request includes three further definitions - `project_name`, `project_folder` and `content`.
While the first two are used to structure the folders to hold the data, the last parameter describes the repository data to be aggregated:
+ `Repository`
+ `Issues`
+ `Version`
+ `PullRequests`
+ `Workflows`
+ `GitReleases`
An overview of the information contained in each data frame can be found in the [wiki of the gitlab2pandas](https://github.com/TUBAF-IFI-DiPiT/github2pandas/wiki) project.