{"id":20319725,"url":"https://github.com/imlegend19/gras","last_synced_at":"2025-04-11T18:20:32.329Z","repository":{"id":40959621,"uuid":"258821596","full_name":"imlegend19/GRAS","owner":"imlegend19","description":"Git Repositories Archiving Service","archived":false,"fork":false,"pushed_at":"2022-12-08T10:56:02.000Z","size":13106,"stargazers_count":8,"open_issues_count":6,"forks_count":0,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-25T14:12:11.462Z","etag":null,"topics":["abstract-syntax-tree","antlr4","data-mining","file-dependency","git","github-api","gras","java","javascipt","javascript","mailing-list","mining-software-repositories","neo4j","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/imlegend19.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-04-25T16:25:58.000Z","updated_at":"2023-12-05T15:25:40.000Z","dependencies_parsed_at":"2023-01-25T10:45:51.752Z","dependency_job_id":null,"html_url":"https://github.com/imlegend19/GRAS","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imlegend19%2FGRAS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imlegend19%2FGRAS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imlegend19%2FGRAS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imlegend19%2FGRAS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/imlegend19","download_url":"https://codeload.github.com/imlegend19/GRAS/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248456382,"owners_count":21106605,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["abstract-syntax-tree","antlr4","data-mining","file-dependency","git","github-api","gras","java","javascipt","javascript","mailing-list","mining-software-repositories","neo4j","python"],"created_at":"2024-11-14T18:47:47.249Z","updated_at":"2025-04-11T18:20:32.277Z","avatar_url":"https://github.com/imlegend19.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Python 3.6](https://img.shields.io/badge/python-3.6-blue.svg)](https://www.python.org/downloads/release/python-360/) [![License](https://img.shields.io/badge/License-BSD%203--Clause-orange.svg)](https://opensource.org/licenses/BSD-3-Clause)\n\n# GRAS (Git Repositories Archiving Service)\n\nGit Repository Mining Service (GRAS) is a tool designed to thoroughly mine software repositories. Software repository data has become the foundation of many empirical software engineering research projects such as sentiment mining, developer social networks, etc. However, the mining and collection of this data proves to be very challenging due to the density and connectedness of each component. GRAS was built due to the need for cleaner and coherent organization of mined data and optimization of the mining process. It is capable of mining a repository from various version control systems and storing it into more than 3 types of databases in a harmonized way. GRAS also has a built-in file dependency analyzer (currently, Java \u0026 Python are only supported) that parses the entire project and creates a heterogenous graph representing the file relationships and their dependencies on each other. The file dependency graph is stored in a Neo4j database. We are currently working on the Identity Merging Component. \n\n## Installation\n\n1. Clone this repository: `$ git clone https://github.com/imlegend19/GRAS.git`\n2. `$ cd GRAS`\n3. Create a virtual environment: \n   ```\n   $ python3 -m venv venv\n   $ source venv/bin/activate\n   ``` \n4. Install the requirements: `$ pip install requirements.txt`\n   - *Note:* If you get error while installing mysqlclient, try running the following:\n      ```\n      sudo apt-get install python3-dev libmysqlclient-dev\n      ```\n   - *Note:* If you get `error building wheel` error while installing `pycparser` or `neo4j`, try running this:\n      ```\n      pip install wheel\n      ```\n5. Bingo! The setup is now complete.\n\n## Usage\n\n```\nusage: python3 main.py [-g [GENERATE]] [-m [MINE]] [-s [STATS]] [-B [BASIC]]\n               [-BE [BASIC_EXTRA]] [-IT [ISSUE_TRACKER]] [-CD [COMMIT]]\n               [-PT [PULL_TRACKER]] [-CS CHUNK_SIZE] [-f [FULL]] [--path PATH]\n               [--aio [AIO]] [-t TOKENS [TOKENS ...]] [-yk YANDEX_KEY]\n               [-i {github,git,identity-merging,java-cda,java-miner}]\n               [-RO REPO_OWNER] [-RN REPO_NAME] [-SD START_DATE]\n               [-ED END_DATE] [-c CONFIG]\n               [-dbms {sqlite,mysql,postgresql,neo4j}] [-DB DB_NAME]\n               [-U DB_USERNAME] [-P [DB_PASSWORD]] [-H DB_HOST] [-p DB_PORT]\n               [-dbo DB_OUTPUT] [-dbl [DB_LOG]] [-h]\n               [-a {arrow,eclipse,dots_1,dots_2,birds,dash,cycle,rod,bar,balloon}]\n               [-OP {1,2,3}] [-CL [CLEAR_LOGS]] [-o OUTPUT]\n\nGRAS - GIT REPOSITORIES ARCHIVING SERVICE\n\nGRAS-COMMANDS:\n  -g [GENERATE], --generate [GENERATE]\n                        Generate a config file template\n  -m [MINE], --mine [MINE]\n                        Mine the repository\n  -s [STATS], --stats [STATS]\n                        View the stats of the repository\n  -B [BASIC], --basic [BASIC]\n                        Mining Stage 1-A: Basic\n  -BE [BASIC_EXTRA], --basic-extra [BASIC_EXTRA]\n                        Mining Stage 1-B: Basic Extra\n  -IT [ISSUE_TRACKER], --issue-tracker [ISSUE_TRACKER]\n                        Mining Stage 2: Issue Tracker\n  -CD [COMMIT], --commit [COMMIT]\n                        Mining Stage 3: Commit Data\n  -PT [PULL_TRACKER], --pull-tracker [PULL_TRACKER]\n                        Mining Stage 4: Pull Request Tracker\n  -CS CHUNK_SIZE, --chunk-size CHUNK_SIZE\n                        Time Period Chunk Size (in Days)\n  -f [FULL], --full [FULL]\n                        Mine the complete repository\n  --path PATH           Path to the directory to mine\n  --aio [AIO]           If added, git-miner would use asyncio architecture\n\nGRAS-SETTINGS:\n  -t TOKENS [TOKENS ...], --tokens TOKENS [TOKENS ...]\n                        List of Personal API Access Tokens for parsing\n  -yk YANDEX_KEY, --yandex-key YANDEX_KEY\n                        Yandex Translator API Key\n                        (https://translate.yandex.com/developers/keys)\n  -i {github,git,identity-merging,java-cda,java-miner}, --interface {github,git,identity-merging,java-cda,java-miner}\n                        Interface of choice\n  -RO REPO_OWNER, --repo-owner REPO_OWNER\n                        Owner of the repository\n  -RN REPO_NAME, --repo-name REPO_NAME\n                        Name of the repository\n  -SD START_DATE, --start-date START_DATE\n                        Start Date for mining the data (in any ISO 8601\n                        format, e.g., 'YYYY-MM-DD HH:mm:SS +|-HH:MM')\n  -ED END_DATE, --end-date END_DATE\n                        End Date for mining the data (in any ISO 8601 format,\n                        e.g., 'YYYY-MM-DD HH:mm:SS +|-HH:MM')\n  -c CONFIG, --config CONFIG\n                        Path to the config file\n\nDATABASE-SETTINGS:\n  -dbms {sqlite,mysql,postgresql,neo4j}\n                        DBMS to dump the data into\n  -DB DB_NAME, --db-name DB_NAME\n                        Name of the database\n  -U DB_USERNAME, --db-username DB_USERNAME\n                        The user name that is used to connect and operate the\n                        selected database\n  -P [DB_PASSWORD], --db-password [DB_PASSWORD]\n                        The password for the user name entered\n  -H DB_HOST, --db-host DB_HOST\n                        The database server IP address or DNS name\n  -p DB_PORT, --db-port DB_PORT\n                        The database server db_port that allows communication\n                        to your database\n  -dbo DB_OUTPUT, --db-output DB_OUTPUT\n                        The path to the .db file in case of sqlite dbms\n  -dbl [DB_LOG], --db-log [DB_LOG]\n                        DB-log flag to log the generated SQL produced\n\nOTHER:\n  -h, --help            show this help message and exit\n  -a {arrow,eclipse,dots_1,dots_2,birds,dash,cycle,rod,bar,balloon}, --animator {arrow,eclipse,dots_1,dots_2,birds,dash,cycle,rod,bar,balloon}\n                        Loading animator\n  -OP {1,2,3}, --operation {1,2,3}\n                        Choose the operation to perform for retrieving the\n                        stats.: 1. CREATE, 2. UPDATE, 3. APPEND\n  -CL [CLEAR_LOGS], --clear-logs [CLEAR_LOGS]\n                        Clear the logs directory\n  -o OUTPUT, --output OUTPUT\n                        The output path where the config file is to be\n                        generated\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fimlegend19%2Fgras","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fimlegend19%2Fgras","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fimlegend19%2Fgras/lists"}