{"id":41698218,"url":"https://github.com/grip-on-software/monetdb-import","last_synced_at":"2026-01-24T20:55:06.806Z","repository":{"id":171609876,"uuid":"648149323","full_name":"grip-on-software/monetdb-import","owner":"grip-on-software","description":"Importer of gathered data into a MonetDB database","archived":false,"fork":false,"pushed_at":"2024-07-13T15:38:05.000Z","size":6632,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-09-04T21:59:14.035Z","etag":null,"topics":["database-importer","monetdb"],"latest_commit_sha":null,"homepage":"https://gros.liacs.nl","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/grip-on-software.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-06-01T10:12:48.000Z","updated_at":"2024-07-13T15:33:27.000Z","dependencies_parsed_at":null,"dependency_job_id":"3693aa0d-ea21-4661-add9-8af974b02b3c","html_url":"https://github.com/grip-on-software/monetdb-import","commit_stats":null,"previous_names":["grip-on-software/monetdb-import"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/grip-on-software/monetdb-import","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grip-on-software%2Fmonetdb-import","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grip-on-software%2Fmonetdb-import/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grip-on-software%2Fmonetdb-import/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grip-on-software%2Fmonetdb-import/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/grip-on-software","download_url":"https://codeload.github.com/grip-on-software/monetdb-import/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grip-on-software%2Fmonetdb-import/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28736791,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-24T19:23:36.361Z","status":"ssl_error","status_checked_at":"2026-01-24T19:23:28.966Z","response_time":89,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database-importer","monetdb"],"created_at":"2026-01-24T20:55:06.278Z","updated_at":"2026-01-24T20:55:06.795Z","avatar_url":"https://github.com/grip-on-software.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MonetDB import and management\n\n[![Build \nstatus](https://github.com/grip-on-software/monetdb-import/actions/workflows/monetdb-import-tests.yml/badge.svg)](https://github.com/grip-on-software/monetdb-import/actions/workflows/monetdb-import-tests.yml)\n[![Coverage \nStatus](https://coveralls.io/repos/github/grip-on-software/monetdb-import/badge.svg?branch=master)](https://coveralls.io/github/grip-on-software/monetdb-import?branch=master)\n[![Quality Gate\nStatus](https://sonarcloud.io/api/project_badges/measure?project=grip-on-software_monetdb-import\u0026metric=alert_status)](https://sonarcloud.io/project/overview?id=grip-on-software_monetdb-import)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.12583197.svg)](https://doi.org/10.5281/zenodo.12583197)\n\nThis repository contains a Java application which can interact with a MonetDB \ndatabase for a Grip on Software data collection in order to import JSON \nrepresentations of source information.\n\nThe repository also contains scripts that validate schemas, perform schema \nupgrades, allow imports of backups and exchange formats, as well as generate \nexports, partially via the \n[monetdb-dumper](https://github.com/grip-on-software/monetdb-dumper) \nrepository.\n\n## Importer application \n\n### Requirements, configuration and building\n\nThe importer application has been tested with Semeru OpenJDK 21. In order to \nbuild the application, we use Ant 1.10.1+ with the JDK (a package with \n`javac`). Make sure your `JAVA_HOME` environment variable points to the correct \nJDK directory if you have multiple possible installations.\n\nBefore building, ensure you have create a file in the path \n`Code/importerjson/nbproject/private/config.properties`, possibly by copying \nthe `config.properties.example` file to there and editing it, containing the \nfollowing properties:\n\n```\nimporter.url=jdbc:monetdb://MONETDB_HOST/gros\nimporter.user=MONETDB_USER\nimporter.password=MONETDB_PASSWORD\nimporter.relPath=export\nimporter.email_domain=EMAIL_DOMAIN\n```\n\nReplace the values to properly connect to the MonetDB database and define the \ninternal domain used at the organization. Take care to use a proper \nJDBC-MonetDB URL for the `importer.url` property, including the correct port to \nconnect to the database and the name of the database (`gros` by default). The \n`importer.relPath` property should be a relative path from the working \ndirectory when the application is run. These properties can be overridden \nduring execution using defines.\n\nNow run the following command in order to build the MonetDB import application: \n```\nant -buildfile Code/importerjson/build.xml \\\n    -propertyfile Code/importerjson/nbproject/private/config.properties\n```\n\nThe JAR is then made available in `Code/importerjson/dist/importerjson.jar`.\n\n### Running importer\n\nThe MonetDB importer works on JSON data files that contain representations of \ndata acquired from software development systems which have been processed by \na [data-gathering](https://github.com/grip-on-software/data-gathering) agent. \nJSON schemas for those files are also available in that repository. Typically, \na released version of the importer is compatible with the same version of the \ndata-gathering agent, with backward compatibility for the same major version.\n\nRun the application as follows:\n\n```\njava -Dimporter.log=LEVEL [...other defines] -jar \\\n    Code/importerjson/dist/importerjson.jar PROJECT TASKS\n```\n\nIn this command, replace `LEVEL` with an appropriate log level (for example \n`INFO`), the `PROJECT` with a project key to import for (must be a subdirectory \nof the relative path), and the `TASKS` with the import tasks (`all` by \ndefault). In case there is a complex selection of tasks required, then they may \nbe provided as a comma-separated list of tasks or group names, where a minus \nsign before a task or group excludes that task from operation again. Special \ntasks (often data corrections) are not performed by default and should be added \nto the list if they are to be performed. If the project argument is `--`, then \nonly special tasks may be performed and they are done as organization-wide \nchanges (if not already). The project argument can also be `--files`, in which \ncase a list of files involved for the selected tasks is printed instead of \nperforming them. The import tasks and configuration aspects are also described \nwhen `--help` is provided for the project argument, which then also exits out.\n\nYou may possibly add other defines at the start of the command, including \nreplacement values for the properties defined in the `config.properties` \nproperty file included during the build (`importer.url`, `importer.user`, \n`importer.password`, `importer.relPath` and `importer.email_domain`), as well \nas the following:\n\n- `importer.update`: A space-separated list of update tracker files to import \n  for the `update` task, such that subsequent data gathering can continue from \n  this state and thus support incremental collection and import. Update \n  trackers are also used by some GROS visualizations to determine source age.\n- `importer.encrypt_tables`: A comma-separated list of table names to perform \n  encryption of personally identifying information on for the `encrypt` task. \n  By default, this task encrypts project-specific tables with developer \n  information with the project encryption key if a project is selected or \n  organization-common tables with the global encryption key if no project is \n  selected (with `--` as `PROJECT` argument).\n\n### Testing\n\nTests can be performed during the build using:\n\n```\nant -buildfile Code/importerjson/build.xml \\\n    -propertyfile Code/importerjson/nbproject/private/config.properties test\n```\n\nNote that one test is an integration test, which requires a few things to be \nset up beforehand, otherwise it will detect and skip the test:\n\n- Import data files should be placed in directories \n  `Code/importerjson/export/TEST1` through `Code/importerjson/export/TEST10`, \n  as well as `Code/importerjson/data_vcsdev_to_dev.json`.\n- A MonetDB database instance should be running on `localhost` on the default \n  port (`50000`) and a database with the name `gros_test` should be created and \n  pre-filled with the database schema.\n\nThese two steps can be simplified by [running scripts](#running-scripts), \nrespectively the `generate_test_files.py` and `recreate_database.py` scripts. \nNote that the check whether a database is available may take a long time before \nskipping, as this depends on database pooling options which try to reconnect to \nthe database several times until it gives up.\n\nTest output should indicate the successful, failed and skipped tests. Once the \ntest is complete, test result and coverage information is made available in \n`Code/importerjson/build/test`, with JUnit XML files in `junit/junit.xml` in \nthat directory and JaCoCo coverage XML in `jacoco.xml` and HTML reports in \n`jacoco/index.html`.\n\n[GitHub Actions](https://github.com/grip-on-software/monetdb-import/actions) is \nused to run the unit tests and report on coverage on commits and pull requests. \nThis includes quality gate scans tracked by \n[SonarCloud](https://sonarcloud.io/project/overview?id=grip-on-software_monetdb-import) \nand [Coveralls](https://coveralls.io/github/grip-on-software/monetdb-import) \nfor coverage history.\n\n## Management scripts\n\n### Requirements\n\nThe scripts can be run using Python 3.7+ and Bash. The Python installation \nshould have Pip or virtualenv installed so that dependencies can be installed, \nfor example using `pip install -r Scripts/requirements.txt`. If there are not \nenough permissions to install the dependencies on the system, then you can add \n`--user` in the command, after `pip install`. Otherwise, you can create \na `virtualenv ENV`, activate it with `source ENV/bin/activate` and install the \ndependencies there.\n\n### Configuration\n\nCopy the `Scripts/settings.cfg.example` to `Scripts/settings.cfg`. From this \npoint on, we assume you are in the `Scripts` working directory, otherwise place \nthe `settings.cfg` in another working directory and adjust any further paths to \nthe scripts.\n\nIn the `settings.cfg` file, replace the variables in the configuration items of \nthe groups in order to properly configure the environment of the scripts:\n\n- `monetdb`: MonetDB database connection settings\n  - `hostname` (`$MONETDB_HOSTNAME`): Domain name of the database host.\n  - `passphrase` (`$MONETDB_PASSPHRASE`): Passphrase for administrative remote \n    control (not the same as the client password).\n  - `username`: (`$MONETDB_USERNAME`): Username that has authorization to \n    create and alter tables.\n  - `password` (`$MONETDB_PASSWORD`): Password of the user that can create and \n    alter tables.\n  - `database` (`$MONETDB_DATABASE`): Database name that can be (re)created or \n    have its schema altered.\n- `jenkins`: Jenkins connection settings\n  - `host` (`$JENKINS_HOST`): Base URL of the Jenkins instance.\n  - `job` (`$JENKINS_JOB`): Jenkins job that has a workspace that can be \n    deleted in order to clean up any leftover tracking data.\n  - `username` (`$JENKINS_USERNAME`): Username to log in to Jenkins.\n  - `token` (`$JENKINS_TOKEN`): Password or API token to log in to Jenkins.\n  - `crumb`: Whether to request a CSRF crumb before performing other API \n    requests. This should be \"yes\", as Jenkins instances that did not support \n    (or require) this are ancient.\n- `schema`: Database table schema validation\n  - `url` (`$SCHEMA_URL`): URL to an external MediaWiki or JSON resource that \n    documents the table schema.\n  - `path`: The path to an SQL file that can be used to create the table schema \n    in an empty database.\n  - `verify` (`$SCHEMA_VERIFY`): Whether to verify the SSL certificate when \n    obtaining the schema documentation from an external URL. If this is set to \n    a file path, then that file is used to verify the certificate with.\n  - `username` (`$SCHEMA_USERNAME`): Username to use for Basic authorization \n    when obtaining the schema documentation from an external URL realm.\n  - `password` (`$SCHEMA_PASSWORD`): Password to use for Basic authorization \n    when obtaining the schema documentation from an external URL realm.\n\nSome configuration can be adjusted through command line arguments in the \nscripts (and some scripts do not use the configuration file).\n\n### Running scripts\n\nThe following scripts are available to manage the database:\n\n- `dump_tables.sh`: Perform a (partial) database dump using compressed SQL/CSV \n  files plus the schema, placed into a timestamped output directoryy by default \n  (uses the `monetdb-dumper` application)\n- `import_tables.sh`: Perform an import of a database dump (assumes an empty \n  database as this also creates the schema provided with the dump)\n- `recreate_database.py`: Destroy the database and create it again, usually \n  with the current schema, possibly wiping out a Jenkins workspace as well\n- `update_database.py`: Perform schema upgrades on an existing database.\n- `validate_schema.py`: Compare a documentation resource against the database \n  table schema file in order to check for validation errors and differences.\n\nUse the `--help` argument for the scripts to receive more details on running \nthe scripts and their arguments.\n\nAdditionally, the file `generate_test_files.py` is usable for setting up the \nintegration tests for the importer application by creating generated files \nbased on JSON Schemas of the data imports, which requires a clone of the \n[data-gathering](https://github.com/grip-on-software/data-gathering) repository \nto be available.\n\nThe script `workbench_group.py` only works within the MySQL Workbench Scripting \nShell and is meant to alter the model file for entity-relationship diagrams.\n\n### Schema documentation\n\nWithin the `Scripts` directory, several versions of documentation of the \ndatabase schema can be found:\n\n- `Database_structure.md`: Exhaustive documentation on tables, keys, attributes \n  and references, including what each means and in which cases columns can be \n  `NULL` or other specific values.\n- `Sensitive_data.md`: Additional documentation on what (future) steps can be \n  taken for specific fields stored within the database to keep \n  project-sensitive data and personal data secure.\n- `create-tables.sql`: The actual schema for MonetDB in `CREATE TABLE` SQL \n  statements.\n- `database-model.mwb`: A MySQL Workbench file containing a converted version \n  of the schema.\n\nSome of these files are used by the scripts in order to perform validation or \nconversion to other formats, such as JSON.\n\n## License\n\nThe MonetDB importer is licensed under the Apache 2.0 License. Dependency \nlibraries are included in object form (some libraries are only used in tests) \nand have the following licenses:\n\n- CopyLibs: Part of NetBeans, distributed under Apache 2.0 License\n- [ahocorasick](https://github.com/robert-bor/aho-corasick): Apache 2.0 License\n- [c3p0](https://github.com/swaldman/c3p0): LGPL v2.1 (or any later version) or \n  EPL v1.0\n- [joda-time](https://github.com/JodaOrg/joda-time): Apache 2.0 License\n- [json-simple](https://github.com/fangyidong/json-simple): Apache 2.0 License\n- [mchange-commons-java](https://github.com/swaldman/mchange-commons-java): \n  LGPL v2.1 (or any later version) or EPL v1.0\n- [monetdb-jdbc](https://github.com/MonetDB/monetdb-java): MPL v2.0, available \n  from [MonetDB Java Download Area](https://www.monetdb.org/downloads/Java/)\n\nTest libraries:\n\n- [hamcrest-core](https://github.com/hamcrest/JavaHamcrest): BSD License, see \n  [LICENSE.txt](Code/importerjson/lib/hamcrest/LICENSE.txt)\n- [jacoco](https://github.com/jacoco/jacoco) (agent and ant task): EPL v2.0\n- [junit4](https://github.com/junit-team/junit4): EPL v1.0\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgrip-on-software%2Fmonetdb-import","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgrip-on-software%2Fmonetdb-import","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgrip-on-software%2Fmonetdb-import/lists"}