Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/diverse-project/maven-miner
This projects mines maven central and creates a global dependency graph
https://github.com/diverse-project/maven-miner
Last synced: 3 months ago
JSON representation
This projects mines maven central and creates a global dependency graph
- Host: GitHub
- URL: https://github.com/diverse-project/maven-miner
- Owner: diverse-project
- Created: 2018-07-25T14:53:07.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2022-09-01T23:03:44.000Z (almost 2 years ago)
- Last Synced: 2024-01-08T11:14:59.214Z (6 months ago)
- Language: Java
- Homepage:
- Size: 2.02 MB
- Stars: 30
- Watchers: 14
- Forks: 9
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
Lists
- awesome-msr - Maven-miner - Java tools and infrastructure to resolve the whole Maven dependency graph, hosted in Maven Central, in the form of a [Neo4j](https://neo4j.com/) Graph. (Tools)
README
# Maven-Miner
Maven miner is a set of java tools aiming at, programmatically, resolving all Maven dependencies hosted in the Maven central repository, then, storing them into a graph database. First, the maven central index is resolved and transformed into a flat file, containing all the hosted dependencies using the maven-indexer tool. Note, this tool is inspired by the [aether-examples](https://github.com/eclipse/aether-demo) project. Later, this file can be passed to the maven-miner tool in order to collect dependency requests for each artifact available in the file. Each artifact is then visited and persisted in a graph database. We rely on Neo4j, a well-known graph database, to persist the maven dependency graph.
## User guide
### General Prerequisites
- Docker (1.13.0+)
- Docker-compose to run the maven-miner in docker-compose mode (Optional)
- Docker swarm to run the maven-miner in docker-compose mode (Optional)
- Maven
- bash### Maven indexer
```
usage: java -jar maven-miner-indexer.jar
-f,--to-file File path to retrieve artifacts coordinates list.
If not specified, the maven central index is used
instead. Note, artifacts are per line and come in
the form groupId:artifactId:version.
-q,--queue Hostname and port of the RabbitMQ broker. Note, URI
comes in the form hostname:port
-t,--to-file Dumping the index into a file with name
allArtifacsInfo. Note the args \'t\' and \'q\' are
mutually exclusive, only one should be provided"
-h,--help Show help
```
### Maven miner on standalone mode
```
usage: java -cp maven-miner-aether.jar
fr.inria.diverse.maven.resolver.launcher.BatchResolverApp
-db,--database Path to store the neo4j database. REQUIRED!
-f,--file Path to artifacts coordinates list file. Note,
artifacts are per line
-p,--pretty-printer Path to the output file stream. Optional
-r,--resolve-jars Actionning jars resolution and classes count.
Not activated by default!
-h,--help Show help
```
### Maven miner on message passing mode (only when the indexer is used in producer mode with the argument -q)```
usage: java -cp maven-miner-aether.jar
fr.inria.diverse.maven.resolver.launcher.ConsumerResolverApp
-db,--database Hostname and port of the neo4j server. REQUIRED!
-q,--queue Hostname and port of the RabbitMQ broker. Note, URI
comes in the form hostname:port
-p,--pretty-printer Path to the output file stream. Optional
-r,--resolve-jars Actionning jars resolution and classes count.
Not activated by default!
-h,--help Show help
```
### Using Maven-miner with Docker
This repository comes along with a set of scripts in order to prepare a ready-to-use docker machine, to create and launch a container, and to mine the maven central.
Regardless of the docker execution mode you are opting for, it is recommended to package the tool using the scrpit below.
#### Packaging the Maven project
After cloning the repository on your local machine or remote server, you will simply need to execute the *prepare.sh* script.
It is responsible of packaging the maven project and moving it to the files folder.```
usage: prepare.sh
user@ubuntu/path/to/repository$ ./prepare.sh
```#### Running maven-miner inside a container
Once the packages are built and moved to the files folder, you may build and run the maven miner using the *buildAndRun.sh* script as shown below:
```
Usage: buildAndRun-batch.sh
user@ubuntu$ ~/path/to/repository/buildAndRun.sh
--file Path to artifacts info version. Optional!
--database Path to database. Optional!
|maven-index.db/| is used by default.
--results Path to the host results folder. Required
--resolve-jars Actionning jars resolution and classes count. Optional
--rebuild Activates the build of the docker image build
```
#### Running maven-miner inside a docker node using docker-compose
The following script assumes that you didn't change the default port numbers of Neo4j and RabbitMQ:
```
user@ubuntu/path/to/repository$./run-swarm.sh
--n-consumer Number of consumers to be deployed
--neo4j-dump Local path to dump neo4j data and logs
```
#### Running maven-miner inside a docker swarm node using docker stack
The following script assumes that you didn't change the default port numbers of Neo4j and RabbitMQ:
```
user@ubuntu/path/to/repository$./run-swarm.sh
--n-consumer Number of consumers to be deployed
--neo4j-dump Local path to dump neo4j data and logs
```