Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium

Dependency Scanning Analyzer based on Gemnasium.
https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium

Dependency Scanning GL-Secure GL-Secure Analyzer hacktoberfest

Last synced: 29 days ago
JSON representation

Dependency Scanning Analyzer based on Gemnasium.

Lists

README

        

This project's issue tracker has been disabled, if you wish to [create an issue or bug please follow these directions](/CONTRIBUTING.md#issue-tracker).

[TOC]

# Gemnasium analyzer

Dependency Scanning analyzer that uses the [GitLab Advisory Database](https://gitlab.com/gitlab-org/security-products/gemnasium-db).

This analyzer is written in Go using the [common library] shared by most Secure analyzers.

If you wish to [create an issue or bug please follow these directions](https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/blob/master/CONTRIBUTING.md#issue-tracker) but do not create an issue or bug in this project.

## Usage

The [common library] documents
[how to use the analyzer](https://gitlab.com/gitlab-org/security-products/analyzers/common/#how-to-use-the-analyzers).

## Testing

The [common library] documents [how to test the Docker image](https://gitlab.com/gitlab-org/security-products/analyzers/common/#analyzers-development) of the analyzer using `docker run`.

In addition, this project also provides image integration tests.

### Image integration tests

Image integration tests are executed on CI to check the Docker image of the analyzer using [RSpec](https://rspec.info/).
They check the output and exit code of the analyzer, as well as the Dependency Scanning report it generates.
The image integration tests can also be executed locally, for example, to check an image that was built locally using `docker build`[3](#unable-to-build-image).

There are two ways of running the image integration tests locally:

1. Using the `integration-test` Docker image**[1](#running-image-integration-tests-using-the-integration-test-docker-image)** (recommended)
1. Directly on your local machine using ruby**[2](#running-image-integration-tests-using-ruby)**

#### Running image integration tests using the integration-test Docker image

See the [instructions](https://gitlab.com/gitlab-org/security-products/analyzers/integration-test/-/blob/main/README.md#how-to-run-the-integration-test-docker-container-locally) from the `integration-test` project.

#### Running image integration tests using ruby

To run the image integration tests, you need ruby, bundler, as well as some ruby extensions.
You also need git in order to fetch some test projects locally.

Here's how to install these packages on Alpine Linux:

```shell
apk add ruby ruby-bundler ruby-json ruby-bigdecimal git
```

Once ruby and bundler are installed, change to the root directory of the analyzer project, and install the gems needed to run RSpec:

```shell
bundle install --path vendor/ruby
```

Then copy the [Dependency Scanning Report schema](https://gitlab.com/gitlab-org/security-products/security-report-schemas/-/blob/master/dist/dependency-scanning-report-format.json) to the analyzer project.
This schema is used to perform JSON schema validation.
Here's how to fetch the latest version of the schema using curl:

```shell
curl -o dependency-scanning-report-format.json https://gitlab.com/gitlab-org/security-products/security-report-schemas/-/raw/master/dist/dependency-scanning-report-format.json
```

Finally, you can test the Docker image you've built using bundler and the `rspec` command.
The name of the image being tested should be set in the environment variable `TMP_IMAGE`.

```shell
TMP_IMAGE=gemnasium:latest bundle exec rspec
```

## Implementation

Gemnasium is a CLI written using the [urfave/cli](https://github.com/urfave/cli) package.

The CLI exposes a `run` command that proceeds as follows:
1. check whether the target directory is supported
1. scan the supported dependency files, and build a list of vulnerabilities
1. look for solutions for these vulnerabilities (auto-remediation)
1. generate a JSON report

The generated JSON report describes:
- the dependency files and their dependencies
- the vulnerabilities found in these files
- the solutions to these vulnerabilities (remediations), if any

Currently, Gemnasium is NOT built on top of the `command` package of the [common library],
even though its `run` command is very similar to `command.Run`.

The scan itself is implemented in the [`scanner`](scanner) package.
The `Scanner` proceeds as follows:
1. **configure** the advisory repository, and update it if requested
1. **find** the supported dependency files, along with the compatible parsers
1. **parse** these dependency files, and build a list of dependencies (type, name, and version)
1. **match** the dependencies with the advisories, and add affections to dependency files

An `Affection` is a struct that combines a security advisory with a dependency affected by it.

The scanner relies on several sub-packages to perform the scan:
- [parser](scanner/parser) to find supported dependency files, and parse them
- [advisory](advisory) to find security advisories, and read them
- [vrange](vrange) to evaluate the affected range, and tell whether a version is affected

## Development

The [common library] covers the generic aspects
of [analyzers development](https://gitlab.com/gitlab-org/security-products/analyzers/common/#analyzers-development).

Supporting a new package manager generally involves:
1. adding a new [dependency file parser](#dependency-file-parsers)
1. adding a specific package name resolver to the [advisory repository](#advisory-repository)
1. adding a new [version range solver](#version-range-evaluation) or reusing an existing one
1. adding QA jobs to the CI pipeline

### Dependency file parsers

The [parser](scanner/parser) package implements a collection of dependency file parsers.
A parser is registered with the filenames and package type it supports.
It reads a lock file or a dependency graph, or any file that lists the transient project dependencies,
along with the exact versions of these dependencies.

A parser generates a list of dependencies.
Each dependency has a name and version. The list is unordered and contains no duplicates.

#### Implementing a parser

Before implementing a new dependency file parser,
it might be necessary to declare a new `PackageType` in the [parser](scanner/parser) package.
Note that the package type might already be declared if it's already supported via another file parser.

Implementing a new dependency file parser consists of the following:
1. create sub-package under the [parser](scanner/parser) package
1. create a struct that implements the `parser.Parser` interface
1. register the struct using `parser.Register`, in the `init` function of the new package
1. provide fixtures and unit tests

The parser must detect whether the version of the dependency file syntax is supported,
and return a specific error if it's not.

A package implementing a parser should only export symbols that are absolutely necessary for external packages.

The parser is registered with:
- a name
- the filenames it supports, used when scanning a directory
- a package type, used to match the dependencies with security advisories
- a package manager reported in the dependency list

The package type and the filenames a parser is registered with,
are thus critical to dependency scanning,
but the name of the parser and the supported package manager are not.

The unit tests should cover at least two cases:
- file is successfully parsed and returns a list of dependencies that contains no duplicate
- file is not supported (incompatible version of the syntax)

Unit tests can be written simply by copying the test of another parser (e.g. [gemfile_test.go](https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/blob/master/scanner/parser/gemfile/gemfile_test.go)) and by supplying a fixture lock file (e.g. [Gemfile.lock](https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/blob/master/scanner/parser/gemfile/fixtures/simple/Gemfile.lock)) and the expectation for the parsed data (e.g. [dependencies.json](https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/blob/master/scanner/parser/gemfile/expect/simple/packages.json)).

Note: For a parser that returns the parsed packages and the graph of dependencies, both expectations must be supplied (e.g. [expectations](https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/tree/master/scanner/parser/nuget/expect/duplicates)).

To enable a parser in the analyzer CLI, import it as an anonymous package in [main.go](main.go).
If not enabled, the dependency files supported by the parser are ignored during the scan.

### Advisory repository

The [advisory](advisory) package is used to interact with the [vulnerability database],
a GitLab project that contains security advisories in the form of YAML files.

Main features:
- update a git clone of the vulnerability database
- list the advisory files matching a given package
- parse advisory files

#### Advisory path resolution

A dependency file parser is registered with a package type, and it returns package names when parsing a file.
Combined together, the type and name are used to find the directory that contains the advisories for a given package.
In most cases, the type and name match the directory path.
For instance, the advisories of a `gem` named `rails` are the YAML files found
in the `gem/rails` directory of the vulnerability database.

However, some parsers might return package names
that don't necessarily match directories of the vulnerability database.
For instance, the parser that handles `Pipfile.lock` (Python) returns non-canonical package names,
and these must be resolved in accordance with [PEP 0426](https://www.python.org/dev/peps/pep-0426/#name).

If the package type and name don't directly match a directory of the vulnerability database,
this exception has to be implemented in `Repo.PackageAdvisories`.

### Version range evaluation

The [vrange](vrange) package is used to determine whether a version matches a version range.
It's composed of sub-packages that cover the various version syntaxes used by the package managers Gemnasium supports.
Most of of these sub-packages are wrappers around simple CLIs implemented using the language to support.
For instance, [vrange/gem](vrange/gem) evaluates Ruby gem versions,
and it's built on top of a Ruby script. The exceptions are Conan and Maven, with Conan supported with the npm sub-package and Maven supported with the semver sub-package. See [version range modules](#version-range-modules) for supported languages/package managers.

#### Version range modules

| Language/Package Manager | Supporting sub-package |
|--------------------------|------------------------|
| Conan (C, C++) | `vrange/npm` |
| Gem (Ruby) | `vrange/gem` |
| Golang | `vrange/golang` |
| npm | `vrange/npm` |
| NuGet (C#) | `vrange/nuget` |
| PHP | `vrange/php` |
| Python | `vrange/python` |
| Maven | `vrange/semver` |

#### Implementing a CLI-based resolver

Implementation steps:

1. create a sub-package under the [vrange](vrange) package
1. implement a CLI that implements the vrange API
1. register the CLI using the `RegisterCmd` function, or register the Go native resolver with `Register`
1. make the path of the vrange CLI configurable by setting an environment variable or a CLI flag
1. check the [Dockerfile](Dockerfile) and make sure the vrange CLI is part of the Docker image
1. if needed, update the [Dockerfile](Dockerfile) to install the dependencies of the vrange CLI
1. provide unit tests

If the vrange CLI is compiled to a binary, you can either:
- update the [CI config](.gitlab-ci.yml) and add a job that compiles the binary, and passes it as an artifact
- update the [Dockerfile](Dockerfile) and add a stage that compiles the binary, which is then copied to the final image

The unique argument of the vrange CLI is the path of a JSON document.
This document is an array of query objects.
A query has two keys:
- `version` (string, required)
- `range` (string, required)

The output of the vrange CLI is a JSON document.
This document is an array of result objects.
A result has three keys:
- `version` (string)
- `range` (string)
- `satisfies` (boolean) tells if the version is in range
- `error` (string) reports a parsing error for the version or the range

The `sastifies` and `error` keys are exclusive.

The output document must contain results for every query of the input document.

The order of the result objects doesn't have to match the order of the query objects.

The vrange CLI fails when it cannot process the input document,
but it musn't fail when it cannot parse a version or a range
(it must report an error instead).

#### Implementing a Go-native resolver

Implementation steps:

1. create a sub-package under the [vrange](vrange) package
1. create a Go struct that implements the `Resolver` interface
1. register the resolver using the `Register` function
1. provide unit tests

## Lefthook

[Lefthook](https://github.com/Arkweid/lefthook) is a Git hooks manager that allows custom logic to be executed prior to Git committing or pushing. This project comes with a `lefthook.yml` configuration file, but there are two steps that must be performed before it can be used:

### Installing Lefthook

1. Install the Lefthook Git hook manager. Please follow [these directions](https://github.com/evilmartians/lefthook/blob/master/docs/other.md) to install the Lefthook binary for your environment. On Mac OS X or Linux, this can be achieved using the following command:

```shell
$ go install github.com/evilmartians/lefthook@latest
```

Note: Before installing the Lefthook binary, check to see if it's already installed by using `which lefthook`, since if you're using the [GitLab Development Kit (GDK)](https://gitlab.com/gitlab-org/gitlab-development-kit) or have contributed to the [gitlab-org/gitlab](https://gitlab.com/gitlab-org/gitlab) project, you may have already installed Lefthook.

1. Install Lefthook managed Git hooks:

```shell
$ lefthook install
```

This command will create new Git hook files in the `.git/hooks/` directory that will execute the commands specified in the `lefthook.yml` file for the given Git hook event, such as `pre-push` or `pre-commit`.

1. Confirm that Lefthook is working by running the Lefthook `pre-push` Git hook:

```shell
$ lefthook run pre-push

Lefthook v0.7.7
RUNNING HOOKS GROUP: pre-push

EXECUTE > go-mod-tidy
EXECUTE > go-lint
EXECUTE > go-test

SUMMARY: (done in 3.16 seconds)
✔️ go-mod-tidy
✔️ go-test
✔️ go-lint
```

Please see the [Pre-push static analysis with Lefthook](https://docs.gitlab.com/ee/development/contributing/style_guides.html#pre-push-static-analysis-with-lefthook) docs for more details.

### Updating the lefthook scripts

```shell
$ lefthook install
```

## Conditionally triggering child pipelines using labels

You can conditionally trigger different child pipelines by assigning the following labels to a merge request before pushing new code or running a pipeline:

- ~"trigger-gemnasium"
- ~"trigger-gemnasium-python"
- ~"trigger-gemnasium-maven"
- ~"trigger-sbomgen-golang"

Only the child pipelines matching the applied labels will be triggered.

Multiple child pipelines can be triggered at once by applying a combination of labels.

Conditionally triggering child pipelines may result in failed container scanning jobs. This is because there's no easy way to skip the container scanning jobs for the images we're not building.

## Release Process

`gemnasium` uses scripts from the [ci-templates](https://gitlab.com/gitlab-org/security-products/ci-templates/) project to automate the release of new analyzer images.

The process works as follows:

1. An MR is merged to `gemnasium`. This kicks off a pipeline in the `master branch`, for example [this pipeline](https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/pipelines/787699855).
1. The `master branch pipeline` executed in step `1.` above triggers child pipelines for the following analyzer and SBOM-generation tools:

- `gemnasium`
- `gemnasium-maven`
- `gemnasium-python`
- `sbomgen-golang`

1. Child pipelines for each analyzer are executed:

1. The `build-image` stage is executed which triggers the [build tmp image](https://gitlab.com/gitlab-org/security-products/ci-templates/blob/d37268e/includes-dev/docker.yml#L31-42) job from the `ci-templates` project.

The `build tmp image` job builds, tags, and pushes new `tmp` Docker images for each analyzer. For example:

- `gemnasium`
- `registry.gitlab.com/gitlab-org/security-products/analyzers/gemnasium/tmp/main:`
- `registry.gitlab.com/gitlab-org/security-products/analyzers/gemnasium/tmp/main:-fips`
- `gemnasium-maven`
- `registry.gitlab.com/gitlab-org/security-products/analyzers/gemnasium/tmp/maven:`
- `registry.gitlab.com/gitlab-org/security-products/analyzers/gemnasium/tmp/maven:-fips`
- `gemnasium-python`
- `registry.gitlab.com/gitlab-org/security-products/analyzers/gemnasium/tmp/python:`
- `registry.gitlab.com/gitlab-org/security-products/analyzers/gemnasium/tmp/python:-python-3.10`
- `registry.gitlab.com/gitlab-org/security-products/analyzers/gemnasium/tmp/python:-fips`
- `sbomgen-golang`
- `registry.gitlab.com/gitlab-org/security-products/analyzers/gemnasium/tmp/sbomgen-golang:`
- `registry.gitlab.com/gitlab-org/security-products/analyzers/gemnasium/tmp/sbomgen-golang:-fips`

1. The `test` stage is executed for each child pipeline, using the `tmp` Docker images produced in step `3.1` above.

1. The [check analyzer version](https://gitlab.com/gitlab-org/security-products/ci-templates/blob/d37268e/includes-dev/docker-test.yml#L94-124) job ensures that the latest version in the changelog matches the version reported by executing the analyzer.

1. The [check image size](https://gitlab.com/gitlab-org/security-products/ci-templates/blob/d37268e/includes-dev/docker-test.yml#L28-47) job ensures that the size of the newly built Docker image doesn't exceed a given threshold.

1. The `image test` and `image test fips` jobs use the [integration-test](https://gitlab.com/gitlab-org/security-products/analyzers/integration-test/) project to execute the Docker images against fixture files in the `qa/fixtures` directory and check their output against expectation files located in the `qa/expect` directory.

1. Various `*-qa` and `*-qa fips` downstream QA jobs are triggered for tests that cannot be implemented using the `image integration-test` approach above.

1. The `release-version` stage is executed, which tags `edge` versions of the analyzers, for example:

- `gemnasium`
- `registry.gitlab.com/security-products/gemnasium:edge`
- `gemnasium-maven`
- `registry.gitlab.com/security-products/gemnasium-maven:edge`
- `gemnasium-python`
- `registry.gitlab.com/security-products/gemnasium-python:edge`
- `sbomgen-golang`
- `registry.gitlab.com/security-products/sbomgen/golang:edge`

1. The `test` stage is executed for the parent pipeline, which runs various static analysis, container scanning and dependency scanning analyzers against the repository and newly built analyzer Docker images.

1. The `tag` stage is executed, which runs the [upsert git tag](https://gitlab.com/gitlab-org/security-products/ci-templates/blob/d37268e/includes-dev/upsert-git-tag.yml#L1-60) job.

The `upsert git tag` job creates a new [release](https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/releases) of `gemnasium` using the [GitLab Releases API](https://docs.gitlab.com/ee/api/releases/#create-a-release).

When the new release is created, a new `git tag` is automatically created using the latest version from the [`CHANGELOG.md`](CHANGELOG.md) file, which points to the `SHA` for the git merge commit of the MR merged in step `1.`.

1. When the new `git tag` is created in step `5.` above, a new `git tag pipeline` is executed, for example [this pipeline](https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/pipelines/787734943) for the `v3.11.3` git tag.

The `git tag pipeline` is executed by the [`analyzer-automated-pipeline-management group_2452873_bot`](https://gitlab.com/group_2452873_bot) internal user.

This pipeline repeats _all of the above steps_ (there's an [open issue](https://gitlab.com/gitlab-org/gitlab/-/issues/350448) to remove this duplication), except for the following differences:

- It does not execute the `test` or `tag` stages that were previously executed in step `4.` and `5.` respectively.

- Instead of executing the `release-version` stage of step `3.3`, it executes a `release-major` stage which tags and pushes the following Docker images:

- `gemnasium`
- `release latest`
- `registry.gitlab.com/security-products/gemnasium:latest`
- `release major`
- `registry.gitlab.com/security-products/gemnasium:3`
- `release major fips`
- `registry.gitlab.com/security-products/gemnasium:3-fips`
- `release minor`
- `registry.gitlab.com/security-products/gemnasium:3.11`
- `release minor fips`
- `registry.gitlab.com/security-products/gemnasium:3.11-fips`
- `release patch`
- `registry.gitlab.com/security-products/gemnasium:3.11.3`
- `release patch fips`
- `registry.gitlab.com/security-products/gemnasium:3.11.3-fips`
- `gemnasium-maven`
- same pattern as above, using `gemnasium-maven` as the image name.
- `gemnasium-python`
- same pattern as above, using `gemnasium-python` as the image name.
- `sbomgen-golang`
- same pattern as above, using `gemnasium-golang` as the image name.

- Special permissions are needed to execute the pipeline. Please see [Permissions required for running a release pipeline](#permissions-required-for-running-a-release-pipeline) for more details about the required permissions.

### Permissions required for running a release pipeline

Because the `git tag pipeline` described in the [Release Process](#release-process) section above is run by the `group_2452873_bot` internal user, it needs special permissions to:

- `Run CI/CD pipeline for a protected branch`

This requires the `Allowed to merge` permission for the `master` branch in [`Settings -> Repository -> Protected branches`](https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/settings/repository) for `gemnasium`:

See [GitLab CI/CD permissions](https://docs.gitlab.com/ee/user/permissions.html#gitlab-cicd-permissions) for more details.

- `Create release for project`

This requires the `Allowed to create` permission in [`Settings -> Repository -> Protected tags`](https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/settings/repository) for `gemnasium`:

See [Project members permissions](https://docs.gitlab.com/ee/user/permissions.html#project-members-permissions) for more details.

- `trigger downstream QA jobs`

In order to trigger the downstream QA jobs, the `GITLAB_TOKEN` for the [gemnasium CI/CD project variables](https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/settings/ci_cd) must be set to the token for the `group_2452873_bot` user, a member of the `gitlab-org/security-products` group.

### Permissions required for running downstream QA jobs in a release pipeline

In addition to the permissions listed in the [Permissions required for running a release pipeline](#permissions-required-for-running-a-release-pipeline) section above, the `group_2452873_bot` user must also have the `Allowed to merge` permission in `Settings -> Repository -> Protected branches` for the given branches in the following downstream QA jobs:

- Downstream jobs triggered by `gemnasium`
* [js-npm](https://gitlab.com/gitlab-org/security-products/tests/js-npm/-/settings/repository):
* `master`
* `*-FREEZE`

For example:

These same settings must be configured for the rest of the branches below.

* [js-yarn](https://gitlab.com/gitlab-org/security-products/tests/js-yarn/-/settings/repository):
* `ds-remediate-top-level`
- Downstream jobs triggered by `gemnasium-python`
* [python-pip](https://gitlab.com/gitlab-org/security-products/tests/python-pip/-/settings/repository):
* `master`
* [python-pipenv](https://gitlab.com/gitlab-org/security-products/tests/python-pipenv/-/settings/repository):
* `master`
* `*-FREEZE`
- Downstream jobs triggered by `gemnasium-maven`
* [java-gradle](https://gitlab.com/gitlab-org/security-products/tests/java-gradle/-/settings/repository):
* `master`
* [java-maven](https://gitlab.com/gitlab-org/security-products/tests/java-maven/-/settings/repository):
* `*-FREEZE`
- Downstream jobs triggered by all `gemnasium` analyzers
* [custom-ca](https://gitlab.com/gitlab-org/security-products/tests/custom-ca/-/settings/repository):
* `master`

For more details on the required permissions, please see the following issues:

- https://gitlab.com/gitlab-org/gitlab/-/issues/374032
- https://gitlab.com/gitlab-org/gitlab/-/issues/396973

### Manually triggering a failed release

If the release process doesn't work for some reason, for example, the permissions are incorrect which causes the tag to be created, but the Docker images for the analyzer are not pushed, you can manually release an existing version of the analyzer doing one of the following:

- Manually run the [Republish images](https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/pipeline_schedules) scheduled job.
- Run a `master` pipeline in `gemnasium` and set the `PUBLISH_IMAGES` variable to "true".

## Backports

We target the last two major releases when backporting bug fixes. For example, if we're on `v4` we should merge bug fixes into the `master` and `v3` branches. In this example, the `master` branch holds all code used in `v4.x.x` releases, and `v3` is used by us to backport fixes for customers that have not yet upgraded to the latest GitLab major milestone, e.g. 16.0.

## Contributing

Contributions are welcome, see [`CONTRIBUTING.md`](CONTRIBUTING.md) for more details.

## Troubleshooting

### Unable to build image

If you encounter the error message `Unknown machine architecture: aarch64` while attempting to build a `gemnasium` analyzer Docker image locally, this is due to the fact that we currently only support building on an `amd64` architecture, such as an Intel Mac. Other architectures, such as the `ARM` Apple Silicon M1 chip, are not currently supported. See [Unable to build gemnasium-maven on non-amd64 machines](https://gitlab.com/gitlab-org/gitlab/-/issues/378669) for more information.

## License

This code is distributed under the The GitLab Enterprise Edition (EE) license, see the [LICENSE](LICENSE) file.

[common library]: https://gitlab.com/gitlab-org/security-products/analyzers/common
[vulnerability database]: https://gitlab.com/gitlab-org/security-products/gemnasium-db