Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/package-url/purl-spec

A minimal specification for purl aka. a package "mostly universal" URL, join the discussion at https://gitter.im/package-url/Lobby
https://github.com/package-url/purl-spec

cyclonedx dependencies package package-management package-url purl sbom spdx url

Last synced: 13 days ago
JSON representation

A minimal specification for purl aka. a package "mostly universal" URL, join the discussion at https://gitter.im/package-url/Lobby

Awesome Lists containing this project

README

        

Context
=======

We build and release software by massively consuming and producing software
packages such as NPMs, RPMs, Rubygems, etc.

Each package manager, platform, type or ecosystem has its own conventions and
protocols to identify, locate and provision software packages.

Problem
=======

When tools, APIs and databases process or store multiple package types, it is
difficult to reference the same software package across tools in a uniform way.

For example, these tools, specifications and API use relatively similar
approaches to identify and locate software packages, each with subtle
differences in syntax, naming and conventions:

- Grafeas uses a scheme, namespace, name and version in a URL-like string.
https://github.com/Grafeas/Grafeas

- Here.com OSRK uses a package manager, name and version field and a colon-
separated URL-like string
https://github.com/heremaps/oss-review-toolkit

- JFrog XRay uses a scheme, namespace, name and version in a URL-like string
https://www.jfrog.com/confluence/display/XRAY/Xray+REST+API#XrayRESTAPI-ComponentIdentifiers

- Libraries.io uses a platform, name and version
https://libraries.io/

- OpenShift fabric8 analytics uses ecosystem, name and version
https://github.com/fabric8-analytics/

- ScanCode and AboutCode.org use a type, name and version
https://github.com/nexB/scancode-toolkit

- SPDX has an appendix for external repository references and uses a type and a
locator with a type-specific syntax for component separators in a URL-like
string
https://spdx.github.io/spdx-spec/latest/package-information/

- versioneye uses a type, name and version
https://github.com/versioneye/

- Sonatype Lifecycle uses a format id followed by format specific coordinates.
https://links.sonatype.com/products/nxiq/doc/component-identifier

Solution
========

A `purl` or package URL is an attempt to standardize existing approaches to
reliably identify and locate software packages.

A `purl` is a URL string used to identify and locate a software package in a
mostly universal and uniform way across programming languages, package managers,
packaging conventions, tools, APIs and databases.

Such a package URL is useful to reliably reference the same software package
using a simple and expressive syntax and conventions based on familiar URLs.

Check also this short `purl` presentation (with video) at FOSDEM 2018
https://fosdem.org/2018/schedule/event/purl/ for an overview.

purl
~~~~~

`purl` stands for **package URL**.

A `purl` is a URL composed of seven components::

scheme:type/namespace/name@version?qualifiers#subpath

Components are separated by a specific character for unambiguous parsing.

The definition for each components is:

- **scheme**: this is the URL scheme with the constant value of "pkg". One of
the primary reason for this single scheme is to facilitate the future official
registration of the "pkg" scheme for package URLs. Required.
- **type**: the package "type" or package "protocol" such as maven, npm, nuget,
gem, pypi, etc. Required.
- **namespace**: some name prefix such as a Maven groupid, a Docker image owner,
a GitHub user or organization. Optional and type-specific.
- **name**: the name of the package. Required.
- **version**: the version of the package. Optional.
- **qualifiers**: extra qualifying data for a package such as an OS,
architecture, a distro, etc. Optional and type-specific.
- **subpath**: extra subpath within a package, relative to the package root.
Optional.

Components are designed such that they form a hierarchy from the most significant component
on the left to the least significant component on the right.

A `purl` must NOT contain a URL Authority i.e. there is no support for
`username`, `password`, `host` and `port` components. A `namespace` segment may
sometimes look like a `host` but its interpretation is specific to a `type`.

Some `purl` examples
~~~~~~~~~~~~~~~~~~~~

::

pkg:bitbucket/birkenfeld/pygments-main@244fd47e07d1014f0aed9c

pkg:deb/debian/[email protected]?arch=i386&distro=jessie

pkg:docker/cassandra@sha256:244fd47e07d1004f0aed9c
pkg:docker/customer/dockerimage@sha256:244fd47e07d1004f0aed9c?repository_url=gcr.io

pkg:gem/[email protected]?platform=java
pkg:gem/[email protected]

pkg:github/package-url/purl-spec@244fd47e07d1004f0aed9c

pkg:golang/google.golang.org/genproto#googleapis/api/annotations

pkg:maven/org.apache.xmlgraphics/[email protected]?packaging=sources
pkg:maven/org.apache.xmlgraphics/[email protected]?repository_url=repo.spring.io/release

pkg:npm/%40angular/[email protected]
pkg:npm/[email protected]

pkg:nuget/[email protected]

pkg:pypi/[email protected]

pkg:rpm/fedora/[email protected]?arch=i386&distro=fedora-25
pkg:rpm/opensuse/[email protected].?arch=i386&distro=opensuse-tumbleweed

(NB: some checksums are truncated for brevity)

Specification details
~~~~~~~~~~~~~~~~~~~~~

The `purl` specification consists of a core syntax definition and independent
type definitions:

- `Package URL core `_: Defines a versioned and
formalized format, syntax, and rules used to represent and validate `purl`.

- `Type definitions `_: Defines `purl` types (e.g. maven, npm,
cargo, rpm, etc) independent of the core specification. Definitions also
include types reserved for future use.

Known implementations
~~~~~~~~~~~~~~~~~~~~~

- .NET: https://github.com/package-url/packageurl-dotnet
- Elixir: https://github.com/maennchen/purl
- Go: https://github.com/package-url/packageurl-go
- Java: https://github.com/package-url/packageurl-java,
https://github.com/sonatype/package-url-java
- JavaScript: https://github.com/package-url/packageurl-js
- Perl: https://github.com/giterlizzi/perl-URI-PackageURL
- PHP: https://github.com/package-url/packageurl-php
- Python: https://github.com/package-url/packageurl-python
- Ruby: https://github.com/package-url/packageurl-ruby
- Rust: https://github.com/package-url/packageurl.rs
- Swift: https://github.com/package-url/packageurl-swift

Users, adopters and links (alphabetical order)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- `CycloneDX `_: A lightweight software
bill-of-material (SBOM) specification
- `GitHub Dependency Submission API `_: allows third-party tools
to submit dependency data to GitHub for inclusion in a repository's dependency graph.
- `OWASP Dependency-Track `_:
Open source component analysis platform
- `OSS Index `_: A free catalog of Open Source
Components and scanning tools to help developers identify vulnerable components
- `OSV Schema `_ and `OSV.dev `_:
Open Source Vulnerability Schema and distributed vulnerability database
- `Scancode Toolkit `_: Reports
`purl` from parsed package manifests using https://github.com/package-url/packageurl-python
- `Sonatype Nexus Lifecycle `_:
Enterprise grade Open Source component management
- `SPDX `_: A data exchange standard for human-readable and
machine-processable software bill-of-materials (SBOM)

License
~~~~~~~

This document is licensed under the MIT license