An open API service indexing awesome lists of open source software.

https://github.com/kitschpatrol/codemeta

A CLI tool and TypeScript library to discover and map software project metadata from various ecosystems to the CodeMeta standard.
https://github.com/kitschpatrol/codemeta

cli codemeta jsonld metadata npm-package rdf

Last synced: 3 months ago
JSON representation

A CLI tool and TypeScript library to discover and map software project metadata from various ecosystems to the CodeMeta standard.

Awesome Lists containing this project

README

          

# @kitschpatrol/codemeta

[![NPM Package @kitschpatrol/codemeta](https://img.shields.io/npm/v/@kitschpatrol/codemeta.svg)](https://npmjs.com/package/@kitschpatrol/codemeta)
[![License: Apache-2.0](https://img.shields.io/badge/License-Apache--2.0-yellow.svg)](https://opensource.org/licenses/Apache-2.0)

**A CLI tool and TypeScript library to discover and map software project metadata from various ecosystems to the CodeMeta standard.**

## Overview

Discover, parse, and merge metadata from a variety of project manifests and files into a single `codemeta.json` file describing the software.

The [CodeMeta](https://codemeta.github.io/) vocabulary provides a standard way to describe software using [JSON-LD](https://json-ld.org/) and [schema.org](https://schema.org/) terms. Most software projects already have rich metadata in manifests and other files (e.g. `package.json`, `Cargo.toml`, `pyproject.toml`, `LICENSE`, etc.), but the name and structure of semantically equivalent metadata is often inconsistent across ecosystems.

This tool reads those manifests and merges metadata from the software development diaspora into one canonical [CodeMeta v3.1](https://w3id.org/codemeta/3.1) JSON-LD document.

More mature Python-based tools like [codemetapy](https://github.com/proycon/codemetapy) and [codemeta-harvester](https://github.com/proycon/codemeta-harvester) perform a similar task, and either of these are emphatically recommended over this project for any use case not limited to a Node.js runtime.

This project should be considered "unofficial" in the sense that its author is not affiliated with the CodeMeta project / governing bodies. The package is released under the `@kitschpatrol` namespace on NPM to leave the `codemeta` package name available for the CodeMeta project core contributors.

## Getting started

### Dependencies

[Node](https://nodejs.org/) 22.17 or newer.

### Installation

Invoke directly in a local project repository directory:

```sh
npx @kitschpatrol/codemeta
```

Or, install globally for access across your system:

```sh
npm install --global @kitschpatrol/codemeta
```

Or, install locally to access the CLI commands in a single project or to import the provided TypeScript APIs:

```sh
npm install @kitschpatrol/codemeta
```

### Running

Navigate to the root of a local project and run the CLI to generate and emit CodeMeta JSON to stdout:

```sh
codemeta
```

Or save directly to a file:

```sh
codemeta -o codemeta.json
```

## Supported metadata formats

This tool leverages the [crosswalk](https://codemeta.github.io/crosswalk/) data generously compiled by CodeMeta contributors to assist in automating the mapping of various metadata formats to the CodeMeta standard. Where crosswalk data is unavailable or incomplete, heuristics are used instead.

The green-checked entries below indicate metadata file formats and sources that `@kitschpatrol/codemeta` can discover, parse, and merge into a `codemeta.json` file for a given directory:

| Status | Ecosystem | Organization or Registry | Specifications | Crosswalk |
| ------ | --------------- | --------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- |
| ✅ | Agnostic | [CodeMeta (v1)](https://codemeta.github.io/) | [`codemeta.json`](https://raw.githubusercontent.com/codemeta/codemeta/1.0/codemeta.jsonld) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'codemeta-V1') |
| ✅ | Agnostic | [CodeMeta (v2)](https://codemeta.github.io/) | [`codemeta.json`](https://raw.githubusercontent.com/codemeta/codemeta/2.0/codemeta.jsonld) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'codemeta-V2') |
| ✅ | Agnostic | [CodeMeta (v3)](https://codemeta.github.io/) | [`codemeta.json`](https://raw.githubusercontent.com/codemeta/codemeta/3.0/codemeta.jsonld) | No |
| ✅ | Agnostic | [CodeMeta (v3.1)](https://codemeta.github.io/) | [`codemeta.json`](https://raw.githubusercontent.com/codemeta/codemeta/3.1/codemeta.jsonld) | No |
| ✅ | Go | [Go Modules](https://go.dev/ref/mod) | [`go.mod`](https://go.dev/doc/modules/gomod-ref) | No |
| ✅ | Go | [GoReleaser](https://goreleaser.com/) | [`.goreleaser.yaml`](https://goreleaser.com/customization/) (Also matches `.yml`) | No |
| ✅ | Java | [Maven](https://search.maven.org/) | [`pom.xml`](https://maven.apache.org/pom.html) | [Yes](https://codemeta.github.io/crosswalk/java/ 'Java (Maven)') |
| ✅ | JavaScript | [NPM](https://www.npmjs.com/) | [`package.json`](https://docs.npmjs.com/cli/v11/configuring-npm/package-json) | [Yes](https://codemeta.github.io/crosswalk/node/ 'NodeJS') |
| ✅ | Agnostic | [Public Code](https://publiccode.net/) | [`publiccode.yml`](https://yml.publiccode.tools/schema.core.html) (Also matches `.yaml`) | [Yes](https://codemeta.github.io/crosswalk/publiccode/ 'publiccode') |
| ✅ | Python | [PyPi (Distutils)](https://pypi.org/) | [`setup.py`](https://docs.python.org/3/distutils/setupscript.html) [`setup.cfg`](https://docs.python.org/3/distutils/apiref.html#distutils.config) | [Yes](https://codemeta.github.io/crosswalk/python/ 'Python Distutils (PyPI)') |
| ✅ | Python | [PyPi (PKG-INFO)](https://pypi.org/) | [`.egg-info/PKG-INFO`](https://packaging.python.org/en/latest/specifications/) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'Python PKG-INFO') |
| ✅ | Python | [PyPi (pep-0621)](https://pypi.org/) | [`pyproject.toml`](https://peps.python.org/pep-0621/) | No |
| ✅ | Ruby | [Ruby Gems](https://rubygems.org/) | [`*.gemspec`](https://guides.rubygems.org/specification-reference/) | [Yes](https://codemeta.github.io/crosswalk/ruby/ 'Ruby Gem') |
| ✅ | Rust | [Crates](https://crates.io/) | [`Cargo.toml`](https://doc.rust-lang.org/cargo/reference/manifest.html) | [Yes](https://codemeta.github.io/crosswalk/cargo/ 'Rust Package Manager') |
| ✅ | Agnostic | | `README.md` (and variants) | No |
| ✅ | Agnostic | [Documented below](#metadatajson) | metadata.json`(and`.yaml`/`.yml\` variants) | No |
| ✅ | Agnostic | [SPDX](https://spdx.org/) | `LICENSE`, `LICENCE`, `COPYING`, `UNLICENSE` (and `.md`/`.txt` variants) | No |
| ❌ | .NET | [NuGet](https://www.nuget.org/) | [`*.nuspec`](https://learn.microsoft.com/nuget/reference/nuspec) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'NuGet') |
| ❌ | Scholarly | [Citation File Format (v1.2.0)](https://github.com/citation-file-format/citation-file-format) | [`CITATION.cff`](https://github.com/citation-file-format/citation-file-format/blob/main/CITATION.cff) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'Citation File Format (1.2.0)') |
| ❌ | Scholarly | [DOAP (Description of a Project)](https://github.com/edumbill/doap) | [`doap.rdf`](https://github.com/edumbill/doap/blob/master/doap.rdf) | [Yes](https://codemeta.github.io/crosswalk/doap/ 'DOAP') |
| ❌ | Astronomy | [ASCL](https://ascl.net/) | [`pom.xml`](https://maven.apache.org/pom.html) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'ASCL') |
| ❌ | Biomedical | [SciCrunch Registry](https://scicrunch.org/) | _platform metadata_ | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'SciCrunchRegistry') |
| ❌ | Clojure | [Leiningen](https://github.com/technomancy/leiningen) | [`project.clj`](https://github.com/technomancy/leiningen/blob/master/doc/PROFILES.md) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'Leiningen (Clojure)') |
| ❌ | Dart | [pub.dev](https://pub.dev/) | [`pubspec.yaml`](https://dart.dev/tools/pub/pubspec) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'Pubspec') |
| ❌ | Data Catalog | [W3C DCAT-2](https://www.w3.org/TR/vocab-dcat-2/) | [`*.ttl`, `*.rdf`, `*.jsonld`](https://www.w3.org/TR/vocab-dcat-2/) | [Yes](https://codemeta.github.io/crosswalk/dcat-2/ 'DCAT-2') |
| ❌ | Data Catalog | [W3C DCAT-3](https://www.w3.org/TR/vocab-dcat-2/) | [`*.ttl`, `*.rdf`, `*.jsonld`](https://www.w3.org/TR/vocab-dcat-2/) | [Yes](https://codemeta.github.io/crosswalk/dcat-3/ 'DCAT-3') |
| ❌ | Debian | [Debian Package](https://www.debian.org/distrib/packages) | [`debian/control`](https://www.debian.org/doc/manuals/debian-policy/ch-controlfields.html) | [Yes](https://codemeta.github.io/crosswalk/debian/ 'Debian Package') |
| ❌ | Earth Science | [CSDMS Model Metadata](https://csdms.colorado.edu/) | [`model_metadata.xml`](https://csdms.colorado.edu/wiki/Model_Metadata_Specification) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'csdms') |
| ❌ | Geoscience | [OntoSoft Software Repository](https://ontosoft.org/portal/#list) | [`*.json`, \*.xml\`](https://ontosoft-earthcube.github.io/ontosoft/ontosoft%20ontology/v1.0.1/doc/) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'OntoSoft') |
| ❌ | Geoscience | [USGS Model Catalog](https://data.usgs.gov/modelcatalog/) | _portal metadata_ | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'usgs-modelcatalog') |
| ❌ | Geospatial | [ISO 19115-1:2014](https://www.iso.org/standard/53798.html) | [`*.xml`](https://standards.iso.org/ittf/PubliclyAvailableStandards/iso_19115-1_2014.html) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'ISO 19115-1:2014 Geographic information - Metadata') |
| ❌ | Haskell | [Hackage](https://hackage.haskell.org/) | [`*.cabal`](https://cabal.readthedocs.io/en/latest/specification.html) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'Cabal (Haskell)') |
| ❌ | Julia | [Pkg](https://pkgdocs.julialang.org/v1/) | [`Project.toml`](https://github.com/JuliaRegistries/General/blob/master/Registry/Package.toml) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'Julia Project.toml') |
| ❌ | Knowledge Graph | [Wikidata](https://www.wikidata.org/) | _Wikidata entity model_ | [Yes](https://codemeta.github.io/crosswalk/wikidata/ 'Wikidata') |
| ❌ | Library | [MODS](https://www.loc.gov/standards/mods/) | [`*.xml`](https://www.loc.gov/standards/mods/) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'MODS') |
| ❌ | Licensing | [SPDX 2.3](https://spdx.org/specifications) | [`*.spdx`, `*.spdx.json`, `*.spdx.rdf`](https://spdx.org/specifications) | [Yes](https://codemeta.github.io/crosswalk/spdx-2-3/ 'SPDX 2.3') |
| ❌ | Life Sciences | [bio.tools](https://bio.tools/) | [`biotools.json`](https://bio.tools/schema) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'bio.tools') |
| ❌ | Mathematics | [swMATH](https://swmath.org/) | _portal metadata_ | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'swMATH') |
| ❌ | Octave | [Octave Package](https://octave.sourceforge.io/) | [`DESCRIPTION`](https://octave.sourceforge.io/pack/pack.html) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'Octave') |
| ❌ | Perl | [CPAN::Meta](https://www.cpan.org/) | [`META.json`](https://metacpan.org/dist/CPAN-Meta/source/lib/CPAN/Meta/Spec.pm) [`META.yml`](https://metacpan.org/dist/CPAN-Meta/source/lib/CPAN/Meta/Spec.pm) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'Perl Module Description (CPAN::Meta)') |
| ❌ | R | [R Package Description](https://cran.r-project.org/) | [`DESCRIPTION`](https://cran.r-project.org/doc/manuals/r-release/R-exts.html#DESCRIPTION-file) | [Yes](https://codemeta.github.io/crosswalk/r/ 'R Package Description') |
| ❌ | Scholarly | [BibTeX](https://www.bibtex.org) | [`*.bib`](https://www.bibtex.org/Format/) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'BibTeX (@softwareversion)') |
| ❌ | Scholarly | [DataCite Metadata Schema](https://datacite.org/schema/kernel-4) | [`datacite.xml`](https://schema.datacite.org/meta/kernel-4) | [Yes](https://codemeta.github.io/crosswalk/datacite/ 'DataCite') |
| ❌ | Scholarly | [Dublin Core](https://www.dublincore.org/specifications/dublin-core/) | [`*.xml`, `*.rdf`](https://www.dublincore.org/specifications/dublin-core/dcmi-terms/) | [Yes](https://codemeta.github.io/crosswalk/dublincore/ 'Dublin Core') |
| ❌ | Scholarly | [Figshare Metadata](https://figshare.com/) | _platform metadata_ | [Yes](https://codemeta.github.io/crosswalk/figshare/ 'Figshare') |
| ❌ | Scholarly | [Software Discovery Index](https://discoveryindex.org/) | _no public format spec_ | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'Software Discovery Index') |
| ❌ | Bioinformatics | [Software Ontology](https://theswo.sourceforge.net/) | [`*.owl`, `*.rdf`](https://www.ebi.ac.uk/ols/ontologies/swo) | [Yes](https://codemeta.github.io/crosswalk/swo/ 'Software Ontology') |
| ❌ | Scholarly | [Trove Software Map](https://trove.nla.gov.au/) | _portal metadata_ | [Yes](https://codemeta.github.io/crosswalk/trove/ 'Trove Software Map') |
| ❌ | Scholarly | [VIVO](https://vivoweb.org/) | [`*.rdf`](https://vivoweb.org/ontology) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'VIVO') |
| ❌ | Scholarly | [Zenodo Metadata](https://zenodo.org/) | [`*.zenodo.json`](https://zenodo.org/api/doc) | [Yes](https://codemeta.github.io/crosswalk/zenodo/ 'Zenodo') |
| ❌ | Space Physics | [SPASE](https://www.spase-group.org/) | [`*.xml`](https://www.spase-group.org/data/schema) | [Yes](https://github.com/codemeta/codemeta/blob/3.1/crosswalk.csv 'SPASE') |
| ❌ | Agnostic | [GitHub Repository Metadata](https://docs.github.com/rest/repos/repos#get-a-repository) | _GitHub REST metadata_ | [Yes](https://codemeta.github.io/crosswalk/github/ 'GitHub') |

### metadata.json

Additionally, a minimalist `metadata.json` (or `.yaml`) file is supported, which can capture the minimal metadata required to populate a GitHub project's repository page's description, homepage, and topics.

| Key | Key Aliases | CodeMeta Property | Notes |
| ------------- | ---------------------------- | ----------------- | ----------------------------------------------------------------------------- |
| `description` | _None_ | `description` | String description of project |
| `homepage` | `url` `repository` `website` | `url` | For repository values, git+ prefix and .git suffix are automatically stripped |
| `keywords` | `tags` `topics` | `keywords` | Array of strings, or a single comma-delimited string |

If multiple key aliases are present in the object, priority for populating the associated `codemeta.json` goes to the key, then falls through to key aliases in the order shown above. (E.g. homepage takes priority over url.)

This is a non-standard format that exists primarily for use in combination with [github-action-repo-sync](https://github.com/kitschpatrol/github-action-repo-sync).

## Usage

### Library

#### API

The library exports the following functions and types:

##### `generate(paths, options?)`

Main entry point. Discovers metadata files in the given paths (files or directories), parses them, and returns a single composed `CodeMeta` object.

```typescript
function generate(paths: string | string[], options?: GenerateOptions): Promise
```

This command is idempotent. By default, an existing `codemeta.json` file in the scanned directory is treated as a generated artifact and excluded from input **if** primary metadata sources are present (e.g. `package.json`, `Cargo.toml`, `pyproject.toml`, etc.). This means the output is always a pure function of your project's existing source metadata files.

If no primary sources are found, an existing `codemeta.json` file is automatically kept as the source of truth.

In the rare case that you're maintaining parts of the `codemeta.json` by hand _alongside_ other primary metadata sources, you can protect your additions to the file wile still merging updates from the other sources by setting the `retain` option to `true`.

**`GenerateOptions`:**

| Option | Type | Default | Description |
| ----------- | ------------------- | ------- | -------------------------------------------------------------------------------- |
| `baseUri` | `string` | | Base URI for `@id`. Auto-detected from `codemeta.json` if present. |
| `enrich` | `boolean` | `false` | Infer missing properties from existing metadata. |
| `exclude` | `string[]` | | Glob patterns to exclude during directory discovery. |
| `retain` | `boolean` | `false` | Include existing `codemeta.json` as input even when primary sources are present. |
| `overrides` | `Partial` | | Property values to set, overriding anything parsed from files. |
| `recursive` | `boolean` | `false` | Scan subdirectories when a path is a directory. |

When a directory is provided, `generate` calls `discover()` internally to find parseable files, then merges them in priority order. By default, existing `codemeta.json` files are excluded when primary metadata sources are present to ensure idempotent generation.

##### `discover(directory, recursive?, ignore?, retain?)`

Auto-detect metadata files in a directory. Returns an array of discovered files sorted by parser priority.

```typescript
function discover(
directory: string,
recursive?: boolean,
ignore?: string[],
retain?: boolean,
): Promise
```

By default, `codemeta.json` files are excluded from discovery when primary metadata sources (project manifests like `package.json`, `Cargo.toml`, etc.) are also found. Set `retain` to `true` to always include them.

Common build artifacts and dot-directories (`node_modules`, `dist`, `target`, `__pycache__`, `venv`, etc.) are ignored by default.

##### `validate(meta)`

Validate a `CodeMeta` object for completeness and consistency.

```typescript
function validate(meta: Partial): ValidationResult
```

Returns a `ValidationResult` with `valid` (boolean) and `warnings` (array). Checks for missing `codeRepository`, `author`, and `license`, and detects license conflicts.

#### Examples

##### Generate metadata from the current directory:

```typescript
import { generate } from '@kitschpatrol/codemeta'

const meta = await generate('.')
console.log(JSON.stringify(meta, null, 2))
```

##### Compose metadata from multiple specific files:

```typescript
import { generate } from '@kitschpatrol/codemeta'

const meta = await generate(['package.json', 'codemeta.json'])
console.log(meta.name, meta.version, meta.license)
```

##### Enrich, override, and validate:

```typescript
import { generate, validate } from '@kitschpatrol/codemeta'

const meta = await generate('/path/to/project', {
baseUri: 'https://github.com/user/my-project',
enrich: true,
overrides: { name: 'My Project' },
recursive: true,
})

const { valid, warnings } = validate(meta)
for (const w of warnings) {
console.warn(`[${w.severity}] ${w.property}: ${w.message}`)
}
```

##### Discover files without parsing them:

```typescript
import { discover } from '@kitschpatrol/codemeta'

const files = await discover('/path/to/project')
for (const f of files) {
console.log(`${f.parserName}: ${f.filePath}`)
}
```

### CLI

#### Command: `codemeta`

Discover and parse software metadata from files and directories into CodeMeta JSON-LD.

Usage:

```txt
codemeta [paths..]
```

| Positional Argument | Description | Type | Default |
| ------------------- | --------------------------------------------------- | -------- | ------- |
| `paths` | Paths to files or directories to scan for metadata. | `string` | `["."]` |

| Option | Description | Type | Default |
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | ------- |
| `--verbose`
`-V` | Enable verbose logging | `boolean` | |
| `--output`
`-o` | Write output to file | `string` | |
| `--basic` | Output simplified metadata with predictable types (no JSON-LD boilerplate) | `boolean` | `false` |
| `--enrich` | Enable automatic inference and enrichment | `boolean` | `false` |
| `--validate` | Validate and report on metadata quality | `boolean` | `false` |
| `--exclude` | Filenames or globs to exclude from automatic discovery in directories | `array` | |
| `--retain` | Retain existing codemeta.json as input alongside primary metadata sources. Without this flag, an existing codemeta.json is only used when no primary sources (package.json, Cargo.toml,etc.) are found. | `boolean` | `false` |
| `--recursive`
`-r` | Scan subdirectories for metadata | `boolean` | `false` |
| `--set`
`-s` | Override a property (e.g. --set name="My Project") | `array` | |
| `--base-uri` | Base URI for identifiers | `string` | |
| `--help`
`-h` | Show help | `boolean` | |
| `--version`
`-v` | Show version number | `boolean` | |

#### Examples

Generate `codemeta.json` from the current directory, emitting to stdout:

```sh
codemeta
```

Scan a project recursively with enrichment, writing to a file:

```sh
codemeta /path/to/project -r --enrich -o codemeta.json
```

Compose from specific files:

```sh
codemeta package.json pyproject.toml
```

Override a property:

```sh
codemeta --set name="My Project" --set version="2.0.0"
```

Validate the output:

```sh
codemeta --validate
```

Set a base URI for the `@id` field:

```sh
codemeta --base-uri https://github.com/user/my-project
```

Exclude files from discovery:

```sh
codemeta -r --exclude "examples/**" --exclude "vendor/**"
```

## Background

### Motivation

Having a native JavaScript/TypeScript tool for generating `codemeta.json` makes it easy to integrate into Node.js-based CI pipelines or toolchains without introducing a Python dependency or requiring containerization.

My [MetaScope](https://github.com/kitschpatrol/metascope) and [github-action-repo-sync](https://github.com/kitschpatrol/github-action-repo-sync) projects both needed a Node-based tool for generating `codemeta.json` files.

### Implementation notes

The behavior and output of the [codemetapy](https://github.com/proycon/codemetapy) binary served as a functional reference during development. This project is an independent clean-room implementation of similar functionality. Correctness was validated through comparison of CLI output from this tool and codemetapy against representative test fixtures. (Though where the behavior of codemetapy is inconsistent with the CodeMeta spec, the CodeMeta spec takes precedence — one example of this is the treatment of `devDependencies` in `package.json` files as `softwareSuggestions` instead of `softwareDependencies`.)

This tool always outputs [CodeMeta v3.1](https://w3id.org/codemeta/3.1) files. When ingesting `codemeta.json` files defined in the older [CodeMeta 1](https://doi.org/10.5063/SCHEMA/CODEMETA-1.0) and [CodeMeta v2](https://doi.org/10.5063/SCHEMA/CODEMETA-2.0) contexts, all simple key re-mappings as defined in the crosswalk table are applied. However, some more nuanced conditional transformations (like the reassignment of copyright holding agents in v1) are not implemented.

For development and building the project itself, we're stuck on Node.js version ^22 specifically until Node Tree-sitter [issues](https://github.com/tree-sitter/node-tree-sitter/issues/268) [related](https://github.com/tree-sitter/node-tree-sitter/issues/276) to more recent versions of Node get resolved.

### Related projects

- [codemetapy](https://github.com/proycon/codemetapy)\
Translate software metadata into the CodeMeta vocabulary (Python)
- [codemeta-harvester](https://github.com/proycon/codemeta-harvester)\
Aggregate software metadata into the CodeMeta vocabulary from source repositories and service endpoints (Python)
- [bibliothecary](https://github.com/librariesio/bibliothecary)\
Manifest discovery and parsing for [libraries.io](https://libraries.io/) (Ruby)
- [diggity](https://github.com/carbonetes/diggity)\
Generates SBOMs for container images, filesystems, archives, and more (Go)
- [SOMEF](https://github.com/KnowledgeCaptureAndDiscovery/somef/)\
Software Metadata Extraction Framework (Python)
- [Upstream Ontologist](https://github.com/jelmer/upstream-ontologist)\
A common interface for finding metadata about upstream software projects (Rust)

## Slop factor

_Medium._

The architecture, test fixture curation, and documentation required manual care and feeding, but the implementation was driven pretty heavily by Claude Code and has been subject to only moderate post-facto human scrutiny.

## Maintainers

@kitschpatrol

## Acknowledgments

Thank you to the [CodeMeta Project Management Committee and contributors](https://codemeta.github.io/governance/people/) for their development and stewardship of the standard.

Jacob Peddicord's [askalono](https://github.com/jpeddicord/askalono) project inspired the [Dice-Sørensen](https://en.wikipedia.org/wiki/Dice-S%C3%B8rensen_coefficient) scoring strategy used for classifying arbitrary license text.

## Contributing

[Issues](https://github.com/kitschpatrol/codemeta/issues) and pull requests are welcome.

## License

[Apache-2.0](license.txt) © Eric Mika