https://github.com/hesa/scarfer
Source Code scan report file reporter
https://github.com/hesa/scarfer
compliance compliance-automation tool
Last synced: 5 months ago
JSON representation
Source Code scan report file reporter
- Host: GitHub
- URL: https://github.com/hesa/scarfer
- Owner: hesa
- Created: 2021-12-30T14:33:11.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2025-07-29T09:07:35.000Z (11 months ago)
- Last Synced: 2025-07-29T11:40:57.941Z (11 months ago)
- Topics: compliance, compliance-automation, tool
- Language: Python
- Homepage:
- Size: 6.9 MB
- Stars: 5
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: NEWS
- License: LICENSES/CC-BY-4.0.txt
Awesome Lists containing this project
README
# scarfer
Source code scan report file reporter
# Introduction
Scarfer outputs compliance related information from a scan report.
A scan report contain lots of information, for example Scancode has 37
entries on the top level for each file, about a file and it is
sometimes cumbersome to open with an editor to extract the information
wanted. Scarfer provides a quick command line access to scan reports.
# Features
Scarfer can output the following information per file:
* copyright (using `-c`)
* license (using `-l`)
* text that caused the license detection (`-m`)
Scarfer can output the following summaries
* license summary (using `-ls`)
* copyright summary (using `-cs`)
## Filter
Scarfer can filter files:
* include files with:
* license name (`-il`) using Python's regular expressions
* files (`-if`) using Python's regular expressions
* files (`-iff`) by reading a file, containing file names, using Python's regular expressions
* copyright (`-ec`) using Python's regular expressions
* exclude files with:
* license name (`-el`) using Python's regular expressions
* files (`-ef`) using Python's regular expressions
* files (`-eff`) by reading a file, containing file names, using Python's regular expressions
* copyright (`-ec`) using Python's regular expressions
*Note: if you're using more than one filter then filters are AND:ed together*
## Curate
Scarfer can curate (fix, amend) license identifications:
* curate license (`-cml`) for all files with missing license
* curate license (`-cfl`) for all files matching Python's regular expressions
## Configuration file
Scarfer can write and read configuration files:
* output current (`-oc`) command line options to a configuration output
* read configuration file (`--config`)
# Example use
Output the file names (full path) of all the files in the Scancode report `example-data/cairo-1.16.0-scan.json`:
```
$ scarfer example-data/cairo-1.16.0-scan.json
```
As above but output only files with path matching `drm`:
```
$ scarfer example-data/cairo-1.16.0-scan.json -if drm
```
Output the file names (full path) of all the files in the Scancode report `example-data/cairo-1.16.0-scan.json` with a license matching `gpl-3`:
```
$ scarfer example-data/cairo-1.16.0-scan.json -il gpl-3
```
Output the file names (full path) of all the files in the Scancode report `example-data/cairo-1.16.0-scan.json` with a license matching `mpl` and files with path matching `drm`. The output should also contain information (per file) about license and copyright:
```
$ scarfer example-data/cairo-1.16.0-scan.json -il mpl -if drm -c -l
```
To filter in all files containing "/*pdi" and ending with ".c":
```
$ scarfer example-data/cairo-1.16.0-scan.json -if "/.*pdi.*\.c$"
```
To filter out all files containing "/*pdi" and ending with ".c":
```
$ scarfer example-data/cairo-1.16.0-scan.json -ef "/.*pdi.*\.c$"
```
# Supported scan report formats
* [Scancode](https://github.com/nexB/scancode-toolkit) Toolkit, version 21 and upwards
* [Scancode](https://github.com/nexB/scancode-toolkit) Output Format version 1.0.0, 2.0.0, 3.0, 3.2, 4.0, 4.1.0
# Hints on source code scanners
## Scancode 32.0*
Assuming you want to scan a directory called `cairo` and store the output in `cairo-scan.json`:
```
scancode -clipe \
--license-text --license-text-diagnostics \
--classify --license-clarity-score --summary \
-n $(cat /proc/cpuinfo | grep processor | wc -l) \
--json-pp cairo-scan.json cairo
```