{"id":13846579,"url":"https://github.com/boyter/lc","last_synced_at":"2025-09-02T18:37:08.907Z","repository":{"id":45781074,"uuid":"118385312","full_name":"boyter/lc","owner":"boyter","description":"licensechecker (lc) a command line application which scans directories and identifies what software license things are under producing reports as either SPDX, CSV, JSON, XLSX or CLI Tabular output. Dual-licensed under MIT or the UNLICENSE.","archived":false,"fork":false,"pushed_at":"2024-02-01T00:44:55.000Z","size":54490,"stargazers_count":117,"open_issues_count":4,"forks_count":16,"subscribers_count":8,"default_branch":"master","last_synced_at":"2024-05-02T00:56:02.855Z","etag":null,"topics":["classifier","cli","command-line-tool","commandline","go","golang","license","license-management","licensechecker","open-source-licensing","spdx"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/boyter.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-01-22T00:06:16.000Z","updated_at":"2024-05-30T04:34:30.160Z","dependencies_parsed_at":"2024-01-13T17:10:48.942Z","dependency_job_id":"ce4577f3-ef72-41fa-a406-988be432c31c","html_url":"https://github.com/boyter/lc","commit_stats":null,"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boyter%2Flc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boyter%2Flc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boyter%2Flc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boyter%2Flc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/boyter","download_url":"https://codeload.github.com/boyter/lc/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243848102,"owners_count":20357491,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classifier","cli","command-line-tool","commandline","go","golang","license","license-management","licensechecker","open-source-licensing","spdx"],"created_at":"2024-08-04T18:00:40.932Z","updated_at":"2025-03-17T07:31:00.057Z","avatar_url":"https://github.com/boyter.png","language":"Go","readme":"licensechecker (lc)\n-------------------\n\n# NOTE - this is under heavy development, and as such master does not currently work, see a release for a working solution!\n\n`lc` is a command line tool that recursively iterates over a supplied directory or file \nattempting to identify what software license each file is under using the list\nof licenses supplied by the SPDX (Software Package Data Exchange) Project. It will pick up \nlicense files named appropriately or inline licenses such as the below in source files\n\n`SPDX-License-Identifier: GPL-3.0-only`\n\nIn a nutshell this project is a reimplementation of http://www.boyter.org/2017/05/identify-software-licenses-python-vector-space-search-ngram-keywords/ using Go while I attempt to nut out the nuances of the language. \n\nIt can produce report outputs as valid [SPDX](https://spdx.org/), CSV, XLSX, JSON and CLI formatted. It has been designed to work inside CI systems that capture either stdout or file artifacts.\n\n[![Go](https://github.com/boyter/lc/actions/workflows/go.yml/badge.svg)](https://github.com/boyter/lc/actions/workflows/go.yml)\n[![Scc Count Badge](https://sloc.xyz/github/boyter/lc/)](https://github.com/boyter/lc/)\n\nDual-licensed under MIT or the [UNLICENSE](http://unlicense.org).\n\n### Support\n\nUsing `lc` commercially? If you want priority support for `lc` you can purchase a years worth https://boyter.gumroad.com/l/vixqn which entitles you to priority direct email support from the developer.\n\n### Why\n\nIn short taken from, http://ben.balter.com/licensee/\n\n * You've got an open source project. How do you know what you can and can't do with the software?\n * You've got a bunch of open source projects, how do you know what their licenses are?\n * You've got a project with a license file, but which license is it? Has it been modified?\n\nWhy should you care about what licenses your code runs under? See \n\n * http://www.openlogic.com/resources/enterprise-blog/archive/use-spdx-for-open-source-license-compliance \n * https://thenewstack.io/spdx-open-source-cheap-compliance-license-can-expensive/\n * https://www.infoworld.com/article/2839560/open-source-software/sticking-a-license-on-everything.html\n\n### Installation\n\nThe binary name for `licencechecker` is `lc`.\n\nFor binary files see releases https://github.com/boyter/lc/releases To build from source you need to have Go setup with your GOPATH working and your go binary path exported like so,\n\n```\nexport PATH=$PATH:$(go env GOPATH)/bin\n```\n\nthen to install\n\n```\n$ go install\n```\n\n\n### Usage\n\nCommand line usage of `licensechecker` is designed to be as simple as possible.\nFull details can be found in `lc --help`.\n\n```\n$ lc --help\nNAME:\n   licensechecker - Check directory for licenses and list what license(s) a file is under\n\nUSAGE:\n   lc [global options] [DIRECTORY|FILE] [DIRECTORY|FILE]\n\nVERSION:\n   1.3.0\n\nCOMMANDS:\n     help, h  Shows a list of commands or help for one command\n\nGLOBAL OPTIONS:\n   --format csv, -f csv                                Set output format, supports progress, tabular, json, spdx, summary, xlsx or csv (default: \"tabular\")\n   --output FILE, -o FILE                              Set output file if not set will print to stdout FILE\n   --confidence 0.95, -c 0.95                          Set required confidence level for licence matching between 0 and 1 E.G. 0.95 (default: \"0.85\")\n   --deepguess true, --dg true                         Should attempt to deep guess the licence false or true true (default: \"true\")\n   --filesize 50000, --fs 50000                        How large a file in bytes should be processed 50000 (default: \"50000\")\n   --licensefiles copying,readme, --lf copying,readme  Possible license files to inspect for over-arching license as comma seperated list copying,readme (default: \"license,licence,copying,readme\")\n   --pathblacklist .git,.hg,.svn, --pbl .git,.hg,.svn  Which directories should be ignored as comma seperated list .git,.hg,.svn (default: \".git,.hg,.svn\")\n   --extblacklist gif,jpg,png, --xbl gif,jpg,png       Which file extensions should be ignored for deep analysis as comma seperated list E.G. gif,jpg,png (default: \"woff,eot,cur,dm,xpm,emz,db,scc,idx,\nmpp,dot,pspimage,stl,dml,wmf,rvm,resources,tlb,docx,doc,xls,xlsx,ppt,pptx,msg,vsd,chm,fm,book,dgn,blines,cab,lib,obj,jar,pdb,dll,bin,out,elf,so,msi,nupkg,pyc,ttf,woff2,jpg,jpeg,png,gif,bmp,psd,tif,tif\nf,yuv,ico,xls,xlsx,pdb,pdf,apk,com,exe,bz2,7z,tgz,rar,gz,zip,zipx,tar,rpm,bin,dmg,iso,vcd,mp3,flac,wma,wav,mid,m4a,3gp,flv,mov,mp4,mpg,rm,wmv,avi,m4v,sqlite,class,rlib,ncb,suo,opt,o,os,pch,pbm,pnm,ppm\n,pyd,pyo,raw,uyv,uyvy,xlsm,swf\")\n   --documentname LicenseChecker, --dn LicenseChecker  SPDX only. Sets DocumentName E.G. LicenseChecker (default: \"Unknown\")\n   --packagename LicenseChecker, --pn LicenseChecker   SPDX only. Sets PackageName E.G. LicenseChecker (default: \"Unknown\")\n   --documentnamespace value, --dns value              SPDX only. Sets DocumentNamespace, if not set will default to http://spdx.org/spdxdocs/[packagename]-[HASH]\n   --help, -h                                          show help\n   --version, -v                                       print the version\n```\n\nMore information about [what licensechecker looks at and how it works](what-we-look-at.md)\n\nProbably the most useful functionality is the `-f` modifier which specifies the output format.\nBy default `licencechecker` will print out results in a tabular CLI format. However as it was designed\nto run at the end of CI tasks you may want to change it. This can be done like so.\n\n```\n$ lc -f tabular .\n$ lc -f progress .\n$ lc -f spdx .\n$ lc -f csv .\n$ lc -f summary .\n```\n\nThe above will process starting in the current directory and print out a formatted list of results to the CLI when finished.\n\nExample output of `licencechecker` running against itself in tabular format while ignoring the .git, licenses and vendor directories\n\n```\n$ lc -pbl .git,vendor,licenses -f tabular .\n-----------------------------------------------------------------------------------------------------------\nDirectory            File                    License                                      Confidence  Size\n-----------------------------------------------------------------------------------------------------------\n.                    .gitignore              (MIT OR Unlicense)                           100.00%     278B\n.                    .travis.yml             (MIT OR Unlicense)                           100.00%     192B\n.                    CODE_OF_CONDUCT.md      (MIT OR Unlicense)                           100.00%     3.1K\n.                    CONTRIBUTING.md         (MIT OR Unlicense)                           100.00%     1.2K\n.                    Gopkg.lock              (MIT OR Unlicense)                           100.00%     1.4K\n.                    Gopkg.toml              (MIT OR Unlicense)                           100.00%     972B\n.                    LICENSE                 Unlicense AND MIT                            94.83%      1.1K\n.                    README.md               (MIT OR Unlicense)                           100.00%     10.6K\n.                    UNLICENSE               MIT AND Unlicense                            95.16%      1.2K\n.                    database_keywords.json  (MIT OR Unlicense)                           100.00%     3.6M\n.                    licensechecker.spdx     (MIT OR Unlicense)                           100.00%     9.3K\n.                    main.go                 (MIT OR Unlicense)                           100.00%     3.4K\n.                    what-we-look-at.md      (MIT OR Unlicense)                           100.00%     3.7K\nexamples/identifier  LICENSE                 GPL-3.0+ AND MIT                             95.40%      1K\nexamples/identifier  LICENSE2                MIT AND GPL-3.0+                             99.65%      35K\nexamples/identifier  has_identifier.py       (MIT OR GPL-3.0+) AND GPL-2.0                100.00%     409B\nparsers              constants.go            (MIT OR Unlicense)                           100.00%     4.8M\nparsers              formatter.go            (MIT OR Unlicense)                           100.00%     8.5K\nparsers              formatter_test.go       (MIT OR Unlicense)                           100.00%     1.3K\nparsers              guesser.go              (MIT OR Unlicense)                           100.00%     9.8K\nparsers              guesser_test.go         (MIT OR Unlicense) AND GPL-2.0 AND GPL-3.0+  100.00%     4.8K\nparsers              helpers.go              (MIT OR Unlicense) AND Apache-2.0            100.00%     2.4K\nparsers              helpers_test.go         (MIT OR Unlicense)                           100.00%     2.8K\nparsers              structs.go              (MIT OR Unlicense)                           100.00%     679B\nscripts              build_database.py       (MIT OR Unlicense)                           100.00%     4.6K\nscripts              include.go              (MIT OR Unlicense)                           100.00%     951B\n-----------------------------------------------------------------------------------------------------------\n```\n\nTo write out the results to a CSV file\n\n```\n$ lc --format csv -output licences.csv --pathblacklist .git,licenses,vendor .\n```\n\nOr to a SPDX 2.1 file\n\n```\n$ lc -f spdx -o licensechecker.spdx --pbl .git,vendor,licenses -dn licensechecker -pn licensechecker .\n```\n\nYou can specify multiple directories as additional arguments and all results will be merged into a single output\n\n```\n$ lc -f tabular ./examples/identifier ./scripts\n```\n\nYou can also specify files and directories as additional arguments \n\n```\n$ lc -f tabular README.md LICENSE ./examples/identifier\n------------------------------------------------------------------------------------------\nDirectory              File               License                        Confidence  Size\n------------------------------------------------------------------------------------------\n                       README.md          NOASSERTION                    100.00%     11.3K\n                       LICENSE            MIT                            94.83%      1.1K\n./examples/identifier  LICENSE            GPL-3.0+ AND MIT               95.40%      1K\n./examples/identifier  LICENSE2           MIT AND GPL-3.0+               99.65%      35K\n./examples/identifier  has_identifier.py  (MIT OR GPL-3.0+) AND GPL-2.0  100.00%     409B\n------------------------------------------------------------------------------------------\n```\n\n### SPDX\n\nThe ouput of SPDX is a valid SPDX 2.1 document. Validation was checked against the tools supplied by the SPDX group.\nRunning master against itself to produce a SPDX and the validating using the tools from https://github.com/spdx/tools\n\n```\n$ go run main.go  -f spdx -o spdx_example.spdx --pbl .git,vendor,licenses -dn licensechecker -pn licensechecker . \u0026\u0026 java -jar ./spdx-tools-2.1.12-SNAPSHOT-jar-with-dependencies.jar Verify ./spdx_example.spdx\nERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.\n03:49:29.479 [main] ERROR org.apache.jena.rdf.model.impl.RDFReaderFImpl - Rewired RDFReaderFImpl - configuration changes have no effect on reading\n03:49:29.482 [main] ERROR org.apache.jena.rdf.model.impl.RDFReaderFImpl - Rewired RDFReaderFImpl - configuration changes have no effect on reading\nThis SPDX Document is valid.\n```\n\n### Package\n\nRun go build for windows and linux then the following in linux, keep in mind need to update the version\n\n```\nzip -r9 lc-1.0.0-x86_64-pc-windows.zip lc.exe \u0026\u0026 zip -r9 lc-1.0.0-x86_64-unknown-linux.zip lc\n\nGOOS=darwin GOARCH=amd64 go build \u0026\u0026 zip -r9 lc-1.0.0-x86_64-apple-darwin.zip lc\nGOOS=windows GOARCH=amd64 go build \u0026\u0026 zip -r9 lc-1.0.0-x86_64-pc-windows.zip lc.exe\nGOOS=linux GOARCH=amd64 go build \u0026\u0026 zip -r9 lc-1.0.0-x86_64-unknown-linux.zip lc\n```\n\n### Most Common Software Licences\n\nSource https://www.blackducksoftware.com/top-open-source-licenses\n\nSource https://blog.sourced.tech/post/gld/pga-licenses.csv\n\n```\nRank \tOpen Source License \t                            %\n1.      MIT License \t                                    38%\n2.      GNU General Public License (GPL 2.0) \t            14%\n3.      Apache License 2.0                                  13%\n4.      ISC License \t                                    10%\n5.      GNU General Public License (GNU) 3.0 \t            6%\n6.      BSD License 2.0 (3-clause, New or Revised) License  5%\n7.      Artistic License (Perl)                             3%\n8.      GNU Lesser General Public License (LGPL) 2.1 \t    3%\n9.      GNU Lesser General Public License (LGPL) 3.0 \t    1%\n10. \tEclipse Public License (EPL) \t                    1%\n11. \tMicrosoft Public License                            1%\n12. \tSimplified BSD License (BSD) \t                    1%\n13. \tCode Project Open License 1.02 \t                    \u003c 1%\n14. \tMozilla Public License (MPL) 1.1                    \u003c 1%\n15. \tGNU Affero General Public License v3 or later \t    \u003c 1%\n16. \tCommon Development and Distribution License (CDDL)  \u003c 1%\n17. \tDO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE \t    \u003c 1%\n18. \tMicrosoft Reciprocal License \t                    \u003c 1%\n19. \tSun GPL with Classpath Exception v2.0 \t            \u003c 1%\n20. \tzlib/libpng License \t                            \u003c 1%\n```\n\n### TODO\n\n* Add error handling for all the file operations and just in general. Most are currently ignored\n* Add logic to guess the file type for SPDX value FileType\n* Add addtional unit and integration tests\n* Investigate using \"github.com/gosuri/uitable\" for formatting https://github.com/gosuri/uitable\n* https://web.archive.org/web/20180822173147/https://blog.sourced.tech/post/gld/\n* https://github.com/boyter/boyter.org/blob/01601a2cafc2b2788b29b6943ad45ad40316d9a8/content/posts/improving-lc-performance.md\n* https://reuse.software/\n* /Users/boyter/Documents/projects/linux/LICENSES/preferred \u003c-- add to the list\n* https://github.com/go-enry/go-license-detector/blob/master/FAILURES.md\n","funding_links":[],"categories":["Static Application Security Testing"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fboyter%2Flc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fboyter%2Flc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fboyter%2Flc/lists"}