Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/github-linguist/linguist
Language Savant. If your repository's language is being reported incorrectly, send us a pull request!
https://github.com/github-linguist/linguist
language-grammars language-statistics linguistic syntax-highlighting
Last synced: 5 days ago
JSON representation
Language Savant. If your repository's language is being reported incorrectly, send us a pull request!
- Host: GitHub
- URL: https://github.com/github-linguist/linguist
- Owner: github-linguist
- License: mit
- Created: 2011-05-09T22:53:13.000Z (over 13 years ago)
- Default Branch: master
- Last Pushed: 2024-05-01T17:24:12.000Z (9 months ago)
- Last Synced: 2024-05-01T22:57:32.977Z (9 months ago)
- Topics: language-grammars, language-statistics, linguistic, syntax-highlighting
- Language: Ruby
- Homepage:
- Size: 38.1 MB
- Stars: 11,819
- Watchers: 514
- Forks: 4,059
- Open Issues: 181
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
- awesome-github-repos - github-linguist/linguist - Language Savant. If your repository's language is being reported incorrectly, send us a pull request! (Ruby)
- my-awesome - github-linguist/linguist - grammars,language-statistics,linguistic,syntax-highlighting pushed_at:2025-01 star:12.5k fork:4.4k Language Savant. If your repository's language is being reported incorrectly, send us a pull request! (Ruby)
- StarryDivineSky - github-linguist/linguist
- jimsghstars - github-linguist/linguist - Language Savant. If your repository's language is being reported incorrectly, send us a pull request! (Ruby)
README
# Linguist
[![Actions Status](https://github.com/github/linguist/workflows/Run%20Tests/badge.svg)](https://github.com/github/linguist/actions)
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/github-linguist/linguist)
This library is used on GitHub.com to detect blob languages, ignore binary or vendored files, suppress generated files in diffs, and generate language breakdown graphs.
## Documentation
- [How Linguist works](/docs/how-linguist-works.md)
- [Change Linguist's behaviour with overrides](/docs/overrides.md)
- [Troubleshooting](/docs/troubleshooting.md)
- [Contributing guidelines](CONTRIBUTING.md)## Installation
Install the gem:
```bash
gem install github-linguist
```### Dependencies
Linguist is a Ruby library so you will need a recent version of Ruby installed.
There are known problems with the macOS/Xcode supplied version of Ruby that causes problems installing some of the dependencies.
Accordingly, we highly recommend you install a version of Ruby using Homebrew, `rbenv`, `rvm`, `ruby-build`, `asdf` or other packaging system, before attempting to install Linguist and the dependencies.Linguist uses [`charlock_holmes`](https://github.com/brianmario/charlock_holmes) for character encoding and [`rugged`](https://github.com/libgit2/rugged) for libgit2 bindings for Ruby.
These components have their own dependencies.1. charlock_holmes
* cmake
* pkg-config
* [ICU](http://site.icu-project.org/)
* [zlib](https://zlib.net/)
2. rugged
* [libcurl](https://curl.haxx.se/libcurl/)
* [OpenSSL](https://www.openssl.org)You may need to install missing dependencies before you can install Linguist.
For example, on macOS with [Homebrew](http://brew.sh/):```bash
brew install cmake pkg-config icu4c
```On Ubuntu:
```bash
sudo apt-get install build-essential cmake pkg-config libicu-dev zlib1g-dev libcurl4-openssl-dev libssl-dev ruby-dev
```## Usage
### Application usage
Linguist can be used in your application as follows:
```ruby
require 'rugged'
require 'linguist'repo = Rugged::Repository.new('.')
project = Linguist::Repository.new(repo, repo.head.target_id)
project.language #=> "Ruby"
project.languages #=> { "Ruby" => 119387 }
```### Command line usage
#### Git Repository
A repository's languages stats can also be assessed from the command line using the `github-linguist` executable.
Without any options, `github-linguist` will output the language breakdown by percentage and file size.```bash
cd /path-to-repository
github-linguist
```You can try running `github-linguist` on the root directory in this repository itself:
```console
$ github-linguist
66.84% 264519 Ruby
24.68% 97685 C
6.57% 25999 Go
1.29% 5098 Lex
0.32% 1257 Shell
0.31% 1212 Dockerfile
```#### Additional options
##### `--rev REV`
The `--rev REV` flag will change the git revision being analyzed to any [gitrevisions(1)](https://git-scm.com/docs/gitrevisions#_specifying_revisions) compatible revision you specify.
This is useful to analyze the makeup of a repo as of a certain tag, or in a certain branch.
For example, here is the popular [Jekyll open source project](https://github.com/jekyll/jekyll).
```console
$ github-linguist jekyll70.64% 709959 Ruby
23.04% 231555 Gherkin
3.80% 38178 JavaScript
1.19% 11943 HTML
0.79% 7900 Shell
0.23% 2279 Dockerfile
0.13% 1344 Earthly
0.10% 1019 CSS
0.06% 606 SCSS
0.02% 234 CoffeeScript
0.01% 90 Hack
```And here is Jekyll's published website, from the gh-pages branch inside their repository.
```console
$ github-linguist jekyll --rev origin/gh-pages
100.00% 2568354 HTML
```##### `--breakdown`
The `--breakdown` or `-b` flag will additionally show the breakdown of files by language.
You can try running `github-linguist` on the root directory in this repository itself:
```console
$ github-linguist --breakdown
66.84% 264519 Ruby
24.68% 97685 C
6.57% 25999 Go
1.29% 5098 Lex
0.32% 1257 Shell
0.31% 1212 DockerfileRuby:
Gemfile
Rakefile
bin/git-linguist
bin/github-linguist
ext/linguist/extconf.rb
github-linguist.gemspec
lib/linguist.rb
…
```##### `--json`
The `--json` or `-j` flag output the data into JSON format.
```console
$ github-linguist --json
{"Dockerfile":{"size":1212,"percentage":"0.31"},"Ruby":{"size":264519,"percentage":"66.84"},"C":{"size":97685,"percentage":"24.68"},"Lex":{"size":5098,"percentage":"1.29"},"Shell":{"size":1257,"percentage":"0.32"},"Go":{"size":25999,"percentage":"6.57"}}
```This option can be used in conjunction with `--breakdown` to get a full list of files along with the size and percentage data.
```console
$ github-linguist --breakdown --json
{"Dockerfile":{"size":1212,"percentage":"0.31","files":["Dockerfile","tools/grammars/Dockerfile"]},"Ruby":{"size":264519,"percentage":"66.84","files":["Gemfile","Rakefile","bin/git-linguist","bin/github-linguist","ext/linguist/extconf.rb","github-linguist.gemspec","lib/linguist.rb",...]}}```
#### Single file
Alternatively you can find stats for a single file using the `github-linguist` executable.
You can try running `github-linguist` on files in this repository itself:
```console
$ github-linguist grammars.yml
grammars.yml: 884 lines (884 sloc)
type: Text
mime type: text/x-yaml
language: YAML
```#### Docker
If you have Docker installed you can build an image and run Linguist within a container:
```console
$ docker build -t linguist .
$ docker run --rm -v $(pwd):$(pwd) -w $(pwd) -t linguist
66.84% 264519 Ruby
24.68% 97685 C
6.57% 25999 Go
1.29% 5098 Lex
0.32% 1257 Shell
0.31% 1212 Dockerfile
$ docker run --rm -v $(pwd):$(pwd) -w $(pwd) -t linguist github-linguist --breakdown
66.84% 264519 Ruby
24.68% 97685 C
6.57% 25999 Go
1.29% 5098 Lex
0.32% 1257 Shell
0.31% 1212 DockerfileRuby:
Gemfile
Rakefile
bin/git-linguist
bin/github-linguist
ext/linguist/extconf.rb
github-linguist.gemspec
lib/linguist.rb
…
```## Contributing
Please check out our [contributing guidelines](CONTRIBUTING.md).
## License
The language grammars included in this gem are covered by their repositories' respective licenses.
[`vendor/README.md`](/vendor/README.md) lists the repository for each grammar.All other files are covered by the MIT license, see [`LICENSE`](./LICENSE).