Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/codespell-project/codespell

check code for common misspellings
https://github.com/codespell-project/codespell

Last synced: 6 days ago
JSON representation

check code for common misspellings

Awesome Lists containing this project

README

        

codespell
=========

Fix common misspellings in text files. It's designed primarily for checking
misspelled words in source code (backslash escapes are skipped), but it can be used with other files as well.
It does not check for word membership in a complete dictionary, but instead
looks for a set of common misspellings. Therefore it should catch errors like
"adn", but it will not catch "adnasdfasdf". This also means it shouldn't
generate false-positives when you use a niche term it doesn't know about.

Useful links
------------

* `GitHub project `_

* `Repository `_

* `Releases `_

Requirements
------------

Python 3.8 or above.

Installation
------------

You can use ``pip`` to install codespell with e.g.:

.. code-block:: sh

pip install codespell

Usage
-----

Below are some simple usage examples to demonstrate how the tool works.
For exhaustive usage information, please check the output of ``codespell -h``.

Run codespell in all files of the current directory:

.. code-block:: sh

codespell

Run codespell in specific files or directories (specified via their names or glob patterns):

.. code-block:: sh

codespell some_file some_dir/ *.ext

Some noteworthy flags:

.. code-block:: sh

codespell -w, --write-changes

The ``-w`` flag will actually implement the changes recommended by codespell. Running without the ``-w`` flag is the same as doing a dry run. It is recommended to run this with the ``-i`` or ``--interactive`` flag.

.. code-block:: sh

codespell -I FILE, --ignore-words=FILE

The ``-I`` flag can be used for a list of certain words to allow that are in the codespell dictionaries. The format of the file is one word per line. Invoke using: ``codespell -I path/to/file.txt`` to execute codespell referencing said list of allowed words. See `Ignoring Words`_ for more details.

.. code-block:: sh

codespell -L word1,word2,word3,word4

The ``-L`` flag can be used to allow certain words that are comma-separated placed immediately after it. See `Ignoring Words`_ for more details.

.. code-block:: sh

codespell -x FILE, --exclude-file=FILE

Ignore whole lines that match those in ``FILE``. The lines in ``FILE`` should match the to-be-excluded lines exactly.

.. code-block:: sh

codespell -S, --skip=

Comma-separated list of files to skip. It accepts globs as well. Examples:

* to skip .eps & .txt files, invoke ``codespell --skip="*.eps,*.txt"``

* to skip directories, invoke ``codespell --skip="./src/3rd-Party,./src/Test"``

Useful commands:

.. code-block:: sh

codespell -d -q 3 --skip="*.po,*.ts,./src/3rdParty,./src/Test"

List all typos found except translation files and some directories.
Display them without terminal colors and with a quiet level of 3.

.. code-block:: sh

codespell -i 3 -w

Run interactive mode level 3 and write changes to file.

We ship a collection of dictionaries that are an improved version of the one available
`on Wikipedia `_
after applying them in projects like Linux Kernel, EFL, oFono among others.
You can provide your own version of the dictionary, but patches for
new/different entries are very welcome.

Want to know if a word you're proposing exists in codespell already? It is possible to test a word against the current set dictionaries that exist in ``codespell_lib/data/dictionary*.txt`` via:

.. code-block:: sh

echo "word" | codespell -
echo "1stword,2ndword" | codespell -

You can select the optional dictionaries with the ``--builtin`` option.

Ignoring words
--------------

When ignoring false positives, note that spelling errors are *case-insensitive* but words to ignore are *case-sensitive*. For example, the dictionary entry ``wrod`` will also match the typo ``Wrod``, but to ignore it you must pass ``Wrod``.

The words to ignore can be passed in two ways:

1. ``-I``: A file with a word per line to ignore:

.. code-block:: sh

codespell -I FILE, --ignore-words=FILE

2. ``-L``: A comma separated list of words to ignore on the command line:

.. code-block:: sh

codespell -L word1,word2,word3,word4

Inline ignore
-------------

Some situation might require ignoring a specific word in a specific location. This can be achieved by adding a comment in the source code.
You can either ignore a single word or a list of words. The comment should be in the format of ``codespell:ignore ``.
Words should be separated by a comma.

1. ignore specific word:

.. code-block:: python

def wrod() # codespell:ignore wrod
pass

2. ignore multiple words:

.. code-block:: python

def wrod(wrods) # codespell:ignore
pass

Using a config file
-------------------

Command line options can also be specified in a config file.

When running ``codespell``, it will check in the current directory for an
`INI file `_
named ``setup.cfg`` or ``.codespellrc`` (or a file specified via ``--config``),
containing an entry named ``[codespell]``. Each command line argument can
be specified in this file (without the preceding dashes), for example:

.. code-block:: ini

[codespell]
skip = *.po,*.ts,./src/3rdParty,./src/Test
count =
quiet-level = 3

Python's
`configparser `_
module defines the exact format of INI config files. For example,
comments are possible using ``;`` or ``#`` as the first character.

Codespell will also check in the current directory for a ``pyproject.toml``
file (or a file specified via ``--toml``), and the ``[tool.codespell]``
entry will be used. For versions of Python prior to 3.11, this requires
the tomli_ package. For example, here is the TOML equivalent of the
previous config file:

.. code-block:: toml

[tool.codespell]
skip = '*.po,*.ts,./src/3rdParty,./src/Test'
count = true
quiet-level = 3

The above INI and TOML files are equivalent to running:

.. code-block:: sh

codespell --skip "*.po,*.ts,./src/3rdParty,./src/Test" --count --quiet-level 3

If several config files are present, they are read in the following order:

#. ``pyproject.toml`` (only if the ``tomli`` library is available for Python < 3.11)
#. ``setup.cfg``
#. ``.codespellrc``
#. any additional file supplied via ``--config``

If a codespell configuration is supplied in several of these files,
the configuration from the most recently read file overwrites previously
specified configurations. Any options specified in the command line will
*override* options from the config files.

Values in a config file entry cannot start with a ``-`` character, so if
you need to do this, structure your entries like this:

.. code-block:: ini

[codespell]
dictionary = mydict,-
ignore-words-list = bar,-foo

instead of these invalid entries:

.. code-block:: ini

[codespell]
dictionary = -,mydict
ignore-words-list = -foo,bar

.. _tomli: https://pypi.org/project/tomli/

pre-commit hook
---------------

codespell also works with `pre-commit `_, using

.. code-block:: yaml

- repo: https://github.com/codespell-project/codespell
rev: v2.2.4
hooks:
- id: codespell

If one configures codespell using the `pyproject.toml` file instead use:

.. code-block:: yaml

- repo: https://github.com/codespell-project/codespell
rev: v2.2.4
hooks:
- id: codespell
additional_dependencies:
- tomli

Dictionary format
-----------------

The format of the dictionaries was influenced by the one they originally came from,
i.e. from Wikipedia. The difference is how multiple options are treated and
that the last argument is an optional reason why a certain entry could not be
applied directly, but should instead be manually inspected. E.g.:

1. Simple entry: one wrong word / one suggestion::

calulated->calculated

2. Entry with more than one suggested fix::

fiel->feel, field, file, phial,

Note the last comma! You need to use it, otherwise the last suggestion
will be discarded (see below for why). When there is more than one
suggestion, an automatic fix is not possible and the best we can do is
to give the user the file and line where the error occurred as well as
the suggestions.

3. Entry with one word, but with automatic fix disabled::

clas->class, disabled because of name clash in c++

Note that there isn't a comma at the end of the line. The last argument is
treated as the reason why a suggestion cannot be automatically applied.

There can also be multiple suggestions but any automatic fix will again be
disabled::

clas->class, clash, disabled because of name clash in c++

Development setup
-----------------

As suggested in the `Python Packaging User Guide`_, ensure ``pip``, ``setuptools``, and ``wheel`` are up to date before installing from source. Specifically you will need recent versions of ``setuptools`` and ``setuptools_scm``:

.. code-block:: sh

pip install --upgrade pip setuptools setuptools_scm wheel

You can install required dependencies for development by running the following within a checkout of the codespell source:

.. code-block:: sh

pip install -e ".[dev]"

To run tests against the codebase run:

.. code-block:: sh

make check

.. _Python Packaging User Guide: https://packaging.python.org/en/latest/tutorials/installing-packages/#requirements-for-installing-packages

Sending pull requests
---------------------

If you have a suggested typo that you'd like to see merged please follow these steps:

1. Make sure you read the instructions mentioned in the ``Dictionary format`` section above to submit correctly formatted entries.

2. Choose the correct dictionary file to add your typo to. See `codespell --help` for explanations of the different dictionaries.

3. Sort the dictionaries. This is done by invoking (in the top level directory of ``codespell/``):

.. code-block:: sh

make check-dictionaries

If the make script finds that you need to sort a dictionary, please then run:

.. code-block:: sh

make sort-dictionaries

4. Only after this process is complete do we recommend you submit the PR.

**Important Notes:**

* If the dictionaries are submitted without being pre-sorted the PR will fail via our various CI tools.
* Not all PRs will be merged. This is pending on the discretion of the devs, maintainers, and the community.

Updating
--------

To stay current with codespell developments it is possible to build codespell from GitHub via:

.. code-block:: sh

pip install --upgrade git+https://github.com/codespell-project/codespell.git

**Important Notes:**

* Sometimes installing via ``pip`` will complain about permissions. If this is the case then run with:

.. code-block:: sh

pip install --user --upgrade git+https://github.com/codespell-project/codespell.git

* It has been reported that after installing from ``pip``, codespell can't be located. Please check the $PATH variable to see if ``~/.local/bin`` is present. If it isn't then add it to your path.
* If you decide to install via ``pip`` then be sure to remove any previously installed versions of codespell (via your platform's preferred app manager).

Updating the dictionaries
-------------------------

In the scenario where the user prefers not to follow the development version of codespell yet still opts to benefit from the frequently updated dictionary files, we recommend running a simple set of commands to achieve this:

.. code-block:: sh

wget https://raw.githubusercontent.com/codespell-project/codespell/main/codespell_lib/data/dictionary.txt
codespell -D dictionary.txt

The above simply downloads the latest ``dictionary.txt`` file and then by utilizing the ``-D`` flag allows the user to specify the freshly downloaded ``dictionary.txt`` as the custom dictionary instead of the default one.

You can also do the same thing for the other dictionaries listed here:
https://github.com/codespell-project/codespell/tree/main/codespell_lib/data

License
-------

The Python script ``codespell`` with its library ``codespell_lib`` is available
with the following terms:
(*tl;dr*: `GPL v2`_)

Copyright (C) 2010-2011 Lucas De Marchi

Copyright (C) 2011 ProFUSION embedded systems

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, see
.

.. _GPL v2: https://www.gnu.org/licenses/old-licenses/gpl-2.0.html

``dictionary.txt`` and the other ``dictionary_*.txt`` files are derivative works of English Wikipedia and are released under the `Creative Commons Attribution-Share-Alike License 3.0 `_.