https://github.com/JohannesBuchner/languagecheck
Improve the language of your paper before submission
https://github.com/JohannesBuchner/languagecheck
grammar-checker language-analysis latex overleaf proof-reading publishing python
Last synced: 2 months ago
JSON representation
Improve the language of your paper before submission
- Host: GitHub
- URL: https://github.com/JohannesBuchner/languagecheck
- Owner: JohannesBuchner
- License: bsd-2-clause
- Created: 2016-05-19T20:23:29.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2023-07-24T02:13:53.000Z (over 2 years ago)
- Last Synced: 2025-07-16T09:48:24.910Z (6 months ago)
- Topics: grammar-checker, language-analysis, latex, overleaf, proof-reading, publishing, python
- Language: Python
- Size: 4.81 MB
- Stars: 106
- Watchers: 7
- Forks: 14
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
- awesome-scientific-writing - LanguageCheck - Analyses scientific LaTeX papers, suggesting improvements from a list of common mistakes/ambiguities, tense consistency, a vs. an, spell check, and paragraph topic sentences. (Spell Checking and Linting)
README
Language checking for scientific papers
--------------------------------------------
This program attempts to assist you in improving your paper before submission.
Features
---------
* Can analyse any LaTeX papers, and Overleaf projects.
* Makes automated reports to point you to improvements:
* Word level:
* find common grammar mistakes, like wrong prepositions
* find wordy phrases and suggest replacements
* a vs an
* spell-check (using hunspell)
* Sentence level:
* find long, wordy sentences
* check topic sentences
* Paragraph level:
* find tense inconsistencies
* Paper level:
* check visual impression of paper
* All analysis is done offline -- your text does not leave your computer.
* Supports British and American English, but focusses on issues applying to both.
Note that there are false positives -- only you can decide whether a
change would make sense, the reports only point out potential issues.
If you find some rules useless (too many false positives), or you want to add more, please send a pull request!
Demo output
-------------
Example analysis (of an early draft of `this paper `_):
* `Example report for misused phrases `_
* `Overview of all reports `_
Requirements
-------------
* python
* convert command (ImageMagick): Install with your distribution
* nltk: Install with pip
* nltk data: Install with python -m nltk.downloader all
* detex command (usually comes with LaTeX)
* pyhunspell (optional): Install with pip
Installation
--------------
These commands should not give you an error::
$ which convert
$ which python
$ which detex
$ which hunspell
$ ls /usr/share/hunspell/{en_US,en_UK}.{dic,aff}
Then install the python packages and data::
$ pip install pyhunspell --user
$ pip install nltk --user
$ python -m nltk.downloader cmudict stopwords punkt_tab averaged_perceptron_tagger_eng
Usage
--------------
*Using directly*:
* create PDF from your latex file -> mypaper.pdf
* For example, run "pdflatex mypaper.tex"
* use detex to create pure text file -> mypaper.txt
* For example, run "detex mypaper.tex > mypaper.txt". You need detex installed.
* This does not capture figure captions. The detex.sh script can help you include those texts, "bash detex.sh mypaper.tex". You still need detex installed
* run $ python languagecheck.py mydir/mypaper.txt mydir/mypaper.pdf
* open with a web browser mypaper_index.html to see all reports
*Using with Overleaf*::
$ bash languagecheck_overleaf.sh
# for example:
$ bash languagecheck_overleaf.sh https://www.overleaf.com/123456789 mypaper.tex
See also
---------
* style-check, a similar program written in Ruby: https://github.com/nspring/style-check/
* Statistics checklist: Check for common statistics mistakes with this checklist
http://astrost.at/istics/minimal-statistics-checklist.html