{"id":21180028,"url":"https://github.com/mmguero/montag","last_synced_at":"2025-07-09T23:31:58.470Z","repository":{"id":57443058,"uuid":"156806970","full_name":"mmguero/montag","owner":"mmguero","description":"Montag is a utility which reads e-book files and scrubs them of profanity","archived":false,"fork":false,"pushed_at":"2024-11-19T03:12:25.000Z","size":80,"stargazers_count":7,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-01T08:04:36.425Z","etag":null,"topics":["e-book","ebook","objectional-language","obscenity","profanity","profanity-detection","profanity-filter","profanityfilter","python","python3","swear-filter"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mmguero.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-11-09T03:46:29.000Z","updated_at":"2024-11-19T03:12:29.000Z","dependencies_parsed_at":"2024-08-14T00:41:22.826Z","dependency_job_id":null,"html_url":"https://github.com/mmguero/montag","commit_stats":{"total_commits":87,"total_committers":3,"mean_commits":29.0,"dds":0.5057471264367817,"last_synced_commit":"a0c4997923179863a8b379366c7c5f856a6cbc3c"},"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/mmguero/montag","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mmguero%2Fmontag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mmguero%2Fmontag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mmguero%2Fmontag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mmguero%2Fmontag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mmguero","download_url":"https://codeload.github.com/mmguero/montag/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mmguero%2Fmontag/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263553523,"owners_count":23479405,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["e-book","ebook","objectional-language","obscenity","profanity","profanity-detection","profanity-filter","profanityfilter","python","python3","swear-filter"],"created_at":"2024-11-20T17:35:43.875Z","updated_at":"2025-07-09T23:31:58.235Z","avatar_url":"https://github.com/mmguero.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Montag\n\n[![Latest Version](https://img.shields.io/pypi/v/montag-cleaner)](https://pypi.python.org/pypi/montag-cleaner/) [![Docker Image](https://github.com/mmguero/montag/workflows/montag-build-push-ghcr/badge.svg)](https://github.com/mmguero/montag/pkgs/container/montag)\n\n*\"Didn't firemen prevent fires rather than stoke them up and get them going?\"*\n\n**montag** is a utility which reads an e-book file (in any format supported by [Calibre's ebook-convert](https://manual.calibre-ebook.com/generated/en/ebook-convert.html)) and scrubs it of profanity (or words from any other list you can provide).\n\nThere are all sorts of arguments to be had about obscenity filters, censorship, etc. That's okay! I'm not really interested in having those arguments. My 13 year-old daughter asked me if I could take some swear words out of a young adult novel she was reading so I wrote this for her. If it's useful to you, great. If not, carry on my wayward son.\n\n**montag** is part of a family of projects with similar goals:\n\n* 📼 [cleanvid](https://github.com/mmguero/cleanvid) for video files (using [SRT-formatted](https://en.wikipedia.org/wiki/SubRip#Format) subtitles)\n* 🎤 [monkeyplug](https://github.com/mmguero/monkeyplug) for audio and video files (using either [Whisper](https://openai.com/research/whisper) or the [Vosk](https://alphacephei.com/vosk/)-[API](https://github.com/alphacep/vosk-api) for speech recognition)\n* 📕 [montag](https://github.com/mmguero/montag) for ebooks\n\n## Installation\n\nUsing `pip`, to install the latest [release from PyPI](https://pypi.org/project/montag-cleaner/):\n\n```\npython3 -m pip install -U montag-cleaner\n```\n\nOr to install directly from GitHub:\n\n\n```\npython3 -m pip install -U 'git+https://github.com/mmguero/montag'\n```\n\n## Prerequisites\n\n### Python Prerequisites\n\n[Montag](montag.py) requires Python 3 and the [EbookLib](https://github.com/aerkalov/ebooklib) and [python-magic](https://github.com/ahupp/python-magic) libraries. It also uses some utilities from the [Calibre](https://calibre-ebook.com/) project.\n\nOn a Debian-based Linux distribution, these requirements could be installed with:\n```\n$ sudo apt-get install libmagic1 imagemagick calibre-bin python3 python3-magic python3-ebooklib\n```\n\nOn Windows, you'll need DLLs for `libmagic`. One option for installing these libraries is [`python-magic-bin`](https://pypi.org/project/python-magic-bin/):\n\n```\npython3 -m pip install python-magic-bin\n```\n\nThe Python dependencies *should* be installed automatically if you are using `pip` to install montag.\n\n### Docker\n\nAlternately, a [Dockerfile](./docker/Dockerfile) is provided to allow you to run Montag in Docker. You can build the `oci.guero.org/montag:latest` Docker image with [`build_docker.sh`](./docker/build_docker.sh), then use [`montag-docker.sh`](./docker/montag-docker.sh) to process your e-book files.\n\n## Usage\n\nMontag is easy to use. Specify the input and output e-book filenames, and, optionally, the file containing the words to be censored (one per line) and the text encoding.\n```\n$ ./montag.py \nusage: montag.py [options]\n\ne-book profanity scrubber\n\nrequired arguments:\n  -i \u003cSTR\u003e, --input \u003cSTR\u003e\n                        Input file\n  -o \u003cSTR\u003e, --output \u003cSTR\u003e\n                        Output file\n  -w \u003cSTR\u003e, --word-list \u003cSTR\u003e\n                        Profanity list text file (default: swears.txt)\n  -e \u003cSTR\u003e, --encoding \u003cSTR\u003e\n                        Text encoding (default: utf-8)\n```\n\nSo, using Andy Weir's \"The Martian\" as an example:\n```\n$ ./montag.py -i \"The Martian - Andy Weir.mobi\" -o \"The Martian - Andy Weir (scrubbed).mobi\"\nProcessing \"The Martian - Andy Weir.mobi\" of type \"Mobipocket E-book \"The Martian\", 775003 bytes uncompressed, version 6, codepage 65001\"\nExtracting metadata...\nConverting to EPUB...\nProcessing book contents...\nGenerating output...\nConverting...\nRestoring metadata...\n```\n\nUpon opening the book, you will find the text reads something like this:\n\u003e CHAPTER 1\n\u003e \n\u003e LOG ENTRY: SOL 6\n\u003e \n\u003e I’m pretty much ******.\n\u003e \n\u003e That’s my considered opinion.\n\u003e \n\u003e ******.\n\u003e \n\u003e Six days into what should be the greatest two months of my life, and it’s turned into a nightmare.\n\u003e \n\u003e ...\n\nAlternately, if you are using the Docker method described above, use [`montag-docker.sh`](./docker/montag-docker.sh) rather than [`montag.py`](./src/montag_cleaner/montag.py) directly.\n\n## Known Limitations\n\nMontag is not smart enough to do any in-depth language analysis or deep filtering. For a while I was trying to use the [rominf/profanity-filter](https://github.com/rominf/profanity-filter) library for the word detection and filtering, but I ran into issues and ended up just going with a simpler method that works but presents a few limitations:\n\n* Only whole words are matched and censored. In other words, if the word `frick` is in your list of profanity, `Frick you!` will be censored, but `Absofrickenlutely` will not. As such if you wish to catch all of the variations of the word `frick`, you'd have to list them individually in your `swears.txt` word list.\n* Having phrases (eg., multiple space-separated words) in your `swears.txt` word list won't do you any good.\n* Montag can't tell the difference between different meanings of the same word. For example, if the word `ass` is in your list, both \"And he said unto his sons, Saddle me the ass. So they saddled him the ass: and he rode thereon\" (from the KJV of *The Bible*) and \"Then the high king carefully turned the golden screw. Once: Nothing. Twice: Nothing. Then he turned it the third time, and the boy’s ass fell off\" (from Patrick Rothfuss' *The Wise Man's Fear*) will be censored.\n\n## Contributing\n\nIf you'd like to help improve Montag, pull requests will be welcomed!\n\n## Authors\n\n* **Seth Grover** - *Initial work* - [mmguero](https://github.com/mmguero)\n\n## License\n\nThis project is licensed under the BSD 3-Clause License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\nThanks to:\n* [Calibre](https://calibre-ebook.com/about) developer Kovid Goyal and contributors\n* the contributors to [EbookLib](https://github.com/aerkalov/ebooklib/blob/master/AUTHORS.txt)\n* [python-magic](https://github.com/ahupp/python-magic) developer Adam Hupp and contributors\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmmguero%2Fmontag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmmguero%2Fmontag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmmguero%2Fmontag/lists"}