https://github.com/yymao/html-cleaner
Simple HTML cleaner
https://github.com/yymao/html-cleaner
cleaner google-docs webapp
Last synced: 25 days ago
JSON representation
Simple HTML cleaner
- Host: GitHub
- URL: https://github.com/yymao/html-cleaner
- Owner: yymao
- License: mit
- Created: 2019-11-16T02:59:28.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-11-08T18:39:17.000Z (over 2 years ago)
- Last Synced: 2025-02-01T19:14:16.792Z (3 months ago)
- Topics: cleaner, google-docs, webapp
- Language: HTML
- Homepage: https://yymao.github.io/html-cleaner/
- Size: 37.1 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# html-cleaner
_Author: [Yao-Yuan Mao](https://yymao.github.io/)_
If you draft emails in Google Docs and then copy and paste the content into your email client, your recipients sometimes would see annoying, unexpected line breaks that seem to be added at random. This web app fixes this issue for you.
[yymao.github.io/html-cleaner](https://yymao.github.io/html-cleaner/)
## Usage
1. Copy formatted text from Google Docs and paste it into the box.
2. Click "Remove white-space preservation" (this option fixes the "Google Docs -> Email" formatting issue, but preserves all other formatting), or, click "Remove all non-schematic formatting" (this will remove all non-schematic formatting like font size/family, linewidth etc).
3. The cleaned text will be automatically re-copied. You can simply paste it into your email client.## A few things to note
- This software is provided "as is." Use it at your own risk.
- There are several other generic tools online (you can search for "html cleaner"), and they usually have more features than mine. Mine is pretty limited to solving the "Google Docs -> email" formatting issue.
- All operations are done locally on the client side. No data is transmitted to the server or a third party. You can [check the source code](index.html) to be sure.
- You can also use this tool to fix an ill-formatted email that you received. While that is not the intended usage, it should still fix most formatting issues.
- It works the best on Firefox, but should work fine on Chrome and Safari too. I am not certain if it works on IE/Edge at all.
- Feel free to [open issues](https://github.com/yymao/html-cleaner/issues) if you see any; I'll try my best to address them (no guarantees though).## Technical details behind this formatting issue
_(No, you don't really care about this.)_
When copying formatted text from Google Docs, Google Docs need to insert some formatting CSS (i.e., the `style` tag) to the copied text to preserve the format.
Among those CSS properties, one is `white-space: pre-wrap;`, which is to preserve sequences of white space and to interpret newline characters as line breaks ([ref](https://developer.mozilla.org/en-US/docs/Web/CSS/white-space)).
While this setting is useful in some cases, it becomes problematic when used in emails, because some email servers/clients may insert some newline characters in the HTML message when transmitting/rendering the message.
Usually these newline characters should have no effect in HTML (they are just treated as regular space), but with the `white-space: pre-wrap;` CSS property, they become actual line break!
That's why the formatting issue manifests as additional, unexpected line breaks.I should note that this issue is not really about Google Docs but about the interplay between email services and HTML, which is a mess to say the least. In practice, the simple fix is to just removes any occurrence of `white-space: pre-*;`, which is exactly what the "Remove white-space preservation" option in this tool does.