Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/joshuaquek/docusite-to-pdf
Provide a URL and this will generate multiple PDF documents of the whole site within the bounds of the URL path. This code repo is for educational purposes only.
https://github.com/joshuaquek/docusite-to-pdf
crawler documentation-generator html2pdf pdf pdf-converter pdf-document pdf-generation scraper
Last synced: 12 days ago
JSON representation
Provide a URL and this will generate multiple PDF documents of the whole site within the bounds of the URL path. This code repo is for educational purposes only.
- Host: GitHub
- URL: https://github.com/joshuaquek/docusite-to-pdf
- Owner: joshuaquek
- License: mit
- Created: 2023-11-02T13:15:30.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-05-27T14:46:51.000Z (8 months ago)
- Last Synced: 2024-11-13T11:25:07.290Z (2 months ago)
- Topics: crawler, documentation-generator, html2pdf, pdf, pdf-converter, pdf-document, pdf-generation, scraper
- Language: JavaScript
- Homepage:
- Size: 10.7 KB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 🌐📝 Docusite to PDF Converter
Provide a URL and this will generate multiple PDF documents of the whole site within the bounds of the URL path. This code repo is for educational purposes only.## 📋 Pre-requisites
NodeJS v20.0.0 or higher## 🚀 Setup
Clone this repo.Then install the dependencies by running:
```
npm install --legacy-peer-deps
```## 🛠️ Usage
Create a `config.json` file in the root of the project with the following structure:
```json
{
"url": "https://urlYouWannaCrawl.com/docu/latest"
}
```Then run the following command:
```bash
npm start
```This will commence the crawling and generate a whole folder of static HTML pages of the site, based on the URL you provided. The html pages will be generated inside the `./outputs/html` folder.
Next, it will generate a PDF for each of the HTML pages. The PDF files will be generated inside the `./outputs/pdf` folder.
## 🗣️ Other Commands
Generate only the HTML pages:
```bash
npm run html
```Generate the PDFs, assuming that you have already generated the HTML pages inside the `./outputs/html` folder:
```bash
npm run html2pdf
```## 🤝 Contributing
Contributions, issues, and feature requests are welcome! Feel free to check issues page.## License
MIT License