https://github.com/hellomouse/apps-site-service
Website downloader service for Hellomouse Apps
https://github.com/hellomouse/apps-site-service
Last synced: 3 months ago
JSON representation
Website downloader service for Hellomouse Apps
- Host: GitHub
- URL: https://github.com/hellomouse/apps-site-service
- Owner: hellomouse
- License: mit
- Created: 2023-09-05T21:56:57.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2024-05-14T21:07:21.000Z (about 1 year ago)
- Last Synced: 2025-01-10T20:15:49.976Z (5 months ago)
- Language: JavaScript
- Size: 1.21 MB
- Stars: 1
- Watchers: 5
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Hellomouse Apps Site Queue
A fun Microservice for scraping stuff from sites for Hellomouse Apps
## Features:
- Download webpages as HTML (with assets like CSS, videos, images, etc... embedded as base64), PDF, WEBP (screenshot)
- Special handling for certain websites, currently we have:
- **Twitter / X:** Tweets are downloaded as HTML + attached media (images, videos)
- **Reddit:** Posts and comments are downloaded with any attached assets
- **Soundcloud:** Songs are downloaded with metadata (HTML + audio)
- **Newgrounds:** Songs are downloaded with metadata (HTML + audio)
- **Imgur:** Albums and gallerys are downloaded with all images and metadata (HTML + images / videos)
- **Youtube:** Videos are downloaded
- **Pixiv:** Albums are downloaded
- **Bilibili:** Videos are downloaded## Built With
* 
* 
* ## Setup
Install dependencies
```
npm install
```Setup the config. You will need a PostgresSQL database running as well as the `hellomouse-apps-api` server (run the server first to generate the required tables).
There is an example config in the root directory. Copy it and rename it to `config.js`. Here are the properties:
```js
export const dbUser = 'hellomouse_board'; // PostgresSQL user
export const dbIp = '127.0.0.1'; // Postgres Server location
export const dbPort = 5433; // Postgres Server port
export const dbPassword = 'my password'; // Postgres Server password
export const dbName = 'hellomouse_board'; // Postgres Server DB nameexport const fileDir = './saves'; // Path to store all files, in general, web files are stored under this path/site_downloads/file.ext
```To setup yt-dlp (optional) you can place your browser cookies in `secret/yt-cookies.txt` for use in downloading youtube videos, and
`secret/bilibili-cookies.txt` for downloading bilibili videos.To setup pixiv cookies (optional, for bypassing rate limiting and age restrictions) you can place your browser cookies (exported as a JS array of objects like `[{ name: ... }])`) and put the result in `secret/pixiv-cookies.txt`.
Run the server:
```
node index.js
```