Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tnypxl/rollup
Compiles text-based files into a single markdown document
https://github.com/tnypxl/rollup
Last synced: about 2 months ago
JSON representation
Compiles text-based files into a single markdown document
- Host: GitHub
- URL: https://github.com/tnypxl/rollup
- Owner: tnypxl
- License: mit
- Created: 2024-09-02T22:58:19.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-10-15T20:01:07.000Z (3 months ago)
- Last Synced: 2024-10-17T05:30:55.314Z (3 months ago)
- Language: Go
- Size: 110 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Rollup
Rollup aggregates the contents of text-based files and webpages into a markdown file.
## Features
- File type filtering for targeted content aggregation
- Ignore patterns for excluding specific files or directories
- Support for code-generated file detection and exclusion
- Advanced web scraping functionality with depth control
- Verbose logging option for detailed operation insights
- Exclusionary CSS selectors for precise web content extraction
- Support for multiple URLs in web scraping operations
- Configurable output format for web scraping (single file or separate files)
- Flexible configuration file support (YAML)
- Automatic generation of default configuration file
- Custom output file naming
- Rate limiting for web scraping to respect server resources## Installation
To install Rollup, make sure you have Go installed on your system, then run:
```bash
go get github.com/tnypxl/rollup
```## Usage
Basic usage:
```bash
rollup [command] [flags]
```### Commands
- `rollup files`: Rollup files into a single Markdown file
- `rollup web`: Scrape main content from webpages and convert to Markdown
- `rollup generate`: Generate a rollup.yml config file### Flags for `files` command
- `--path, -p`: Path to the project directory (default: current directory)
- `--types, -t`: Comma-separated list of file extensions to include (default: .go,.md,.txt)
- `--codegen, -g`: Comma-separated list of glob patterns for code-generated files
- `--ignore, -i`: Comma-separated list of glob patterns for files to ignore### Flags for `web` command
- `--urls, -u`: URLs of the webpages to scrape (comma-separated)
- `--output, -o`: Output type: 'single' for one file, 'separate' for multiple files (default: single)
- `--depth, -d`: Depth of link traversal (default: 0, only scrape the given URLs)
- `--css`: CSS selector to extract specific content
- `--exclude`: CSS selectors to exclude from the extracted content (comma-separated)### Global flags
- `--config, -f`: Path to the configuration file (default: rollup.yml in the current directory)
- `--verbose, -v`: Enable verbose logging## Configuration
Rollup can be configured using a YAML file. By default, it looks for `rollup.yml` in the current directory. You can specify a different configuration file using the `--config` flag.
Example `rollup.yml`:
```yaml
file_extensions:
- go
- md
ignore_paths:
- node_modules/**
- vendor/**
- .git/**
code_generated_paths:
- **/generated/**
sites:
- base_url: https://example.com
css_locator: .content
exclude_selectors:
- .ads
- .navigation
max_depth: 2
allowed_paths:
- /blog
- /docs
exclude_paths:
- /admin
output_alias: example
path_overrides:
- path: /special-page
css_locator: .special-content
exclude_selectors:
- .special-ads
output_type: single
requests_per_second: 1.0
burst_limit: 3
```## Examples
1. Rollup files with default configuration:
```bash
rollup files
```2. Web scraping with multiple URLs:
```bash
rollup web --urls=https://example.com,https://another-example.com
```3. Generate a default configuration file:
```bash
rollup generate
```4. Use a custom configuration file:
```bash
rollup files --config=my-config.yml
```5. Web scraping with separate output files:
```bash
rollup web --urls=https://example.com,https://another-example.com --output=separate
```6. Rollup files with specific types and ignore patterns:
```bash
rollup files --types=go,md --ignore=vendor/**,*_test.go
```7. Web scraping with depth and CSS selector:
```bash
rollup web --urls=https://example.com --depth=2 --css=.main-content
```## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License.