https://github.com/rtulke/rp
rp aka red pencil is a simple multicolor command-line tool to highlight the filtered output text.
https://github.com/rtulke/rp
color command-line command-line-tool egrep grep highlight mark pencil python rp rpen select text
Last synced: 3 months ago
JSON representation
rp aka red pencil is a simple multicolor command-line tool to highlight the filtered output text.
- Host: GitHub
- URL: https://github.com/rtulke/rp
- Owner: rtulke
- License: gpl-2.0
- Created: 2014-05-06T00:19:48.000Z (about 12 years ago)
- Default Branch: master
- Last Pushed: 2024-05-15T17:28:42.000Z (about 2 years ago)
- Last Synced: 2024-05-16T05:56:30.382Z (about 2 years ago)
- Topics: color, command-line, command-line-tool, egrep, grep, highlight, mark, pencil, python, rp, rpen, select, text
- Language: Python
- Homepage:
- Size: 188 KB
- Stars: 10
- Watchers: 2
- Forks: 4
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
rp
==
rp - red pencil is a simple multicolor command-line tool to highlight the filtered output text with multi-pattern highlighting, regex support, and advanced statistics. Each search pattern gets its own color, making it easy to visually distinguish multiple patterns in log files and text output.
I always had problems with filtering out large continuous text, i.e. text that you want to filter out of log files with cat or something similar. So I needed a tool that makes it easy to see what I'm actually looking for.

## Features
- **Multi-color highlighting** - Each pattern gets a unique color
- **Full regex support** - Use powerful regular expressions
- **Statistics mode** - Get detailed match counts per pattern
- **File and directory support** - Search single files or recursively
- **Context lines** - Show lines before/after matches (like `grep -A/-B/-C`)
- **Multiple output modes** - Count, only-matching, invert, and more
- **Performance optimized** - Single-pass processing
- **Robust** - Handles binary files, unicode, broken pipes
## Requirements
* Python 3.6
* egrep or grep, grep should be GNU Version 3.x
## Installation Linux, macOS
```bash
# clone repository
git clone https://github.com/rtulke/rp.git
# for system wide installation
cp rp/rp.py /usr/local/bin/rp
chmod +x /usr/local/bin/rp
# or optinal use symlink insted of copy to /usr/local/bin
# ln -s $(pwd)/rp.py /usr/local/bin
```
Mac OS X uses BSD grep or egrep, which are not 100% compatible with Linux grep/egrep. Most functions should work. If you encounter any problems, please create an issue.
## Quick Start
```bash
# Basic usage with stdin
cat error.log | rp ERROR WARN INFO
# Search in files
rp "exception" application.log
# Multiple patterns with colors
rp "error|ERROR" "warn|WARN" "info|INFO" system.log
# Recursive directory search
rp -r -i "todo|fixme|hack" ~/projects/
# With line numbers and context
rp -n -C 3 "CRITICAL" system.log
```
## Usage
```
rp [OPTIONS] PATTERN [PATTERN...] [FILE...]
```
### Basic Options
| Option | Description |
|--------|-------------|
| `-i, --ignore-case` | Ignore case distinctions in patterns |
| `-k, --display-all` | Display all lines, only highlight matches |
| `-v, --invert-match` | Select non-matching lines |
| `-w, --word-regexp` | Match only whole words |
### Output Control
| Option | Description |
|--------|-------------|
| `-n, --line-number` | Print line numbers with output |
| `-c, --count` | Print only count of matching lines |
| `-o, --only-matching` | Print only matched parts of lines |
| `--stats` | Display detailed statistics about matches |
### Context Control
| Option | Description |
|--------|-------------|
| `-A NUM, --after-context NUM` | Print NUM lines after match |
| `-B NUM, --before-context NUM` | Print NUM lines before match |
| `-C NUM, --context NUM` | Print NUM lines before and after match |
### File Handling
| Option | Description |
|--------|-------------|
| `-r, --recursive` | Search directories recursively |
| `-I, --no-binary` | Skip binary files automatically |
| `-f FILE, --file FILE` | Read patterns from FILE (one per line) |
## Examples
### Basic Searching
```bash
# Single pattern
cat app.log | rp "ERROR"
# Multiple patterns with different colors
cat app.log | rp "ERROR" "WARN" "INFO"
# Case-insensitive search
cat app.log | rp -i "error" "warning"
# Show all lines, only highlight
cat config.txt | rp -k "TODO" "FIXME"
```
### Regex Patterns
```bash
# IP addresses
cat access.log | rp "\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b"
# Email addresses
cat dump.txt | rp "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"
# Timestamps
cat system.log | rp "\d{4}-\d{2}-\d{2}" "\d{2}:\d{2}:\d{2}"
# Multiple error types
cat error.log | rp "err(or)?" "warn(ing)?" "exception"
```
### File Operations
```bash
# Single file
rp "pattern" file.log
# Multiple files
rp "error" *.log
# Recursive directory search
rp -r "TODO" ~/projects/
# Recursive with case-insensitive
rp -r -i "fixme" /var/log/
# Skip binary files
rp -r -I "search term" /usr/share/
```
### Context Lines
```bash
# 3 lines after each match
rp -A 3 "ERROR" app.log
# 2 lines before each match
rp -B 2 "Exception" debug.log
# 3 lines before and after
rp -C 3 "CRITICAL" system.log
# Context with line numbers
rp -n -C 2 "error" application.log
```
### Output Modes
```bash
# Count matches only
rp -c "error" *.log
# Count per file
rp -c "warning" file1.log file2.log file3.log
# Show only matching parts
rp -o "\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}" access.log
# Invert match (exclude lines)
rp -v "DEBUG" application.log
# Line numbers
rp -n "error" app.log
```
### Statistics Mode
```bash
# Basic statistics
rp --stats "ERROR" "WARN" "INFO" application.log
# Output:
# [... colored matches ...]
#
# === Statistics ===
# Total lines processed: 10000
# Lines with matches: 847
# Match rate: 8.47%
#
# Pattern matches:
# ERROR: 42
# WARN : 128
# INFO : 677
# Statistics with count mode
rp --stats -c "error|exception" *.log
# Statistics without match output
rp --stats -c "pattern" large.log
```
### Pattern Files
Create a pattern file (`patterns.txt`):
```text
# Security-related patterns
authentication.*failed
permission.*denied
unauthorized.*access
# Error patterns
\b(error|exception|fatal)\b
# Network patterns
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
```
Use the pattern file:
```bash
# From file only
rp -f patterns.txt access.log
# Combine file and command-line patterns
rp -f patterns.txt "CRITICAL" system.log
# With statistics
rp -f security-patterns.txt --stats -r /var/log/
# Recursive with pattern file
rp -f patterns.txt -r -I ~/logs/
```
### Advanced Combinations
```bash
# Full log analysis
rp -f patterns.txt --stats -n -C 2 -i -r /var/log/
# Security audit
rp -f security-patterns.txt -r -I --stats /var/log/ 2>/dev/null
# Development workflow
rp -r -i "TODO|FIXME|HACK|XXX" --stats ~/project/src/
# Monitor logs in real-time
tail -f app.log | rp "ERROR" "EXCEPTION" "CRITICAL"
# Find specific issues with context
rp -n -C 5 "OutOfMemory|StackOverflow" *.log
```
## Pattern File Format
Pattern files support:
- One pattern per line
- Comments starting with `#`
- Empty lines (ignored)
- Full regex syntax
Example (`patterns.txt`):
```text
# HTTP status codes
\b[45]\d{2}\b
# Common log levels
ERROR
WARNING
CRITICAL
# IP addresses
\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b
```
## Exit Codes
| Code | Meaning |
|------|---------|
| 0 | Matches found |
| 1 | No matches found |
| 2 | Error occurred |
| 130 | Interrupted (Ctrl+C) |
| 141 | Broken pipe |
## Performance
**Optimizations:**
- Single-pass processing (no repeated pipe overhead like v3)
- Pre-compiled regex patterns
- Efficient context buffer with deque
- Lazy file iteration
- Binary file detection (first 8KB)
**Benchmarks** (1M lines, 4 patterns):
```
grep (4 calls): ~3.5s
rp v3: ~4.2s (grep-based, multiple passes)
rp v5: ~1.8s (Python, single pass)
```
## Comparison with grep
| Feature | grep | rp |
|---------|------|-----|
| Multi-color highlighting | ❌ | ✅ |
| Regex support | ✅ | ✅ |
| Context lines | ✅ | ✅ |
| Statistics | ❌ | ✅ |
| Pattern files | ✅ | ✅ |
| Performance | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| Single pattern | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| Multiple patterns | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
## Use Cases
### Security Analysis
```bash
rp -f security-patterns.txt --stats -r /var/log/
```
### Development
```bash
rp -r "TODO|FIXME|HACK" --stats src/
```
### Log Monitoring
```bash
tail -f app.log | rp "ERROR" "WARN" "EXCEPTION"
```
### Incident Response
```bash
rp -n -C 5 "OutOfMemory|Segfault|Panic" *.log
```
### Code Review
```bash
rp -r -i "password|secret|key|token" --stats .
```
## Tips & Tricks
### Color Output in Less
```bash
rp "pattern" file.log | less -R
```
### Save Colored Output
```bash
rp "pattern" file.log > output.txt # No colors
script -c 'rp "pattern" file.log' output.txt # With colors
```
### Combine with Other Tools
```bash
# With find
find . -name "*.log" -exec rp "ERROR" {} +
# With xargs
cat file-list.txt | xargs rp "pattern"
# With awk
cat data.txt | awk '{print $3}' | rp "pattern"
```
### Word Boundaries
```bash
# Match "test" but not "testing"
rp -w "test" file.txt
# Multiple whole words
rp -w "error" "warning" "info" log.txt
```
## Troubleshooting
### Binary File Warnings
```bash
# Skip binary files automatically
rp -I "pattern" /path/
# Or redirect stderr
rp "pattern" /path/ 2>/dev/null
```
### Performance Issues
```bash
# Use simpler patterns when possible
rp "ERROR" instead of "E[RR]*O[RR]*"
# Limit recursion depth (use find)
find /path -maxdepth 2 -name "*.log" -exec rp "pattern" {} +
```
### Unicode Issues
Unicode is handled with `errors='replace'` - invalid sequences are replaced with �
## Contributing
Contributions welcome! Please ensure:
- Code follows PEP 8
- All features have examples
- Error handling is robust
- Performance is tested
- One single file ;)