https://github.com/anyparser/anyparser_llamaindex
Instantly access Anyparser's robust document processing and data extraction capabilities directly within your LlamaIndex workflows. Enhance your AI applications with superior content understanding and data quality.
https://github.com/anyparser/anyparser_llamaindex
anyparser artificial-intelligence cache-augmented-generation cag kag knowledge-graph llama-index llamaindex llamaindex-rag rag retrieval-augmented-generation
Last synced: 17 days ago
JSON representation
Instantly access Anyparser's robust document processing and data extraction capabilities directly within your LlamaIndex workflows. Enhance your AI applications with superior content understanding and data quality.
- Host: GitHub
- URL: https://github.com/anyparser/anyparser_llamaindex
- Owner: anyparser
- License: apache-2.0
- Created: 2025-02-17T08:22:02.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2025-02-17T08:24:18.000Z (about 1 year ago)
- Last Synced: 2025-12-22T06:36:40.489Z (2 months ago)
- Topics: anyparser, artificial-intelligence, cache-augmented-generation, cag, kag, knowledge-graph, llama-index, llamaindex, llamaindex-rag, rag, retrieval-augmented-generation
- Language: Python
- Homepage: https://anyparser.com
- Size: 371 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: changelogs/v0.0.1-changelog.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Anyparser LlamaIndex: Seamless Integration of Anyparser with LlamaIndex
https://anyparser.com
**Integrate Anyparser's powerful content extraction capabilities with LlamaIndex for enhanced AI workflows.** This integration package enables seamless use of Anyparser's document processing and data extraction features within your LlamaIndex applications, making it easier than ever to build sophisticated AI pipelines.
## Installation
```bash
pip install anyparser-llamaindex
```
## Setup
Before running the examples, make sure to set your Anyparser API credentials as environment variables:
```bash
export ANYPARSER_API_KEY="your-api-key"
export ANYPARSER_API_URL="https://anyparserapi.com"
```
## Anyparser LlamaIndex Examples
This `examples` directory contains examples demonstrating different ways to use the Anyparser LlamaIndex integration.
```bash
python examples/01_basic_usage.py
python examples/02_single_file_json.py
python examples/03_single_file_markdown.py
python examples/04_multiple_files_json.py
python examples/05_multiple_files_markdown.py
python examples/06_load_folder.py
python examples/07_ocr_markdown.py
python examples/08_ocr_json.py
python examples/09_web_crawler.py
```
## Features Demonstrated
### Document Processing
- Different output formats (markdown, JSON)
- Multiple file handling
- Folder processing
- Metadata handling
### Web Crawling
- Basic crawling with depth and scope control
- Advanced URL and content filtering
- Crawling strategies (BFS, LIFO)
- Rate limiting and robots.txt respect
## Notes
- All examples use async/await for better performance
- Error handling is included in all examples
- Each example includes detailed comments explaining the options used
- OCR examples support multiple languages
- Crawler examples demonstrate various filtering and control options
## Features Demonstrated
- Different output formats (markdown, JSON)
- OCR capabilities with language support
- OCR performance presets
- Image extraction
- Table extraction
- Metadata handling
- Error handling
- Async/await usage
## License
Apache-2.0