https://github.com/anyparser/anyparser_llamaindex

Instantly access Anyparser's robust document processing and data extraction capabilities directly within your LlamaIndex workflows. Enhance your AI applications with superior content understanding and data quality.
https://github.com/anyparser/anyparser_llamaindex

anyparser artificial-intelligence cache-augmented-generation cag kag knowledge-graph llama-index llamaindex llamaindex-rag rag retrieval-augmented-generation

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/anyparser/anyparser_llamaindex
Owner: anyparser
License: apache-2.0
Created: 2025-02-17T08:22:02.000Z (over 1 year ago)
Default Branch: master
Last Pushed: 2025-02-17T08:24:18.000Z (over 1 year ago)
Last Synced: 2025-12-22T06:36:40.489Z (6 months ago)
Topics: anyparser, artificial-intelligence, cache-augmented-generation, cag, kag, knowledge-graph, llama-index, llamaindex, llamaindex-rag, rag, retrieval-augmented-generation
Language: Python
Homepage: https://anyparser.com
Size: 371 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: changelogs/v0.0.1-changelog.md
- License: LICENSE.md

Awesome Lists containing this project

README

# Anyparser LlamaIndex: Seamless Integration of Anyparser with LlamaIndex

https://anyparser.com

**Integrate Anyparser's powerful content extraction capabilities with LlamaIndex for enhanced AI workflows.** This integration package enables seamless use of Anyparser's document processing and data extraction features within your LlamaIndex applications, making it easier than ever to build sophisticated AI pipelines.

## Installation

```bash
pip install anyparser-llamaindex
```

## Setup

Before running the examples, make sure to set your Anyparser API credentials as environment variables:

```bash
export ANYPARSER_API_KEY="your-api-key"
export ANYPARSER_API_URL="https://anyparserapi.com"
```

## Anyparser LlamaIndex Examples

This `examples` directory contains examples demonstrating different ways to use the Anyparser LlamaIndex integration.

```bash
python examples/01_basic_usage.py
python examples/02_single_file_json.py
python examples/03_single_file_markdown.py
python examples/04_multiple_files_json.py
python examples/05_multiple_files_markdown.py
python examples/06_load_folder.py
python examples/07_ocr_markdown.py
python examples/08_ocr_json.py
python examples/09_web_crawler.py
```

## Features Demonstrated

### Document Processing
- Different output formats (markdown, JSON)
- Multiple file handling
- Folder processing
- Metadata handling

### Web Crawling
- Basic crawling with depth and scope control
- Advanced URL and content filtering
- Crawling strategies (BFS, LIFO)
- Rate limiting and robots.txt respect

## Notes

- All examples use async/await for better performance
- Error handling is included in all examples
- Each example includes detailed comments explaining the options used
- OCR examples support multiple languages
- Crawler examples demonstrate various filtering and control options

## Features Demonstrated

- Different output formats (markdown, JSON)
- OCR capabilities with language support
- OCR performance presets
- Image extraction
- Table extraction
- Metadata handling
- Error handling
- Async/await usage

## License

Apache-2.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/anyparser/anyparser_llamaindex

Awesome Lists containing this project

README