{"id":25843010,"url":"https://github.com/anyparser/anyparser_llamaindex","last_synced_at":"2026-02-11T06:31:02.741Z","repository":{"id":277962890,"uuid":"934071951","full_name":"anyparser/anyparser_llamaindex","owner":"anyparser","description":"Instantly access Anyparser's robust document processing and data extraction capabilities directly within your LlamaIndex workflows. Enhance your AI applications with superior content understanding and data quality.","archived":false,"fork":false,"pushed_at":"2025-02-17T08:24:18.000Z","size":380,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-12-22T06:36:40.489Z","etag":null,"topics":["anyparser","artificial-intelligence","cache-augmented-generation","cag","kag","knowledge-graph","llama-index","llamaindex","llamaindex-rag","rag","retrieval-augmented-generation"],"latest_commit_sha":null,"homepage":"https://anyparser.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/anyparser.png","metadata":{"files":{"readme":"README.md","changelog":"changelogs/v0.0.1-changelog.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-17T08:22:02.000Z","updated_at":"2025-02-17T11:31:32.000Z","dependencies_parsed_at":null,"dependency_job_id":"db3108c5-96a6-4147-b89a-9021127d7693","html_url":"https://github.com/anyparser/anyparser_llamaindex","commit_stats":null,"previous_names":["anyparser/anyparser_llamaindex"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/anyparser/anyparser_llamaindex","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anyparser%2Fanyparser_llamaindex","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anyparser%2Fanyparser_llamaindex/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anyparser%2Fanyparser_llamaindex/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anyparser%2Fanyparser_llamaindex/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/anyparser","download_url":"https://codeload.github.com/anyparser/anyparser_llamaindex/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anyparser%2Fanyparser_llamaindex/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29328261,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-11T06:13:03.264Z","status":"ssl_error","status_checked_at":"2026-02-11T06:12:55.843Z","response_time":97,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anyparser","artificial-intelligence","cache-augmented-generation","cag","kag","knowledge-graph","llama-index","llamaindex","llamaindex-rag","rag","retrieval-augmented-generation"],"created_at":"2025-03-01T06:35:42.725Z","updated_at":"2026-02-11T06:31:02.737Z","avatar_url":"https://github.com/anyparser.png","language":"Python","readme":"# Anyparser LlamaIndex: Seamless Integration of Anyparser with LlamaIndex\n\nhttps://anyparser.com\n\n**Integrate Anyparser's powerful content extraction capabilities with LlamaIndex for enhanced AI workflows.** This integration package enables seamless use of Anyparser's document processing and data extraction features within your LlamaIndex applications, making it easier than ever to build sophisticated AI pipelines.\n\n## Installation\n\n```bash\npip install anyparser-llamaindex\n```\n\n## Setup\n\nBefore running the examples, make sure to set your Anyparser API credentials as environment variables:\n\n```bash\nexport ANYPARSER_API_KEY=\"your-api-key\"\nexport ANYPARSER_API_URL=\"https://anyparserapi.com\"\n```\n\n## Anyparser LlamaIndex Examples\n\nThis `examples` directory contains examples demonstrating different ways to use the Anyparser LlamaIndex integration.\n\n```bash\npython examples/01_basic_usage.py\npython examples/02_single_file_json.py\npython examples/03_single_file_markdown.py\npython examples/04_multiple_files_json.py\npython examples/05_multiple_files_markdown.py\npython examples/06_load_folder.py\npython examples/07_ocr_markdown.py\npython examples/08_ocr_json.py\npython examples/09_web_crawler.py\n```\n\n## Features Demonstrated\n\n### Document Processing\n- Different output formats (markdown, JSON)\n- Multiple file handling\n- Folder processing\n- Metadata handling\n\n### Web Crawling\n- Basic crawling with depth and scope control\n- Advanced URL and content filtering\n- Crawling strategies (BFS, LIFO)\n- Rate limiting and robots.txt respect\n\n## Notes\n\n- All examples use async/await for better performance\n- Error handling is included in all examples\n- Each example includes detailed comments explaining the options used\n- OCR examples support multiple languages\n- Crawler examples demonstrate various filtering and control options\n\n## Features Demonstrated\n\n- Different output formats (markdown, JSON)\n- OCR capabilities with language support\n- OCR performance presets\n- Image extraction\n- Table extraction\n- Metadata handling\n- Error handling\n- Async/await usage\n\n## License\n\nApache-2.0","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanyparser%2Fanyparser_llamaindex","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fanyparser%2Fanyparser_llamaindex","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanyparser%2Fanyparser_llamaindex/lists"}