https://github.com/comparedge/langchain-comparedge
ComparEdge data connector for langchain-comparedge
https://github.com/comparedge/langchain-comparedge
Last synced: 14 days ago
JSON representation
ComparEdge data connector for langchain-comparedge
- Host: GitHub
- URL: https://github.com/comparedge/langchain-comparedge
- Owner: comparedge
- Created: 2026-04-27T17:21:35.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-04-27T17:22:14.000Z (about 2 months ago)
- Last Synced: 2026-04-27T19:20:55.007Z (about 2 months ago)
- Language: Python
- Size: 10.7 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ComparEdge Data Loader for LangChain
Pulls SaaS product data from the ComparEdge API into LangChain Documents. Real SaaS pricing — plans, features, ratings. No API key.
## Quick start
```python
from comparedge_loader import ComparEdgeLoader
# Load all LLM products
loader = ComparEdgeLoader(category="llm", include_pricing=True)
docs = loader.load()
# Each doc: product name, description, pricing plans
for doc in docs[:3]:
print(doc.metadata["name"], doc.metadata.get("starting_price"))
```
## Parameters
| Param | Type | Default | What it does |
|-------|------|---------|-------------|
| `category` | str or None | None | Filter by slug: `"crm"`, `"llm"`, `"project-management"`, etc. `None` = all products |
| `include_pricing` | bool | True | Add pricing plans to document text + `starting_price` to metadata |
| `include_features` | bool | False | Append feature list to document text (capped at 20 per product) |
### Available category slugs
`accounting`, `ai-agents`, `analytics`, `bi-tools`, `cms`, `crm`, `customer-support`,
`data-pipeline`, `design`, `devops`, `email-marketing`, `erp`, `helpdesk`, `hr`,
`llm`, `marketing-automation`, `monitoring`, `project-management`, `sales`,
`security`, `seo`, `social-media`, `storage`, `video-conferencing`, `and more`
Full list: `GET https://comparedge-api.up.railway.app/api/v1/categories`
## Document schema
**page_content**: Markdown-formatted text with product name, category, description, optional pricing table, optional features list.
**metadata**:
| Key | Type | Description |
|-----|------|-------------|
| `source` | str | Canonical URL on comparedge.com |
| `name` | str | Product display name |
| `slug` | str | URL-safe identifier |
| `category` | str | Category slug |
| `g2_rating` | float or null | G2 crowd rating |
| `has_free_tier` | bool | Product has a free plan |
| `starting_price` | float | Lowest paid plan price (when `include_pricing=True`) |
| `website` | str | Vendor homepage |
## Sample document
```
# Notion
Category: project-management
All-in-one workspace for notes, docs, and projects.
## Pricing
- Free: Free
- Plus: $12/user/mo
- Business: $18/user/mo
- Enterprise: Free
```
## Use cases
- RAG pipeline for software recommendation chatbots
- Automated vendor evaluation reports
- Price monitoring agents
- SaaS stack analysis
- Competitive intelligence
## Pagination
The loader paginates automatically. All matching products are streamed via `lazy_load()` without loading the full dataset into memory at once.
```python
# Memory-efficient streaming
loader = ComparEdgeLoader()
for doc in loader.lazy_load():
index(doc)
```
## API
Base URL: `https://comparedge-api.up.railway.app/api/v1`
No auth required. Be reasonable with request rate.
Docs: https://comparedge-api.up.railway.app/docs
## Integration with LangChain (PR target)
This loader targets `langchain_community.document_loaders`. The PR-ready file is at `langchain_pr/comparedge.py`.
Expected import after merge:
```python
from langchain_community.document_loaders import ComparEdgeLoader
```
## Testing
```bash
# Unit tests (mocked, no network)
python langchain_pr/test_comparedge.py
# Live test against the API
python example.py
```