Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/spider-rs/web-crawling-guides
How to guides on web-crawling or scraping
https://github.com/spider-rs/web-crawling-guides
agents ai-agents ai-scraping clean-markdown crawler fast-webcrawler html-to-markdown llm-webcrawler scraper web-scraping
Last synced: 1 day ago
JSON representation
How to guides on web-crawling or scraping
- Host: GitHub
- URL: https://github.com/spider-rs/web-crawling-guides
- Owner: spider-rs
- Created: 2024-06-16T21:13:38.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-07-21T17:17:09.000Z (4 months ago)
- Last Synced: 2024-07-21T19:05:53.715Z (4 months ago)
- Topics: agents, ai-agents, ai-scraping, clean-markdown, crawler, fast-webcrawler, html-to-markdown, llm-webcrawler, scraper, web-scraping
- Homepage: https://spider.cloud/guides
- Size: 58.6 KB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Spider Web Crawling and Scraping Guides
This repos contains a collection of guides on how to effectively use the Spider service to crawl or scrape. Contributors are welcome! 😁## Collection
- [Using the Spider API](spider-api.md)
- [How to Use Proxy Mode](proxy-mode.md)
- [LangChain + Groq + Spider = 🚀 (Integration Guide)](langchain-groq.md)
- [CrewAI Spider Stock Research](crewai-spider-research-agent.md)
- [Extracting Contacts](extracting-contacts.md)
- [Automated Cold Email Outreach Using Spider](auto-email-response-outreach.md)
- [How to Archive Full Website](website-archiving.md)
- Building A Speedy Resilient Web Scraper for RAG AI ([Part 1](building-a-speedy-resilient-web-scraper-for-rag-ai-part1-preparing.md), [Part 2](building-a-speedy-resilient-web-scraper-for-rag-ai-part2-scaling-up.md))
- [Agents from Scrach](ai-agent-from-scratch.md)## Contribute
We're happy to accept requests in the issue tracker, improvements to the content, and additional guides.