https://github.com/pythonicshariful/pythonicshariful
Python developer specializing in web scraping, automation, and machine learning. Experienced in building scalable scrapers, bypassing anti-bot systems, processing large datasets, and applying ML/NLP models to real-world problems.
https://github.com/pythonicshariful/pythonicshariful
Last synced: 9 days ago
JSON representation
Python developer specializing in web scraping, automation, and machine learning. Experienced in building scalable scrapers, bypassing anti-bot systems, processing large datasets, and applying ML/NLP models to real-world problems.
- Host: GitHub
- URL: https://github.com/pythonicshariful/pythonicshariful
- Owner: pythonicshariful
- Created: 2025-09-27T18:59:15.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2026-05-18T02:09:52.000Z (20 days ago)
- Last Synced: 2026-05-18T04:28:35.210Z (20 days ago)
- Homepage:
- Size: 9.77 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
---
## πΈοΈ What I Do
- **Web scraping at scale** with Python (Selenium, Playwright, Scrapy, BeautifulSoup)
- **Automation pipelines** for data collection, cleaning, and storage
- **API integrations** (REST/GraphQL) and browser automation
- **Data wrangling** with Pandas, exporting to CSV/JSON/DB
- **Learning ML & AI** to build smarter data products
### π§ Tech Stack
---
## β¨ Highlights
- Built bots that **extract thousands of pages/day** with rotating proxies & retries
- Designed resilient **anti-bot bypass** flows (stealth drivers, human-like waits, captchas via services)
- Delivered **clean datasets** ready for analysis & model training
- Currently exploring **feature engineering**, **vector databases**, and **LLM-powered** scraping assistants
---
/
---
## π GitHub Stats
---
## π§ͺ ML & AI Learning Journey
- π― Current focus: **data labeling, feature engineering, small ML models for classification/regression**
- π§ Next up: **LLM-assisted scraping**, **RAG for document-heavy sites**, **agent workflows**
- π Notes & experiments live here β [`/labs`](https://github.com/pythonicshariful/labs)
---
## ποΈ Example Services I Offer
- Full-site data extraction (anti-bot aware) β CSV/JSON/DB
- PDF/image capture & text extraction (OCR)
- API discovery & reverse engineering for private endpoints
- Dashboard/API to deliver data (FastAPI + simple UI)
- Ongoing monitoring for **price changes**, **stock**, **new listings**
> π **Need data?** Open an issue or reach out!
---
## π¬ Connect
---
## π Fun
---
Made with β€οΈ, Python, and a lot of headless browsers.