{"id":29906694,"url":"https://github.com/shivah12/webscrapper-python","last_synced_at":"2026-06-20T12:32:18.961Z","repository":{"id":307701068,"uuid":"1030429771","full_name":"shivah12/WebScrapper-python","owner":"shivah12","description":"AI-Powered Web Scraper built with Streamlit that extracts tables, headings, and rows from any webpage. Users can input a URL and data type, visualize results, and export to CSV. Powered by  BeautifulSoup, and Pandas.","archived":false,"fork":false,"pushed_at":"2025-08-01T17:45:02.000Z","size":24,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-03T17:47:04.107Z","etag":null,"topics":["beautifulsoup","matplotlib","python3","streamlit","webscraping"],"latest_commit_sha":null,"homepage":"https://webscrapper-python.streamlit.app/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shivah12.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-01T16:14:46.000Z","updated_at":"2025-08-01T17:45:06.000Z","dependencies_parsed_at":"2025-08-01T18:33:48.208Z","dependency_job_id":"d54d30e6-79b7-40bd-b67c-d7152a72c9c7","html_url":"https://github.com/shivah12/WebScrapper-python","commit_stats":null,"previous_names":["shivah12/webscrapper-python"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/shivah12/WebScrapper-python","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shivah12%2FWebScrapper-python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shivah12%2FWebScrapper-python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shivah12%2FWebScrapper-python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shivah12%2FWebScrapper-python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shivah12","download_url":"https://codeload.github.com/shivah12/WebScrapper-python/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shivah12%2FWebScrapper-python/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34570534,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-20T02:00:06.407Z","response_time":98,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup","matplotlib","python3","streamlit","webscraping"],"created_at":"2025-08-01T21:12:12.985Z","updated_at":"2026-06-20T12:32:18.945Z","avatar_url":"https://github.com/shivah12.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AI Web Scraper with Streamlit\n\nThis project is a web-based tool built using Streamlit and BeautifulSoup that allows users to extract specific data (tables, headings, or rows) from webpages. The tool enables quick preview, visualization, and CSV export of the scraped data.\n\n## Features\n\n- Input any webpage URL\n- Select the type of data to extract: full table, headings, or specific rows\n- Visualize the extracted data using basic charts\n- Export the data to a CSV file\n- Simple and user-friendly interface\n\n## Getting Started\n\n### 1. Clone the repository\n\n```bash\ngit clone https://github.com/yourusername/ai-web-scraper.git\ncd ai-web-scraper\n````\n\n### 2. Create and activate a virtual environment (recommended)\n\n**On Windows:**\n\n```bash\npython -m venv venv\nvenv\\Scripts\\activate\n```\n\n**On macOS/Linux:**\n\n```bash\npython3 -m venv venv\nsource venv/bin/activate\n```\n\n### 3. Install dependencies\n\n```bash\npip install -r requirements.txt\n```\n\n### 4. Run the app\n\n```bash\nstreamlit run app.py\n```\n\n## Project Structure\n\n```\napp/\n├── app.py               # Main Streamlit app\n├── scraper.py           # Contains scraping logic\n├── interpreter.py       # AI prompt → selector logic (optional)\n├── utils.py             # CSV export, error handling\n├── requirements.txt\n```\n\n## Secrets Setup (Optional)\n\nIf using an OpenAI API key or other environment variables, create `.streamlit/secrets.toml` like this:\n\n```toml\n[openai]\napi_key = \"your-openai-api-key\"\n```\n\nAccess it in your code as:\n\n```python\nst.secrets[\"openai\"][\"api_key\"]\n```\n\n## Dependencies\n\n* streamlit\n* pandas\n* beautifulsoup4\n* lxml\n* matplotlib\n\n## Deployment\n\nYou can deploy this project on Streamlit Cloud. Just upload your code and add your `secrets.toml` in the Settings → Secrets section of your app.\n\n\nLet me know if you’d like badges, Docker setup, or example screenshots added.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshivah12%2Fwebscrapper-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshivah12%2Fwebscrapper-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshivah12%2Fwebscrapper-python/lists"}