{"id":25452798,"url":"https://github.com/yllvar/modalaiscraper","last_synced_at":"2025-05-16T13:10:32.538Z","repository":{"id":270952850,"uuid":"911950250","full_name":"yllvar/modalAiscraper","owner":"yllvar","description":"Using Modal Ai model to scrapping website with Puppeteer","archived":false,"fork":false,"pushed_at":"2025-01-04T09:44:59.000Z","size":69,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-17T23:41:59.825Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yllvar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-04T09:31:15.000Z","updated_at":"2025-01-04T09:45:02.000Z","dependencies_parsed_at":"2025-01-04T11:00:19.835Z","dependency_job_id":"9e7ecf10-6e2b-468b-9eb4-d60248d9bf60","html_url":"https://github.com/yllvar/modalAiscraper","commit_stats":null,"previous_names":["yllvar/modalaiscraper"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yllvar%2FmodalAiscraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yllvar%2FmodalAiscraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yllvar%2FmodalAiscraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yllvar%2FmodalAiscraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yllvar","download_url":"https://codeload.github.com/yllvar/modalAiscraper/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254535822,"owners_count":22087399,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-02-17T23:41:38.986Z","updated_at":"2025-05-16T13:10:32.518Z","avatar_url":"https://github.com/yllvar.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Modal AI Web Scraper\n\n## Overview\n**Modal AI Web Scraper** is a tool that combines web scraping capabilities with AI-powered analysis using Modal's advanced GPU infrastructure. This project enables users to input a URL, scrape its content, and perform AI analysis on the extracted data, all through a user-friendly web interface.\n\n\n\u003cimg width=\"1350\" alt=\"Screenshot 2025-01-04 at 17 37 34\" src=\"https://github.com/user-attachments/assets/2e248d04-c86a-4d40-968b-61b7e1d2774e\" /\u003e\n\n---\n\n\u003cimg width=\"809\" alt=\"Screenshot 2025-01-04 at 17 32 43\" src=\"https://github.com/user-attachments/assets/23237abd-a1aa-4073-a97f-d583cf771d23\" /\u003e\n\n\n```markdown\n## Features\n- **Web scraping** using server-side fetch API.\n- **AI-powered content analysis** leveraging Modal's L40S GPU.\n- **Real-time logging** and error handling.\n- **User-friendly interface** built with React and Next.js.\n- **Responsive design** powered by Tailwind CSS.\n\n---\n\n## Prerequisites\nEnsure you have the following installed before starting:\n- **Node.js** (v14 or later).\n- **npm** or **yarn**.\n- A **Modal account** with an API key.\n- A **Vercel account** (for deployment).\n\n---\n```\n## Installation\n\n### 1. Clone the repository:\n```bash\ngit clone https://github.com/yourusername/modal-ai-web-scraper.git\ncd modal-ai-web-scraper\n```\n---\n\n### 2. Install dependencies:\n```bash\nnpm install\n# or\nyarn install\n```\n---\n\n### 3. Set up environment variables:\nCreate a `.env.local` file in the root directory and add your Modal API key:\n```env\nMODAL_API_KEY=your_modal_api_key_here\n```\n\n---\n\n## Usage\n\n### Run the development server:\n```bash\nnpm run dev\n# or\nyarn dev\n```\n\nVisit [http://localhost:3000](http://localhost:3000) in your browser to access the application.\n\n### Using the web scraper:\n1. Enter a URL in the input field.\n2. Click the **\"Scrape and Analyze Content\"** button.\n3. Explore the results in the tabs below:\n   - **Analysis Results**: AI-generated insights about the content.\n   - **Raw HTML**: The scraped HTML content.\n   - **Links**: List of links found on the page.\n   - **Process Logs**: Detailed logs of the scraping and analysis process.\n\n---\n\n## Deployment\n\nThis project is optimized for deployment on **Vercel**. To deploy:\n1. Push your code to a GitHub repository.\n2. Connect the repository to Vercel.\n3. Set the `MODAL_API_KEY` environment variable in your Vercel project settings.\n4. Deploy your project.\n\n---\n\n## Project Structure\n```plaintext\napp/                 # Next.js app directory\napi/                 # API routes for scraping and logging\npage.tsx             # Main page component\ncomponents/          # Reusable React components\nlib/                 # Utility functions and modules\npublic/              # Static assets\nmodal_functions.py   # Python script for Modal AI functions\n```\n\n---\n\n## Contributing\nWe welcome contributions! Follow these steps to contribute:\n1. **Fork** the repository.\n2. Create a **new branch**: `git checkout -b feature/your-feature-name`.\n3. Make your changes and **commit them**: `git commit -m 'Add some feature'`.\n4. **Push** to the branch: `git push origin feature/your-feature-name`.\n5. Submit a **pull request**.\n\n---\n\n## License\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n\n---\n\n## Acknowledgements\n- **[Next.js](https://nextjs.org/)** for the React framework.\n- **[Modal](https://modal.com/)** for AI infrastructure.\n- **[Tailwind CSS](https://tailwindcss.com/)** for styling.\n- **[Vercel](https://vercel.com/)** for hosting and deployment.\n\n---\n\n## Contact\nFor questions or feedback, open an issue on the [GitHub repository](https://github.com/yourusername/modal-ai-web-scraper).\n\n**Happy scraping and analyzing!**\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyllvar%2Fmodalaiscraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyllvar%2Fmodalaiscraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyllvar%2Fmodalaiscraper/lists"}