{"id":31214346,"url":"https://github.com/justcodeit7/go-fetch","last_synced_at":"2025-10-04T09:21:53.118Z","repository":{"id":305996620,"uuid":"1023567567","full_name":"JustCodeIt7/Go-Fetch","owner":"JustCodeIt7","description":null,"archived":false,"fork":false,"pushed_at":"2025-07-23T03:18:38.000Z","size":21,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-23T05:23:15.563Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JustCodeIt7.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-21T11:01:54.000Z","updated_at":"2025-07-23T03:18:42.000Z","dependencies_parsed_at":"2025-07-23T05:23:34.582Z","dependency_job_id":"6dfea149-45c4-4b59-bc46-66612cd64cab","html_url":"https://github.com/JustCodeIt7/Go-Fetch","commit_stats":null,"previous_names":["justcodeit7/go-fetch"],"tags_count":null,"template":false,"template_full_name":"JustCodeIt7/Python_Template","purl":"pkg:github/JustCodeIt7/Go-Fetch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JustCodeIt7%2FGo-Fetch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JustCodeIt7%2FGo-Fetch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JustCodeIt7%2FGo-Fetch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JustCodeIt7%2FGo-Fetch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JustCodeIt7","download_url":"https://codeload.github.com/JustCodeIt7/Go-Fetch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JustCodeIt7%2FGo-Fetch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":276223224,"owners_count":25605793,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-21T02:00:07.055Z","response_time":72,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-09-21T09:35:05.240Z","updated_at":"2025-09-21T09:35:10.088Z","avatar_url":"https://github.com/JustCodeIt7.png","language":"Python","readme":"# Go-Fetch - Web Crawler\n\nGo-Fetch is a user-friendly web crawling application built with Streamlit and powered by the `crawl4ai` library. It provides an intuitive graphical interface to configure and execute web crawls, allowing you to extract content from websites and download it in various formats.\n\n## Features\n\n- **Configurable Crawl Settings**: Easily set the target URL, maximum crawl depth, and the number of pages to crawl.\n- **Multiple Crawling Strategies**: Choose between Breadth-First Search (BFS), Depth-First Search (DFS), and Best-First crawling strategies to suit your needs.\n- **External Link Inclusion**: Option to include or exclude external links during the crawl.\n- **Keyword Relevance Scoring**: For the Best-First strategy, prioritize pages based on specified keywords.\n- **Real-time Progress Updates**: Monitor the crawling process with live progress indicators.\n- **Content Download**: Download all crawled content as a single Markdown file or as a ZIP archive containing individual Markdown files for each page.\n- **Page Preview**: Quickly preview the content of the first crawled page directly within the application.\n\n## Installation\n\nTo get started with Go-Fetch, follow these steps:\n\n1.  **Clone the repository**:\n\n    ```bash\n    git clone https://github.com/your-username/Go-Fetch.git\n    cd Go-Fetch\n    ```\n\n    *(Note: Replace `https://github.com/your-username/Go-Fetch.git` with the actual repository URL if it's hosted elsewhere.)*\n\n2.  **Install dependencies**:\n\n    Go-Fetch requires `streamlit` and `crawl4ai`. You can install them using pip:\n\n    ```bash\n    pip install streamlit crawl4ai\n    ```\n\n3.  **Set up `crawl4ai`**:\n\n    Before your first use, you need to run the `crawl4ai-setup` command:\n\n    ```bash\n    crawl4ai-setup\n    ```\n\n## Usage\n\n1.  **Run the Streamlit application**:\n\n    Navigate to the project directory in your terminal and run:\n\n    ```bash\n    streamlit run main.py\n    ```\n\n2.  **Access the application**: \n\n    Your web browser will automatically open to the Streamlit application (usually at `http://localhost:8501`).\n\n3.  **Configure and Crawl**:\n\n    -   Enter the target website URL in the sidebar.\n    -   Adjust the crawl settings (Max Crawl Depth, Max Pages to Crawl, Strategy, etc.) as desired.\n    -   Click the \"Start Crawling\" button to begin the process.\n\n4.  **Download and Preview**:\n\n    Once the crawl is complete, you will see options to download the content as a combined Markdown file or a ZIP archive. You can also preview the first crawled page.\n\n## License\n\nThis project is licensed under the [LICENSE](LICENSE) file. Please see the file for more details.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjustcodeit7%2Fgo-fetch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjustcodeit7%2Fgo-fetch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjustcodeit7%2Fgo-fetch/lists"}