{"id":26740626,"url":"https://github.com/joelstephen97/tracr","last_synced_at":"2025-03-28T05:20:14.817Z","repository":{"id":284836115,"uuid":"956225617","full_name":"joelstephen97/tracr","owner":"joelstephen97","description":"useful for searching for images in a recursive fashion given starting url","archived":false,"fork":false,"pushed_at":"2025-03-27T22:46:16.000Z","size":0,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-27T23:28:36.130Z","etag":null,"topics":["beautifulsoup4","image-extraction","python3"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/joelstephen97.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-27T22:44:20.000Z","updated_at":"2025-03-27T22:46:20.000Z","dependencies_parsed_at":"2025-03-27T23:38:42.614Z","dependency_job_id":null,"html_url":"https://github.com/joelstephen97/tracr","commit_stats":null,"previous_names":["joelstephen97/tracr"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joelstephen97%2Ftracr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joelstephen97%2Ftracr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joelstephen97%2Ftracr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joelstephen97%2Ftracr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/joelstephen97","download_url":"https://codeload.github.com/joelstephen97/tracr/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245973683,"owners_count":20702883,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup4","image-extraction","python3"],"created_at":"2025-03-28T05:20:14.372Z","updated_at":"2025-03-28T05:20:14.805Z","avatar_url":"https://github.com/joelstephen97.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Tracr Tool\n\nTracer is a Python tool that follows a given URL, scrapes images from the website along with their metadata (EXIF), and organizes them into a folder tree reflecting the link hierarchy. If the link contains further links, Tracer will recursively follow them up to a specified depth (default is 5, configurable via command-line arguments).\n\n## Features\n\n- **Recursive Traversal:** Follow links up to a configurable maximum depth.\n- **Image Downloading:** Downloads all images found on each page.\n- **Metadata Extraction:** Saves image metadata (format, size, mode, and EXIF data if available) into a corresponding text file.\n- **Folder Organization:** Creates a folder structure that mirrors the link hierarchy.\n\n## Requirements\n\n- Python 3.x\n- [requests](https://pypi.org/project/requests/)\n- [beautifulsoup4](https://pypi.org/project/beautifulsoup4/)\n- [Pillow](https://pypi.org/project/Pillow/)\n\n## Installation\n\n1. **Clone the Repository:**\n   ```bash\n   git clone \u003crepository_url\u003e\n   cd tracr\n   ```\n\n2. **Create a Virtual Environment:**\n   On Windows, open PowerShell and run:\n   ```powershell\n   python -m venv venv\n   ```\n\n3. **Activate the Virtual Environment:**\n\n   - **Using PowerShell:**  \n     If you encounter an execution policy error, open PowerShell as Administrator and run:\n     ```powershell\n     Set-ExecutionPolicy RemoteSigned -Scope CurrentUser\n     ```\n     Then activate the virtual environment:\n     ```powershell\n     .\\venv\\Scripts\\Activate.ps1\n     ```\n\n   - **Using Command Prompt:**  \n     ```cmd\n     venv\\Scripts\\activate\n     ```\n\n   Once activated, you should see `(venv)` prefixed on your command line.\n\n4. **Install Dependencies:**\n   With the virtual environment activated, install all required packages using:\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n## Usage\n\nRun the Tracer tool with the following command:\n```bash\npython tracer.py \u003cstarting_url\u003e --depth \u003cmax_depth\u003e --output \u003coutput_folder\u003e\n```\n- `\u003cstarting_url\u003e`: The URL where the tracer begins.\n- `--depth \u003cmax_depth\u003e`: (Optional) The maximum depth to traverse. Default is 5.\n- `--output \u003coutput_folder\u003e`: (Optional) The folder where output will be stored. Default is `output`.\n\nExample:\n```bash\npython tracer.py https://example.com --depth 5 --output tracer_output\n```\n\n## Virtual Environment Management\n\n### Activating the Virtual Environment\n- **PowerShell:**\n  ```powershell\n  .\\venv\\Scripts\\Activate.ps1\n  ```\n- **Command Prompt:**\n  ```cmd\n  venv\\Scripts\\activate\n  ```\n\n### Deactivating the Virtual Environment\n\nTo deactivate the virtual environment, simply run:\n```bash\ndeactivate\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjoelstephen97%2Ftracr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjoelstephen97%2Ftracr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjoelstephen97%2Ftracr/lists"}