{"id":18310214,"url":"https://github.com/dedsecinside/gotor","last_synced_at":"2025-04-09T20:00:22.210Z","repository":{"id":38893903,"uuid":"135831463","full_name":"DedSecInside/gotor","owner":"DedSecInside","description":"This program provides efficient web scraping services for Tor and non-Tor sites. The program has both a CLI and  REST API. ","archived":false,"fork":false,"pushed_at":"2025-03-13T00:26:47.000Z","size":11206,"stargazers_count":166,"open_issues_count":4,"forks_count":44,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-04-02T19:00:48.454Z","etag":null,"topics":["cli","command-line","command-line-tool","docker","go","golang","golang-server","hacktoberfest","http-server","information-extraction","osint","osint-tools","rest-api","service","tor","torbot","webcrawler","webcrawling","webscraping"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DedSecInside.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-06-02T15:55:03.000Z","updated_at":"2025-03-13T07:19:47.000Z","dependencies_parsed_at":"2023-10-13T12:45:23.187Z","dependency_job_id":"4fca7274-c153-4a17-9b19-c7d1983c4b54","html_url":"https://github.com/DedSecInside/gotor","commit_stats":null,"previous_names":["kingakeem/gotor"],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DedSecInside%2Fgotor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DedSecInside%2Fgotor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DedSecInside%2Fgotor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DedSecInside%2Fgotor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DedSecInside","download_url":"https://codeload.github.com/DedSecInside/gotor/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248103877,"owners_count":21048245,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","command-line","command-line-tool","docker","go","golang","golang-server","hacktoberfest","http-server","information-extraction","osint","osint-tools","rest-api","service","tor","torbot","webcrawler","webcrawling","webscraping"],"created_at":"2024-11-05T16:13:41.278Z","updated_at":"2025-04-09T20:00:22.152Z","avatar_url":"https://github.com/DedSecInside.png","language":"Go","readme":"# GoTor - HTTP REST API and Web Crawling Tool with TOR Integration\n\nThis repository contains an HTTP REST API and a command-line program designed for efficient data gathering and analysis through web crawling using the TOR network. While the program is primarily designed to work seamlessly with TorBot, the API and CLI can also operate independently.\n\n## Status/Social Links\n[![Go](https://github.com/DedSecInside/gotor/actions/workflows/go.yml/badge.svg)](https://github.com/DedSecInside/gotor/actions/workflows/go.yml)\n[![Open Source Helpers](https://www.codetriage.com/kingakeem/gotor/badges/users.svg)](https://www.codetriage.com/kingakeem/gotor)\n[![](https://img.shields.io/badge/Made%20with-Go-blue.svg?style=flat-square)]()\n![image](https://github.com/DedSecInside/gotor/assets/13573860/9705fcbf-055c-4024-9f36-1bd4bea71442)\n\n## Features and Options\n\n### Main Arguments\n* `-url`: URL to crawl. (Required Argument)\n* `-depth`: Depth of the tree. (default: 1)\n\n### TOR Integration\nThe program employs the TOR network for enhanced privacy and security during web crawling. TOR settings can be configured using environment variables or overridden using CLI flags.\n\n* `-socks5-host`: Specify the SOCKS5 proxy host (default: localhost / 127.0.0.1)\n* `-socks5-port`: Specify the SOCKS5 proxy port (default: 9050)\n* `-disable-socks5`: Run the program without the SOCKS5 proxy. \n\n### REST API\n* `-server-host`: Specify the host that the server runs on (default: localhost / 127.0.0.1)\n* `-server-port`: Specify the port that the server runs on (default: 8081)\n* `-s`: Run the program as a service\n\n### Other options\n* `-d`: Download the results to an Excel spreadsheet (.xlsx)\n* `-f`: Output format for the results. Options are list or tree. (default: list)\n\n### Available Crawling Mechanisms\n1. **Building Relationship Tree of Links**: Generates a hierarchical tree of links, with child nodes representing links found on a website.\n2. **Getting Tor Client IP**: Retrieves the IP address of the current TOR client.\n3. **Retrieving Phone Numbers**: Collects phone numbers found on websites.\n4. **Retrieving Emails**: Gathers email addresses found on websites.\n\n#### Example Usage\nTo start the HTTP server and initiate crawling, use the following command:\n```bash\ngo run cmd/main/gotor.go -s\n```\n\nw/ alternate host and port for server and SOCKS5 proxy:\n```bash\ngo run cmd/main/gotor.go -s -server-host 192.6.8.124 -server-port 8088 -socks5-host 127.0.0.1 -socks5-port 9051\n```\n\nTo crawl directly using the CLI and output the results to an Excel file, use the following command:\n```bash\ngo run cmd/main/gotor.go -url https://example.com -depth 2 -d\n```\n\n## Running with Docker\nTo run the server using Docker, a convenience script build.sh is provided. This script builds a Docker network service for Tor and connects it to the \"gotor\" Docker container.\nMake sure no other service is using the same port. The script uses the SOCKS5_PORT.\n\n### To build and start the Docker containers:\n```bash\n./scripts/build.sh\n```\n### To stop and destroy the Docker containers:\n```bash\n./scripts/destroy.sh\n```\n## Documentation\nThis project includes comprehensive code comments to facilitate documentation generation with godoc. To generate and access documentation, use the following command:\n\n```bash\ngodoc -v -http=:6060\n```\nThis will make the documentation available at http://127.0.0.1:6060.\n\n## License\nThis project is licensed under the GNU General Public License.\n\nFeel free to contribute, report issues, or suggest improvements!\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdedsecinside%2Fgotor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdedsecinside%2Fgotor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdedsecinside%2Fgotor/lists"}