{"id":21824831,"url":"https://github.com/martelogan/cftrace","last_synced_at":"2026-04-10T22:50:02.750Z","repository":{"id":265046237,"uuid":"894959541","full_name":"martelogan/cftrace","owner":"martelogan","description":"Experiments with worldwide traceroutes","archived":false,"fork":false,"pushed_at":"2024-12-16T07:07:48.000Z","size":103,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-26T08:14:17.602Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/martelogan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-27T10:05:37.000Z","updated_at":"2024-12-16T07:07:52.000Z","dependencies_parsed_at":"2024-11-27T11:23:27.557Z","dependency_job_id":"04ddae55-2378-465c-a143-f70d61e88d37","html_url":"https://github.com/martelogan/cftrace","commit_stats":null,"previous_names":["martelogan/cftrace"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martelogan%2Fcftrace","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martelogan%2Fcftrace/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martelogan%2Fcftrace/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/martelogan%2Fcftrace/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/martelogan","download_url":"https://codeload.github.com/martelogan/cftrace/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244795220,"owners_count":20511519,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-27T18:00:26.033Z","updated_at":"2026-04-10T22:49:57.706Z","avatar_url":"https://github.com/martelogan.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Traceroute Data Collection Tool\n\nThis tool collects traceroute data from Cloudflare colos to a specified target IP or domain and organizes the results into structured JSON files and aggregate CSVs for analysis. It supports geo-location lookups, hop analysis, and optional GCP region identification for targets hosted on Google Cloud.\n\n---\n\n## **Directory Structure**\nThe results are saved in a structured directory hierarchy based on colos and regions:\n\n- **Regions**: `na/` (North America), `emea/` (Europe, Middle East, and Africa), `apac/` (Asia-Pacific), `latam/` (Latin America).\n- **Files**: Each file is named using a human-readable target name and the target IP, e.g., `c1-gclb_34.49.121.93.json`.\n\n---\n\n## **CSV Formats**\n\n### **Traceroute Summary CSV**\n\nThe `traceroute_summary.csv` summarizes traceroute data for each `\u003ccolo, target\u003e` pair. Below is the list of columns in the CSV, ordered as they appear:\n\n| **Column Name**             | **Description**                                                                                      |\n|-----------------------------|--------------------------------------------------------------------------------------------------|\n| `start_region`              | Region grouping for the colo, e.g., `na`, `emea`, etc.                                           |\n| `start_colo`                | The colo's short name, e.g., `pdx`.                                                              |\n| `trace_target`              | Human-readable name for the target IP, e.g., `c1-gclb`.                                          |\n| `rtt_ms`                    | Mean round-trip time across all hops in milliseconds (rounded to an integer).                    |\n| `hops_count`                | Total number of hops in the traceroute.                                                          |\n| `start_city`                | City, state, and country of the colo, e.g., `Portland, OR, US`.                                  |\n| `approx_final_hop`          | Geo-location of the final hop, e.g., `Mountain View, CA, US`.                                    |\n| `approx_nearest_gcp`        | Closest GCP region to the final hop, e.g., `us-east1`.                                           |\n| `target_distance_km`        | Approximate distance (in kilometers) between the colo and the target. Defaults to `unknown`.     |\n| `start_subcolo`             | The sub-colo (if provided in the API response), e.g., `pdx02`.                                   |\n| `target_ip`                 | Target IP address, e.g., `34.49.121.93`.                                                         |\n| `target_domain`             | Domain name of the target, e.g., `google.com`. Defaults to `unknown` if unavailable.             |\n| `traceroute_time_ms`        | Total time for the traceroute in milliseconds.                                                   |\n| `traceroute_packet_count`   | Total packets sent to the target.                                                                |\n| `min_rtt_ms`                | Minimum RTT observed across all hops.                                                           |\n| `max_rtt_ms`                | Maximum RTT observed across all hops.                                                           |\n| `std_dev_rtt_ms`            | Standard deviation of RTTs across all hops.                                                     |\n| `colo_lat`                  | Latitude of the colo. Defaults to `unknown` if unavailable.                                      |\n| `colo_long`                 | Longitude of the colo. Defaults to `unknown` if unavailable.                                     |\n| `colo_country`              | Country of the colo, e.g., `US`. Defaults to `unknown`.                                          |\n| `target_lat`                | Latitude of the target. Defaults to `unknown` if unavailable.                                    |\n| `target_long`               | Longitude of the target. Defaults to `unknown` if unavailable.                                   |\n| `target_country`            | Country of the target, e.g., `US`. Defaults to `unknown`.                                        |\n\n### **Skipped Colos CSV**\n\nThe `skipped_colos.csv` tracks colos that were skipped during data collection. Below is the list of columns:\n\n| **Column Name**             | **Description**                                                                                  |\n|-----------------------------|--------------------------------------------------------------------------------------------------|\n| `start_region`              | Region grouping for the colo, e.g., `na`, `emea`, etc.                                          |\n| `start_colo`                | The colo's short name, e.g., `pdx`.                                                             |\n| `trace_target`              | Human-readable name for the target IP, e.g., `c1-gclb`.                                         |\n| `target_ip`                 | Target IP address, e.g., `34.49.121.93`.                                                        |\n| `target_domain`             | Domain name of the target, e.g., `google.com`. Defaults to `unknown` if unavailable.            |\n| `skipped_reason`            | Reason the colo was skipped, e.g., `no_traceroute_response`.                                    |\n\n---\n\n## **Geo-Location and GCP Region Lookup**\n\n### Geo-Location\n- Geo-location data (`colo_lat`, `colo_long`, `colo_country`, `target_lat`, `target_long`, `target_country`) is fetched using a GeoIP API.\n- Defaults to `unknown` if the API call fails or data is missing.\n\n### GCP Region\n- When the `target_is_gcp=true` flag is enabled, the `approx_nearest_gcp` column identifies the closest GCP region (e.g., `us-east1`) based on the final hop's geo-location.\n- This is implemented using an in-memory lookup table mapping GCP regions to approximate lat/long coordinates.\n\n---\n\n## **Hop Analysis**\n\n### Congested Hops\n- Hops with **\u003e50% packet loss** are flagged as congested.\n- Stored in the JSON result files.\n\n### Slowest Hops\n- Hops with RTT \u003e1 standard deviation above the mean are flagged as slow.\n- Stored in the JSON result files.\n\n---\n\n## **Usage**\n\n```bash\nruby traceroute_collector.rb --traceroute-uri URI --output-dir DIR --cf-colo-file FILE --targets IP:NAME:DOMAIN --region REGION\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmartelogan%2Fcftrace","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmartelogan%2Fcftrace","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmartelogan%2Fcftrace/lists"}