{"id":23431037,"url":"https://github.com/jroakes/npath","last_synced_at":"2026-04-28T08:02:57.292Z","repository":{"id":203295006,"uuid":"709266844","full_name":"jroakes/Npath","owner":"jroakes","description":"Exploring path sequences in GA4 BigQuery data","archived":false,"fork":false,"pushed_at":"2023-11-27T14:46:46.000Z","size":186,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2026-03-29T21:44:31.375Z","etag":null,"topics":["analytics","bigquery","pathfinding-algorithm"],"latest_commit_sha":null,"homepage":"https://locomotive.agency","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jroakes.png","metadata":{"files":{"readme":"Readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-24T11:23:25.000Z","updated_at":"2026-01-09T23:55:07.000Z","dependencies_parsed_at":null,"dependency_job_id":"814b66c4-291e-4eb7-9bbd-edbc70594d0a","html_url":"https://github.com/jroakes/Npath","commit_stats":null,"previous_names":["jroakes/npath"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jroakes/Npath","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jroakes%2FNpath","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jroakes%2FNpath/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jroakes%2FNpath/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jroakes%2FNpath/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jroakes","download_url":"https://codeload.github.com/jroakes/Npath/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jroakes%2FNpath/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32371673,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-27T20:07:02.737Z","status":"online","status_checked_at":"2026-04-28T02:00:07.250Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analytics","bigquery","pathfinding-algorithm"],"created_at":"2024-12-23T09:49:14.024Z","updated_at":"2026-04-28T08:02:57.261Z","avatar_url":"https://github.com/jroakes.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# NPath\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1BKrdrLrWdxUZFPnSxUJWZW4wfoulavUx?usp=sharing)\n\n\n## Description\nExploring path sequences in GA4 BigQuery data\n\n## Setup\n1. Create a new Google Cloud project\n2. Enable the BigQuery API\n3. Ensure that GA4 data is being sent to BigQuery\n4. Get the dataset ID and table ID for the GA4 data\n5. Create a service account with BigQuery read access. [Here](https://docs.aws.amazon.com/dms/latest/sbs/bigquery-redshift-migration-step-1.html) is a good guide.\n6. Download the service account key as a JSON file\n7. Create a new file in the root directory called `service_account.json` and paste the contents of the JSON file into it\n8. Run `pip install -r requirements.txt` to install the required Python packages\n9. Get your API key from OpenAI if you want to run analyze_clusters.\n10. Open `demo.ipynb` in Jupyter Notebook and run the cells\n\n## Components\n* `plot_important_features_prefixspan`: Plots the important conversion path sequences of the PrefixSpan model.\n* `convertor_review`: Sequence Patterns of Similarity and Anomalies in Non-Convertors that are clustered with Convertors\n* `analyze_divergence`: Scores the similarity of non-convertor to convertor sequences.\n* `analyze_clusters`: Clusters users based on their navigational paths and labels clusters.\n\n\n## To Do\n- [ ] Add more documentation\n- [ ] Update sequence importance for sequences that pass through certain pages.\n- [ ] Add more sequence mining algorithms\n- [x] Add attribution models\n- [x] Remove sequences after conversion\n- [x] Add sequence divergence\n- [ ] Analysis by section\n- [ ] Analysis through product/service page\n- [ ] Analysis through blog\n- [ ] Analysis through pricing page\n- [ ] Analysis by source/medium\n- [ ] Analysis to score pages based on their importance (presence in conversion and closeness to conversion)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjroakes%2Fnpath","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjroakes%2Fnpath","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjroakes%2Fnpath/lists"}