{"id":15936806,"url":"https://github.com/tyleryep/yelp","last_synced_at":"2025-04-03T19:30:04.364Z","repository":{"id":83210630,"uuid":"155815525","full_name":"TylerYep/yelp","owner":"TylerYep","description":"Community Detection on the Yelp Dataset","archived":false,"fork":false,"pushed_at":"2021-01-20T08:10:36.000Z","size":14523,"stargazers_count":3,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-09T07:41:35.402Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TylerYep.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-11-02T04:50:47.000Z","updated_at":"2020-12-07T06:12:48.000Z","dependencies_parsed_at":"2023-05-30T20:30:52.931Z","dependency_job_id":null,"html_url":"https://github.com/TylerYep/yelp","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TylerYep%2Fyelp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TylerYep%2Fyelp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TylerYep%2Fyelp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TylerYep%2Fyelp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TylerYep","download_url":"https://codeload.github.com/TylerYep/yelp/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247064978,"owners_count":20877683,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-07T04:40:51.159Z","updated_at":"2025-04-03T19:30:04.330Z","avatar_url":"https://github.com/TylerYep.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Yelp\n## Predicting Restaurant Success using Attribute-Specific Spatial Clusters\n### Community Detection on the Yelp Dataset\nBy Heidi Chen (heidichen7), Edward Lee (ed-w-lee), Tyler Yep (tyleryep)  \n\n# Preprocess/Clean Dataset\n```\npython src/preprocesssing/data-clean.py\n```\nExtracts the relevant fields from the Yelp data files.\n\n# Graph Construction / Spatial Clustering\n```\npython src/graph-construction/knn.py or /louvain.py\n```\nCreates networkx or snap graphs.\n\n# Graph Visualization\n```\npython src/visualization/graph-viz.py\n```\n\nGiven a Yelp CSV of latitude/longitude coordinates, maps data into a nice-looking map.\nNote that any outlier points will ruin the graph - make sure all data passed in is within a couple lat/long degrees of each other.\n\n# Community Detection\n```\npython src/clustering/comm.py\n```\nDetects a lot of variations of communities to varying degrees of success.\n\n# Supervised Learning\n```\npython src/ml/runner.py\n```\nRuns every single sci-kit learn model to look for best prediction.\n\n# Administrivia\n\nTo generate distance matrix:\n```\npython create_distance_matrix.py\n```\n\nTo generate fake graphs:\n---\nDownload these as png:\n\n`complex-with`:\ndensity: https://docs.google.com/drawings/d/18clYRmUvQUYK4-6_25sIb3hop1rvRhZS60e_rc0yoNE/edit?usp=sharing\ncategory: https://docs.google.com/drawings/d/1HJFULW14yUPwTRJEUaYNT_TrOrV4JNJ8C7BnC75LOoM/edit?usp=sharing\n\n`simple`:\ndensity: https://docs.google.com/drawings/d/1cgTknj5zB51gxLiLbsQbujIPEOQG-tveB36-kDSmrvA/edit?usp=sharing\ncategory: https://docs.google.com/drawings/d/1upoeZrkXGolW8HyuKzxikNGQT-2xSH2Z5NQ8yNVYZCM/edit?usp=sharing\n\n`ripme`:\ndensity: https://docs.google.com/drawings/d/1z78tWqMCXoW6ApREiQvWca92PcX9Fdfwks94NrgxEX8/edit?usp=sharing\ncategory: https://docs.google.com/drawings/d/1inpZScOC-Qdlez4hvsy4X25c8p1QF2QP_aVR8EU4G9g/edit?usp=sharing\n\n`ripme-more` and `ripme-more2`:\ndensity: https://docs.google.com/drawings/d/1CVAvbzfV5gTjYO5szOMbn9kQ5lVDvacXoPtbGjSbpvw/edit?usp=sharing\ncategory (ripme-more2): https://docs.google.com/drawings/d/10mLzaC4qDZW_Vy4LrZ4qIR7_w4sN8fKz56SzP4v9XcY/edit?usp=sharing\ncategory: swap the colors of ripme-more2\n\n`sectioned`:\ndensity: https://docs.google.com/drawings/d/10mLzaC4qDZW_Vy4LrZ4qIR7_w4sN8fKz56SzP4v9XcY/edit?usp=sharing\ncategory: https://docs.google.com/drawings/d/1kFr55sA33yvxNUw341eSKf5pHttXQda7tyUgOyKWmbE/edit?usp=sharing\n\nMove to directory called `data/fake-graphs/$GRAPH_DIRNAME_HERE` as `category.png` and `density.png`\n\nTo generate some relevant graphs:\n```\nsh src/experimental/lazy.sh $GRAPH_DIRNAME_HERE\n```\n\nTo evaluate:\n```\nmkdir src/clustering/graphs\nmkdir src/clustering/figures\nmkdir -p data/results\npython src/clustering/evaluate.py --dir data/fake-graphs/$GRAPH_DIRNAME_HERE\n```\nThe resulting graphs will be in `src/clustering/graphs` and the F1 scores will be in `data/results`\n\nTo visualize:\n`python src/visualization/fake-graph-viz.py -n data/fake-graphs/$GRAPH_DIRNAME_HERE/points.csv -e data/fake-graphs/$GRAPH_DIRNAME_HERE/categories/edges-knn_10_*.csv`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftyleryep%2Fyelp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftyleryep%2Fyelp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftyleryep%2Fyelp/lists"}