{"id":19563292,"url":"https://github.com/vdutts7/reddit-map","last_synced_at":"2026-06-18T14:33:01.802Z","repository":{"id":260576563,"uuid":"881721931","full_name":"vdutts7/reddit-map","owner":"vdutts7","description":"Visualizing maps of Reddit comments based on semantic similarity","archived":false,"fork":false,"pushed_at":"2024-11-01T18:48:27.000Z","size":18269,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-11-20T14:33:25.488Z","etag":null,"topics":["data-visualization","embeddings","nomic","reddit","semantic-similarity"],"latest_commit_sha":null,"homepage":"https://atlas.nomic.ai/data/auth0thread765/reddit-dataset","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vdutts7.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-11-01T05:01:53.000Z","updated_at":"2025-04-27T03:59:17.000Z","dependencies_parsed_at":null,"dependency_job_id":"56f10c3a-a938-42ac-813d-50ed842431b7","html_url":"https://github.com/vdutts7/reddit-map","commit_stats":null,"previous_names":["vdutts7/reddit-map"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/vdutts7/reddit-map","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vdutts7%2Freddit-map","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vdutts7%2Freddit-map/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vdutts7%2Freddit-map/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vdutts7%2Freddit-map/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vdutts7","download_url":"https://codeload.github.com/vdutts7/reddit-map/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vdutts7%2Freddit-map/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34495378,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-18T02:00:06.871Z","response_time":128,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-visualization","embeddings","nomic","reddit","semantic-similarity"],"created_at":"2024-11-11T05:17:12.711Z","updated_at":"2026-06-18T14:33:01.793Z","avatar_url":"https://github.com/vdutts7.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n\n  \u003cimg src=\"public/reddit.png\" alt=\"Reddit Logo\" width=\"80\" height=\"80\" /\u003e\n  \u003cimg src=\"public/obama.png\" alt=\"Obama\" width=\"\" height=\"67\"/\u003e\n\n  \u003ch1 align=\"center\"\u003e\n        Visual Map | Reddit comments\n    \u003c/h1\u003e\n    \u003cp align=\"center\"\u003e \n        \u003ci\u003e\u003cb\u003eVisualizing maps of Reddit comments based on semantic similarity\u003c/b\u003e\u003c/i\u003e\n        \u003cbr /\u003e \n    \u003c/p\u003e\n\n[![Github][github]][github-url]\n\n\n  \u003cimg src=\"public/map.png\" alt=\"Reddit Logo\" width=\"200\" height=\"\" /\u003e\n\n\u003c/div\u003e\n\n\n\n## Table of Contents\n\n  \u003col\u003e\n    \u003ca href=\"#FREE-200-USD-cloud-credits\"\u003e💸 FREE 200 USD cloud credits\u003c/a\u003e\u003cbr/\u003e\n    \u003ca href=\"#about\"\u003e📝 About\u003c/a\u003e\u003cbr/\u003e\n    \u003ca href=\"#how-to-build\"\u003e💻 How to build\u003c/a\u003e\u003cbr/\u003e\n    \u003ca href=\"#tools-used\"\u003e🔧 Tools used\u003c/a\u003e\n        \u003cul\u003e\n        \u003c/ul\u003e\n    \u003ca href=\"#contact\"\u003e👤 Contact\u003c/a\u003e\n  \u003c/ol\u003e\n\n\u003cbr/\u003e\n\n\n## 💸FREE 200 USD cloud credits\n\nClick the banner to activate $200 free personal cloud credits on DigitalOcean (deploy anything).\n\n\u003cdiv style=\"display: flex; align-items: center; justify-content: center; width: 400px;\"\u003e \n    \u003ca href=\"https://www.digitalocean.com/?refcode=2aa0ec7cfd0e\u0026utm_campaign=Referral_Invite\u0026utm_medium=Referral_Program\u0026utm_source=badge\"\u003e\n        \u003cimg src=\"https://res.cloudinary.com/dnz16usmk/image/upload/v1709301461/digitalocean-referral.png\"\n            width=\"150\"\n        /\u003e\n    \u003c/a\u003e\n\u003c/div\u003e\n\n\n\n## 📝About\n- How to automate the extraction, processing, and mapping of Reddit comments using Python and Nomic Atlas\n- Use the Reddit API to fetch comments from a Reddit post URL\n- Store the data in Nomic Atlas\n- Create an Atlas map on the dataset to produce a visualization\n\n\n## 💻How to build\n\n\n### 1. Setup:\n```\npython3 -m venv venv\nsource venv/bin/activate\npip install -r requirements.txt\n```\n### 2. Environemnt variables\n\nGet your Reddit developer credentials: https://www.reddit.com/prefs/apps \n```\nREDDIT_CLIENT_ID=\u003cyour_client_id\u003e\nREDDIT_CLIENT_SECRET=\u003cyour_client_secret\u003e\nREDDIT_USER_AGENT=\u003cyour_user_agent\u003e\n```\n\n### 3. Data Collection\n\nRun the script to collect Reddit comments:\n```\npython reddit.py\n```\n\n\nThe script will:\n- Prompt for a Reddit post URL\n- Extract all comments recursively\n- Show real-time progress\n- Save data to a CSV file\n\n### 4. Data Processing\nThe script automatically:\n- Extracts comment metadata (author, score, timestamp)\n- Handles nested comment structures\n- Implements rate limiting and error handling\n- Saves processed data in a structured format\n\n### 5. Visualization\nAfter data collection, the comments are visualized using Nomic Atlas:\n- Creates semantic embeddings of comments\n- Generates interactive 2D/3D visualizations\n- Clusters similar comments together\n- Allows exploration of comment relationships\n\n### Example Output\n```\n(venv) (base) vdutts7@Vacbook-Vro reddit-map % python reddit.py             \nEnter Reddit post URL: https://www.reddit.com/r/pics/comments/5bx4bx/thanks_obama/\nLoading comments...\nFound 5936 comments to process\nProgress: 1.7% (100/5936 comments processed)\nProgress: 3.4% (200/5936 comments processed)\nProgress: 5.1% (300/5936 comments processed)\nProgress: 6.7% (400/5936 comments processed)\nProgress: 8.4% (500/5936 comments processed)\nProgress: 10.1% (600/5936 comments processed)\nProgress: 11.8% (700/5936 comments processed)\nProgress: 13.5% (800/5936 comments processed)\nProgress: 15.2% (900/5936 comments processed)\nProgress: 16.8% (1000/5936 comments processed)\nProgress: 18.5% (1100/5936 comments processed)\nProgress: 20.2% (1200/5936 comments processed)\nProgress: 21.9% (1300/5936 comments processed)\nProgress: 23.6% (1400/5936 comments processed)\nProgress: 25.3% (1500/5936 comments processed)\nProgress: 27.0% (1600/5936 comments processed)\nProgress: 28.6% (1700/5936 comments processed)\nProgress: 30.3% (1800/5936 comments processed)\nProgress: 32.0% (1900/5936 comments processed)\nProgress: 33.7% (2000/5936 comments processed)\nProgress: 35.4% (2100/5936 comments processed)\nProgress: 37.1% (2200/5936 comments processed)\nProgress: 38.7% (2300/5936 comments processed)\nProgress: 40.4% (2400/5936 comments processed)\nProgress: 42.1% (2500/5936 comments processed)\nProgress: 43.8% (2600/5936 comments processed)\nProgress: 45.5% (2700/5936 comments processed)\nProgress: 47.2% (2800/5936 comments processed)\nProgress: 48.9% (2900/5936 comments processed)\nProgress: 50.5% (3000/5936 comments processed)\nProgress: 52.2% (3100/5936 comments processed)\nProgress: 53.9% (3200/5936 comments processed)\nProgress: 55.6% (3300/5936 comments processed)\nProgress: 57.3% (3400/5936 comments processed)\nProgress: 59.0% (3500/5936 comments processed)\nProgress: 60.6% (3600/5936 comments processed)\nProgress: 62.3% (3700/5936 comments processed)\nProgress: 64.0% (3800/5936 comments processed)\nProgress: 65.7% (3900/5936 comments processed)\nProgress: 67.4% (4000/5936 comments processed)\nProgress: 69.1% (4100/5936 comments processed)\nProgress: 70.8% (4200/5936 comments processed)\nProgress: 72.4% (4300/5936 comments processed)\nProgress: 74.1% (4400/5936 comments processed)\nProgress: 75.8% (4500/5936 comments processed)\nProgress: 77.5% (4600/5936 comments processed)\nProgress: 79.2% (4700/5936 comments processed)\nProgress: 80.9% (4800/5936 comments processed)\nProgress: 82.5% (4900/5936 comments processed)\nProgress: 84.2% (5000/5936 comments processed)\nProgress: 85.9% (5100/5936 comments processed)\nProgress: 87.6% (5200/5936 comments processed)\nProgress: 89.3% (5300/5936 comments processed)\nProgress: 91.0% (5400/5936 comments processed)\nProgress: 92.7% (5500/5936 comments processed)\nProgress: 94.3% (5600/5936 comments processed)\nProgress: 96.0% (5700/5936 comments processed)\nProgress: 97.7% (5800/5936 comments processed)\nProgress: 99.4% (5900/5936 comments processed)\nCompleted! Total comments fetched: 5936\nComments saved to reddit_comments_1730443051.csv\n```\nDemo: https://atlas.nomic.ai/data/auth0thread765/reddit-dataset\n\n\u003cvideo width=\"100%\" controls\u003e\n  \u003csource src=\"public/reddit-map.mp4\" type=\"video/mp4\"\u003e\n\u003c/video\u003e\n\n\n## 🔧Tools Used\n\n[![Python][python]][python-url]\n[![PRAW][praw]][praw-url]\n[![Pandas][pandas]][pandas-url]\n[![Nomic][nomic]][nomic-url]\n\n## 👤Contact\n\n\u003c!-- Replace placeholders with your actual contact information --\u003e\n[![Email][email]][email-url]\n[![Twitter][twitter]][twitter-url]\n\n\u003c!-- MARKDOWN LINKS \u0026 IMAGES --\u003e\n\u003c!-- https://www.markdownguide.org/basic-syntax/#reference-style-links --\u003e\n\n[python]: https://img.shields.io/badge/Python-3776AB?style=for-the-badge\u0026logo=python\u0026logoColor=white\n[python-url]: https://www.python.org/\n[praw]: https://img.shields.io/badge/PRAW-ff4301?style=for-the-badge\u0026logo=reddit\u0026logoColor=white\n[praw-url]: https://praw.readthedocs.io/\n[pandas]: https://img.shields.io/badge/Pandas-150458?style=for-the-badge\u0026logo=pandas\u0026logoColor=white\n[pandas-url]: https://pandas.pydata.org/\n[nomic]: https://img.shields.io/badge/Nomic_Atlas-000000?style=for-the-badge\u0026logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAA4AAAAOCAYAAAAfSC3RAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAAAAZdEVYdFNvZnR3YXJlAHBhaW50Lm5ldCA0LjAuMjHxIGmVAAAAEGNhZ3BhY2tldCBiZWdpbj0i77u/IiBpZD0iVzVNME1wQ2VoaUh6cmVTek5UY3prYzlkIj8+IDx4OnhtcG1ldGEgeG1sbnM6eD0iYWRvYmU6bnM6bWV0YS8iIHg6eG1wdGs9IkFkb2JlIFhNUCBDb3JlIDUuMy1jMDExIDY2LjE0NTY2MSwgMjAxMi8wMi8wNi0xNDo1NjoyNyAgICAgICAgIj4gPHJkZjpSREYgeG1sbnM6cmRmPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5LzAyLzIyLXJkZi1zeW50YXgtbnMjIj4gPHJkZjpEZXNjcmlwdGlvbiByZGY6YWJvdXQ9IiIgeG1sbnM6eG1wPSJodHRwOi8vbnMuYWRvYmUuY29tL3hhcC8xLjAvIiB4bWxuczp4bXBNTT0iaHR0cDovL25zLmFkb2JlLmNvbS94YXAvMS4wL21tLyIgeG1sbnM6c3RSZWY9Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC9zVHlwZS9SZXNvdXJjZVJlZiMiIHhtcDpDcmVhdG9yVG9vbD0iQWRvYmUgUGhvdG9zaG9wIENTNiAoV2luZG93cykiIHhtcE1NOkluc3RhbmNlSUQ9InhtcC5paWQ6RjY3NjM0QTY5NzNFMTFFNUI2QUQ4NTY1OTg1QTRFMUQiIHhtcE1NOkRvY3VtZW50SUQ9InhtcC5kaWQ6RjY3NjM0QTc5NzNFMTFFNUI2QUQ4NTY1OTg1QTRFMUQiPiA8eG1wTU06RGVyaXZlZEZyb20gc3RSZWY6aW5zdGFuY2VJRD0ieG1wLmlpZDpGNjc2MzRBNDk3M0UxMUU1QjZBRDg1NjU5ODVBNEUxRCIgc3RSZWY6ZG9jdW1lbnRJRD0ieG1wLmRpZDpGNjc2MzRBNTk3M0UxMUU1QjZBRDg1NjU5ODVBNEUxRCIvPiA8L3JkZjpEZXNjcmlwdGlvbj4gPC9yZGY6UkRGPiA8L3g6eG1wbWV0YT4gPD94cGFja2V0IGVuZD0iciI/Pg==\n[nomic-url]: https://atlas.nomic.ai/\n[email]: https://img.shields.io/badge/me@vd7.io-FFCA28?style=for-the-badge\u0026logo=Gmail\u0026logoColor=00bbff\u0026color=black\n[email-url]: #\n[github]: https://img.shields.io/badge/💻Github-000000?style=for-the-badge\n[github-url]: https://github.com/vdutts7/blockchain-js\n[twitter]: https://img.shields.io/badge/Twitter-FFCA28?style=for-the-badge\u0026logo=Twitter\u0026logoColor=00bbff\u0026color=black\n[twitter-url]: https://twitter.com/vdutts7/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvdutts7%2Freddit-map","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvdutts7%2Freddit-map","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvdutts7%2Freddit-map/lists"}