{"id":27334885,"url":"https://github.com/dmuth/splunk-glassdoor","last_synced_at":"2025-08-23T16:08:52.410Z","repository":{"id":138471253,"uuid":"193252967","full_name":"dmuth/splunk-glassdoor","owner":"dmuth","description":"Splunk app to graph Glassdoor reviews of companies","archived":false,"fork":false,"pushed_at":"2023-05-22T22:16:38.000Z","size":1333,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-12T14:51:26.905Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dmuth.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-06-22T16:02:57.000Z","updated_at":"2022-01-20T04:36:42.000Z","dependencies_parsed_at":null,"dependency_job_id":"47f70867-3573-45e0-ac6b-bc13d6e9f3b1","html_url":"https://github.com/dmuth/splunk-glassdoor","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dmuth/splunk-glassdoor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmuth%2Fsplunk-glassdoor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmuth%2Fsplunk-glassdoor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmuth%2Fsplunk-glassdoor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmuth%2Fsplunk-glassdoor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dmuth","download_url":"https://codeload.github.com/dmuth/splunk-glassdoor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmuth%2Fsplunk-glassdoor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271755413,"owners_count":24815399,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-23T02:00:09.327Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-12T14:46:33.530Z","updated_at":"2025-08-23T16:08:52.399Z","avatar_url":"https://github.com/dmuth.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Splunking Glassdoor Reviews\n\nThis project is based largely on the work I did for \nmy \u003ca href=\"https://github.com/dmuth/splunk-yelp-reviews\"\u003eSplunk Yelp project\u003c/a\u003e\nnot too long ago.  The impetus for it came about when a local tech company pinged \nme about the possibility of working for them, and I wanted to see what other people\nthought of their company.  Thus, Splunk Glassdoor was born!\n\nThis app will tell you the following:\n\n- Avg ratings/number of ratings over time\n- Recent pros, cons, and advice to management\n- Tag cloud of words from pros, cons, and advice to management\n\nIn real-life, I've used this app to check out potential employers.\n\nThis app uses \u003ca href=\"https://github.com/dmuth/splunk-lab\"\u003eSplunk Lab\u003c/a\u003e, an open-source \napp I built to effortlessly run Splunk in a Docker container.\n\n\n# Screenshots\n\n\u003ca href=\"img/facebook-glassdoor.png\"\u003e\u003cimg src=\"img/facebook-glassdoor.png\" width=\"250\" alt=\"Facebook Glassdoor Reviews\" /\u003e\u003c/a\u003e\n\u003ca href=\"img/netflix-glassdoor.png\"\u003e\u003cimg src=\"img/netflix-glassdoor.png\" width=\"250\" alt=\"Netflix Glassdoor Reviews\" /\u003e\u003c/a\u003e\n\u003ca href=\"img/qvc-glassdoor.png\"\u003e\u003cimg src=\"img/qvc-glassdoor.png\" width=\"250\" alt=\"QVC Glassdoor Reviews\" /\u003e\u003c/a\u003e\n\n\n## Requirements\n\n- Docker\n\n\n## Running The App\n\n- `SPLUNK_START_ARGS=--accept-license bash \u003c(curl -s https://raw.githubusercontent.com/dmuth/splunk-glassdoor/master/go.sh ) ./urls.txt`\n   - The file `urls.txt` should contain one URL per line, and each URL should be a business's review page from Glassdoor.\n   - Since some businesees can have thousands of reviews, this script will pick up where it left off if interrupted.\n   - This grabs the HTML from review pages uses \u003ca href=\"https://www.crummy.com/software/BeautifulSoup/bs4/doc/\"\u003eBeautiful Soup\u003c/a\u003e to parse the reviews and then export them to the `logs/` directory.  I looked into using Glassdoor's API, but when I went to the signup page, it was a broken page that was mostly blank.  So I tried 🤷.\n   - The script is single threaded, but reasonably efficient. (and I don't want to DoS Glassdoor's website)  I've clocked downloads at 5,000 in a little over 8 minutes, or about 600 reviews a minute.\n- Go to \u003ca href=\"https://localhost:8000/\"\u003ehttps://localhost:8000/\u003c/a\u003e, log in with the password you set, and you'll see the Glassdoor Reviews Dashboard.\n\n\n## Troubleshooting\n\n- Q: Dashboards show ` Search is waiting for input...`\n- A: You need to select a venue in the dropdown!  If no items are in the dropdown, that means no data was ingested.  Did you run the command to download some Glassdor reviews?\n\n\n## Development\n\nMostly for my benefit, these are the scripts that I use to make my life easier:\n\n- `./bin/build.sh` - Build the Python and Splunk Docker containers\n- `./bin/push.sh` - Upload the Docker containers to Docker Hub\n- `./bin/devel.sh` - Build and run the Splunk Docker container with an interactive shell\n- `./bin/run-download-reviews.sh` - Run the script to download reviews directly\n- `./bin/stop.sh` - Stop the Splunk container\n- `./bin/clean.sh` - Stop Splunk, and remove the data and logs\n\n\n## Credits\n\nI'd like to thank \u003ca href=\"http://splunk.com/\"\u003eSplunk\u003c/a\u003e, for having such a kick-ass data\nanalytics platform, and the operational excellence which it embodies.\n\nAlso:\n- \u003ca href=\"https://www.ascii-art-generator.org/\"\u003eThis text to ASCII art generator\u003c/a\u003e, for the logo I used in the script.\n\n\n## Bugs\n\n- Excessive CPU Usage\n   - In Docker on OS/X, if you have thousands and thousands of files, Splunk persistently uses like 70% of the CPU.  Not good.  I think it's more a Docker thing than Splunk thing, but I could write a workaround as follows:\n      - Download reviews to a SQLite database with SQLAlchemy\n      - When downloads are done, dump all reviews for that business to a single JSON file in the logs/ directory\n   - Workaround: Run `index=main earliest=-10y | stats count` and when the number of events stops going up, stop Splunk, remove the contents of the `logs/` directory, and restart Splunk.\n- Sometimes you'll see a yellow exclamation point with the text \"Field 'words' does not exist in the data\" on the Advice Tag Cloud.  The underlying search appears to be executing normally, so I can trying to sort this one out.\n\n\n## Copyright\n\nSplunk is copyright by Splunk.  Apps within Splunk Lab are copyright their creators,\nand made available under the respective license.  \n\n\n## Contact\n\n- \u003ca href=\"mailto:doug.muth@gmail.com\"\u003eEmail me\u003c/a\u003e\n- \u003ca href=\"https://twitter.com/dmuth\"\u003eTwitter\u003c/a\u003e\n- \u003ca href=\"https://facebook.com/dmuth\"\u003eFacebook\u003c/a\u003e\n\n\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdmuth%2Fsplunk-glassdoor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdmuth%2Fsplunk-glassdoor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdmuth%2Fsplunk-glassdoor/lists"}