{"id":18853700,"url":"https://github.com/rwynn/monstache-showcase","last_synced_at":"2025-04-14T10:24:18.943Z","repository":{"id":49843371,"uuid":"175726927","full_name":"rwynn/monstache-showcase","owner":"rwynn","description":"monstache showcase to visualize open data","archived":false,"fork":false,"pushed_at":"2021-11-14T23:57:10.000Z","size":56140,"stargazers_count":31,"open_issues_count":3,"forks_count":18,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-27T23:41:46.742Z","etag":null,"topics":["elasticsearch","kibana","mongodb","monstache","open-data","visualization"],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rwynn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-03-15T01:30:15.000Z","updated_at":"2024-06-07T15:46:29.000Z","dependencies_parsed_at":"2022-09-19T01:51:55.476Z","dependency_job_id":null,"html_url":"https://github.com/rwynn/monstache-showcase","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rwynn%2Fmonstache-showcase","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rwynn%2Fmonstache-showcase/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rwynn%2Fmonstache-showcase/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rwynn%2Fmonstache-showcase/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rwynn","download_url":"https://codeload.github.com/rwynn/monstache-showcase/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248860780,"owners_count":21173506,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["elasticsearch","kibana","mongodb","monstache","open-data","visualization"],"created_at":"2024-11-08T03:45:19.657Z","updated_at":"2025-04-14T10:24:18.897Z","avatar_url":"https://github.com/rwynn.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Monstache showcase\n\nThis project shows how monstache can be applied to real data from data.gov.  The `mongoimport` tool will be used\nto import 6.5 million records of crime data.\n\nDuring the import monstache will be listening for change events on the entire MongoDB deployment and indexing \nthose documents into Elasticsearch.  Before importing monstache will do a little bit of transformation on the \ndata using a golang plugin to enable certain aggregations in Kibana. \n\nThe golang plugin was used over a Javascript plugin after noticing a dramatic performance increase.\n\nI recommend that your machine has at least 16GB RAM, 20GB free disk, and 4 or more CPU cores. You may be able to \nget away with less by decreasing the heap sizes for Elasticsearch in the docker-compose files.\n\nFirst you will need to make sure you have `docker` and `docker-compose` installed.  On desktop systems like \nDocker Desktop for Mac and Windows, Docker Compose is included as part of those desktop installs.\n\nThe versions at this project creation time were:\n\n```\nClient:\n Version:           18.09.3\n API version:       1.39\n Go version:        go1.10.8\n Git commit:        774a1f4\n Built:             Thu Feb 28 06:40:58 2019\n OS/Arch:           linux/amd64\n Experimental:      false\n\nServer: Docker Engine - Community\n Engine:\n  Version:          18.09.3\n  API version:      1.39 (minimum version 1.12)\n  Go version:       go1.10.8\n  Git commit:       774a1f4\n  Built:            Thu Feb 28 05:59:55 2019\n  OS/Arch:          linux/amd64\n  Experimental:     false\n\ndocker-compose version 1.23.1, build b02f1306\ndocker-py version: 3.5.0\nCPython version: 3.6.7\nOpenSSL version: OpenSSL 1.1.0f  25 May 2017\n```\n\nNext you will want to download the public [dataset](https://catalog.data.gov/dataset/2003-ward-dataset-csvs-crimes-2001-to-present). You will\nwant the .CSV format.  Please read all the rules and caveats associated with the public dataset before proceeding.\n\nWhen you have downloaded this large 1.5GB file you should copy it to the following location:\n\n```\nmonstache-showcase/mongodb/scripts/data/crimes.csv\n```\n\nUse the following command to note the number of documents to expect later during the import.\n\n```\n# subtract 1 for the csv header\nwc -l monstache-showcase/mongodb/scripts/data/crimes.csv\n```\n\nYou are now ready to run docker-compose and start the import. \n\n```\ncd monstache-showcase\n./import-showcase.sh\n```\n\nThe import will take a while.  During the process you will a see line like this coming from `mongoimport`:\n\n```\nc-data       | 2019-03-12T20:34:57.586+0000     imported 6820156 documents\n```\n\nThat means that all the data has been loaded into MongoDB.  Now you must wait for the indexing to complete in \nElasticsearch.  The process will periodically query the document count in Elasticsearch.  \n\nYou will see lines like this repeating forever:\n\n```\nc-config     | [\nc-config     |   {\nc-config     |     \"health\" : \"green\",\nc-config     |     \"status\" : \"open\",\nc-config     |     \"index\" : \"chicago.crimes\",\nc-config     |     \"uuid\" : \"4wShbV-LTq6-6paRsWataQ\",\nc-config     |     \"pri\" : \"1\",\nc-config     |     \"rep\" : \"0\",\nc-config     |     \"docs.count\" : \"1198982\",\nc-config     |     \"docs.deleted\" : \"0\",\nc-config     |     \"store.size\" : \"359mb\",\nc-config     |     \"pri.store.size\" : \"359mb\"\nc-config     |   }\nc-config     | ]\n\n```\n\nThe `doc_count` field in the response should eventually reach 1 less than the number you recorded from `wc -l`.\n\nOnce all the data is loaded into Elasticsearch you can bring down the containers with Ctrl-C or:\n\n```\ncd monstache-showcase\n./stop-showcase.sh\n```\n\nAt this point you have indexed all the data and no longer should run `import-showcase.sh` as that will index all the data\nagain. The import process stores the Elasticsearch data in a docker volume so it will persist between runs until you \ndelete the volume.\n\nThe last step is to fire up Kibana to analyze it. To do this start only Elasticsearch and Kibana with:\n\n```\ncd monstache-showcase\n./view-showcase.sh\n```\n\nOnce the containers are up and healthy you can go to http://localhost:5601 on the host to load Kibana and explore data.  \n\nIn Kibana you can start from scratch and define an index-pattern. However, I recommend that you import the \nfile named `export.json` from the root of monstache-showcase to get a head start.\n\nTo import you will want to go to `Management` -\u003e `Saved Objects` and then click `Import` and upload `export.json`.\n\nYou will also want to go under `Management` -\u003e `Advanced Settings` in Kibana and set `Timezone for date formatting`\nto `UTC` to display dates correctly.\n\nWhen you are finished analyzing in Kibana you can run `./stop-showcase.sh` to bring down the containers.\n\nIf you want to tear down everything and delete all the associated data you can run `./clean-showcase.sh`.  \nThis stops the containers and deletes the associated docker volumes.  \n\nPlease open an issue with any feedback you might have.  Thanks!\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frwynn%2Fmonstache-showcase","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frwynn%2Fmonstache-showcase","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frwynn%2Fmonstache-showcase/lists"}