{"id":23502955,"url":"https://github.com/GatorEducator/GatorMiner","last_synced_at":"2025-08-28T01:31:46.015Z","repository":{"id":50760931,"uuid":"208187903","full_name":"GatorEducator/GatorMiner","owner":"GatorEducator","description":"A visualized text mining and analysis tool for student markdown reflection documents based on Natural language processing in the Dept of CS at Allegheny College.","archived":false,"fork":false,"pushed_at":"2023-11-03T21:57:16.000Z","size":13838,"stargazers_count":11,"open_issues_count":31,"forks_count":8,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-12-19T23:06:51.582Z","etag":null,"topics":["nlp","spacy","streamlit","textmining"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GatorEducator.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-09-13T03:20:46.000Z","updated_at":"2024-07-03T22:02:14.000Z","dependencies_parsed_at":"2024-09-28T13:00:49.993Z","dependency_job_id":"89eec001-21a3-4dd7-90aa-9771b1fb58f7","html_url":"https://github.com/GatorEducator/GatorMiner","commit_stats":{"total_commits":793,"total_committers":30,"mean_commits":"26.433333333333334","dds":"0.22698612862547285","last_synced_commit":"851cd329421eb88942e42655165f42fc8706c12a"},"previous_names":["allegheny-ethical-cs/gatorminer"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GatorEducator%2FGatorMiner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GatorEducator%2FGatorMiner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GatorEducator%2FGatorMiner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GatorEducator%2FGatorMiner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GatorEducator","download_url":"https://codeload.github.com/GatorEducator/GatorMiner/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":231205328,"owners_count":18341698,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["nlp","spacy","streamlit","textmining"],"created_at":"2024-12-25T08:12:01.429Z","updated_at":"2024-12-25T08:12:02.815Z","avatar_url":"https://github.com/GatorEducator.png","language":"Python","readme":"# GatorMiner\n\n[![Build Status](https://travis-ci.com/Allegheny-Ethical-CS/GatorMiner.svg?branch=master)](https://travis-ci.com/Allegheny-Ethical-CS/GatorMiner)\n[![codecov](https://codecov.io/gh/Allegheny-Ethical-CS/GatorMiner/branch/master/graph/badge.svg)](https://codecov.io/gh/Allegheny-Ethical-CS/GatorMiner)\n[![Built with spaCy](https://img.shields.io/badge/built%20with-spaCy-09a3d5.svg)](https://spacy.io)\n[![Built with Streamlit](https://img.shields.io/badge/built%20with-Streamlit-09a3d5.svg)](https://www.streamlit.io/)\n\nAn automated text-mining tool written in Python to measure the technical\nresponsibility of students in computer science courses, being used to analyze\nstudents' markdown reflection documents and five questions survey based on\nNatural Language Processing in the Department of Computer Science at Allegheny\nCollege.\n\n## Installation\n\nYou can clone the repository by running the following command:\n\n```bash\ngit clone git@github.com:Allegheny-Ethical-CS/GatorMiner.git\n```\n\n`cd` into the project root folder:\n\n```bash\ncd GatorMiner\n```\n\nThis program uses [Pipenv](https://github.com/pypa/pipenv) for dependency management.\n\n- If needed, install and upgrade the `pipenv` with `pip`:\n\n  ```bash\n  pip install pipenv -U\n  ```\n\n- To create a default virtual environment and use the program:\n\n  ```bash\n  pipenv install\n  ```\n\nGatorMiner relies on `en_core_web_sm` and/or `en_core_web_md`, English models\ntrained on written web text (blogs, news, comments) that includes vocabulary,\nvectors, syntax and entities.\n\nTo install the pre-trained model, you can run (one of) the following commands:\n\n```bash\npipenv run python -m spacy download en_core_web_sm\npipenv run python -m spacy download en_core_web_md\n```\n\n## Web Interface\n\nGatorMiner is mainly developed on its web interface with [Streamlit](https://www.streamlit.io)\nin order to provide fast text analysis and visualizations.\n\nIn order to run the `Streamlit` interface, type and execute the following command\nin your terminal:\n\n```bash\npipenv run streamlit run streamlit_web.py\n```\n\nYou then will see something like this in your terminal window:\n\n```bash\nYou can now view your Streamlit app in your browser.\n\nLocal URL: http://localhost:8501\nNetwork URL: http://xxx.xxx.x.x:8501\n```\n\nThe web interface will be automatically opened in your browser:\n\n\u003cimg src=\"resources/images/landing_page.png\" alt=\"browser\" style=\"width:100%\"/\u003e\n\n### Data Retreiving\n\nThere are currently two ways to import text data for analysis: through local\nfile system or AWS DynamoDB.\n\n#### Local File System\n\nYou can type in the path(s) to the directorie(s) that hold reflection markdown\ndocuments. You are welcome to try the tool with the sample documents we\nprovided in `resources`, for example:\n\n```shell\nresources/sample_md_reflections/lab1, resources/sample_md_reflections/lab2, resources/sample_md_reflections/lab3\n```\n\n#### AWS\n\nRetrieving reflection documents from AWS is a feature integrated with the use\nof [GatorGrader](https://github.com/GatorEducator/gatorgrader) where students'\nmarkdown reflection documents are being collected and stored inside the a\npre-configured DynamoDB database. In order to use this feature, you will need\nto have some credential tokens (listed below) stored as environment variables:\n\n```bash\nexport GATOR_ENDPOINT=\u003cYour Endpoint\u003e\nexport GATOR_API_KEY=\u003cYour API Key\u003e\nexport AWS_ACCESS_KEY_ID=\u003cYour Access Key ID\u003e\nexport AWS_SECRET_ACCESS_KEY=\u003cYour Secret Access Key\u003e\n```\n\nIt is likely that you already have these prepared when using GatorMiner in\nconjunction with GatorGrader, since these would already be exported when\nsetting up the AWS services. You can read more about setting up an AWS service\nwith GatorGrader [here](https://github.com/enpuyou/script-api-lambda-dynamodb).\n\nOnce the documents are successfully imported, you can then navigate through\nthe select box in the sidebar to view the text analysis:\n\n\u003cimg src=\"resources/images/select_box.png\" alt=\"select box\" style=\"width:100%\"/\u003e\n\n##### Reflection Documents\n\nWe are using markdown format for the student reflection documents.\nIts organized structure allows us to parse and perform text analysis easily.\nWith that said, there are few requirements for the reflection document before it\ncould be seamlessly processed and analyzed with GatorMiner. A\n[template](resources/reflection_template.md) is provided within the repo. Note\nthat the headers with the assignment's and student's ID/name are required.\nGatorMiner is set in default to take the first header as assignment name and the\nsecond header as student name.\n\nYou can also check out the\n[sample json report](resources/sample_json_report/report%201.json) to see the\nformat of json reports GatorMiner gathers from AWS.\n\n### Analysis\n\n\u003cimg src=\"resources/images/frequency.png\" alt=\"frequency\" style=\"width:100%\"/\u003e\n\u003cimg src=\"resources/images/sentiment.png\" alt=\"sentiment\" style=\"width:100%\"/\u003e\n\u003cimg src=\"resources/images/similarity.png\" alt=\"similarity\" style=\"width:100%\"/\u003e\n\u003cimg src=\"resources/images/topic.png\" alt=\"topic\" style=\"width:100%\"/\u003e\n\n### Contribution\n\nWe are excited that you would take the time to contribute to GatorMiner! We have\nprovided a [contributing guideline](CONTRIBUTING.md) that will help you\neffectively get started and make contributions to the project.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FGatorEducator%2FGatorMiner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FGatorEducator%2FGatorMiner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FGatorEducator%2FGatorMiner/lists"}