{"id":19262471,"url":"https://github.com/cdeck3r/gradepredictiondata","last_synced_at":"2025-07-22T20:33:00.754Z","repository":{"id":171614463,"uuid":"648164432","full_name":"cdeck3r/GradePredictionData","owner":"cdeck3r","description":"Data registry for grade prediction (part of the AML lecture)","archived":false,"fork":false,"pushed_at":"2023-07-04T12:03:30.000Z","size":27,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-23T18:46:59.829Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cdeck3r.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-01T10:55:15.000Z","updated_at":"2023-06-01T14:48:51.000Z","dependencies_parsed_at":null,"dependency_job_id":"7e52f0a2-7829-4074-a4fb-c3afc05779d1","html_url":"https://github.com/cdeck3r/GradePredictionData","commit_stats":null,"previous_names":["cdeck3r/gradepredictiondata"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/cdeck3r/GradePredictionData","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cdeck3r%2FGradePredictionData","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cdeck3r%2FGradePredictionData/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cdeck3r%2FGradePredictionData/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cdeck3r%2FGradePredictionData/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cdeck3r","download_url":"https://codeload.github.com/cdeck3r/GradePredictionData/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cdeck3r%2FGradePredictionData/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266567539,"owners_count":23949375,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-22T02:00:09.085Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-09T19:31:52.222Z","updated_at":"2025-07-22T20:33:00.731Z","avatar_url":"https://github.com/cdeck3r.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Grade Prediction Data Registry (GPDR)\n\nThis project contains the data for a project within the Applied Machine Learning (AML) lecture.\n\nThe use case is grade prediction from log files sourced from the learn management system. The lecturer naively observes that students who actively participate in class achieve better grades. It may comprise different facets of activity. We focus on the activity log from the learn management system. It records which resources ( e.g., files, videos, assignments, students ) have been accessed during the semester.\n\nTwo students are working on an ML system for grade prediction. Their focus is on the MLops aspect and less on the machine learning model.\n\n## Data Registry\n\nThe ML system for grade prediction works on anonymized data only. This repository contains the code to anonymize the log files. The output is organized in a [DVC data registry](https://dvc.org/doc/use-cases/data-registry/tutorial).\n\n**Important note:** Neither the raw data nor the processed data is stored in the repository.\n\nThe data registry stores references to the remote storage for anonymized data. Students may access data using the following DVC commands. Note, access requires additional credentials.\n\nUsing this registry, we assume you have a project under `git` and `dvc` version control. List the content of this registry using the command\n```\ndvc list -R https://github.com/cdeck3r/GradePredictionData\n```\n\nIt displays the repository content.\n\n```\n.dvcignore\n.gitignore\nDockerfiles/Dockerfile.jupyops\nLICENSE\nREADME.md\ndata-registry/.gitignore\ndata-registry/RTWIBNet_W22.dvc\ndata-registry/RTWIBNet_W22/RTWIBNet_W22_grades.csv\ndata-registry/RTWIBNet_W22/RTWIBNet_W22_log.csv\ndata-registry/RTWIBSE_W22.dvc\ndata-registry/RTWIBSE_W22/RTWIBSE_W22_grades.csv\ndata-registry/RTWIBSE_W22/RTWIBSE_W22_log.csv\ndata-registry/RTWIBStat_W22.dvc\ndata-registry/RTWIBStat_W22/RTWIBStat_W22_grades.csv\ndata-registry/RTWIBStat_W22/RTWIBStat_W22_log.csv\ndata/.gitkeep\ndocker-compose.yml\n...\n```\n\nDownload the data specified in the registry as `data-registry/RTWIBNet_W22` into your current working directory from the `dvc remote default`.\n```\ndvc get https://github.com/cdeck3r/GradePredictionData data-registry/RTWIBNet_W22\n```\n\n\nThe next command downloads the data from the registry into the workspace and track it.\n```\ndvc import https://github.com/cdeck3r/GradePredictionData data-registry/RTWIBNet_W22\n```\nOne may issue `dvc update` to retrieve updated versions.\n\n\n## Software System Setup\n\nA `docker-compose` file provides a setup for a multi-container application\nThe diagram below depicts the containers, their dependencies and ports (number in circles). The diagram sources from this [medium.com article](https://medium.com/@krishnakummar/creating-block-diagrams-from-your-docker-compose-yml-da9d5a2450b4).\n\n![docker-compose dependencies](docs/docker-compose.png)\n\n\n**Setup:** Start in project's root dir and create a `.env` file with the content shown below.\n```\n# .env file\n\n# In the container, this is the directory where the code is found\n# Example:\nAPP_ROOT=/GradePredictionData\n\n# the HOST directory containing directories to be mounted into containers\n# Example:\nVOL_DIR=/dev/GradePredictionData\n```\n\n**Create** docker image. Please see [Dockerfiles/](Dockerfiles directory) for details.\n\n```bash\ndocker-compose build jupyops\n```\n\n**Spin up** the containers and get a shell from a container\n```bash\ndocker-compose up -d jupyops\n```\n\nFinally, point you browser to http://localhost:8888 to load the jupyterlab editor.\n\n## License\n\nInformation provided in the [LICENSE](LICENSE) file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcdeck3r%2Fgradepredictiondata","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcdeck3r%2Fgradepredictiondata","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcdeck3r%2Fgradepredictiondata/lists"}