{"id":15937073,"url":"https://github.com/tyleryep/sage","last_synced_at":"2025-10-29T10:13:36.508Z","repository":{"id":106214203,"uuid":"220703234","full_name":"TylerYep/sage","owner":"TylerYep","description":"CS 398 Final Project ","archived":false,"fork":false,"pushed_at":"2021-01-20T08:07:06.000Z","size":5152,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-09T08:17:05.647Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TylerYep.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-11-09T21:00:56.000Z","updated_at":"2019-12-18T02:21:43.000Z","dependencies_parsed_at":"2023-05-30T15:45:17.921Z","dependency_job_id":null,"html_url":"https://github.com/TylerYep/sage","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TylerYep%2Fsage","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TylerYep%2Fsage/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TylerYep%2Fsage/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TylerYep%2Fsage/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TylerYep","download_url":"https://codeload.github.com/TylerYep/sage/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247070780,"owners_count":20878581,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-07T04:42:05.874Z","updated_at":"2025-10-15T22:51:02.196Z","avatar_url":"https://github.com/TylerYep.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# sage\nCS 398 Final Project\nBy Tyler Yep, Jesse Doan\n\n# Instructions\nAdding more documentation because, heavens, we have guests!\nStep 0: Obtain the Code Studio data from Chris Piech (cpiech) and put it in `data/`.\n\n\n## Data Exploration\nJump into generate with `cd generate/`.\n\nRun `python data_loader.py` to convert the data to json.\n\nSet `USE_FEEDBACK_NN = False` in explore.py.\n\nThen, run `python explore.py` to examine specific data examples.\n\n\n## Autograder\nJump into generate with `cd generate/`.\n\nRun `python sample.py [problem_num]` to get training data. This will also give the top 50 submissions in the source data that were not represented in your grammar. (Fix your grammar!)\n\nYOU CAN ALSO run `python sample.py 0 -a` to get all training data.\n\nRun `python preprocess.py [problem_num] [data_path_here.pkl]` to convert/split the data into train/val/test.\n\nRun `python train.py [problem_num]` to train the model.\n\nRun `python report_card.py` to get the student's aggregated report card.\n\nRun `python explore.py` to examine specific data examples.\n\n\n## Milestones\n- Wrote a grammar for p1-p4.\n  - Learned what samples we were missing while we were sampling!\n    - P1 - 82%\n    - P2 - 99%\n    - P3 - 52%\n    - P4 - 56%\n- Created data exploration tool.\n  - View a submission in its original code form.\n  - Manually inspect student progress.\n\n- Create visualizations for rubric sampled AI.\n\n- Anomaly Detection.\n  - Transition model\n    - Find breakthrough moments = a single large transition score\n    - Find backtracking moments = (not as important, save for later)\n  - Bucketed learning\n    - Use the categories of learning that we identified across all problems, and get transition scores for each of those categories instead. This is a student's \"report card\".\n\n  - Deep Learning\n    - Does amount of learning we arbitrarily calculated correlate with number of submissions later?\n    - Plot final score vs (number of submissions later * time spent)\n\n- Ability Gradient Estimation.\n  - Can you backprop student success on next problem to train transition weights for learning?\n  - Model that predicts future success (number of submissions on next problem) based on calculated score?\n\n\n# Project Info\n\n## Motivating Question\nHow can we measure a student’s growth in Hour of Code? Can we find the moments when the student has learned, or in other words, advanced to a greater ability?\n\nBased on our predicted student ability, we can better place students in the zone of proximal development, and can then give better recommendations (feedback/next problem to try). We can also evaluate students via a different metric (grit rather than recall).\n\n## Method\nFor each problem in Code.org, we can build a rubric of mistakes the students are making. Given these rubric items, we can see whether the same students stop making these mistakes on a later problem, implying some measure of growth. If we identify this change in ability, we can make informed recommendations to increase the amount / rate of learning.\n\n## Steps\n1. Create rubric items for mistakes students make, and also when they don't make mistakes. This should align with intuitive notions of student growth, such as:\n    * time to problem completion? (might not be true)\n    * number of submissions\n    * number of backtracks\n    * amount of code increasing vs decreasing\n    * code style?\n2. Simulate a student with some ability a_{init} answering questions on all Hour of Code questions.\n  * Their ability randomly grows and rubric items change, and build our dataset.\n  * Zero-shot learning problem.\n3. Validation - given a real student, predict student ability a_{final} using marked rubric items, and get a sense of the slope of a student's growth. See which rubric buckets change the most over time.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftyleryep%2Fsage","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftyleryep%2Fsage","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftyleryep%2Fsage/lists"}