{"id":28166567,"url":"https://github.com/gregorybchris/ngrams","last_synced_at":"2025-05-15T13:13:52.239Z","repository":{"id":289567354,"uuid":"971302083","full_name":"gregorybchris/ngrams","owner":"gregorybchris","description":"N-grams assignment for Park Tudor","archived":false,"fork":false,"pushed_at":"2025-04-23T23:58:01.000Z","size":439,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-24T00:18:42.575Z","etag":null,"topics":["assignment","auto-regressive","generation","language","lesson","llm","model","n-gram","teach","text"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gregorybchris.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-23T10:09:52.000Z","updated_at":"2025-04-23T23:58:05.000Z","dependencies_parsed_at":"2025-04-24T00:18:44.429Z","dependency_job_id":"c1d8fe49-d112-4677-a96f-cdcf0297400c","html_url":"https://github.com/gregorybchris/ngrams","commit_stats":null,"previous_names":["gregorybchris/ngrams"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gregorybchris%2Fngrams","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gregorybchris%2Fngrams/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gregorybchris%2Fngrams/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gregorybchris%2Fngrams/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gregorybchris","download_url":"https://codeload.github.com/gregorybchris/ngrams/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254346569,"owners_count":22055809,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["assignment","auto-regressive","generation","language","lesson","llm","model","n-gram","teach","text"],"created_at":"2025-05-15T13:13:26.852Z","updated_at":"2025-05-15T13:13:52.223Z","avatar_url":"https://github.com/gregorybchris.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# N-grams\n\nIntroduction to generative language modeling using an n-gram model.\n\nThis project is an assignment for the Park Tudor data science class. See [assignment.md](./assignment.md) for detailed instructions.\n\n## Files\n\n| Name                                           | Description                                     |\n| ---------------------------------------------- | ----------------------------------------------- |\n| [assignment.md](./assignment.md)               | The instructions for the assignment             |\n| [tiny_shakespeare.txt](./tiny_shakespeare.txt) | The dataset we use to train our language model  |\n| --                                             | --                                              |\n| [dataset.py](./dataset.py)                     | Utilities for loading and splitting the dataset |\n| [model.py](./model.py)                         | The n-gram model implementation                 |\n| --                                             | --                                              |\n| [train.py](./train.py)                         | A CLI script to train the model                 |\n| [generate.py](./generate.py)                   | A CLI script to generate text with the model    |\n| [grade.py](./grade.py)                         | A CLI script to grade the assignment            |\n| --                                             | --                                              |\n| [grading_utils.py](./grading_utils.py)         | Utilities for grading, can be ignored           |\n\n## Dataset\n\nThe [Tiny Shakespeare dataset](./tiny_shakespeare.txt) has been downloaded from [the GitHub of Andrej Karpathy](https://github.com/karpathy/char-rnn/blob/master/data/tinyshakespeare/input.txt).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgregorybchris%2Fngrams","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgregorybchris%2Fngrams","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgregorybchris%2Fngrams/lists"}