https://github.com/oelin/github-404s
A dataset containing non-existant GitHub usernames. Useful for generating usernames that haven't already been taken.
https://github.com/oelin/github-404s
data-science dataset github nlp statistics
Last synced: 9 months ago
JSON representation
A dataset containing non-existant GitHub usernames. Useful for generating usernames that haven't already been taken.
- Host: GitHub
- URL: https://github.com/oelin/github-404s
- Owner: oelin
- License: mit
- Created: 2023-01-20T11:13:00.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-01-20T11:28:49.000Z (over 3 years ago)
- Last Synced: 2025-03-12T05:33:05.940Z (over 1 year ago)
- Topics: data-science, dataset, github, nlp, statistics
- Homepage:
- Size: 7.81 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# GitHub 404s
A dataset containing non-existant GitHub usernames. Useful for generating usernames that haven't already been taken. The dataset is currently quite small with only ~1.1k examples, however we plan to expand it in future. Each example is a lowercase alphabetical 4-gram. These have been concatenated into line-separated string.