{"id":15116344,"url":"https://github.com/albino/shithead-X","last_synced_at":"2025-09-27T22:30:33.911Z","repository":{"id":42002605,"uuid":"265393309","full_name":"albino/shithead-X","owner":"albino","description":"GPT-2 SUPER NEXT GENERATION MACHINE LEARNING irc shitposting bot ","archived":false,"fork":false,"pushed_at":"2022-06-26T00:35:48.000Z","size":30,"stargazers_count":11,"open_issues_count":1,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-09-27T01:51:28.548Z","etag":null,"topics":["gpt-2","gpt2-chatbot","irc","irc-bot","ml"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"cc0-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/albino.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-05-19T23:31:10.000Z","updated_at":"2024-06-10T22:21:03.000Z","dependencies_parsed_at":"2022-08-12T02:00:51.214Z","dependency_job_id":null,"html_url":"https://github.com/albino/shithead-X","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albino%2Fshithead-X","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albino%2Fshithead-X/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albino%2Fshithead-X/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albino%2Fshithead-X/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/albino","download_url":"https://codeload.github.com/albino/shithead-X/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234461940,"owners_count":18837203,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gpt-2","gpt2-chatbot","irc","irc-bot","ml"],"created_at":"2024-09-26T01:44:18.716Z","updated_at":"2025-09-27T22:30:28.661Z","avatar_url":"https://github.com/albino.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# shithead-X\n\nGPT-2 SUPER NEXT GENERATION MACHINE LEARNING irc shitposting bot\n\n*`shithead-X` is an IRC chatbot based capable of producing remarkably realistic output. It is based on [aitextgen](https://github.com/minimaxir/aitextgen), a library implementing the [GPT-2](https://en.wikipedia.org/wiki/OpenAI#GPT-2) model.*\n\n### The problem\n\n[`shithead-ng`](https://github.com/albino/shithead-ng) is a cool Markov chain-based chatbot, but its funniness peaks around 750,000 keys, it could do with a much more intelligent weighting system, and I am bored of simple Markov chains and don't want to work on it any more.\n\n### The solution\n\nLet's implement another chatbot, this time using the latest hip machine learning technology.\n\n## Getting started\n\nFirst, set up the dependencies:\n\n**Note**: This will pull in some outdated dependencies because of breaking API changes in newer versions.\nYou really should use a `virtualenv` to avoid breaking your Python installation!\n\n```bash\n# Best to use a python virtualenv\npython3 -m venv venv\n. venv/bin/activate\npip3 install -r requirements.txt\n```\n\n### Building a model\n\nYou can use any pytorch GPT-2 model, but for functionality as a chatbot, it's best to train one using existing IRC logs. This way, your chatbot will mimic the culture of the channel you're deploying it in.\n\nThis can take a while to get just right - a bit of trial and effort will be worth it here.\n\n#### Filtering the logs\n\nThis is not as simple as it sounds, as any 'unhelpful' input that is not filtered at this stage can cause problems with the model later on. **A simple, well thought-out filter is crucial to training a good model later.** Obvious things to filter include names, timestamps, URLs and bot output - I have included some filter scripts I used as examples in `dump_logs.py` and `filter.sh`.\n\n#### Training the model\n\nI don't actually know anything about natural language processing or artificial intelligence, I just played about with it until I was happy with the result. Training the provided 124M model worked the best for me, although this will require a powerful GPU to train (you can train it on Google's servers if you don't have one at home) and text generation can take a while on a weak CPU.\n\nMainly I followed [this tutorial](https://colab.research.google.com/drive/15qBZx5y9rdaQSyWpsreMDnTiZ5IlN0zD?usp=sharing) provided by aitextgen's author.\n\n### Configuration\n\nCopy `config.default.ini` to `config.ini` and edit accordingly. Run `gpt2_bot.py` to start.\n\nThe bot will only connect using SSL to servers with valid certificates. This is a feature. If a network can't set up SSL properly, it doesn't deserve to be graced by shithead-X. If you really want to change this behaviour, edit `irc.py`.\n\n## Commands\n\n* .shitposting - controls the % of messages the bot will reply to\n```\n\u003c user\u003e .shitposting 1.5\n\u003c shithead-X\u003e OK\n```\n\n* .temp - controls the temperature. This is a value which has something to do with randomness, I don't really know.\n```\n\u003c user\u003e .temp\n\u003c shithead-X\u003e Current temperature: 0.9\n\u003c shithead-X\u003e Regular temperature range is 0.7 - 1.0 (higher values are crazier)\n\u003c user\u003e .temp 0.7\n\u003c shithead-X\u003e New temperature is 0.7\n```\n\n* .ignore - make the bot totally ignore a user (useful for other bots)\n```\n\u003c user\u003e .ignore bot\n\u003c shithead-X\u003e Now ignoring bot\n```\n\n* .unignore - removes a user from the ignore list\n```\n\u003c user\u003e .unignore bot\n\u003c shithead-X\u003e No longer ignoring bot\n```\n\n* .ping - checks that shithead-X is still alive\n```\n\u003c user\u003e .ping\n\u003c shithead-X\u003e user: Pong!\n```\n\n## FAQ\n\n**Q**: Can you send me a model to use?    \n**A**: No.\n\n**Q**: Can you help me with something else?    \n**A**: Yes. Open an issue, or contact me via email or IRC.\n\n## Caveats\n\n* I don't know anything about artifical intelligence  \n* I don't know anything about Python  \n* This is a new project and it probably has some issues, I'm running it in a small channel myself so I will endeavour to get it working as well as I can. Bug reports welcome, PRs even welcomer.\n\n## Thanks\n\nThanks to Max Woolf for writing aitextgen, the team at OpenAI for making GPT-2 available to the public, and meiscoffee for helping train the model on his GPU.\n\n## Ethics\n\nPlease read the [ethics](https://github.com/minimaxir/aitextgen#ethics) section of aitextgen's README if you intend to seriously deploy shithead-X. In short, while it's funny to trick people with Markov chain bots, GPT-2 is 'real' AI and you should probably let people know they are talking to a robot.\n\n## License\n\nCC0/Public Domain; for more information please see the `LICENSE` file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falbino%2Fshithead-X","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falbino%2Fshithead-X","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falbino%2Fshithead-X/lists"}