{"id":21418523,"url":"https://github.com/jef1056/jv6","last_synced_at":"2025-07-14T05:31:07.040Z","repository":{"id":56156818,"uuid":"265716943","full_name":"JEF1056/Jv6","owner":"JEF1056","description":"Jade V6 - Based on GPT","archived":false,"fork":false,"pushed_at":"2023-06-12T21:27:53.000Z","size":10379,"stargazers_count":16,"open_issues_count":1,"forks_count":1,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-05-01T18:18:04.169Z","etag":null,"topics":["ai","chatbot","jadeai","machine-learning"],"latest_commit_sha":null,"homepage":"https://jadeai.ml","language":"CSS","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JEF1056.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-05-21T00:36:38.000Z","updated_at":"2024-02-01T06:32:43.000Z","dependencies_parsed_at":"2022-08-15T13:50:14.837Z","dependency_job_id":null,"html_url":"https://github.com/JEF1056/Jv6","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JEF1056%2FJv6","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JEF1056%2FJv6/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JEF1056%2FJv6/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JEF1056%2FJv6/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JEF1056","download_url":"https://codeload.github.com/JEF1056/Jv6/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225952172,"owners_count":17550504,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","chatbot","jadeai","machine-learning"],"created_at":"2024-11-22T19:22:06.142Z","updated_at":"2024-11-22T19:22:08.199Z","avatar_url":"https://github.com/JEF1056.png","language":"CSS","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Jade AI (V6)\r\n### What is Jade? \r\nJade is a contextual chatbot, using OpenAI's GPT-2 transformer model to generate logical conversation. I wrote a more detailed description of Jade's capabilities (with screenshots!!) [here](https://dev.to/jef1056/jade-ai-549i).\r\n\r\n----------------------------\r\n## Usage\r\n### To deploy, it's rather simple.\r\n#### Requirements\r\n- Python 3+\r\n- Pip 3\r\n- 1 vCPU (more cores doesn't matter, but high single-core preformance does)\r\n- 1.25GB or more of RAM\r\n- ~10GB of storage\r\n#### Instructions\r\n1. Download the [pretrained model](haha-i-havent-uploaded-it-yet) or [train your own](#train)\r\n2. Run `pip3 install -r reqirements.txt` in this folder\r\n3. Create a `config.json` file based on `config_example.json`\r\n4. run `interact.py`\r\n\r\n### Here is a table of commands for `interact.py` (in discord)\r\nPrefix is `JD`\r\n|    Command    |   Short Explanation  | Usage Example |\r\n|---------------|:----------------:|:-------------:|\r\n| JD -h |  Help menu | `JDT -h` |\r\n| JD -p | Ping menu | `JDT -p` |\r\n| JD -v |  Voting menu | `JDT -v` |\r\n| JD -s | Settings menu | `JDT -s` or `JDT -s [setting] [new value]` |\r\n| JD -r | Resetting history or settings | `JDT -r` or `JDT -r [history or settings]` |\r\n| JD [message] | Talk to Jade! | `JDT [message]` |\r\n\r\n### Here is a table of arguments for `interact.py`\r\nArgument | Type | Default value | Description\r\n---------|------|---------------|------------\r\nmodel | `str` | `\"openai-gpt\"` | Path, url or short name of the model\r\nmax_history | `int` | `4` | Number of previous utterances to keep in history\r\ndevice | `str` | `cuda` if `torch.cuda.is_available()` else `cpu` | Device (cuda or cpu)\r\nno_sample | action `store_true` | Set to use greedy decoding instead of sampling\r\nmax_length | `int` | `20` | Maximum length of the output utterances\r\nmin_length | `int` | `1` | Minimum length of the output utterances\r\nseed | `int` | `0` | Seed\r\ntemperature | `int` | `0.7` | Sampling softmax temperature\r\ntop_k | `int` | `0` | Filter top-k tokens before sampling (`\u003c=0`: no filtering)\r\ntop_p | `float` | `0.9` | Nucleus filtering (top-p) before sampling (`\u003c=0.0`: no filtering)\r\nf | `idk...` | `N/A` | Colab is weird ok\r\n\r\n##### All of these (except for the server settings) are availabe for user modifications via the `JD -s`  command.\r\n\u003ca name=\"train\"\u003e\u003c/a\u003e\r\n\r\n## Training\r\n### Training is a bit more difficult.\r\n#### Requirements\r\n\r\n- Python 3+\r\n- Pip 3\r\n- (Preferred) a server with GPU functionality. (Multi-GPU setups are supported!)\r\n  - A single Tesla T4 will take ~12 hours to finetune the model, wouthout FP16. The formatted dataset size was ~400mb. See the dataset format [here](train/formatting/example_data.json)\r\n  - FP16 will reduce the vram usage and increase the speed of the model training. Depending on your hardware configuration, your results may vary.\r\n- (Preffered) 25 + GB of ram\r\n  - \u003cspan style=\"color:red\"\u003eWARNING!!\u003c/span\u003e Large datasets (\u003e 200 mb formatted datasets) may require mare than 25GB of ram during tokenization.\r\n  - The pretrained model took ~45GB of ram to train\r\n- (Reccomended) 30gb of storage\r\n\r\n#### Instructions\r\nSee the [README.md](train/README.md) in the train folder to see all the possible arguments for thr training script.\r\n\r\n```bash\r\npython3 train.py --dataset_path data.json --model gpt2 --gradient_accumulation_steps=4 \\\r\n--lm_coef=2.0 --fp16=O1 --max_history=4 --n_epochs=1 --num_candidates=0 \\\r\n--personality_permutations=2 --train_batch_size=4 --valid_batch_size=4\r\n```\r\n\r\n1. Run `pip3 install -r reqirements-train.txt` in this folder\r\n2. Use the [formatter](train/formatting/README.md) to create the dataset in the correct format.\r\n3. Run the command above (which will replicate my results), or add `-m torch.distributed.launch --nproc_per_node=8` after python if you're using more than one GPU.\r\n   - If the progam is crashing with no errors during padding, use `htop` to check your RAM usage. Most cloud solutions will allow you to edit the machine memory size, or you can add swap.\r\n4. Use [nvtop](https://github.com/Syllo/nvtop) or `nvidia-smi` to monitor GPU usage after tokenization.\r\n   - Use FP16 if vram usage is really high\r\n   - try to max out your GPU(s) by increasing or decreasing `--train_batch_size=4 --valid_batch_size=4`. A Tesla T4 will not be maxed out until it receives batches of 4.\r\n5. Complete training.\r\n   - \u003cspan style=\"color:red\"\u003eWARNING!!\u003c/span\u003e TRAINING IS NOT DONE UNTIL ALL VALIDATION IS COMPLETE. Try and keep the validation set small.\r\n   - The model will be saved to `runs/[DATE]`\r\n\r\n#### Post-Training\r\nThere's a bit more to make the completed model compatable with `interact.py`.\r\nUnfortunately, I need to re-code this part, but I'll have it uploaded soon!\r\n\r\n(if you want to do it yourself)\r\n1. read the tokenizer file and use it to tokenize the personalities in your detaset file\r\n2. save the tokenized personalities in a list\r\n3. use pickle to save the list to `versions.p` in the model folder","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjef1056%2Fjv6","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjef1056%2Fjv6","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjef1056%2Fjv6/lists"}