{"id":17109162,"url":"https://github.com/dkackman/chattahoochie","last_synced_at":"2025-06-13T03:40:59.262Z","repository":{"id":185671298,"uuid":"606809678","full_name":"dkackman/chattahoochie","owner":"dkackman","description":null,"archived":false,"fork":false,"pushed_at":"2023-03-13T22:11:17.000Z","size":14,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-14T20:45:34.298Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dkackman.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-02-26T16:25:49.000Z","updated_at":"2023-02-26T16:25:55.000Z","dependencies_parsed_at":null,"dependency_job_id":"378bdc2c-c37b-425a-8650-fd7050d1cd76","html_url":"https://github.com/dkackman/chattahoochie","commit_stats":null,"previous_names":["dkackman/chattahoochie"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkackman%2Fchattahoochie","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkackman%2Fchattahoochie/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkackman%2Fchattahoochie/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkackman%2Fchattahoochie/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dkackman","download_url":"https://codeload.github.com/dkackman/chattahoochie/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245171773,"owners_count":20572296,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-14T16:22:20.981Z","updated_at":"2025-03-23T21:28:26.716Z","avatar_url":"https://github.com/dkackman.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# chattahoochie\n\nA code playground/scratch pad/whatever for learning about running language models. Uses huggingface and if you're familiar with stable diffusion code \nshould be relatively familiar. LLMs are at a point where they will soon be as accesible as SD was in late 2022 so exciting times are coming!\n\n## Some initial things I've leaned\n\n- VRAM size is much more important here than SD\n- Models can be run at 8 bit precsion with little loss of fidelity. The [bitsandbytes library](https://github.com/TimDettmers/bitsandbytes) enables this.\n- This allows large models to fit into less memory. \n- https://huggingface.co/PygmalionAI/pygmalion-6b can run in 16GB performs pretty well. \n- Models with less then 6B paramters are pretty nonsensical. They are good for quick testing though.\n- [Deepspeed](https://github.com/microsoft/DeepSpeed) and [flexgen](https://github.com/FMInference/FlexGen) look very interesting for LLM comrpession and optimization on smaller GPUs.\n- I'm working with a NVIDIA 3090 with 24GB VRAM and the LLMs I've played with are starting to be coherent, though can still drift into the nonsensical.\n\n## Fun LLMs I've found so far\n\n- [The 7B paramters versions of bloom](https://huggingface.co/bigscience) can run on commodity hardware and are fun. \n- [Pygmalion 6B](https://huggingface.co/PygmalionAI/pygmalion-6b) also seems to perform well.\n\n## Hello Chat\n\nChat apps work by building up an array of strings which represent the ocnversaiton. The entire conversation is sent to the model each time, \nwhich is how it knows the context. The longer the larger the context the more VRAM is needed.\n\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\nimport torch\n\n\ntokenizer = AutoTokenizer.from_pretrained(\"microsoft/DialoGPT-large\")\nmodel = AutoModelForCausalLM.from_pretrained(\"microsoft/DialoGPT-large\")\n\n# Let's chat for 5 lines\nfor step in range(5):\n    # encode the new user input, add the eos_token and return a tensor in Pytorch\n    new_user_input_ids = tokenizer.encode(input(\"\u003e\u003e User:\") + tokenizer.eos_token, return_tensors='pt')\n\n    # append the new user input tokens to the chat history\n    bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step \u003e 0 else new_user_input_ids\n\n    # generated a response while limiting the total chat history to 1000 tokens, \n    chat_history_ids = model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id)\n\n    # pretty print last ouput tokens from bot\n    print(\"DialoGPT: {}\".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdkackman%2Fchattahoochie","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdkackman%2Fchattahoochie","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdkackman%2Fchattahoochie/lists"}