Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/fatemafaria142/Instructions-Tuning-Across-Various-LLMs-with-Alpaca-Dataset


https://github.com/fatemafaria142/Instructions-Tuning-Across-Various-LLMs-with-Alpaca-Dataset

Last synced: 3 days ago
JSON representation

Awesome Lists containing this project

README

        

Improved-Language-Model-Instructions-Tuning-using-Alpaca-Dataset

In this project, I explored different prompt types for Large Language Models.

Alpaca Dataset


I utilized the "Alpaca" dataset, which comprises 52,000 instructions and demonstrations generated by OpenAI's text-davinci-003 engine. This instruction data is ideal for conducting instruction-tuning for language models, enhancing their ability to follow instructions effectively.


Dataset Link: Alpaca Dataset

Large Language Models (LLMs)


I employed six different types of Large Language Models for this task. Here are the details along with their respective links:



  1. GPT2


    Model Link: GPT2 Documentation




  2. GPT-Medium


    Model Link: GPT-Medium




  3. Mistral-7B-v0.1


    Model Link: Mistral-7B-v0.1




  4. TinyLlama-1.1B-Chat-v1.0


    Model Link: TinyLlama-1.1B-Chat-v1.0




  5. Mistral-7B-Instruct-v0.2


    Model Link: Mistral-7B-Instruct-v0.2




  6. Starling-LM-7B-alpha


    Model Link: Starling-LM-7B-alpha



Feel free to explore these models and the Alpaca dataset for a deeper understanding of the project's advancements in language model instruction tuning.