Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aadisrivastava05/pravachakai-hindi-article-generation
This project is a Hindi Article Generator, developed using natural language processing (NLP) techniques to create contextually relevant and coherent articles in Hindi. The model was trained on a custom dataset collected via web scraping. The model has been fine-tuned on a dataset of Hindi news headlines and articles.
https://github.com/aadisrivastava05/pravachakai-hindi-article-generation
fine-tuning huggingface-transformers llama3 nlp unsloth
Last synced: about 2 months ago
JSON representation
This project is a Hindi Article Generator, developed using natural language processing (NLP) techniques to create contextually relevant and coherent articles in Hindi. The model was trained on a custom dataset collected via web scraping. The model has been fine-tuned on a dataset of Hindi news headlines and articles.
- Host: GitHub
- URL: https://github.com/aadisrivastava05/pravachakai-hindi-article-generation
- Owner: AadiSrivastava05
- License: mit
- Created: 2024-09-05T10:25:46.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-09-05T11:21:14.000Z (4 months ago)
- Last Synced: 2024-10-19T22:15:26.111Z (3 months ago)
- Topics: fine-tuning, huggingface-transformers, llama3, nlp, unsloth
- Language: Jupyter Notebook
- Homepage: https://www.kaggle.com/code/aadisrivastava/hindi-article-generator
- Size: 42 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Hindi Article Generatorπ°
This project is a Hindi Article Generator, developed using natural language processing (NLP) techniques to create contextually relevant and coherent articles in Hindi. The model was trained on a custom dataset collected via web scraping. The model has been fine-tuned on a dataset of Hindi news headlines and articles to generate high-quality, fluent articles for a variety of use cases such as news, blogs, and creative content.## Direct link to Kaggle Notebook with outputs https://www.kaggle.com/code/aadisrivastava/hindi-article-generator
(I would highly recommend you to check it out, as it has all the cells including the outputs)# π Dataset
The model is trained on a dataset that I created through web scraping from the BBC Hindi website. The dataset includes a wide variety of articles, ensuring that the generator produces diverse and contextually accurate content.
## Here is the link to the Dataset repo containing the dataset and scraping code:-
https://github.com/AadiSrivastava05/BBC-Hindi-News-Dataset-with-web-scraping-script
## The dataset is also available on Kaggle which you can use directly in your code without downloading:-
https://www.kaggle.com/datasets/aadisrivastava/bbc-hindi-news-articles-dataset-detailed# π Features
* Generates fluent and contextually accurate Hindi articles.
* Can be used for content creation in media platforms, blogs, and creative writing.
* Built with advanced NLP techniques and fine-tuned for the Hindi language on **llama 3**.
* This same dataset and approach can be used for many other tasks, some of them I am listing below:-
1) Generating a headline for an article
2) Classification of an article into different categories
3) Classification of an article headline into different categories