https://github.com/mrdbourke/learn-huggingface
Repo designed to help learn the Hugging Face ecosystem (transformers, datasets, accelerate + more).
https://github.com/mrdbourke/learn-huggingface
Last synced: 6 days ago
JSON representation
Repo designed to help learn the Hugging Face ecosystem (transformers, datasets, accelerate + more).
- Host: GitHub
- URL: https://github.com/mrdbourke/learn-huggingface
- Owner: mrdbourke
- License: apache-2.0
- Created: 2024-06-06T22:47:12.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-10T07:48:17.000Z (7 months ago)
- Last Synced: 2025-04-13T05:13:21.946Z (6 months ago)
- Language: Jupyter Notebook
- Homepage: http://www.learnhuggingface.com/
- Size: 67.2 MB
- Stars: 69
- Watchers: 3
- Forks: 12
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Learn Hugging Face 🤗 (work in progress)
I'd like to learn the Hugging Face ecosystem better (transformers, datasets, accelerate + more).
So this repo is to help me learn it and simulatenously teach others.
Each example will include an end-to-end approach of starting with a dataset (custom or existing), building and evaluating a model and creating a demo to share.
Teaching style:
A machine learning cooking show! 👨🍳
Mottos:
* ***If in doubt, run the code.*** -- Machine learning is very experimental. So it's good to get in the habit of continually trying things (even if you think they won't work).
* ***Visualize, visualize, visualize!*** - If you're not sure of some dataset or some operation or some predictions, visualize it/them.
* ***Experiment, experiment, experiment!*** - Again, machine learning is very experimental. So keep trying different things!
* ***Data, model, demo!*** - Create/get a dataset, build/train/evaluate a model, create a demo to share.Project style:
Data, model, demo!
* Create a new/reuse an existing dataset.
* Train/evaluate a model.
* Build a demo to share.This will be our (rough) workflow:
![]()
A general Hugging Face workflow from idea to shared model and demo using tools from the Hugging Face ecosystem. These kind of workflows are not set in stone and are more of guide than specific directions. See information on each of the tools in the Hugging Face documentation.## Contents
All code and text will be free/open-source, video step-by-step walkthroughs are available as a paid upgrade.
| Project | Description | Dataset | Model | Demo | Video Course |
| ----- | ----- | ----- | ----- | ----- | ----- |
| [Text classification](https://www.learnhuggingface.com/notebooks/hugging_face_text_classification_tutorial) | Build project "Food Not Food", a text classification model to classify image captions into "food" if they're about food or "not_food" if they're not about food. This is the ideal place to get started if you've never used the Hugging Face ecosystem. | [Dataset](https://huggingface.co/datasets/mrdbourke/learn_hf_food_not_food_image_captions) | [Model](https://huggingface.co/mrdbourke/learn_hf_food_not_food_text_classifier-distilbert-base-uncased) | [Demo](https://huggingface.co/spaces/mrdbourke/learn_hf_food_not_food_text_classifier_demo) | [Video Course](https://dbourke.link/ZTM-HF-Text-Classification) |
| [Object Detection](https://www.learnhuggingface.com/notebooks/hugging_face_object_detection_tutorial) | Build Trashify 🚮, an object detection model to detect "trash", "hand", "bin" to incentivize people to clean up their local area. Start with a dataset, customize an open-source object detection model and turn it into a demo application that others can use and try out on their own images. | [Dataset](https://huggingface.co/datasets/mrdbourke/trashify_manual_labelled_images) | [Model](https://huggingface.co/mrdbourke/rt_detrv2_finetuned_trashify_box_detector_v1) | [Demo](https://huggingface.co/spaces/mrdbourke/trashify_demo_v4) | Video Course (coming soon) |
| More to come soon! | Let me know if you'd like to see anything specific by [leaving an issue](https://github.com/mrdbourke/learn-huggingface/issues). | | | | |## Who is it for?
Ideal for:
* Beginners who love things explained in detail.
* Someone who wants to create more of their own end-to-end machine learning projects.Not ideal for:
* People with 2-3+ years of machine learning projects & experience^.
^Note: This being said, you may actually find some things helpful along the way. Best to explore and see!
## Prerequisites
* 3-6 months Python experience.
* 1x beginner machine learning or deep learning course (see my [begineer-friendly ML course](https://dbourke.link/ZTMMLcourse) to learn Python + important ML concepts in one).
* PyTorch experience is a bonus (see my [Learn PyTorch in a Day video](https://youtu.be/Z_ikDlimN6A?si=Glf8q383cV0P9hEO) or [learnpytorch.io](https://www.learnpytorch.io/))## What is Hugging Face?
Hugging Face is a platform that offers access to many different kinds of open-source machine learning models and datasets.
They're also the creators of the popular [`transformers` library](https://huggingface.co/docs/transformers/en/index) (and many more helpful libraries) which is a Python-based library for working with pre-trained models as well as custom models.
If you're getting into the world of AI and machine learning, you're going to come across Hugging Face.
![]()
A handful of pieces from the Hugging Face ecosystem. There are many more available in Hugging Face documentation.## Why Hugging Face?
Many of the biggest companies in the world use Hugging Face for their open-source machine learning projects including [Apple](https://huggingface.co/apple), [Google](https://huggingface.co/google), [Facebook](https://huggingface.co/facebook) (Meta), [Microsoft](https://huggingface.co/microsoft), [OpenAI](https://huggingface.co/openai), [ByteDance](https://huggingface.co/ByteDance) and more.
Not only does Hugging Face make it so you can use state-of-the-art machine learning models such as [Stable Diffusion](https://huggingface.co/stabilityai/stable-diffusion-2-1) (for image generation) and [Whipser](https://huggingface.co/openai/whisper-large-v3) (for audio transcription) easily, it also makes it so you can share your own models, datasets and resources.
Aside from your own website, consider Hugging Face the homepage of your AI/machine learning profile.
## TODO
- [ ] Prerequisites
- [ ] Ecosystem overview (transformers, datasets, accelerate, tokenizers, Spaces, demos, models, hub etc.)
- [x] Text classification
- [ ] Object detection
- [ ] Image classification
- [ ] Named entity recognition
- [ ] LLM fine-tuning
- [ ] VLM fine-tuning
- [ ] RAG workflow
- [ ] Zero-shot image classification/multi-modal workflows (CLIP)## Setup
See [setup](https://github.com/mrdbourke/learn-huggingface/blob/main/extras/setup.md).
## Resources
* Hugging Face documentation - https://huggingface.co/
* Hugging Face cookbook - https://github.com/huggingface/cookbook## FAQ
> Is this an official Hugging Face website?
No, it's a personal project by myself ([Daniel Bourke](https://www.mrdbourke.com)) to learn and help others learn the Hugging Face ecosystem.
## Log
11 Feb 2025 - Add object detection base tutorial (code works, but this is a first draft, stay tuned for updates to make it cleaner)