https://github.com/frostbyte012/gemma-studio
A web-based application for managing datasets, fine-tuning machine learning models, and monitoring training progress. Built with React, TypeScript, and FastAPI, it simplifies workflows for dataset preprocessing, hyperparameter configuration, and model export.
https://github.com/frostbyte012/gemma-studio
docker fastapi reactjs typescript
Last synced: 3 months ago
JSON representation
A web-based application for managing datasets, fine-tuning machine learning models, and monitoring training progress. Built with React, TypeScript, and FastAPI, it simplifies workflows for dataset preprocessing, hyperparameter configuration, and model export.
- Host: GitHub
- URL: https://github.com/frostbyte012/gemma-studio
- Owner: frostbyte012
- Created: 2025-03-25T11:37:52.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-22T00:00:18.000Z (about 1 year ago)
- Last Synced: 2025-04-22T01:19:09.593Z (about 1 year ago)
- Topics: docker, fastapi, reactjs, typescript
- Language: TypeScript
- Homepage:
- Size: 273 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Gemma Studio
Gemma Studio is a web-based application designed to streamline dataset management, model training, and deployment workflows. Built with modern technologies like **React**, **TypeScript**, **Vite**, and **Tailwind CSS**, it provides an intuitive interface for managing datasets, configuring training parameters, monitoring training progress, and exporting trained models.
# Inspiration
Inspired by [GSoC 25 from Google DeepMind](https://gist.github.com/dynamicwebpaige/92f7739ad69d2863ac7e2032fe52fbad), the goal is to create a user-friendly Gemma Model Fine-tuning UI using tools like Streamlit or Gradio. The UI enables users to:
- **Dataset Uploading**: Support various formats (CSV, JSONL, text files) with validation, preprocessing, and optional data augmentation.
- **Hyperparameter Configuration**: Adjust key parameters (learning rate, batch size, epochs) with sensible defaults and tooltips.
- **Training Progress Visualization**: Display real-time metrics (loss curves, accuracy, F1-score) and examples of generated text.
- **Model Download/Export**: Export fine-tuned models in formats like TensorFlow SavedModel, PyTorch, or GGUF.
- **Cloud Integration**: Optionally integrate with Google Cloud Storage and Vertex AI for scalable training and data storage.
- **Documentation**: Provide clear documentation and step-by-step examples for ease of use.
---
## Project Structure
The project is organized as follows:
```
gemma-studio/
├── public/ # Static assets
├── src/
│ ├── components/ # React components
│ │ ├── dashboard/ # Dashboard components
│ │ │ └── WelcomeCard.tsx # Welcome component on dashboard
│ │ ├── dataset/ # Dataset related components
│ │ │ ├── DatasetPreview.tsx # Preview uploaded datasets
│ │ │ └── DatasetUpload.tsx # Upload datasets UI
│ │ ├── layout/ # Layout components
│ │ │ ├── Layout.tsx # Main layout wrapper
│ │ │ └── Navbar.tsx # Navigation bar
│ │ ├── models/ # Model components
│ │ │ └── ModelExport.tsx # Export trained models
│ │ ├── training/ # Training components
│ │ │ ├── HyperparameterConfig.tsx # Configure training parameters
│ │ │ └── TrainingProgress.tsx # Training progress visualization
│ │ └── ui/ # shadcn/ui components
│ │ └── ... # Various UI components (buttons, cards, etc.)
│ ├── hooks/ # Custom React hooks
│ │ ├── use-mobile.tsx # Hook for responsive design
│ │ └── use-toast.ts # Toast notification hook
│ ├── lib/ # Utility functions
│ │ ├── animations.ts # Animation utilities
│ │ └── utils.ts # General utilities
│ ├── pages/ # Page components
│ │ ├── Dashboard.tsx # Dashboard page
│ │ ├── Datasets.tsx # Datasets management page
│ │ ├── Index.tsx # Landing page
│ │ ├── Models.tsx # Model export page
│ │ ├── NotFound.tsx # 404 page
│ │ ├── Settings.tsx # Settings page
│ │ └── Training.tsx # Training configuration and monitoring page
│ ├── services/ # Backend service integrations
│ │ ├── datasetService.ts # Dataset management functionality
│ │ ├── modelService.ts # Model export functionality
│ │ └── trainingService.ts # Training functionality
│ ├── App.css # App-wide styles
│ ├── App.tsx # Main application component with routing
│ ├── index.css # Global styles
│ ├── main.tsx # Application entry point
│ └── vite-env.d.ts # Vite environment types
├── eslint.config.js # ESLint configuration
├── tailwind.config.ts # Tailwind CSS configuration
├── tsconfig.json # TypeScript configuration
└── vite.config.ts # Vite configuration
```
---
## Technologies Used
- **Vite**: Fast build tool for modern web projects.
- **TypeScript**: Strongly typed JavaScript for better code quality.
- **React**: Component-based UI library.
- **shadcn-ui**: Pre-built UI components.
- **Tailwind CSS**: Utility-first CSS framework.
---
## Installation and Usage
### Prerequisites
- **Node.js** and **npm** installed on your system. You can install them using [nvm](https://github.com/nvm-sh/nvm#installing-and-updating).
### Steps to Install and Run Locally
1. Clone the repository:
```bash
git clone
```
2. Navigate to the project directory:
```bash
cd gemma-studio
```
3. Install dependencies:
```bash
npm install
```
4. Start the development server:
```bash
npm run dev
```
5. Open your browser and navigate to `http://localhost:8080` to view the application.
---
### Starting the Backend
1. Navigate to the backend directory:
```bash
cd backend
```
2. Install the required Python dependencies:
```bash
pip install -r requirements.txt
```
3. Start the FastAPI backend server:
```bash
uvicorn app:app --host 0.0.0.0 --port 8000
```
4. The backend will be available at:
```
http://localhost:8000
```
## Deployment
### Deploying to Hugging Face Spaces
1. Package the application as a Docker container:
```bash
docker build -t gemma-studio .
```
2. Push the Docker image to Hugging Face Spaces:
- Follow the [Hugging Face Spaces Docker documentation](https://huggingface.co/docs/hub/spaces-docker) to deploy your container.
### Deploying to Google Cloud Run
1. Authenticate with Google Cloud:
```bash
gcloud auth login
gcloud auth configure-docker
```
2. Build and push the Docker image:
```bash
docker build -t gcr.io//gemma-studio .
docker push gcr.io//gemma-studio
```
3. Deploy to Cloud Run:
```bash
gcloud run deploy gemma-studio \
--image gcr.io//gemma-studio \
--platform managed \
--region us-central1 \
--allow-unauthenticated
```
### Integrating with Vertex AI
- Use Vertex AI for advanced model training and deployment.
- Integrate the backend services (`trainingService.ts`, `modelService.ts`) with Vertex AI APIs for seamless training and deployment workflows.
- Refer to the [Vertex AI documentation](https://cloud.google.com/vertex-ai/docs) for more details.
---
## Future Scope
1. **Hugging Face Integration**:
- Deploy models directly to Hugging Face Spaces for easy sharing and inference.
2. **Google Cloud Integration**:
- Use Google Cloud Storage for dataset management.
- Leverage Vertex AI for scalable model training and deployment.
3. **Custom Domain Support**:
- Integrate with platforms like Netlify or Vercel for hosting under a custom domain.
4. **Enhanced UI/UX**:
- Add more interactive visualizations for training progress and dataset insights.
5. **Multi-Cloud Support**:
- Extend deployment options to AWS and Azure.
---
## Contributing
Contributions are welcome! Please follow these steps:
1. Fork the repository.
2. Create a new branch:
```bash
git checkout -b feature/your-feature-name
```
3. Commit your changes:
```bash
git commit -m "Add your message here"
```
4. Push to your branch:
```bash
git push origin feature/your-feature-name
```
5. Open a pull request.
---
## License
This project is licensed under the MIT License. See the `LICENSE` file for details.
---
## Contact
For any questions or feedback, feel free to reach out at [ai.frostbyte012@gmail.com].