https://github.com/thediymaker/slurm-node-dashboard
Slurm HPC node status page
https://github.com/thediymaker/slurm-node-dashboard
dashboard hpc hpc-clusters slurm slurm-cluster slurm-job-scheduler
Last synced: 4 months ago
JSON representation
Slurm HPC node status page
- Host: GitHub
- URL: https://github.com/thediymaker/slurm-node-dashboard
- Owner: thediymaker
- License: gpl-3.0
- Created: 2024-04-20T06:34:27.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2026-02-01T23:52:45.000Z (5 months ago)
- Last Synced: 2026-02-02T04:24:10.320Z (4 months ago)
- Topics: dashboard, hpc, hpc-clusters, slurm, slurm-cluster, slurm-job-scheduler
- Language: TypeScript
- Homepage: https://slurmdash.com
- Size: 4.48 MB
- Stars: 62
- Watchers: 2
- Forks: 15
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# HPC Dashboard
[](https://www.gnu.org/licenses/gpl-3.0)
[](https://nodejs.org/)
[](https://nextjs.org/)
[](https://tailwindcss.com/)
[](https://ui.shadcn.com/)
> Powerful monitoring for your SLURM-based HPC cluster
The HPC Dashboard is a Next.js application designed to provide comprehensive monitoring of SLURM nodes. With a focus on performance and usability, this dashboard offers real-time insights into your HPC resources.

## Key Features
Core Functionality
- Real-time monitoring of CPU and GPU node utilization
- Detailed individual node status
- Comprehensive Slurm job details and history
- Dynamic data updates with refresh countdown
Advanced Integrations
Enable these features by configuring your environment file:
- LMOD module display and details
- Prometheus metrics integration
- OpenAI-powered chat and embeddings
## Quick Start
```bash
git clone https://github.com/thediymaker/slurm-node-dashboard.git
cd slurm-node-dashboard
npm install
# Set up your .env file (see Configuration section)
npm run dev
```
Visit `http://localhost:3000` to see your dashboard in action.
## Detailed Setup - Base
Prerequisites
- Node.js (v18 or later)
- npm or Yarn
- PM2 (for production deployment)
- Slurm API (enabled and configured)
- Slrum API user
- Slurm API token
Enabling the Slurm API
To use this dashboard, you need to have the Slurm API enabled on your HPC cluster. Follow these steps to set it up:
1. Start by reviewing the [Schedmd quickstart guide](https://slurm.schedmd.com/rest_quickstart.html).
2. Ensure that `slurmrestd` is running on your cluster.
3. Once the Slurm API is running, you need to generate an API key for authentication.
### Generating an API Key
The API key needs permissions to read all data. Here's an example of generating a key for the slurm user with a lifespan of 1 year:
```bash
scontrol token username=slurm lifespan=31536000
```
Note: This generates a JWT token. You can view the expiration date on the token and set up a reminder to renew it, or automate the renewal process (even with a shorter timeframe). The expiration of this token will be added to the future admin section on the dashboard.
Configuration
Create a `.env` file in the root directory:
```env
# BASE
COMPANY_NAME="Acme Corp"
NEXT_PUBLIC_BASE_URL="http://localhost:3000" # Update for your url and port
VERSION=1.1.2
CLUSTER_NAME="Cluster"
CLUSTER_LOGO="/cluster.png"
# DEV
NODE_ENV="dev"
REACT_EDITOR="code"
# SLURM
SLURM_API_VERSION="v0.0.40"
SLURM_SERVER="192.168.1.5"
SLURM_SERVER_PORT="6820"
SLURM_API_TOKEN=""
SLURM_API_ACCOUNT=""
# PLUGINS
NEXT_PUBLIC_ENABLE_OPENAI_PLUGIN=false
NEXT_PUBLIC_ENABLE_PROMETHEUS_PLUGIN=false
# ADVANCED FEATURES
OPENAI_API_KEY=""
OPENAI_API_URL="https://api.openai.com/v1"
OPENAI_API_MODEL="gpt-4o-mini"
PROMETHEUS_URL="" # Format http://192.168.1.5:9090
```
Production Deployment
For production environments, we recommend using PM2:
```bash
npm install -g pm2
pm2 start npm --name "hpc-dashboard" -- start
pm2 save
```
This ensures your dashboard runs continuously and restarts automatically if the server reboots.
## Advanced Usage
Custom Data Collection
### Historical Node Data
Collect historical node data with this script (run hourly via cron):
```bash
#!/bin/bash
SAVE_DIR="/path/to/data/directory"
mkdir -p "$SAVE_DIR"
FILENAME=$(date +"%Y-%m-%dT%H-%M-%S.000Z.json.gz")
curl -s "http://localhost:3000/api/slurm/nodes" | gzip > "$SAVE_DIR/$FILENAME"
find "$SAVE_DIR" -name "*.json.gz" -type f -mtime +30 -delete
```
### Module Data
Collect module data with this script (run daily via cron):
```bash
#!/bin/bash
json_dir="/path/to/public/directory"
json_output="${json_dir}/modules.json"
mkdir -p "$json_dir"
export MODULESHOME="/usr/share/lmod/lmod"
export MODULEPATH="/your/module/path"
$LMOD_DIR/spider -o jsonSoftwarePage $MODULEPATH | python -m json.tool > "$json_output"
```
Open OnDemand Integration
To integrate this dashboard with Open OnDemand:
Clone the generic Ruby app template:
```
git clone https://github.com/thediymaker/ood-status-iframe.git
```
Navigate to the cloned repository:
```
cd ood-status-iframe
```
Open the views/layout.erb file in your preferred text editor.
Update the URL in the views/layout.erb file to point to your deployed HPC Dashboard:
erb
```
```
Follow Open OnDemand's documentation to deploy this app within your Open OnDemand environment.
This integration allows you to embed the HPC Dashboard within your Open OnDemand interface, providing users with easy access to cluster status information.
## Contributing
We welcome contributions! Here's how you can help:
1. Fork the repository
2. Create a feature branch: `git checkout -b new-feature`
3. Make your changes and commit: `git commit -am 'Add new feature'`
4. Push to the branch: `git push origin new-feature`
5. Submit a pull request
## License
This project is licensed under the GNU General Public License v3.0. See the [LICENSE](LICENSE.md) file for details.
## Support and Contact
For support, please open an issue on our [GitHub repository](https://github.com/thediymaker/slurm-node-dashboard/issues).
For direct inquiries, contact Johnathan Lee at [john.lee@thediymaker.com](mailto:john.lee@thediymaker.com).
## Video
Video Guides
Quick start guide
[](https://youtu.be/wVEhPN-IqEA)
Open OnDemand iframe configuration
[](https://youtu.be/avLUYgMya98)
## Gallery
Additional Screenshots
| Feature Overview | Job Details |
| :----------------------------------------------: | :--------------------------------------------------: |
|  |  |
| Running Job | Completed Job |
| :----------------------------------------------------: | :--------------------------------------------------------: |
|  |  |
| Node Hover Details |
| :-----------------------------------------------------: |
|  |
---
Made with ❤️ for HPC