https://github.com/launchplatform/cakelens-v5
Open-source AI-gen video detection model
https://github.com/launchplatform/cakelens-v5
Last synced: 11 months ago
JSON representation
Open-source AI-gen video detection model
- Host: GitHub
- URL: https://github.com/launchplatform/cakelens-v5
- Owner: LaunchPlatform
- License: mit
- Created: 2025-07-23T00:27:51.000Z (11 months ago)
- Default Branch: master
- Last Pushed: 2025-07-30T06:49:41.000Z (11 months ago)
- Last Synced: 2025-07-30T07:25:15.271Z (11 months ago)
- Language: Python
- Size: 8.21 MB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# cakelens-v5 [](https://dl.circleci.com/status-badge/redirect/gh/LaunchPlatform/cakelens-v5/tree/master)
Open-source AI-gen video detection model
Please see the [blog post](https://fangpenlin.com/posts/2025/07/30/open-source-cakelens-v5/) for more details.
You can find the model weight at our [Huggingface Hub repository](https://huggingface.co/fangpenlin/cakelens-v5).
## Installation
Install the package with its dependencies:
```bash
pip install cakelens-v5
```
## Command Line Interface
The package provides a command line tool `cakelens` for easy video detection:
### Basic Usage
```bash
# Using Hugging Face Hub (recommended)
cakelens video.mp4
# Using local model file
cakelens video.mp4 --model-path model.pt
```
### Options
- `--model-path`: Path to the model checkpoint file (optional - will load from Hugging Face Hub if not provided)
- `--batch-size`: Batch size for inference (default: 1)
- `--device`: Device to run inference on (`cpu`, `cuda`, `mps`) - auto-detected if not specified
- `--verbose, -v`: Enable verbose logging
- `--output`: Output file path for results (JSON format)
### Examples
```bash
# Basic detection (uses Hugging Face Hub)
cakelens video.mp4
# Using local model file
cakelens video.mp4 --model-path model.pt
# With custom batch size and device
cakelens video.mp4 --batch-size 4 --device cuda
# Save results to JSON file
cakelens video.mp4 --output results.json
# Verbose output
cakelens video.mp4 --verbose
```
### Output
The tool provides:
- Real-time prediction percentages for each label
- Final mean predictions across all frames
- Option to save results in JSON format
- Detailed logging (with `--verbose` flag)
## Programmatic Usage
You can also use the detection functionality programmatically in your Python code:
### Basic Detection
```python
import pathlib
from cakelens.detect import Detector
from cakelens.model import Model
# Create model and load from Hugging Face Hub
model = Model()
# load the model weights from Hugging Face Hub
model.load_from_huggingface_hub()
# or, if you have a local model file:
# model.load_state_dict(torch.load("model.pt")["model_state_dict"])
# Create detector
detector = Detector(
model=model,
batch_size=1,
device="cpu" # or "cuda", "mps", or None for auto-detection
)
# Run detection
video_path = pathlib.Path("video.mp4")
verdict = detector.detect(video_path)
# Access results
print(f"Video: {verdict.video_filepath}")
print(f"Frame count: {verdict.frame_count}")
print("Predictions:")
for i, prob in enumerate(verdict.predictions):
print(f" Label {i}: {prob * 100:.2f}%")
```
## Labels
The model can detect the following labels:
- **AI_GEN**: Is the video AI-generated or not?
- **ANIME_1D**: Is the video in 2D anime style?
- **ANIME_2D**: Is the video in 3D anime style?
- **VIDEO_GAME**: Does the video look like a video game?
- **KLING**: Is the video generated by Kling?
- **HIGGSFIELD**: Is the video generated by Higgsfield?
- **WAN**: Is the video generated by Wan?
- **MIDJOURNEY**: Is the video generated using images from Midjourney?
- **HAILUO**: Is the video generated by Hailuo?
- **RAY**: Is the video generated by Ray?
- **VEO**: Is the video generated by Veo?
- **RUNWAY**: Is the video generated by Runway?
- **SORA**: Is the video generated by Sora?
- **CHATGPT**: Is the video generated using images from ChatGPT?
- **PIKA**: Is the video generated by Pika?
- **HUNYUAN**: Is the video generated by Hunyuan?
- **VIDU**: Is the video generated by Vidu?
> **Note**: The **AI_GEN** label is the most accurate as it has the most training data. Other labels have limited training data and may be less accurate.
## Accuracy
The PR curve of the model is shown below:
At threshold 0.5, the model has an precision of 0.77 and a recall of 0.74.
The dataset size is 5,093 videos for training and 498 videos for validation.
Please note that the model is not perfect and may make mistakes.