https://github.com/launchplatform/cakelens-v5

Open-source AI-gen video detection model
https://github.com/launchplatform/cakelens-v5

Last synced: 11 months ago
JSON representation

Open-source AI-gen video detection model

Host: GitHub
URL: https://github.com/launchplatform/cakelens-v5
Owner: LaunchPlatform
License: mit
Created: 2025-07-23T00:27:51.000Z (11 months ago)
Default Branch: master
Last Pushed: 2025-07-30T06:49:41.000Z (11 months ago)
Last Synced: 2025-07-30T07:25:15.271Z (11 months ago)
Language: Python
Size: 8.21 MB
Stars: 2
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # cakelens-v5 [![CircleCI](https://dl.circleci.com/status-badge/img/gh/LaunchPlatform/cakelens-v5/tree/master.svg?style=svg)](https://dl.circleci.com/status-badge/redirect/gh/LaunchPlatform/cakelens-v5/tree/master)

Open-source AI-gen video detection model

Please see the [blog post](https://fangpenlin.com/posts/2025/07/30/open-source-cakelens-v5/) for more details.

You can find the model weight at our [Huggingface Hub repository](https://huggingface.co/fangpenlin/cakelens-v5).

## Installation

Install the package with its dependencies:

```bash

pip install cakelens-v5

```

## Command Line Interface

The package provides a command line tool `cakelens` for easy video detection:

### Basic Usage

```bash

# Using Hugging Face Hub (recommended)

cakelens video.mp4

# Using local model file

cakelens video.mp4 --model-path model.pt

```

### Options

- `--model-path`: Path to the model checkpoint file (optional - will load from Hugging Face Hub if not provided)

- `--batch-size`: Batch size for inference (default: 1)

- `--device`: Device to run inference on (`cpu`, `cuda`, `mps`) - auto-detected if not specified

- `--verbose, -v`: Enable verbose logging

- `--output`: Output file path for results (JSON format)

### Examples

```bash

# Basic detection (uses Hugging Face Hub)

cakelens video.mp4

# Using local model file

cakelens video.mp4 --model-path model.pt

# With custom batch size and device

cakelens video.mp4 --batch-size 4 --device cuda

# Save results to JSON file

cakelens video.mp4 --output results.json

# Verbose output

cakelens video.mp4 --verbose

```

### Output

The tool provides:

- Real-time prediction percentages for each label

- Final mean predictions across all frames

- Option to save results in JSON format

- Detailed logging (with `--verbose` flag)

## Programmatic Usage

You can also use the detection functionality programmatically in your Python code:

### Basic Detection

```python

import pathlib

from cakelens.detect import Detector

from cakelens.model import Model

# Create model and load from Hugging Face Hub

model = Model()

# load the model weights from Hugging Face Hub

model.load_from_huggingface_hub()

# or, if you have a local model file:

# model.load_state_dict(torch.load("model.pt")["model_state_dict"])

# Create detector

detector = Detector(

    model=model,

    batch_size=1,

    device="cpu"  # or "cuda", "mps", or None for auto-detection

)

# Run detection

video_path = pathlib.Path("video.mp4")

verdict = detector.detect(video_path)

# Access results

print(f"Video: {verdict.video_filepath}")

print(f"Frame count: {verdict.frame_count}")

print("Predictions:")

for i, prob in enumerate(verdict.predictions):

    print(f"  Label {i}: {prob * 100:.2f}%")

```

## Labels

The model can detect the following labels:

- **AI_GEN**: Is the video AI-generated or not?

- **ANIME_1D**: Is the video in 2D anime style?

- **ANIME_2D**: Is the video in 3D anime style?

- **VIDEO_GAME**: Does the video look like a video game?

- **KLING**: Is the video generated by Kling?

- **HIGGSFIELD**: Is the video generated by Higgsfield?

- **WAN**: Is the video generated by Wan?

- **MIDJOURNEY**: Is the video generated using images from Midjourney?

- **HAILUO**: Is the video generated by Hailuo?

- **RAY**: Is the video generated by Ray?

- **VEO**: Is the video generated by Veo?

- **RUNWAY**: Is the video generated by Runway?

- **SORA**: Is the video generated by Sora?

- **CHATGPT**: Is the video generated using images from ChatGPT?

- **PIKA**: Is the video generated by Pika?

- **HUNYUAN**: Is the video generated by Hunyuan?

- **VIDU**: Is the video generated by Vidu?

> **Note**: The **AI_GEN** label is the most accurate as it has the most training data. Other labels have limited training data and may be less accurate.

## Accuracy

The PR curve of the model is shown below:



  



At threshold 0.5, the model has an precision of 0.77 and a recall of 0.74.

The dataset size is 5,093 videos for training and 498 videos for validation.

Please note that the model is not perfect and may make mistakes.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/launchplatform/cakelens-v5

Awesome Lists containing this project

README