https://github.com/kazuhito00/mediapipe-python-sample

MediaPipeのPythonパッケージのサンプルです。2024/9/1時点でPython実装のある15機能について用意しています。
https://github.com/kazuhito00/mediapipe-python-sample

face-detection face-mesh facemesh hands holistic mediapipe mediapipe-python-sample opencv pose python selfie-segmentation

Last synced: about 1 month ago
JSON representation

MediaPipeのPythonパッケージのサンプルです。2024/9/1時点でPython実装のある15機能について用意しています。

Host: GitHub
URL: https://github.com/kazuhito00/mediapipe-python-sample
Owner: Kazuhito00
License: apache-2.0
Created: 2020-12-08T13:43:05.000Z (almost 5 years ago)
Default Branch: main
Last Pushed: 2024-09-02T11:30:47.000Z (about 1 year ago)
Last Synced: 2025-08-28T10:46:02.514Z (about 1 month ago)
Topics: face-detection, face-mesh, facemesh, hands, holistic, mediapipe, mediapipe-python-sample, opencv, pose, python, selfie-segmentation
Language: Python
Homepage:
Size: 6.2 MB
Stars: 321
Watchers: 6
Forks: 91
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

> [!IMPORTANT]
> MediaPipe レガシーソリューションのサポートは、2023年3月1日で終了しています。

> 従来のソリューションのサンプルは [_legacy](_legacy)ディレクトリに移動しました。

> MediaPipeは後方互換を保っており、現パッケージでもレガシーソリューションのサンプルを実行出来ます。

# mediapipe-python-sample
[google-ai-edge/mediapipe](https://github.com/google-ai-edge/mediapipe)のPythonパッケージのサンプルスクリプト集です。

2024/9/1時点でPython実装のある以下15機能について用意しています。
* [物体検出（Object Detection）](https://ai.google.dev/edge/mediapipe/solutions/vision/object_detector?hl=ja)
* [画像分類（Image Classification）](https://ai.google.dev/mediapipe/solutions/vision/image_classifier?hl=ja)
* [画像セグメンテーション（Image Segmentation）](https://ai.google.dev/mediapipe/solutions/vision/image_segmenter?hl=ja)
* [インタラクティブセグメンテーション（Interactive segmentation）](https://ai.google.dev/mediapipe/solutions/vision/interactive_segmenter?hl=ja)
* [手検出（Hand Landmark detection）](https://ai.google.dev/mediapipe/solutions/vision/hand_landmarker?hl=ja)
* [手のジェスチャー認識（Gesture Recognition）](https://ai.google.dev/mediapipe/solutions/vision/gesture_recognizer?hl=ja)
* [画像の埋め込み表現（Image Embedding）](https://ai.google.dev/mediapipe/solutions/vision/image_embedder?hl=ja)
* [顔検出（Face Detection）](https://ai.google.dev/mediapipe/solutions/vision/face_detector?hl=ja)
* [顔のランドマーク検出（Face Landmark Detection）](https://ai.google.dev/mediapipe/solutions/vision/face_landmarker?hl=ja)
* [顔のスタイル変換（Face Stylization）](https://ai.google.dev/mediapipe/solutions/vision/face_stylizer?hl=ja)
* [姿勢推定（Pose Landmark Detection）](https://ai.google.dev/mediapipe/solutions/vision/pose_landmarker?hl=ja)
* [テキスト分類（Text Classification）](https://ai.google.dev/mediapipe/solutions/text/text_classifier?hl=ja)
* [テキストの埋め込み表現（Text Embedding）](https://ai.google.dev/mediapipe/solutions/text/text_embedder?hl=ja)
* [テキスト言語分類（Language Detector）](https://ai.google.dev/mediapipe/solutions/text/language_detector?hl=ja)
* [音分類（Audio Classification）](https://ai.google.dev/mediapipe/solutions/audio/audio_classifier?hl=ja)

# Requirement
* mediapipe 0.10.14 or later
* opencv-python 4.10.0.84 or later
* tqdm 4.66.5 or later　※重みファイルダウンロードに使用
* requests 2.32.3 or later　※重みファイルダウンロードに使用
* scipy 1.14.1 or later　※音分類（Audio Classification）サンプルを実行する場合のみ
* numpy 1.26.4　※NumPyは1.x系

```
pip install -r requirements.txt
```

# Demo
デモの実行方法は以下です。

### 物体検出（Object Detection）
```bash
python sample_object_detection.py
```

コマンドライン引数オプション

* --device

カメラデバイス番号の指定

デフォルト：0
* --video

動画パスの指定 ※指定時はカメラより優先

デフォルト：None
* --width

カメラキャプチャ時の横幅

デフォルト：960
* --height

カメラキャプチャ時の縦幅

デフォルト：540
* --model

使用モデル[0, 1, 2, 3, 4, 5, 6, 7]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

[COCOデータセット](https://cocodataset.org/#home)でトレーニングされた重みで、サポートされているラベルは[labelmap.txt](https://storage.googleapis.com/mediapipe-tasks/object_detector/labelmap.txt)

デフォルト：0

* 0:EfficientDet-Lite0(int8)
* 1:EfficientDet-Lite0(float 16)
* 2:EfficientDet-Lite0(float 32)
* 3:EfficientDet-Lite2(int8)
* 4:EfficientDet-Lite2(float 16)
* 5:EfficientDet-Lite2float 32）
* 6:SSDMobileNet-V2(int8)
* 7:SSDMobileNet-V2(float 32)
* --score_threshold

スコア閾値

デフォルト：0.5

### 画像分類（Image Classification）
```bash
python sample_image_classification.py
```

コマンドライン引数オプション

* --device

カメラデバイス番号の指定

デフォルト：0
* --video

動画パスの指定 ※指定時はカメラより優先

デフォルト：None
* --width

カメラキャプチャ時の横幅

デフォルト：960
* --height

カメラキャプチャ時の縦幅

デフォルト：540
* --model

使用モデル[0, 1, 2, 3]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

[ImageNet](https://www.image-net.org/)でトレーニングされた重みで、サポートされているラベルは[labels.txt](https://storage.googleapis.com/mediapipe-tasks/image_classifier/labels.txt)

デフォルト：0

* 0:EfficientNet-Lite0(int8)
* 1:EfficientNet-Lite0(float 32)
* 2:EfficientNet-Lite2(int8)
* 3:EfficientNet-Lite2(float 32)
* --max_results

結果出力数

デフォルト：5

### 画像セグメンテーション（Image Segmentation）
```bash
python sample_image_segmentation.py
```

コマンドライン引数オプション

* --device

カメラデバイス番号の指定

デフォルト：0
* --video

動画パスの指定 ※指定時はカメラより優先

デフォルト：None
* --width

カメラキャプチャ時の横幅

デフォルト：960
* --height

カメラキャプチャ時の縦幅

デフォルト：540
* --model

使用モデル[0, 1, 2, 3, 4]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

デフォルト：0

* 0:SelfieSegmenter(square)
* 1:SelfieSegmenter(landscape)
* 2:HairSegmenter
* 3:SelfieMulticlass(256x256)
* 4:DeepLab-V3

### インタラクティブセグメンテーション（Interactive segmentation）
```bash
python sample_interactive_image_segmentation.py
```

コマンドライン引数オプション

* --image

画像パスの指定

デフォルト：asset/hedgehog01.jpg
* --model

使用モデル[0]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

デフォルト：0

* 0:MagicTouch

### 手検出（Hand Landmark detection）
```bash
python sample_hand_landmarks_detection.py
```

コマンドライン引数オプション

* --device

カメラデバイス番号の指定

デフォルト：0
* --video

動画パスの指定 ※指定時はカメラより優先

デフォルト：None
* --width

カメラキャプチャ時の横幅

デフォルト：960
* --height

カメラキャプチャ時の縦幅

デフォルト：540
* --unuse_mirror

ミラー表示不使用

デフォルト：指定なし
* --model

使用モデル[0]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

デフォルト：0

* 0:HandLandmarker (full)
* --num_hands

検出数

デフォルト：2
* --use_world_landmark

ワールド座標表示

デフォルト：指定なし

### 手のジェスチャー認識（Gesture Recognition）
```bash
python sample_hand_gesture_recognition.py
```

コマンドライン引数オプション

* --device

カメラデバイス番号の指定

デフォルト：0
* --video

動画パスの指定 ※指定時はカメラより優先

デフォルト：None
* --width

カメラキャプチャ時の横幅

デフォルト：960
* --height

カメラキャプチャ時の縦幅

デフォルト：540
* --unuse_mirror

ミラー表示不使用

デフォルト：指定なし
* --model

使用モデル[0]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

認識ジェスチャーは「Closed fist」「Open palm」「Pointing up」「Thumbs down」「Thumbs up」「Victory」「Love」「Unknown」

デフォルト：0

* 0:HandGestureClassifier

### 画像の埋め込み表現（Image Embedding）
```bash
python sample_image_embedding.py
```

コマンドライン引数オプション

* --image01

画像パス1の指定

デフォルト：asset/hedgehog01.jpg
* --image02

画像パス2の指定

デフォルト：asset/hedgehog02.jpg
* --model

使用モデル[0, 1]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

デフォルト：0

* 0:MobileNet-V3 (small)
* 1:MobileNet-V3 (large)
* --unuse_l2_normalize

特徴ベクトルを L2 ノルムで正規化しない

デフォルト：指定なし
* --unuse_quantize

特徴ベクトルをスカラー量子化によってバイトに量子化しない

デフォルト：指定なし

### 顔検出（Face Detection）
```bash
python sample_face_landmark_detection.py
```

### 顔のランドマーク検出（Face Landmark Detection）
```bash
python sample_face_landmark_detection.py
```

コマンドライン引数オプション

* --device

カメラデバイス番号の指定

デフォルト：0
* --video

動画パスの指定 ※指定時はカメラより優先

デフォルト：None
* --width

カメラキャプチャ時の横幅

デフォルト：960
* --height

カメラキャプチャ時の縦幅

デフォルト：540
* --model

使用モデル[0]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

デフォルト：0

* 0:FaceLandscapeer
* --num_faces

検出数

デフォルト：1
* --unuse_output_face_blendshapes

顔のブレンドシェイプを出力しない

デフォルト：指定なし
* --unuse_output_facial_transformation_matrixes

顔変換行列を出力しない

デフォルト：指定なし

### 顔のスタイル変換（Face Stylization）
```bash
python sample_face_stylization.py
```

### 姿勢推定（Pose Landmark Detection）
```bash
python sample_pose_landmark_detection.py
```

コマンドライン引数オプション

* --device

カメラデバイス番号の指定

デフォルト：0
* --video

動画パスの指定 ※指定時はカメラより優先

デフォルト：None
* --width

カメラキャプチャ時の横幅

デフォルト：960
* --height

カメラキャプチャ時の縦幅

デフォルト：540
* --unuse_mirror

ミラー表示不使用

デフォルト：指定なし
* --model

使用モデル[0, 1, 2]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

デフォルト：0

* 0:Pose landmarker(lite)
* 1:Pose landmarker(Full)
* 2:Pose landmarker(Heavy)
* --use_output_segmentation_masks

セグメンテーションを実施

デフォルト：指定なし
* --use_world_landmark

ワールド座標表示を実施

デフォルト：指定なし

### テキスト分類（Text Classification）
```bash
python sample_text_classification.py
```

コマンドライン引数オプション

* --input_text

入力テキスト

デフォルト：I'm looking forward to what will come next.
* --model

使用モデル[0, 1]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

デフォルト：0

* 0:BERT-classifier
* 1:Average word embedding

### テキストの埋め込み表現（Text Embedding）
```bash
python sample_text_embedding.py
```

コマンドライン引数オプション

* --input_text01

入力テキスト1

デフォルト：I'm feeling so good
* --input_text02

入力テキスト2

デフォルト：I'm okay I guess
* --model

使用モデル[0]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

デフォルト：0

* 0:Universal Sentence Encoder
* --unuse_l2_normalize

特徴ベクトルを L2 ノルムで正規化しない

デフォルト：指定なし
* --use_quantize

特徴ベクトルをスカラー量子化によってバイトに量子化する

デフォルト：指定なし

### テキスト言語分類（Language Detector）
```bash
python sample_text_language_detection.py
```

コマンドライン引数オプション

* --input_text

入力テキスト

デフォルト：分久必合合久必分
* --model

使用モデル[0, 1]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

デフォルト：0

* 0:Language Detector

### 音分類（Audio Classification）
```bash
python sample_audio_classification.py
```

コマンドライン引数オプション

* --input_audio

入力音声ファイルのパス

デフォルト：asset/hyakuninisshu_02.wav
* --model

使用モデル[0]　※対象モデルの重みがmodelディレクトリ内に無い場合ダウンロードを実行

デフォルト：0

* 0:YamNet
* --max_results

結果出力数

デフォルト：5

# Reference
* [google-ai-edge/mediapipe](https://github.com/google-ai-edge/mediapipe)

# Author
高橋かずひと(https://twitter.com/KzhtTkhs)

# License
mediapipe-python-sample is under [Apache-2.0 License](LICENSE).

# License(Image, Video, Audio)
サンプル実行用に格納している画像などは以下を利用しています。
* [ぱくたそ](https://www.pakutaso.com)様：[トゲトゲのサボテンとハリネズミ](https://www.pakutaso.com/20190257050post-19488.html)
* [ぱくたそ](https://www.pakutaso.com)様：[人間の靴にはまり込むハリネズ](https://www.pakutaso.com/20171041289post-13677.html)
* [ぱくたそ](https://www.pakutaso.com)様：[靴にすっぽり隠れるハリネズミ](https://www.pakutaso.com/20171039289post-13676.html)
* [NHKクリエイティブ・ライブラリー](https://www.nhk.or.jp/archives/creative/)様：「[猫カフェのネコ（３）](https://www2.nhk.or.jp/archives/movies/?id=D0002161325_00000)」
* [NHKクリエイティブ・ライブラリー](https://www.nhk.or.jp/archives/creative/)様：「[寅さんの像　アップ](https://www2.nhk.or.jp/archives/movies/?id=D0002022189_00000)」
* [NHKクリエイティブ・ライブラリー](https://www.nhk.or.jp/archives/creative/)様：「[音声　百人一首　二](https://www2.nhk.or.jp/archives/movies/?id=D0002110102_00000)」

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kazuhito00/mediapipe-python-sample

Awesome Lists containing this project

README