{"id":13608510,"url":"https://github.com/wzh99/RealGes","last_synced_at":"2025-04-12T17:31:55.343Z","repository":{"id":69775403,"uuid":"217848539","full_name":"wzh99/RealGes","owner":"wzh99","description":"SJTU CS386 Project: Real-time Dynamic Hand Gesture Recognition","archived":false,"fork":false,"pushed_at":"2019-12-27T05:50:20.000Z","size":8879,"stargazers_count":8,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-11-07T14:39:57.502Z","etag":null,"topics":["computer-vision","convolutional-neural-networks","deep-learning","gesture-recognition","keras-tensorflow","real-time"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wzh99.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2019-10-27T12:00:36.000Z","updated_at":"2024-07-29T07:09:40.000Z","dependencies_parsed_at":null,"dependency_job_id":"5c3c525e-4ae6-413a-8415-7ff82a1135a0","html_url":"https://github.com/wzh99/RealGes","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wzh99%2FRealGes","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wzh99%2FRealGes/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wzh99%2FRealGes/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wzh99%2FRealGes/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wzh99","download_url":"https://codeload.github.com/wzh99/RealGes/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248605142,"owners_count":21132118,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","convolutional-neural-networks","deep-learning","gesture-recognition","keras-tensorflow","real-time"],"created_at":"2024-08-01T19:01:27.838Z","updated_at":"2025-04-12T17:31:50.332Z","avatar_url":"https://github.com/wzh99.png","language":"Python","funding_links":[],"categories":["资源清单"],"sub_categories":["CS3324 (原 CS386) - 数字图像处理"],"readme":"# 基于视觉的实时动态手势识别系统\n\n## 简介\n\n本项目为上海交通大学数字图像处理（CS386）课程大作业，由[王梓涵](https://github.com/wzh99)完成。\n\n在本项目中构建了一个完整的基于视觉的实时动态手势识别系统，包括数据采集、模型训练与测试、实时预测等各个环节。\n\n## 配置\n\n### 环境\n\n* Python 3.7\n\n### 依赖\n\n* Tensorflow 2.0.0\n* Keras 2.3.1\n* OpenCV 4.1.0\n* pyrealsense 2.29.0\n* NumPy 1.16.2\n* scikit-learn 0.21.3\n* Matplotlib 3.0.3\n* Pandas 0.25.3\n* h5py 2.10.0\n\n通过 `pip install -r requirements.txt` 安装所有依赖。\n\n如需使用 `tf.keras`，请将所有的 `import keras` 替换为 `from tensorflow import keras`。\n\n## 运行\n\n本项目的代码使用模块化与面向对象设计，主程序的代码均非常简短。建议直接在 IDE 中修改参数并运行，暂不支持通过命令行传递参数。\n\n## 数据集\n\n### 采集\n\n在 [capture.py](capture.py) 指定数据集的存放位置：\n\n```python\nrec = Recorder(path=\"train_data\")\n```\n\n运行以捕获手势样本。在指定目录下存储为 `${手势类别名}/${时间戳}/${通道标识}${帧编号}.jpg` 的形式，可直接观察样本的质量。手势样本制作为 GIF 如下：\n\n深度图\n\n![](doc/depth.gif)\n\n梯度图：\n\n![](doc/grad.gif)\n\n### 封装\n\n对采集的样本需要进行一定的预处理和封装，以加快训练时数据的加载效率。在 [load.py](load.py) 指定原始数据集目录和输出 HDF5 文件位置：\n\n```python\nx, y = from_directory(\"test_data\")\n_store_as_hdf5(\"test_data.h5\", x, y)\n```\n\n运行以进行封装。\n\n## 分类模型\n\n### 模型定义\n\n[model.py](model.py) 中定义了需要的分类模型，可自行根据需求定义。本项目中定义了三个模型：来自 [Molchanov et al.](https://research.nvidia.com/publication/hand-gesture-recognition-3d-convolutional-neural-networks) 的 HRN 和 LRN，以及来自 [Tran et al.](https://arxiv.org/abs/1412.0767) 的 C3D。添加定义后需要在 `network_spec` 字典中规定类型名和模型参数存储位置，如对于 HRN：\n\n```python\n\"hrn\": {\n    \"init\": HRN,\n    \"path\": \"weights/hrn.h5\"\n},\n```\n\n### 训练\n\n[train.py](train.py) 中定义了训练的程序，需要给定模型在 `model.network_spec` 中的键，封装后数据的位置，以及训练轮数：\n\n```python\ntrainer = Trainer(\"hrn\", \"train_data.h5\")\ntrainer.train(200)\n```\n\n运行以进行训练，训练过程中如要中断，按下回车键，即可在当前轮结束后停止。训练历史数据（损失与准确度等）存储在 `log/${模型名}.csv` 中，可供后续分析使用，如使用 RStudio 绘图：\n\n![](doc/loss.png)\n\n![](doc/accuracy.png)\n\n### 测试\n\n[test.py](test.py) 中可对训练完成的模型进行测试，测试数据建议另行采集。`ModelTester` 可取两个模型的结合结果，如果只需测试单个模型，则两个模型名保持相同即可。\n\n```python\ndata_x, data_y = data.from_hdf5(\"test_data.h5\")\ntester = ModelTester(\"hrn\", \"lrn\")\ntester.test(data_x, data_y)\n```\n\n测试完成后，会输出准确率、运行时间及混淆矩阵。下面展示 HRN+LRN 在实验时的混淆矩阵：\n\n![](doc/hrn\u0026lrn.png)\n\n### 可视化\n\n[visual.py](visual.py) 可对训练完成的模型进行卷积层输出可视化。`Visualizer` 可对单个卷积层、单个手势类别的特定样本（默认第一个）输出可视化结果。结果存储为 `visual/${模型名}/${手势名}/l${卷积层序号}_f${帧序号}.jpg`\n\n```python\n vis = Visualizer(\"lrn\")\n for layer_idx in range(3):\n \tvis.visualize(layer_idx, \"train_data\", 14)\n```\n\n上述代码可对 LRN 的三个卷积层对于“好的”（14 号手势）第一个样本分别输出可视化结果，单个图片中包含了该层的所有卷积核在该帧的输出，将输出图片制作为 GIF 如下：\n\n层 0：\n\n![](doc/layer0.gif)\n\n层 1：\n\n![](doc/layer1.gif)\n\n层 2：\n\n![](doc/layer2.gif)\n\n## 实时预测\n\n手势实时预测和相关应用见 [recog.py](recog.py)。为了实现对于采集程序的重用，以及预测程序与手势应用的分离，采用了回调函数。手势应用开发者只需要继承 `GestureApp` 并实现 `on_gesture` 方法，在 `Recorder` 的构造函数中将该应用的对象传入即可。本项目中实现了简单的图片图片查看器 `ImageViewer` 作为 `GestureApp` 的示例。\n\n```\nm1, m2 = model.load_two_models(\"hrn\", \"lrn\")\nviewer = ImageViewer(\"demo\")\nrec = Recorder(callback=lambda seq: recogize_sample(m1, m2, seq, viewer))\nviewer.start()\nrec.record()\n```\n\n运行该程序即可体验。","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwzh99%2FRealGes","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwzh99%2FRealGes","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwzh99%2FRealGes/lists"}