{"id":19001314,"url":"https://github.com/oneflow-inc/oneflow_face","last_synced_at":"2025-04-22T17:28:34.479Z","repository":{"id":42126154,"uuid":"259540098","full_name":"Oneflow-Inc/oneflow_face","owner":"Oneflow-Inc","description":null,"archived":false,"fork":false,"pushed_at":"2022-08-10T03:21:03.000Z","size":462,"stargazers_count":12,"open_issues_count":4,"forks_count":5,"subscribers_count":49,"default_branch":"master","last_synced_at":"2025-04-17T07:17:47.898Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Oneflow-Inc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-04-28T05:27:02.000Z","updated_at":"2022-10-20T16:50:07.000Z","dependencies_parsed_at":"2022-08-20T09:40:50.719Z","dependency_job_id":null,"html_url":"https://github.com/Oneflow-Inc/oneflow_face","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Oneflow-Inc%2Foneflow_face","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Oneflow-Inc%2Foneflow_face/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Oneflow-Inc%2Foneflow_face/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Oneflow-Inc%2Foneflow_face/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Oneflow-Inc","download_url":"https://codeload.github.com/Oneflow-Inc/oneflow_face/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250286806,"owners_count":21405505,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-08T18:10:43.968Z","updated_at":"2025-04-22T17:28:34.459Z","avatar_url":"https://github.com/Oneflow-Inc.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# InsightFace in OneFlow\n\n[English](README.md) **|** [简体中文](README_CH.md)\n\nIt introduces how to train InsightFace in OneFlow, and do verification over the validation datasets via the well-toned networks.\n\n## Contents\n\n\\- [InsightFace in OneFlow](#insightface-in-oneflow)\n\n \\- [Contents](#contents)\n\n \\- [Background](#background)\n\n  \\- [InsightFace opensource project](#insightface-opensource-project)\n\n  \\- [Implementation in OneFlow](#implementation-in-oneflow)\n\n \\- [Preparations](#preparations)\n\n  \\- [Install OneFlow](#install-oneflow)\n\n  \\- [Data preparations](#data-preparations)\n\n   \\- [1. Download datasets](#1-download-datasets)\n\n   \\- [2. Transformation from MS1M recordio to OFRecord](#2-transformation-from-ms1m-recordio-to-ofrecord)\n\n \\- [Pretrained model](#Pretrained-model)\n\n \\- [Training and verification](#training-and-verification)\n\n  \\- [Training](#training)\n\n  \\- [Varification](#varification)\n\n \\- [Benchmark](#benchmark)\n\n\n## Background\n\n### InsightFace opensource project\n\n[InsightFace](https://github.com/deepinsight/insightface) is an open-source 2D\u00263D deep face analysis toolbox, mainly based on MXNet.\n\nIn InsightFace, it supports:\n\n\n\n- Datasets typically used for face recognition, such as CASIA-Webface、MS1M、VGG2(Provided with the form of a binary file which could run in MXNet, [here](https://github.com/deepinsight/insightface/wiki/Dataset-Zoo) is more details about the datasets and how to download.\n\n\n\n* Backbones of ResNet, MobilefaceNet, InceptionResNet_v2, and other deep-learning networks to apply in facial recognition. \n\n* Implementation of different loss functions, including SphereFace Loss、Softmax Loss、SphereFace Loss, etc.\n\n  \n\n### Implementation in OneFlow\n\nBased upon the currently existing work of Insightface, OneFlow ported basic models from it, and now OneFlow supports:\n\n\n\n- Training datasets of MS1M、Glint360k, and validation datasets of Lfw、Cfp_fp and Agedb_30, scripts for training and validating.\n\n- Backbones of ResNet100 and MobileFaceNet to recognize faces.\n\n- Loss function, e.g. Softmax Loss and Margin Softmax Loss（including Arcface、Cosface and Combined Loss）.\n\n- Model parallelism and [Partial FC](https://github.com/deepinsight/insightface/tree/760d6de043d7f654c5963391271f215dab461547/recognition/partial_fc#partial-fc) optimization.\n\n- Model transformation via MXNet.\n\n\n\nTo be coming further:\n\n- Additional datasets transformation.\n\n- Plentiful backbones.\n\n- Full-scale loss functions implementation.\n\n- Incremental tutorial on the distributed configuration.\n\n\n\nThis project is open for every developer to PR, new implementation and animated discussion will be most welcome.\n\n\n\n## Preparations\n\nFirst of all, before execution, please make sure that:\n\n1. Install OneFlow\n\n2. Prepare training and validation datasets in form of OFRecord.\n\n\n\n### Install OneFlow\n\n\n\nAccording to steps in [Install OneFlow](https://github.com/Oneflow-Inc/oneflow#install-oneflow) install the newest release master whl packages.\n\n```\npython3 -m pip install --find-links https://release.oneflow.info oneflow_cu102 --user\n```\n\n\n\n### Data preparations\n\nAccording to [Load and Prepare OFRecord Datasets](https://docs.oneflow.org/en/extended_topics/how_to_make_ofdataset.html), datasets should be converted into the form of OFREcord, to test InsightFace.\n\n\n\nIt has provided a set of datasets related to face recognition tasks, which have been pre-processed via face alignment or other processions already in [InsightFace](https://github.com/deepinsight/insightface). The corresponding datasets could be downloaded from [here](https://github.com/deepinsight/insightface/wiki/Dataset-Zoo) and should be converted into OFRecord, which performs better in OneFlow. Considering the cumbersome steps, it is suggested to download converted OFrecord datasets:\n\n[MS1M-ArcFace(face_emore)](http://oneflow-public.oss-cn-beijing.aliyuncs.com/face_dataset/train_ofrecord.tar.gz)\n\n[MS1MV3](https://oneflow-public.oss-cn-beijing.aliyuncs.com/facedata/MS1V3/oneflow/ms1m-retinaface-t1.zip)\n\nIt illustrates how to convert downloaded datasets into OFRecords, and take MS1M-ArcFace as an example in the following.\n\n#### 1. Download datasets\n\nThe structure of the downloaded MS1M-ArcFace is shown as follown：\n\n\n\n```\nfaces_emore/\n\n​    train.idx\n\n​    train.rec\n\n​    property\n\n​    lfw.bin\n\n​    cfp_fp.bin\n\n​    agedb_30.bin\n```\n\nThe first three files are MXNet recordio format files of MS1M training dataset, the last three `.bin` files are different validation datasets.\n\n\n\n#### 2. Transformation from MS1M recordio to OFRecord\nOnly need to execute 2.1 or 2.2\n2.1 Use Python scripts directly\n\nRun \n```\npython tools/mx_recordio_2_ofrecord_shuffled_npart.py  --data_dir datasets/faces_emore --output_filepath faces_emore/ofrecord/train --part_num 16\n```\nAnd you will get the number of `part_num` parts of OFRecord, it's 16 parts in this example, it showed like this\n```\ntree ofrecord/test/\nofrecord/test/\n|-- _SUCCESS\n|-- part-00000\n|-- part-00001\n|-- part-00002\n|-- part-00003\n|-- part-00004\n|-- part-00005\n|-- part-00006\n|-- part-00007\n|-- part-00008\n|-- part-00009\n|-- part-00010\n|-- part-00011\n|-- part-00012\n|-- part-00013\n|-- part-00014\n`-- part-00015\n\n0 directories, 17 files\n```\n\n\n2.2 Use Python scripts + Spark Shuffle + Spark partition\n\nRun\n\n```\npython tools/dataset_convert/mx_recordio_2_ofrecord.py --data_dir datasets/faces_emore --output_filepath faces_emore/ofrecord/train\n```\n\nAnd you will get one part of OFRecord(`part-0`) with all data in this way. Then you should use Spark to shuffle and partition.\n1. Get jar package available\nYou can download Spark-oneflow-connector-assembly-0.1.0.jar via [Github](https://github.com/Oneflow-Inc/spark-oneflow-connector) or [OSS](https://oneflow-public.oss-cn-beijing.aliyuncs.com/spark-oneflow-connector/spark-oneflow-connector-assembly-0.1.1.jar)\n\n2. Run in Spark\nAssign that you have already installed and configured Spark.\nRun\n```\n//Start Spark \n./Spark-2.4.3-bin-hadoop2.7/bin/Spark-shell --jars ~/Spark-oneflow-connector-assembly-0.1.0.jar --driver-memory=64G --conf Spark.local.dir=/tmp/\n// shuffle and partition in 16 parts\nimport org.oneflow.Spark.functions._\nSpark.read.chunk(\"data_path\").shuffle().repartition(16).write.chunk(\"new_data_path\")\nsc.formatFilenameAsOneflowStyle(\"new_data_path\")\n```\nHence you will get 16 parts of OFRecords, it shown like this\n```\ntree ofrecord/test/\nofrecord/test/\n|-- _SUCCESS\n|-- part-00000\n|-- part-00001\n|-- part-00002\n|-- part-00003\n|-- part-00004\n|-- part-00005\n|-- part-00006\n|-- part-00007\n|-- part-00008\n|-- part-00009\n|-- part-00010\n|-- part-00011\n|-- part-00012\n|-- part-00013\n|-- part-00014\n`-- part-00015\n\n0 directories, 17 files\n```\n\n\n\n## Pretrained model\n\nThe accuracy comparison of OneFlow and MXNet pretrained models on the verification set of the 1:1 verification accuracy on insightface recognition test (IFRT) are as follows:\n\n| **Framework** | **African** | **Caucasian** | **Indian** | **Asian** | **All** |\n| ------------- | ----------- | ------------- | ---------- | --------- | ------- |\n| OneFlow       | 90.4076     | 94.583        | 93.702     | 68.754    | 89.684  |\n| MXNet         | 90.45       | 94.60         | 93.96      | 63.91     | 88.23   |\n\nThe download link of the OneFlow pretrain model:[of_005_model.tar.gz](http://oneflow-public.oss-cn-beijing.aliyuncs.com/face_dataset/pretrained_model/of_glint360k_partial_fc/of_005_model.tar.gz)\n\nWe also provide the MXNet model which converted from OneFlow:[of_to_mxnet_model_005.tar.gz](http://oneflow-public.oss-cn-beijing.aliyuncs.com/face_dataset/pretrained_model/of_2_mxnet_glint360k_partial_fc/of_to_mxnet_model_005.tar.gz)\n\n\n\n## OneFLow2ONNX\n\n```\npip install oneflow-onnx==0.3.4\n./convert.sh\n```\n\n## Training and verification\n\n\n\n### Training\n\nTo reduce the usage cost of user, OneFlow draws close the scripts to Torch style, you can directly modify parameters via configs/*.py\n\n```\n./run.sh\n```\n\n### Varification\n\nMoreover, OneFlow offers a validation script to do verification separately, val.py, which facilitates you to check the precision of the pre-training model saved.\n\n```\n./val.sh\n\n```\n\n## Benchmark\n\n### Training Speed Benchmark\n\n#### Face_emore Dataset \u0026 FP32\n\n| Backbone | GPU                      | model_parallel | partial_fc | BatchSize / it | Throughput img / sec |\n| -------- | ------------------------ | -------------- | ---------- | -------------- | -------------------- |\n| R100     | 8 * Tesla V100-SXM2-16GB | False          | False      | 64             | 1836.8              |\n| R100     | 8 * Tesla V100-SXM2-16GB | True           | False      | 64             | 1854.15              |\n| R100     | 8 * Tesla V100-SXM2-16GB | True           | True       | 64             | 1872.81              |\n| R100     | 8 * Tesla V100-SXM2-16GB | False           | False       | 96(Max)        | 1931.76               |\n| R100     | 8 * Tesla V100-SXM2-16GB | True           | False      | 115(Max)       | 1921.87              |\n| R100     | 8 * Tesla V100-SXM2-16GB | True           | True       | 120(Max)       | 1962.76              |\n| Y1       | 8 * Tesla V100-SXM2-16GB | False          | False      | 256            | 14298.02             |\n| Y1       | 8 * Tesla V100-SXM2-16GB | True           | False      | 256            | 14049.75             |\n| Y1       | 8 * Tesla V100-SXM2-16GB | False          | False      | 350(Max)       | 14756.03             |\n| Y1       | 8 * Tesla V100-SXM2-16GB | True           | True       | 400(Max)       | 14436.38             |\n\n#### Glint360k Dataset \u0026 FP32\n\n| Backbone | GPU                      | partial_fc sample_ratio | BatchSize / it | Throughput img / sec |\n| -------- | ------------------------ | ----------------------- | -------------- | -------------------- |\n| R100     | 8 * Tesla V100-SXM2-16GB | 0.1                       | 64             | 1858.57              |\n| R100     | 8 * Tesla V100-SXM2-16GB | 0.1                     | 115             | 1933.88             |\n\n\n\n### Evaluation on Lfw, Cfp_fp, Agedb_30\n\n- Data Parallelism\n\n| Backbone      | Dataset | Lfw    | Cfp_fp | Agedb_30 |\n| ------------- | ------- | ------ | ------ | -------- |\n| R100          | MS1M    | 99.717 | 98.643 | 98.150   |\n| MobileFaceNet | MS1M    | 99.5   | 92.657 | 95.6     |\n\n- Model Parallelism\n\n| Backbone      | Dataset | Lfw    | Cfp_fp | Agedb_30 |\n| ------------- | ------- | ------ | ------ | -------- |\n| R100          | MS1M    | 99.733 | 98.329 | 98.033   |\n| MobileFaceNet | MS1M    | 99.483 | 93.457 | 95.7     |\n\n- Partial FC\n\n| Backbone | Dataset | Lfw    | Cfp_fp | Agedb_30 |\n| -------- | ------- | ------ | ------ | -------- |\n| R100     | MS1M    | 99.817 | 98.443 | 98.217   |\n\n### Evaluation on IFRT\n\nr denotes the sampling rate of negative class centers.\n\n| Backbone | Dataset              | African | Caucasian | Indian | Asian  | ALL    |\n| -------- | -------------------- | ------- | --------- | ------ | ------ | ------ |\n| R100     | **Glint360k**(r=0.1) | 90.4076 | 94.583    | 93.702 | 68.754 | 89.684 |\n\n### Max num_classses\n\n| node_num | gpu_num_per_node | batch_size_per_device | fp16 | Model Parallel | Partial FC | num_classes |\n| -------- | ---------------- | --------------------- | ---- | -------------- | ---------- | ----------- |\n| 1        | 1                | 64                    | True | True           | True       | 2000000     |\n| 1        | 8                | 64                    | True | True           | True       | 13500000    |\n\nMore test details could refer to [OneFlow DLPerf](https://github.com/Oneflow-Inc/DLPerf#insightface).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foneflow-inc%2Foneflow_face","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foneflow-inc%2Foneflow_face","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foneflow-inc%2Foneflow_face/lists"}