{"id":26645898,"url":"https://github.com/xlite-dev/torchlm","last_synced_at":"2025-12-13T21:19:24.907Z","repository":{"id":39252155,"uuid":"455757012","full_name":"xlite-dev/torchlm","owner":"xlite-dev","description":"💎A high level python lib for face landmarks detection: training, eval, export, inference(Python/C++) and 100+ data augmentations.","archived":false,"fork":false,"pushed_at":"2025-02-07T08:47:27.000Z","size":161175,"stargazers_count":255,"open_issues_count":14,"forks_count":24,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-03-31T23:37:39.276Z","etag":null,"topics":["albumentations","data-augmentation","face-landmarks","heatmap","mobilenet","pip","pipnet","regression","shufflenet","torchvision","yolov5","yolov6","yolov7","yolox"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xlite-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-02-05T02:38:42.000Z","updated_at":"2025-03-23T14:02:07.000Z","dependencies_parsed_at":"2025-03-17T21:47:34.669Z","dependency_job_id":null,"html_url":"https://github.com/xlite-dev/torchlm","commit_stats":{"total_commits":186,"total_committers":3,"mean_commits":62.0,"dds":0.06451612903225812,"last_synced_commit":"99d285ee448ddcbca071428effc85646e44564a8"},"previous_names":["xlite-dev/torchlm","deftruth/torchlm"],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xlite-dev%2Ftorchlm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xlite-dev%2Ftorchlm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xlite-dev%2Ftorchlm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xlite-dev%2Ftorchlm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xlite-dev","download_url":"https://codeload.github.com/xlite-dev/torchlm/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247755560,"owners_count":20990620,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["albumentations","data-augmentation","face-landmarks","heatmap","mobilenet","pip","pipnet","regression","shufflenet","torchvision","yolov5","yolov6","yolov7","yolox"],"created_at":"2025-03-24T22:36:52.532Z","updated_at":"2025-12-13T21:19:24.897Z","avatar_url":"https://github.com/xlite-dev.png","language":"Python","readme":"\u003cdiv align=\"center\"\u003e\n  \u003cp align=\"center\"\u003e\n    \u003ch2\u003e💎 TorchLM: An easy-to-use PyTorch library for face landmarks detection.\u003c/h2\u003e\n  \u003c/p\u003e\n  \u003cdiv align='center'\u003e\n      \u003cimg src=https://img.shields.io/badge/PRs-welcome-9cf.svg \u003e\n      \u003cimg src=https://img.shields.io/pypi/v/torchlm?color=aff \u003e\n      \u003cimg src=https://static.pepy.tech/personalized-badge/torchlm?period=total\u0026units=international_system\u0026left_color=grey\u0026right_color=blue\u0026left_text=Downloads \u003e\n      \u003cimg src=https://img.shields.io/pypi/pyversions/torchlm?color=dfd \u003e\n      \u003cimg src=https://img.shields.io/badge/macos|linux|windows-pass-skyblue.svg \u003e\n      \u003cimg src=https://img.shields.io/badge/license-MIT-lightblue.svg \u003e\n \u003c/div\u003e\n\u003c/div\u003e\n\n\n\u003cp align=\"center\"\u003eEnglish | \u003ca href=\"docs/api/transforms.md\"\u003eData Augmentations API Docs\u003c/a\u003e | \u003ca href=\"https://www.zhihu.com/column/c_1426666301352218624\"\u003eZhiHu Page\u003c/a\u003e | \u003ca href=\"https://pepy.tech/project/torchlm\"\u003ePypi Downloads\u003c/a\u003e \u003c/p\u003e\n\n## 🤗 Introduction\n**torchlm** is aims to build a high level pipeline for face landmarks detection, it supports **training**, **evaluating**, **exporting**, **inference(Python/C++)** and **100+ data augmentations**, can easily install via **pip**.\n\u003cdiv align='center'\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/40fc4421-f628-4d5b-96e4-d486711284f9' height=\"100px\" width=\"720px\"\u003e\n\u003c/div\u003e  \n\n## 👋 Core Features\n* High level pipeline for **training** and **inference**.\n* Provides **30+** native landmarks data augmentations.\n* Can **bind 80+** transforms from [torchvision](https://github.com/pytorch/vision) and [albumentations](https://github.com/albumentations-team/albumentations) with **one-line-code**.\n* Support [PIPNet](https://arxiv.org/pdf/2003.03771.pdf), YOLOX, ResNet, MobileNet and ShuffleNet for face landmarks detection.\n\n## 🆕 What's New\n* [2022/03/08]: Add **PIPNet**: [Towards Efficient Facial Landmark Detection in the Wild, CVPR2021](https://github.com/jhb86253817/PIPNet)\n* [2022/02/13]: Add **30+** transforms and **bind** **80+** transforms from torchvision and albumentations.\n\n## 🔥🔥Performance(@NME)\n\n\u003cdiv align='center'\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/74a9607a-772e-4508-8c24-efc7404a97cb' height=\"150px\" width=\"400px\"\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/05f06090-d751-4949-abb1-ea56182fabcd' height=\"150px\" width=\"400px\"\u003e\n\u003c/div\u003e  \n\n|Model|Backbone|Head|300W|COFW|AFLW|WFLW|Download|\n|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|\n|[PIPNet](https://arxiv.org/pdf/2003.03771.pdf)|MobileNetV2|Heatmap+Regression+NRM|3.40|3.43|1.52|4.79| [link](https://github.com/xlite-dev/torchlm/releases/tag/torchlm-0.1.6-alpha)|\n|[PIPNet](https://arxiv.org/pdf/2003.03771.pdf)|ResNet18|Heatmap+Regression+NRM|3.36|3.31|1.48|4.47| [link](https://github.com/xlite-dev/torchlm/releases/tag/torchlm-0.1.6-alpha)|\n|[PIPNet](https://arxiv.org/pdf/2003.03771.pdf)|ResNet50|Heatmap+Regression+NRM|3.34|3.18|1.44|4.48| [link](https://github.com/xlite-dev/torchlm/releases/tag/torchlm-0.1.6-alpha)|\n|[PIPNet](https://arxiv.org/pdf/2003.03771.pdf)|ResNet101|Heatmap+Regression+NRM|3.19|3.08|1.42|4.31| [link](https://github.com/xlite-dev/torchlm/releases/tag/torchlm-0.1.6-alpha)|\n\n\n## 🛠️Installation\nyou can install **torchlm** directly from [pypi](https://pypi.org/project/torchlm/). \n```shell\npip install torchlm\u003e=0.1.6.10 # or install the latest pypi version `pip install torchlm`\npip install torchlm\u003e=0.1.6.10 -i https://pypi.org/simple/ # or install from specific pypi mirrors use '-i'\n```\nor install from source if you want the latest torchlm and install it in editable mode with `-e`.\n```shell\ngit clone --depth=1 https://github.com/xlite-dev/torchlm.git \ncd torchlm \u0026\u0026 pip install -e .\n```\n\u003cdiv id=\"torchlm-NOTE\"\u003e\u003c/div\u003e  \n\n\n## 🌟🌟Data Augmentation\n**torchlm** provides **30+** native data augmentations for landmarks and can **bind** with **80+** transforms from torchvision and albumentations. The layout format of landmarks is `xy` with shape `(N, 2)`.\n\nUse almost **30+** native transforms from **torchlm** directly\n```python\nimport torchlm\ntransform = torchlm.LandmarksCompose([\n    torchlm.LandmarksRandomScale(prob=0.5),\n    torchlm.LandmarksRandomMask(prob=0.5),\n    torchlm.LandmarksRandomBlur(kernel_range=(5, 25), prob=0.5),\n    torchlm.LandmarksRandomBrightness(prob=0.),\n    torchlm.LandmarksRandomRotate(40, prob=0.5, bins=8),\n    torchlm.LandmarksRandomCenterCrop((0.5, 1.0), (0.5, 1.0), prob=0.5)\n])\n```  \n\u003cdiv align='center'\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/36c595e0-2d12-43fb-8981-f57fda62a7b4' height=\"100px\" width=\"100px\"\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/de6259ff-cce3-428b-a16f-369ec2ca7b35' height=\"100px\" width=\"100px\"\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/a4ffdc92-ce14-400a-a3dc-33bd0b6388bf' height=\"100px\" width=\"100px\"\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/e8039440-3e8d-4c78-b216-45b8e9996379' height=\"100px\" width=\"100px\"\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/a48b4793-a837-4221-964a-d3976af4c604' height=\"100px\" width=\"100px\"\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/7478df16-bc7d-4727-893f-4f3a8db7b442' height=\"100px\" width=\"100px\"\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/a263d814-1db1-4bc0-b138-c956030f5cc4' height=\"100px\" width=\"100px\"\u003e\n\u003c/div\u003e  \n\nAlso, a user-friendly API `build_default_transform` is available to build a default transform pipeline.\n```python\ntransform = torchlm.build_default_transform(\n    input_size=(input_size, input_size),\n    mean=[0.485, 0.456, 0.406],\n    std=[0.229, 0.224, 0.225],\n    force_norm_before_mean_std=True,  # img/=255. first\n    rotate=30,\n    keep_aspect=False,\n    to_tensor=True  # array -\u003e Tensor \u0026 HWC -\u003e CHW\n)\n```\nSee [transforms.md](docs/api/transforms.md) for supported transforms sets and more example can be found at [test/transforms.py](test/transforms.py).\n\n\u003cdetails\u003e\n\n\u003csummary\u003e💡 more details about transform in torchlm \u003c/summary\u003e  \n\n**torchlm** provides **30+** native data augmentations for landmarks and can **bind** with **80+** transforms from torchvision and albumentations through **torchlm.bind** method. The layout format of landmarks is `xy` with shape `(N, 2)`, `N` denotes the number of the input landmarks. Further, **torchlm.bind** provide a `prob` param at bind-level to force any transform or callable be a random-style augmentation. The data augmentations in **torchlm** are `safe` and `simplest`. Any transform operations at runtime cause landmarks outside will be auto dropped to keep the number of landmarks unchanged. Yes, is ok if you pass a Tensor to a np.ndarray-like transform, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper. \n\n\u003cdetails\u003e\n\u003csummary\u003e bind 80+ torchvision and albumentations's transforms \u003c/summary\u003e  \n\n**NOTE**: Please install albumentations first if you want to bind albumentations's transforms. If you have the conflict problem between different installed version of opencv (opencv-python and opencv-python-headless, `ablumentations` need opencv-python-headless). Please uninstall the opencv-python and opencv-python-headless first, and then reinstall albumentations. See [albumentations#1140](https://github.com/albumentations-team/albumentations/issues/1140) for more details.\n\n```shell\n# first uninstall conflict opencvs\npip uninstall opencv-python\npip uninstall opencv-python-headless\npip uninstall albumentations  # if you have installed albumentations\npip install albumentations # then reinstall albumentations, will also install deps, e.g opencv\n```\n\nThen, check if albumentations is available. \n```python\ntorchlm.albumentations_is_available()  # True or False\n```\n\n```python\ntransform = torchlm.LandmarksCompose([\n    torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5),  \n    torchlm.bind(albumentations.ColorJitter(p=0.5))\n])\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e bind custom callable array or Tensor transform functions \u003c/summary\u003e  \n\n```python\n# First, defined your custom functions\ndef callable_array_noop(img: np.ndarray, landmarks: np.ndarray) -\u003e Tuple[np.ndarray, np.ndarray]: # do some transform here ...\n    return img.astype(np.uint32), landmarks.astype(np.float32)\n\ndef callable_tensor_noop(img: Tensor, landmarks: Tensor) -\u003e Tuple[Tensor, Tensor]: # do some transform here ...\n    return img, landmarks\n```\n\n```python\n# Then, bind your functions and put it into the transforms pipeline.\ntransform = torchlm.LandmarksCompose([\n        torchlm.bind(callable_array_noop, bind_type=torchlm.BindEnum.Callable_Array),\n        torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5)\n])\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e some global debug setting for torchlm's transform \u003c/summary\u003e  \n\n* setup logging mode as `True` globally might help you figure out the runtime details\n```python\n# some global setting\ntorchlm.set_transforms_debug(True)\ntorchlm.set_transforms_logging(True)\ntorchlm.set_autodtype_logging(True)\n```  \n\nsome detail information will show you at each runtime, the infos might look like\n```shell\nLandmarksRandomScale() AutoDtype Info: AutoDtypeEnum.Array_InOut\nLandmarksRandomScale() Execution Flag: False\nBindTorchVisionTransform(GaussianBlur())() AutoDtype Info: AutoDtypeEnum.Tensor_InOut\nBindTorchVisionTransform(GaussianBlur())() Execution Flag: True\nBindAlbumentationsTransform(ColorJitter())() AutoDtype Info: AutoDtypeEnum.Array_InOut\nBindAlbumentationsTransform(ColorJitter())() Execution Flag: True\nBindTensorCallable(callable_tensor_noop())() AutoDtype Info: AutoDtypeEnum.Tensor_InOut\nBindTensorCallable(callable_tensor_noop())() Execution Flag: False\nError at LandmarksRandomTranslate() Skip, Flag: False Error Info: LandmarksRandomTranslate() have 98 input landmarks, but got 96 output landmarks!\nLandmarksRandomTranslate() Execution Flag: False\n```\n* Execution Flag: True means current transform was executed successful, False means it was not executed because of the random probability or some Runtime Exceptions(torchlm will should the error infos if debug mode is True).\n* AutoDtype Info: \n  * Array_InOut means current transform need a np.ndnarray as input and then output a np.ndarray.\n  * Tensor_InOut means current transform need a torch Tensor as input and then output a torch Tensor. \n  * Array_In means current transform needs a np.ndarray input and then output a torch Tensor. \n  * Tensor_In means current transform needs a torch Tensor input and then output a np.ndarray. \n    \n  Yes, is ok if you pass a Tensor to a np.ndarray-like transform, **torchlm** will automatically be compatible with different data types and then wrap it back to the original type through a **autodtype** wrapper.\n\n\u003c/details\u003e\n\n\u003c/details\u003e\n\n\n## 🎉🎉Training  \nIn **torchlm**, each model have two high level and user-friendly APIs named `apply_training` and `apply_freezing` for training. `apply_training` handle the training process and `apply_freezing` decide whether to freeze the backbone for fune-tuning.\n\n### Quick Start👇\nHere is an example of [PIPNet](https://arxiv.org/pdf/2003.03771.pdf). You can freeze backbone before fine-tuning through `apply_freezing`.\n\n```python\nfrom torchlm.models import pipnet\n# will auto download pretrained weights from latest release if pretrained=True\nmodel = pipnet(backbone=\"resnet18\", pretrained=True, num_nb=10, num_lms=98, net_stride=32,\n               input_size=256, meanface_type=\"wflw\", backbone_pretrained=True)\nmodel.apply_freezing(backbone=True)\nmodel.apply_training(\n    annotation_path=\"../data/WFLW/converted/train.txt\",  # or fine-tuning your custom data\n    num_epochs=10,\n    learning_rate=0.0001,\n    save_dir=\"./save/pipnet\",\n    save_prefix=\"pipnet-wflw-resnet18\",\n    save_interval=1,\n    logging_interval=1,\n    device=\"cuda\",\n    coordinates_already_normalized=True,\n    batch_size=16,\n    num_workers=4,\n    shuffle=True\n)\n```  \nPlease jump to the entry point of the function for the detail documentations of **apply_training** API for each defined models in torchlm, e.g [pipnet/_impls.py#L166](https://github.com/xlite-dev/torchlm/blob/main/torchlm/models/pipnet/_impls.py#L166). You might see some logs if the training process is running: \n\n```shell\nParameters for DataLoader:  {'batch_size': 16, 'num_workers': 4, 'shuffle': True}\nBuilt _PIPTrainDataset: train count is 7500 !\nEpoch 0/9\n----------\n[Epoch 0/9, Batch 1/468] \u003cTotal loss: 0.372885\u003e \u003ccls loss: 0.063186\u003e \u003cx loss: 0.078508\u003e \u003cy loss: 0.071679\u003e \u003cnbx loss: 0.086480\u003e \u003cnby loss: 0.073031\u003e\n[Epoch 0/9, Batch 2/468] \u003cTotal loss: 0.354169\u003e \u003ccls loss: 0.051672\u003e \u003cx loss: 0.075350\u003e \u003cy loss: 0.071229\u003e \u003cnbx loss: 0.083785\u003e \u003cnby loss: 0.072132\u003e\n[Epoch 0/9, Batch 3/468] \u003cTotal loss: 0.367538\u003e \u003ccls loss: 0.056038\u003e \u003cx loss: 0.078029\u003e \u003cy loss: 0.076432\u003e \u003cnbx loss: 0.083546\u003e \u003cnby loss: 0.073492\u003e\n[Epoch 0/9, Batch 4/468] \u003cTotal loss: 0.339656\u003e \u003ccls loss: 0.053631\u003e \u003cx loss: 0.073036\u003e \u003cy loss: 0.066723\u003e \u003cnbx loss: 0.080007\u003e \u003cnby loss: 0.066258\u003e\n[Epoch 0/9, Batch 5/468] \u003cTotal loss: 0.364556\u003e \u003ccls loss: 0.051094\u003e \u003cx loss: 0.077378\u003e \u003cy loss: 0.071951\u003e \u003cnbx loss: 0.086363\u003e \u003cnby loss: 0.077770\u003e\n[Epoch 0/9, Batch 6/468] \u003cTotal loss: 0.371356\u003e \u003ccls loss: 0.049117\u003e \u003cx loss: 0.079237\u003e \u003cy loss: 0.075729\u003e \u003cnbx loss: 0.086213\u003e \u003cnby loss: 0.081060\u003e\n...\n[Epoch 0/9, Batch 33/468] \u003cTotal loss: 0.298983\u003e \u003ccls loss: 0.041368\u003e \u003cx loss: 0.069912\u003e \u003cy loss: 0.057667\u003e \u003cnbx loss: 0.072996\u003e \u003cnby loss: 0.057040\u003e\n```\n\n### Dataset Format👇\nThe `annotation_path` parameter is denotes the path to a custom annotation file, the format must be:\n```shell\n\"img0_path x0 y0 x1 y1 ... xn-1,yn-1\"\n\"img1_path x0 y0 x1 y1 ... xn-1,yn-1\"\n\"img2_path x0 y0 x1 y1 ... xn-1,yn-1\"\n\"img3_path x0 y0 x1 y1 ... xn-1,yn-1\"\n...\n```\nIf the label in annotation_path is already normalized by image size, please set `coordinates_already_normalized` as `True` in `apply_training` API.\n```shell\n\"img0_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h\"\n\"img1_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h\"\n\"img2_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h\"\n\"img3_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h\"\n...\n```\nHere is an example of [WFLW](torchlm/data/_converters.py) to show you how to prepare the dataset, also see [test/data.py](test/data.py).\n\n### Additional Custom Settings👋\n\nSome models in torchlm support additional custom settings beyond the `num_lms` of your custom dataset. For example, [PIPNet](https://arxiv.org/pdf/2003.03771.pdf) also need to set a custom meanface generated by your custom dataset. Please jump the source code of each defined model in **torchlm** for the details about additional custom settings to get more flexibilities of training or fine-tuning processes. Here is an example of How to train [PIPNet](https://arxiv.org/pdf/2003.03771.pdf) in your own dataset with custom meanface setting?\n\nSet up your custom meanface and nearest-neighbor landmarks through `pipnet.set_custom_meanface` method, this method will calculate the Euclidean Distance between different landmarks in meanface and will auto set up the nearest-neighbors for each landmark. NOTE: The PIPNet will reshape the detection headers if the number of landmarks in custom dataset is not equal with the `num_lms` you initialized.\n\n```python\ndef set_custom_meanface(custom_meanface_file_or_string: str) -\u003e bool:\n    \"\"\"\n    :param custom_meanface_file_or_string: a long string or a file contains normalized\n    or un-normalized meanface coords, the format is \"x0,y0,x1,y1,x2,y2,...,xn-1,yn-1\".\n    :return: status, True if successful.\n    \"\"\"\n```\nAlso, a `generate_meanface` API is available in torchlm to help you get meanface in custom dataset.\n```python\n# generate your custom meanface.\ncustom_meanface, custom_meanface_string = torchlm.data.annotools.generate_meanface(\n  annotation_path=\"../data/WFLW/converted/train.txt\",\n  coordinates_already_normalized=True)\n# check your generated meanface.\nrendered_meanface = torchlm.data.annotools.draw_meanface(\n  meanface=custom_meanface, coordinates_already_normalized=True)\ncv2.imwrite(\"./logs/wflw_meanface.jpg\", rendered_meanface)\n# setting up your custom meanface\nmodel.set_custom_meanface(custom_meanface_file_or_string=custom_meanface_string)\n```\n\u003cdiv align='center'\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/68eb7769-8e91-45ac-a91f-1af576ad5e68' height=\"200px\" width=\"200px\"\u003e\n\u003c/div\u003e  \n\n### Benchmarks Dataset Converters👇\nIn **torchlm**, some pre-defined dataset converters for common use benchmark datasets are available, such as [300W](https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/), [COFW](http://www.vision.caltech.edu/xpburgos/ICCV13/), [WFLW](https://wywu.github.io/projects/LAB/WFLW.html) and [AFLW](https://www.tugraz.at/institute/icg/research/team-bischof/lrs/downloads/aflw/). These converters will help you to convert the common use dataset to the standard annotation format that **torchlm** need. Here is an example of [WFLW](https://wywu.github.io/projects/LAB/WFLW.html).  \n```python\nfrom torchlm.data import LandmarksWFLWConverter\n# setup your path to the original downloaded dataset from official \nconverter = LandmarksWFLWConverter(\n    data_dir=\"../data/WFLW\", save_dir=\"../data/WFLW/converted\",\n    extend=0.2, rebuild=True, target_size=256, keep_aspect=False,\n    force_normalize=True, force_absolute_path=True\n)\nconverter.convert()\nconverter.show(count=30)  # show you some converted images with landmarks for debugging\n```\nThen, the output's layout in `../data/WFLW/converted` would be look like:\n```shell\n├── image\n│   ├── test\n│   └── train\n├── show\n│   ├── 16--Award_Ceremony_16_Award_Ceremony_Awards_Ceremony_16_589x456y91.jpg\n│   ├── 20--Family_Group_20_Family_Group_Family_Group_20_118x458y58.jpg\n...\n├── test.txt\n└── train.txt\n```\n\n## 🛸🚵‍️ Inference\n### C++ APIs👀\nThe ONNXRuntime(CPU/GPU), MNN, NCNN and TNN C++ inference of **torchlm** will be release in **[lite.ai.toolkit](https://github.com/xlite-dev/lite.ai.toolkit)**. Here is an example of **1000 Facial Landmarks Detection** using [FaceLandmarks1000](https://github.com/Single430/FaceLandmark1000). Download model from Model-Zoo[\u003csup\u003e2\u003c/sup\u003e](https://github.com/xlite-dev/lite.ai.toolkit#lite.ai.toolkit-Model-Zoo).\n\n```C++\n#include \"lite/lite.h\"\n\nstatic void test_default()\n{\n  std::string onnx_path = \"../../../hub/onnx/cv/FaceLandmark1000.onnx\";\n  std::string test_img_path = \"../../../examples/lite/resources/test_lite_face_landmarks_0.png\";\n  std::string save_img_path = \"../../../logs/test_lite_face_landmarks_1000.jpg\";\n    \n  auto *face_landmarks_1000 = new lite::cv::face::align::FaceLandmark1000(onnx_path);\n\n  lite::types::Landmarks landmarks;\n  cv::Mat img_bgr = cv::imread(test_img_path);\n  face_landmarks_1000-\u003edetect(img_bgr, landmarks);\n  lite::utils::draw_landmarks_inplace(img_bgr, landmarks);\n  cv::imwrite(save_img_path, img_bgr);\n  \n  delete face_landmarks_1000;\n}\n```\nThe output is:\n\u003cdiv align='center'\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/eca927eb-bcba-42dc-9a62-b0e14eb378bd' height=\"200px\" width=\"250px\"\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/2462d43f-cd6e-4f41-af1c-ee9c2b008237' height=\"200px\" width=\"250px\"\u003e\n  \u003cimg src='https://github.com/xlite-dev/torchlm/assets/31974251/8ebddaea-b5e1-4b5d-b2c5-7015b30f9649' height=\"200px\" width=\"250px\"\u003e\n\u003c/div\u003e    \n\nMore classes for face alignment (68 points, 98 points, 106 points, 1000 points)\n```c++\nauto *align = new lite::cv::face::align::PFLD(onnx_path);  // 106 landmarks, 1.0Mb only!\nauto *align = new lite::cv::face::align::PFLD98(onnx_path);  // 98 landmarks, 4.8Mb only!\nauto *align = new lite::cv::face::align::PFLD68(onnx_path);  // 68 landmarks, 2.8Mb only!\nauto *align = new lite::cv::face::align::MobileNetV268(onnx_path);  // 68 landmarks, 9.4Mb only!\nauto *align = new lite::cv::face::align::MobileNetV2SE68(onnx_path);  // 68 landmarks, 11Mb only!\nauto *align = new lite::cv::face::align::FaceLandmark1000(onnx_path);  // 1000 landmarks, 2.0Mb only!\nauto *align = new lite::cv::face::align::PIPNet98(onnx_path);  // 98 landmarks, CVPR2021!\nauto *align = new lite::cv::face::align::PIPNet68(onnx_path);  // 68 landmarks, CVPR2021!\nauto *align = new lite::cv::face::align::PIPNet29(onnx_path);  // 29 landmarks, CVPR2021!\nauto *align = new lite::cv::face::align::PIPNet19(onnx_path);  // 19 landmarks, CVPR2021!\n```  \nMore details of C++ APIs, please check **[lite.ai.toolkit](https://github.com/xlite-dev/lite.ai.toolkit)**. ![](https://img.shields.io/github/stars/xlite-dev/lite.ai.toolkit.svg?style=social) ![](https://img.shields.io/github/forks/xlite-dev/lite.ai.toolkit.svg?style=social) ![](https://img.shields.io/github/watchers/xlite-dev/lite.ai.toolkit.svg?style=social)\n\n### Python APIs👇\nIn **torchlm**, we provide pipelines for deploying models with [PyTorch](https://github.com/pytorch/pytorch) and [ONNXRuntime](https://github.com/microsoft/onnxruntime). A high level API named `runtime.bind` can bind face detection and landmarks models together, then you can run the `runtime.forward` API to get the output landmarks and bboxes. Here is an example of [PIPNet](https://github.com/jhb86253817/PIPNet). Pretrained weights of PIPNet, [Download](https://github.com/xlite-dev/torchlm/releases/tag/torchlm-0.1.6-alpha).\n\n#### Inference on PyTorch Backend\n\n```python\nimport torchlm\nfrom torchlm.tools import faceboxesv2\nfrom torchlm.models import pipnet\n\ntorchlm.runtime.bind(faceboxesv2(device=\"cpu\"))  # set device=\"cuda\" if you want to run with CUDA\n# set map_location=\"cuda\" if you want to run with CUDA\ntorchlm.runtime.bind(\n  pipnet(backbone=\"resnet18\", pretrained=True,  \n         num_nb=10, num_lms=98, net_stride=32, input_size=256,\n         meanface_type=\"wflw\", map_location=\"cpu\", checkpoint=None) \n) # will auto download pretrained weights from latest release if pretrained=True\nlandmarks, bboxes = torchlm.runtime.forward(image)\nimage = torchlm.utils.draw_bboxes(image, bboxes=bboxes)\nimage = torchlm.utils.draw_landmarks(image, landmarks=landmarks)\n```\n\n#### Inference on ONNXRuntime Backend\n```python\nimport torchlm\nfrom torchlm.runtime import faceboxesv2_ort, pipnet_ort\n\ntorchlm.runtime.bind(faceboxesv2_ort())\ntorchlm.runtime.bind(\n  pipnet_ort(onnx_path=\"pipnet_resnet18.onnx\",num_nb=10,\n             num_lms=98, net_stride=32,input_size=256, meanface_type=\"wflw\")\n)\nlandmarks, bboxes = torchlm.runtime.forward(image)\nimage = torchlm.utils.draw_bboxes(image, bboxes=bboxes)\nimage = torchlm.utils.draw_landmarks(image, landmarks=landmarks)\n```\n\u003cdiv align='center'\u003e\n  \u003cimg src='docs/assets/pipnet_300W_CELEBA_model.gif' height=\"200px\" width=\"400px\"\u003e\n  \u003cimg src='docs/assets/pipnet_WFLW_model.gif' height=\"200px\" width=\"400px\"\u003e\n\u003c/div\u003e  \n\n## 🤠🎯 Evaluating  \nIn **torchlm**, each model have a high level and user-friendly API named `apply_evaluating` for evaluation. This method will calculate the NME, FR and AUC for eval dataset. Here is an example of [PIPNet](https://github.com/jhb86253817/PIPNet).\n\n```python\nfrom torchlm.models import pipnet\n# will auto download pretrained weights from latest release if pretrained=True\nmodel = pipnet(backbone=\"resnet18\", pretrained=True, num_nb=10, num_lms=98, net_stride=32,\n               input_size=256, meanface_type=\"wflw\", backbone_pretrained=True)\nNME, FR, AUC = model.apply_evaluating(\n    annotation_path=\"../data/WFLW/convertd/test.txt\",\n    norm_indices=[60, 72],  # the indexes of two eyeballs.\n    coordinates_already_normalized=True, \n    eval_normalized_coordinates=False\n)\nprint(f\"NME: {NME}, FR: {FR}, AUC: {AUC}\")\n```\nThen, you will get the **Performance(@NME@FR@AUC)** results.\n```shell\nBuilt _PIPEvalDataset: eval count is 2500 !\nEvaluating PIPNet: 100%|██████████| 2500/2500 [02:53\u003c00:00, 14.45it/s]\nNME: 0.04453323229181989, FR: 0.04200000000000004, AUC: 0.5732673333333334\n```\n\n## ⚙️⚔️ Exporting  \nIn **torchlm**, each model have a high level and user-friendly API named `apply_exporting` for ONNX export. Here is an example of [PIPNet](https://github.com/jhb86253817/PIPNet).\n\n```python\nfrom torchlm.models import pipnet\n# will auto download pretrained weights from latest release if pretrained=True\nmodel = pipnet(backbone=\"resnet18\", pretrained=True, num_nb=10, num_lms=98, net_stride=32,\n               input_size=256, meanface_type=\"wflw\", backbone_pretrained=True)\nmodel.apply_exporting(\n    onnx_path=\"./save/pipnet/pipnet_resnet18.onnx\",\n    opset=12, simplify=True, output_names=None  # use default output names.\n)\n``` \nThen, you will get a Static ONNX model file if the exporting process was done.\n```shell\n  ...\n  %195 = Add(%259, %189)\n  %196 = Relu(%195)\n  %outputs_cls = Conv[dilations = [1, 1], group = 1, kernel_shape = [1, 1], pads = [0, 0, 0, 0], strides = [1, 1]](%196, %cls_layer.weight, %cls_layer.bias)\n  %outputs_x = Conv[dilations = [1, 1], group = 1, kernel_shape = [1, 1], pads = [0, 0, 0, 0], strides = [1, 1]](%196, %x_layer.weight, %x_layer.bias)\n  %outputs_y = Conv[dilations = [1, 1], group = 1, kernel_shape = [1, 1], pads = [0, 0, 0, 0], strides = [1, 1]](%196, %y_layer.weight, %y_layer.bias)\n  %outputs_nb_x = Conv[dilations = [1, 1], group = 1, kernel_shape = [1, 1], pads = [0, 0, 0, 0], strides = [1, 1]](%196, %nb_x_layer.weight, %nb_x_layer.bias)\n  %outputs_nb_y = Conv[dilations = [1, 1], group = 1, kernel_shape = [1, 1], pads = [0, 0, 0, 0], strides = [1, 1]](%196, %nb_y_layer.weight, %nb_y_layer.bias)\n  return %outputs_cls, %outputs_x, %outputs_y, %outputs_nb_x, %outputs_nb_y\n}\nChecking 0/3...\nChecking 1/3...\nChecking 2/3...\n```\n\n## 📖 Documentations\n* [x] [Data Augmentation's API](docs/api/transforms.md) \n\n## 🎓 License \nThe code of **torchlm** is released under the MIT License.\n\n## ❤️ Contribution\nPlease consider ⭐ this repo if you like it, as it is the simplest way to support me.\n\n## 👋 Acknowledgement  \n* The implementation of torchlm's transforms borrow the code from [Paperspace](https://github.com/Paperspace/DataAugmentationForObjectDetection/blob/master/data_aug/bbox_util.py).    \n* **PIPNet**: [Towards Efficient Facial Landmark Detection in the Wild, CVPR2021](https://github.com/jhb86253817/PIPNet)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxlite-dev%2Ftorchlm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxlite-dev%2Ftorchlm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxlite-dev%2Ftorchlm/lists"}