{"id":13408779,"url":"https://github.com/amineHorseman/facial-expression-recognition-using-cnn","last_synced_at":"2025-03-14T13:31:58.438Z","repository":{"id":45156792,"uuid":"118813572","full_name":"amineHorseman/facial-expression-recognition-using-cnn","owner":"amineHorseman","description":"Deep facial expressions recognition using Opencv and Tensorflow. Recognizing facial expressions from images or camera stream","archived":false,"fork":false,"pushed_at":"2023-06-10T03:10:17.000Z","size":477,"stargazers_count":445,"open_issues_count":13,"forks_count":141,"subscribers_count":12,"default_branch":"master","last_synced_at":"2024-07-31T20:31:58.542Z","etag":null,"topics":["cnn","cnn-classification","deep-learning","facial-expression-recognition","facial-landmarks","fer2013","hog-features","image-classification","images","machine-learning","opencv","python","tensorflow","tflearn"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amineHorseman.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-01-24T19:45:11.000Z","updated_at":"2024-07-02T13:36:08.000Z","dependencies_parsed_at":"2024-10-26T04:07:25.484Z","dependency_job_id":"b5fe75f7-4cbd-4a12-9100-02acae9d49aa","html_url":"https://github.com/amineHorseman/facial-expression-recognition-using-cnn","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amineHorseman%2Ffacial-expression-recognition-using-cnn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amineHorseman%2Ffacial-expression-recognition-using-cnn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amineHorseman%2Ffacial-expression-recognition-using-cnn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amineHorseman%2Ffacial-expression-recognition-using-cnn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amineHorseman","download_url":"https://codeload.github.com/amineHorseman/facial-expression-recognition-using-cnn/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243584434,"owners_count":20314760,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cnn","cnn-classification","deep-learning","facial-expression-recognition","facial-landmarks","fer2013","hog-features","image-classification","images","machine-learning","opencv","python","tensorflow","tflearn"],"created_at":"2024-07-30T20:00:55.190Z","updated_at":"2025-03-14T13:31:58.429Z","avatar_url":"https://github.com/amineHorseman.png","language":"Python","funding_links":[],"categories":["Libraries and Frameworks"],"sub_categories":[],"readme":"\n# Facial expression recognition using CNN in Tensorflow\n\nUsing a Convolutional Neural Network (CNN) to recognize facial expressions from images or video/camera stream.\n\n## Table of contents\n\n[1. Motivation](#motivation)\n\n[2. Why is Fer2013 challenging?](#fer2013)\n\n[3. Classification results](#results)\n\n[4. How to use?](#how-to-use) \n- [Install the dependeciens](#install)\n- [Download and prepare the data](#data)\n- [Train the model](#train)\n- [Optimize the hyperparameters](#optimize)\n- [Evaluate a trained model](#evaluate)\n- [Recognizing facial expressions from an image file](#recognize-image)\n- [Recognizing facial expressions in real time from video/camera](#recognize-video)\n\n[5. Contributing](#contrib)\n\n\u003cbr /\u003e\n\n# \u003ca name=\"motivation\"\u003e1. Motivation\u003c/a\u003e\n\nThe goal is to get a quick baseline to compare if the CNN architecture performs better when it uses only the raw pixels of images for training, or if it's better to feed some extra information to the CNN (such as face landmarks or HOG features). The results show that the extra information helps the CNN to perform better.\n\nTo train the model, we used Fer2013 datset that contains 30,000 images of facial expressions grouped in seven categories: Angry, Disgust, Fear, Happy, Sad, Surprise and Neutral.\n\nThe faces are first detected using opencv, then we extract the face landmarks using dlib. We also extracted the HOG features and we input the raw image data with the face landmarks+hog into a convolutional neural network.\n\nFor our experiments, we used 2 CNN models:\n\n![Model's architecture](img/CNN_models_architecture.png)\n\n\n# \u003ca name=\"fer2013\"\u003e2. Why is Fer2013 challenging?\u003c/a\u003e\n\nFer2013 is a challenging dataset. The images are not aligned and some of them are uncorrectly labeled as we can see from the following images. Moreover, some samples do not contain faces. \n\n![Fer2013 incorrect labeled images](img/fer2013_incorrect_labels.png)\n\n![Fer2013 strange samples](img/FER2013_strange_samples.png)\n\nThis makes the classification harder because the model have to generalize well and be robust to incorrect data. The best accuracy results obtained on this dataset, as far as I know, is 75.2% described in this paper: \n[[Facial Expression Recognition using Convolutional Neural Networks: State of the Art, Pramerdorfer \u0026 al. 2016]](https://arxiv.org/abs/1612.02903)\n\n\n# \u003ca name=\"results\"\u003e3. Classification Results (training on 5 expressions)\u003c/a\u003e\n\n|       Experiments                            |    SVM    | Model A  |  Model B  |  Difference |\n|----------------------------------------------|-----------|----------|-----------|-------------|\n| CNN (on raw pixels)                          |   -----   |   72.4%  |   73.5%   |    +1.1%    |\n| CNN + Face landmarks                         |   46.9%   |   **73.5%**  |   74.4%   |    +0.9%    |\n| CNN + Face landmarks + HOG                   |   55.0%   |   68.7%  |   73.2%   |    +4.5%    |\n| CNN + Face landmarks + HOG + sliding window  |   **59.4%**   |   71.4%  |   **75.1%**   |    +3.7%    |  \n\nAs expected:\n- The CNN models gives better results than the SVM (You can find the code for the SVM implmentation in the following repository: [Facial Expressions Recognition using SVM](https://github.com/amineHorseman/facial-expression-recognition-svm))\n- Combining more features such as Face Landmarks and HOG, improves *slightly* the accuray.\n- Since the CNN Model B uses deep convolutions, it gives better results on all experiments (up to 4.5%).\n\nIt's interesting to note that using HOG features in the CNN Model A decreased the results compared to using only the RAW data. This may be caused by an overfitting or a failure to extract the coorelation between the information.\n\nIn the following table, we can see the effects of the batch normalization on improving the results:\n\n|   Batch norm effects                         |  on Model A  |  on Model B  |\n|----------------------------------------------|--------------|--------------|\n| CNN (on raw pixels)                          |     +7.4%    |    +39.3%    |\n| CNN + Face landmarks                         |    +26.2%    |    +50.0%    |\n| CNN + Face landmarks + HOG                   |     +1.9%    |    +50.1%    |\n| CNN + Face landmarks + HOG + sliding window  |    +16.7%    |    +16.9%    |\n\nIn the previous experiments, I used only 5 expressions for the training: Angry, Happy, Sad, Surprise and Neutral.\n\nThe accuracy using the best model trained on the whole dataset (7 emotions) dropped to 61.4%. \nThe state of the art results obtained on this dataset, as far as I know, is 75.2% described in [this paper](https://arxiv.org/abs/1612.02903).\n\n\nNote: the code was tested in python 2.7 and 3.6.\n\n# \u003ca name=\"how-to-use\"\u003e4. HOW TO USE?\u003c/a\u003e\n\n## \u003ca name=\"install\"\u003e4.1. Install dependencies\u003c/a\u003e\n\n- Tensorflow\n- Tflearn\n- Numpy\n- Argparse\n- [optional] Hyperopt + pymongo + networkx\n- [optional] dlib, imutils, opencv 3\n- [optional] scipy, pandas, skimage\n\nBetter to use anaconda environemnt to easily install the dependencies (especially opencv and dlib)\n\n## \u003ca name=\"data\"\u003e4.2. Download and prepare the data\u003c/a\u003e\n\n1. Download Fer2013 dataset and the Face Landmarks model\n\n    - [Kaggle Fer2013 challenge](https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data)\n    - [Dlib Shape Predictor model](http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2)\n\n2. Unzip the downloaded files\n\n    And put the files `fer2013.csv` and `shape_predictor_68_face_landmarks.dat` in the root folder of this package.\n\n3. Convert the dataset to extract Face Landmarks and HOG Features\n    ```\n    python convert_fer2013_to_images_and_landmarks.py\n    ```\n    \n    You can also use these optional arguments according to your needs:\n    - `-j`, `--jpg` (yes|no): **save images as .jpg files (default=no)**\n    - `-l`, `--landmarks` *(yes|no)*: **extract Dlib Face landmarks (default=yes)**\n    - `-ho`, `--hog` (yes|no): **extract HOG features (default=yes)**\n    - `-hw`, `--hog_windows` (yes|no): **extract HOG features using a sliding window (default=yes)**\n    - `-hi`, `--hog_images` (yes|no): **extract HOG images (default=no)**\n    - `-o`, `--onehot` (yes|no): **one hot encoding (default=yes)**\n    - `-e`, `--expressions` (list of numbers): **choose the faciale expression you want to use: *0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral* (default=0,1,2,3,4,5,6)**\n\n    Examples:\n    ```\n    python convert_fer2013_to_images_and_landmarks.py\n    python convert_fer2013_to_images_and_landmarks.py --landmarks=yes --hog=no --how_windows=no --jpg=no --expressions=1,4,6\n    ```\n    The script will create a folder with the data prepared and saved as numpy arrays.\n    Make sure the `--onehot` argument set to `yes` (default value)\n\n## \u003ca name=\"train\"\u003e4.3. Train the model\u003c/a\u003e\n1. Choose your parameters in 'parameters.py'\n\n2. Launch training:\n\n```\npython train.py --train=yes\n```\n\nThe variable `output_size` in parameters.py (line 20), should correspond to the number of facial expressions you want to train on. By default it is set to 7 expressions.\n\n3. Train and evaluate:\n\n```\npython train.py --train=yes --evaluate=yes\n```\n\nN.B: make sure the parameter \"save_model\" (in parameters.py) is set to True if you want to train and evaluate\n\n## \u003ca name=\"optimize\"\u003e4.4. Optimize training hyperparameters\u003c/a\u003e\n1. For this section, you'll need to install first these optional dependencies:\n```\npip install hyperopt, pymongo, networkx\n```\n\n2. Lunch the hyperparamets search:\n```\npython optimize_hyperparams.py --max_evals=20\n```\n\n3. You should then retrain your model with the best parameters\n\nN.B: the accuracies displayed are for validation_set only (not test_set)\n\n## \u003ca name=\"evaluate\"\u003e4.5. Evaluate a trained model (calculating test accuracy)\u003c/a\u003e\n\n1. Modify 'parameters.py':\n \nSet \"save_model_path\" parameter to the path of your pretrained file.\n\n2. Launch evaluation on test_set:\n\n```\npython train.py --evaluate=yes\n```\n\n## \u003ca name=\"recognize-image\"\u003e4.6. Recognizing facial expressions from an image file\u003c/a\u003e\n\n1. For this section you will need to install `dlib` and `opencv 3` dependencies\n\n2. Modify 'parameters.py':\n\nSet \"save_model_path\" parameter to the path of your pretrained file\n\n3. Predict emotions from a file\n\n```\npython predict.py --image path/to/image.jpg\n```\n\n## \u003ca name=\"recognize-video\"\u003e4.7. Recognizing facial expressions in real time from video\u003c/a\u003e\n\n1. For this section you will need to install `dlib`, `imutils` and `opencv 3` dependencies\n\n2. Modify 'parameters.py':\n\nSet \"save_model_path\" parameter to the path of your pretrained file\n\n3. Predict emotions from a file\n\n```\npython predict-from-video.py\n```\nA window will appear with a box around the face and the predicted expression.\nPress 'q' key to stop.\n\nN.B: If you changed the number of expressions while training the model (default 7 expressions), please update the emotions array in `parameters.py` line 51.\n\n\n# \u003ca name=\"contrib\"\u003e5. Contributing\u003c/a\u003e\n\nSome ideas for interessted contributors:\n- Automatically downloading the data\n- Adding data augmentation?\n- Adding other features extraction techniques?\n- Improving the models\n\nFeel free to add or suggest more ideas.\nPlease report any bug in the [issues section](https://github.com/amineHorseman/facial-expression-recognition-using-cnn/issues).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FamineHorseman%2Ffacial-expression-recognition-using-cnn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FamineHorseman%2Ffacial-expression-recognition-using-cnn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FamineHorseman%2Ffacial-expression-recognition-using-cnn/lists"}