{"id":20608904,"url":"https://github.com/jose-jaen/facialrecognizer","last_synced_at":"2026-05-04T12:31:30.249Z","repository":{"id":237752718,"uuid":"747592102","full_name":"jose-jaen/FacialRecognizer","owner":"jose-jaen","description":"Facial Recognition system with AI and Statistical Learning models","archived":false,"fork":false,"pushed_at":"2024-05-05T14:43:52.000Z","size":1019,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-27T14:08:15.453Z","etag":null,"topics":["cnn","computer-vision","data-science","deep-learning","faces","facial-recognition","lda","pca","polars","python","pytorch","statistics","uc3m"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jose-jaen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-24T08:46:50.000Z","updated_at":"2024-05-05T14:43:55.000Z","dependencies_parsed_at":"2024-05-05T15:51:01.853Z","dependency_job_id":null,"html_url":"https://github.com/jose-jaen/FacialRecognizer","commit_stats":null,"previous_names":["jose-jaen/facialrecognizer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jose-jaen/FacialRecognizer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jose-jaen%2FFacialRecognizer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jose-jaen%2FFacialRecognizer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jose-jaen%2FFacialRecognizer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jose-jaen%2FFacialRecognizer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jose-jaen","download_url":"https://codeload.github.com/jose-jaen/FacialRecognizer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jose-jaen%2FFacialRecognizer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32607328,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-04T10:08:07.713Z","status":"ssl_error","status_checked_at":"2026-05-04T10:08:02.005Z","response_time":58,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cnn","computer-vision","data-science","deep-learning","faces","facial-recognition","lda","pca","polars","python","pytorch","statistics","uc3m"],"created_at":"2024-11-16T10:12:09.339Z","updated_at":"2026-05-04T12:31:30.235Z","avatar_url":"https://github.com/jose-jaen.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# FacialRecognizer\n\nA personal project leveraging both Statistical Learning and Deep Learning techniques to accurately recognize and classify faces. \n\n`PyTorch` and `polars` are used for efficiently handling the data in a distrubuted fashion. \n\nThe \u003ca target=\"_blank\" rel=\"noopener noreferrer\" href=\"https://cmp.felk.cvut.cz/~spacelib/faces/\"\u003e`faces94`\u003c/a\u003e dataset is utilized to train and evaluate models.\n\n## Summary\n\n\u003ca target=\"_blank\" rel=\"noopener noreferrer\" href=\"https://cmp.felk.cvut.cz/~spacelib/faces/\"\u003e`faces94`\u003c/a\u003e consists of thousands of images, concretely around 20 images from more than 300 distinct individuals.\n\nA facial recognition system is built with two well-known Statistical Learning methods implemented from scratch: \u003ca href=\"https://www.face-rec.org/algorithms/PCA/jcn.pdf\" target=\"_blank\" rel=\"noopener noreferrer\"\u003e\u003cstrong\u003eEigenfaces\u003c/strong\u003e\u003c/a\u003e and \n\u003ca href=\"https://cseweb.ucsd.edu/classes/wi14/cse152-a/fisherface-pami97.pdf\" target=\"_blank\" rel=\"noopener noreferrer\"\u003e\u003cstrong\u003eFisherfaces\u003c/strong\u003e\u003c/a\u003e. This system is able to discern whether a person belongs or not to the given database as well as its identification in case of the former.\n\nFurthermore, an AI model (**Convolutional Neural Network**) is trained replicating the \u003ca href=\"http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf\" target=\"_blank\" rel=\"noopener noreferrer\"\u003e\u003cstrong\u003eLeNet-5\u003c/strong\u003e\u003c/a\u003e architecture using PyTorch as the preferred Deep Learning framework.\n\n## Demo\n\nCheck the \u003ca href=\"https://github.com/jose-jaen/FacialRecognizer/blob/main/demo.ipynb\" rel=\"noreferrer noopener\" target=\"_blank\"\u003eJupyter Notebook\u003c/a\u003e showcasing the models with additional information about the algorithms.\n\n## Requirements\n\nThe following software is needed to successfully run the project:\n\n- `Python` \u003e= 3.11\n- `pip` \u003e= 23.3.2\n\nOptionally, if you own a GPU it is recommended to have `CUDA` installed.\n\nFor downloading and automatically managing the images it is necessary to use a Unix-based OS like Linux or macOS. A script that generalizes for Windows is being worked on.\n\nPython libraries can be directly downloaded after creating a virtual environment with:\n```bash\npython3 -m pip install -r requirements.txt\n```\n\n## Getting Started\n:warning: All Python scripts must be run from the speficic directory they are located :warning:\n\nFirstly, it is needed to download and sort the images into three partitions:\n```bash\npython3 set_up_data.py\n```\n\nThis will create a `data` folder with all the relevant images split into `train`, `validation` and `test` partitions.\n\nAfter this you can train the models, for example, if we want to train the CNN:\n```bash\npython3 train_lenet.py\n```\n\n## Project Structure\n\n### Data Engineering\n\n```bash\nFacialRecognizer/\n├── config/\n│   ├── .env\n│   └── url.py\n│\n├── controllers/\n│   ├── data_extractor.py\n│   └── data_images.py\n│     \n├── data/\n│   ├── train/\n│   ├── validation/\n│   └── test/\n│\n└── set_up_data.py\n```\n\nThe `config` folder simply stores the URL for the dataset within a variable in `url.py`. Since the URL was long, it was kept in an environmental variable.\n\nWithin `controllers`, the main data extraction and partition process is implemented: `set_up_data.py` serves as a `main` file for running the different methods of the `DataExtractor` class in `data_extractor.py`. `data_images.py` turns each image into an array of pixels stored in a polars dataframe for later processing.\n\nConcretely, `faces94` is downloaded and a sample of females is guaranteed to be present in all datasets as to make sure the models generalize well. As to simulate the presence of individuals who do not belong to the original database, certain images are hidden from training in the folders `validation` and `test`.\n\n### Statistical Learning\n\n```bash\nFacialRecognizer/\n├── stats/\n│   ├── pca.py\n│   ├── lda.py\n│   ├── knn.py\n│   ├── eigenfaces.py\n│   └── fisherfaces.py\n│\n├── train_eigenfaces.py\n└── train_fisherfaces.py\n```\n\n`pca.py` includes the code for applying Principal Component Analysis (PCA) for dimensionality reduction. \n\n`lda.py` implements Fisher's Linear Discriminant Analysis (LDA) method for maximizing the spread between classes and minimizing the within-class variance.\n\nA custom k-Nearest Neighbors (k-NN) algorithm can be found in `knn.py`. It chooses the most voted class among the selected candidates. The user may opt for one of the following statistical distances: 'cosine', 'seuclidean', 'euclidean', 'canberra'.\n\n`eigenfaces.py` and `fisherfaces.py` define linear discriminants that feed the classifier defined in `knn.py`.\n\nBoth scripts `train_eigenfaces.py` and `train_fisherfaces.py` build a facial recognizer system based on the Eigenfaces and Fisherfaces approaches, tuning the hyperparameters with Bayesian Optimization.\n\n### Deep Learning\n```bash\nFacialRecognizer/\n├── deep_learning/\n│   ├── dataset.py\n│   └── lenet.py\n│\n└── train_lenet.py\n```\n\n`dataset.py` sets up a custom PyTorch Dataset structure to store the data and `lenet.py` lays out the code for defining LeNet-5 core architecture and its learning process. Finally, `train_lenet.py` fits a CNN to the data and performs classification, procuring the accuracy on test images.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjose-jaen%2Ffacialrecognizer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjose-jaen%2Ffacialrecognizer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjose-jaen%2Ffacialrecognizer/lists"}