{"id":16420765,"url":"https://github.com/andrew-chen-wang/age-detection","last_synced_at":"2025-02-24T16:17:16.412Z","repository":{"id":103624838,"uuid":"272011723","full_name":"Andrew-Chen-Wang/age-detection","owner":"Andrew-Chen-Wang","description":"Estimating Age using BOTH Speech and Facial Features","archived":false,"fork":false,"pushed_at":"2020-06-14T03:57:34.000Z","size":11,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-07T05:19:24.557Z","etag":null,"topics":["age-estimation","computer-vision","opencv","python","python3","speech-recognition"],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Andrew-Chen-Wang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-06-13T13:13:58.000Z","updated_at":"2020-06-14T03:57:37.000Z","dependencies_parsed_at":null,"dependency_job_id":"f31ac69c-ce52-49f4-b8cc-dc638d56a81d","html_url":"https://github.com/Andrew-Chen-Wang/age-detection","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrew-Chen-Wang%2Fage-detection","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrew-Chen-Wang%2Fage-detection/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrew-Chen-Wang%2Fage-detection/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrew-Chen-Wang%2Fage-detection/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Andrew-Chen-Wang","download_url":"https://codeload.github.com/Andrew-Chen-Wang/age-detection/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240511299,"owners_count":19813236,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["age-estimation","computer-vision","opencv","python","python3","speech-recognition"],"created_at":"2024-10-11T07:29:07.734Z","updated_at":"2025-02-24T16:17:16.367Z","avatar_url":"https://github.com/Andrew-Chen-Wang.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Age Detection\nPublished by: Andrew Chen Wang\n\nStarted on 9 June 2020\n\nThis project aims to use computer vision alongside text-to-speech _or_ speech recognition\nto identify age during the following scenarios:\n\n1. Static images\n  - This will only work with computer vision.\n  - [Conclusion of paper 13 June 2020](https://github.com/Andrew-Chen-Wang/static-image-age-detection):\n    age detector is terrible, even with the \n    multi-image processing. It is most likely due to the model, as most of my input\n    data is of teenagers and they're not particularly well-documented in the\n    age detector model. Moving on to integrating with audio recordings to really\n    make this work much better! That's of my opinion at least.\n    \n2. Chatrooms and Audio Recordings\n  - This would only work with the text-to-speech or speech recognition, depending\n  on the context of the input.\n  - With audio recordings, you can more easily identify age based on pitch and\n  other factors than simple text-to-speech.\n  - Text-to-speech recognition should NOT formulate words. Instead, it should\n  formulate syllables (like Siri and Alexa) and piece those together based on\n  the amount of spaces between each \"word.\" This is part of growing up: you\n  can't pronounce your TH's correctly.\n3. (Live) Videos\n  - Live videos utilizes both computer vision and speech recognition.\n  - In videos, multiple people could be talking, so it is important to\n  figure out **who** is talking and try our best to correctly identify the age\n  of a person based on the inputs.\n\nThe goal is the third point. Each goal will be marked as a milestone as two set-pieces\nin tackling videos. Age detection today is very good; however, [many lack the integration\nof speech recognition and face in combination.\n\u003csup\u003e[1]\u003c/sup\u003e](https://www.psychologicabelgica.com/articles/10.5334/pb.aq/)\nAlthough the paper was published in 2014 and I believe my Googling skills to be top-notch,\nI still don't think anyone has done this.\n\nThe main problem is the integration of each component. Obviously, the computer vision\nweighs more than the speech recognition (especially if it's text-to-speech), but\nfinding a balance between these two components is what I'm looking for.\n\n---\n### Pre-Log of Initial Thoughts\n\nDated: 13 June 2020\n\nMy thoughts before I found this paper was to combine the inputs at some random\nlayer of the network. (Oh FYI, I've never learned how to do ML, so lots of this is\ntrial and error commits with practically zero unittests). I was hoping to give more\nweight or bias to the one that had more confidence, and this could be for multiple\nreasons such as a lack of images and only voice chat, a lack of sound and only\nvideo stream, etc. in addition to just confidence of detection and running through\nthe layers itself.\n\nIn addition to those inputs, throughout the layer, we'll be able to figure out\nother human attributes such as age and ethnicity, which could assist the speech\nrecognition. Especially when it comes to \n\nAfter reading this paper, I can safely say that there are several factors\nthat should outweigh others. A notable point was male detection vs.\nfemale detection can lend to differences in estimation. In the paper,\nit is said that female voices lead to better estimation, whereas\nmale faces lead to better estimation.\n\nI will take into account all these factors.\n\n---\n### License\n\nCopyright 2020 Andrew Chen Wang\n\n   Licensed under the Apache License, Version 2.0 (the \"License\");\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an \"AS IS\" BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n\n---\n### Credit + References\n\n1. Moyse, E., 2014. Age Estimation from Faces and Voices: A Review. Psychologica Belgica, 54(3), pp.255–265. DOI: http://doi.org/10.5334/pb.aq\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrew-chen-wang%2Fage-detection","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fandrew-chen-wang%2Fage-detection","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrew-chen-wang%2Fage-detection/lists"}