{"id":18439509,"url":"https://github.com/idiap/vfoa","last_synced_at":"2025-04-15T03:46:41.482Z","repository":{"id":144963478,"uuid":"303306127","full_name":"idiap/vfoa","owner":"idiap","description":"Methods to estimate the visual focus of attention","archived":false,"fork":false,"pushed_at":"2020-10-12T07:04:37.000Z","size":168,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-15T03:46:27.690Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/idiap.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-10-12T07:04:15.000Z","updated_at":"2024-03-20T20:40:48.000Z","dependencies_parsed_at":null,"dependency_job_id":"cab52c8b-1640-4500-95f5-0e5d5feb3075","html_url":"https://github.com/idiap/vfoa","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/idiap%2Fvfoa","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/idiap%2Fvfoa/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/idiap%2Fvfoa/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/idiap%2Fvfoa/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/idiap","download_url":"https://codeload.github.com/idiap/vfoa/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249003941,"owners_count":21196794,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T06:25:12.874Z","updated_at":"2025-04-15T03:46:41.466Z","avatar_url":"https://github.com/idiap.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# VFOA Module\n\nThis package contains several models that allow to estimate the vfoa (visual focus of attention) of a subject given\nhis gaze and the position of the potential visual targets (some models require additional variables, like\nhead position and orientation, speaking status, ...). It provides also an handler class VfoaModule, that make the link\nbetween the user and those models.\n\n### Prerequisites\n\n- Numpy 1.14.0\n\n## Installing\n\nSimply install the package using\n```bash\npython setup.py install\n```\nAdd the --record installed_files.txt option to get the list of installed files\n\n## Usage\n\nTo use this module, import it and instantiate it using the following code. The implementation will look like this:\n\n```python\nfrom vfoa.vfoa_module import VFOAModule, Person, Target\nvfoaModule = VFOAModule(modelName)\n\ndef get_vfoa(subject, targetList, frameIndex):\n    \"\"\" Return probabilities that \u003csubject\u003e looks at each target in \u003ctargetList\u003e \"\"\"\n    headpose = [subject.headpose.x, subject.headpose.y, subject.headpose.z, \n                subject.headpose.yaw, subject.headpose.pitch, subject.headpose.roll]\n    gaze = [subject.gaze.x, subject.gaze.y, subject.gaze.z, \n                subject.gaze.yaw, subject.gaze.pitch, subject.gaze.roll]\n    bodypose = [subject.bodypose.x, subject.bodypose.y, subject.bodypose.z, \n                subject.bodypose.yaw, subject.bodypose.pitch, subject.bodypose.roll]\n    speaking = subject.isSpeaking\n    personDict = {subject.name: Person(subject.name, headpose, gaze, bodypose, speaking)}\n    \n    targetDict = {}\n    for target in targetList:\n        position = [target.position.x, target.position.y, target.position.z]\n        targetDict[target.name] = Target(target.name, position)\n    \n    vfoaModule.compute_vfoa(personDict, targetDict, frameIndex)\n    return vfoaModule.get_vfoa_best(subject.name)\n```\n\nYou should provide a valid *modelName* (list is given here below)\nPlease note that some methods can ignore some cues, like gaze or bodypose and return a different format of VFOA.\nSee below for methods description.\n\n#### 1) geometricalModel\nEstimate vfoa based on a geometric model: if the angular distance between the gaze vector and the target\ndirection (i.e. line frome eye to target) is below a given threshold, the gaze is allocated to this target.\nIf several targets fill this condition, the nearest to the gaze vector wins.\n\nNeeded inputs: **gaze direction**\n\nOuput: dictionary of {targetName: isVFOA}, i.e. sort of one-hot encoding\n\nNote: you can also use this model without knowing the gaze by making the approximation that the gaze is equal to the headpose\n\n\n#### 2) gazeProbability\nEstimate vfoa based on the probability that the subject is looking in the target direction. It means that\nwe want to compute the posterior probability of the gaze and evaluate it at each target position.\nThus, output probabilities does not sum to 1, as they are only point-wise evaluation of the posterior.\n\nNeeded inputs: **headpose** and **gaze direction**\n\nOuput: dictionary of {targetName: probability}. Note that it is not normalized and aversion is constant\n\nNote: you can also use this model without knowing the gaze by making the approximation that the gaze is equal to the headpose\n\n\n#### 3) gaussianModel\nEstimate vfoa based on a gaussian model: each target is modelled as a gaussian centered on the target and with\na given variance. The aversion is also modelled as a gaussian centered on the head pose of the subject, with\nanother given variance. Finally, measure noise is added to each gaussian, and it compute the likelihood of\nthe observed gaze wrt each target. The final probabilities are normalized and returned, giving the probability\nthat the subject looks at each target\n\nNeeded inputs: **headpose** (optional) and **gaze direction**\n\nOuput: dictionary of {targetName: probability}, a normalized distribution.\n\nNote: you can also use this model without knowing the gaze by making the approximation that the gaze is equal to the headpose\n\n\n#### 4) HMM\nEstimate vfoa based on a HMM. Gaze is estimated from the head pose using either the Midline effect model or the dynamical\nhead reference model.\n\nNeeded inputs: **headpose**\n\nOuput: dictionary of {targetName: probability}, a normalized distribution.\n\n\n## About coordinate systems\nDepending on the method, different coordinate systems (CS) are used. In order to make it easy to use,\nused coordinate system must be defined when a Person or a Target obejct is created. Here is the\nlist of possible CS:\n\nCS for positions in 3D space (positionCS):\n* Camera Coordinate System (CCS): this CS is attached to the camera. The z-axis goes backward (on\nthe optical axis), the y-axis goes upward and x-axis goes to right when one looks in the same\ndirection as the camera.\n* Optical Coordinate System (OCS): this CS corresponds to the openCV frame, i.e. the x-axis goes to\nright in the image frame, the y-axis goes downward in the image frame and the z-axis goes forward\nand is aligned to the optical axis)\n\nCS for poses in 3d space (poseCS):\n* Camera Coordinate System (CCS): this CS is attached to the camera. The z-axis goes backward (on\nthe optical axis), the y-axis goes upward and x-axis goes to right when one looks in the same\ndirection as the camera. Angles are defined as follow: yaw is the positive rotation over y, pitch is\nthe negative rotation over x and roll is the rotation over z\n* Frontal Coordinate System (FCS): this CS is attached to the head of a person and is oriented\ntoward the camera. It means that it is dependent on the person and on the time. The x-axis points\ntoward the camera, the z-axis goes upward and the y-axis goes to the right when one is in front of\nthe person. Angles are defined as follow: yaw is the negative rotation over z, pitch is the negative\nrotation over y and roll is the negative rotation over x\n\n![Coordinate system (CCS, OCS, FCS)](images/coordinate_systems.png)\n\n*Example for CCS*: Let's define a setup with a subject in front of a camera. We define a coordinate\nsystem attached to the camera, with x-axis on the right, y-axis up and z-axis backward (from the\ncamera point of view). If the subject is in the center of the image, standing at 1 meter from the\ncamera, he's xyz position will be (0, 0, -1). Now we can define the gaze angles:\n* if he is looking at the camera, the gaze direction yaw-pitch-roll will be (0, 0, 0);\n* if he looks 45 degrees up, it will become (0, 45, 0);\n* if he looks 45 degrees on the right, it will become (-45, 0, 0);\n* etc\n\n### Other coordinate systems\nYou can work with another coordinate system, as long it is fixed over time (FCS is a special case\nthat requires particuler management). To do so, set positionCS and poseCS to 'CCS', and methods\nshould work normally.\n\n## Examples\n\n...\n\n## License\n\nThis project is licensed under the BSDv3 License - see the [LICENSE](LICENSE) file for details\n\nCopyright (c) 2018, Idiap Research Institute\nAuthor: Remy Siegfried (remy.siegfried@idiap.ch)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fidiap%2Fvfoa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fidiap%2Fvfoa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fidiap%2Fvfoa/lists"}