{"id":14238483,"url":"https://github.com/sarthak268/nesca-pytorch","last_synced_at":"2025-08-11T08:31:24.171Z","repository":{"id":246857857,"uuid":"748377990","full_name":"sarthak268/nesca-pytorch","owner":"sarthak268","description":" PyTorch Implementation for the paper \"Let Me Help You! Neuro-Symbolic Short-Context Action Anticipation\" accepted to RA-L'24. ","archived":false,"fork":false,"pushed_at":"2024-11-27T09:00:35.000Z","size":415,"stargazers_count":9,"open_issues_count":1,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-12-06T18:53:01.869Z","etag":null,"topics":["action-anticipation","attention","computer-vision","human-robot-collaboration","human-robot-interaction","pytorch","robot-and-automation","short-context","video-understanding"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sarthak268.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-25T21:01:35.000Z","updated_at":"2024-11-27T09:00:38.000Z","dependencies_parsed_at":"2024-08-09T07:25:24.110Z","dependency_job_id":"b94e5789-10b0-4daa-a81e-3bd784d3c9dd","html_url":"https://github.com/sarthak268/nesca-pytorch","commit_stats":null,"previous_names":["sarthak268/nesca-pytorch"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sarthak268%2Fnesca-pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sarthak268%2Fnesca-pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sarthak268%2Fnesca-pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sarthak268%2Fnesca-pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sarthak268","download_url":"https://codeload.github.com/sarthak268/nesca-pytorch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":229516971,"owners_count":18085475,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["action-anticipation","attention","computer-vision","human-robot-collaboration","human-robot-interaction","pytorch","robot-and-automation","short-context","video-understanding"],"created_at":"2024-08-21T03:00:51.829Z","updated_at":"2024-12-13T08:30:28.619Z","avatar_url":"https://github.com/sarthak268.png","language":"Python","funding_links":[],"categories":["Robot Arm"],"sub_categories":[],"readme":"# Let Me Help You! Neuro-Symbolic Short-Context Action Anticipation\n\nThis is a PyTorch implementation of the paper \"[\u003ci\u003eLet Me Help You!\u003c/i\u003e Neuro-Symbolic Short-Context Action Anticipation](https://ieeexplore.ieee.org/document/10582423)\". This work has been accepted to the [IEEE Robotics and Automation Letters (RA-L)](https://www.ieee-ras.org/publications/ra-l), 2024.\n\n|📑 Original Paper|📰 Project Page|\n|:-:|:-:|\n[Paper](https://ieeexplore.ieee.org/document/10582423) | [Project Page](https://sarthak268.github.io/NeSCA/) |\n\n\u003cimg src=\"media/overview.png\" alt=\"Approach Figure\" width=\"50%\" height=\"50%\"\u003e\n\n## Abstract\n\nIn an era where robots become available to the general public, the applicability of assistive robotics extends across numerous aspects of daily life, including in-home robotics.\nThis work presents a novel approach for such systems, leveraging long-horizon action anticipation from short-observation contexts.\nIn an assistive cooking task, we demonstrate that predicting human intention leads to effective collaboration between humans and robots.\nCompared to prior approaches, our method halves the required observation time of human behavior before accurate future predictions can be made, thus, allowing for quick and effective task support from short contexts. \nTo provide sufficient context in such scenarios, our proposed method analyzes the human user and their interaction with surrounding scene objects by imbuing the system with additional domain knowledge, encoding the scene object's affordances. \nWe integrate this knowledge into a transformer-based action anticipation architecture, which alters the attention mechanism between different visual features by either boosting or attenuating the attention between them. \nThrough this approach, we achieve an up to 9% improvement on two common action anticipation benchmarks, namely \u003ci\u003e50Salads\u003c/i\u003e and \u003ci\u003eBreakfast\u003c/i\u003e.\nAfter predicting a sequence of future actions, our system selects an appropriate assistive action that is subsequently executed on a robot for a joint salad preparation task between a human and a robot. \n\n## Citation \n\nIn case you find our work useful, consider citing:\n```\n@ARTICLE{BhagatNeSCA,\n  author={Bhagat, Sarthak and Li, Samuel and Campbell, Joseph and Xie, Yaqi and Sycara, Katia and Stepputtis, Simon},\n  journal={IEEE Robotics and Automation Letters}, \n  title={Let Me Help You! Neuro-Symbolic Short-Context Action Anticipation}, \n  year={2024},\n  pages={1-8},\n  doi={10.1109/LRA.2024.3421848}}\n``` \n\n## Index\n\n1. [Environment Setup](#setup)\n2. [Dataset](#dataset)\n3. [Training](#training)\n4. [Testing](#testing)\n5. [License](#license)\n\n## Setup\n\nThis system was tested on Ubuntu 22.04.\nIn order to build a ```conda``` environment for running our model, run the following command:\n```\nconda env create -f environment.yml\n```\n\nActivate environment using:\n```\nconda activate nesca\n```\n\n## Dataset\n\nWe train our action anticipation pipeline on two publicly available datasets namely, \u003ci\u003e50Salads\u003c/i\u003e and \u003ci\u003eBreakfast\u003c/i\u003e dataset. You can download the features from [this link](https://mega.nz/file/O6wXlSTS#wcEoDT4Ctq5HRq_hV-aWeVF1_JB3cacQBQqOLjCIbc8). \n\u003cbr\u003e\nTo leverage the trained model on our real-world setup, we finetune our model on the collected demonstrations. We also open-source the dataset that we collect in our real-world setup to promote research on action anticipation in complex scenarios. The dataset can be found [here](https://drive.google.com/drive/u/1/folders/1gfhkG3zDmewIR4CnO7KbSB1Kwvu15jl0).  \n\n## Training \n\nTo train our model on the 50Salads dataset, run the following command:\n```\nCUDA_VISIBLE_DEVICES=GPU_ID python main.py --task long --seg --anticipate --pos_emb --n_query 20 --n_encoder_layer 2 --n_decoder_layer 2 --batch_size 8 --hidden_dim 512 --workers 1 --dataset 50salads --max_pos_len 3100 --sample_rate 6 --epochs 70 --mode=train --input_type=i3d_transcript --split=SPLIT_NUM\n```\n\nMost training parameters are similar to the ones in [codebase](https://github.com/gongda0e/FUTR), so please refer to their codebase for more information on the training parameters. \nTo train the baseline model without NeSCA (i.e. FUTR), you can turn the arguments `use_gsnn` and `kg_attn` as False in `opts.py`. Alternatively, you can use the weights provided [here](https://github.com/gongda0e/FUTR). \n\n## Testing\n\nYou can find the pre-trained weights for our model trained on the 50Salads dataset [here](https://drive.google.com/drive/folders/1ezfe2V_buwmu21F4DK1xdDVnJaI_dpsp?usp=sharing). \u003cbr\u003e\nTo obtain the results mentioned in the paper, your trained models must be placed in a directory called ```ckpt/``` inside the home directory and then run the following command. \n\n```\nCUDA_VISIBLE_DEVICES=GPU_ID python main.py --hidden_dim 512 --n_encoder_layer 2 --n_decoder_layer 2 --n_query 20 --seg --task long --pos_emb --anticipate --max_pos_len 3100 --sample_rate 6 --dataset 50salads --predict --mode=train --split=SPLIT_NUM\n```\n\n## Resources \n\nThis implementation has been greatly inspired by the [codebase](https://github.com/gongda0e/FUTR) for the [FUTR paper](https://arxiv.org/abs/2205.14022).\n\n## License\n\nCopyright (c) 2023 Sarthak Bhagat, Samuel Li, Joseph Campbell, Yaqi Xie, Katia Sycara, Simon Stepputtis\n\nFor license information, see the license.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsarthak268%2Fnesca-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsarthak268%2Fnesca-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsarthak268%2Fnesca-pytorch/lists"}