{"id":45466770,"url":"https://github.com/marcelsamyn/thesis","last_synced_at":"2026-05-03T21:33:18.011Z","repository":{"id":74347630,"uuid":"126800975","full_name":"marcelsamyn/thesis","owner":"marcelsamyn","description":"Autonomous Production of Gestures on a Social Robot using Deep Learning","archived":false,"fork":false,"pushed_at":"2020-03-30T20:38:33.000Z","size":64051,"stargazers_count":2,"open_issues_count":3,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-02-22T13:53:35.518Z","etag":null,"topics":["gestures","machine-learning","robotics"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/marcelsamyn.png","metadata":{"files":{"readme":"README.org","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-03-26T08:55:35.000Z","updated_at":"2023-02-19T11:53:56.000Z","dependencies_parsed_at":null,"dependency_job_id":"832eca65-5107-4bbb-ba48-1f732e5892c8","html_url":"https://github.com/marcelsamyn/thesis","commit_stats":null,"previous_names":["marcelsamyn/thesis"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/marcelsamyn/thesis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcelsamyn%2Fthesis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcelsamyn%2Fthesis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcelsamyn%2Fthesis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcelsamyn%2Fthesis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/marcelsamyn","download_url":"https://codeload.github.com/marcelsamyn/thesis/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcelsamyn%2Fthesis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32586187,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-03T06:36:36.687Z","status":"ssl_error","status_checked_at":"2026-05-03T06:36:09.306Z","response_time":103,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gestures","machine-learning","robotics"],"created_at":"2026-02-22T09:50:57.286Z","updated_at":"2026-05-03T21:33:17.980Z","avatar_url":"https://github.com/marcelsamyn.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"* Thesis\n\n** Installation\n   \n   The simplest way to get started is by using the virtualenv, managed with\n   pipenv:\n\n   #+BEGIN_SRC shell\n     pip install --user pipenv\n     pipenv install\n     pipenv shell\n   #+END_SRC\n\n   This will take in all the Python dependencies and could take a while. The\n   last command opens the virtualenv so you can start using all these\n   dependencies.\n   \n   This project was developed on a machine running Ubuntu 16.04 LTS. Most things\n   will work on other distributions too. However, I've found that Choregraphe,\n   SoftBank's robot simulator, might not work on distributions with newer\n   versions of certain packages (I can't remember which, however on my machine\n   running the latest Fedora it doesn't work). Some parts of the project, like\n   the Video Picker, also require system-level dependencies which aren't managed\n   by Pipenv.\n\n   If you have difficulties with dependencies, you can run the code in a Docker\n   contaier instead. There are two options:\n\n   - The machine specified in ~Dockerfile~ requires installation of\n     ~nvidia-docker~ and thus requires an NVIDIA GPU. This also uses\n     ~tensorflow-gpu~ to exploit your graphics card.\n   - The machine specified in ~Dockerfile.cpu~ uses the regular Ubuntu 16.04\n     image and runs the CPU variant of TensorFlow.\n\n   To set up the Docker machine, run ~make build~ or ~make cpubuild~ and to\n   start it, run ~make start~ or ~make cpustart~, respectively.\n\n   Note that these Docker containers install the NAOqi SDK's and Choregraphe but\n   since I can't host them here, you should download them yourself from the\n   SoftBank website and place them in the root directory of this repository.\n   Double-check the file names in the Dockerfile to make sure the version\n   mentioned there is the same one as you downloaded.\n\n*** Running OpenPose\n    For OpenPose, a separate Dockerfile is present in ~src/openpose~. This\n    builds OpenPose as part of the Docker build process, so you can immediately\n    start using the compiled binaries.\n\n*** Installing ~nvidia-docker~\n\n    If you are running Ubuntu, installation is easy:\n\n    #+BEGIN_SRC sh\n      git clone git@github.com:ryanolson/bootstrap.git\n      cd bootstrap\n      ./bootstrap.sh\n    #+END_SRC\n\n    If you have another setup, the installation is probably quite simple too\n    (but not /this/ simple :wink:).\n    \n** Running\n\n*** Building the Dataset\n\n**** Collect video\n\n     First, find a suitable video on YouTube. It should have the following\n     properties:\n\n     - Is in English (actually this is not strictly necessary, but don't start\n       messing with multiple languages :wink:)\n     - Has subtitles available (the download script downloads the automatically\n       generated subtitles do if there are built-in ones that you want to use,\n       you have to modify it)\n     - Has at least one shot of a person who is fully visible in frame (from\n       head to toe)\n\n     For these steps you need ~youtube-dl~ and ~ffmpeg~ installed. Open a\n     command line in the ~src/video~ directory and run:\n     \n     #+BEGIN_SRC sh\n       ./download.sh '$YOUTUBE_URL'\n       ./detect-shots.sh $NEW_VIDEO_FILE\n     #+END_SRC\n     \n     If there are quotes in the file name created after running ~download.sh~,\n     this might cause trouble in the next step. I recommend removing them first.\n     Keep the YouTube ID at the end of the file since it is used by other parts\n     of the project.\n\n**** Extract Video Clips\n     \n     The Video Picker assists with selecting the right parts of a video and\n     saves the initial data for the dataset. You need some system-level\n     dependencies such as FFmpeg, Cairo, Gstreamer and the Python GTK libraries.\n     You can refer to the Dockerfiles to see which packages this are or just run\n     this from the Docker container if you don't want to install them yourself.\n\n     Browse to the ~src/video-picker~ directory and run ~main.py~:\n\n     #+BEGIN_SRC sh\n       python main.py\n     #+END_SRC\n     \n     Click the =Open= icon or press =Ctrl+O= to open a video file. Then, the\n     video will start playing and the program should look something like this:\n\n     [[file:./img/video-picker-screenshot.png]]\n\n     The most ergonomic way to extract video is to hold your left hand above\n     your keyboard (you only need its left half anyways) and hold your mouse\n     with your right hand.\n\n     Point the cursor roughly at the hip of the person you're interested in. You\n     can adjust the size of the rectangle by scrolling. (This information is\n     saved but is not used anymore. It was in a previous version. So the size of\n     the rectangle doesn't really matter.)\n\n     #+BEGIN_QUOTE\n     *Some terminoligy.* The ~detect-shots.sh~ script you ran above runs a\n      /scene detection/ algorithm which detects the *shots* in a video. A shot\n      is a single continuous piece of video. So, there is another shot if the\n      camera cuts to another angle, for example. Sometimes these changes are not\n      detected, for example, when there is a fading animation in betwen shots.\n     #+END_QUOTE\n     \n     Navigate and record with the following shortcuts:\n\n     - ~Ctrl+D~ go to the previous shot\n     - ~Crtl+F~ go to the next shot\n     - ~Ctrl+R~ record the current shot (rewinds to the start of this shot\n       first)\n     - ~Ctrl+R~ (while recording) stop recording immediately. Use this when\n       there is a new shot the scene detection algorithm didn't pick up.\n\n     While Video Picker is recording, just let it play until it stops recording.\n     The cursor changes color according to what state it's in:\n\n     - Red cursor :: recording\n     - Semi-transparent cursor :: this clip is unusable (probably because the\n          subtitle is present during a change of shot) and is thus not being\n          recorded\n     - Green cursor :: this clip is already recorded\n\n**** Perform 2D Pose Detection with OpenPose\n\n     Move the ~src/video-picker/images~ folder to ~src/openpose/src/images~.\n\n     Go so ~src/openpose~ in your command line and set up the container:\n     \n     #+BEGIN_SRC sh\n       make build\n       make start\n     #+END_SRC\n\n     Once you're in the container, run OpenPose on the images extracted by the\n     Video Picker:\n\n     #+BEGIN_SRC sh\n       cd openpose\n       ./build/examples/openpose/openpose.bin --image_dir ~/dev/images/ --write_json ~/dev/output/\n     #+END_SRC\n\n**** Lift the poses to 3D with 3D Pose Baseline\n\n     The 3D Pose Baseline will read the output directory from OpenPose and save\n     the lifted 3D poses into the ~clips.jsonl~ file created by the Video\n     Picker.\n\n     Go to the ~src/openpose-baseline~ folder and simply run ~make~.\n\n**** Clean the data\n     \n     Go to the ~src/~ folder and run:\n     \n     #+BEGIN_SRC sh\n       ./util.py dataset preprocess\n     #+END_SRC\n\n**** Detect the clusters\n\n     Go to the ~src/clustering~ folder and run:\n     \n     #+BEGIN_SRC sh\n       R detect-clusters.r\n     #+END_SRC\n\n     You need to have R installed but R dependencies will be installed with\n     Pacman.\n\n**** Create the TFRecord dataset\n     \n     Go to the ~src/~ folder and run:\n\n     #+BEGIN_SRC sh\n       ./util.py create-tfrecords\n     #+END_SRC\n\n     Phew! The dataset is ready.\n\n*** Using the model\n\n**** Training\n\n    Go to the ~src/learning~ model and run:\n\n    #+BEGIN_SRC sh\n      python model.py --train\n    #+END_SRC\n\n    Modify the parameters at the bottom of ~model.py~ if you want to.\n\n**** Evaluating\n\n     Results are automatically evaluated at the end of trainig. You can inspect\n     them by starting TensorBoard:\n\n     #+BEGIN_SRC sh\n       make board\n     #+END_SRC\n     \n     Then, navigate to [http://localhost:6006](http://localhost:6006). Note that\n     you can run TensorBoard while training and look at the results while\n     training.\n\n**** Inference\n     \n     To plot the pose for a subtitle of your choice, run:\n     \n     #+BEGIN_SRC sh\n       python model.py --predict --subtitle 'robots are smarter with machine learning'\n     #+END_SRC\n\n*** Performing gestures on a robot\n\n    In ~src/util.py~ a few functions are implemented to play back poses. To\n    specify how to connect the nao, use the ~--bot_address~ and ~--bot_port~\n    options. Defaults are ~127.0.0.1~ and 9559, respectively.\n    \n    #+BEGIN_SRC sh\n      ./util.py bot play-random-clip  # Take a random clip and play the ground truth gesture\n      ./util.py bot play-clusters     # Play the clusters from `cluster-centers.json`\n    #+END_SRC\n\n**** TODO Add a method to play back predictions\n\n*** Preparing the survey\n\n    There is a single ~create question~ functions that prepares the gestures\n    needed for a question in the survey. Such a question contains a video\n    recording of the robot performing 3 clips immediately after each other, in\n    four different scenarios:\n\n    - Ground truth (3D pose detections)\n    - Baseline (built-in robot animations)\n    - Classification-based prediction (uses the clusters)\n    - Sequence-based prediction (directly predicts the gesture)\n\n    The ~create question~ will make the robot perform these scenarios after each\n    other. While performing a gesture, its eye LED's will be active and in\n    between performances they will be turned off. It will also print the\n    associated (combined) subtitle and save the metadata for the question in\n    ~questions.jsonl~.\n\n    Go to ~src/~ and run:\n    \n    #+BEGIN_SRC sh\n      ./util.py survey create-question\n    #+END_SRC\n\n\n**** Using a virtual robot\n\n     It is possible to generate a question using a virtual robot from a running\n     Choregraphe instance.\n\n     Run the ~create_question~ function in ~src/survey.py~ with\n     ~do_record_screen=True, do_generate_tts=True~. You will probably need to\n     update the code to make sure the correct region of your display is\n     captured. In order to generate the TTS speech, the IBM Watson API is used\n     (since the SoftBank TTS engine is not available in the simulator). For that\n     to work, you need to sign up for an account and set up the following\n     environment variables:\n\n     #+BEGIN_SRC sh\n       export WATSON_TTS_USERNAME='xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'\n       export WATSON_TTS_PASSWORD='xxxxxxxxxxxx'\n     #+END_SRC\n     \n     *Tip:* Save this in a file ~.env~ in the root directory of this project.\n      Pipenv will automatically load the environment variables when running\n      ~pipenv shell~. You'll need load them manually, though, if you're running\n      this in a Docker container (since there's no virtual environment in that\n      case).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarcelsamyn%2Fthesis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmarcelsamyn%2Fthesis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarcelsamyn%2Fthesis/lists"}