{"id":17908377,"url":"https://github.com/pseudotensor/temporal_autoencoder","last_synced_at":"2025-09-06T16:37:55.936Z","repository":{"id":148324305,"uuid":"80373341","full_name":"pseudotensor/temporal_autoencoder","owner":"pseudotensor","description":"Temporal Autoencoder Project","archived":false,"fork":false,"pushed_at":"2017-03-11T03:49:34.000Z","size":133,"stargazers_count":5,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-07-10T16:31:20.900Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pseudotensor.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-01-29T21:19:18.000Z","updated_at":"2021-08-22T17:58:13.000Z","dependencies_parsed_at":null,"dependency_job_id":"f51a84d4-31ad-4c04-94f0-f114e0f4f64a","html_url":"https://github.com/pseudotensor/temporal_autoencoder","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/pseudotensor/temporal_autoencoder","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pseudotensor%2Ftemporal_autoencoder","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pseudotensor%2Ftemporal_autoencoder/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pseudotensor%2Ftemporal_autoencoder/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pseudotensor%2Ftemporal_autoencoder/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pseudotensor","download_url":"https://codeload.github.com/pseudotensor/temporal_autoencoder/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pseudotensor%2Ftemporal_autoencoder/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273932805,"owners_count":25193599,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-06T02:00:13.247Z","response_time":2576,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-28T19:15:41.706Z","updated_at":"2025-09-06T16:37:55.818Z","avatar_url":"https://github.com/pseudotensor.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## What: Temporal Autoencoder for Predicting Video\n\n## How: Tensorflow version of CNN to LSTM to uCNN\n\n## Why:\n\n# Inspired by papers:\n\nhttp://www.jmlr.org/proceedings/papers/v2/sutskever07a/sutskever07a.pdf\nhttps://arxiv.org/abs/1411.4389\nhttps://arxiv.org/abs/1504.08023\nhttps://arxiv.org/abs/1506.04214 (like this paper with RNN but now with LSTM)\nhttps://arxiv.org/abs/1511.06380\nhttps://arxiv.org/abs/1511.05440\nhttps://arxiv.org/abs/1605.08104\nhttp://file.scirp.org/pdf/AM20100400007_46529567.pdf\nhttps://arxiv.org/abs/1607.03597\nhttp://web.mit.edu/vondrick/tinyvideoa\nhttps://arxiv.org/abs/1605.07157\nhttps://arxiv.org/abs/1502.04681\nhttps://arxiv.org/abs/1605.07157\nhttp://www.ri.cmu.edu/pub_files/2014/3/egpaper_final.pdf\n\n# Uses parts of (or inspired by) the following repos:\n\nhttps://github.com/tensorflow/models/blob/master/real_nvp/real_nvp_utils.py\nhttps://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py\nhttps://github.com/machrisaa/tensorflow-vgg\nhttps://github.com/loliverhennigh/\nhttps://coxlab.github.io/prednet/\nhttps://github.com/tensorflow/models/tree/master/video_prediction\nhttps://github.com/yoonkim/lstm-char-cnn\nhttps://github.com/anayebi/keras-extra\nhttps://github.com/tgjeon/TensorFlow-Tutorials-for-Time-Series\nhttps://github.com/jtoy/awesome-tensorflow\nhttps://github.com/aymericdamien/TensorFlow-Examples\n\n# Inspired by the following articles:\n\nhttp://spectrum.ieee.org/automaton/robotics/artificial-intelligence/deep-learning-ai-listens-to-machines-for-signs-of-trouble?adbsc=social_20170124_69611636\u0026adbid=823956941219053569\u0026adbpl=tw\u0026adbpr=740238495952736256\n\nhttp://www.theverge.com/2016/8/4/12369494/descartes-artificial-intelligence-crop-predictions-usda\n\nhttps://devblogs.nvidia.com/parallelforall/exploring-spacenet-dataset-using-digits/\n\n# And inspired to a lesser extent the following papers:\n\nhttps://arxiv.org/abs/1508.01211\nhttps://arxiv.org/abs/1507.08750\nhttps://arxiv.org/abs/1505.00295\nwww.ijcsi.org/papers/IJCSI-8-4-1-139-148.pdf\ncs231n.stanford.edu/reports2016/223_Report.pdf\n\n# Program Requirements:\n\n* Tensorflow 0.12\n* Python 2.7\n* OpenCV\n\n# Post-Processing requirements\n\n* avconv, mencoder, MP4Box,smplayer\n\n\n# How to run:\n\npython main.py\n\n# Post-processing: making model vs. predicted video:\n\nsh mergemov.sh\n\nsmplayer out_all.mp4\n\nor\n\nsmplayer out_all2_fast.mp4\n\n# Some training results:\n\n* Balls, slow movie: [![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/xQdaaYogRMM/0.jpg)](https://www.youtube.com/watch?v=xQdaaYogRMM)\n\n* Balls, fast movie: [![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/wxxD4sDUEfg/0.jpg)](https://www.youtube.com/watch?v=wxxD4sDUEfg)\n\n* Training Curve in Tensorflow (norm order 80): ![Alt text](https://github.com/pseudotensor/temporal_autoencoder/blob/master/lossexamples/loss_balls.jpg \"Training loss curve for balls prediction vs. model.\")\n\n\n* Wheel, slow movie: [![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/8IsqTFnZ_1w/0.jpg)](https://www.youtube.com/watch?v=8IsqTFnZ_1w)\n\n* Wheel, fast movie: [![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/lABUOLzCp-k/0.jpg)](https://www.youtube.com/watch?v=lABUOLzCp-k)\n\n* Training Curve in Tensorflow (norm order 40): ![Alt text](https://github.com/pseudotensor/temporal_autoencoder/blob/master/lossexamples/loss_wheel.jpg \"Training loss curve for wheel prediction vs. model.\")\n\n\n# Parameters:\n\n1) In main.py:\n\n* Choose global flags\n* In main():\n  * Choose to use checkpoints (if exist) or not: continuetrain\n  * type of model: modeltype\n  * number of balls: num_balls\n\n2) In balls.py:\n\n* SIZE: size of ball's bounding box in pixels\n\n\n# Ideas and Future Work:\n\n* Test on other models\n\n* Try more filters\n\n* Try temporal convolution\n\n* Try other LSTM architectures (C-peek, bind forget-recall, GRU, etc.)\n\n* Try adversarial loss:\n\nhttps://github.com/carpedm20/DCGAN-tensorflow\nhttp://blog.aylien.com/introduction-generative-adversarial-networks-code-tensorflow/\nhttp://blog.aylien.com/introduction-generative-adversarial-networks-code-tensorflow/\nhttp://blog.aylien.com/introduction-generative-adversarial-networks-code-tensorflow/\nhttp://blog.aylien.com/introduction-generative-adversarial-networks-code-tensorflow/ (pytorch)\nhttp://blog.aylien.com/introduction-generative-adversarial-networks-code-tensorflow/\nhttps://arxiv.org/pdf/1511.05644v2.pdf\n\n* Try more depth in time\n\n* Train with geodesic acceleration (can't be done in python in tensorflow)\n\n* Try homogenous LSTM/CNN architecture\n\n* Include depth in CNN even if not explicitly 3D data, to avoid issues\n  with overlapping pixel space causing diffusion\n\n* Estimate velocity field in rgb, to avoid collisions most likely state as\n  averaging to no motion due to L2 error's treatment of two possible\n  states.\n\n* Use entropy generation rate to train attention where can best predict.\n\n* Try rotation, faces, and ultimately real video.\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpseudotensor%2Ftemporal_autoencoder","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpseudotensor%2Ftemporal_autoencoder","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpseudotensor%2Ftemporal_autoencoder/lists"}