{"id":13507980,"url":"https://github.com/ryanjay0/miles-deep","last_synced_at":"2025-04-08T04:13:29.073Z","repository":{"id":52299988,"uuid":"73598162","full_name":"ryanjay0/miles-deep","owner":"ryanjay0","description":"Deep Learning Porn Video Classifier/Editor with Caffe","archived":false,"fork":false,"pushed_at":"2018-08-23T12:37:35.000Z","size":22281,"stargazers_count":2609,"open_issues_count":12,"forks_count":283,"subscribers_count":172,"default_branch":"master","last_synced_at":"2025-04-01T03:33:00.636Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ryanjay0.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-11-13T05:49:47.000Z","updated_at":"2025-03-25T17:23:51.000Z","dependencies_parsed_at":"2022-09-06T08:01:54.379Z","dependency_job_id":null,"html_url":"https://github.com/ryanjay0/miles-deep","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryanjay0%2Fmiles-deep","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryanjay0%2Fmiles-deep/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryanjay0%2Fmiles-deep/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryanjay0%2Fmiles-deep/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ryanjay0","download_url":"https://codeload.github.com/ryanjay0/miles-deep/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247773719,"owners_count":20993639,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T02:00:44.872Z","updated_at":"2025-04-08T04:13:29.045Z","avatar_url":"https://github.com/ryanjay0.png","language":"C++","funding_links":[],"categories":["C++","📦 Legacy \u0026 Inactive Projects","Part 2 : Detect"],"sub_categories":["Code Repositories"],"readme":"\n#Miles Deep - AI Porn Video Editor\n\nUsing a deep convolutional neural network with residual connections, Miles Deep quickly classifies each second of a pornographic video into 6 categories based on sexual act with 95% accuracy. Then it uses that classification to automatically edit the video. It can remove all the scenes not containing sexual contact, or edit out just a specific act.\n\nUnlike Yahoo's recently released [NSFW model](https://github.com/yahoo/open_nsfw),  which uses a similar architecture, Miles Deep can tell the difference between nudity and various explicit sexual acts. As far as I know this is the first and only public pornography classification or editing tool.\n\nThis program can also be viewed as a general framework for classifying video with a [Caffe](http://caffe.berkeleyvision.org/) model, using batching and threading in C++. By replacing the weights, model definition, and mean file it can immediately be used to edit videos with other classes without recompiling. See below for an [example](https://github.com/ryanjay0/miles-deep#using-miles-deep-with-your-own-caffe-model).\n\n##Installation\n\n###Ubuntu Installation (16.04)\n\n####Dependencies\n\n`sudo apt install ffmpeg libopenblas-base libhdf5-serial-dev libgoogle-glog-dev libopencv-dev` \n\n#####Additional 14.04 Dependencies\n\n`sudo apt install libgflags-dev`\n\n#####CUDA (Recommended)\nFor GPU usage you need an Nvidia GPU and CUDA 8.0 drivers. Highly recommended; increases speed 10x. This can be installed from a package or by downloading from [NVIDIA directly](https://developer.nvidia.com/cuda-downloads). \n\n\n#####CuDNN (Optional)\n\nAdditional drivers from NVIDIA that make the CUDA GPU support even faster.\n[Download here](https://developer.nvidia.com/cudnn). (Requires registration)\n\n\n\n####Download Miles Deep\n\n* [miles-deep (GPU + CuDNN)](https://github.com/ryanjay0/miles-deep/releases/download/v0.4/miles-deep-xxx.v0.4.tgz)\n* [miles-deep (GPU)](https://github.com/ryanjay0/miles-deep/releases/download/v0.4/miles-deep-gpu.v0.4.tgz)\n* [miles-deep (CPU)](https://github.com/ryanjay0/miles-deep/releases/download/v0.4/miles-deep-cpu.v0.4.tgz)\n\nDownload the [model](https://github.com/ryanjay0/miles-deep/files/587616/model.v0.1.tar.gz) too. Put miles-deep in the same location as the model folder (not in it). \n\n\nVersion | Runtime\n:---:|---:\nGPU + CuDNN | 15s  \nGPU | 19s\nCPU  | 1m 59s \n*on a 24.5 minute video with a GTX 960 4GB\n###Windows and OSX\nI'm working on a version for Windows. Sorry, I don't have a Mac but it should run on OSX with few changes. [Compilations instructions](https://github.com/ryanjay0/miles-deep#compiling) below. I'll accept pull requests related to OSX or other linux compatibility. Windows will likely require anothe repository to link with Caffe for windows.\n\n##Usage\n\nExample:\n```bash\nmiles-deep -t sex_back,sex_front movie.mp4\n``` \n\nThis finds the scenes sex from the back or front and outputs the result\nin `movie.cut.avi`\n\nExample:\n```bash\nmiles-deep -x movie.avi\n```\n\nThis edits out all the non-sexual scenes from `movie.avi`\nand outputs the result in `movie.cut.avi`.\n\nExample:\n```bash\nmiles-deep -b 16 -t cunnilingus -o /cut_movies movie.mkv\n```\n\nThis reduces the batch size to 16 (default 32). \nFinds only scenes with cunnilingus,\noutputs result in `/cut_movies/movie.cut.mkv`.\n\n   **NOTE: Reduce the batch size if you run out of memory**\n\n\n####GPU VRAM used and runtime for various batch sizes:\n\nVRAM(GB)  |  batch\\_size  |    run time\n---:       | ---:           | ---:\n3.5       |       32       |          14.9s\n1.9       |   16           |          15.7s \n1.2            | 8  |            16.9s\n0.8            | 4   |           19.5s\n0.6            | 2   |           24.3s\n0.1           |  1   |           36.2s\n\nTested on an Nvidia GTX 960 with 4GB VRAM and a 24.5 minute video file. At batch\\_size 32 it took approximately 0.6 seconds to process 1 minute of input video or about 36 seconds per hour.\n\nIn addition to batching, Miles Deep also uses threading, which allows the screenshots to be captured and processed while they are classified.\n\n###Auto-Tagging Without Cutting\n\nExample:\n```bash\nmiles-deep movie.mp4 -a\n```\n\nBy popular demand I added this option, which outputs `movie.tag`:\n\n```\nmovie_name, label_1, ..., label_n\ntotal_time, label_1_time, ..., label_n_time\nlabel, start, end, score, coverage\n.\n.\n.\n\n```\n\nThe file contains the cuts for each target, ordered as they occur in the movie. The first lines gives the movie name, the labels, the total movie time, and the total seconds for each label. Then for each cut it list the start time, end time, average score, and coverage. Because of the threshold and the gaps, these cuts may overlap and aren't guaranteed to cover every second.\n\n###Prediction Weights\nHere is an example of the predictions for each second of a video:\n\n![predictions for each second of a video](images/prediction_weights.jpg?raw=true)\n\n\n###Using Miles Deep with your own Caffe model\n####Cat finding\n\nHere's an example of how to use the program with your own model (or a pre-trained one):\n\n\n```bash\nMODEL_PATH=/models/bvlc_reference_caffenet/ \n\nmiles-deep -t n02123045 \\\n  -p caffe/${MODEL_PATH}/deploy.prototxt \\\n  -m caffe/data/ilsvrc12/imagenet_mean.binaryproto \\\n  -w caffe/${MODEL_PATH/bvlc_reference_caffenet.caffemodel \\\n  -l caffe/data/ilsvrc12/synsets.txt \\\n  movie.mp4\n```\nThis finds the scenes in `movie.mp4` with a tabby cat and returns `movie.cut.mp4` with only those parts. n02123045 is the category for tabby cats. You can find the category names in `caffe/data/ilsvrc12/synset_words.txt`. You can use a pre-trained model from the [model zoo](https://github.com/BVLC/caffe/wiki/Model-Zoo) instead.\n\n*Note: This example is just to show the syntax. It performs somewhat poorly in my experience, likely due to the 1000 classes. This program is ideally suited to models with a smaller number of categories with an 'other' category too.*\n\n\n##Model\n\nThe model is a CNN with [residual connections](https://arxiv.org/abs/1512.03385) created by [pynetbuilder](https://github.com/jay-mahadeokar/pynetbuilder/tree/master/models/imagenet). These models are pre-trained on ImageNet. Then the final layer is changed to fit the new number of classes and [fine-tuned](http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html).\n\nAs [Karpathy et al](http://cs.stanford.edu/people/karpathy/deepvideo/) suggest, I train the weights for the top-3 layers not just the top-layer, which improves the accuracy slightly:\n\nRetunedLayers  |  Accuracy\n---|---\nTop3 |            94.6\nTop1         |    93.9\n\n\nBelow are the results for fine-tuning the top3 layers with different models,\ntested on 2500 images, taken from different videos than the training set.\n\nModel |   Accuracy(%)   |  Flops(millions) | Params(millions) \n--- | ---: | ---: | ---:\nresnet50      |   80.0     |   3858 |    25.5\nresnet50\\_1by2  | 94.6  |      1070  |   6.9\nresnet77\\_1by2  | 95.7  |      1561  |   9.4 \n\nThe training loss and test accuracy:\n\n![fine-tuning training loss](images/train_loss.png?raw=true) \n![test accuracy vs step](images/accuracy.png?raw=true)\n\nOf all the models tested, the resnet50\\_1by2 provides the best balance between\nruntime, memory, and accuracy. I believe the full resnet50's low accuracy is due \nto overfitting because it has more parameters, or perhaps the training could\nbe done differently.\n\nThe results above were obtained with mirroring but not cropping. Using cropping slightly improves the results on the resnet50\\_1by2 to **95.2%**, therefore it is used as the final model.\n\n[Fine-tuning](https://www.tensorflow.org/versions/r0.9/how_tos/image_retraining/index.html) the Inception V3 model with Tensorflow also only achieved 80% accuracy. However, that is with a 299x299 image size instead of 224x224 with no mirroring or cropping, so the result is not directly comparable. Overfitting may be a problem with this model too.\n\n###Editing the Movie\n\nGiven the predictions for a frame each second, it takes the argmax of those predictions and creates cut blocks of the movie where argmax equals the target and the score is greater than some threshold. The gap size, the minimum fraction of frames matching the target in each block, and the score threshold are all adjustable.\n\nFFmpeg supports a lot of codecs including: mp4, avi, flv, mkv, wmv, and many more.\n\n###Single Frame vs Multiple Frames\n\nThis model doesn't make use of any temporal information since it treats each image separately. *Karpathy et al* showed that other models which use multiple frames don't perform much better. They have difficulty dealing with camera movement. It would still be interesting to compare their slow fusion model with the results here.\n\n##Data\n\nThe training database consists of 36,000 (and 2500 test images) images divided into 6 categories: \n\n0. blowjob\\_handjob\n1. cunnilingus\n2. other\n3. sex\\_back\n4. sex\\_front\n6. titfuck\n\nThe images are resized to 256x256 with horizontal mirroring and random cropping\nto 224x224 for data augmentation. A lot of the experiments were done without cropping but it slightly improves the results for the resnet50\\_1by2.\n\nFor now the dataset is limited to two heterosexual performers. But given the success of this method, I plan to expand the number of categories. Due to the nature of the material, I will not be releasing the database itself; only the trained model.\n\n####Sex back and front\n\nSex front and back are defined by the position of the camera, instead of the orientation of the performers. If the female's body is facing the camera so the front of the vagina is shown, it's sex front. If the female's rear is shown instead, it's sex back. This creates two visually distinct classes. No distinction is made between vaginal and anal intercourse; sex back or sex front could include either.\n\n##Compiling\n\n* Clone the git repository which includes Caffe as an external dependency. \n\n* Follow the step-by-step [instructions](http://caffe.berkeleyvision.org/installation.html) to install the Caffe dependencies for your plaform. [Ubuntu instructions](http://caffe.berkeleyvision.org/install_apt.html). The default is OpenBlas. Don't worry about editing the Makefile.config or making Caffe. On Ubuntu 16.04 try this in addition to the dependencies at the top:\n\n```bash\nsudo apt install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler\nsudo apt install --no-install-recommends libboost-all-dev\nsudo apt install libopenblas-dev python-numpy\n\n#Add symbolic links for hdf5 library\n#(not necessary on LinuxMint 18)\n\ncd /usr/lib/x86_64-linux-gnu\nsudo ln -s libhdf5_serial.so libhdf5.so\nsudo ln -s libhdf5_serial_hl.so libhdf5_hl.so\n\n```\n\n* The default is GPU without CuDNN. If you want something else edit `Makefile` and `Makefile.caffe`. Comment out or uncomment the proper lines in both files.\n\n* `make` \n\n#####License\nCode licensed under GPLv3, including the trained model. Caffe is licensed under BSD 2. \n\n####Contact\n\nIf you have problems, suggestions, or thoughts open an issue or send me email nipplerdeeplearning at gmail. \n \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fryanjay0%2Fmiles-deep","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fryanjay0%2Fmiles-deep","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fryanjay0%2Fmiles-deep/lists"}