{"id":13564220,"url":"https://github.com/bumble-tech/private-detector","last_synced_at":"2025-05-16T05:03:47.200Z","repository":{"id":62008718,"uuid":"543961877","full_name":"bumble-tech/private-detector","owner":"bumble-tech","description":"Bumble's Private Detector - a pretrained model for detecting lewd images","archived":false,"fork":false,"pushed_at":"2023-11-05T22:14:36.000Z","size":62,"stargazers_count":1328,"open_issues_count":5,"forks_count":98,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-05-16T05:02:04.121Z","etag":null,"topics":["bumble","efficientnet","image-classification","tensorflow"],"latest_commit_sha":null,"homepage":"https://medium.com/bumble-tech/bumble-inc-open-sources-private-detector-and-makes-another-step-towards-a-safer-internet-for-women-8e6cdb111d81","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bumble-tech.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2022-10-01T09:09:51.000Z","updated_at":"2025-05-13T01:39:57.000Z","dependencies_parsed_at":"2022-10-25T03:45:18.375Z","dependency_job_id":"84783930-f7c2-48b9-82c8-8cd1562de24f","html_url":"https://github.com/bumble-tech/private-detector","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bumble-tech%2Fprivate-detector","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bumble-tech%2Fprivate-detector/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bumble-tech%2Fprivate-detector/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bumble-tech%2Fprivate-detector/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bumble-tech","download_url":"https://codeload.github.com/bumble-tech/private-detector/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254471062,"owners_count":22076585,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bumble","efficientnet","image-classification","tensorflow"],"created_at":"2024-08-01T13:01:28.196Z","updated_at":"2025-05-16T05:03:47.180Z","avatar_url":"https://github.com/bumble-tech.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Private Detector\n\nThis is the repo for Bumble's *Private Detector*™ model - an image classifier that can detect lewd images.\n\nThe internal repo has been heavily refactored and released as a fully open-source project to allow for the wider community to use and finetune a Private Detector model of their own. You can download the pretrained SavedModel, [Frozen Model](https://github.com/bumble-tech/private-detector/issues/7) and checkpoint [here](https://storage.googleapis.com/private_detector/private_detector_with_frozen.zip)\n\n## Model\n\nThe SavedModel can be found in `saved_model/` within `private_detector.zip` above\n\nThe model is based on Efficientnet-v2 and trained on our internal dataset of lewd images - more information can be found at the whitepaper [here](https://bumble.com/en/the-buzz/bumble-open-source-private-detector-ai-cyberflashing-dick-pics) or [here](https://medium.com/bumble-tech/bumble-inc-open-sources-private-detector-and-makes-another-step-towards-a-safer-internet-for-women-8e6cdb111d81)\n\n## Inference\n\nInference is pretty simple and an example has been given in `inference.py`. The model is released as a SavedModel so it can be deployed in many different ways, but here's a quick runthrough of one way to get it working for those less familiar with Python/Tensorflow.\n\nFirst you need to install [Python](https://www.python.org/downloads/) and [Conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html) on your system and go to the Terminal/Command Prompt on your machine\n\nThen you can use the `environment.yaml` file to install the necessary packages to run the inference.\n\n```sh\nconda env create -f environment.yaml\nconda activate private_detector\n```\n\nOnce that's set up, you can run the inference script. Simply replace the sample `.jpg` file paths below with your own\n\n```sh\npython3 inference.py \\\n    --model saved_model/ \\\n    --image_paths \\\n        Yes_samples/1.jpg \\\n        Yes_samples/2.jpg \\\n        Yes_samples/3.jpg \\\n        Yes_samples/4.jpg \\\n        Yes_samples/5.jpg \\\n        No_samples/1.jpg \\\n        No_samples/2.jpg \\\n        No_samples/3.jpg \\\n        No_samples/4.jpg \\\n        No_samples/5.jpg \\\n```\n\n\u003cdetails\u003e\n\u003csummary\u003eSample Output\u003c/summary\u003e\n\u003ccode\u003e\n\n    Probability: 93.71% - Yes_samples/1.jpg\n    Probability: 93.43% - Yes_samples/2.jpg\n    Probability: 94.06% - Yes_samples/3.jpg\n    Probability: 94.08% - Yes_samples/4.jpg\n    Probability: 91.01% - Yes_samples/5.jpg\n    Probability: 9.76% - No_samples/1.jpg\n    Probability: 7.14% - No_samples/2.jpg\n    Probability: 8.83% - No_samples/3.jpg\n    Probability: 4.87% - No_samples/4.jpg\n    Probability: 5.29% - No_samples/5.jpg\n\u003c/code\u003e\n\u003c/details\u003e\n\n## Serving\n\nSee [Tensorflow Serving example](deployments/tensorflow-serving/README.md)\n\n## Additional Training\n\nYou can finetune the model yourself on your own data, to do so is fairly simple - though you will need the checkpoint files as can be found in `saved_checkpoint/` in `private_detector.zip`\n\nSet up a JSON file with links to your image path lists for each class:\n\n```json\n{\n    \"Yes\": {\n        \"path\": \"/home/sofarrell/private_detector/Yes.txt\",\n        \"label\": 0\n    },\n    \"No\": {\n         \"path\": \"/home/sofarrell/private_detector/No.txt\",\n         \"label\": 1\n    }\n}\n```\n\nWith each `.txt` file listing off the image paths to your images\n\n```txt\n/home/sofarrell/private_detector_images/Yes/1093840880_309463828.jpg\n/home/sofarrell/private_detector_images/Yes/657954182_3459624.jpg\n/home/sofarrell/private_detector_images/Yes/1503714421_3048734.jpg\n```\n\nYou can create the training environment with conda:\n\n```sh\nconda env create -f environment.yaml\nconda activate private_detector\n```\n\nAnd then retrain like so:\n\n```sh\npython3 ./train.py \\\n    --train_json /home/sofarrell/private_detector/train_classes.json \\\n    --eval_json /home/sofarrell/private_detector/eval_classes.json \\\n    --checkpoint_dir saved_checkpoint/ \\\n    --train_id retrained_private_detector\n```\n\nThe training script has several parameters that can be tweaked:\n|Command|Description|Type|Default|\n|---|---|---|---|\n|`train_id`|ID for this particular training run|str||\n|`train_json`|JSON file(s) which describes classes and contains lists of filenames of data files|List[str]||\n|`eval_json`|Validation json file which describes classes and contains lists of filenames of data files|str||\n|`num_epochs`|Number of epochs to train for|int||\n|`batch_size`|Number of images to process in a batch|int|`64`|\n|`checkpoint_dir`|Directory to store checkpoints in|str||\n|`model_dir`|Directory to store graph in|str|`.`|\n|`data_format`|Data format: [channels_first, channels_last]|str|`channels_last`|\n|`initial_learning_rate`|Initial learning rate|float|`1e-4`|\n|`min_learning_rate`|Minimal learning rate|float|`1e-6`|\n|`min_eval_metric`|Minimal evaluation metric to start saving models|float|`0.01`|\n|`float_dtype`|Float Dtype to use in image tensors: [16, 32]|int|`16`|\n|`steps_per_train_epoch`|Number of steps per train epoch|int|`800`|\n|`steps_per_eval_epoch`|Number of steps per evaluation epoch|int|`1`|\n|`reset_on_lr_update`|Whether to reset to the best model after learning rate update|bool|`False`|\n|`rotation_augmentation`|Rotation augmentation angle, value \u003c= 0 disables it|float|`0`|\n|`use_augmentation`|Add speckle, v0, random or color distortion augmentation|str||\n|`scale_crop_augmentation`|Resize image to the model's size times this scale and then randomly crop needed size|float|`1.4`|\n|`reg_loss_weight`|L2 regularization weight|float|`0`|\n|`skip_saving_epochs`|Do not save good checkpoint and update best metric for this number of the first epochs|int|`0`|\n|`sequential`|Use sequential run over randomly shuffled filenames vs equal sampling from each class|bool|`False`|\n|`eval_threshold`|Threshold above which to consider a prediction positive for evaluation|float|`0.5`|\n|`epochs_lr_update`|Maximum number of epochs without improvement used to reset/decrease learning rate|int|`20`|\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbumble-tech%2Fprivate-detector","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbumble-tech%2Fprivate-detector","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbumble-tech%2Fprivate-detector/lists"}