https://github.com/EBazarov/nsfw_data_source_urls
Collection of NSFW images URLs for the purposes of training an NSFW Image Classifier
https://github.com/EBazarov/nsfw_data_source_urls
Last synced: 20 days ago
JSON representation
Collection of NSFW images URLs for the purposes of training an NSFW Image Classifier
- Host: GitHub
- URL: https://github.com/EBazarov/nsfw_data_source_urls
- Owner: EBazarov
- License: mit
- Created: 2019-02-13T09:21:38.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2020-12-14T09:40:00.000Z (over 4 years ago)
- Last Synced: 2025-03-24T15:47:57.275Z (22 days ago)
- Size: 27.2 MB
- Stars: 3,408
- Watchers: 131
- Forks: 739
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-github-star - nsfw_data_source_urls
- awesome-deepfake-porn-detection - github
- StarryDivineSky - EBazarov/nsfw_data_source_urls
README
# NSFW data source URLs
## Description
Repository contains lists of URLs that will help you download NSFW images, this set can be used in building big enough dataset to train robust NSFM classification model.
This work inspired by [nsfw_data_scrapper](https://github.com/alexkimxyz/nsfw_data_scrapper) and for downloading images suggested to use scripts from the scrapper.
## Stats
In folder `raw_data` you will find different `txt` files each of them contains list of URLs, here some stats for this set:
- **159** different categories
- in total **1 589 331** URLs
- after downloading and cleaning it's possible to have ~ **500GB** or in other words ~ **1 300 000** of NSFW images| file name | number of URLs |
|--------------------------------------------------------------|----------------|
| urls_age_college.txt | 2949 |
| urls_age_mature.txt | 5942 |
| urls_age_milf.txt | 8503 |
| urls_age_teen.txt | 5389 |
| urls_amateur.txt | 13033 |
| urls_amateur_self-shots.txt | 10306 |
| urls_appearance.txt | 2734 |
| urls_appearance_appearance-modification.txt | 3795 |
| urls_appearance_appearance-modification_piercings.txt | 1339 |
| urls_appearance_appearance-modification_tattoos.txt | 1983 |
| urls_appearance_clothing.txt | 24924 |
| urls_appearance_clothing_bodyparts-through-clothes.txt | 6691 |
| urls_appearance_clothing_bottomless.txt | 2390 |
| urls_appearance_clothing_clothed-naked-pair.txt | 1274 |
| urls_appearance_clothing_dresses.txt | 4360 |
| urls_appearance_clothing_shoes.txt | 1238 |
| urls_appearance_clothing_stockings.txt | 2556 |
| urls_appearance_clothing_swimwear.txt | 741 |
| urls_appearance_clothing_tight-clothing.txt | 11522 |
| urls_appearance_clothing_topless.txt | 1009 |
| urls_appearance_clothing_underwear.txt | 3190 |
| urls_appearance_clothing_underwear_panties.txt | 9512 |
| urls_appearance_clothing_underwear_thongs.txt | 2636 |
| urls_appearance_clothing_uniforms-outfits.txt | 15390 |
| urls_appearance_clothing_uniforms-outfits_cosplay.txt | 6465 |
| urls_appearance_clothing_upskirt-downblouse.txt | 2599 |
| urls_appearance_expressions.txt | 1396 |
| urls_appearance_pose.txt | 8377 |
| urls_appearance_wet-&-messy.txt | 9169 |
| urls_artificial-images.txt | 247993 |
| urls_artificial-images_fictional-characters-shows.txt | 73349 |
| urls_artificial-images_hentai.txt | 81178 |
| urls_artificial-images_photoshop.txt | 10146 |
| urls_body-parts_head_hair.txt | 1797 |
| urls_body-parts_head_hair_blonde.txt | 6227 |
| urls_body-parts_head_hair_brunette.txt | 2022 |
| urls_body-parts_head_hair_dyed.txt | 1011 |
| urls_body-parts_head_hair_hairstyle.txt | 6946 |
| urls_body-parts_head_hair_redhead.txt | 4725 |
| urls_body-parts_head_lips-mouth.txt | 4449 |
| urls_body-parts_lower-body.txt | 2136 |
| urls_body-parts_lower-body_ass.txt | 9420 |
| urls_body-parts_lower-body_ass_large.txt | 3654 |
| urls_body-parts_lower-body_asshole.txt | 1826 |
| urls_body-parts_lower-body_feet.txt | 3539 |
| urls_body-parts_lower-body_gap.txt | 1332 |
| urls_body-parts_lower-body_genitalia_penis.txt | 6611 |
| urls_body-parts_lower-body_genitalia_penis_large.txt | 1607 |
| urls_body-parts_lower-body_genitalia_penis_small.txt | 2233 |
| urls_body-parts_lower-body_genitalia_vulva.txt | 12746 |
| urls_body-parts_lower-body_genitalia_vulva_hair.txt | 12085 |
| urls_body-parts_lower-body_genitalia_vulva_labia.txt | 5037 |
| urls_body-parts_lower-body_hips.txt | 3490 |
| urls_body-parts_lower-body_legs.txt | 3104 |
| urls_body-parts_upper-body.txt | 4465 |
| urls_body-parts_upper-body_breasts.txt | 11962 |
| urls_body-parts_upper-body_breasts_from-an-angle.txt | 7196 |
| urls_body-parts_upper-body_breasts_implants.txt | 3913 |
| urls_body-parts_upper-body_breasts_large.txt | 11582 |
| urls_body-parts_upper-body_breasts_nipples.txt | 4383 |
| urls_body-parts_upper-body_breasts_small.txt | 3094 |
| urls_body-traits_complexion_freckles.txt | 2309 |
| urls_body-traits_complexion_light-skin.txt | 1436 |
| urls_body-traits_complexion_tan.txt | 827 |
| urls_body-traits_traits.txt | 157 |
| urls_body-traits_traits_flexible.txt | 862 |
| urls_body-traits_traits_pregnant.txt | 2674 |
| urls_body-traits_types_bbw.txt | 8160 |
| urls_body-traits_types_chubby.txt | 8207 |
| urls_body-traits_types_curvy.txt | 1799 |
| urls_body-traits_types_petite.txt | 2305 |
| urls_body-traits_types_skinny-thin.txt | 4560 |
| urls_classic-vintage.txt | 16532 |
| urls_communities.txt | 12500 |
| urls_communities_identification.txt | 1507 |
| urls_communities_personals.txt | 1106 |
| urls_communities_role-play.txt | 226 |
| urls_cum-play_cum.txt | 4514 |
| urls_cum-play_cum_creampie.txt | 1493 |
| urls_cum-play_cum_cum-shot.txt | 4719 |
| urls_cum-play_cum_cum-shot_bukkake.txt | 1042 |
| urls_cum-play_cum_cum-shot_facial.txt | 2458 |
| urls_cum-play_cum_swallowing.txt | 51 |
| urls_cum-play_female.txt | 921 |
| urls_ethnicity.txt | 19675 |
| urls_ethnicity_asian.txt | 26674 |
| urls_ethnicity_black.txt | 4220 |
| urls_ethnicity_euro.txt | 3949 |
| urls_ethnicity_indian.txt | 11195 |
| urls_ethnicity_japanese.txt | 8109 |
| urls_exhibition.txt | 10 |
| urls_exhibition_gonewild.txt | 96718 |
| urls_exhibition_public.txt | 15066 |
| urls_fetish.txt | 22656 |
| urls_fetish_bdsm.txt | 3301 |
| urls_fetish_bdsm_bondage.txt | 8962 |
| urls_fetish_bdsm_domination-&-submission.txt | 13608 |
| urls_fetish_bdsm_domination-&-submission_femdom.txt | 9205 |
| urls_fetish_drugs.txt | 1171 |
| urls_fetish_role-enactment.txt | 942 |
| urls_fetish_role-enactment_age-play.txt | 2053 |
| urls_fetish_role-enactment_furry.txt | 2455 |
| urls_fetish_role-enactment_pet-play.txt | 1270 |
| urls_fetish_role-enactment_rape-abuse.txt | 1091 |
| urls_fetish_watersports.txt | 5128 |
| urls_general-categories.txt | 212869 |
| urls_general-categories_artistic-or-borderline-porn.txt | 8944 |
| urls_general-categories_desktop-wallpaper.txt | 20173 |
| urls_general-categories_gifs.txt | 1228 |
| urls_general-categories_humorous.txt | 1909 |
| urls_general-categories_p.o.v..txt | 1025 |
| urls_general-categories_passionate.txt | 781 |
| urls_general-categories_porn-for-women.txt | 31 |
| urls_general-categories_videos.txt | 400 |
| urls_groups.txt | 97 |
| urls_groups_alt.txt | 10321 |
| urls_groups_athlete.txt | 7719 |
| urls_groups_camgirl.txt | 4321 |
| urls_groups_celebrity.txt | 46437 |
| urls_groups_country.txt | 787 |
| urls_groups_nerd.txt | 3742 |
| urls_groups_pornstar.txt | 3860 |
| urls_groups_pornstar_pornstar-lookalike.txt | 0 |
| urls_groups_religious.txt | 1054 |
| urls_groups_specific-personality.txt | 4012 |
| urls_illegal-taboo.txt | 0 |
| urls_illegal-taboo_bestiality.txt | 0 |
| urls_illegal-taboo_incest.txt | 3816 |
| urls_illegal-taboo_voyeurism.txt | 439 |
| urls_lgbt_bisexual.txt | 1244 |
| urls_lgbt_crossdressing.txt | 2443 |
| urls_lgbt_gay.txt | 19812 |
| urls_lgbt_lesbian.txt | 5179 |
| urls_lgbt_transgender.txt | 719 |
| urls_lgbt_transsexual.txt | 13106 |
| urls_literary.txt | 1953 |
| urls_locations_man-made.txt | 3869 |
| urls_locations_nature.txt | 3831 |
| urls_locations_nature_beach.txt | 4698 |
| urls_non-porn-nsfw.txt | 21389 |
| urls_sex.txt | 1313 |
| urls_sex_anal.txt | 4683 |
| urls_sex_anal_gaping.txt | 754 |
| urls_sex_anal_rimming.txt | 688 |
| urls_sex_breasts.txt | 176 |
| urls_sex_fisting.txt | 1033 |
| urls_sex_group.txt | 1134 |
| urls_sex_group_large-group.txt | 2989 |
| urls_sex_group_swinging.txt | 4466 |
| urls_sex_group_threesome.txt | 1747 |
| urls_sex_insertion.txt | 4344 |
| urls_sex_interracial.txt | 906 |
| urls_sex_masturbation.txt | 2032 |
| urls_sex_oral.txt | 4155 |
| urls_sex_orgasm.txt | 327 |
| urls_sex_toys.txt | 6710 |
| urls_specific-actor-actress.txt | 52409 |
| urls_specific-company.txt | 18763 |
| urls_wtf.txt | 4001 |## NOTE
1. After downloading is highly suggested to clean your dataset, for example:
- delete duplicates
- remove images that was banned/deleted (they have a special image placeholder)
- find out corrupted data and remove it also
- etc
2. Pay attention to noise, some resources provide highly mixed data of NSFW and neutral images
3. This repository helps in retrieving NSFW images and there's no special URLs for neutral content