An open API service indexing awesome lists of open source software.

https://github.com/EBazarov/nsfw_data_source_urls

Collection of NSFW images URLs for the purposes of training an NSFW Image Classifier
https://github.com/EBazarov/nsfw_data_source_urls

Last synced: 20 days ago
JSON representation

Collection of NSFW images URLs for the purposes of training an NSFW Image Classifier

Awesome Lists containing this project

README

        

# NSFW data source URLs

## Description

Repository contains lists of URLs that will help you download NSFW images, this set can be used in building big enough dataset to train robust NSFM classification model.

This work inspired by [nsfw_data_scrapper](https://github.com/alexkimxyz/nsfw_data_scrapper) and for downloading images suggested to use scripts from the scrapper.

## Stats

In folder `raw_data` you will find different `txt` files each of them contains list of URLs, here some stats for this set:

- **159** different categories
- in total **1 589 331** URLs
- after downloading and cleaning it's possible to have ~ **500GB** or in other words ~ **1 300 000** of NSFW images

| file name | number of URLs |
|--------------------------------------------------------------|----------------|
| urls_age_college.txt | 2949 |
| urls_age_mature.txt | 5942 |
| urls_age_milf.txt | 8503 |
| urls_age_teen.txt | 5389 |
| urls_amateur.txt | 13033 |
| urls_amateur_self-shots.txt | 10306 |
| urls_appearance.txt | 2734 |
| urls_appearance_appearance-modification.txt | 3795 |
| urls_appearance_appearance-modification_piercings.txt | 1339 |
| urls_appearance_appearance-modification_tattoos.txt | 1983 |
| urls_appearance_clothing.txt | 24924 |
| urls_appearance_clothing_bodyparts-through-clothes.txt | 6691 |
| urls_appearance_clothing_bottomless.txt | 2390 |
| urls_appearance_clothing_clothed-naked-pair.txt | 1274 |
| urls_appearance_clothing_dresses.txt | 4360 |
| urls_appearance_clothing_shoes.txt | 1238 |
| urls_appearance_clothing_stockings.txt | 2556 |
| urls_appearance_clothing_swimwear.txt | 741 |
| urls_appearance_clothing_tight-clothing.txt | 11522 |
| urls_appearance_clothing_topless.txt | 1009 |
| urls_appearance_clothing_underwear.txt | 3190 |
| urls_appearance_clothing_underwear_panties.txt | 9512 |
| urls_appearance_clothing_underwear_thongs.txt | 2636 |
| urls_appearance_clothing_uniforms-outfits.txt | 15390 |
| urls_appearance_clothing_uniforms-outfits_cosplay.txt | 6465 |
| urls_appearance_clothing_upskirt-downblouse.txt | 2599 |
| urls_appearance_expressions.txt | 1396 |
| urls_appearance_pose.txt | 8377 |
| urls_appearance_wet-&-messy.txt | 9169 |
| urls_artificial-images.txt | 247993 |
| urls_artificial-images_fictional-characters-shows.txt | 73349 |
| urls_artificial-images_hentai.txt | 81178 |
| urls_artificial-images_photoshop.txt | 10146 |
| urls_body-parts_head_hair.txt | 1797 |
| urls_body-parts_head_hair_blonde.txt | 6227 |
| urls_body-parts_head_hair_brunette.txt | 2022 |
| urls_body-parts_head_hair_dyed.txt | 1011 |
| urls_body-parts_head_hair_hairstyle.txt | 6946 |
| urls_body-parts_head_hair_redhead.txt | 4725 |
| urls_body-parts_head_lips-mouth.txt | 4449 |
| urls_body-parts_lower-body.txt | 2136 |
| urls_body-parts_lower-body_ass.txt | 9420 |
| urls_body-parts_lower-body_ass_large.txt | 3654 |
| urls_body-parts_lower-body_asshole.txt | 1826 |
| urls_body-parts_lower-body_feet.txt | 3539 |
| urls_body-parts_lower-body_gap.txt | 1332 |
| urls_body-parts_lower-body_genitalia_penis.txt | 6611 |
| urls_body-parts_lower-body_genitalia_penis_large.txt | 1607 |
| urls_body-parts_lower-body_genitalia_penis_small.txt | 2233 |
| urls_body-parts_lower-body_genitalia_vulva.txt | 12746 |
| urls_body-parts_lower-body_genitalia_vulva_hair.txt | 12085 |
| urls_body-parts_lower-body_genitalia_vulva_labia.txt | 5037 |
| urls_body-parts_lower-body_hips.txt | 3490 |
| urls_body-parts_lower-body_legs.txt | 3104 |
| urls_body-parts_upper-body.txt | 4465 |
| urls_body-parts_upper-body_breasts.txt | 11962 |
| urls_body-parts_upper-body_breasts_from-an-angle.txt | 7196 |
| urls_body-parts_upper-body_breasts_implants.txt | 3913 |
| urls_body-parts_upper-body_breasts_large.txt | 11582 |
| urls_body-parts_upper-body_breasts_nipples.txt | 4383 |
| urls_body-parts_upper-body_breasts_small.txt | 3094 |
| urls_body-traits_complexion_freckles.txt | 2309 |
| urls_body-traits_complexion_light-skin.txt | 1436 |
| urls_body-traits_complexion_tan.txt | 827 |
| urls_body-traits_traits.txt | 157 |
| urls_body-traits_traits_flexible.txt | 862 |
| urls_body-traits_traits_pregnant.txt | 2674 |
| urls_body-traits_types_bbw.txt | 8160 |
| urls_body-traits_types_chubby.txt | 8207 |
| urls_body-traits_types_curvy.txt | 1799 |
| urls_body-traits_types_petite.txt | 2305 |
| urls_body-traits_types_skinny-thin.txt | 4560 |
| urls_classic-vintage.txt | 16532 |
| urls_communities.txt | 12500 |
| urls_communities_identification.txt | 1507 |
| urls_communities_personals.txt | 1106 |
| urls_communities_role-play.txt | 226 |
| urls_cum-play_cum.txt | 4514 |
| urls_cum-play_cum_creampie.txt | 1493 |
| urls_cum-play_cum_cum-shot.txt | 4719 |
| urls_cum-play_cum_cum-shot_bukkake.txt | 1042 |
| urls_cum-play_cum_cum-shot_facial.txt | 2458 |
| urls_cum-play_cum_swallowing.txt | 51 |
| urls_cum-play_female.txt | 921 |
| urls_ethnicity.txt | 19675 |
| urls_ethnicity_asian.txt | 26674 |
| urls_ethnicity_black.txt | 4220 |
| urls_ethnicity_euro.txt | 3949 |
| urls_ethnicity_indian.txt | 11195 |
| urls_ethnicity_japanese.txt | 8109 |
| urls_exhibition.txt | 10 |
| urls_exhibition_gonewild.txt | 96718 |
| urls_exhibition_public.txt | 15066 |
| urls_fetish.txt | 22656 |
| urls_fetish_bdsm.txt | 3301 |
| urls_fetish_bdsm_bondage.txt | 8962 |
| urls_fetish_bdsm_domination-&-submission.txt | 13608 |
| urls_fetish_bdsm_domination-&-submission_femdom.txt | 9205 |
| urls_fetish_drugs.txt | 1171 |
| urls_fetish_role-enactment.txt | 942 |
| urls_fetish_role-enactment_age-play.txt | 2053 |
| urls_fetish_role-enactment_furry.txt | 2455 |
| urls_fetish_role-enactment_pet-play.txt | 1270 |
| urls_fetish_role-enactment_rape-abuse.txt | 1091 |
| urls_fetish_watersports.txt | 5128 |
| urls_general-categories.txt | 212869 |
| urls_general-categories_artistic-or-borderline-porn.txt | 8944 |
| urls_general-categories_desktop-wallpaper.txt | 20173 |
| urls_general-categories_gifs.txt | 1228 |
| urls_general-categories_humorous.txt | 1909 |
| urls_general-categories_p.o.v..txt | 1025 |
| urls_general-categories_passionate.txt | 781 |
| urls_general-categories_porn-for-women.txt | 31 |
| urls_general-categories_videos.txt | 400 |
| urls_groups.txt | 97 |
| urls_groups_alt.txt | 10321 |
| urls_groups_athlete.txt | 7719 |
| urls_groups_camgirl.txt | 4321 |
| urls_groups_celebrity.txt | 46437 |
| urls_groups_country.txt | 787 |
| urls_groups_nerd.txt | 3742 |
| urls_groups_pornstar.txt | 3860 |
| urls_groups_pornstar_pornstar-lookalike.txt | 0 |
| urls_groups_religious.txt | 1054 |
| urls_groups_specific-personality.txt | 4012 |
| urls_illegal-taboo.txt | 0 |
| urls_illegal-taboo_bestiality.txt | 0 |
| urls_illegal-taboo_incest.txt | 3816 |
| urls_illegal-taboo_voyeurism.txt | 439 |
| urls_lgbt_bisexual.txt | 1244 |
| urls_lgbt_crossdressing.txt | 2443 |
| urls_lgbt_gay.txt | 19812 |
| urls_lgbt_lesbian.txt | 5179 |
| urls_lgbt_transgender.txt | 719 |
| urls_lgbt_transsexual.txt | 13106 |
| urls_literary.txt | 1953 |
| urls_locations_man-made.txt | 3869 |
| urls_locations_nature.txt | 3831 |
| urls_locations_nature_beach.txt | 4698 |
| urls_non-porn-nsfw.txt | 21389 |
| urls_sex.txt | 1313 |
| urls_sex_anal.txt | 4683 |
| urls_sex_anal_gaping.txt | 754 |
| urls_sex_anal_rimming.txt | 688 |
| urls_sex_breasts.txt | 176 |
| urls_sex_fisting.txt | 1033 |
| urls_sex_group.txt | 1134 |
| urls_sex_group_large-group.txt | 2989 |
| urls_sex_group_swinging.txt | 4466 |
| urls_sex_group_threesome.txt | 1747 |
| urls_sex_insertion.txt | 4344 |
| urls_sex_interracial.txt | 906 |
| urls_sex_masturbation.txt | 2032 |
| urls_sex_oral.txt | 4155 |
| urls_sex_orgasm.txt | 327 |
| urls_sex_toys.txt | 6710 |
| urls_specific-actor-actress.txt | 52409 |
| urls_specific-company.txt | 18763 |
| urls_wtf.txt | 4001 |

## NOTE

1. After downloading is highly suggested to clean your dataset, for example:
- delete duplicates
- remove images that was banned/deleted (they have a special image placeholder)
- find out corrupted data and remove it also
- etc
2. Pay attention to noise, some resources provide highly mixed data of NSFW and neutral images
3. This repository helps in retrieving NSFW images and there's no special URLs for neutral content