{"id":13578708,"url":"https://github.com/cwerner/fastclass","last_synced_at":"2025-04-05T19:33:32.287Z","repository":{"id":102973529,"uuid":"138149178","full_name":"cwerner/fastclass","owner":"cwerner","description":"Little tools to download and then weed through images, delete and classify them into groups for building deep learning image datasets (based on crawler and tkinter)","archived":false,"fork":false,"pushed_at":"2020-06-04T12:43:34.000Z","size":1030,"stargazers_count":133,"open_issues_count":7,"forks_count":25,"subscribers_count":6,"default_branch":"master","last_synced_at":"2024-11-05T16:48:14.988Z","etag":null,"topics":["deep-learning","gui","labeling"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cwerner.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2018-06-21T09:27:19.000Z","updated_at":"2024-06-26T02:33:53.000Z","dependencies_parsed_at":null,"dependency_job_id":"e7e157b7-afd3-4a4a-88e5-8554e964150a","html_url":"https://github.com/cwerner/fastclass","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cwerner%2Ffastclass","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cwerner%2Ffastclass/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cwerner%2Ffastclass/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cwerner%2Ffastclass/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cwerner","download_url":"https://codeload.github.com/cwerner/fastclass/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247393095,"owners_count":20931804,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","gui","labeling"],"created_at":"2024-08-01T15:01:33.052Z","updated_at":"2025-04-05T19:33:30.593Z","avatar_url":"https://github.com/cwerner.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# FastClass\n\n![Version](https://img.shields.io/github/v/release/cwerner/fastclass.svg)\n![Python](https://img.shields.io/badge/python-3.6%20%7C%203.7%20%7C%203.8-blue)\n![Style](https://img.shields.io/badge/code%20style-black-000000.svg)\n![GitHub stars](https://img.shields.io/github/stars/cwerner/fastclass?style=social)\n\n\nA little set of tools to batch download images and weed through, delete and\nclassify them into groups for building deep learning image datasets.\n\nI wrote up a small [blog post](https://www.christianwerner.net/tech/Build-your-image-dataset-faster/) on my site [www.christianwerner.net](https://www.christianwerner.net).\n\n## Installation\n\n`pip install git+https://github.com/cwerner/fastclass.git#egg=fastclass`\n\nThe installer will also place the executables **fcc** and **fcd** in your \\$PATH.\n\nThe package currently contains the follwing tools:\n\n## Download images\n\nUse **fcd** to crawl search engines (Google, Bing, Baidu, Flickr) and pull all images for\na defined set of queries. In addition, files are renamed, scaled and checked\nfor duplicates.\n\nYou provide queries and terms that should be excluded when naming the category folders. There\nis an example (guitars.csv) provided in the repository.\n\n### Usage\n\nCall the script from the commandline. If you omit any input parameters it\nwill show you the help page.\n\n```\nUsage: fcd [OPTIONS] INFILE\n\nOptions:\n  -c, --crawler [ALL|GOOGLE|BING|BAIDU|FLICKR]\n                                  selection of crawler (multiple invocations\n                                  supported)  [default: ALL] (Note: BAIDU and FLICKR are not included in ALL option)\n  -k, --keep                      keep original results of crawlers  [default:\n                                  False]\n  -m, --maxnum                    maximum number of images per crawler [default: 1000]\n  -s, --size INTEGER              image size for rescaling  [default: 299]\n  -o, --outpath TEXT              name of output directory  [default: dataset]\n  -h, --help                      Show this message and exit.\n\n  ::: FastClass fcd :::\n\n  ...an easy way to crawl the net for images when building a dataset for\n  deep learning.\n\n  Example: fcd -c GOOGLE -c BING -s 224 example/guitars.csv\n```\n\nIf you specify the _-k, --keep_ flag a second folder called outpath.raw containing the original/ unscled images will be created.\n\n### Search file format\n\nThe csv file currently requires two columns (columns are seperated by a comma (,)) and each row defines a image class you want to download (see the guitars.csv file in the example folder). The first row contains a header which will be skipped.\n\nColumn 1 contains the search terms. You can specify multiple searchterms using space between them. If you want to require a search term enclose it in quotation marks (\") (you can use the normal query syntax you'd normally use in a google search - i.e. filetype:jpg). In column 2 you can specify terms that should not be included in the final class names. An example would be that you want to add guitar to your search terms to help the search but don't need that term in the final folder class names. If you do not want to specify this column you can leave it blank (i.e., end the line with a comma).\n\n## Clean image sets\n\nOnce downloaded use **fcc** to quickly inspect the loaded files and rate or\nclassify them. You can also mark them for deletion.\n\n![FastClass cleaner: fcc](assets/fcc_screenshot.png)\n\n### Usage\n\nCall the script from the commandline. If you omit any input parameters it\nwill show you the help page.\n\n```\nUsage: fcc [OPTIONS] INFOLDER [OUTFOLDER]\n\n  FastClass fcc\n\nOptions:\n  --nocopy TEXT  disable filecopy for cleaned image set  [default: False]\n  -h, --help     Show this message and exit.\n\n  ::: FastClass fcc ::: ...a fast way to cleanup/ sort your images when\n  building a dataset for deep learning.\n\n  Note: In the application use the following keys: \u003c1\u003e, \u003c2\u003e, ... \u003c9\u003e for\n  class assignments or quality ratings \u003cspace\u003e assigns \u003c1\u003e \u003cd\u003e to mark a\n  deletion \u003cx\u003e to terminate the app/ write output\n\n  Use the buttons to navigate back and forth without changing the\n  classification. The current classification of an image is given in the\n  title bar (X indicated a mark for deletion). The counter in the titlebar\n  gives number of classified images vs the total number in the input folder.\n\n  In the output csv file 1,2 depcit class assignments/ ratings,  -1\n  indicates files marked for deletion (if not excluded with -d).\n```\n\n## Flickr Crawler\n\nThe Flickr crawler requires an API key. FastClass looks for the key in an environment variable called `FLICKR_API_KEY`. Request one from the [Flickr API key application page.](https://www.flickr.com/services/apps/create/apply/)\n\n`FLICKR_API_KEY=asdf1234asdf456 fcd -c FLICKR my_project.csv`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcwerner%2Ffastclass","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcwerner%2Ffastclass","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcwerner%2Ffastclass/lists"}