{"id":18851599,"url":"https://github.com/vijishmadhavan/parse-clip","last_synced_at":"2026-05-14T23:02:35.071Z","repository":{"id":107400526,"uuid":"471387004","full_name":"vijishmadhavan/PARSE-CLIP","owner":"vijishmadhavan","description":"A simple CLIP based project for combining images from multiple datasets. ","archived":false,"fork":false,"pushed_at":"2022-03-19T09:57:28.000Z","size":4325,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-10-09T13:31:58.345Z","etag":null,"topics":["clip","data","datacleaning","dataexploration","dataset","fastai","image","python"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vijishmadhavan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-03-18T13:52:06.000Z","updated_at":"2023-08-24T20:31:22.000Z","dependencies_parsed_at":null,"dependency_job_id":"a42ae079-3d96-44ca-b709-709e031f1dc6","html_url":"https://github.com/vijishmadhavan/PARSE-CLIP","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/vijishmadhavan/PARSE-CLIP","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vijishmadhavan%2FPARSE-CLIP","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vijishmadhavan%2FPARSE-CLIP/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vijishmadhavan%2FPARSE-CLIP/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vijishmadhavan%2FPARSE-CLIP/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vijishmadhavan","download_url":"https://codeload.github.com/vijishmadhavan/PARSE-CLIP/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vijishmadhavan%2FPARSE-CLIP/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279001484,"owners_count":26083102,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clip","data","datacleaning","dataexploration","dataset","fastai","image","python"],"created_at":"2024-11-08T03:35:23.610Z","updated_at":"2025-10-09T13:32:02.797Z","avatar_url":"https://github.com/vijishmadhavan.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PARSE-CLIP\n\nA simple CLIP based project for combining images from multiple datasets. This has been very helpful for me, hope it helps others as well.\n\n**Colab:** [\u003cimg src=\"https://colab.research.google.com/assets/colab-badge.svg\" align=\"center\"\u003e](https://colab.research.google.com/github/vijishmadhavan/PARSE-CLIP/blob/master/PARSE_CLIP.ipynb)\n\n![Combine](https://github.com/vijishmadhavan/PARSE-PIC/blob/master/Images/22.png)\n\n## What can PARSE-PIC do?\n\n### - Combine Images from multiple datasets to create a new dataset.\n\n#### Search Query - Carrot\n\n![Data](https://github.com/vijishmadhavan/PARSE-CLIP/blob/master/Images/aaa.png)\n\n### - Search the the class and get the total number.\n\n#### Search Query - Indian\n\nTotal - 363 Images\n\n![Search](https://github.com/vijishmadhavan/PARSE-PIC/blob/master/Images/download%20(2)-side.png)\n\n### - Get the search results and explore.\n\n#### Search Query - Red hair\n\n![explore](https://github.com/vijishmadhavan/PARSE-PIC/blob/master/Images/download%20(3)-side.png)\n\n### - Move Images to a new folder/drive and start training.\n\n### - Remove unwanted Images from the dataset.\n\n\n## Limitations\n\n- It might struggle with huge datasets.\n- Colab free version will be slow.\n- We should have some idea about the dataset, random search wont work.\n- Have tried it only on Kaggle datasets.\n\n\n## Acknowledgements\n- [Beyond tags and entering the semantic search era on images with OpenAI CLIP](https://towardsdatascience.com/beyond-tags-and-entering-the-semantic-search-era-on-images-with-openai-clip-1f7d629a9978) by [Ramsri Goutham Golla](https://twitter.com/ramsri_goutham)\n- [OpenAI's CLIP](https://github.com/openai/CLIP)\n- [Natural Language Image Search](https://github.com/haltakov/natural-language-image-search)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvijishmadhavan%2Fparse-clip","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvijishmadhavan%2Fparse-clip","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvijishmadhavan%2Fparse-clip/lists"}