{"id":17543581,"url":"https://github.com/lopcode/peep","last_synced_at":"2025-03-29T05:34:04.224Z","repository":{"id":205849350,"uuid":"715229107","full_name":"lopcode/peep","owner":"lopcode","description":"A work-in-progress project to classify and tag species (and other) information automatically from free text input on social media websites 🦊👀🐰","archived":false,"fork":false,"pushed_at":"2024-02-18T12:20:40.000Z","size":77,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-26T23:03:20.638Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Kotlin","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lopcode.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-11-06T18:13:38.000Z","updated_at":"2023-11-15T21:25:25.000Z","dependencies_parsed_at":"2024-02-18T13:28:39.663Z","dependency_job_id":"e0313976-ac00-41c0-8134-d7b6ede193c3","html_url":"https://github.com/lopcode/peep","commit_stats":null,"previous_names":["lopcode/peep"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lopcode%2Fpeep","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lopcode%2Fpeep/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lopcode%2Fpeep/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lopcode%2Fpeep/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lopcode","download_url":"https://codeload.github.com/lopcode/peep/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246145013,"owners_count":20730494,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-21T00:24:53.160Z","updated_at":"2025-03-29T05:34:04.205Z","avatar_url":"https://github.com/lopcode.png","language":"Kotlin","funding_links":[],"categories":[],"sub_categories":[],"readme":"# peep\n\npeep is a work-in-progress project to classify and tag species (and other) information automatically from free text\ninput on social media websites 🦊👀🐰.\n\nFor example, on [barq.app](https://barq.app) there's a free text field for species which could result in:\n* `Lop-eared bunny` -\u003e `species:rabbit`, `family:leporidae`, `order:lagomorpha`, `characterisation:lop-eared`\n* `Arctic fox` -\u003e `species:arctic fox`, `species:fox`, `species:vulpes lagopus`, `genus:vulpes`, `family:canidae`\n* `Blue sparkleyote` -\u003e `species:coyote`, `species:canis latrans`, `genus:canis`, `family:canidae`, `color:blue`, `color:sparkle`\n\nThis is useful for data analysis, including monitoring species popularity over time, whilst also letting users express\nthemselves freely.\n\nIf this sounds interesting to you, please star the repo - thanks ⭐️!\n\n## How does it work\n\nThe initial plan is to do this by:\n* Tidying/normalising input data\n* Using some form of natural language processing\n  * Probably with some cultural additions that tend not to appear in popular language\n* Nearest-matching to a configurable taxonomy, using an appropriate algorithm\n\nSome other goals / non-goals:\n* Take some samples from an existing dataset to do classification performance measurement\n* Try combining different methods and benchmark each\n* Make an API\n  * Including batch classification\n* Not interested in ML/AI classification to begin with\n\n## Examples\n\nRun the project with Gradle and pass the file you want to parse as an argument, or omit to use the default provided (`data.csv`):\n\n```\n🥕 carrot 🗂  ~/git/peep 🐙 main $ ./gradlew run                          \n\n\u003e Task :app:run\nhello, peep\nloading data from file: \"data.csv\"\nread 5 rows (not including a header)\ntop 10 uncategorised:\n  rabbit - 2\n  bunny - 1\n  bunny rabbit - 1\n  lop eared rabbit - 1\ntop 10, accounting for categories and their aliases:\n  rabbit - 5\nnormalised stats:\n  4 unique entries (reduction of 1 by normalisation)\n  with an uncategorised count of 5\n  and 1 resulting categories (reduction of 3 by categorisation)\n  with 5 categorisations (1.00 average categories per entry)\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flopcode%2Fpeep","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flopcode%2Fpeep","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flopcode%2Fpeep/lists"}