{"id":18524067,"url":"https://github.com/atharv-naik/tiger-population-estimation","last_synced_at":"2025-07-15T09:03:50.152Z","repository":{"id":180423004,"uuid":"600144479","full_name":"atharv-naik/tiger-population-estimation","owner":"atharv-naik","description":"From the tiger pugmarks data estimated the number of tigers using KNN clustering","archived":false,"fork":false,"pushed_at":"2024-10-01T16:37:12.000Z","size":8,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-05-14T19:13:40.566Z","etag":null,"topics":["estimation-algorithm","knn-clustering","machine-learning"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/atharv-naik.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-10T17:25:34.000Z","updated_at":"2023-02-10T17:44:37.000Z","dependencies_parsed_at":null,"dependency_job_id":"c876ef42-0b40-4f6a-b43a-186a4ffa40c2","html_url":"https://github.com/atharv-naik/tiger-population-estimation","commit_stats":null,"previous_names":["atharv-naik/tiger-population-estimation"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/atharv-naik/tiger-population-estimation","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atharv-naik%2Ftiger-population-estimation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atharv-naik%2Ftiger-population-estimation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atharv-naik%2Ftiger-population-estimation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atharv-naik%2Ftiger-population-estimation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/atharv-naik","download_url":"https://codeload.github.com/atharv-naik/tiger-population-estimation/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atharv-naik%2Ftiger-population-estimation/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265424328,"owners_count":23762880,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["estimation-algorithm","knn-clustering","machine-learning"],"created_at":"2024-11-06T17:39:27.432Z","updated_at":"2025-07-15T09:03:49.891Z","avatar_url":"https://github.com/atharv-naik.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Tiger Population Estimation\nThis project is a part of ISI DataFest Integration 2023, which aims to predict the number of tigers in a given dataset using machine learning algorithms. The dataset contains information about different tiger sightings, including the location, soil type, and other environmental factors.\n\nTo predict the number of tigers, the project uses two main techniques: Supervised K-Nearest Neighbor (KNN) clustering and feature selection using the Logit function. Supervised KNN is a clustering algorithm that is useful when labeled data is expensive or impossible to obtain. It can achieve high accuracy in a wide variety of prediction-type problems. The Logit function, on the other hand, is a useful technique for predicting binary outcomes, such as whether a tiger is unique or not.\n\nThe project first loads the training dataset, which contains labeled data, and splits it into training and testing sets. It then trains the KNN model on the training set and tests its accuracy on the testing set. Once the model is trained, it is used to predict the number of tigers in a new dataset using the KNN algorithm.\n\nTo further improve the accuracy of the prediction, the project uses feature selection techniques to identify the most useful features in the dataset. Specifically, it applies the Logit function to the dataset and examines the regression summary to determine which features have the most significant impact on the outcome variable (i.e., whether a tiger is unique or not). These features are then used to refine the KNN model and improve its accuracy.\n\nOverall, this project demonstrates how machine learning algorithms can be used to predict the number of tigers in a given dataset. By combining KNN clustering and feature selection techniques, it achieves high accuracy in predicting the number of unique tigers in the dataset.\n\n## Getting Started\n\nTo get started with this project, you will need to clone the repository to your local machine and install the required libraries using pip. Download the dataset from the kaggle page [here](https://www.kaggle.com/competitions/im-hard-to-spot/data) or alternatively, using the kaggle API* use:\n```\nkaggle competitions download -c im-hard-to-spot\n```\n*You can follow the instructions in [this](https://github.com/Kaggle/kaggle-api) repo to set up the API.\n\n## Prerequisites\n\nYou will need the following libraries:\n  - pandas \n  - sklearn \n  - statsmodels\n  - matplotlib\n  - seaborn\n\nYou can install them using pip:\n```\npip install pandas sklearn statsmodels matplotlib seaborn\n```\nAlternatively use the below command after cloning the repo:\n```\npip install -r requirements.txt\n```\n\n## Installing\n\nTo install this project, simply clone the repository to your local machine:\n```\ngit clone https://github.com/your-username/your-repository.git\n```\n\n## License\n\nThis project is licensed under the [MIT](LICENSE) license.\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/Arg-10/Tiger-Population-Estimation/blob/main/LICENSE)\n\n## Acknowledgments\n\n  - ISI DataFest Integration 2023\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fatharv-naik%2Ftiger-population-estimation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fatharv-naik%2Ftiger-population-estimation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fatharv-naik%2Ftiger-population-estimation/lists"}