https://github.com/shadowfaxx1/crop-data-analysis
Crop Prediction Using Machine Learning , a side project for the internship done for UCT Noida
https://github.com/shadowfaxx1/crop-data-analysis
Last synced: 2 months ago
JSON representation
Crop Prediction Using Machine Learning , a side project for the internship done for UCT Noida
- Host: GitHub
- URL: https://github.com/shadowfaxx1/crop-data-analysis
- Owner: shadowfaxx1
- License: mit
- Created: 2023-07-21T12:50:58.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-07-21T15:35:39.000Z (almost 2 years ago)
- Last Synced: 2025-01-26T08:41:42.890Z (4 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 12.2 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 🔰 Crop Yield Prediction :
[](https://github.com/shadowfaxx1/upSkill-Campus.git)
This data science project aims to predict crop yields based on various factors related to the cost of cultivation and production for different crops in different states.
## Table of Contents
- [Introduction](#introduction)
- [Project Structure](#project-structure)
- [Data Sources](#data-sources)
- [Installation](#installation)
- [Usage](#usage)
- [Data Preprocessing](#data-preprocessing)
- [Exploratory Data Analysis (EDA)](#exploratory-data-analysis-eda)
- [Model Training and Evaluation](#model-training-and-evaluation)
- [Results](#results)
- [Future Improvements](#future-improvements)
- [Contributing](#contributing)
- [License](#license)## Introduction
In this project, we use machine learning algorithms to predict crop yields based on different cost factors related to the cultivation and production of crops. The goal is to understand how the cost of cultivation impacts crop yields and make predictions for future yields.
## Project Structure
The project is structured as follows:
- `data/`: Contains the datasets used for training and evaluation.
- `notebooks/`: Jupyter notebooks for data preprocessing, exploratory data analysis, and model training.
- `models/`: Saved models and model evaluation results.
- `scripts/`: Helper scripts and functions used in the project.
- `readme.md`: This file, providing an overview of the project.## Data Sources
The data used in this project is sourced from multiple datasets:
1. `datafile4.csv`: Contains information on the cost of cultivation for different crops and states.
2. `datafile2.csv`: Provides data on crop production and yield over time.
3. `datafile.csv`: Contains additional information on crop growth over time.
4. `datafile3.csv`: contains additional data
5. `produce.csv`: contains production data in ton/mm.
## InstallationTo run this project locally, you will need to install the required libraries. You can do this using the following steps:
> 1. Clone the repository: gh repo clone shadowfaxx1/upSkill-Campus
> 2. Install the required Python libraries: pip install -r requirements.txt
## Usage
To use the project, follow these steps:
1. Navigate to the main execuetable file and run the Jupyter notebooks in the specified order and > change the path of the datasets .
2. The notebooks cover data preprocessing, exploratory data analysis, and model training.
3. Evaluate the performance of different machine learning algorithms for crop yield prediction.## Data Preprocessing
In the data preprocessing step, we handle missing values and clean the data to prepare it for analysis and model training.
## Exploratory Data Analysis (EDA)
EDA involves visualizing and analyzing the data to gain insights into the relationships between different variables and the target variable (crop yield).
## Model Training and Evaluation
We train different machine learning algorithms, including Decision Trees, Linear Regression, XGBoost, and Random Forest, to predict crop yields.
Evaluation metrics such as Mean Squared Error (MSE) and Mean Absolute Error (MAE) are used to assess the performance of the models.## Results
The results of the models are visualized and compared to understand the effectiveness of each algorithm in predicting crop yields.
## Future Improvements
This project can be further improved in the following ways:
- Experiment with more advanced machine learning algorithms and hyperparameter tuning to improve prediction accuracy.
- Explore additional features or external datasets that could enhance model performance.
- Deploy the best-performing model as a web application for real-time predictions.## Contributing
Contributions to this project are welcome. If you find any issues or have suggestions for improvements, please submit a pull request or open an issue.
## License
This project is licensed under the MIT License. See the `LICENSE` file for details.