{"id":25666479,"url":"https://github.com/parvvaresh/agricultural-products-classification","last_synced_at":"2025-10-16T18:55:58.839Z","repository":{"id":143030362,"uuid":"532810434","full_name":"parvvaresh/agricultural-products-classification","owner":"parvvaresh","description":"This pipeline is designed to classify agricultural products using satellite data from two satellites, SENTINEL-1 and SENTINEL-2","archived":false,"fork":false,"pushed_at":"2024-12-27T03:20:58.000Z","size":975,"stargazers_count":13,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-22T19:11:47.232Z","etag":null,"topics":["classification","docker","google-earth-engine","lda","ml","pca","satellite","standarization"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/parvvaresh.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-05T08:29:18.000Z","updated_at":"2025-02-20T03:01:43.000Z","dependencies_parsed_at":null,"dependency_job_id":"e50bf985-8976-4b9c-b2fa-df5406860848","html_url":"https://github.com/parvvaresh/agricultural-products-classification","commit_stats":null,"previous_names":["parvvaresh/classification-of-satellite-images","parvvaresh/agricultural-products-classification"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/parvvaresh/agricultural-products-classification","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parvvaresh%2Fagricultural-products-classification","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parvvaresh%2Fagricultural-products-classification/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parvvaresh%2Fagricultural-products-classification/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parvvaresh%2Fagricultural-products-classification/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/parvvaresh","download_url":"https://codeload.github.com/parvvaresh/agricultural-products-classification/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parvvaresh%2Fagricultural-products-classification/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279227424,"owners_count":26130286,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-16T02:00:06.019Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","docker","google-earth-engine","lda","ml","pca","satellite","standarization"],"created_at":"2025-02-24T08:30:24.333Z","updated_at":"2025-10-16T18:55:58.804Z","avatar_url":"https://github.com/parvvaresh.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Agricultural Products Classification Pipeline\n\n## Overview\nThis pipeline is designed to classify agricultural products using satellite data from **SENTINEL-1** and **SENTINEL-2**. The pipeline includes the following stages:\n\n1. **Data Standardization**: Different standardization techniques are applied to the data to make it suitable for model training.\n2. **Dimensionality Reduction**: PCA and LDA are applied to reduce the dimensionality of the feature space, with separate models for each satellite's data.\n3. **Model Training and Hyperparameter Optimization**: Various machine learning models are trained, and hyperparameter optimization is performed using grid search.\n\n\nI've added the information you provided to the README. Here's the updated section that includes the satellite data input:\n\n---\n\n### Satellite Data Input:\n\nThe input dataset contains Earth observation data from **SENTINEL-1** and **SENTINEL-2** satellites, obtained via Google Earth Engine. The data includes various bands from both satellites, as well as additional values relevant for classification tasks.\n\n#### Example of input data:\n\n| **Sample** | **0_B1**  | **0_B2**  | **0_B3**  | **0_B4**  | **0_B5**  | **0_B6**  | **0_B7**  | **0_B8**  | **0_B8A** | **0_B9**  | **0_B11** | **0_B12** | **0_VV** |\n|------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|----------|\n| **Sample 1** | 0.050643478 | 0.071909783 | 0.108879348 | 0.140969565 | 0.156472826 | 0.172709783 | 0.185292391 | 0.180054348 | 0.195056522 | 0.205251087 | 0.195241304 | 0.1603 | -1 |\n| **Sample 2** | 0.051273684 | 0.07195 | 0.107911842 | 0.138413158 | 0.156592105 | 0.180571053 | 0.195072368 | 0.189626316 | 0.204071053 | 0.243975 | 0.199786842 | 0.161619737 | -1 |\n| **Sample 3** | 0.064336805 | 0.097296528 | 0.140022222 | 0.176558333 | 0.187975 | 0.19215 | 0.199796528 | 0.203748611 | 0.201070833 | 0.235688194 | 0.202470833 | -15.741307 | -1 |\n| **Sample 4** | 0.070949999 | 0.100846154 | 0.150261539 | 0.196115385 | 0.214473077 | 0.219430769 | 0.227103846 | 0.226692308 | 0.230776923 | 0.23485 | 0.240280769 | 0.209653846 | -1 |\n| **Sample 5** | 0.071380468 | 0.101917188 | 0.151620313 | 0.198378125 | 0.213576563 | 0.215678125 | 0.222285156 | 0.224170313 | 0.224170313 | 0.235323438 | 0.235323438 | 0.208569531 | -1 |\n| **Sample 6** | 0.072846154 | 0.100773077 | 0.150984615 | 0.198823077 | 0.213915385 | 0.217265385 | 0.224673077 | 0.226946154 | 0.226946154 | 0.234361538 | 0.237073077 | 0.206880769 | -1 |\n| **Sample 7** | 0.067707143 | 0.103935714 | 0.152242857 | 0.200014286 | 0.209557143 | 0.213071429 | 0.221978571 | 0.229471429 | 0.223307143 | 0.232307143 | 0.232307143 | 0.205528571 | -1 |\n| **Sample 8** | 0.097139552 | 0.130318657 | 0.162661194 | 0.194323881 | 0.209510448 | 0.212884328 | 0.222468657 | 0.230838806 | 0.230782836 | 0.236003731 | 0.311174627 | 0.283676866 | -1 |\n| **Sample 9** | 0.070247222 | 0.097663194 | 0.129397222 | 0.159320833 | 0.171659722 | 0.17494375 | 0.183878472 | 0.192720833 | 0.193045833 | 0.276390278 | 0.256345833 | 0.249488889 | -1 |\n| **Sample 10** | 0.060408333 | 0.085986806 | 0.121355556 | 0.154906944 | 0.168461111 | 0.1728375 | 0.182507639 | 0.191263889 | 0.192247222 | 0.282597917 | 0.263926389 | 0.249488889 | -1 |\n\n\n\n- The **bands** from **SENTINEL-2** include: `B1`, `B2`, `B3`, `B4`, `B5`, `B6`, `B11`, `B12`, etc.\n- The **SENTINEL-1** data includes polarization bands such as `VV` and `VH`, with additional derived features such as `VV_1` and `VH_1`.\n- Each row represents a specific point in time for the satellite’s data, with `1_VV` marking the timestamp of the observation.\n\n\n---\n\n\n## Pipeline Steps\n\n### 1. Data Standardization\nThe following standardization methods are applied to the data:\n\n- **Original**: Raw data without scaling.\n- **Standard Scaled**: Standardization using mean and variance.\n- **MinMax Scaled**: Scales data to a specified range (usually [0,1]).\n- **MaxAbs Scaled**: Scales data to [-1, 1] based on the maximum absolute value.\n- **Robust Scaled**: Scales data using the median and interquartile range.\n- **Normalized**: Scales data to unit norm.\n\n### 2. Dimensionality Reduction\nTwo dimensionality reduction techniques are used:\n\n- **PCA (Principal Component Analysis)**: Reduces the feature space by projecting the data into a lower-dimensional space.\n- **LDA (Linear Discriminant Analysis)**: A classification-specific dimensionality reduction technique.\n\nNote: Each satellite’s data is processed separately due to different band spaces. For **SENTINEL-1**, the bands are:\n- `VH`, `VV`, `HH`, `VH_1`, `VV_1`\n\nFor **SENTINEL-2**, the bands are:\n- `B1`, `B2`, `B3`, `B4`, `B5`, `B6`, `B11`, `B12`, `B13`, `B14`, `B15`, `B16`, `NDVI`, `EVI`, `SAVI`\n\n### 3. Model Training \u0026 Hyperparameter Optimization\nThe following models are trained using grid search for hyperparameter optimization:\n\n1. **Decision Tree Classifier**\n2. **K-Nearest Neighbors (KNN)**\n3. **Logistic Regression**\n4. **Multilayer Perceptron (MLP)**\n5. **Naive Bayes**\n6. **Nearest Centroid**\n7. **Perceptron**\n8. **Random Forest**\n9. **Support Vector Machine (SVM)**\n\nEach model is optimized based on its hyperparameter grid.\n\n---\n\n## Model Hyperparameters\n\n| Model                  | Hyperparameters                                                                                                                                                       |\n|------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| **Decision Tree**       | `criterion`: ['gini', 'entropy'], `max_depth`: [None, 10, 20, 30, 40], `min_samples_split`: [2, 5, 10], `min_samples_leaf`: [1, 2, 4]                                  |\n| **K-Nearest Neighbors** | `n_neighbors`: [1, 3, 5, 7, 9, 11, 13], `weights`: ['uniform', 'distance'], `metric`: ['euclidean', 'manhattan', 'minkowski']                                      |\n| **Logistic Regression** | `penalty`: ['l1', 'l2'], `C`: [0.01, 0.1, 1, 10, 100], `solver`: ['liblinear', 'saga'], `max_iter`: [100, 200, 300, 500]                                           |\n| **MLP**                 | `hidden_layer_sizes`: [(50,), (100,), (50, 50), (100, 100)], `activation`: ['tanh', 'relu'], `solver`: ['sgd', 'adam'], `alpha`: [0.0001, 0.001, 0.01], `max_iter`: [100, 200, 300, 400, 500] |\n| **Naive Bayes**         | `var_smoothing`: [1e-9, 1e-8, 1e-7, 1e-6, 1e-5]                                                                                                                     |\n| **Nearest Centroid**    | `metric`: ['euclidean', 'manhattan'], `shrink_threshold`: [None, 0.1, 0.2, 0.5, 0.7, 0.8]                                                                      |\n| **Perceptron**          | `penalty`: ['l1', 'l2', 'elasticnet'], `alpha`: [0.0001, 0.001, 0.01, 0.1, 1], `max_iter`: [1000, 2000, 3000]                                                      |\n| **Random Forest**       | `n_estimators`: [50, 100, 200], `criterion`: ['gini', 'entropy'], `max_depth`: [None, 10, 20, 30], `min_samples_split`: [2, 5, 10], `min_samples_leaf`: [1, 2, 4]   |\n| **SVM**                 | `C`: [0.1, 1, 10, 100, 1000], `kernel`: ['rbf'], `gamma`: [0.001, 0.01, 0.1, 1]                                                                                   |\n\n---\n\n## Requirements\n\n- Python 3.x\n- pandas\n- scikit-learn\n- numpy\n- matplotlib (for plotting and visualization)\n\nHere’s the updated **Setup** section with the commands:\n\n---\n\n## Setup\n\n1. **Clone the repository:**\n\n    ```bash\n    git clone https://github.com/parvvaresh/Classification-of-satellite-images.git\n    cd your-repository-name\n    ```\n\n2. **Install dependencies:**\n\n    First, create a virtual environment (optional but recommended):\n\n    ```bash\n    python -m venv venv\n    source venv/bin/activate  # On Windows: venv\\Scripts\\activate\n    ```\n\n    Then, install the required Python libraries:\n\n    ```bash\n    pip install -r requirements.txt\n    ```\n\n\n\n\n\n## Usage\n\nThe pipeline is designed to handle satellite data, perform preprocessing, apply dimensionality reduction techniques, and train various models with optimized hyperparameters.\n\nTo use the pipeline, follow the steps below:\n\n1. Prepare your input data as a CSV file with the necessary features and target column.\n2. Modify the input data and parameters in the respective scripts to suit your specific agricultural classification problem.\n\n---\n\n## Example Usage\n\n```python\nimport pandas as pd\nfrom pre_process import pre_process\nfrom train_models import train_models\n\ndef classification(df: pd.DataFrame,\n                   class_column: str,\n                   path: str,\n                   name: str) -\u003e None:\n    \"\"\"\n    This function takes a DataFrame, preprocesses the data, \n    and trains models on the processed data.\n    \n    Parameters:\n    - df: pandas DataFrame containing the data to be classified.\n    - class_column: The column name containing the target variable.\n    - path: The path where the trained model and results will be saved.\n    - name: The name used to save the model and results.\n    \"\"\"\n    # Preprocess the data to separate features and target variable\n    x_data, y = pre_process(df, class_column)\n\n    # Train models using the processed data\n    train_models(x_data, y, path, name)\n\n# Example usage\npath_csv = \"/data.csv\"  # Path to your dataset\ndf = pd.read_csv(path_csv)  # Read the CSV into a DataFrame\n\n# Call the classification function\nclassification(df, \"ClassColumn\", \"/home/reza/data_test\", \"data_test\")\n```\n\n### Explanation:\n- `pre_process(df, class_column)` processes the data, separating the features (`x_data`) and target variable (`y`).\n- `train_models(x_data, y, path, name)` trains machine learning models and saves the trained models to the specified path (`path`) using the provided name (`name`).\n\n### Output:\nAfter training, the results will be saved into a CSV file containing the following information:\n- **Method**: The data standardization and dimensionality reduction method used.\n- **Model name**: The name of the model.\n- **Best hyperparameters**: The best hyperparameters found during grid search.\n- **Train accuracy**: Accuracy on the training dataset.\n- **Test accuracy**: Accuracy on the test dataset.\n- **Precision**, **Recall**, **F1 Score**, **Kappa**: Metrics for model evaluation.\n- **Confusion Matrix path**: Path to the confusion matrix plot.\n- **Runtime**: The time taken to train the model.\n- **Best model**: The best model with its parameters.\n\n---\n\n### Sample Result Entry:\n\n| method              | model                  | best_params                                                                                       | train_accuracy | test_accuracy | precision | recall | f1_score | kappa | confusion_matrix_path | runtime | best_model                                   |\n|---------------------|------------------------|--------------------------------------------------------------------------------------------------|----------------|----------------|-----------|--------|----------|-------|------------------------|---------|----------------------------------------------|\n| original-original   | KNeighborsClassifier    | `{'metric': 'euclidean', 'n_neighbors': 1, 'weights': 'uniform'}`                                | 1.0            | 0.925          | 0.909     | 0.925  | 0.913    | 0.903 | path_to_matrix.png      | 2.43    | KNeighborsClassifier(metric='euclidean', n_neighbors=1) |\n\n---\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparvvaresh%2Fagricultural-products-classification","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fparvvaresh%2Fagricultural-products-classification","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparvvaresh%2Fagricultural-products-classification/lists"}