https://github.com/ser-arthur/knn-machine-learning
a k-Nearest-Neighbors (kNN) classification model to petrophysical well log data from the Lismore Field. Includes a full analysis workflow, classification notebook, and final report.
https://github.com/ser-arthur/knn-machine-learning
knn-classification machine-learning petroleum-engineering python reservoir-characterization
Last synced: 2 months ago
JSON representation
a k-Nearest-Neighbors (kNN) classification model to petrophysical well log data from the Lismore Field. Includes a full analysis workflow, classification notebook, and final report.
- Host: GitHub
- URL: https://github.com/ser-arthur/knn-machine-learning
- Owner: ser-arthur
- Created: 2025-04-08T13:29:44.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2025-04-08T14:48:38.000Z (2 months ago)
- Last Synced: 2025-04-08T15:45:46.193Z (2 months ago)
- Topics: knn-classification, machine-learning, petroleum-engineering, python, reservoir-characterization
- Language: Jupyter Notebook
- Homepage:
- Size: 670 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Thief Zone Classification using k-Nearest Neighbours (kNN)
This project applies the K-Nearest Neighbours (KNN) machine learning algorithm to classify a suspected thief zone in the Lismore field using petrophysical data.
A **thief zone** is a high-permeability interval that may divert injected fluids away from productive reservoir zones due to its superior flow capacity. Accurately identifying such zones is critical in reservoir management and enhanced oil recovery.
The goal was to define a clear classification criteria using geological insight and statistical thresholds, then apply a supervised kNN model to evaluate the test zone.---
## Project Structure
```
├── KNN-thief-zone-classifier.ipynb # Main Jupyter Notebook
├── KNN Classification - Report.pdf # Full written report with workflow and interpretation
├── petrophysical_data.xlsx # Dataset with petrophysical properties
├── README.md # Project summary and guide
```---
## Problem Overview
The goal was to:
- Define thief zone classification criteria based on geological insight and statistical analysis.
- Apply the KNN algorithm to classify a test zone using training data from known formations.
- Validate model accuracy using cross-validation techniques suited for small datasets.---
## Methods Used
- **Data Cleaning:** Removed missing values, excluded the Zechstein zone, and standardized numerical inputs.
- **Exploratory Analysis:** Histogram distribution, summary statistics, and correlation matrix were used to define thief zone thresholds.
- **Labeling:** Thief zones were defined as intervals with high permeability (`KLOGH_arithmetic > 66 mD`) and moderate-to-low NTG (`< 0.75`), based on data percentiles and reservoir insights.
- **Modeling:** A KNN classifier (k=3) was trained using 8 petrophysical features.
- **Validation:** LOOCV and Stratified K-Fold were used to ensure robustness despite class imbalance (only 2 thief zones in dataset).
- **Interpretation:** The test zone was classified as a thief zone. Visualization of the classification was included for interpretability.---
## Key Outcomes
- **Test Zone Prediction:** Classified as a thief zone by majority vote of 3 nearest neighbors.
- **Validation Accuracy:** ~77.8% (LOOCV) and ~77.5% (Stratified K-Fold).
- **Insights:** KNN proved flexible and effective, accounting for multiple feature patterns rather than fixed thresholds.---
## Learning Highlights
- KNN is useful in small, interpretable datasets where class boundaries are based on feature similarity.
- Combining domain knowledge with statistical thresholds supports better model inputs.
- Model validation methods like LOOCV are critical when data is limited.---
## How to View
You can explore the notebook directly on GitHub, in JupyterNotebook or [View via nbviewer](https://nbviewer.org/github/ser-arthur/knn-machine-learning/blob/main/knn-thief-zone-classifier.ipynb).
---
## References
- Altman, N. S. (1992). *An introduction to kernel and nearest-neighbour non-parametric regression*. The American Statistician.
- Kohavi, R. (1995). *A study of cross-validation and bootstrap for accuracy estimation and model selection*. IJCAI.
- Liu et al. (2021). *The Characteristics and Origins of Thief Zones in the Cretaceous Limestone Reservoirs...* Journal of Petroleum Science and Engineering, 201.