https://github.com/thomasjo/nemo

Detection and classification of microscopic foraminifera
https://github.com/thomasjo/nemo

climatology deep-learning foraminifera geology geoscience machine-learning oceanography research tensorflow

Last synced: 7 months ago
JSON representation

Detection and classification of microscopic foraminifera

Host: GitHub
URL: https://github.com/thomasjo/nemo
Owner: thomasjo
Created: 2019-03-18T08:39:57.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2022-12-08T03:37:51.000Z (almost 3 years ago)
Last Synced: 2025-01-21T04:41:40.038Z (9 months ago)
Topics: climatology, deep-learning, foraminifera, geology, geoscience, machine-learning, oceanography, research, tensorflow
Language: Python
Homepage:
Size: 409 KB
Stars: 3
Watchers: 6
Forks: 0
Open Issues: 5
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Project Nemo

Foraminifera (forams for short) classification via deep feature extraction.

## Image dataset

All models have been trained on a dataset of large, high-resolution images of
forams. The dataset has been produced by our research group, and will be made
publically available in the near future. Each of the source images consist of
a single class of forams. From these images, patches of _224x224_ pixels are
extracted using combinations of Gaussian smoothing, binary image generation
via thresholding, and connected components. The first step removes the metallic
border present in all source images, and the second step extracts candidate
patches. Each patch that passes a defined selection critera is extracted by
placing a _224x224_ crop at the centroid of the candidate region. The entire
process is automated in the `preprocess_data.py` script.

Once the source images have been preprocessed by extracting patches, datasets
for training, validation, and testing are generated automatically by using the
`build_datasets.py` script.

### Caveat regarding `raw-halves` source images

The `raw-halves` source images are slightly different in nature, and requires
that the `preprocess_data.py` script be invoked with `--border-threshold=50`.
Patches from this dataset must be manually copied to the `preprocessed` folder
built by process outlined above. In the future, we should find a way to fully
automate this step as well.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/thomasjo/nemo

Awesome Lists containing this project

README