Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/brain-facens/image-eda
Python package for working with Exploratory Data Analysis on Image data
https://github.com/brain-facens/image-eda
Last synced: 2 days ago
JSON representation
Python package for working with Exploratory Data Analysis on Image data
- Host: GitHub
- URL: https://github.com/brain-facens/image-eda
- Owner: brain-facens
- Created: 2020-07-21T10:58:10.000Z (over 4 years ago)
- Default Branch: dev
- Last Pushed: 2021-04-21T19:24:14.000Z (over 3 years ago)
- Last Synced: 2023-03-07T14:54:08.339Z (over 1 year ago)
- Language: Python
- Size: 310 KB
- Stars: 0
- Watchers: 5
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# image-eda
Python package for working with Exploratory Data Analysis on Image data.The image-eda package uses a pre-trained neural network to extract feature maps from a given number of datasets then apply a dimensionality reduction algorithm and finally, plot the transformed data for analysis.
Currently supporting PCA and t-SNE dimension reduction algorithms.
## Usage Guide:
The image-eda package works on the following steps:
### Create the parameters dictionary for the ImageEDA object
```python
data = {'dataset_name': ['dataset_name_01','dataset_name_02','dataset_name_xx'],
'annot_path': ['path/to/annotations_01.csv','path/to/annotations_02.csv','path/to/annotations_xx.csv'],
'image_path': ['path/to/images_01','path/to/images_02','path/to/images_xx'],
'dr_method': ['t-sne','pca','pca']} # t-sne or pca
```### Then run the ImageEDA class iterating through the dictionary parameters
```python
image_eda = []
for i in range(0,len(data['dataset_name'])):image_eda.append(ImageEDA(experiment_name=data['dataset_name'][i],
data_source=LocalCsvSource(data['annot_path'][i],data['image_path'][i], data['dataset_name'][i]),
dr_method=data['dr_method'][i],
batch_size=10))
print(image_eda[i])# Predict
image_eda[i].predict_feature_map()
print(data['dataset_name'][i],'predict done')# fit
image_eda[i].partial_fit()
print(data['dataset_name'][i],'fit done')# transform
image_eda[i].transform()
print(data['dataset_name'][i],'transform done')
```### Run the data visualization for each individual dataset
```python
for i in range(0,len(data['dataset_name'])):
image_eda[i].visualize('classes_config.txt')
```### Save the eda model output
```python
for i in range(0,len(data['dataset_name'])):
image_eda[i].save_output()
```### Load from previously analyzed data
```python
image_eda = ImageEDA("dataset_name", pickle_path="dataset_name_vgg16_dr_method_.pickle")
image_eda.visualize("classes_config.txt")
```## Running experiment with MLfLow
```sh
mlflow run . -e experiment --no-conda \
-P experiment_name="Experiment" \
-P model_name="vgg16"
-P dataset_name="dataset_name_01 dataset_name_02 dataset_name_xx" \
-P image_path="path/to/images_01 path/to/images_02 path/to/images_xx" \
-P annotation_path="path/to/annotations_01.csv path/to/annotations_02.csv path/to/annotations_xx.csv" \
-P dr_method="t-sne pca pca" \
-P batch_size=5 \
-P n_components=2 \
```