An open API service indexing awesome lists of open source software.

https://github.com/tymill/synthpred

A Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.
https://github.com/tymill/synthpred

arima automl ensemble flux imputation julia machine-learning synthetic-data time-series

Last synced: 6 months ago
JSON representation

A Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.

Awesome Lists containing this project

README

          

# SynthPred.jl
[![Docs](https://img.shields.io/badge/docs-dev-blue.svg)](https://tymill.github.io/SynthPred/)

[![DOI](https://zenodo.org/badge/955290469.svg)](https://doi.org/10.5281/zenodo.15090892)

![GitHub all releases](https://img.shields.io/github/downloads/TyMill/SynthPred/total?label=๐Ÿ“ฆ%20Downloads&style=plastic)

**SynthPred.jl** is a Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.

---

## ๐Ÿš€ Features

- ๐Ÿ” Descriptive statistics and missing data reporting
- ๐Ÿงผ Simple and advanced imputation:
- Mean, median, mode
- Forward/backward fill
- Gaussian distribution sampling
- Time series-based: ARIMA
- Sequence learning-based: RNN (Flux.jl)
- ๐Ÿค– AutoML for classification (MLJ.jl-based)
- โš–๏ธ Blending top-performing models via ensembling
- ๐Ÿ“Š Predictions on new data
- ๐Ÿ“‘ JSON/CSV imputation reports

---

## ๐Ÿ“ฆ Installation

```julia
using Pkg
Pkg.add(url="https://github.com/TyMill/SynthPred.jl")
```

---

## ๐Ÿงช Quick Example

```julia
using SynthPred
using CSV, DataFrames

# Load training data
df = CSV.read("data/example.csv", DataFrame)

# Explore data
SynthPred.Exploration.describe_data(df)

# Impute missing values (e.g. RNN strategy)
df_clean, report = SynthPred.Imputer.impute_advanced(df, "rnn", threshold=0.1)
SynthPred.Imputer.save_imputation_report(report, "reports/imputation_report.json")

# Run AutoML pipeline
top_models, scores = SynthPred.AutoML.run_automl(df_clean, :target)
X = select(df_clean, Not(:target))
y = df_clean[:, :target]
ensemble = SynthPred.AutoML.blend_top_models(top_models, X, y)

# Predict on new data
Xnew = CSV.read("data/new_data.csv", DataFrame)
preds = SynthPred.AutoML.predict_ensemble(ensemble, Xnew)
println(preds)
```

---

## ๐Ÿ“š Documentation

Full documentation is available at: [https://your-username.github.io/SynthPred.jl](https://tymill.github.io/SynthPred.jl)

---

## ๐Ÿงช Project Structure

```
SynthPred/
โ”œโ”€โ”€ Project.toml
โ”œโ”€โ”€ src/
โ”‚ โ”œโ”€โ”€ SynthPred.jl
โ”‚ โ”œโ”€โ”€ Exploration.jl
โ”‚ โ”œโ”€โ”€ Imputer.jl
โ”‚ โ””โ”€โ”€ AutoML.jl
โ”œโ”€โ”€ data/
โ”‚ โ”œโ”€โ”€ example.csv
โ”‚ โ””โ”€โ”€ new_data.csv
โ”œโ”€โ”€ reports/
โ”‚ โ””โ”€โ”€ imputation_report.json
โ”œโ”€โ”€ docs/
โ”‚ โ””โ”€โ”€ src/index.md
โ”œโ”€โ”€ test/
โ”‚ โ””โ”€โ”€ runtests.jl
โ””โ”€โ”€ main.jl
```

---

## ๐Ÿ“Œ Roadmap

- [x] Core modules: Exploration, Imputer, AutoML
- [x] ARIMA and RNN-based imputations
- [x] AutoML + model blending with MLJ.jl
- [x] Imputation reports (CSV/JSON)
- [x] Documentation (Documenter.jl + GitHub Pages)
- [ ] Exporting trained models (`JLD2`, `BSON`)
- [ ] Web GUI with Pluto.jl or Dash.jl
- [ ] Integration with JuliaHub and Zenodo DOI

---

## ๐Ÿค Contributing

Pull requests are welcome! For major changes, please open an issue first to discuss your proposal.

---

## ๐Ÿ“œ License

MIT License ยฉ 2025 Tymoteusz Miller

---

## ๐Ÿ“ฌ Contact

๐Ÿ“ง me@tymoteuszmiller.dev

---

Built with โค๏ธ in Julia for real-world ML and scientific discovery.