https://github.com/tymill/synthpred
A Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.
https://github.com/tymill/synthpred
arima automl ensemble flux imputation julia machine-learning synthetic-data time-series
Last synced: 6 months ago
JSON representation
A Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.
- Host: GitHub
- URL: https://github.com/tymill/synthpred
- Owner: TyMill
- License: mit
- Created: 2025-03-26T12:13:52.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-04-16T11:40:12.000Z (6 months ago)
- Last Synced: 2025-04-22T12:16:47.469Z (6 months ago)
- Topics: arima, automl, ensemble, flux, imputation, julia, machine-learning, synthetic-data, time-series
- Language: Julia
- Homepage:
- Size: 308 KB
- Stars: 2
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.bib
Awesome Lists containing this project
README
# SynthPred.jl
[](https://tymill.github.io/SynthPred/)[](https://doi.org/10.5281/zenodo.15090892)

**SynthPred.jl** is a Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.
---
## ๐ Features
- ๐ Descriptive statistics and missing data reporting
- ๐งผ Simple and advanced imputation:
- Mean, median, mode
- Forward/backward fill
- Gaussian distribution sampling
- Time series-based: ARIMA
- Sequence learning-based: RNN (Flux.jl)
- ๐ค AutoML for classification (MLJ.jl-based)
- โ๏ธ Blending top-performing models via ensembling
- ๐ Predictions on new data
- ๐ JSON/CSV imputation reports---
## ๐ฆ Installation
```julia
using Pkg
Pkg.add(url="https://github.com/TyMill/SynthPred.jl")
```---
## ๐งช Quick Example
```julia
using SynthPred
using CSV, DataFrames# Load training data
df = CSV.read("data/example.csv", DataFrame)# Explore data
SynthPred.Exploration.describe_data(df)# Impute missing values (e.g. RNN strategy)
df_clean, report = SynthPred.Imputer.impute_advanced(df, "rnn", threshold=0.1)
SynthPred.Imputer.save_imputation_report(report, "reports/imputation_report.json")# Run AutoML pipeline
top_models, scores = SynthPred.AutoML.run_automl(df_clean, :target)
X = select(df_clean, Not(:target))
y = df_clean[:, :target]
ensemble = SynthPred.AutoML.blend_top_models(top_models, X, y)# Predict on new data
Xnew = CSV.read("data/new_data.csv", DataFrame)
preds = SynthPred.AutoML.predict_ensemble(ensemble, Xnew)
println(preds)
```---
## ๐ Documentation
Full documentation is available at: [https://your-username.github.io/SynthPred.jl](https://tymill.github.io/SynthPred.jl)
---
## ๐งช Project Structure
```
SynthPred/
โโโ Project.toml
โโโ src/
โ โโโ SynthPred.jl
โ โโโ Exploration.jl
โ โโโ Imputer.jl
โ โโโ AutoML.jl
โโโ data/
โ โโโ example.csv
โ โโโ new_data.csv
โโโ reports/
โ โโโ imputation_report.json
โโโ docs/
โ โโโ src/index.md
โโโ test/
โ โโโ runtests.jl
โโโ main.jl
```---
## ๐ Roadmap
- [x] Core modules: Exploration, Imputer, AutoML
- [x] ARIMA and RNN-based imputations
- [x] AutoML + model blending with MLJ.jl
- [x] Imputation reports (CSV/JSON)
- [x] Documentation (Documenter.jl + GitHub Pages)
- [ ] Exporting trained models (`JLD2`, `BSON`)
- [ ] Web GUI with Pluto.jl or Dash.jl
- [ ] Integration with JuliaHub and Zenodo DOI---
## ๐ค Contributing
Pull requests are welcome! For major changes, please open an issue first to discuss your proposal.
---
## ๐ License
MIT License ยฉ 2025 Tymoteusz Miller
---
## ๐ฌ Contact
๐ง me@tymoteuszmiller.dev
---
Built with โค๏ธ in Julia for real-world ML and scientific discovery.