Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with data-splitting

A curated list of projects in awesome lists tagged with data-splitting .

https://github.com/szcf-weiya/splitclustertest.jl

Julia package for "FDR Control via Data Splitting for Testing-after-Clustering (arXiv: 2410.06451)"

data-splitting fdr post-selection

Last synced: 16 Nov 2024

https://github.com/aarryasutar/credit_eda

This project focuses on cleaning and analyzing a loan application dataset to gain insights into the factors influencing loan defaults. Through systematic data cleaning, visualization, and merging with previous application data, it provides a robust foundation for further predictive modeling.

binning boxplot correlation-matrix data-cleaning data-splitting dataframe feature-engineering heatmap jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn

Last synced: 13 Nov 2024

https://github.com/lefteris-souflas/propensity-to-lapse-model-building-exercise

Analyzed customer churn using transaction data. Built ML model to predict lapses. Dataset includes customer status, collection/redemption info, and program tenure. Delivered business presentation outlining modeling approach, findings, and churn reduction strategies.

cluster-analysis data-driven-decisions data-preprocessing data-splitting decision-tree feature-engineering gradient-boosting logistic-regression model-interpretation model-optimization model-selection-and-evaluation neural-network random-forest sas-visual-analytics support-vector-machine

Last synced: 13 Nov 2024

https://github.com/lefteris-souflas/spark-movies-analytics

Utilizing Apache Spark & PySpark to analyze a movie dataset. Tasks include data exploration, identifying top-rated movies, training a linear regression model, and experimenting with Airflow.

apache-airflow cross-validation dag data-splitting hyperparameter-tuning linear-regression model-evaluation one-hot-encoding pipeline pyspark pyspark-mllib pyspark-sql spark-session

Last synced: 13 Nov 2024

https://github.com/katiebristol/data_splitter

A basic Python script to split a .dat file into individual sample files.

converter data-converter data-splitting paleomagnetism python

Last synced: 24 Nov 2024