Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/chrise96/3D_Ground_Segmentation
A ground segmentation algorithm for 3D point clouds based on the work described in “Fast segmentation of 3D point clouds: a paradigm on LIDAR data for Autonomous Vehicle Applications”, D. Zermas, I. Izzat and N. Papanikolopoulos, 2017. Distinguish between road and non-road points. Road surface extraction. Plane fit ground filter
cpp extraction ground ground-segmentation lastools lidar non-ground point-cloud preprocessing road-surface
Last synced: 01 Jul 2024
![](https://github.com/chrise96.png)
https://github.com/Alyetama/Ray-Image
Fast image compression for large number of images with Ray library
compression image-compression jpeg preprocessing python ray
Last synced: 27 Jun 2024
![](https://github.com/Alyetama.png)
https://github.com/niklaswais/gesp
court-decisions preprocessing web-scraping
Last synced: 26 Jun 2024
![](https://github.com/niklaswais.png)
https://github.com/sinaahmadi/ScriptNormalization
Script Normalization for Unconventional Writing of Perso-Arabic scripts (ACL2023)
acl2023 arabic azeri gilaki gorani kashmiri kurdish kurdish-language-processing kurmanji less-resource-languages mazanderani nlp persian preprocessing script-normalization sindhi sorani turkish urdu
Last synced: 23 Jun 2024
![](https://github.com/sinaahmadi.png)
https://github.com/nipreps/dmriprep
dMRIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data. The transparent workflow dispenses of manual intervention, thereby ensuring the reproducibility of the results.
bids bids-apps diffusion-mri magnetic-resonance-imaging preprocessing
Last synced: 20 Jun 2024
![](https://github.com/nipreps.png)
https://github.com/bids-apps/HCPPipelines
A BIDS App for minimal preprocessing using the HCP Pipelines
anatomical-mri bids bidsapp functional-mri mri preprocessing
Last synced: 20 Jun 2024
![](https://github.com/bids-apps.png)
https://github.com/bids-apps/freesurfer
BIDS app wrapping recon-all from FreeSurfer
anatomical-mri bids bidsapp mri preprocessing
Last synced: 20 Jun 2024
![](https://github.com/bids-apps.png)
https://github.com/bids-apps/CPAC
BIDS Application for the Configurable Pipeline for the Analysis of Connectomes (C-PAC)
bids bidsapp mri preprocessing
Last synced: 20 Jun 2024
![](https://github.com/bids-apps.png)
https://github.com/bids-apps/afni_proc
prototype AFNI bids app implmenting participant level preprocessing with afni_proc.py
Last synced: 20 Jun 2024
![](https://github.com/bids-apps.png)
https://github.com/madyankin/postcss-each
PostCSS plugin to iterate through values
css iteration postcss preprocessing
Last synced: 19 Jun 2024
![](https://github.com/madyankin.png)
https://github.com/wiz-craft/wiz-craft
A CLI-based dataset preprocessing tool for machine learning tasks. Features include data exploration, null value handling, one-hot encoding, and feature scaling, and download the modified dataset effortlessly.
cli cli-app dataset machine-learning preprocessing
Last synced: 16 Jun 2024
![](https://github.com/wiz-craft.png)
https://github.com/DataCanvasIO/HyperGBM
A full pipeline AutoML tool for tabular data
adversarial-validation automl catboost dask dask-distributed datacleaning distributed-training ensemble-learning fullpipeline gbm gpu-acceleration lightgbm preprocessing pseudo-labeling rapidsai semi-supervised-learning sklearn tabular-data xgboost
Last synced: 13 Jun 2024
![](https://github.com/DataCanvasIO.png)
https://github.com/maruedt/chemometrics
Python library for chemometric data analysis
chemometrics data-analysis ihm mcr mvda pca pls preprocessing python scikit-learn spectroscopy statistics
Last synced: 10 Jun 2024
![](https://github.com/maruedt.png)
https://github.com/qd-cae/awesome-CAE
A curated list of awesome CAE frameworks, libraries and software.
abaqus cae cfd collection fem libraries ls-dyna preprocessing scripting tools
Last synced: 10 Jun 2024
![](https://github.com/qd-cae.png)
https://github.com/kharchenkolab/dropEst
Pipeline for initial analysis of droplet-based single-cell RNA-seq data
pipeline preprocessing scrna-seq single-cell-rna-seq
Last synced: 09 Jun 2024
![](https://github.com/kharchenkolab.png)
https://github.com/mlr-org/mlr3pipelines
Dataflow Programming for Machine Learning in R
bagging data-science dataflow-programming ensemble-learning machine-learning mlr3 pipelines preprocessing r r-package stacking
Last synced: 04 Jun 2024
![](https://github.com/mlr-org.png)
https://github.com/TheAlgorithms/R
Collection of various algorithms implemented in R.
algorithm algorithms classification clustering data-mining datamanipulation education hacktoberfest learning machine-learning practice preprocessing r r-language r-programming regression
Last synced: 04 Jun 2024
![](https://github.com/TheAlgorithms.png)
https://github.com/winedarksea/AutoTS
Automated Time Series Forecasting
automl autots deep-learning feature-engineering forecasting machine-learning preprocessing time-series
Last synced: 31 May 2024
![](https://github.com/winedarksea.png)
https://github.com/Unstructured-IO/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
data-pipelines deep-learning document-image-analysis document-image-processing document-parser document-parsing docx donut information-retrieval langchain llm machine-learning ml natural-language-processing nlp ocr pdf pdf-to-json pdf-to-text preprocessing
Last synced: 31 May 2024
![](https://github.com/Unstructured-IO.png)
https://github.com/advaitsave/Introduction-to-Time-Series-forecasting-Python
Introduction to time series preprocessing and forecasting in Python using AR, MA, ARMA, ARIMA, SARIMA and Prophet model with forecast evaluation.
arima arma dickey-fuller forecast-evaluation forecasting preprocessing prophet-model python sarima seasonality series-forecasting-python series-preprocessing stationarity time-series time-series-forecasting
Last synced: 31 May 2024
![](https://github.com/advaitsave.png)
https://github.com/sunlabuiuc/PyHealth
A Deep Learning Python Toolkit for Healthcare Applications.
clinical-data clinical-research data-mining deep-learning electronic-health-record electronic-medical-record healthcare medical-code preprocessing
Last synced: 24 May 2024
![](https://github.com/sunlabuiuc.png)
https://github.com/LaurentDardenne/Template
Code generation by using text templates
conditional parsing-engine powershell powershell-module preprocessing preprocessor regex template template-engine transformations
Last synced: 22 May 2024
![](https://github.com/LaurentDardenne.png)
https://github.com/dongrixinyu/JioNLP
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
apache2 chinese natural-language-processing ner nlp nlp-parse preprocessing python time-parse time-parsing
Last synced: 14 May 2024
![](https://github.com/dongrixinyu.png)
https://github.com/ropensci/MODIStsp
An "R" package for automatic download and preprocessing of MODIS Land Products Time Series
gdal modis modis-data modis-land-products peer-reviewed preprocessing r r-package remote-sensing rstats satellite-imagery time-series
Last synced: 10 May 2024
![](https://github.com/ropensci.png)
https://github.com/jbusecke/xMIP
Analysis ready CMIP6 data in python the easy way with pangeo tools.
analysis-ready-data climate-analysis climate-models cmip6 cmip6-data pangeo preprocessing xgcm
Last synced: 09 May 2024
![](https://github.com/jbusecke.png)
https://github.com/OpenGene/fastp
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
adapter bioinformatics duplication fastq filter filtering illumina merging ngs overlap polyg preprocessing qc quality quality-control sequencing splitting trimming umi
Last synced: 08 May 2024
![](https://github.com/OpenGene.png)
https://github.com/NVIDIA-Merlin/NVTabular
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
deep-learning feature-engineering feature-selection gpu machine-learning nvidia preprocessing recommendation-system recommender-system
Last synced: 07 May 2024
![](https://github.com/NVIDIA-Merlin.png)
https://github.com/AxeldeRomblay/MLBox
MLBox is a powerful Automated Machine Learning python library.
auto-ml automated-machine-learning automl classification data-science deep-learning distributed drift encoding kaggle keras lightgbm machine-learning optimization pipeline prediction preprocessing regression stacking xgboost
Last synced: 05 May 2024
![](https://github.com/AxeldeRomblay.png)
https://github.com/KinWaiCheuk/nnAudio
Audio processing by using pytorch 1D convolution network
1d-convolution audio-processing cqt-spectrogram melspectrogram neural-network preprocessing pytorch spectrogram spectrogram-conversion-toolbox stft
Last synced: 28 Apr 2024
![](https://github.com/KinWaiCheuk.png)
https://github.com/nidhaloff/igel
a delightful machine learning tool that allows you to train, test, and use models without writing code
artificial-intelligence automation automl automl-experiments data-analysis data-science hacktoberfest hacktoberfest2021 machine-learning machine-learning-algorithms machine-learning-library machinelearning neural-network neural-networks preprocessing scikit-learn scikitlearn-machine-learning sklearn
Last synced: 23 Apr 2024
![](https://github.com/nidhaloff.png)
https://github.com/chrislemke/sk-transformers
A collection of pandas & scikit-learn compatible transformers for preprocessing and feature engineering 🛠
data-science feature-engineering feature-selection machine-learning pandas preprocessing python scikit-learn scikit-learn-pipelines scikit-learn-transformer
Last synced: 22 Apr 2024
![](https://github.com/chrislemke.png)
https://github.com/MaxHalford/xam
:dart: Personal data science and machine learning toolbox
data-science machine-learning preprocessing python stacking
Last synced: 20 Apr 2024
![](https://github.com/MaxHalford.png)
https://github.com/msamogh/nonechucks
Deal with bad samples in your dataset dynamically, use Transforms as Filters, and more!
data-cleaning data-pipeline data-preprocessing data-processing machine-learning preprocessing pytorch torch
Last synced: 19 Apr 2024
![](https://github.com/msamogh.png)
https://github.com/pytorch/torcharrow
High performance model preprocessing library on PyTorch
Last synced: 16 Apr 2024
![](https://github.com/pytorch.png)
https://github.com/infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
data-pipelines deep-learning document-parser document-understanding information-retrieval llm llmops machine-learning nlp ocr orchestration pdf-to-text preprocessing rag retrieval-augmented-generation table-structure-recognition
Last synced: 08 Apr 2024
![](https://github.com/infiniflow.png)
https://github.com/Neurita/pypes
Reusable neuroimaging pipelines using nipype
dti fmri ica neuroimaging nipype pet plotting preprocessing
Last synced: 01 Apr 2024
![](https://github.com/Neurita.png)
https://github.com/Davisy/Texthero-Python-Toolkit
Texthero is a simple python toolkit to work with a text-based dataset. It provides quick and effortlessly functionalities to preprocess, represent, map it into vectors and visualize text data in just a couple of lines of codes.
machine-learning natural-language-processing nlp preprocessing python
Last synced: 30 Mar 2024
![](https://github.com/Davisy.png)
https://github.com/raj-sutariya/indic-num2words
Python library for converting numbers to words for all Indian Languages.
indian-languages indic nlp preprocessing python speech-processing
Last synced: 27 Mar 2024
![](https://github.com/raj-sutariya.png)
https://github.com/ikegami-yukino/neologdn
Japanese text normalizer for mecab-neologd
japanese-language mecab-ipadic-neologd nlp preprocessing text-normalization
Last synced: 17 Mar 2024
![](https://github.com/ikegami-yukino.png)
https://github.com/lozuwa/impy
Impy is a Python3 library with features that help you in your computer vision tasks.
dataset exploratory-data-analysis machine-learning preprocessing raw-data statistics tidy-data
Last synced: 16 Mar 2024
![](https://github.com/lozuwa.png)