Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with preprocessing

A curated list of projects in awesome lists tagged with preprocessing .

https://github.com/dongrixinyu/JioNLP

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com

apache2 chinese natural-language-processing ner nlp nlp-parse preprocessing python time-parse time-parsing

Last synced: 27 Oct 2024

https://github.com/opengene/fastp

An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)

adapter bioinformatics duplication fastq filter filtering illumina merging ngs overlap polyg preprocessing qc quality quality-control sequencing splitting trimming umi

Last synced: 17 Dec 2024

https://github.com/OpenGene/fastp

An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)

adapter bioinformatics duplication fastq filter filtering illumina merging ngs overlap polyg preprocessing qc quality quality-control sequencing splitting trimming umi

Last synced: 14 Nov 2024

https://github.com/nvidia-merlin/nvtabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

deep-learning feature-engineering feature-selection gpu machine-learning nvidia preprocessing recommendation-system recommender-system

Last synced: 17 Dec 2024

https://github.com/NVIDIA-Merlin/NVTabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

deep-learning feature-engineering feature-selection gpu machine-learning nvidia preprocessing recommendation-system recommender-system

Last synced: 12 Nov 2024

https://github.com/pytorch/torcharrow

High performance model preprocessing library on PyTorch

preprocessing python pytorch

Last synced: 07 Oct 2024

https://github.com/msamogh/nonechucks

Deal with bad samples in your dataset dynamically, use Transforms as Filters, and more!

data-cleaning data-pipeline data-preprocessing data-processing machine-learning preprocessing pytorch torch

Last synced: 14 Nov 2024

https://github.com/maxhalford/xam

:dart: Personal data science and machine learning toolbox

data-science machine-learning preprocessing python stacking

Last synced: 17 Dec 2024

https://github.com/MaxHalford/xam

:dart: Personal data science and machine learning toolbox

data-science machine-learning preprocessing python stacking

Last synced: 15 Nov 2024

https://github.com/ikegami-yukino/jaconv

Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, and Zenkaku

character-converter japanese-kana japanese-language julius preprocessing pure-python text-processing transliteration

Last synced: 19 Dec 2024

https://github.com/advaitsave/Introduction-to-Time-Series-forecasting-Python

Introduction to time series preprocessing and forecasting in Python using AR, MA, ARMA, ARIMA, SARIMA and Prophet model with forecast evaluation.

arima arma dickey-fuller forecast-evaluation forecasting preprocessing prophet-model python sarima seasonality series-forecasting-python series-preprocessing stationarity time-series time-series-forecasting

Last synced: 30 Oct 2024

https://github.com/dunky11/voicesmith

[WIP] VoiceSmith makes training text to speech models easy.

dataset-manager delightfultts preprocessing speech-synthesis text-to-speech toolkit tts univnet voice-cloning

Last synced: 18 Dec 2024

https://github.com/jbusecke/xMIP

Analysis ready CMIP6 data in python the easy way with pangeo tools.

analysis-ready-data climate-analysis climate-models cmip6 cmip6-data pangeo preprocessing xgcm

Last synced: 27 Nov 2024

https://github.com/jbusecke/xmip

Analysis ready CMIP6 data in python the easy way with pangeo tools.

analysis-ready-data climate-analysis climate-models cmip6 cmip6-data pangeo preprocessing xgcm

Last synced: 17 Dec 2024

https://github.com/ropensci/modistsp

An "R" package for automatic download and preprocessing of MODIS Land Products Time Series

gdal modis modis-data modis-land-products peer-reviewed preprocessing r r-package remote-sensing rstats satellite-imagery time-series

Last synced: 22 Dec 2024

https://github.com/githubharald/deslantimg

The deslanting algorithm sets text upright in images. Python, C++ and OpenCL implementations provided.

c-plus-plus gpu handwriting-recognition image-processing ocr opencl opencv preprocessing python

Last synced: 19 Nov 2024

https://github.com/chakki-works/chariot

Deliver the ready-to-train data to your NLP model.

keras natural-language-processing preprocessing python tensorflow

Last synced: 11 Nov 2024

https://github.com/lozuwa/impy

Impy is a Python3 library with features that help you in your computer vision tasks.

dataset exploratory-data-analysis machine-learning preprocessing raw-data statistics tidy-data

Last synced: 03 Nov 2024

https://github.com/chrise96/3D_Ground_Segmentation

A ground segmentation algorithm for 3D point clouds based on the work described in “Fast segmentation of 3D point clouds: a paradigm on LIDAR data for Autonomous Vehicle Applications”, D. Zermas, I. Izzat and N. Papanikolopoulos, 2017. Distinguish between road and non-road points. Road surface extraction. Plane fit ground filter

cpp extraction ground ground-segmentation lastools lidar non-ground point-cloud preprocessing road-surface

Last synced: 27 Oct 2024

https://github.com/madyankin/postcss-each

PostCSS plugin to iterate through values

css iteration postcss preprocessing

Last synced: 11 Nov 2024

https://github.com/kharchenkolab/dropEst

Pipeline for initial analysis of droplet-based single-cell RNA-seq data

pipeline preprocessing scrna-seq single-cell-rna-seq

Last synced: 06 Nov 2024

https://github.com/nipreps/dmriprep

dMRIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data. The transparent workflow dispenses of manual intervention, thereby ensuring the reproducibility of the results.

bids bids-apps diffusion-mri magnetic-resonance-imaging preprocessing

Last synced: 11 Nov 2024

https://github.com/elcorto/pwtools

pwtools is a Python package for pre- and postprocessing of atomistic calculations, mostly targeted to Quantum Espresso, CPMD, CP2K and LAMMPS. It is almost, but not quite, entirely unlike ASE, with some tools extending numpy/scipy. It has a set of powerful parsers and data types for storing calculation data.

ase cp2k cpmd kernel-regression kernel-ridge-regression lammps molecular-dynamics multivariate-regression parameter-sweep polynomial-regression postprocessing preprocessing python quantum-espresso quasi-harmonic-approximation radial-basis-function radial-distribution-function radial-pair-correlation-function sqlite

Last synced: 18 Dec 2024

https://github.com/takelab/podium

Podium: a framework agnostic Python NLP library for data loading and preprocessing

data-loading datasets natural-language-processing nlp preprocessing python

Last synced: 08 Nov 2024

https://github.com/paulross/cpip

CPIP - a C/C++ preprocessor implemented in Python.

c c-plus-plus pre-processing pre-processor preprocessing preprocessor python

Last synced: 17 Dec 2024

https://github.com/bids-apps/freesurfer

BIDS app wrapping recon-all from FreeSurfer

anatomical-mri bids bidsapp mri preprocessing

Last synced: 11 Nov 2024

https://github.com/bids-apps/HCPPipelines

A BIDS App for minimal preprocessing using the HCP Pipelines

anatomical-mri bids bidsapp functional-mri mri preprocessing

Last synced: 11 Nov 2024

https://github.com/juliaml/mllabelutils.jl

Utility package for working with classification targets and label-encodings

classification julia machine-learning preprocessing

Last synced: 20 Nov 2024

https://github.com/intuition-dev/intuition

Intuition v1. CLI for Pug, CRUD and docs/blogs as staticGen, and much more.

component low-code markdown preprocessing pug seo static-site-generator web webapp

Last synced: 12 Oct 2024

https://github.com/fkie-cad/logprep

log data pre processing, generation and shipping in python

etl kafka log logdata loggenerator logshipper opensearch preprocessing python soar sre

Last synced: 16 Dec 2024

https://github.com/vasisouv/tweets-preprocessor

Repo containing the Twitter preprocessor module, developed by the AUTH OSWinds team

nltk preprocessing python spacy spacy-nlp twitter

Last synced: 15 Dec 2024

https://github.com/lucasrla/wsi-preprocessing

Simple library for preprocessing histopathological whole-slide images (WSI) into tiles (a.k.a. patches) towards deep learning

fastai histopathology libvips openslide pathology preprocessing pytorch pyvips whole-slide-imaging wsi

Last synced: 14 Oct 2024

https://github.com/nobodywasishere/vhdlproc

VHDLproc is a VHDL preprocessor

preprocessing python vhdl vhdl-preprocessor

Last synced: 20 Dec 2024

https://github.com/fitushar/brain-tissue-segmentation-using-deep-learning-pipeline-neuronet

This Repository is for the MISA Course final project which was Brain tissue segmentation. we adopt NeuroNet which is a comprehensive brain image segmentation tool based on a novel multi-output CNN architecture which has been trained and tuned using IBSR18 dataset

3d 3dfcn brain brain-tissue-segmentation cnn-architecture dice neuronet preprocessing registration segmentation

Last synced: 08 Nov 2024

https://github.com/akb89/pyfn

A python module to process data for Frame Semantic Parsing

coling2018 frame-semantic-parsing framenet framenet-xml-data open-sesame pipeline preprocessing semafor

Last synced: 09 Nov 2024

https://github.com/banditml/faucetml

High speed mini-batch data reading & preprocessing from BigQuery.

bigquery feature-engineering features machine-learning ml preprocessing pytorch

Last synced: 03 Dec 2024

https://github.com/louisbrulenaudet/docutron

Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.

cv2 detecron2 detection document legal legaltech legaltools llm machine-learning nlp ocr ocr-recognition preprocessing

Last synced: 23 Nov 2024

https://github.com/Neurita/pypes

Reusable neuroimaging pipelines using nipype

dti fmri ica neuroimaging nipype pet plotting preprocessing

Last synced: 12 Nov 2024

https://github.com/bids-apps/CPAC

BIDS Application for the Configurable Pipeline for the Analysis of Connectomes (C-PAC)

bids bidsapp mri preprocessing

Last synced: 11 Nov 2024

https://github.com/lucasrla/wsi-tile-cleanup

Image filters for digital pathology: detect pen marks, background, and artifacts. Use them for preprocessing towards deep learning

deep-learning fastai histopathology libvips otsu-threshold pathology preprocessing pytorch pyvips whole-slide-imaging wsi

Last synced: 14 Oct 2024

https://github.com/deepraj1729/tchatbot-api

A Flask REST API to serve trained ChatBots using Tensorflow Serving and Docker Containers

api-rest chatbot deep-learning flask flask-restful framwork keras nlp preprocessing requests tensorflow tf-serving

Last synced: 12 Nov 2024

https://github.com/yeonghyeon/preprocessing-method-for-stemi-detection

Official source code of "Preprocessing Method for Performance Enhancement in CNN-based STEMI Detection from 12-lead ECG"

cnn convolutional-neural-network ecg electrocardiogram enhancement highpass-filter improvement lead notch-filter preprocessing python qrs-complex stemi-detection voting

Last synced: 11 Nov 2024

https://github.com/marrow/dsl

A Pythonic DSL construction engine for import–time code translation.

cpython dsl preprocessing preprocessor pypy python python-2 python-3 text-processing

Last synced: 12 Nov 2024

https://github.com/evernext10/hand-gesture-recognition-machine-learning

Automatic method for the recognition of hand gestures for the categorization of vowels and numbers in Colombian sign language based on Neural Networks (Perceptrons), Support Vector Machine and K-Nearest Neighbor for classifier /// Método automático para el reconocimiento de gestos de mano para la categorización de vocales y números en lenguaje de señas colombiano basado en redes neuronales (perceptrones), soporte de máquina vectorial y K-vecino más cercano para clasificador

artificial-intelligence colombian-sign-language colombian-signal-language f1-score feature-extraction gesture hand knearest-neighbor-classifier knn-classification knn-classifier lsc machine-learning machinelearning neural-network precision preprocessing recall recognition signal-processing support-vector-machines

Last synced: 09 Nov 2024

https://github.com/adobe-research/beacon-aug

Cross-library augmentation toolbox supporting 300 operators over 8 libraries + AI transforms

albumentation augly augmentation beacon conversion cross-platform deep-learning gan imgaug mmcv preprocessing transformations

Last synced: 14 Nov 2024

https://github.com/gianlucatruda/warfit-learn

A machine learning toolkit for reproducible research in anticoagulant dose estimation.

data-science iwpc pandas preprocessing python reproducible-research sklearn supervised-learning warfarin warfit-learn

Last synced: 10 Oct 2024

https://github.com/james77777778/keras-aug

A library that includes pure TF/Keras preprocessing and augmentation layers, providing support for various data types such as images, labels, bounding boxes, segmentation masks, and more.

augmentation keras keras-cv preprocessing tensorflow

Last synced: 13 Oct 2024

https://github.com/huangzhii/tsunami

An R software for Gene Co-Expression Analysis

co-expression gene preprocessing

Last synced: 19 Nov 2024

https://github.com/miferreiro/bdpar

Big Data Preprocessing Architecture

custom-flow custom-pipes preprocessing r r6

Last synced: 08 Nov 2024

https://github.com/chrislemke/sk-transformers

A collection of pandas & scikit-learn compatible transformers for preprocessing and feature engineering 🛠

data-science feature-engineering feature-selection machine-learning pandas preprocessing python scikit-learn scikit-learn-pipelines scikit-learn-transformer

Last synced: 13 Oct 2024

https://github.com/bencardoen/datacurator.jl

A scalable Julia package to transparently validate and transform large biomedical datasets using human readable recipes that are translated to machine verifiable templates.

julia julia-package portable postprocessing preprocessing reproducible-research scalability

Last synced: 02 Nov 2024

https://github.com/boudinfl/semeval-2010-pre

Preprocessed SemEval-2010 benchmark dataset for keyphrase extraction

dataset information-retrieval keyphrase-extraction natural-language-processing preprocessing

Last synced: 02 Dec 2024

https://github.com/bcbi/preprocessmd.jl

Medically-informed data preprocessing for machine learning

julia machine-learning omop preprocessing

Last synced: 07 Dec 2024

https://github.com/aflah02/cleansetext

This is a simple library to help you clean your textual data

cleaning-data nlp preprocessing pypi text

Last synced: 20 Nov 2024

https://github.com/Davisy/Texthero-Python-Toolkit

Texthero is a simple python toolkit to work with a text-based dataset. It provides quick and effortlessly functionalities to preprocess, represent, map it into vectors and visualize text data in just a couple of lines of codes.

machine-learning natural-language-processing nlp preprocessing python

Last synced: 04 Nov 2024

https://github.com/fostroll/toxine

Tiny preprocessor for Russian text

natural-language-processing nlp preprocessing python

Last synced: 21 Dec 2024

https://github.com/nashory/loader-torch

An Multi-threaded Data Loader Module for Torch.

data dataloader loader module multi-thread preprocessing toolbox torch

Last synced: 19 Nov 2024

https://github.com/lucasrla/wsi-preprocessing-sos-workflow

A pipeline to preprocess whole-slide images (WSI) towards deep learning

deep-learning histopathology pathology preprocessing sos sos-workflow whole-slide-imaging

Last synced: 29 Nov 2024

https://github.com/pratikunterwegs/atlastools

Tools for pre-processing high-throughput animal tracking data.

animal-movement animal-tracking movement-ecology preprocessing

Last synced: 02 Dec 2024

https://github.com/krzmbrzl/armapreprocessortestcases

A collection of tests I have performed on the Arma preprocessor. All tests are made up by an input and an output file containing what the Arma preproc made of it.

arma arma3 preprocessing preprocessor test-cases

Last synced: 18 Dec 2024

https://github.com/fracpete/missing-values-imputation-weka-package

Weka package for missing values imputation and injection using various techniques.

filters java machine-learning plugin preprocessing weka

Last synced: 10 Dec 2024

https://github.com/ikegami-yukino/neologdn-java

Japanese text normalizer for mecab-neologd

java nlp preprocessing text-processing

Last synced: 18 Nov 2024

https://github.com/rom1504/tensorflow_captcha_solver

Captcha solver based on https://medium.com/@ageitgey/how-to-break-a-captcha-system-in-15-minutes-with-machine-learning-dbebb035a710

captcha-solving deep-learning preprocessing tensorflow vision

Last synced: 15 Dec 2024

https://github.com/pgcai/dorapy

Dorapy is a deep learning framework that focuses on data pre-processing.🛸

deep-learning deeplearning-framework preprocessing python

Last synced: 12 Nov 2024