Projects in Awesome Lists tagged with data-processing-pipelines
A curated list of projects in awesome lists tagged with data-processing-pipelines .
https://github.com/NVIDIA/NeMo-Curator
Scalable data pre processing and curation toolkit for LLMs
data data-curation data-prep data-preparation data-processing data-processing-pipelines data-quality datacuration datarecipes deduplication fast-data-processing fine-tuning large-language-models large-scale-data-processing llm llm-data-quality llmapps python semantic-deduplication
Last synced: 29 Jul 2025
https://github.com/NVIDIA-NeMo/Curator
Scalable data pre processing and curation toolkit for LLMs
data data-curation data-prep data-preparation data-processing data-processing-pipelines data-quality datacuration datarecipes deduplication fast-data-processing fine-tuning large-language-models large-scale-data-processing llm llm-data-quality llmapps python semantic-deduplication
Last synced: 20 Jul 2025
https://github.com/graphbookai/graphbook
The framework for AI-driven data pipelines. Build interactive, highly efficient data pipelines with PyTorch. ⭐ Leave a star to support us!
ai data-processing data-processing-pipelines data-science framework machine-learning ml pytorch research workflow
Last synced: 07 Sep 2025
https://github.com/tamasgal/thepipe
A simplistic, general purpose pipeline framework.
data-processing data-processing-pipelines data-science hacktoberfest pipelines provenance python
Last synced: 21 Mar 2025
https://github.com/mehanix/dhrw
🎢 IaaS visual editor to create & deploy data processing pipelines - python, rmq, react, meteorjs
computational-graph computational-graphs data-analysis data-engineering data-pipeline data-pipelines data-processing data-processing-and-analysis data-processing-pipelines data-processing-system data-science data-visualization docker-compose good-first-issue help-wanted meteorjs-application rabbitmq react-flow
Last synced: 04 Apr 2025