https://github.com/vincentzed/decon
`decon`, but with python API binding.
https://github.com/vincentzed/decon
benchmark data-pipeline data-processing data-science datacomp decontaminate deduplication evaluation instruction-tuning llm llm-eval llm-evaluation llms lm-evaluation nlp pretraining synthetic-data
Last synced: 6 months ago
JSON representation
`decon`, but with python API binding.
- Host: GitHub
- URL: https://github.com/vincentzed/decon
- Owner: vincentzed
- License: apache-2.0
- Fork: true (allenai/decon)
- Created: 2026-01-09T15:41:41.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2026-01-09T22:35:27.000Z (6 months ago)
- Last Synced: 2026-01-12T18:34:53.829Z (6 months ago)
- Topics: benchmark, data-pipeline, data-processing, data-science, datacomp, decontaminate, deduplication, evaluation, instruction-tuning, llm, llm-eval, llm-evaluation, llms, lm-evaluation, nlp, pretraining, synthetic-data
- Language: Rust
- Homepage: https://pypi.org/project/decontaminate/
- Size: 7.31 MB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0