https://github.com/bigscience-workshop/data_sourcing
This directory gathers the tools developed by the Data Sourcing Working Group
https://github.com/bigscience-workshop/data_sourcing
Last synced: 4 months ago
JSON representation
This directory gathers the tools developed by the Data Sourcing Working Group
- Host: GitHub
- URL: https://github.com/bigscience-workshop/data_sourcing
- Owner: bigscience-workshop
- License: apache-2.0
- Created: 2021-09-28T19:05:52.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2021-10-25T19:16:46.000Z (over 4 years ago)
- Last Synced: 2025-09-09T16:09:25.060Z (6 months ago)
- Language: Python
- Size: 755 KB
- Stars: 31
- Watchers: 17
- Forks: 6
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# BigScience Data Sourcing Code
This directory gathers the tools developed by the Data Sourcing Working Group
## First Sourcing Sprint: October 2021
The code for the input form can be found in `sourcing_sprint/streamlit_form.py`
The code for the exploration tool can be found in `sourcing_sprint/streamlit_explore.py`
The resource entries can be found in `sourcing_sprint/resources` (one folder per language, one `.jsonl` file per resource)