An open API service indexing awesome lists of open source software.

https://github.com/eleutherai/pile-explorer

For exploring the data and documenting its limitations
https://github.com/eleutherai/pile-explorer

Last synced: 11 months ago
JSON representation

For exploring the data and documenting its limitations

Awesome Lists containing this project

README

          

# Exploring the Pile
This repository contains code for exploring the Pile and documenting its limitations

## Language Modeling Data Format

The data in the Pile is stored in the [lm_dataformat](https://github.com/leogao2/lm_dataformat). This repository is designed to be used on data stored in that format. For documentation, see the linked repository.