Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jack-development/redditfetch
RedditFetch is a robust tool for collecting and managing Reddit user data using Python and PRAW. It fetches posts and comments, assigns unique IDs, and structures the data seamlessly for easy access and analysis.
https://github.com/jack-development/redditfetch
api dataset-generation praw pytorch reddit
Last synced: about 1 month ago
JSON representation
RedditFetch is a robust tool for collecting and managing Reddit user data using Python and PRAW. It fetches posts and comments, assigns unique IDs, and structures the data seamlessly for easy access and analysis.
- Host: GitHub
- URL: https://github.com/jack-development/redditfetch
- Owner: Jack-Development
- License: mit
- Created: 2023-08-17T05:49:46.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-08-18T14:52:19.000Z (over 1 year ago)
- Last Synced: 2023-08-18T16:10:42.432Z (over 1 year ago)
- Topics: api, dataset-generation, praw, pytorch, reddit
- Language: Python
- Homepage:
- Size: 35.2 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# RedditFetch
RedditFetch is a robust and efficient tool for collecting and managing datasets from Reddit. This repository is designed to fetch posts and comments from specified Reddit users, assign unique IDs to users, and save the data in a structured manner.
Inspired by the vast amount of data available on Reddit and the need for a streamlined data collection process, this project was developed to provide a seamless experience for researchers and developers interested in Reddit data.
The initial implementation focuses on user-specific data collection, but the modular architecture of the codebase allows for potential expansion to other Reddit data types.
## Skills and Technologies Used
The project heavily relies on:
- Python
- PRAW (Python Reddit API Wrapper)
- JSON
## Getting Started
_Coming soon..._
A comprehensive guide on how to utilize this project will be available soon. The guide will detail the steps to set up the pipeline, fetch data, and manage the datasets effectively.
## Contributing
Contributions, issues, and feature requests are welcome. If you're interested in enhancing the capabilities of RedditDataPipeline or have found any bugs, please check the [issues page](https://github.com/Jack-Development/RedditDataPipeline/issues).
## License
This project is [MIT](https://choosealicense.com/licenses/mit/) licensed.