An open API service indexing awesome lists of open source software.

https://github.com/luminati-io/free-datasets

A collection of multiple free datasets across various domains. Each sample contains over 1,000 records, ideal for market analysis, machine learning, consumer insights, and more.
https://github.com/luminati-io/free-datasets

datasets free-data free-datasets web-scraper web-scraper-api

Last synced: 4 months ago
JSON representation

A collection of multiple free datasets across various domains. Each sample contains over 1,000 records, ideal for market analysis, machine learning, consumer insights, and more.

Awesome Lists containing this project

README

          

# Free-Datasets

## A collection of free sample datasets for various analysis

![Free datasets header](https://github.com/luminati-io/Free-datasets/blob/main/free-datasets.PNG)

This repository contains a collection of **free datasets** with thousands of records for use in data analysis, machine learning, and research. The datasets span multiple domains, from business to social media data. All the datasets were collected with [our Web Scraper APIs](https://brightdata.com/products/web-scraper). Want custom datasets or large datasets from popular and hard to scrape domains? Check out our [Dataset Marketplace](https://brightdata.com/products/datasets).

## Some of the data points include:

- `company_name`: Name of the company or business
- `industry`: Industry the company belongs to
- `location`: Geographical location of the business
- `product_name`: Name of the product or service
- `price`: Price of the product or service
- `reviews`: Customer reviews or ratings
- `job_title`: Job title for employment data
- `job_location`: Location of the job
- `education_level`: Required education level for the job
- `skills`: Key skills required for the job
- `followers_count`: Number of social media followers
- `posts_count`: Number of social media posts
- `url`: Direct link to the source page
- `ratings`: Ratings of businesses, products, or services

And a lot more. These datasets are derived from various public sources and offer insights into multiple industries. You can download full size free datasets from [here](https://brightdata.com/products/datasets/free).

## Available Dataset File Formats:

- **CSV**, **JSON**, **NDJSON**, **Parquet**
- Optionally compressed to `.gz`

## Dataset Delivery Options:

- **Email**, **API download**, **Webhook**, **Amazon S3**, **Google Cloud Storage**, **Microsoft Azure**, **Snowflake**, **SFTP**

## Update Frequency:

- **Once**, **Daily**, **Weekly**, **Monthly**, **Quarterly**, or **Custom intervals**

## Data Enrichment:

- Additional data points can be added based on specific project needs.

## Some of the datasets in this repo include:

- `LinkedIn-company-info.csv`
- `Pinterest-posts.csv`
- `Pinterest-profiles.csv`
- `Slintel-6sense-company-information.csv`
- `Trustpilot-business-reviews.csv`
- `Yelp-businesses-reviews.csv`
- `Zoominfo-companies-information.csv`
- `airbnb-properties-information.csv`
- `amazon-products.csv`
- `crunchbase-companies-information.csv`
- `facebook-posts-by-profile.csv`
- `glassdoor-companies-reviews.csv`
- `google-maps-businesses.csv`
- `indeed-job-listings-information.csv`
- `lazada-products.csv`
- `linkedin-company-information.csv`
- `shein-products.csv`
- `shopee-products.csv`
- `target-products.csv`
- `tiktok-profiles.csv`
- `twitter-posts.csv`
- `walmart-products.csv`

## Use Cases for Free Datasets:

### 1. Business Intelligence
Analyze market trends, company performance, or consumer behavior to gain insights and improve decision-making processes.

### 2. Machine Learning
Train machine learning models using real-world data to create accurate predictive models in various domains.

### 3. Social Media Analysis
Analyze user behavior, social media sentiment, and engagement metrics for brands or individual users.

### 4. Job Market Insights
Study trends in job postings, skills required, and industry growth to help guide career planning and talent acquisition.

### 5. Customer Sentiment Analysis
Use product or service reviews to analyze consumer sentiment and identify areas of improvement.

## Free Access for Researchers and NGOs

We provide **free access** to datasets for academic faculties, researchers, NGOs, and NPOs working on environmental or social causes. If you are a researcher or NGO, submit your application [here](https://brightinitiative.com).