Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/luminati-io/Crunchbase-dataset-samples

A sample of 1001 Crunchbase companies with key data points, extracted using the Bright Data API.
https://github.com/luminati-io/Crunchbase-dataset-samples

crunchbase crunchbase-api crunchbase-scraper data database datasets webscraper-api webscraping

Last synced: 1 day ago
JSON representation

A sample of 1001 Crunchbase companies with key data points, extracted using the Bright Data API.

Awesome Lists containing this project

README

        

# Crunchbase-dataset-samples

A sample dataset of 1001 Crunchbase companies

![Crunchbase dataset header](https://github.com/luminati-io/Crunchbase-dataset-samples/blob/main/crunchbase-datasets.PNG)

A Crunchbase dataset sample of over 1000 companies. Dataset was extracted using the Bright Data API.

Data points included in this free dataset:

* ```id```: Unique identifier for the company
* ```name```: Name of the company
* ```url```: URL or web address associated with the company
* ```cb_rank```: Crunchbase rank assigned to the company
* ```region```: Continent where the company's headquarters is located
* ```about```: Overview or description of the company
* ```industries```: Industries associated with the company
* ```operating_status```: Current operating status of the company
* ```company_type```: Type of company (e.g., private, public)
* ```social_media_links```: URLs of social media profiles associated with the company
* ```founded_date```: Date when the company was founded
* ```num_employees```: Number of employees in the company

And a lot more.

This is a sample subset which is derived from the "Crunchbase Company Information (public data)"
dataset which includes more than 3,200,000 companies.

Available dataset file formats: JSON, NDJSON, JSON Lines, CSV, or Parquet. Optionally, files can be compressed to .gz.

Dataset delivery type options: Email, API download, Webhook, Amazon S3, Google Cloud storage, Google Cloud PubSub, Microsoft Azure, Snowflake, SFTP.

Update frequency: Once, Daily, Weekly, Monthly, Quarterly, or Custom basis.

Data enrichment available as an addition to the data points extracted: Based on request.

[Get the full Crunchbase companies dataset](https://brightdata.com/products/datasets/crunchbase).

Additional Crunchbase subsets available:

* Crunchbase top ranked companies
* Crunchbase largest companies
* Crunchbase newest companies

![Crunchbase dataset visual](https://github.com/luminati-io/Crunchbase-dataset-samples/blob/main/crunchbase-datasets-image.PNG)

What are the Crunchbase datasets use cases?

1. Competitive analysis

Utilize firmographic data to monitor company growth, pinpoint key organizations and professionals, track employee transitions, and enhance competitive intelligence and analysis with greater efficiency.

2. Market trends & growth

Assess company growth and industry trends to support data-driven decisions. Hedge funds, VCs, and financial firms can significantly boost their investment analysis using the Crunchbase dataset.

3. B2B company data

Enhance your lead generation and sales intelligence by integrating high-quality company and employee data into your CRM, enriched with a comprehensive Crunchbase dataset.

Free access to web scraping tools and datasets for academic researchers and NGOs

The Bright Initiative offers access to Bright Data's [Web Scraper APIs](https://brightdata.com/products/web-scraper) and [ready-to-use datasets](https://brightdata.com/products/datasets) to leading academic faculties and researchers, NGOs and NPOs promoting various environmental and social causes. You can submit an application [here](https://brightinitiative.com).