https://github.com/gamehunterkaan/companyenum

OSINT sweep on a company name — Flask dashboard that scrapes Craft.co, Trustpilot, CareerBliss, WHOIS, and web-tech scanners in parallel.
https://github.com/gamehunterkaan/companyenum

cybersecurity cybersecurity-tools osint osint-python osint-tool python python3

Last synced: 3 months ago
JSON representation

OSINT sweep on a company name — Flask dashboard that scrapes Craft.co, Trustpilot, CareerBliss, WHOIS, and web-tech scanners in parallel.

Host: GitHub
URL: https://github.com/gamehunterkaan/companyenum
Owner: GamehunterKaan
Created: 2023-06-27T19:47:00.000Z (about 3 years ago)
Default Branch: main
Last Pushed: 2026-04-11T17:09:06.000Z (3 months ago)
Last Synced: 2026-04-11T19:11:17.296Z (3 months ago)
Topics: cybersecurity, cybersecurity-tools, osint, osint-python, osint-tool, python, python3
Language: Python
Homepage: https://kaangultekin.net/projects/company-enum/
Size: 30.9 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# CompanyEnum

A Flask web app that runs a one-shot OSINT sweep on a company name and presents the findings in a live dashboard. You type a company, watch each scraper report progress in real time, and get a tabbed report covering the company profile, financials, people, web-tech posture, and customer/employee reviews.

![CompanyEnum Input](images/companyenum-input.png)

## What it collects

| Tab | Sources | Data |
| ----------- | ------------------------------------ | ------------------------------------------------------------------------------------- |
| Summary | Craft.co, Google (website fallback) | Company name, website, HQ, founded date, description, sectors, competitors |
| Financials | Craft.co | Stock price, market cap, revenue |
| People | Craft.co | Executives with roles |
| Technology | securityheaders.com, SSL Shopper, Sucuri SiteCheck, whois.com | HTTP security-header grade, raw headers, TLS cert + SANs, Sucuri ratings and recommendations, WHOIS records |
| Ratings | Trustpilot, CareerBliss | Aggregate scores and recent reviews with star widgets |

## How it works

Submissions start a background thread that runs each scraper sequentially and writes per-step state into an in-memory job store. The browser gets redirected to a loading page that polls `/status/` every 500 ms and renders each step as pending → running → done / skipped / error. When the job finishes, the page auto-navigates to `/result/`.

Scrapers in [submodules/](submodules/) are a mix of three strategies depending on how aggressive the target is about bot detection:

- **requests / BeautifulSoup** for sites that don't block (whois.com, SSL Shopper, Sucuri API).
- **cloudscraper** for Cloudflare-protected sites that still accept a good browser fingerprint (Craft.co, securityheaders.com).
- **Playwright + playwright-stealth** (headless Chromium) for sites whose anti-bot won't budge without a real browser (Trustpilot, CareerBliss).

![CompanyEnum Summary](images/companyenum-summary.png)

## Project layout

```
main.py Flask app, routes, background job runner
requirements.txt Python dependencies
submodules/
craftco.py Craft.co profile + executives scraper
trustpilot.py Trustpilot reviews (Playwright)
careerbliss.py CareerBliss reviews (Playwright)
findwebsite.py Website resolver (Craft.co field + Google fallback)
securityheaders.py securityheaders.com scanner
sslhopper.py SSL Shopper cert checker
sucuri.py Sucuri SiteCheck API client
whoisquery.py whois.com scraper
compiledata.py HTML rendering for each output tab
static/
style.css Input page (waves, gradient, search bar)
loading.css Loading page (progress bar + step list)
output-style.css Report page (cards, tags, stars, grade badge)
script.js Output page tab switching + scroll progress
templates/
input.html Search form with example chips
loading.html Live progress UI
output.html Tabbed report
```

## Installation

Requires Python 3.9+.

```bash
pip install -r requirements.txt
playwright install chromium
```

The second command downloads the headless Chromium binary Playwright needs for Trustpilot and CareerBliss.

## Running

```bash
python main.py
```

Then open `http://127.0.0.1:5000/` and enter a company name. Use the example chips on the input page for a quick test.

## Notes

- The in-memory job store is capped at 32 entries and evicts oldest-first.
- A single scan takes roughly 20-40 seconds depending on network and which scrapers stall.
- If Craft.co can't find the company, the website resolver falls back to a Google search, which may itself fail silently if Google serves a CAPTCHA; in that case the Technology tab is filled with "Company website not found" placeholders and everything else still works.
- The scrapers target real HTML and API shapes that change without warning. Expect occasional breakage when a source restructures its page.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gamehunterkaan/companyenum

Awesome Lists containing this project

README