An open API service indexing awesome lists of open source software.

https://github.com/zerocostautomation/wappalyzer-fingerprints


https://github.com/zerocostautomation/wappalyzer-fingerprints

fingerprinting wappalyzer whatweb

Last synced: 13 days ago
JSON representation

Awesome Lists containing this project

README

          

# Wappalyzer Fingerprints

A GitHub Action workflow that automatically fetches, processes, and publishes the latest Wappalyzer fingerprints for technology detection.

## What is this?

This repository contains a GitHub workflow that:

1. Automatically runs daily to fetch the latest Wappalyzer technology fingerprints
2. Processes and standardizes the data format
3. Creates GitHub Releases with the fingerprint data
4. Updates a "latest" tag for easy access

The data is sourced from:
- [HTTPArchive/wappalyzer](https://github.com/HTTPArchive/wappalyzer)
- [enthec/webappanalyzer](https://github.com/enthec/webappanalyzer)

## Download the Latest Data

You can always download the latest fingerprints using these commands:

```bash
# Download the latest technologies data
wget https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json

# Download the latest categories
wget https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/categories.json

# Download the latest groups
wget https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/groups.json

# Download everything in a ZIP archive
wget https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/wappalyzer-fingerprints.zip
```

Or visit the [latest release page](https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest).

## Data Files

The following files are included in each release:

- `technologies.json`: Technology fingerprints for detection (flat array)
- `categories.json`: Technology categories
- `groups.json`: Category grouping information
- `README.md`: Documentation and statistics
- `wappalyzer-fingerprints.zip`: All files in a single ZIP archive

## Data Models

The fingerprint data follows specific structures that can be modeled in various languages:

### JSON Schema

#### Technologies

```json
[
{
"name": "technology_name",
"cats": [1, 2],
"description": "Technology description",
"website": "https://example.com",
"cpe": "cpe:/a:vendor:product:version",
"icon": "icon_file.png",
"cookies": {
"cookie_name": "cookie_pattern"
},
"headers": {
"header_name": "header_pattern"
},
"html": ["html_pattern1", "html_pattern2"],
"scripts": ["script_pattern1", "script_pattern2"],
"meta": {
"meta_name": ["meta_pattern"]
},
"js": {
"object.property": ""
},
"implies": ["other_technology"]
}
]
```

#### Categories

```json
{
"category_id": {
"name": "Category Name",
"priority": 1,
"groups": ["group_id"]
}
}
```

#### Groups

```json
{
"group_id": {
"name": "Group Name"
}
}
```

### TypeScript Models

```typescript
// models.ts
export interface Technology {
name: string;
cats: number[];
description?: string;
website?: string;
cpe?: string;
icon?: string;
cookies?: Record;
headers?: Record;
html?: string[];
scripts?: string[];
scriptSrc?: string[];
meta?: Record;
js?: Record;
implies?: string[];
}

// Technologies.json is a direct array of Technology objects
export type WappalyzerTechnologies = Technology[];

export interface Categories {
[categoryId: string]: {
name: string;
priority: number;
groups?: string[];
};
}

export interface Groups {
[groupId: string]: {
name: string;
};
}

// Usage example
import { WappalyzerTechnologies, Categories, Groups } from './models';

async function loadTechnologies(): Promise {
const response = await fetch('https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json');
return response.json();
}
```

### Go Structs

```go
// models.go
package wappalyzer

// Technologies.json is a direct array of Technology objects
type Technologies []Technology

type Technology struct {
Name string `json:"name"`
Cats []int `json:"cats"`
Description string `json:"description,omitempty"`
Website string `json:"website,omitempty"`
CPE string `json:"cpe,omitempty"`
Icon string `json:"icon,omitempty"`
Cookies map[string]string `json:"cookies,omitempty"`
Headers map[string]string `json:"headers,omitempty"`
HTML []string `json:"html,omitempty"`
Scripts []string `json:"scripts,omitempty"`
ScriptSrc []string `json:"scriptSrc,omitempty"`
Meta map[string][]string `json:"meta,omitempty"`
JS map[string]any `json:"js,omitempty"`
Implies []string `json:"implies,omitempty"`
}

type Categories map[string]Category

type Category struct {
Name string `json:"name"`
Priority int `json:"priority"`
Groups []string `json:"groups,omitempty"`
}

type Groups map[string]Group

type Group struct {
Name string `json:"name"`
}

// Usage example
func LoadTechnologies() (Technologies, error) {
resp, err := http.Get("https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json")
if err != nil {
return nil, err
}
defer resp.Body.Close()

var technologies Technologies
err = json.NewDecoder(resp.Body).Decode(&technologies)
return technologies, err
}
```

### Python Classes

```python
# models.py
from typing import Dict, List, Any, Optional
from dataclasses import dataclass
from pydantic import BaseModel

# Pydantic model
class Technology(BaseModel):
name: str
cats: List[int]
description: Optional[str] = None
website: Optional[str] = None
cpe: Optional[str] = None
icon: Optional[str] = None
cookies: Optional[Dict[str, str]] = None
headers: Optional[Dict[str, str]] = None
html: Optional[List[str]] = None
scripts: Optional[List[str]] = None
scriptSrc: Optional[List[str]] = None
meta: Optional[Dict[str, List[str]]] = None
js: Optional[Dict[str, Any]] = None
implies: Optional[List[str]] = None

# Technologies.json is a direct array of Technology objects
WappalyzerTechnologies = List[Technology]

class Category(BaseModel):
name: str
priority: int
groups: Optional[List[str]] = None

class Group(BaseModel):
name: str

# Usage example
import requests

def load_technologies():
response = requests.get("https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json")
data = response.json()
return [Technology(**tech) for tech in data]
```

### PHP Classes

```php
name = $techData['name'];
$tech->cats = $techData['cats'];
$tech->description = $techData['description'] ?? null;
// populate other properties
$technologies[] = $tech;
}

return $technologies;
}

class Category {
public string $name;
public int $priority;
public ?array $groups = null;
}

class Group {
public string $name;
}

function loadCategories(): array {
$json = file_get_contents("https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/categories.json");
return json_decode($json, true);
}
```

## Usage Examples

### Python

```python
import json
import requests

# Download technologies data
technologies_url = "https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json"
response = requests.get(technologies_url)
data = response.json()

# Print number of technologies
print(f"Total technologies: {len(data)}")

# Example: Find WordPress data
wordpress_tech = next((tech for tech in data if tech["name"] == "WordPress"), None)
if wordpress_tech:
print(f"WordPress categories: {wordpress_tech['cats']}")
```

### JavaScript/Node.js

```javascript
const https = require('https');
const fs = require('fs');

// Download technologies data
const url = "https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json";
https.get(url, (response) => {
let data = '';

response.on('data', (chunk) => {
data += chunk;
});

response.on('end', () => {
const technologies = JSON.parse(data);
console.log(`Total technologies: ${technologies.length}`);

// Example: Find React data
const reactTech = technologies.find(tech => tech.name === "React");
if (reactTech) {
console.log(`React categories: ${reactTech.cats.join(', ')}`);
}
});
}).on("error", (err) => {
console.log("Error: " + err.message);
});
```

### Go

```go
package main

import (
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
)

// Technologies.json is a direct array of Technology objects
type Technology struct {
Name string `json:"name"`
Cats []int `json:"cats"`
// other fields omitted for brevity
}

func main() {
// Download technologies data
resp, err := http.Get("https://github.com/ZeroCostAutomation/wappalyzer-fingerprints/releases/latest/download/technologies.json")
if err != nil {
fmt.Printf("Error fetching data: %s\n", err)
return
}
defer resp.Body.Close()

// Parse JSON
data, err := ioutil.ReadAll(resp.Body)
if err != nil {
fmt.Printf("Error reading response: %s\n", err)
return
}

var technologies []Technology
if err := json.Unmarshal(data, &technologies); err != nil {
fmt.Printf("Error parsing JSON: %s\n", err)
return
}

fmt.Printf("Total technologies: %d\n", len(technologies))

// Example: Find jQuery data
for _, tech := range technologies {
if tech.Name == "jQuery" {
fmt.Printf("jQuery categories: %v\n", tech.Cats)
break
}
}
}
```

## Running Locally

If you want to run the fingerprint update script locally:

1. Clone this repository
2. Install Python dependencies: `pip install requests`
3. Run the script: `python .github/scripts/fetch_fingerprints.py`
4. Find the output files in the `assets` directory

## Workflow Schedule

The GitHub workflow runs automatically:
- Daily at 01:00 UTC
- Manually via GitHub Actions workflow dispatch

## License

This data is derived from Wappalyzer, which is licensed under the MIT License. When using this data, please respect the original license terms.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.