https://github.com/lissy93/wapalyzer
🌐 Identify the technologies powering any website. This is a fork of the now deleted Wappalyzer project by @AliasIO and community.
https://github.com/lissy93/wapalyzer
Last synced: about 1 year ago
JSON representation
🌐 Identify the technologies powering any website. This is a fork of the now deleted Wappalyzer project by @AliasIO and community.
- Host: GitHub
- URL: https://github.com/lissy93/wapalyzer
- Owner: Lissy93
- License: gpl-3.0
- Created: 2023-08-24T12:14:52.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2024-02-21T15:49:40.000Z (over 2 years ago)
- Last Synced: 2024-04-14T08:14:09.738Z (about 2 years ago)
- Language: JavaScript
- Homepage:
- Size: 70.7 MB
- Stars: 203
- Watchers: 7
- Forks: 35
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
Wapalyzer

Identify the technologies powering any website
---
> This is a community fork of the now removed [wappalyzer](https://web.archive.org/web/20230821034415/https://20230821034415/github.com/wappalyzer/wappalyzer) project, initially developed by [@AliasIO](https://github.com/AliasIO).
>
> The original author maintains a hosted instanced, availible at [wappalyzer.com](https://www.wappalyzer.com/).
## Prerequisites
- [Git](https://git-scm.com)
- [Node.js](https://nodejs.org) version 14 or higher
- [Yarn](https://yarnpkg.com)
## Quick start
```sh
git clone https://github.com/lissy93/wapalyzer.git
cd wappalyzer
yarn install
yarn run link
```
## Usage
### Command line
```sh
node src/drivers/npm/cli.js https://example.com
```
### Chrome extension
* Go to `about:extensions`
* Enable 'Developer mode'
* Click 'Load unpacked'
* Select `src/drivers/webextension`
### Firefox extension
* Go to `about:debugging#/runtime/this-firefox`
* Click 'Load Temporary Add-on'
* Select `src/drivers/webextension/manifest.json`
## Specification
A long list of [regular expressions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions) is used to identify technologies on web pages. wapalyzer inspects HTML code, as well as JavaScript variables, response headers and more.
Patterns (regular expressions) are kept in [`src/technologies/`](https://github.com/lissy93/wapalyzer/blob/master/src/technologies). The following is an example of an application fingerprint.
#### Example
```json
"Example": {
"description": "A short description of the technology.",
"cats": [
"1"
],
"cookies": {
"cookie_name": "Example"
},
"dom": {
"#example-id": {
"exists": "",
"attributes": {
"class": "example-class"
},
"properties": {
"example-property": ""
},
"text": "Example text content"
}
},
"dns": {
"MX": [
"example\\.com"
]
},
"js": {
"Example.method": ""
},
"excludes": "Example",
"headers": {
"X-Powered-By": "Example"
},
"html": "]example\\.css",
"text": "\bexample\b",
"css": "\\.example-class",
"robots": "Disallow: /unique-path/",
"implies": "PHP\\;confidence:50",
"requires": "WordPress",
"requiresCategory": "Ecommerce",
"meta": {
"generator": "(?:Example|Another Example)"
},
"probe": {
"/path": ""
},
"scriptSrc": "example-([0-9.]+)\\.js\\;confidence:50\\;version:\\1",
"scripts": "function webpackJsonpCallback\\(data\\) {",
"url": "example\\.com",
"xhr": "example\\.com",
"oss": true,
"saas": true,
"pricing": ["mid", "freemium", "recurring"],
"website": "https://example.com",
}
```
## JSON fields
Find the JSON schema at [`schema.json`](https://github.com/lissy93/wapalyzer/blob/master/schema.json).
### Required properties
Field
Type
Description
Example
cats
Array
One or more category IDs.
[1, 6]
website
String
URL of the application's website.
"https://example.com"
### Optional properties
Field
Type
Description
Example
description
String
A short description of the technology in British English (max.
250 characters). Write in a neutral, factual tone; not like an
ad.
"A short description."
icon
String
Application icon filename.
"WordPress.svg"
cpe
String
CPE
is a structured naming scheme for technologies. To check if a CPE is valid and exists (using v2.3), use the search).
"cpe:2.3:a:apache:http_server
:*:*:*:*:*:*:*:*"
saas
Boolean
The technology is offered as a Software-as-a-Service (SaaS), i.e. hosted or cloud-based.
true
oss
Boolean
The technology has an open-source license.
true
pricing
Array
Cost indicator (based on a typical plan or average monthly price) and available pricing models. For paid products only.
One of:
-
lowLess than US $100 / mo -
midBetween US $100 - $1,000 / mo -
highMore than US $1,000 / mo
Plus any of:
-
freemiumFree plan available -
onetimeOne-time payments accepted -
recurringSubscriptions available -
poaPrice on asking -
paygPay as you go (e.g. commissions or usage-based fees)
["low", "freemium"]
### Implies, requires and excludes (optional)
Field
Type
Description
Example
implies
String | Array
The presence of one application can imply the presence of
another, e.g. WordPress means PHP is also in use.
"PHP"
requires
String | Array
Similar to implies but detection only runs if the required technology has been identified. Useful for themes for a specific CMS.
"WordPress"
requiresCategory
String | Array
Similar to requires; detection only runs if a technology in the required category has been identified.
"Ecommerce"
excludes
String | Array
Opposite of implies. The presence of one application can exclude
the presence of another.
"Apache"
### Patterns (optional)
Field
Type
Description
Example
cookies
Object
Cookies.
{ "cookie_name": "Cookie value" }
dom
String | Array | Object
Uses a
query selector
to inspect element properties, attributes and text content.
{ "#example-id": { "property": { "example-prop": "" } }
}
dns
Object
DNS records: supports MX, TXT, SOA and NS (NPM driver only).
{ "MX": "example\\.com" }
js
Object
JavaScript properties (case sensitive). Avoid short property
names to prevent matching minified code.
{ "jQuery.fn.jquery": "" }
headers
Object
HTTP response headers.
{ "X-Powered-By": "^WordPress$" }
html
String | Array
HTML source code. Patterns must include an HTML opening tag to
avoid matching plain text. For performance reasons, avoid
html where possible and use
dom instead.
"<a [^>]*href=\"index.html"
text
String | Array
Matches plain text. Should only be used in very specific cases where other methods can't be used.
\bexample\b
css
String | Array
CSS rules. Unavailable when a website enforces a same-origin
policy. For performance reasons, only a portion of the available
CSS rules are used to find matches.
"\\.example-class"
probe
Object
Request a URL to test for its existence or match text content (NPM driver only).
{ "/path": "Example text" }
robots
String | Array
Robots.txt contents.
"Disallow: /unique-path/"
url
String | Array
Full URL of the page.
"^https?//.+\\.wordpress\\.com"
xhr
String | Array
Hostnames of XHR requests.
"cdn\\.netlify\\.com"
meta
Object
HTML meta tags, e.g. generator.
{ "generator": "^WordPress$" }
scriptSrc
String | Array
URLs of JavaScript files included on the page.
"jquery\\.js"
scripts
String | Array
JavaScript source code. Inspects inline and external scripts. For performance reasons, avoid
scripts where possible and use
js instead.
"function webpackJsonpCallback\\(data\\) {"
## Patterns
Patterns are essentially JavaScript regular expressions written as strings, but with some additions.
### Quirks and pitfalls
- Because of the string format, the escape character itself must be escaped when using special characters such as the dot (`\\.`). Double quotes must be escaped only once (`\"`). Slashes do not need to be escaped (`/`).
- Flags are not supported. Regular expressions are treated as case-insensitive.
- Capture groups (`()`) are used for version detection. In other cases, use non-capturing groups (`(?:)`).
- Use start and end of string anchors (`^` and `$`) where possible for optimal performance.
- Short or generic patterns can cause applications to be identified incorrectly. Try to find unique strings to match.
### Tags
Tags (a non-standard syntax) can be appended to patterns (and implies and excludes, separated by `\\;`) to store additional information.
Tag
Description
Example
confidence
Indicates a less reliable pattern that may cause false
positives. The aim is to achieve a combined confidence of 100%.
Defaults to 100% if not specified.
"js": { "Mage": "\\;confidence:50" }
version
Gets the version number from a pattern match using a special
syntax.
"scriptSrc": "jquery-([0-9.]+)\.js\\;version:\\1"
### Version syntax
Application version information can be obtained from a pattern using a capture group. A condition can be evaluated using the ternary operator (`?:`).
Example
Description
\\1
Returns the first match.
\\1?a:
Returns a if the first match contains a value, nothing
otherwise.
\\1?a:b
Returns a if the first match contains a value, b otherwise.
\\1?:b
Returns nothing if the first match contains a value, b
otherwise.
foo\\1
Returns foo with the first match appended.