Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nfriedly/node-unblocker
Web proxy for evading internet censorship, and general-purpose Node.js library for proxying and rewriting remote webpages
https://github.com/nfriedly/node-unblocker
Last synced: 22 days ago
JSON representation
Web proxy for evading internet censorship, and general-purpose Node.js library for proxying and rewriting remote webpages
- Host: GitHub
- URL: https://github.com/nfriedly/node-unblocker
- Owner: nfriedly
- License: agpl-3.0
- Created: 2011-03-26T19:51:07.000Z (over 13 years ago)
- Default Branch: master
- Last Pushed: 2024-10-15T15:37:14.000Z (29 days ago)
- Last Synced: 2024-10-16T19:24:58.020Z (27 days ago)
- Language: JavaScript
- Homepage: https://www.npmjs.com/package/unblocker
- Size: 9.4 MB
- Stars: 458
- Watchers: 40
- Forks: 891
- Open Issues: 29
-
Metadata Files:
- Readme: readme.md
- Changelog: changelog.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# unblocker
Unblocker was originally a web proxy for evading internet censorship, similar to CGIproxy / PHProxy / Glype but
written in node.js. It's since morphed into a general-purpose library for proxying and rewriting remote webpages.All data is processed and relayed to the client on the fly without unnecessary buffering, making unblocker one of the
fastest web proxies available.[![Node.js CI](https://github.com/nfriedly/node-unblocker/actions/workflows/node.js.yml/badge.svg)](https://github.com/nfriedly/node-unblocker/actions/workflows/node.js.yml)
[![npm-version](https://img.shields.io/npm/v/unblocker.svg)](https://www.npmjs.com/package/unblocker)### The magic part
The script uses "pretty" urls which, besides looking pretty, allow links with relative paths
to just work without modification. (E.g. ``)In addition to this, links that are relative to the root (E.g. ``)
can be handled without modification by checking the referrer and 307 redirecting them to the proper
location in the referring site. (Although the proxy does attempt to rewrite these links to avoid the redirect.)Cookies are proxied by adjusting their path to include the proxy's URL, and a bit of extra work is done to ensure they
remain intact when switching protocols or subdomains.### Limitations
Although the proxy works well for standard login forms and even most AJAX content, OAuth login forms and anything that uses
[postMessage](https://developer.mozilla.org/en-US/docs/Web/API/Window/postMessage) (Google, Facebook, etc.) are not
likely to work out of the box. This is not an insurmountable issue, but it's not one that I expect to have fixed in the
near term.More advanced websites, such as Roblox, Discord, YouTube*, Instagram, etc. do not currently work. At the moment, there is no timeframe for when these might be supported.
* There is an example that detects YouTube video pages and [replaces them with a custom page that just streams the video](examples/youtube/).
Patches are welcome, including both general-purpose improvements to go into the main library, and site-specific
fixes to go in the examples folder.## Running the website on your computer
See https://github.com/nfriedly/nodeunblocker.com
## Using unblocker as a library in your software
npm install --save unblocker
Unblocker exports an [express](http://expressjs.com/)-compatible API, so using in an express application is trivial:
var express = require('express')
var Unblocker = require('unblocker');
var app = express();
var unblocker = new Unblocker({prefix: '/proxy/'});// this must be one of the first app.use() calls and must not be on a subdirectory to work properly
app.use(unblocker);app.get('/', function(req, res) {
//...
});// the upgrade handler allows unblocker to proxy websockets
app.listen(process.env.PORT || 8080).on('upgrade', unblocker.onUpgrade);See [examples/simple/server.js](examples/express/server.js) for a complete example.
Usage without express is similarly easy, see [examples/simple/server.js](examples/simple/server.js) for an example.
### Configuration
Unblocker supports the following configuration options, defaults are shown:
```js
{
prefix: '/proxy/', // Path that the proxied URLs begin with. '/' is not recommended due to a few edge cases.
host: null, // Host used in redirects (e.g `example.com` or `localhost:8080`). Default behavior is to determine this from the request headers.
requestMiddleware: [], // Array of functions that perform extra processing on client requests before they are sent to the remote server. API is detailed below.
responseMiddleware: [], // Array of functions that perform extra processing on remote responses before they are sent back to the client. API is detailed below.
standardMiddleware: true, // Allows you to disable all built-in middleware if you need to perform advanced customization of requests or responses.
clientScripts: true, // Injects JavaScript to force things like WebSockets and XMLHttpRequest to go through the proxy.
processContentTypes: [ // All built-in middleware that modifies the content of responses limits itself to these content-types.
'text/html',
'application/xml+xhtml',
'application/xhtml+xml',
'text/css'
],
httpAgent: null, //override agent used to request http response from server. see https://nodejs.org/api/http.html#http_class_http_agent
httpsAgent: null //override agent used to request https response from server. see https://nodejs.org/api/https.html#https_class_https_agent
}
```Setting `process.env.NODE_ENV='production'` will enable more aggressive caching on the client scripts and potentially other optimizations in the future.
#### Custom Middleware
Unblocker "middleware" are small functions that allow you to inspect and modify requests and responses. The majority of Unblocker's internal logic is implimented as middleware, and it's possible to write custom middleware to augment or replace the built-in middleware.
Custom middleware should be a function that accepts a single `data` argument and runs synchronously.
To process request and response data, create a [Transform Stream](https://nodejs.org/api/stream.html#stream_class_stream_transform) to perform the processing in chunks and pipe through this stream. (Example below.)
To respond directly to a request, add a function to `config.requestMiddleware` that handles the `clientResponse` (a standard [http.ServerResponse](https://nodejs.org/api/http.html#http_class_http_serverresponse) when used directly, or a [Express Response](http://expressjs.com/en/4x/api.html#res) when used with Express. Once a response is sent, no further middleware will be executed for that request. (Example below.)
##### requestMiddleware
Data example:
```js
{
url: 'http://example.com/',
clientRequest: {request},
clientResponse: {response},
headers: {
//...
},
stream: {ReadableStream of data for PUT/POST requests, empty stream for other types}
}
```requestMiddleware may inspect the headers, url, etc. It can modify headers, pipe PUT/POST data through a transform stream, or respond to the request directly.
If you're using express, the request and response objects will have all of the usual express goodies. For example:```js
function validateRequest(data) {
if (!data.url.match(/^https?:\/\/en.wikipedia.org\//)) {
data.clientResponse.status(403).send('Wikipedia only.');
}
}
var config = {
requestMiddleware: [
validateRequest
]
}
```If any piece of middleware sends a response, no further middleware is run.
After all requestMiddleware has run, the request is forwarded to the remote server with the (potentially modified) url/headers/stream/etc.
##### responseMiddleware
responseMiddleware receives the same `data` object as the requestMiddleware, but the `headers` and `stream` fields are replaced with those of the remote server's response, and several new fields are added for the remote request and response:
Data example:
```js
{
url: 'http://example.com/',
clientRequest: {request},
clientResponse: {response},
remoteRequest {request},
remoteResponse: {response},
contentType: 'text/html',
headers: {
//...
},
stream: {ReadableStream of response data}
}
```For modifying content, create a new stream and then pipe `data.stream` to it and replace `data.stream` with it:
```js
var Transform = require('stream').Transform;function injectScript(data) {
if (data.contentType == 'text/html') {// https://nodejs.org/api/stream.html#stream_transform
var myStream = new Transform({
decodeStrings: false,
function(chunk, encoding, next) {
chunk = chunk.toString.replace('