Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/itbdw/server-render-javascript
Prerender your javascript web page for better seo with PhantomJS. ⚠️ no es6 supported.
https://github.com/itbdw/server-render-javascript
prerender seo server-rendering
Last synced: 3 months ago
JSON representation
Prerender your javascript web page for better seo with PhantomJS. ⚠️ no es6 supported.
- Host: GitHub
- URL: https://github.com/itbdw/server-render-javascript
- Owner: itbdw
- License: mit
- Created: 2017-07-22T15:51:47.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2020-10-14T16:14:25.000Z (over 4 years ago)
- Last Synced: 2024-10-11T11:45:17.281Z (4 months ago)
- Topics: prerender, seo, server-rendering
- Language: JavaScript
- Homepage:
- Size: 22.5 KB
- Stars: 12
- Watchers: 3
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:# WARNING
This is a thing waste of your time!# 警告
纯属浪费时间,不要尝试。:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:# server-render-javascript
Prerender your javascript web page for better seo with PhantomJS.## Dependency
Must
`PhantomJS`, a server headless browser.
`NodeJS`, serve phantomjs.
Suggested
`pm2`, start the server and much more.
## Install
> Suppose you are using a Ubuntu Server and nginx as web server.
1. NodeJS
```
curl -sL https://deb.nodesource.com/setup_6.x | sudo -E bash -sudo apt-get install -y nodejs
```
2. PhantomJS
Go [http://phantomjs.org/download.html](http://phantomjs.org/download.html) and download a latest version, extract it, cd the dir, and `mv bin/phantomjs /usr/bin`, that is all!
3. Install and Run
Download this code to your server, say `/var/server/spider`, the directory
structure looks like below```
/var/server/spider/
craw.js
package.json
spider.js```
Any thing within craw.js and spider.js can be modified for your actual need.
```
cd /var/server/spider
npm installPORT=3000 pm2 -f start spider.js
```after started, you can use `pm2 logs` to monitor logs, `pm2 list` to display services and much more.
4. Proxy Request
I suppose you use nginx as web server and run the nodejs and nginx at same server.
```
upstream spider {
server localhost:3000;
}server {
...
set $is_spider 0;
set $is_server_render 0;if ($http_user_agent ~ Baiduspider) {
set $is_spider 1;
}if ($http_user_agent ~ Googlebot) {
set $is_spider 1;
}if ($http_user_agent ~ ServerRender) {
set $is_server_render 1;
}set $is_spider_is_render $is_spider$is_server_render;
location / {
...
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;if ($is_spider_is_render = 10) {
proxy_pass http://spider;
}...
}
...
}
```After changing nginx config, don't forget reload the nginx.
You can make a request and check your nginx access log if anything works great.
`curl -A 'fake Googlebot by server-render-javascript' http://yourwebsite.com/abc`
You should get two line in nginx access log, one is your request with user-agent `fake Googlebot by server-render-javascript` and one made by
your upstream server with user-agent `ServerRenderJavascript`, if you have not change the default user-agent at craw.js.## Caution
* Watch out the timeout in craw.js
## What and how it works
Your website is rendered with javascript. But search engine (like Baidu, Sogou, 360) does not like the page, and `even` can not understand javascript content totally.
So, why don't we run a browser on the server side. When spider like Googlebot comes to your website,
proxy the request to a upstream server, why not `nodejs server`, and the upstream server deal the request
with a headless browser and make a new request just like the we human visit website by Safari or Chrome and return the
rendered content back.The workflow looks like this
```
GoogleBot => Web Server => NodeJS Server => Make A Request Again With Server Browser => Get Web Content And Return
```In fact, Chrome also support headless browse after version 59. But compared with PhantomJS for now, ~~PhantomJS is better and easy to use.~~ you can find a chrome version at [https://github.com/itbdw/server-render-javascript-chrome](https://github.com/itbdw/server-render-javascript-chrome)
## Thanks
[Prerender For SEO](https://www.mxgw.info/t/phantomjs-prerender-for-seo.html)