https://github.com/nakabonne/staticcollector
Application to analyze static files of competing sites
https://github.com/nakabonne/staticcollector
crawler go golang
Last synced: about 1 month ago
JSON representation
Application to analyze static files of competing sites
- Host: GitHub
- URL: https://github.com/nakabonne/staticcollector
- Owner: nakabonne
- License: mit
- Created: 2017-08-26T05:05:27.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2017-11-05T11:53:30.000Z (over 8 years ago)
- Last Synced: 2025-09-12T06:56:05.707Z (9 months ago)
- Topics: crawler, go, golang
- Language: JavaScript
- Homepage:
- Size: 26.8 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Overview
Application to analyze static files of competing sites.
You can do the following.
- Confirm the change in ranking of competing websites
- Compare the two HTML before and after the rank change

- Register keywords

- Crawl on the web

## Installation
```
$ go get github.com/ryonakao/StaticCollector
```
## SetUP
### Setup mongoDB
Start
```
$ sudo mongod --dbpath /var/lib/mongodb --logpath /var/log/moodb.log
```
Create collection
```
$ mongo
> use web_crawler
> db.createCollection('static_files');
```
Insert tmp data
```
> db.static_files.insert({word_id:1, page_id:1, title:'tmp title', html:"", rank:2, target_day:ISODate("2017-08-24T04:54:00.697Z")});
```
### Setup Mysql
Start
```
$ mysql.server restart
```
Create tables
```
$ mysql -u root -p
mysql> CREATE DATABASE web_crawler;
mysql> use web_crawler
mysql> CREATE TABLE keywords (id int AUTO_INCREMENT PRIMARY KEY, word varchar(100) NOT NULL);
mysql> CREATE TABLE pages (id int AUTO_INCREMENT PRIMARY KEY, url varchar(300) UNIQUE NOT NULL);
```
# License
`StaticCollector` source code is available under the MIT [License](https://github.com/ryonakao/StaticCollector/blob/master/LICENSE).