Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/muffinista/wayback_exe

code for twitter bot @wayback_exe
https://github.com/muffinista/wayback_exe

Last synced: 3 days ago
JSON representation

code for twitter bot @wayback_exe

Host: GitHub
URL: https://github.com/muffinista/wayback_exe
Owner: muffinista
License: mit
Created: 2015-10-06T18:20:10.000Z (about 9 years ago)
Default Branch: main
Last Pushed: 2024-04-13T15:12:25.000Z (7 months ago)
Last Synced: 2024-08-02T05:10:19.256Z (3 months ago)
Language: JavaScript
Homepage:
Size: 1.3 MB
Stars: 48
Watchers: 5
Forks: 5
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

wayback_exe
===========

This is the code for [wayback_exe](https://twitter.com/wayback_exe), a
bot that renders images of old web pages from the Wayback Machine and
posts them to Twitter and Tumblr.

This is my first project in node.js and probably has all sorts of
awful issues that should be cleaned up. Sorry about that!

The bot is written in node and includes a few pieces:

- **scraper.js** -- this script indexes old web pages in the Wayback
Machine. It looks for URLs to scrape, queues them up in a redis
store, and stores some details about the page itself in a MySQL
database.

- **bot.js** -- loads a random page from the MySQL database, generates a
screenshot of it via PhantomJS, and then fits that screenshot into
an image of an old browser via ImageMagick. Then it sends the image
to Twitter and Tumblr.

- **pages.js** - code for adding pages to MySQL, and getting them out
later.

- **queue.js** - code for managing the scraper queue with Redis

- A bunch of random utility scripts/etc.

Running the bot
---------------

If you want to run a version of this bot, these steps should get you
started:

- copy conf.json.example to conf.json
- create a Twitter account and authorize it, and put the credentials
in conf.json
- create a MySQL database with the script in setup.sql. Add the login
info to the config.
- setup Redis and add the credentials to the config
- run `npm install` to install dependencies
- Add some interesting URLs to Redis like so:

```
nodejs add-url.js http://www.cool-old-site.com/
```

or load the contents of a JSON array of URLs into redis:

```
nodejs add-urls.js urls.json > tmp && cat tmp | redis-cli --pipe
```

Run the scraper! I'm lazy, and currently run it like this in a
terminal window:

```
echo "var s= require('./scraper.js'); s.loop()" | nodejs;
```

Let it run for awhile. It will scrape pages, store the results to
MySQL, and add more URLs to index to Redis. Eventually, you can send a
tweet like so:

```
node bot.js
```