https://github.com/cyphunk/jstor_archive

JSTOR_archive provides scripts to download and save in PDF/HTML form articles added to a free-user's shelf on the Journal Storage archive site JSTOR
https://github.com/cyphunk/jstor_archive

Last synced: 19 days ago
JSON representation

JSTOR_archive provides scripts to download and save in PDF/HTML form articles added to a free-user's shelf on the Journal Storage archive site JSTOR

Host: GitHub
URL: https://github.com/cyphunk/jstor_archive
Owner: cyphunk
Created: 2014-08-06T08:55:37.000Z (almost 12 years ago)
Default Branch: master
Last Pushed: 2014-08-07T08:59:57.000Z (almost 12 years ago)
Last Synced: 2025-03-03T02:22:22.493Z (over 1 year ago)
Language: Shell
Homepage:
Size: 160 KB
Stars: 4
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

Found it dificult to find scripts that help archive JSTOR content. This is
surprising considering their sordid history in relation to Aaron Swartz.

These scripts do not download PDF content of privileged JSTOR users. For
that look elsewhere. Instead these scripts allow for archiving of articles
provided freely on a limited bases to public unprivileged users.
The shell script simply monitors your browser cache and copies GIF's
that match the JSTOR preview size (configurable) to a directory and then
generates a pdf and html index when interrupted. The included javascript
user script can be loaded in the browser (via greasemonkey or other user
script management extensions) and will emulate a click on the "next"
button of a JSTOR article so as to fill up the browser cache with the
articles contents.

# Use

Defaults are configured for Firefox browsers on Linux systems or Chrome on OSX.
For other browsers or systems set the ``CACHE_DIR`` path appropriately.

1. ``git clone https://github.com/cyphunk/jstor_archive.git``
2. Install Javascript userscript via User Script manager (such as grease
monkey) or directly if your browser supports it by accessing the raw
[jstor_clicknext.user.js](https://github.com/cyphunk/jstor_archive/raw/master/jstor_clicknext.user.js).
3. ``cd jstor_archive`` and ``./cache_monitor.sh EXAMPLE_NAME`` where
``EXAMPLE_NAME`` is the directory to be created within the current directory
for storing GIF's. This name will also be used as the name for the PDF and
HTML indexes that will be created later.
4. If the monitored ``CACHE_DIR`` shown on execution is different from that of
your browser change the ``CACHE_DIR`` environment variable and execute again.
5. Open a JSTOR article in your browser.
6. When all pages of the article have been viewed, either by manually
clicking through each page or by letting the ``jstor_clicknext.user.js``
user script do it, press "``ctrl+c``" in the scripts.
PDF and HTML files will be created in the ``EXAMPLE_NAME`` directory.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cyphunk/jstor_archive

Awesome Lists containing this project

README