https://github.com/jplusplus/janus

A basic tool to retrieve the documents metadata from a domain name
https://github.com/jplusplus/janus

Last synced: 10 months ago
JSON representation

A basic tool to retrieve the documents metadata from a domain name

Host: GitHub
URL: https://github.com/jplusplus/janus
Owner: jplusplus
Created: 2013-06-13T13:00:57.000Z (about 13 years ago)
Default Branch: master
Last Pushed: 2013-07-09T17:21:05.000Z (almost 13 years ago)
Last Synced: 2024-04-14T04:55:28.363Z (about 2 years ago)
Language: JavaScript
Homepage:
Size: 775 KB
Stars: 3
Watchers: 5
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Janus

## Extract metadata from PDFs, fast
Janus is a simple tool to extract all meta data from all PDF files on a single domain. Type in a domain name, for instance "gov.uk", and get a list of all PDFs with their metadata (e.g. Author, creation and modification date). Metadata analysis is a great source of information for investigative journalists.
In the future, Janus will include other data types and go further in the analysis, clustering metadata together (like individuals who appear in the metadata).
It was developed by Journalism++' [Pierre Bellon](http://twitter.com/toutenrab) and [Leo Wallentin](http://twitter.com/leo_wallentin), who was an embedded news nerd there in June, 2013.

## How to install it
- be sure to have nodeJS installed on your computer
- get the sources
```git clone https://github.com/jplusplus/documents-from-domains.git```
- install the dependencies

```
cd janus
npm install
```
- copy the configuration file template

```
cp config.template.json config.json
```
- then enter your bing account key

## Launch the application
You can simply launch it by executing ```coffee app.coffe``` but I recommend you to use nodemon:
```
npm install -g nodemon
nodemon app.coffee
```

## Troubleshooting
- I get an error when I run npm install
| You may have an older version of node, please make sure to have node >= 9.4.1 installed on your system

## TODO
- handle images search
- handle doc & docx search

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jplusplus/janus

Awesome Lists containing this project

README