Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/fredericvergnaud/extractify

Extract structured data online
https://github.com/fredericvergnaud/extractify

chrome-extension scrapping structured-data

Last synced: 4 months ago
JSON representation

Extract structured data online

Awesome Lists containing this project

README

        

***Important*** : to mention Extractify in a publication, please use the following : « Extractify. Frederic Vergnaud, Mines Paris, PSL University, Centre for the Sociology of Innovation, i3 CNRS, France, https://github.com/fredericvergnaud/extractify »

**Presentation**

Extractify is a free extension for Chromium, developed in JavaScript under Atom, whose purpose is to scrap structured data on the web. It is particularly designed for collecting online comments or online conversations such as forums.

It allows you to:
1) Select structured information on a web page (like tables with rows and columns), by direct selection on the web page, or manual selection by entering HTML tags and related CSS code
2) Select the pagination of pages with the same structure and level
3) Repeat the process as many times as desired for lower levels
4) Scrape the whole selection
5) Finally, obtain a file in json format that can be easily imported in other software, [in L@ME for example](https://github.com/fredericvergnaud/lame).

What it does not allow: everything else!

**Manual installation for Chrome**

1. Press the green « **Clone or download** » button on this page to download the latest version
2. Unzip the downloaded archive
3. In Chrome adress bar, go to extensions page by typing « **chrome://extensions/** »
4. Switch to « **Developper mode** » in the upper right corner
5. Finally load the folder **extractify-master** as an « **unpacked extension** »

**Usage**

[Go to the wiki](https://github.com/fredericvergnaud/extractify/wiki) to see how to use Extractify.

**Love it ?** [Tell me](mailto:[email protected]) !

**Found a bug** ? Don’t be afraid to [open an issue](https://github.com/fredericvergnaud/extractify/issues/new).