https://github.com/raoul2000/bob-the-miner
Bob is a miner that will digg the web to extract precious data for you
https://github.com/raoul2000/bob-the-miner
webscraper
Last synced: 2 months ago
JSON representation
Bob is a miner that will digg the web to extract precious data for you
- Host: GitHub
- URL: https://github.com/raoul2000/bob-the-miner
- Owner: raoul2000
- License: mit
- Created: 2018-05-11T11:10:49.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2024-07-31T09:32:34.000Z (10 months ago)
- Last Synced: 2025-01-29T03:43:43.540Z (4 months ago)
- Topics: webscraper
- Language: JavaScript
- Homepage: https://raoul2000.github.io/bob-the-miner/
- Size: 580 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
> What ? another web scraper ? ...yeah, that's Bob !
What about [reading the doc ?](https://raoul2000.github.io/bob-the-miner/)
# Install
To use Bob in your own project :
```
npm install bob-the-miner --save
```If you want to contribute :
```
git clone https://github.com/raoul2000/bob-the-miner.git
cd bob-the-miner
npm install
```# Test
First start the test local server :
```
npm run server
```Then open another shell run the tests :
```
npm test
```# Examples
- extract titles from the [nodejs]('https://foundation.nodejs.org) new website
```
npm run nodejs-news
```- extract headlines from the [New-York Times](https://www.nytimes.com/) website
```
npm run nyt-headline
```- extract packages list from [NPM](https://www.npmjs.com) website
```
npm run npm-crawler
```# Documentation
Documentation is based on [vuepress](https://vuepress.vuejs.org).
To view the documentation running a local server :
```
npm run docs:dev
```> If the error `error:0308010C:digital envelope routines::unsupported` is reported during local dev server startup, try to first
> define `export NODE_OPTIONS=--openssl-legacy-provider` and re-run the command. (see [this thread](https://stackoverflow.com/questions/69692842/error-message-error0308010cdigital-envelope-routinesunsupported) for more)To build the documentation :
```
npm run docs:build
```Then copy the generated files from `docs/.vuepress/dist` to another (temporary) folder outside of the project, switch the the branch **gh-pages** and copy back the files. Then commit and push.