An open API service indexing awesome lists of open source software.

https://github.com/promptapi/scraper-pkg

NPM package for Prompt API's Scraper API
https://github.com/promptapi/scraper-pkg

api-marketplace css-selector css-selector-parser data-extraction image-scraper javascript javascript-library module nodejs npm-package promptapi scraper-api web-scraper web-scraping

Last synced: 3 months ago
JSON representation

NPM package for Prompt API's Scraper API

Awesome Lists containing this project

README

          

![Node](https://img.shields.io/badge/node-14.9.0-green.svg)
[![npm version](https://badge.fury.io/js/%40promptapi%2Fscraper-pkg.svg)](https://badge.fury.io/js/%40promptapi%2Fscraper-pkg)

# Prompt API - Scraper - Node Package

`@promptapi/scraper-pkg` is a simple JavaScript wrapper for [scraper-api][scraper-api].

## Requirements

1. You need to signup for [Prompt API][promptapi-signup]
1. You need to subscribe [scraper-api][scraper-api], test drive is **free!!!**
1. You need to set `PROMPTAPI_TOKEN` environment variable after subscription.

then;

```bash
$ npm install @promptapi/scraper-pkg
```

or, install from GitHub registry;

```bash
$ npm install @promptapi/scraper-pkg@0.1.6
```

---

## Example Usage

Basic scrape feature:

```javascript
const promptapi = require('@promptapi/scraper-pkg')
params = {}
promptapi.scraper('https://pypi.org/classifiers/', params).then(result => {
if(result.error){
console.log(result.error)
} else {
console.log(result.data); // your scraped data...
console.log(result.headers);
console.log(result.url);

promptapi.save('/tmp/data.html', result.data) // save result
}
})
```

Output:

// result.data









:
:
:

// result.headers
{ 'Content-Length': '322126', ...

// result.url
https://pypi.org/classifiers/

/tmp/data.html saved successfully, written 322126 bytes

You can add url parameters for extra operations. Valid parameters are:

- `auth_password`: for HTTP Realm auth password
- `auth_username`: for HTTP Realm auth username
- `cookie`: URL Encoded cookie header.
- `country`: 2 character country code. If you wish to scrape from an IP address of a specific country.
- `referer`: HTTP referer header
- `selector`: CSS style selector path such as `a.btn div li`. If `selector` is
enabled, returning result will be collection of data and saved file will be
in `.json` format.

```javascript
const promptapi = require('@promptapi/scraper-pkg')

params = {country: 'EE', selector: 'ul li button[data-clipboard-text]'}

promptapi.scraper('https://pypi.org/classifiers/', params).then(result => {
if(result.error){
console.log(result.error)
} else {
console.log(result.data); // your scraped data...
console.log(result.headers);
console.log(result.url);

promptapi.save('/tmp/data.json', result.data)
}
})
```

Output :

// result.data
[ '\n Copy\n\n',
'\n Copy\n\n',
'\n Copy\n\n',
:
:
:

// result.headers
{ 'Content-Length': '322126', ...

// result.url
https://pypi.org/classifiers/

/tmp/data.json saved successfully, written 174182 bytes

If you have `jq` tool;

```bash
$ cat /tmp/data.json | jq 'length'
736
```

You can also add extra `X-` headers to your request. Read more about http
headers at [Mozilla](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers)’s website.

```javascript
const promptapi = require('@promptapi/scraper-pkg')
params = {}
headers = {'X-Referer': 'https://www.google.com'}
promptapi.scraper('https://pypi.org/classifiers/', params, headers=headers).then(result => {
if(result.error){
console.log(result.error)
} else {
console.log(result.data); // your scraped data...
console.log(result.headers);
console.log(result.url);

promptapi.save('/tmp/data.html', result.data) // save result
}
})

```

---

## Development

All you need is `node` and `npm`...

---

## License

This project is licensed under MIT

---

## Contributer(s)

* [Prompt API](https://github.com/promptapi) - Creator, maintainer

---

## Contribute

All PR’s are welcome!

1. `fork` (https://github.com/promptapi/scraper-pkg/fork)
1. Create your `branch` (`git checkout -b my-feature`)
1. `commit` yours (`git commit -am 'Add awesome features...'`)
1. `push` your `branch` (`git push origin my-feature`)
1. Than create a new **Pull Request**!

This project is intended to be a safe,
welcoming space for collaboration, and contributors are expected to adhere to
the [code of conduct][coc].

[promptapi-signup]: https://promptapi.com/#signup-form
[scraper-api]: https://promptapi.com/marketplace/description/scraper-api
[coc]: https://github.com/promptapi/scraper-pkg/blob/main/CODE_OF_CONDUCT.md