An open API service indexing awesome lists of open source software.

https://github.com/ztf666/web-scraper

A small page scraper , NO DYNAMIC SCRAPING tho :tired_face:
https://github.com/ztf666/web-scraper

api-call axios axios-rest axios-restful cheerio page scraper scraperjs

Last synced: 6 months ago
JSON representation

A small page scraper , NO DYNAMIC SCRAPING tho :tired_face:

Awesome Lists containing this project

README

          

# 💩Scrapy💩

A small page scraper , still a WiP .
No dynamic scraping ...
This script uses :


Cheerio
Javascript

Axios

## How to use

- **Install and run**

```javascript
npm install
```

```javascript
npm run scrapy
```

- **Change the website and add yours**

```javascript
axios.get("https://chouftv.ma/press");
```

- **Change the elements by the ones you desire**

```javascript
$(".description").each((index, element) => {
const title = $(element).children().first().text();
const links = $(element).children("a").attr("href");
});
```

![Screenshot](scr/res.png)

```
It looks weird because i used it on a local news website.
```

- **Limitations**


This is a shitty scrapper , i'm still learning.


It doesn't scrap unloaded links.

![Screenshot](scr/lm.png)

In the screenshot above , the button litteraly translates to : LOAD MORE


Since i suck at this, i can't make it load more so i can grab the links


So it only grabs the latest news articles .


That's a blessing and a curse , beacause if clicked , it will load EVERY ARTICLE WRITTEN


since the deployement of the website...

## Contact

```
you can contact me at ZTF666@protonmail.ch

```



ZTF666

## License

**💩Scrapy💩** released under the [MIT](LICENSE) License.



Made with 💘 by a 👨‍💻 on a 💻 | 2020 | ZTF666 - N.EA