Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mattrichmo/csv-url-crawl


https://github.com/mattrichmo/csv-url-crawl

Last synced: about 9 hours ago
JSON representation

Awesome Lists containing this project

README

        

CSV-URL-CRAWL

A node.js app to parse urls from a csv, then scrape each page - creating a new pobject for each page under each link, then recompiling that into a flattened csv data export for each link.

Primary purpose of this code was built to a client's specifications in an Upwork Prjoect.

#Important Variables
```
fileName = `links.csv` // set this to your csv file name

Max_Depth = 1 // This should be set to how deep you want to scrape each. 1 being only the main page

```

To Run
```
npm init

```

THEN

```
node index.mjs
```

saves 3 files for now.

allLinksData.csv: A concatentation of all page data scraped into 1 link object
parentLinkData.csv: just the parents data
chiildLinksData.csv: Just the child data

© 2024 all rights reserved Matt Richmond