Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/gabrielahn/citieslover-public

Allows users to scrape select web pages and upload them to S3 bucket.
https://github.com/gabrielahn/citieslover-public

newsletter python scraping

Last synced: about 2 months ago
JSON representation

Allows users to scrape select web pages and upload them to S3 bucket.

Awesome Lists containing this project

README

        

# Cities Lover

Cities Lover 💚 a place where you can find jobs 💼 and news 📰 for all those looking for anything related to urbanism 🌆.

This script scrapes different websites and uploads to S3 Buckets to be rendered here **[Cities Lover](https://www.gabrielhn.com/citieslover/)**

![website](/citieslover.gif)

## Setup
```
python3 -m venv venv
source venv/bin/activate
pip3 install -r requirements.txt
```

## Use
```
# Test scrape for a specific source
python -m cities_scrape_data test_source --source [--response] [--threads ]

# Test scrape by source type (e.g., jobs, podcasts, articles)
python -m cities_scrape_data test_by_type --type [--response] [--threads ]

# Test scraping all data sources
python -m cities_scrape_data test_all [--response] [--threads ]

# Create datasets to be uploaded to the S3 Bucket
python -m cities_scrape_data create_datasets [--threads ]

# Get websites info
python -m cities_scrape_data get_websites
```