An open API service indexing awesome lists of open source software.

https://github.com/desktopcleaner/naturemagazinescraper

Scrapes open-access Nature magazine articles and store as txt files.
https://github.com/desktopcleaner/naturemagazinescraper

data nature-magazine python scrapper word-frequency

Last synced: 8 months ago
JSON representation

Scrapes open-access Nature magazine articles and store as txt files.

Awesome Lists containing this project

README

          

# NatureMagazineScraper
Scrape open-access Nature articles and store them as txt files.

# Key Features
- User can specify which year's articles to scrape/analyze
- User can specify maximum word count per word per article to reduce over-counting

## scraper.py
Scrape articles using `Beautiful Soup` and store them as text files

## analyzer.py
Parse scrapped articles and sum up word counts

## data_cleaner.py
Clean common words and other baised words