Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/smhussain5/politico-nlp-python

Terminal application using natural language processing and web-scraping via Python/PyCharm
https://github.com/smhussain5/politico-nlp-python

beautifulsoup natural-language-processing nltk pycharm python terminal-application web-scraping

Last synced: 23 days ago
JSON representation

Terminal application using natural language processing and web-scraping via Python/PyCharm

Awesome Lists containing this project

README

        

# POLITICO NLP PYTHON

![Politico NLP Python GIF Demonstration](https://github.com/smhussain5/Politico-NLP-Python/blob/main/POLITICO_NLP_PYTHON.gif?raw=true)

## Problem 🤔

Utilizing multiple libraries to develop innovative solutions is key to being a competent software engineer. In today's fast-paced landscape, staying informed is important but difficult to do!

## Solution 💡

This terminal application utilizes Beautiful Soup to scrape the Politico website for the day's top stories. After collecting news links, the application utilizes natural language processing (NLP) libraries to summarize article texts and calculate a polairty score to further inform the reader. The resulting information is displayed to the user for convenient reading.

## Quick Start âš¡

If you have Docker installed, you can run this application on your own machine with just 2 steps!




Pull the image from Docker Hub
```python
docker pull smhussain5/politico-python
```
Then run the image as an interactive Docker container
```python
docker run --rm -it smhussain5/politico-python
```
## Technologies Used âš™

- Beautiful Soup
- Newspaper3k
- NLTK
- PyCharm
- Python
- TextBlob

## Challenges 💢

This was a straightforward application, but required proper organization for clean code. Furthermore, the Newspaper3k library was unable to collect every article and the NLP, in its current state, provides adequate summaries.

## Insights 💭

In < 100 lines of code, I was able to scrape Politico and use NLP techniques to summarize the scraped articles. This is a great feat and demonstrates the power of these Python libraries. Potential refactoring may include utilizing more accurate NLP libraries and web-frameworks like Django for better presentation.

## Contact 📲

[![Static Badge](https://img.shields.io/badge/Send%20me%20an%20email-212121?style=flat-square&logo=gmail&logoColor=EA4335)](mailto:[email protected]?)

[![Static Badge](https://img.shields.io/badge/Connect_with_me_on_LinkedIn-212121?style=flat-square&logo=linkedin&logoColor=0A66C2)](https://www.linkedin.com/in/shabab-h)

[![Static Badge](https://img.shields.io/badge/Follow_me_on_Twitter-212121?style=flat-square&logo=twitter&logoColor=1D9BF0)](https://twitter.com/shussain_5)

[![Static Badge](https://img.shields.io/badge/Follow_me_on_GitHub-212121?style=flat-square&logo=github&logoColor=FAFAFA)](https://github.com/smhussain5)