Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/infomiho/keynews

News scraping and key connected entities graph exploration
https://github.com/infomiho/keynews

python react

Last synced: 1 day ago
JSON representation

News scraping and key connected entities graph exploration

Awesome Lists containing this project

README

        

# Keynews

![](http://deviantpics.com/images/2018/04/19/image.png)

## Objective
Our objective was to enable the end user to explore current news space and deduce connections between key phrases. It included downloading currently trending news stories from multitude of news outlets.
We approached this objective from a standpoint of a connected graph where the nodes represent the keyphrases and the edges represent the connections.

## How we solved it

We created a system that would scrape article links from newsapi.org, downloaded them and finally process them with our NLP pipeline. Our NLP pipeline consisted out of:
- Named entity recognition
- Keyphrase extraction
- Keyphrase ranking using TF-IDF
- Text summarization using TextRank
- Finding colocated keyphrases in articles

## Technologies used
- Python 3.6
- Flask
- SQLAlchemy
- APScheduler
- Scheduled background article fetching and processing
- SpaCy
- Named entity recognition
- Parts-of-speech tagging
- Gensim
- TF-IDF keyphrase ranking
- TextRank text summarization
- Cytoscape.js / React
- Graph visualisation

## People involved

- Mate Mijolović
- Ivan Dujmić
- Mihovil Ilakovac

Project done [@Takelab](http://takelab.fer.hr/) - 2017