Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/infomiho/keynews
News scraping and key connected entities graph exploration
https://github.com/infomiho/keynews
python react
Last synced: 1 day ago
JSON representation
News scraping and key connected entities graph exploration
- Host: GitHub
- URL: https://github.com/infomiho/keynews
- Owner: infomiho
- Created: 2018-04-19T07:50:33.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2022-12-29T12:37:38.000Z (about 2 years ago)
- Last Synced: 2024-11-22T20:36:02.893Z (2 months ago)
- Topics: python, react
- Language: Python
- Size: 5.79 MB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Keynews
![](http://deviantpics.com/images/2018/04/19/image.png)
## Objective
Our objective was to enable the end user to explore current news space and deduce connections between key phrases. It included downloading currently trending news stories from multitude of news outlets.
We approached this objective from a standpoint of a connected graph where the nodes represent the keyphrases and the edges represent the connections.## How we solved it
We created a system that would scrape article links from newsapi.org, downloaded them and finally process them with our NLP pipeline. Our NLP pipeline consisted out of:
- Named entity recognition
- Keyphrase extraction
- Keyphrase ranking using TF-IDF
- Text summarization using TextRank
- Finding colocated keyphrases in articles## Technologies used
- Python 3.6
- Flask
- SQLAlchemy
- APScheduler
- Scheduled background article fetching and processing
- SpaCy
- Named entity recognition
- Parts-of-speech tagging
- Gensim
- TF-IDF keyphrase ranking
- TextRank text summarization
- Cytoscape.js / React
- Graph visualisation## People involved
- Mate Mijolović
- Ivan Dujmić
- Mihovil IlakovacProject done [@Takelab](http://takelab.fer.hr/) - 2017