An open API service indexing awesome lists of open source software.

https://github.com/onetail/crawler-with-kafka-docker

homework to crawler and anaylsis
https://github.com/onetail/crawler-with-kafka-docker

analysis crawler kafka-docker

Last synced: 2 months ago
JSON representation

homework to crawler and anaylsis

Awesome Lists containing this project

README

        

# Crawler-with-kafka-docker

> can crawler yahoo news and analysis

* ```python main.py```

> ![run this process](https://i.imgur.com/VfJzvkr.jpg)

> to run crawler and kafka producer and consumer data

* ``` python main.py consumer ```

> ![consumer get data](https://i.imgur.com/2Ym7611.jpg)

> see consumer get data

> This is use kafka-docker for message queue and python crawler get data
> doing analysis cosine similarity for top 5
> csv content
> ![csv content detail top 5 news](https://i.imgur.com/xNBr5We.jpg)