An open API service indexing awesome lists of open source software.

https://github.com/rafaelmoraes003/tech-news

Analysis and manipulation of news data from a technology website obtained through data scraping using Python.
https://github.com/rafaelmoraes003/tech-news

crawler data-scraping https mongodb parsel pymongo python web-scraping

Last synced: about 1 month ago
JSON representation

Analysis and manipulation of news data from a technology website obtained through data scraping using Python.

Awesome Lists containing this project

README

          

Tech News

###

This project aims to query a website that contains news about technology. To do this, data scraping was used, which is a technique for collecting data from online platforms. The data is captured from the scripts that are generated by the pages and programs that โ€œscrapeโ€ the information. After the scraping is finished, the data is saved in a database.

With the data already saved and structured, the program allows to search by title, date, tags and news category.

An interactive menu is available so that the user can do the processes more easily.

###

Technologies used

###


python logo
mongodb logo

###

How to use the application

###

Clone the application using the `git clone` command. After that, enter the project folder using the command `cd tech-news`.

###

How to run the application

1. Create the virtual environment for the project
- `python3 -m venv .venv && source .venv/bin/activate`

2. Install the dependencies
- `python3 -m pip install -r dev-requirements.txt`

###

Running the MongoDB database through Docker ๐Ÿƒ ๐Ÿณ

In the root folder of the project, use the command `docker-compose up -d mongodb`.

###

Using the menu

1. In the terminal, use the command:
- `python3 -m tech_news.menu`

This command will bring up the menu, which contains several options on how to view the data that was collected from the scrape.
If this is your first time using the application, first use option `0` on the menu to populate the database.