Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sekharsunkara6/twitter-web-scraping
Web scraping with Selenium and ProxyMesh, storing the data in MongoDB, showing a list on a webpage.
https://github.com/sekharsunkara6/twitter-web-scraping
flask gunicorn mongodb proxymesh pymongo python selenium webdriver
Last synced: 14 days ago
JSON representation
Web scraping with Selenium and ProxyMesh, storing the data in MongoDB, showing a list on a webpage.
- Host: GitHub
- URL: https://github.com/sekharsunkara6/twitter-web-scraping
- Owner: SekharSunkara6
- Created: 2024-06-05T15:39:59.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-06-05T15:53:59.000Z (8 months ago)
- Last Synced: 2024-12-01T00:18:35.785Z (2 months ago)
- Topics: flask, gunicorn, mongodb, proxymesh, pymongo, python, selenium, webdriver
- Language: Python
- Homepage:
- Size: 75.2 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Twitter Web-Scraping App
In this repository it was created by using Selenium script that can read the Twitter home page (on your local computer) and fetch the top 5 trending topics under “What’s Happening” section from the homepage.## Technologies Used:
1. Selenium: Selenium is a popular tool for automating web browsers. In this project, it is used for web scraping to interact with the Twitter website and extract trending topics.
2. MongoDB: MongoDB is a NoSQL database program. It is used to store the scraped data (trending topics) in a database for persistence.
3. HTML and Jinja: HTML is used to structure the web pages, while Jinja is a templating engine for Python that is used with Flask to generate dynamic HTML content.
4. Python-Dotenv: Python-Dotenv is a Python library that manages environment variables stored in a .env file. It is used in this project to securely store sensitive information such as credentials.
5. Flask: Flask is a micro web framework written in Python. It is used to build the web application and define routes for handling HTTP requests.## Features:
1. Web Scraping: Utilizes Selenium and BeautifulSoup for web scraping to extract trending topics from Twitter's homepage.
2. MongoDB Integration: Stores scraped data in a MongoDB database for persistence.
3. Flask Web Interface: Provides a simple web interface for triggering the scraping process and displaying the scraped data.## Files Included:
1. Web.py: Flask application script containing routes for triggering the scraping process and displaying scraped data.
2. twitter_scrapping.py: Python script for scraping Twitter's homepage and storing trending topics in MongoDB.
3. requirements.txt: File containing a list of Python dependencies required for running the project.
4. index.html: HTML template file for rendering the web interface.
5. .env: Environment variable file containing sensitive information like credentials.### Installation
Clone the repository to your local machine:```bash
git clone https://github.com/SekharSunkara6/Twitter-Web-Scraping.git
```
Navigate to the project directory:```bash
cd twitter-scraping-app
```
Install the required dependencies using pip:```bash
pip install -r requirements.txt
```
Set up your environment variables by creating a .env file in the project directory and adding the necessary credentials:```makefile
proxymesh_user=your_proxymesh_username
proxymesh_pass=your_proxymesh_password
X_EMAIL=your_twitter_email
X_PASS=your_twitter_password
X_USER=your_twitter_username
MONGODB_URI=your_mongodb_uri
```
Usage
Run the Flask application:```bash
python Web.py
```
Open your web browser and go to http://localhost:5000 to access the web interface.Click the "Scrape Data" button to trigger the scraping process.
Once the scraping is complete, the trending topics will be displayed on the webpage.
## Live Hosted
![Project picture](./image/image.png)