https://github.com/nisch-mhrzn/scraping

This project scrapes data from Wikipedia about the largest U.S. companies by revenue using Python's requests and BeautifulSoup libraries.
https://github.com/nisch-mhrzn/scraping

beautifulsoup python requests webscrapping

Last synced: 3 months ago
JSON representation

This project scrapes data from Wikipedia about the largest U.S. companies by revenue using Python's requests and BeautifulSoup libraries.

Host: GitHub
URL: https://github.com/nisch-mhrzn/scraping
Owner: nisch-mhrzn
Created: 2024-12-04T12:37:57.000Z (7 months ago)
Default Branch: main
Last Pushed: 2024-12-04T14:37:57.000Z (7 months ago)
Last Synced: 2025-02-08T01:17:14.417Z (5 months ago)
Topics: beautifulsoup, python, requests, webscrapping
Language: Jupyter Notebook
Homepage:
Size: 44.9 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Web Scraping Project:
Largest Companies in the United States by Revenue

## Overview
This project involves scraping data from the Wikipedia page listing the largest companies in the United States by revenue. The data can be used for analysis, visualization, or other purposes.

## Requirements
To run this project, you'll need the following Python libraries:
- `requests`
- `beautifulsoup4`

You can install these libraries using pip:

```bash
pip install requests beautifulsoup4
```

## Usage
1. **Clone the repository** (if applicable):
```bash
git clone https://github.com/nisch-mhrzn/Scraping.git
cd Scraping
```
2. **Data Extraction**:
The script fetches the HTML content from the specified Wikipedia URL and parses it to extract relevant information about the largest companies.

## Code Explanation
The main components of the script include:

- **Fetching the page**:
```python
url = 'https://en.wikipedia.org/wiki/List_of_largest_companies_in_the_United_States_by_revenue'
page = requests.get(url)
```

- **Parsing the HTML**:
```python
soup = BeautifulSoup(page.text, 'html')
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nisch-mhrzn/scraping

Awesome Lists containing this project

README