https://github.com/srijitpatra99/puppeteer
A Node.js API that scrapes ICC Cricket Team Rankings and updates Firebase Realtime Database every 24 hours. 🏏📊
https://github.com/srijitpatra99/puppeteer
expressjs firebase-realtime-database node-cron node-js puppeteer
Last synced: 3 months ago
JSON representation
A Node.js API that scrapes ICC Cricket Team Rankings and updates Firebase Realtime Database every 24 hours. 🏏📊
- Host: GitHub
- URL: https://github.com/srijitpatra99/puppeteer
- Owner: srijitpatra99
- Created: 2022-02-19T06:43:10.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2025-02-23T08:31:15.000Z (4 months ago)
- Last Synced: 2025-02-23T09:25:17.242Z (4 months ago)
- Topics: expressjs, firebase-realtime-database, node-cron, node-js, puppeteer
- Language: JavaScript
- Homepage:
- Size: 30.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🏏 Web Scraping with Node.js
A **Node.js API** that scrapes the [Cricket Points Table](https://www.icc-cricket.com/rankings/mens/team-rankings/odi) and stores the **team rankings** and **last updated info** in a **Firebase Realtime Database**. This API continuously checks for changes **every midnight** and automatically updates the database to ensure real-time accuracy.
---
## 🚀 Features
✅ Scrapes **ICC Cricket Team Rankings** (ODI) automatically
✅ Stores scraped data in **Firebase Realtime Database**
✅ Runs every **24 hours** to keep data updated
✅ Uses **Node.js, Cheerio, and Firebase Admin SDK**---
## 📂 Project Structure
```
📦 Web-Scraping-NodeJS
┣ 📜 package.json
┣ 📜 app.js
┣ 📜 firebase-config.json <-- Your Firebase credentials file
┣ 📜 README.md
┗ 📂 node_modules
```---
## 🛠️ Setup & Installation
### 1️⃣ Prerequisites
Ensure you have the following installed:
- **Node.js** (v14+ recommended)
- **NPM** (Node Package Manager)
- A **Firebase Realtime Database** account### 2️⃣ Clone the Repository
```sh
git clone https://github.com/yourusername/web-scraping-nodejs.git
cd web-scraping-nodejs
```### 3️⃣ Install Dependencies
```sh
npm install
```### 4️⃣ Set Up Firebase
- Go to [Firebase Console](https://console.firebase.google.com/)
- Create a **Firebase Realtime Database**
- Generate your **Service Account Key** (JSON file)
- Place the JSON file in the root folder and rename it to **firebase-config.json**### 5️⃣ Run the Scraper
```sh
node app.js
```🎯 The API will now fetch and store data in **Firebase** every **24 hours** automatically.
---
## 📌 Usage
Once running, the scraper will:
1. Fetch the latest **Cricket Team Rankings** data
2. Store it in **Firebase Realtime Database**
3. Automatically update the database every **midnight**---
## 📷 Example Data
```json
{
"last_updated": "2025-02-22T00:00:00Z",
"rankings": [
{
"rank": 1,
"team": "India",
"matches": 25,
"points": 3200,
"rating": 128
},
{
"rank": 2,
"team": "Australia",
"matches": 23,
"points": 2900,
"rating": 126
}
]
}
```---
## 🛠 Technologies Used
🔹 **Node.js** – Backend runtime environment
🔹 **Cheerio.js** – Scraping and parsing HTML
🔹 **Firebase Realtime Database** – Data storage
🔹 **Cron Jobs** – Automating scheduled tasks---
## 📜 License
This project is **open-source** and available under the **MIT License**.
---
🔥 **Enjoy Scraping!** 🚀