An open API service indexing awesome lists of open source software.

https://github.com/sarthak-0-sach/amazon_webscraper_application

A Next.js and Bright Data-powered e-commerce product scraping site. Get notified on price drops and stock status. Automate with cron jobs.
https://github.com/sarthak-0-sach/amazon_webscraper_application

bright-data cheerio headless-ui mongodb nextjs nodemailer responsive tailwind-css web-scraper

Last synced: 6 months ago
JSON representation

A Next.js and Bright Data-powered e-commerce product scraping site. Get notified on price drops and stock status. Automate with cron jobs.

Awesome Lists containing this project

README

          


webscraping
nextjs
tailwindcss
mongodb

## πŸ“‹ Table of Contents

1. πŸ€– [Introduction](#introduction)
2. βš™οΈ [Tech Stack](#tech-stack)
3. πŸ”‹ [Features](#features)
4. 🀸 [Quick Start](#quick-start)
5. πŸ•ΈοΈ [Snippets](#snippets)
6. πŸ”— [Links](#links)

## πŸ€– Introduction

Developed using Next.js and Bright Data's webunlocker, this e-commerce product scraping site is designed to assist users in making informed decisions. It notifies users when a product drops in price and helps competitors by alerting them when the product is out of stock, all managed through cron jobs.

## βš™οΈ Tech Stack

- Next.js
- Bright Data
- Cheerio
- Nodemailer
- MongoDB
- Headless UI
- Tailwind CSS

## πŸ”‹ Features

πŸ‘‰ **Header with Carousel**: Visually appealing header with a carousel showcasing key features and benefits

πŸ‘‰ **Product Scraping**: A search bar allowing users to input Amazon product links for scraping.

πŸ‘‰ **Scraped Projects**: Displays the details of products scraped so far, offering insights into tracked items.

πŸ‘‰ **Scraped Product Details**: Showcase the product image, title, pricing, details, and other relevant information scraped from the original website

πŸ‘‰ **Track Option**: Modal for users to provide email addresses and opt-in for tracking.

πŸ‘‰ **Email Notifications**: Send emails product alert emails for various scenarios, e.g., back in stock alerts or lowest price notifications.

πŸ‘‰ **Automated Cron Jobs**: Utilize cron jobs to automate periodic scraping, ensuring data is up-to-date.

and many more, including code architecture and reusability

## 🀸 Quick Start

Follow these steps to set up the project locally on your machine.

**Prerequisites**

Make sure you have the following installed on your machine:

- [Git](https://git-scm.com/)
- [Node.js](https://nodejs.org/en)
- [npm](https://www.npmjs.com/) (Node Package Manager)

**Cloning the Repository**

```bash
git clone https://github.com/SartHak-0-Sach/Amazon_WebScraper_application
cd Amazon_WebScraper_application
```

**Installation**

Install the project dependencies using npm:

```bash
npm install
```

**Set Up Environment Variables**

Create a new file named `.env` in the root of your project and add the following content:

```env
#SCRAPER
BRIGHT_DATA_USERNAME=
BRIGHT_DATA_PASSWORD=

#DB
MONGODB_URI=

#OUTLOOK
EMAIL_USER=
EMAIL_PASS=
```

Replace the placeholder values with your actual credentials. You can obtain these credentials by signing up on these specific websites from [BrightData](https://brightdata.com/), [MongoDB](https://www.mongodb.com/), and [Node Mailer](https://nodemailer.com/)

**Running the Project**

```bash
npm run dev
```

Open [http://localhost:3000](http://localhost:3000) in your browser to view the project.

## πŸ•ΈοΈ Snippets

cron.route.ts

```typescript
import { NextResponse } from "next/server";

import { getLowestPrice, getHighestPrice, getAveragePrice, getEmailNotifType } from "@/lib/utils";
import { connectToDB } from "@/lib/mongoose";
import Product from "@/lib/models/product.model";
import { scrapeAmazonProduct } from "@/lib/scraper";
import { generateEmailBody, sendEmail } from "@/lib/nodemailer";

export const maxDuration = 300; // This function can run for a maximum of 300 seconds
export const dynamic = "force-dynamic";
export const revalidate = 0;

export async function GET(request: Request) {
try {
connectToDB();

const products = await Product.find({});

if (!products) throw new Error("No product fetched");

// ======================== 1 SCRAPE LATEST PRODUCT DETAILS & UPDATE DB
const updatedProducts = await Promise.all(
products.map(async (currentProduct) => {
// Scrape product
const scrapedProduct = await scrapeAmazonProduct(currentProduct.url);

if (!scrapedProduct) return;

const updatedPriceHistory = [
...currentProduct.priceHistory,
{
price: scrapedProduct.currentPrice,
},
];

const product = {
...scrapedProduct,
priceHistory: updatedPriceHistory,
lowestPrice: getLowestPrice(updatedPriceHistory),
highestPrice: getHighestPrice(updatedPriceHistory),
averagePrice: getAveragePrice(updatedPriceHistory),
};

// Update Products in DB
const updatedProduct = await Product.findOneAndUpdate(
{
url: product.url,
},
product
);

// ======================== 2 CHECK EACH PRODUCT'S STATUS & SEND EMAIL ACCORDINGLY
const emailNotifType = getEmailNotifType(
scrapedProduct,
currentProduct
);

if (emailNotifType && updatedProduct.users.length > 0) {
const productInfo = {
title: updatedProduct.title,
url: updatedProduct.url,
};
// Construct emailContent
const emailContent = await generateEmailBody(productInfo, emailNotifType);
// Get array of user emails
const userEmails = updatedProduct.users.map((user: any) => user.email);
// Send email notification
await sendEmail(emailContent, userEmails);
}

return updatedProduct;
})
);

return NextResponse.json({
message: "Ok",
data: updatedProducts,
});
} catch (error: any) {
throw new Error(`Failed to get all products: ${error.message}`);
}
}
```

generateEmailBody.ts

```typescript
export async function generateEmailBody(
product: EmailProductInfo,
type: NotificationType
) {
const THRESHOLD_PERCENTAGE = 40;
// Shorten the product title
const shortenedTitle =
product.title.length > 20
? `${product.title.substring(0, 20)}...`
: product.title;

let subject = "";
let body = "";

switch (type) {
case Notification.WELCOME:
subject = `Welcome to Price Tracking for ${shortenedTitle}`;
body = `


Welcome to PriceWise πŸš€


You are now tracking ${product.title}.


Here's an example of how you'll receive updates:



${product.title} is back in stock!


We're excited to let you know that ${product.title} is now back in stock.


Don't miss out - buy it now!


Product Image

Stay tuned for more updates on ${product.title} and other products you're tracking.



`;
break;

case Notification.CHANGE_OF_STOCK:
subject = `${shortenedTitle} is now back in stock!`;
body = `


Hey, ${product.title} is now restocked! Grab yours before they run out again!


See the product here.



`;
break;

case Notification.LOWEST_PRICE:
subject = `Lowest Price Alert for ${shortenedTitle}`;
body = `


Hey, ${product.title} has reached its lowest price ever!!


Grab the product here now.



`;
break;

case Notification.THRESHOLD_MET:
subject = `Discount Alert for ${shortenedTitle}`;
body = `


Hey, ${product.title} is now available at a discount more than ${THRESHOLD_PERCENTAGE}%!


Grab it right away from here.



`;
break;

default:
throw new Error("Invalid notification type.");
}

return { subject, body };
}
```

globals.css

```css
@tailwind base;
@tailwind components;
@tailwind utilities;

* {
margin: 0;
padding: 0;
box-sizing: border-box;
scroll-behavior: smooth;
}

@layer base {
body {
@apply font-inter;
}
}

@layer utilities {
.btn {
@apply py-4 px-4 bg-secondary hover:bg-opacity-70 rounded-[30px] text-white text-lg font-semibold;
}

.head-text {
@apply mt-4 text-6xl leading-[72px] font-bold tracking-[-1.2px] text-gray-900;
}

.section-text {
@apply text-secondary text-[32px] font-semibold;
}

.small-text {
@apply flex gap-2 text-sm font-medium text-primary;
}

.paragraph-text {
@apply text-xl leading-[30px] text-gray-600;
}

.hero-carousel {
@apply relative sm:px-10 py-5 sm:pt-20 pb-5 max-w-[560px] h-[700px] w-full bg-[#F2F4F7] rounded-[30px] sm:mx-auto;
}

.carousel {
@apply flex flex-col-reverse h-[700px];
}

.carousel .control-dots {
@apply static !important;
}

.carousel .control-dots .dot {
@apply w-[10px] h-[10px] bg-[#D9D9D9] rounded-full bottom-0 !important;
}

.carousel .control-dots .dot.selected {
@apply bg-[#475467] !important;
}

.trending-section {
@apply flex flex-col gap-10 px-6 md:px-20 py-24;
}

/* PRODUCT DETAILS PAGE STYLES */
.product-container {
@apply flex flex-col gap-16 flex-wrap px-6 md:px-20 py-24;
}

.product-image {
@apply flex-grow xl:max-w-[50%] max-w-full py-16 border border-[#CDDBFF] rounded-[17px];
}

.product-info {
@apply flex items-center flex-wrap gap-10 py-6 border-y border-y-[#E4E4E4];
}

.product-hearts {
@apply flex items-center gap-2 px-3 py-2 bg-[#FFF0F0] rounded-10;
}

.product-stars {
@apply flex items-center gap-2 px-3 py-2 bg-[#FBF3EA] rounded-[27px];
}

.product-reviews {
@apply flex items-center gap-2 px-3 py-2 bg-white-200 rounded-[27px];
}

/* MODAL */
.dialog-container {
@apply fixed inset-0 z-10 overflow-y-auto bg-black bg-opacity-60;
}

.dialog-content {
@apply p-6 bg-white inline-block w-full max-w-md my-8 overflow-hidden text-left align-middle transition-all transform shadow-xl rounded-2xl;
}

.dialog-head_text {
@apply text-secondary text-lg leading-[24px] font-semibold mt-4;
}

.dialog-input_container {
@apply px-5 py-3 mt-3 flex items-center gap-2 border border-gray-300 rounded-[27px];
}

.dialog-input {
@apply flex-1 pl-1 border-none text-gray-500 text-base focus:outline-none border border-gray-300 rounded-[27px] shadow-xs;
}

.dialog-btn {
@apply px-5 py-3 text-white text-base font-semibold border border-secondary bg-secondary rounded-lg mt-8;
}

/* NAVBAR */
.nav {
@apply flex justify-between items-center px-6 md:px-20 py-4;
}

.nav-logo {
@apply font-spaceGrotesk text-[21px] text-secondary font-bold;
}

/* PRICE INFO */
.price-info_card {
@apply flex-1 min-w-[200px] flex flex-col gap-2 border-l-[3px] rounded-10 bg-white-100 px-5 py-4;
}

/* PRODUCT CARD */
.product-card {
@apply sm:w-[292px] sm:max-w-[292px] w-full flex-1 flex flex-col gap-4 rounded-md;
}

.product-card_img-container {
@apply flex-1 relative flex flex-col gap-5 p-4 rounded-md;
}

.product-card_img {
@apply max-h-[250px] object-contain w-full h-full bg-transparent;
}

.product-title {
@apply text-secondary text-xl leading-6 font-semibold truncate;
}

/* SEARCHBAR INPUT */
.searchbar-input {
@apply flex-1 min-w-[200px] w-full p-3 border border-gray-300 rounded-lg shadow-xs text-base text-gray-500 focus:outline-none;
}

.searchbar-btn {
@apply bg-gray-900 border border-gray-900 rounded-lg shadow-xs px-5 py-3 text-white text-base font-semibold hover:opacity-90 disabled:pointer-events-none disabled:cursor-not-allowed disabled:opacity-40;
}
}
```

index.scraper.ts

```typescript
"use server"

import axios from 'axios';
import * as cheerio from 'cheerio';
import { extractCurrency, extractDescription, extractPrice } from '../utils';

export async function scrapeAmazonProduct(url: string) {
if(!url) return;

// BrightData proxy configuration
const username = String(process.env.BRIGHT_DATA_USERNAME);
const password = String(process.env.BRIGHT_DATA_PASSWORD);
const port = 22225;
const session_id = (1000000 * Math.random()) | 0;

const options = {
auth: {
username: `${username}-session-${session_id}`,
password,
},
host: 'brd.superproxy.io',
port,
rejectUnauthorized: false,
}

try {
// Fetch the product page
const response = await axios.get(url, options);
const $ = cheerio.load(response.data);

// Extract the product title
const title = $('#productTitle').text().trim();
const currentPrice = extractPrice(
$('.priceToPay span.a-price-whole'),
$('.a.size.base.a-color-price'),
$('.a-button-selected .a-color-base'),
);

const originalPrice = extractPrice(
$('#priceblock_ourprice'),
$('.a-price.a-text-price span.a-offscreen'),
$('#listPrice'),
$('#priceblock_dealprice'),
$('.a-size-base.a-color-price')
);

const outOfStock = $('#availability span').text().trim().toLowerCase() === 'currently unavailable';

const images =
$('#imgBlkFront').attr('data-a-dynamic-image') ||
$('#landingImage').attr('data-a-dynamic-image') ||
'{}'

const imageUrls = Object.keys(JSON.parse(images));

const currency = extractCurrency($('.a-price-symbol'))
const discountRate = $('.savingsPercentage').text().replace(/[-%]/g, "");

const description = extractDescription($)

// Construct data object with scraped information
const data = {
url,
currency: currency || '$',
image: imageUrls[0],
title,
currentPrice: Number(currentPrice) || Number(originalPrice),
originalPrice: Number(originalPrice) || Number(currentPrice),
priceHistory: [],
discountRate: Number(discountRate),
category: 'category',
reviewsCount:100,
stars: 4.5,
isOutOfStock: outOfStock,
description,
lowestPrice: Number(currentPrice) || Number(originalPrice),
highestPrice: Number(originalPrice) || Number(currentPrice),
averagePrice: Number(currentPrice) || Number(originalPrice),
}

return data;
} catch (error: any) {
console.log(error);
}
}
```

next.config.js

```javascript
/** @type {import('next').NextConfig} */
const nextConfig = {
experimental: {
serverActions: true,
serverComponentsExternalPackages: ['mongoose']
},
images: {
domains: ['m.media-amazon.com']
}
}

module.exports = nextConfig
```

tailwind.config.ts

```typescript
/** @type {import('tailwindcss').Config} */
module.exports = {
content: [
"./pages/**/*.{js,ts,jsx,tsx,mdx}",
"./components/**/*.{js,ts,jsx,tsx,mdx}",
"./app/**/*.{js,ts,jsx,tsx,mdx}",
],
theme: {
extend: {
colors: {
primary: {
DEFAULT: "#E43030",
"orange": "#D48D3B",
"green": "#3E9242"
},
secondary: "#282828",
"gray-200": "#EAECF0",
"gray-300": "D0D5DD",
"gray-500": "#667085",
"gray-600": "#475467",
"gray-700": "#344054",
"gray-900": "#101828",
"white-100": "#F4F4F4",
"white-200": "#EDF0F8",
"black-100": "#3D4258",
"neutral-black": "#23263B",
},
boxShadow: {
xs: "0px 1px 2px 0px rgba(16, 24, 40, 0.05)",
},
maxWidth: {
"10xl": '1440px'
},
fontFamily: {
inter: ['Inter', 'sans-serif'],
spaceGrotesk: ['Space Grotesk', 'sans-serif'],
},
borderRadius: {
10: "10px"
}
},
},
plugins: [],
};
```

types.ts

```typescript
export type PriceHistoryItem = {
price: number;
};

export type User = {
email: string;
};

export type Product = {
_id?: string;
url: string;
currency: string;
image: string;
title: string;
currentPrice: number;
originalPrice: number;
priceHistory: PriceHistoryItem[] | [];
highestPrice: number;
lowestPrice: number;
averagePrice: number;
discountRate: number;
description: string;
category: string;
reviewsCount: number;
stars: number;
isOutOfStock: Boolean;
users?: User[];
};

export type NotificationType =
| "WELCOME"
| "CHANGE_OF_STOCK"
| "LOWEST_PRICE"
| "THRESHOLD_MET";

export type EmailContent = {
subject: string;
body: string;
};

export type EmailProductInfo = {
title: string;
url: string;
};
```

utils.ts

```typescript
import { PriceHistoryItem, Product } from "@/types";

const Notification = {
WELCOME: 'WELCOME',
CHANGE_OF_STOCK: 'CHANGE_OF_STOCK',
LOWEST_PRICE: 'LOWEST_PRICE',
THRESHOLD_MET: 'THRESHOLD_MET',
}

const THRESHOLD_PERCENTAGE = 40;

// Extracts and returns the price from a list of possible elements.
export function extractPrice(...elements: any) {
for (const element of elements) {
const priceText = element.text().trim();

if(priceText) {
const cleanPrice = priceText.replace(/[^\d.]/g, '');

let firstPrice;

if (cleanPrice) {
firstPrice = cleanPrice.match(/\d+\.\d{2}/)?.[0];
}

return firstPrice || cleanPrice;
}
}

return '';
}

// Extracts and returns the currency symbol from an element.
export function extractCurrency(element: any) {
const currencyText = element.text().trim().slice(0, 1);
return currencyText ? currencyText : "";
}

// Extracts description from two possible elements from amazon
export function extractDescription($: any) {
// these are possible elements holding description of the product
const selectors = [
".a-unordered-list .a-list-item",
".a-expander-content p",
// Add more selectors here if needed
];

for (const selector of selectors) {
const elements = $(selector);
if (elements.length > 0) {
const textContent = elements
.map((_: any, element: any) => $(element).text().trim())
.get()
.join("\n");
return textContent;
}
}

// If no matching elements were found, return an empty string
return "";
}

export function getHighestPrice(priceList: PriceHistoryItem[]) {
let highestPrice = priceList[0];

for (let i = 0; i < priceList.length; i++) {
if (priceList[i].price > highestPrice.price) {
highestPrice = priceList[i];
}
}

return highestPrice.price;
}

export function getLowestPrice(priceList: PriceHistoryItem[]) {
let lowestPrice = priceList[0];

for (let i = 0; i < priceList.length; i++) {
if (priceList[i].price < lowestPrice.price) {
lowestPrice = priceList[i];
}
}

return lowestPrice.price;
}

export function getAveragePrice(priceList: PriceHistoryItem[]) {
const sumOfPrices = priceList.reduce((acc, curr) => acc + curr.price, 0);
const averagePrice = sumOfPrices / priceList.length || 0;

return averagePrice;
}

export const getEmailNotifType = (
scrapedProduct: Product,
currentProduct: Product
) => {
const lowestPrice = getLowestPrice(currentProduct.priceHistory);

if (scrapedProduct.currentPrice < lowestPrice) {
return Notification.LOWEST_PRICE as keyof typeof Notification;
}
if (!scrapedProduct.isOutOfStock && currentProduct.isOutOfStock) {
return Notification.CHANGE_OF_STOCK as keyof typeof Notification;
}
if (scrapedProduct.discountRate >= THRESHOLD_PERCENTAGE) {
return Notification.THRESHOLD_MET as keyof typeof Notification;
}

return null;
};

export const formatNumber = (num: number = 0) => {
return num.toLocaleString(undefined, {
minimumFractionDigits: 0,
maximumFractionDigits: 0,
});
};
```

## πŸ”— Links

Assets used in the project are [here](https://drive.google.com/file/d/1v6h993BgYX6axBoIXFbZ9HQAgqbR4PSH/view?usp=sharing)

## Continued development

I aim to enhance my skills in 3D graphics and animation, and further integrate advanced interactivity features in web projects to create engaging user experiences.

## Author

Sarthak Sachdev
- Website - [Sarthak Sachdev](https://itsmesarthak.netlify.app/)
- LeetCode - [@sarthak_sachdev](https://leetcode.com/u/sarthak_sachdev/)
- Twitter - [@sarthak_sach69](https://www.twitter.com/sarthak_sach69)

## Acknowledgments

Special thanks to the vast knowledge base available on YouTube and Stack Overflow, where I learned many of the concepts integrated in this project and got my doubts cleared.

## Got feedback for me?

I love receiving feedback! I am always looking to improve my code and take up new innovative ideas to work upon. So if you have anything you'd like to mention, please email 'hi' at saarsaach30[at]gmail[dot]com.

If you liked this project, make sure to spread the word and share it with all your friends.

**Happy Coding!** βœ¨πŸŽ–οΈ