Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/oxylabs/best-buy-price-tracker
A tutorial for building a scalable price tracker with Python and Oxylabs Best Buy Scraper API to get price change alerts and historical data.
https://github.com/oxylabs/best-buy-price-tracker
change-monitoring price-tracker price-tracking-system scraper-api web-scraper web-scraping
Last synced: about 2 months ago
JSON representation
A tutorial for building a scalable price tracker with Python and Oxylabs Best Buy Scraper API to get price change alerts and historical data.
- Host: GitHub
- URL: https://github.com/oxylabs/best-buy-price-tracker
- Owner: oxylabs
- Created: 2024-01-11T13:17:02.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-01-11T15:41:28.000Z (12 months ago)
- Last Synced: 2024-04-21T02:04:38.740Z (9 months ago)
- Topics: change-monitoring, price-tracker, price-tracking-system, scraper-api, web-scraper, web-scraping
- Language: Python
- Homepage: https://oxylabs.io/products/scraper-api/ecommerce/bestbuy
- Size: 24.4 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Best Buy Price Tracker
[![Oxylabs promo code](https://user-images.githubusercontent.com/129506779/250792357-8289e25e-9c36-4dc0-a5e2-2706db797bb5.png)](https://oxylabs.go2cloud.org/aff_c?offer_id=7&aff_id=877&url_id=112)
Here, you'll find the process of building a scalable price tracker for Best Buy, one of the largest e-commerce websites for electronics.
The tutorial uses Python and Oxylabs’ [Best Buy API](https://oxylabs.io/products/scraper-api/ecommerce/bestbuy) (a part of Web Scraper API). You can get a **1-week free trial** by registering on the [dashboard](https://dashboard.oxylabs.io/).
For visualizations and in-depth explanations, see our [blog post](https://oxylabs.io/blog/best-buy-price-tracker).
## 1. Installing prerequisite libraries
```python
pip install pandas
pip install matplotlib
```## 2. Making the initial request
```python
import requestsUSERNAME = "username"
PASSWORD = "password"# Structure payload.
payload = {
'source': 'universal',
'url': "https://www.bestbuy.com/site/samsung-galaxy-z-flip4-128gb-unlocked-graphite/6512618.p?skuId=6512618&intl=nosplash",
'geo_location': 'United States',
'parse': True,
}# Get response.
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=(USERNAME, PASSWORD),
json=payload,
)print(response.json())
```## 3. Creating the core of the tracker
Create a function that would read the historical price tracker data.
```python
def read_past_data(filepath):
results = {}if not os.path.isfile(filepath):
open(filepath, 'a').close()if not os.stat(filepath).st_size == 0:
results_df = pd.read_json(filepath, convert_axes=False)
results = results_df.to_dict()
return results
return results
```As the historical price data is now loaded, think of a function that would take the past price tracker data and add the present price to it.
```python
def add_todays_prices(results, tracked_product_links):
today = date.today()for link in tracked_product_links:
product = get_product(link)if product["title"] not in results:
results[product["title"]] = {}
results[product["title"]][today.strftime("%d %B, %Y")] = {
"price": product["price"],
"currency": product["currency"],
}
return results
```Having the prices updated for the present, move on to saving the results back to the file you started from, thus finishing the process loop.
```python
def save_results(results, filepath):
df = pd.DataFrame.from_dict(results)df.to_json(filepath)
return
```Finally, move the connection to the Scraper API to a separate function and combine all you have so far.
```python
import os
import requests
import os.path
from datetime import date
import pandas as pddef get_product(link):
USERNAME = "username"
PASSWORD = "password"# Structure payload.
payload = {
'source': 'universal',
'url': link,
'geo_location': 'United States',
'parse': True,
}# Get response.
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=(USERNAME, PASSWORD),
json=payload,
)
response_json = response.json()content = response_json["results"][0]["content"]
product = {
"title": content["title"],
"price": content["price"]["price"],
"currency": content["price"]["currency"]
}
return productdef read_past_data(filepath):
results = {}if not os.path.isfile(filepath):
open(filepath, 'a').close()if not os.stat(filepath).st_size == 0:
results_df = pd.read_json(filepath, convert_axes=False)
results = results_df.to_dict()
return results
return resultsdef save_results(results, filepath):
df = pd.DataFrame.from_dict(results)df.to_json(filepath)
return
def add_todays_prices(results, tracked_product_links):
today = date.today()for link in tracked_product_links:
product = get_product(link)if product["title"] not in results:
results[product["title"]] = {}
results[product["title"]][today.strftime("%d %B, %Y")] = {
"price": product["price"],
"currency": product["currency"],
}
return resultsdef main():
results_file = "data.json"tracked_product_links = [
"https://www.bestbuy.com/site/samsung-galaxy-z-flip4-128gb-unlocked-graphite/6512618.p?skuId=6512618&intl=nosplash",
"https://www.bestbuy.com/site/samsung-galaxy-z-flip5-256gb-unlocked-graphite/6548838.p?skuId=6548838"
]past_results = read_past_data(results_file)
updated_results = add_todays_prices(past_results, tracked_product_links)
save_results(updated_results, results_file)
if __name__ == "__main__":
main()
```## 4. Plotting price history
```python
def plot_history_chart(results):
for product in results:
dates = []
prices = []
for entry_date in results[product]:
dates.append(entry_date)
prices.append(results[product][entry_date]["price"])plt.plot(dates,prices, label=product)
plt.xlabel("Date")
plt.ylabel("Price")plt.title("Product prices over time")
plt.legend()
plt.show()
```## 5. Creating price drop alerts
```python
def check_for_pricedrop(results):
for product in results:
today = date.today()
yesterday = today - timedelta(days = 1)change = results[product][today.strftime("%d %B, %Y")]["price"] - results[product][yesterday.strftime("%d %B, %Y")]["price"]
if change < 0:
print(f'Price for {product} has dropped by {change}!')
```## 6. The final code
```python
import os
import requests
import os.path
from datetime import date
from datetime import timedelta
import pandas as pd
import matplotlib.pyplot as pltdef get_product(link):
USERNAME = "username"
PASSWORD = "password"# Structure payload.
payload = {
'source': 'universal',
'url': link,
'geo_location': 'United States',
'parse': True,
}# Get response.
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=(USERNAME, PASSWORD),
json=payload,
)
response_json = response.json()content = response_json["results"][0]["content"]
product = {
"title": content["title"],
"price": content["price"]["price"],
"currency": content["price"]["currency"]
}
return productdef read_past_data(filepath):
results = {}if not os.path.isfile(filepath):
open(filepath, 'a').close()if not os.stat(filepath).st_size == 0:
results_df = pd.read_json(filepath, convert_axes=False)
results = results_df.to_dict()
return results
return resultsdef save_results(results, filepath):
df = pd.DataFrame.from_dict(results)df.to_json(filepath)
return
def add_todays_prices(results, tracked_product_links):
today = date.today()for link in tracked_product_links:
product = get_product(link)if product["title"] not in results:
results[product["title"]] = {}
results[product["title"]][today.strftime("%d %B, %Y")] = {
"price": product["price"],
"currency": product["currency"],
}
return resultsdef plot_history_chart(results):
for product in results:
dates = []
prices = []
for entry_date in results[product]:
dates.append(entry_date)
prices.append(results[product][entry_date]["price"])plt.plot(dates,prices, label=product)
plt.xlabel("Date")
plt.ylabel("Price")plt.title("Product prices over time")
plt.legend()
plt.show()def check_for_pricedrop(results):
for product in results:
today = date.today()
yesterday = today - timedelta(days = 1)change = results[product][today.strftime("%d %B, %Y")]["price"] - results[product][yesterday.strftime("%d %B, %Y")]["price"]
if change < 0:
print(f'Price for {product} has dropped by {change}!')def main():
results_file = "data.json"tracked_product_links = [
"https://www.bestbuy.com/site/samsung-galaxy-z-flip4-128gb-unlocked-graphite/6512618.p?skuId=6512618&intl=nosplash",
"https://www.bestbuy.com/site/samsung-galaxy-z-flip5-256gb-unlocked-graphite/6548838.p?skuId=6548838"
]past_results = read_past_data(results_file)
updated_results = add_todays_prices(past_results, tracked_product_links)
plot_history_chart(updated_results)
check_for_pricedrop(updated_results)
save_results(updated_results, results_file)
if __name__ == "__main__":
main()
```## Wrapping up
For all of the API parameters, see our [documentation](https://developers.oxylabs.io/scraper-apis/web-scraper-api/best-buy).
If you need assistance, don't hesitate to contact us at [email protected].