An open API service indexing awesome lists of open source software.

https://github.com/maron09/tiktok-data-miner

This is a script that uses Selenium and Beautiful Soup to mine Data from a Tiktik User's account
https://github.com/maron09/tiktok-data-miner

Last synced: 3 months ago
JSON representation

This is a script that uses Selenium and Beautiful Soup to mine Data from a Tiktik User's account

Awesome Lists containing this project

README

        

# TikTok-Data-Miner
**TikTok Data Scraper**

The TikTok Data Scraper is a Python script that allows you to extract data from TikTok user profiles. It retrieves information such as video views, likes, comments, shares, music, and other details from the specified TikTok user's profile page.

**Prerequisites**

Python 3.x

selenium library

webdriver_manager library

fake_useragent library

beautifulsoup4 library

requests library

xlsxwriter library

**Setup**

Install the required libraries:

selenium: A library for automating web browsers.

webdriver_manager: A library for managing web driver executables.

fake_useragent: A library for generating random user agents.

beautifulsoup4: A library for parsing HTML and XML documents.

requests: A library for making HTTP requests.

xlsxwriter: A library for creating Excel files.

**Import the necessary libraries:**

selenium: Import the WebDriver module for browser automation.

webdriver_manager: Import the ChromeDriverManager for managing the Chrome web driver.

fake_useragent: Import the UserAgent for generating random user agents.

beautifulsoup4: Import the BeautifulSoup module for parsing HTML documents.

requests: Import the requests module for making HTTP requests.

xlsxwriter: Import the xlsxwriter module for creating Excel files.

**Usage**

Initialize the web driver:

Set up the Chrome driver options, such as window size, user agent, and headless mode.
Create a WebDriver instance using webdriver.Chrome and specify the path to the Chrome driver executable.
Specify the TikTok profile URL:

Set the driver.get method with the TikTok profile URL you want to scrape.
Scroll to the end of the page:

Use the driver.execute_script method to scroll the page to the bottom multiple times, allowing all videos to load.
Parse the HTML and extract video information:

Use BeautifulSoup to parse the HTML source of the page.
Find the relevant HTML elements that contain the video information you want to extract.
Extract the desired data, such as video views, likes, comments, shares, and music.
Visit each video page and scrape additional details:

For each video, extract the URL and visit the video page using driver.get.
Parse the HTML of the video page using BeautifulSoup.
Extract additional details, such as likes, comments, shares, and music, from the video page.
Store the scraped data:

Store the extracted data in a data structure, such as a list or dictionary.
Optionally, save the data to a file, such as an Excel spreadsheet, using the xlsxwriter library.
Close the web driver:

Use the driver.quit() method to close the web driver and release system resources.

**Limitations**

The script relies on the structure of the TikTok website, so any changes to the HTML structure or class names may break the scraping process.
The script does not handle exceptions or errors gracefully. Additional error handling and validation should be implemented for a more robust solution.

**Disclaimer**

Please be aware of the legal implications of web scraping. Make sure to review and comply with TikTok's terms of service and respect the privacy of users.
Use this script responsibly and in accordance with TikTok's terms of service and any applicable laws and regulations.