https://github.com/drawrowfly/tiktok-scraper

TikTok Scraper. Download video posts, collect user/trend/hashtag/music feed metadata, sign URL and etc.
https://github.com/drawrowfly/tiktok-scraper
Last synced: about 1 year ago
JSON representation
TikTok Scraper. Download video posts, collect user/trend/hashtag/music feed metadata, sign URL and etc.
Host: GitHub
URL: https://github.com/drawrowfly/tiktok-scraper
Owner: drawrowfly
Created: 2019-10-23T07:47:40.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2023-05-19T05:56:45.000Z (about 3 years ago)
Last Synced: 2025-05-06T08:50:23.478Z (about 1 year ago)
Language: TypeScript
Homepage:
Size: 31.3 MB
Stars: 4,674
Watchers: 143
Forks: 831
Open Issues: 88
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

osint_stuff_tool_collection - TikTok Scraper
README

          # TikTok Scraper & Downloader

![NPM](https://img.shields.io/npm/l/tiktok-scraper.svg?style=for-the-badge) ![npm](https://img.shields.io/npm/v/tiktok-scraper.svg?style=for-the-badge) ![Codacy grade](https://img.shields.io/codacy/grade/b3ef17f5a8504600931abfa60ac01006.svg?style=for-the-badge) ![CI](https://img.shields.io/github/workflow/status/drawrowfly/tiktok-scraper/CI?style=for-the-badge)

Scrape and download useful information from TikTok.

## No login or password are required

This is not an official API support and etc. This is just a scraper that is using TikTok Web API to scrape media and related meta information.

---





---

## Content

- [Features](#features)

- [To Do](#to-do)

- [Contribution](#contribution)

- [Installation](#installation)

- [Usage](#usage)

	- [In Terminal](#in-terminal)

	    - [Terminal Examples](https://github.com/drawrowfly/tiktok-scraper/tree/master/examples/CLI/Examples.md)

	    - [Manage Download History](https://github.com/drawrowfly/tiktok-scraper/tree/master/examples/CLI/DownloadHistory.md)

	    - [Scrape and Download in Batch](https://github.com/drawrowfly/tiktok-scraper/tree/master/examples/CLI/BatchDownload.md)

	    - [Output File Example](#output-file-example)

	- [Docker](#docker)

	    - [Build](#build)

	    - [Run](#run)

	- [Module](#docker)

	    - [Methods](#methods)

	    - [Options](#options)

	    - [Use with Promises](#promise)

	    - [Use with Events](#event)

	    - [How to get/set session value](#get-set-session)

	    - [How to access/download video](#download-video)

	    - [Output Example](#json-output-example)

	        - [Video Feed Methods](#video-feed)

	        - [getUserProfileInfo](#getUserProfileInfo)

	        - [getHashtagInfo](#getHashtagInfo)

	        - [getVideoMeta](#getVideoMeta)

            - [getMusicInfo](#getMusicInfo)

## Important notes

- As of right now it is NOT possible to download video without the watermark

## Features

-   Download **unlimited** post metadata from the User, Hashtag, Trends, or Music-Id pages

-   Save post metadata to the JSON/CSV files

-   Download media **with and without the watermark** and save to the ZIP file

-   Download single video without the watermark from the CLI

-   Sign URL to make custom request to the TikTok API

-   Extract metadata from the User, Hashtag and Single Video pages

-   **Save previous progress and download only new videos that weren't downloaded before**. This feature only works from the CLI and only if **download** flag is on.

-   **View and manage previously downloaded posts history in the CLI**

-   Scrape and download user, hashtag, music feeds and single videos specified in the file in batch mode

## To Do

-   [x] CLI: save progress to avoid downloading same videos

-   [x] **Rewrite everything in TypeScript**

-   [x] Improve proxy support

-   [x] Add tests

-   [x] Download video without the watermark

-   [x] Indicate in the output file(csv/json) if the video was downloaded or not

-   [x] Build and run from Docker

-   [x] CLI: Scrape and download in batch

-   [x] CLI: Load proxies from a file

-   [x] CLI: Optional ZIP

-   [x] Renew API

-   [x] Set WebHook URL (CLI)

-   [x] Add new method to collect music metadata

-   [ ] Add Manual Pagination

-   [ ] Improve documentation

-   [ ] Download audio files

-   [ ] Web interface

## Contribution

-   Don't forget about tests

```sh

yarn test

```

```sh

yarn build

```

## Installation

tiktok-scraper requires [Node.js](https://nodejs.org/) v10+ to run.

**Install from NPM**

```sh

npm i -g tiktok-scraper

```

**Install from YARN**

```sh

yarn global add tiktok-scraper

```

## USAGE

### In Terminal

```sh

$ tiktok-scraper --help

Usage: tiktok-scraper  [options]

Commands:

  tiktok-scraper user [id]     Scrape videos from username. Enter only username

  tiktok-scraper hashtag [id]  Scrape videos from hashtag. Enter hashtag without #

  tiktok-scraper trend         Scrape posts from current trends

  tiktok-scraper music [id]    Scrape posts from a music id number

  tiktok-scraper video [id]    Download single video without the watermark

  tiktok-scraper history       View previous download history

  tiktok-scraper from-file [file] [async]  Scrape users, hashtags, music, videos mentioned

                                in a file. 1 value per 1 line

Options:

  --version            Show version number                             [boolean]

  --session            Set session cookie value. Sometimes session can be

                       helpful when scraping data from any method  [default: ""]

  --session-file       Set path to the file with list of active sessions. One

                       session per line!                           [default: ""]

  --timeout            Set timeout between requests. Timeout is in Milliseconds:

                       1000 mls = 1 s                               [default: 0]

  --number, -n         Number of posts to scrape. If you will set 0 then all

                       posts will be scraped                        [default: 0]

  --since              Scrape no posts published before this date (timestamp).

                       If set to 0 the filter is deactived          [default: 0]

  --proxy, -p          Set single proxy                            [default: ""]

  --proxy-file         Use proxies from a file. Scraper will use random proxies

                       from the file per each request. 1 line 1 proxy.

                                                                   [default: ""]

  --download, -d       Download video posts to the folder with the name input

                       [id]                           [boolean] [default: false]

  --asyncDownload, -a  Number of concurrent downloads               [default: 5]

  --hd                 Download video in HD. Video size will be x5-x10 times

                       larger and this will affect scraper execution speed. This

                       option only works in combination with -w flag

                                                      [boolean] [default: false]

  --zip, -z            ZIP all downloaded video posts [boolean] [default: false]

  --filepath           File path to save all output files.

      [default: "/Users/karl.wint/Documents/projects/javascript/tiktok-scraper"]

  --filetype, -t       Type of the output file where post information will be

                       saved. 'all' - save information about all posts to the`

                       'json' and 'csv'

                               [choices: "csv", "json", "all", ""] [default: ""]

  --filename, -f       Set custom filename for the output files    [default: ""]

  --noWaterMark, -w    Download video without the watermark. NOTE: With the

                       recent update you only need to use this option if you are

                       scraping Hashtag Feed. User/Trend/Music feeds will have

                       this url by default            [boolean] [default: false]

  --store, -s          Scraper will save the progress in the OS TMP or Custom

                       folder and in the future usage will only download new

                       videos avoiding duplicates     [boolean] [default: false]

  --historypath        Set custom path where history file/files will be stored

                   [default: "/var/folders/d5/fyh1_f2926q7c65g7skc0qh80000gn/T"]

  --remove, -r         Delete the history record by entering "TYPE:INPUT" or

                       "all" to clean all the history. For example: user:bob

                                                                   [default: ""]

  --webHookUrl         Set webhook url to receive scraper result as HTTP

                       requests. For example to your own API       [default: ""]

  --method             Receive data to your webhook url as POST or GET request

                                      [choices: "GET", "POST"] [default: "POST"]

  --help               Show help                                       [boolean]

Examples:

  tiktok-scraper user USERNAME -d -n 100 --session sid_tt=dae32131231

  tiktok-scraper trend -d -n 100 --session sid_tt=dae32131231

  tiktok-scraper hashtag HASHTAG_NAME -d -n 100 --session sid_tt=dae32131231

  tiktok-scraper music MUSIC_ID -d -n 50 --session sid_tt=dae32131231

  tiktok-scraper video https://www.tiktok.com/@tiktok/video/6807491984882765062 -d

  tiktok-scraper history

  tiktok-scraper history -r user:bob

  tiktok-scraper history -r all

  tiktok-scraper from-file BATCH_FILE ASYNC_TASKS -d

```

- [Terminal Examples](https://github.com/drawrowfly/tiktok-scraper/tree/master/examples/CLI/Examples.md)

- [Manage Download History](https://github.com/drawrowfly/tiktok-scraper/tree/master/examples/CLI/DownloadHistory.md)

- [Scrape and Download in Batch](https://github.com/drawrowfly/tiktok-scraper/tree/master/examples/CLI/BatchDownload.md)

### Output File Example

![Demo](https://i.imgur.com/6gIbBzo.png)

## Docker

By using docker you won't be able to use --filepath and --historypath , but you can set volume(**host path where all files will be saved**) by using -v

##### Build

```sh

docker build . -t tiktok-scraper

```

##### Run

**Example 1:**

All files including history file will be saved in the directory(\$pwd) where you running the docker from

```sh

docker run -v $(pwd):/usr/app/files tiktok-scraper user tiktok -d -n 5 -s

```

**Example 2:**

All files including history file will be saved in /User/blah/downloads

```sh

docker run -v /User/blah/downloads:/usr/app/files tiktok-scraper user tiktok -d -n 5 -s

```

## Module

### Methods

```javascript

.user(id, options) //Scrape posts from a specific user (Promise)

.hashtag(id, options) //Scrape posts from hashtag section (Promise)

.trend('', options) // Scrape posts from a trends section (Promise)

.music(id, options) // Scrape posts by music id (Promise)

.userEvent(id, options) //Scrape posts from a specific user (Event)

.hashtagEvent(id, options) //Scrape posts from hashtag section (Event)

.trendEvent('', options) // Scrape posts from a trends section (Event)

.musicEvent(id, options) // Scrape posts by music id (Event)

.getUserProfileInfo('USERNAME', options) // Get user profile information

.getHashtagInfo('HASHTAG', options) // Get hashtag information

.signUrl('URL', options) // Get signature for the request

.getVideoMeta('WEB_VIDEO_URL', options) // Get video meta info, including video url without the watermark

.getMusicInfo('https://www.tiktok.com/music/original-sound-6801885499343571718', options) // Get music metadata

```

### Options

```javascript

const options = {

    // Number of posts to scrape: {int default: 20}

    number: 50,

    // Scrape posts published since this date: { int default: 0}

    since: 0,

    // Set session: {string[] default: ['']}

    // Authenticated session cookie value is required to scrape user/trending/music/hashtag feed

    // You can put here any number of sessions, each request will select random session from the list

    sessionList: ['sid_tt=21312213'],

    // Set proxy {string[] | string default: ''}

    // http proxy: 127.0.0.1:8080

    // socks proxy: socks5://127.0.0.1:8080

    // You can pass proxies as an array and scraper will randomly select a proxy from the array to execute the requests

    proxy: '',

    // Set to {true} to search by user id: {boolean default: false}

    by_user_id: false,

    // How many post should be downloaded asynchronously. Only if {download:true}: {int default: 5}

    asyncDownload: 5,

    // How many post should be scraped asynchronously: {int default: 3}

    // Current option will be applied only with current types: music and hashtag

    // With other types it is always 1 because every request response to the TikTok API is providing the "maxCursor" value

    // that is required to send the next request

    asyncScraping: 3,

    // File path where all files will be saved: {string default: 'CURRENT_DIR'}

    filepath: `CURRENT_DIR`,

    // Custom file name for the output files: {string default: ''}

    fileName: `CURRENT_DIR`,

    // Output with information can be saved to a CSV or JSON files: {string default: 'na'}

    // 'csv' to save in csv

    // 'json' to save in json

    // 'all' to save in json and csv

    // 'na' to skip this step

    filetype: `na`,

    // Set custom headers: user-agent, cookie and etc

    // NOTE: When you parse video feed or single video metadata then in return you will receive {headers} object

    // that was used to extract the information and in order to access and download video through received {videoUrl} value you need to use same headers

    headers: {

        'user-agent': "BLAH",

        referer: 'https://www.tiktok.com/',

        cookie: `tt_webid_v2=68dssds`,

    },

    // Download video without the watermark: {boolean default: false}

    // Set to true to download without the watermark

    // This option will affect the execution speed

    noWaterMark: false,

    // Create link to HD video: {boolean default: false}

    // This option will only work if {noWaterMark} is set to {true}

    hdVideo: false,

    // verifyFp is used to verify the request and avoid captcha

    // When you are using proxy then there are high chances that the request will be

    // blocked with captcha

    // You can set your own verifyFp value or default(hardcoded) will be used

    verifyFp: '',

    // Switch main host to Tiktok test enpoint.

    // When your requests are blocked by captcha you can try to use Tiktok test endpoints.

    useTestEndpoints: false

};

```

Don't forget to check the **examples** folder

### Promise

```javascript

const TikTokScraper = require('tiktok-scraper');

// User feed by username

(async () => {

    try {

        const posts = await TikTokScraper.user('USERNAME', {

            number: 100,

            sessionList: ['sid_tt=58ba9e34431774703d3c34e60d584475;']

        });

        console.log(posts);

    } catch (error) {

        console.log(error);

    }

})();

// User feed by user id

// Some TikTok user id's are larger then MAX_SAFE_INTEGER, you need to pass user id as a string

(async () => {

    try {

        const posts = await TikTokScraper.user(`USER_ID`, {

            number: 100,

            by_user_id: true,

            sessionList: ['sid_tt=58ba9e34431774703d3c34e60d584475;']

        });

        console.log(posts);

    } catch (error) {

        console.log(error);

    }

})();

// Trending feed

(async () => {

    try {

        const posts = await TikTokScraper.trend('', {

            number: 100,

            sessionList: ['sid_tt=58ba9e34431774703d3c34e60d584475;']

        });

        console.log(posts);

    } catch (error) {

        console.log(error);

    }

})();

// Hashtag feed

(async () => {

    try {

        const posts = await TikTokScraper.hashtag('HASHTAG', {

            number: 100,

            sessionList: ['sid_tt=58ba9e34431774703d3c34e60d584475;']

        });

        console.log(posts);

    } catch (error) {

        console.log(error);

    }

})();

// Get single user profile information: Number of followers and etc

// input - USERNAME

// options - not required

(async () => {

    try {

        const user = await TikTokScraper.getUserProfileInfo('USERNAME', options);

        console.log(user);

    } catch (error) {

        console.log(error);

    }

})();

// Get single hashtag information: Number of views and etc

// input - HASHTAG NAME

// options - not required

(async () => {

    try {

        const hashtag = await TikTokScraper.getHashtagInfo('HASHTAG', options);

        console.log(hashtag);

    } catch (error) {

        console.log(error);

    }

})();

// Get single video metadata

// input - WEB_VIDEO_URL

// For example: https://www.tiktok.com/@tiktok/video/6807491984882765062

// options - not required

(async () => {

    try {

        const videoMeta = await TikTokScraper.getVideoMeta('https://www.tiktok.com/@tiktok/video/6807491984882765062', options);

        console.log(videoMeta);

    } catch (error) {

        console.log(error);

    }

})();

```

### Event

```javascript

const TikTokScraper = require('tiktok-scraper');

const users = TikTokScraper.userEvent("tiktok", { number: 30 });

users.on('data', json => {

    //data in JSON format

});

users.on('done', () => {

    //completed

});

users.on('error', error => {

    //error message

});

users.scrape();

const hashtag = TikTokScraper.hashtagEvent("summer", { number: 250, proxy: 'socks5://1.1.1.1:90' });

hashtag.on('data', json => {

    //data in JSON format

});

hashtag.on('done', () => {

    //completed

});

hashtag.on('error', error => {

    //error message

});

hashtag.scrape();

```

### Get Set Session

**NOT REQUIRED**

**Very common problem is when tiktok is blacklisting your IP/PROXY and in such case you can try to set session and there will be higher chances for success**

Get the session:

- Open https://www.tiktok.com/ in any browser

- Login in to your account

- Right click -> inspector -> networking

- Refresh page -> select any request that was made to the tiktok -> go to the Request Header sections -> Cookies

- Find in cookies **sid_tt** value. It usually looks like that: **sid_tt=521kkadkasdaskdj4j213j12j312;**

- **sid_tt=521kkadkasdaskdj4j213j12j312;** - this will be your authenticated session cookie value that should be used to scrape user/hashtag/music/trending feed

Set the session:

- **CLI**:

    -  Set single session by using option **--session**. For example **--session sid_tt=521kkadkasdaskdj4j213j12j312;**

    -  Set path to the file with the list of sessions by using option **--session-file**. For example **--session-file /var/bob/sessionList.txt**

        - Example content /var/bob/sessionList.txt:

        ```

        sid_tt=521kkadkasdaskdj4j213j12j312;

        sid_tt=521kkadkasdaskdj4j213j12j312;

        sid_tt=521kkadkasdaskdj4j213j12j312;

        sid_tt=521kkadkasdaskdj4j213j12j312;

        ```

- In the **MODULE** you can set session by setting the option value sessionList . For example **sessionList:["sid_tt=521kkadkasdaskdj4j213j12j312;", "sid_tt=12312312312312;"]**

### Download Video

**This part is related to the MODULE usage (NOT THE CLI)**

The **{videoUrl}** value is binded to the cookie value **{tt_webid_v2}** that can contain **any value**

#### Method 1: default headers

When you extract videos from the user, hashtag, music, trending feed or single video then in response besides the video metadata you will receive **headers** object that will contain params that were used to extract the data. Here is the important part, **in order to access/download video through {videoUrl} value you need to use same {headers} values**.

```json

    headers: {

        "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.80 Safari/537.36",

        "referer": "https://www.tiktok.com/",

        "cookie": "tt_webid_v2=689854141086886123"

    },

```

#### Method 2: custom headers

You can pass your own headers with the **{options}**.

```javascript

const headers = {

    "user-agent": "BOB",

    "referer": "https://www.tiktok.com/",

    "cookie": "tt_webid_v2=BOB"

}

getVideoMeta('WEB_VIDEO_URL', {headers})

user('WEB_VIDEO_URL', {headers})

hashtag('WEB_VIDEO_URL', {headers})

trend('WEB_VIDEO_URL', {headers})

music('WEB_VIDEO_URL', {headers})

// And after you can access video through {videoUrl} value by using same custom headers

```

### Json Output Example

##### Video Feed

Example output for the methods: **user, hashtag, trend, music, userEvent, hashtagEvent, musicEvent, trendEvent**

```javascript

{

    headers: {

        'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.80 Safari/537.36',

        referer: 'https://www.tiktok.com/',

        cookie: 'tt_webid_v2=689854141086886123'

    },

    collector:[{

        id: 'VIDEO_ID',

        text: 'CAPTION',

        createTime: '1583870600',

        authorMeta:{

            id: 'USER ID',

            name: 'USERNAME',

            following: 195,

            fans: 43500,

            heart: '1093998',

            video: 3,

            digg: 95,

            verified: false,

            private: false,

            signature: 'USER BIO',

            avatar:'AVATAR_URL'

        },

        musicMeta:{

            musicId: '6808098113188120838',

            musicName: 'blah blah',

            musicAuthor: 'blah',

            musicOriginal: true,

            playUrl: 'SOUND/MUSIC_URL',

        },

        covers:{

            default: 'COVER_URL',

            origin: 'COVER_URL',

            dynamic: 'COVER_URL'

        },

        imageUrl:'IMAGE_URL',

        videoUrl:'VIDEO_URL',

        videoUrlNoWaterMark:'VIDEO_URL_WITHOUT_THE_WATERMARK',

        videoMeta: { width: 480, height: 864, ratio: 14, duration: 14 },

        diggCount: 2104,

        shareCount: 1,

        playCount: 9007,

        commentCount: 50,

        mentions: ['@bob', '@sam', '@bob_again', '@and_sam_again'],

        hashtags:

        [{

            id: '69573911',

            name: 'PlayWithLife',

            title: 'HASHTAG_TITLE',

            cover: [Array]

        }...],

        downloaded: true

    }...],

    //If {filetype} and {download} options are enbabled then:

    zip: '/{CURRENT_PATH}/user_1552963581094.zip',

    json: '/{CURRENT_PATH}/user_1552963581094.json',

    csv: '/{CURRENT_PATH}/user_1552963581094.csv'

}

```

##### getUserProfileInfo

```javascript

{

    secUid: 'MS4wLjABAAAAv7iSuuXDJGDvJkmH_vz1qkDZYo1apxgzaxdBSeIuPiM',

    userId: '107955',

    isSecret: false,

    uniqueId: 'tiktok',

    nickName: 'TikTok',

    signature: 'Make Your Day',

    covers: ['COVER_URL'],

    coversMedium: ['COVER_URL'],

    following: 490,

    fans: 38040567,

    heart: '211522962',

    video: 93,

    verified: true,

    digg: 29,

}

```

##### getHashtagInfo

```javascript

{

    challengeId: '4231',

    challengeName: 'love',

    text: '',

    covers: [],

    coversMedium: [],

    posts: 66904972,

    views: '194557706433',

    isCommerce: false,

    splitTitle: ''

}

```

##### getVideoMeta

```javascript

{

    headers: {

        'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.80 Safari/537.36',

        referer: 'https://www.tiktok.com/',

        cookie: 'tt_webid_v2=689854141086886123'

    },

    collector:[{

        id: '6807491984882765062',

        text: 'We’re kicking off the #happyathome live stream series today at 5pm PT!',

        createTime: '1584992742',

        authorMeta: { id: '6812221792183403526', name: 'blah' },

        musicMeta:{

            musicId: '6822233276137213677',

            musicName: 'blah',

            musicAuthor: 'blah'

        },

        imageUrl: 'IMAGE_URL',

        videoUrl: 'VIDEO_URL',

        videoUrlNoWaterMark: 'VIDEO_URL_WITHOUT_THE_WATERMARK',

        videoMeta: { width: 480, height: 864, ratio: 14, duration: 14 },

        covers:{

            default: 'COVER_URL',

            origin: 'COVER_URL'

        },

        diggCount: 49292,

        shareCount: 339,

        playCount: 614678,

        commentCount: 4023,

        downloaded: false,

        hashtags: [],

    }]

}

```

##### getMusicInfo

```javascript

{

    music: {

        id: '6882925279036066566',

        title: 'doja x calabria',

        playUrl: 'dfdfdfdf',

        coverThumb:

            'dfdfdf',

        coverMedium:

            'dfdfdf',

        coverLarge:

            'fdfdf',

        authorName: 'bryce',

        original: true,

        playToken:

            'ffdfdf',

        keyToken: 'dfdfdfd',

        audioURLWithcookie: false,

        private: false,

        duration: 46,

        album: '',

    },

    author: {

        id: '6835300004094166021',

        uniqueId: 'mashupsbybryce',

        nickname: 'bryce',

        avatarThumb:

            'dfdfd',

        avatarMedium:

            'dfdfdf',

        avatarLarger:

            'dfdfdf',

        signature: 'hi ily :)\n70k sounds cool tbh\n👇follow my soundcloud & insta👇',

        verified: false,

        secUid: 'MS4wLjABAAAA1_5bjLAamayD4rv3q49qJGa_7dZ5jzExTO0ozOybqIwwhw5TAg_iM25lkO94DM3K',

        secret: false,

        ftc: false,

        relation: 0,

        openFavorite: false,

        commentSetting: 0,

        duetSetting: 0,

        stitchSetting: 0,

        privateAccount: false,

    },

    stats: { videoCount: 361700 },

    shareMeta: {

        title: 'bryceyouloser | ♬ doja x calabria | on TikTok',

        desc: '361.0k videos - Watch awesome short ' + 'videos created with ♬ doja x calabria',

    },

};

```



---

License

---

**MIT**

**Free Software**
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/drawrowfly/tiktok-scraper

Awesome Lists containing this project

README