https://github.com/bigdata5911/patreon-video-scrapper

This is scrapper to get post metadata without downloading and download external video file and embedded youtube video with given post url from patreon
https://github.com/bigdata5911/patreon-video-scrapper

patreon patreon-api patreon-client patreon-scraper scraper youtube-downloader yt-dl

Last synced: 7 months ago
JSON representation

This is scrapper to get post metadata without downloading and download external video file and embedded youtube video with given post url from patreon

Host: GitHub
URL: https://github.com/bigdata5911/patreon-video-scrapper
Owner: BigData5911
Created: 2025-01-08T19:47:09.000Z (9 months ago)
Default Branch: main
Last Pushed: 2025-01-21T21:29:23.000Z (9 months ago)
Last Synced: 2025-01-21T22:29:25.185Z (9 months ago)
Topics: patreon, patreon-api, patreon-client, patreon-scraper, scraper, youtube-downloader, yt-dl
Language: TypeScript
Homepage:
Size: 84 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          
##  Prequisites

- [How to obtain patreon cookies](https://github.com/patrickkfkan/patreon-dl/wiki/How-to-obtain-Cookie)

- [Youtube external download provider](https://github.com/patrickkfkan/patreon-dl/blob/master/docs/Library.md#external-downloaders)

## Technical Stack

- **Runtime**: Node.js

- **Language**: TypeScript

- **Library**:

  - patreon-dl

  - youtube-dl-exec

  - FFMPEG

## Environment Setup

```bash

PATREON_COOKIE='your_patreon_cookie'

YOUTBUE_COOKIE_JSON_PATH='path/to/yt-credentials.json' 

PATREON_OUTPUT_DIR='./output'

```

## Use case

- Download patreon posts

```javascript

// [supported url](https://github.com/patrickkfkan/patreon-dl/blob/master/README.md#url)

const postUrl = 'https://www.patreon.com/c/kobeon/posts';

// fetching posts from patreon and save it to json file

getPosts(postUrl).then(posts => {

    // save posts to a json file

    saveJsonToFile(posts, 'posts.json');

}).catch(console.error);

// Download the post from patreon

downloadFromPatreon(postUrl, "output_dir");

```

- Get patreon posts' metadata without downloading with given post ids

```javascript

import { loadJsonFile, saveJsonToFile } from './utils/fileUtils';

import { getPostMetadata } from './utils/patreonUtils';

import { sleep } from './utils/sleepUtils'

async function main() {

    try {

        const postMetadataList = [];

        const failedPostIds = [];

        

        const postIds = loadJsonFile('postIds.json');

        const totalCount = postIds.length;

        // Use Promise.allSettled to handle all requests concurrently

        const promises = postIds.map(async (postId: string, index: number) => {

            console.log(`Processing ${index + 1}/${totalCount} ${postId}...`);

            try {

                const metadata = await getPostMetadata(postId);

                // sleep 3 seconds to avoid rate limit

                await sleep(1000)

                return { postId, success: true, metadata };

            } catch (error) {

                console.error(`Failed to process ${postId}:`, error);

                return { success: false, postId };

            }

        });

        // Wait for all promises to settle

        const results = await Promise.allSettled(promises);

        // Process the results

        for (const result of results) {

            if (result.status === 'fulfilled') {

                console.log(result.value.postId)

                postMetadataList.push(result.value.metadata);

            } else {

                failedPostIds.push(result.reason.postId);

            }

        }

        // Save json file

        saveJsonToFile(postMetadataList, 'postMetadataList.json');

        saveJsonToFile(failedPostIds, 'failedPostIds.json');

    } catch (err) {

        console.error('Error:', err);

    }

}

// Execute the main function

main();

```

- Extract youtube video url from retrieved post metadata list

```javascript

// Extracting youtube video url from retrieved post metadata

const postList = loadJsonFile("postMetadataList.json")

// Assuming postList is defined and is an array of post objects

const youtubeVideoUrlList = [];

const oddPostList = [];

// Iterate over each post in the postList

for (const post of postList) {

    const attributes = post.data?.attributes; // Accessing attributes directly

    const embed = attributes?.embed; // Accessing embed directly

    // Check if the embed provider is YouTube and push the URL to the list

    if (embed?.provider === "YouTube") {

        youtubeVideoUrlList.push(embed.url); // Accessing URL directly

    } else {

        oddPostList.push(post);

    }

}

// Log the collected YouTube video URLs

console.log('YouTube Video URLs:', youtubeVideoUrlList);

// Save json file

saveJsonToFile(youtubeVideoUrlList, "embedded_youtube_video_url_list.json")

saveJsonToFile(oddPostList, "odd_post_list.json")

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bigdata5911/patreon-video-scrapper

Awesome Lists containing this project

README