Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/khanhtc1202/animu-crawling-system

animu crawling system
https://github.com/khanhtc1202/animu-crawling-system

anime crawling media-server rss-listener

Last synced: about 2 months ago
JSON representation

animu crawling system

Host: GitHub
URL: https://github.com/khanhtc1202/animu-crawling-system
Owner: khanhtc1202
Created: 2018-01-12T09:50:33.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2018-06-02T07:29:57.000Z (over 6 years ago)
Last Synced: 2024-10-28T07:31:28.359Z (3 months ago)
Topics: anime, crawling, media-server, rss-listener
Language: JavaScript
Homepage:
Size: 343 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Animu Crawling System

## System's features

1. Looking for new animu videos from rss

2. Auto clone animu videos via torrent

3. Media serve on browser

## System's struct

```bash

.

├── README.md

├── dummy_files

├── pacman

├── logs

├── resources

├── scripts

├── server   

└── media

```

1. `pacman`       => rss listener, config rss list on resources/feeds.json

2. `logs`         => rss listener's log

3. `resources`    => store rss list

4. `scripts`      => system's scripts (such as download script)

5. `server`       => media server

6. `media`        => store videos

7. `dummy_files`  => store *

## How to run

### Require packages

#### For `pacman` rss listener

This service performed by python2.7 using libs:

1. requests (install via `pip`)

2. python-bs4 (install via package manger, such as `apt`)

3. logging (install via `pip`)

Install python libs via old version of `pip` may harm security problems. Please update your `pip` to newest version to avoid that kind of errors.

#### For `server` media

This service performed by node js. Just run `npm install` on the first time run this project for installing dependencies packages.

For using `npm start` command, please install [nodemon](https://www.npmjs.com/package/nodemon) package globally to your host.

#### For `scripts` download videos via torrent

Install `aria2` package via your package manager such as `apt`. More informations go [here](https://aria2.github.io/)

### Add new rss

Rss list stored at `resources/feeds.json`.

Sample config:

```json

{

    "title": "FEED LIST",

    "data": [

        {

            "team": "fuyu",

            "rss": "https://www.fuyufs.com/episode/feed",

            "anchor": {

                "tag": "a",

                "css_selector": {"data-key":"quality_720p_torrent"}

            }

        },

        {

            "team": "HorribleSubs",

            "rss": "https://nyaa.si/?page=rss&u=HorribleSubs",

            "anchor": false

        }

    ]

}

```

For each rss, if `link` field from `xml` code not contain url for .torrent file, you must specific `anchor` field on rss config to point at tag that have .torrent file download link.

### Run system

Run 3 services one by one

> Pacman service - RSS listener

```bash

$ cd pacman

$ python main.py

```

>  Media service

If you installed `nodemon`

```bash

$ cd server

$ npm start

```

else

```bash

$ cd server

$ node app.js

```

> Monitoring service (optional)

```bash

$ cd scripts

$ watch -n 10 "./status.sh"

```

### Endpoint list

With `xxx.yyy` is your host

> http://xxx.yyy:3000/

=> Greeting endpoint

> http://xxx.yyy:3000/videos

=> Cloned videos list

> http://xxx.yyy:3000/logs

=> Crawling service log

> http://xxx.yyy:3000/monitor

=> Crawling system status (for checking hard disk capacity, system status, etc)