Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lowlighter/rakun
๐ฆ A parser to extract metadata from anime torrent filename, like source, codecs, resolution, series, audio, etc.
https://github.com/lowlighter/rakun
anime-torrents parser torrent torrent-management
Last synced: 3 months ago
JSON representation
๐ฆ A parser to extract metadata from anime torrent filename, like source, codecs, resolution, series, audio, etc.
- Host: GitHub
- URL: https://github.com/lowlighter/rakun
- Owner: lowlighter
- License: mit
- Created: 2020-05-15T21:05:14.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2023-01-27T03:53:05.000Z (almost 2 years ago)
- Last Synced: 2024-10-30T01:38:13.057Z (3 months ago)
- Topics: anime-torrents, parser, torrent, torrent-management
- Language: TypeScript
- Homepage:
- Size: 599 KB
- Stars: 21
- Watchers: 1
- Forks: 3
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project
README
# ๐ฆ Rakun - An Anime torrent name parser
![Onee-chan](https://github.com/lowlighter/rakun/workflows/Onee-chan/badge.svg) ![Tests passed](https://badges.lecoq.io/rakun.tests) ![Coverage](https://img.shields.io/endpoint?url=https%3A%2F%2Fbadges.lecoq.io%2Frakun%2Fcoverage) ![Code quality](https://img.shields.io/codacy/grade/114dbe0608b14b6a8e92163a6dffc9b4/master?labelColor=2F3338&link=https://app.codacy.com/manual/simon.lecoq/rakun) ![Version](https://img.shields.io/github/package-json/v/lowlighter/rakun?labelColor=2F3338) ![License](https://img.shields.io/github/license/lowlighter/rakun?labelColor=2F3338)
A parser to extract informations and metadata from anime torrents filename.
With this, it'll be easier to script anime torrents related stuff !## ๐ป Usage
### ๐ฆ [Website](https://rakun.app)
If you want to use Rakun online, you can visit the website by [@Wamy-Dev](https://github.com/Wamy-Dev/Rakun.app) at [rakun.app](https://rakun.app).
### ๐ฆ Installation
You can either use default npm registry (`https://registry.npmjs.org`) or the GitHub package registry (`https://npm.pkg.github.com`).
```bash
npm install @lowlighter/rakun --save
```### โช With the following code...
```typescript
import rakun from "rakun"
console.log(rakun.parse("[Team246] Ghost in the shell Stand alone complex S01 E10-E15 [BDREMUX 1080P MULTi DTSHDMA 5.1][VOSTFR]"))
```### โฉ You'll get the following output :
```typescript
{
filename:"[Team246] Ghost in the shell Stand alone complex S01 E10-E15 [BDREMUX 1080P MULTi DTSHDMA 5.1][VOSTFR]",
subber:"Team246",
name:"Ghost in the shell Stand alone complex",
season:"1",
episode:"10-15",
resolution:"1080p",
audio:"multi vo",
subtitles:"fr",
source:"bluray",
codecs:"audio_5_1 dts_hdma",
meta:"remux",
}
```### ๐ What's rakun performances ?
We try to gather a lot of different torrents name format from all over the net to make **rakun** more efficient and reliable.
You can check [tests cases](https://github.com/lowlighter/rakun/tree/master/tests/cases) to see what kind of formatting is currently supported.Below is an excerpt of tests cases which may help you to check **rakun**'s capabilities :
```text
- [Leopard-Raws] Shingeki no Kyojin Season 3 Part.2 - 09 END RAW (NHKG 1280x720 x264 AAC).mp4
- [Team246] Ghost in the shell Stand alone complex S01 E10-E15 [BDREMUX 1080P MULTi DTSHDMA 5.1][VOSTFR]
- [U3-Web] PSYCHO-PASS 3 - FIRST INSPECTOR (2020) [Movie][AMZN WEB-DL(v) 1080p AVC AAC DDP SRT][Multi-Subs] (PSYCHO-PASS ใตใคใณใใน / ๅฟ้ๅคๅฎ / ๅฟ็ๆธฌ้่ )
- [Kaerizaki-Fansub]_One-Piece_Stampede_Film_14_[VERSION-LIGHT][VOSTFR][BLU-RAY][FHD_1920x1080].mp4
- Serial Experiments Lain (1998) 1080p.H264.Ac3.Ita.Flac.Jap.Eng.Sub.Ita.Eng [v2] [21GB] [stress]
- [T.H.X&A.I.R.nesSub][Shinsekai_Yori][BDRIP][Vol.1-9ๅ จ][1920x1080_10bpp][AVC_FLAC][MKV]
- [BDRip1080p.x264.AC3.ITA.ENG.JAP.].Fullmetal.Alchemist.Brotherhood + OAV Bluray RIP
- [AnimeRG] Spirited Away (2001) (Sen to Chihiro no Kamikakushi) [720p BD 10bit] [Dual Audio] [Multi-Language Subtitles] [JRR] [Studio Ghibli].mkv
- Death Note 1-37 [480p] [EN SUB]
- Dragon.Ball.Z.Movie.14.Battle.of.Gods.2013.EXTENDED.DUAL.AUDiO.SUB.PL.1080p.BluRay.REMUX.AVC.TrueHD.5.1-SeBoLeX
- [Erai-raws] Berserk 2017 - 12 [v2][END][720p][Multiple Subtitle][10A073DC].mkv
- [BDMV] Made in Abyss / ใกใคใใคใณใขใใน [BD-BOX I+II][JP]
- Naruto.SD.Rock.Lee.no.Seishun.Full-Power.Ninden.S01.FRENCH.DVDRiP.x264-Kazelle
- Evangelion 1.11 You Are (Not) Alone 2007 Multi 1080p Blu-ray Remux AVC DTS-HD MA 6.1 VFF 5.1 H264 [Team246]
```### ๐ Extracted informations
Below is the descriptor of the possible extreacted torrent informations.
Multiple values properties (codecs, audio, subtitles, etc.) are sorted to keep consistancy.```typescript
/** Torrent informations. */
interface TorrentInfos {
//Original filename
readonly filename:string
//Cleaned name
readonly name:string
//Hash
readonly hash?:string
//File extension
readonly extension?:string
//Resolution
readonly resolution?:string
//Source (Bluray, DVD, etc.)
readonly source?:string
//Codecs
readonly codecs?:string
//Audio language
readonly audio?:string
//Subtitles language
readonly subtitles?:string
//Subber group (or translation group)
readonly subber?:string
//Website of subber
readonly website?:string
//Content producer/distributor
readonly distributor?:string
//Torrent other metadata (remux, repack, etc.)
readonly meta?:string
//Movie
readonly movie?:string
//Season
readonly season?:string
//Part
readonly part?:string
//Episode (or episode range)
readonly episode?:string
}
```## ๐ง Under the hood
### ๐งฌ ATNP's logic
Parsing follows the following workflow :
* An input `filename` is given, with possible parser `options`
* Pre-processors are executed
* Main processor iterates through the [collection of regexs](https://github.com/lowlighter/rakun/tree/master/src/regexs) to extract informations from `filename`
* At each iteration, extracted informations can be either removed from `filename` through [cleaners](https://github.com/lowlighter/rakun/blob/master/src/regexs/cleaners.ts) are applied to remove remnants, or keept for next iteration for closely related informations (like season, parts and episode)
* Post-processors are executed
* The remaining part of `filename` is considered as the cleaned title
* Extracted informations are returned#### ๐ฐ Additional informations
Regexs are ordered to match first what can be accurately extracted and removed early (like hash, extension, website, etc.) to ease the remaining extraction.
Except for cleaners and specials regexs, they should have named capture groups. (if capture group are needed). It allows easier understanding of regexs, but it is also the way that the parser register data.
The main processor is a list of descriptors with the following properties :
```typescript
{
key?:string, // Property to set (e.g. audio, episode, meta, etc..)
collection?:RegExp[], // Regex collection to use
get?:"key"|"value", // If "key", will return capture group's name ; if "value", will return capture group's value
mode?:"append"|"replace"|"skip", // If "append", add value (or create) to property ; if "replace", replace property ; if "skip", skip if property already defined
clean?:boolean, // If true, cleaners will be applied at the end of the iteration
cleaners?:RegExp[] // Additional cleaners to apply
}
```While the aim is to reach 100% accuracy, note that this objective is nearly impossible since there are too much outliers in naming conventions across various websites.
### ๐ Code quality
To ensure a quality library, code is required to pass **Onee-chan**'s judgement and fulfill a parsing accuracy of at least **85%** of defined [test cases](https://github.com/lowlighter/rakun/tree/master/tests/cases).
Pull requests may not be merged if they do not reach this standard, unless they are adding revelant tests which may reveal missed matches in current builds. Although edges cases should be integrated to challenge **rakun**, simpler cases should also be added to not biases **Onee-chan**'s evaluation.
### ๐ช Contributing
Want to contribute ? Sugoรฏ ! Here what you can help with :
#### Open a pull request to
- Add new features for [rakun](https://github.com/lowlighter/rakun/tree/master/src)
- Complete [regexs collection](https://github.com/lowlighter/rakun/tree/master/src/regexs)
- Define new [tests cases](https://github.com/lowlighter/rakun/tree/master/tests/cases)#### Open an issue to
- Reports bugs or unsupported format that should be
- Ask help### ๐งพ License
**rakun** is licensed under the [MIT License](https://github.com/lowlighter/rakun/blob/master/LICENSE).