{"id":47188432,"url":"https://github.com/rensetsu/db.trakt.anitrakt","last_synced_at":"2026-03-13T10:11:45.729Z","repository":{"id":175698396,"uuid":"604165573","full_name":"rensetsu/db.trakt.anitrakt","owner":"rensetsu","description":"Parse AniTrakt's Show and Movie table to get basic Trakt \u003c-\u003e MAL ID mapping","archived":false,"fork":false,"pushed_at":"2026-02-27T05:36:45.000Z","size":2300,"stargazers_count":7,"open_issues_count":4,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-27T12:16:36.640Z","etag":null,"topics":["anime","myanimelist","trakt","unofficial-parser"],"latest_commit_sha":null,"homepage":"https://anitrakt.huere.net/","language":"JSON","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rensetsu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-02-20T13:26:02.000Z","updated_at":"2026-02-27T05:36:48.000Z","dependencies_parsed_at":"2025-11-30T05:00:23.254Z","dependency_job_id":null,"html_url":"https://github.com/rensetsu/db.trakt.anitrakt","commit_stats":null,"previous_names":["ryuuganime/anitrakt-indexparser","rensetsu/anitrakt-indexparser","rensetsu/db.trakt.anitrakt"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rensetsu/db.trakt.anitrakt","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rensetsu%2Fdb.trakt.anitrakt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rensetsu%2Fdb.trakt.anitrakt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rensetsu%2Fdb.trakt.anitrakt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rensetsu%2Fdb.trakt.anitrakt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rensetsu","download_url":"https://codeload.github.com/rensetsu/db.trakt.anitrakt/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rensetsu%2Fdb.trakt.anitrakt/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30465036,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-13T06:34:02.089Z","status":"ssl_error","status_checked_at":"2026-03-13T06:33:49.182Z","response_time":60,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anime","myanimelist","trakt","unofficial-parser"],"created_at":"2026-03-13T10:11:45.124Z","updated_at":"2026-03-13T10:11:45.704Z","avatar_url":"https://github.com/rensetsu.png","language":"JSON","funding_links":[],"categories":[],"sub_categories":[],"readme":"# db.trakt.anitrakt\n\n[![GitHub Repo stars](https://img.shields.io/github/stars/rensetsu/db.trakt.anitrakt?style=social)](https://github.com/rensetsu/db.trakt.anitrakt)\n[![GitHub Repo forks](https://img.shields.io/github/forks/rensetsu/db.trakt.anitrakt?style=social)](https://github.com/rensetsu/db.trakt.anitrakt/fork)\n\nA scraped table data from [AniTrakt by Huere](https://anitrakt.huere.net/) to\nget anime mappings on [MyAnimeList](https://myanimelist.net) and [Trakt](https://trakt.tv).\n\n\u003e [!WARNING]\n\u003e\n\u003e **THIS REPO IS NOT OFFICIALLY SUPPORTED BY HUERE, MAL, or TRAKT.**\n\nIf you used any contents from this repo in your project and found bugs or want\nto submit a suggestion, please send us [issues](https://github.com/rensetsu/db.trakt.anitrakt/issues).\n\n\u003e [!NOTE]\n\u003e\n\u003e **Extended Database Available**\n\u003e\n\u003e For a more comprehensive dataset with richer metadata, please use\n\u003e the [Extended Database](https://github.com/rensetsu/db.trakt.extended-anitrakt)\n\u003e repo instead. The extended database includes release years, external IDs\n\u003e (TMDB, TVDB, IMDb), and handles issues like `guessed_slug`. This repository\n\u003e should primarily be used if you only need the basic mapping between MyAnimeList\n\u003e and Trakt IDs.\n\n## Features\n\n- **Intelligent Filtering**: Configurable ignore rules with support for AND/OR logic  \n- **Data Overwriting**: Manual overrides for specific entries via overwrite files\n- **Error Handling**: Robust error handling with custom exception hierarchy\n- **Modular Architecture**: Clean, maintainable code structure with separated concerns\n\n## Data Structure\n\n| Key Name | Type | Description |\n| --- | --- | --- |\n| `title` | `string` | The title of the anime |\n| `mal_id` | `int` | MyAnimeList ID of the anime |\n| `trakt_id` | `int` | Trakt ID of the show/movie |\n| `guessed_slug` | `string \\| null` | Guessed slug of the anime, see [comments](#guessed-slug) for additional context |\n| `type` | `Enum[\"shows\", \"movies\"]` | Type of the anime |\n| `season` | `int` | Season number of the anime, only for `type == \"shows\"` |\n\n### Examples\n\n\u003e [!NOTE]\n\u003e\n\u003e Final result does not contain comments, it's just for additional context in\n\u003e this README.\n\n#### Shows\n\n```jsonc\n[\n  // Example of a show \"Shingeki no Kyojin\", both season 1 and 2\n  {\n    \"title\": \"Shingeki no Kyojin\",\n    \"mal_id\": 16498,\n    \"trakt_id\": 1420,\n    \"guessed_slug\": \"attack-on-titan\",\n    \"type\": \"shows\",\n    \"season\": 1\n  },\n  {\n    \"title\": \"Shingeki no Kyojin Season 2\",\n    \"mal_id\": 25777,\n    \"trakt_id\": 1420,\n    \"guessed_slug\": \"attack-on-titan\",\n    \"type\": \"shows\",\n    \"season\": 2\n  }\n]\n```\n\nTo construct a link, you can use the following format:\n\n```text\nhttps://trakt.tv/{type}/{guessed_slug}/seasons/{season}\n```\n\n#### Movies\n\n```jsonc\n[\n  // Example of a movie \"Kimi no Na wa.\"\n  {\n    \"title\": \"Kimi no Na wa.\",\n    \"mal_id\": 32281,\n    \"trakt_id\": 1402,\n    // Guessed slug won't work for movies, see additional comment\n    \"guessed_slug\": \"your-name\",\n    \"type\": \"movies\"\n  }\n]\n```\n\nTo construct a link, you can use the following format:\n\n```text\nhttps://trakt.tv/{type}/{guessed_slug}-{year, see additional comment}\n```\n\n## Configuration Files\n\n### Ignore Rules (`db/ignore_movies.json` \u0026 `db/ignore_tv.json`)\n\nThe parser supports intelligent filtering through ignore rule files. These\nrules allow you to exclude specific items from the final dataset based on\nvarious criteria.\n\n#### Structure\n\n```jsonc\n[\n  {\n    \"source\": \"remote|local|all\",\n    \"type\": \"OR|AND|ANY|ALL\", \n    \"conditions\": [\n      {\n        \"field_name\": \"value_to_match\"\n      }\n    ],\n    \"description\": \"Human-readable description of the rule\"\n  }\n]\n```\n\n#### Source Types\n\n- **`remote`**: Applied to items from AniTrakt before overwrite merging\n- **`local`**: Applied only to overwrite items (after merging)\n- **`all`**: Applied to all remaining items after overwrite processing\n\n#### Rule Types\n\n- **`OR`/`ANY`**: Match if **any** condition is true\n- **`AND`/`ALL`**: Match if **all** conditions are true\n\n#### Supported Fields\n\nYou can create conditions based on any field in the data structure:\n\n- `title` - Exact title match\n- `mal_id` - MyAnimeList ID (supports `null` for missing IDs)\n- `trakt_id` - Trakt ID (supports `null` for missing IDs)\n- `guessed_slug` - Generated slug\n- `season` - Season number (shows only)\n- `type` - Media type (\"movies\" or \"shows\")\n\nIf multiple fields exists inside one condition statement, it will behave as\n`AND`.\n\n#### Example Ignore Rules\n\n```jsonc\n[\n  {\n    \"source\": \"all\",\n    \"type\": \"ANY\",\n    \"conditions\": [\n      { \"mal_id\": 0 },\n      { \"mal_id\": null },\n      { \"trakt_id\": 0 },\n      { \"trakt_id\": null }\n    ],\n    \"description\": \"Ignore items with invalid IDs\"\n  },\n  {\n    \"source\": \"remote\", \n    \"type\": \"ANY\",\n    \"conditions\": [\n      { \"mal_id\": 50532 },\n      { \"mal_id\": 986 },\n      { \"mal_id\": 12231 },\n      { \"mal_id\": 32051 },\n      { \"mal_id\": 2020 },\n      { \"mal_id\": 31704 },\n      { \"mal_id\": 28285 }\n    ],\n    \"description\": \"Special/OVA titles found in TV show entries\"\n  },\n  {\n    \"source\": \"all\",\n    \"type\": \"AND\",\n    \"conditions\": [\n      { \"type\": \"movies\" },\n      { \"guessed_slug\": null }\n    ],\n    \"description\": \"Remove movies without valid slugs\"\n  }\n]\n```\n\n### Overwrite Files (`db/overwrite_movies.json` \u0026 `db/overwrite_tv.json`)\n\nThese files contain manual additions or corrections to the scraped data. Items\nin overwrite files take precedent and can only be filtered by `\"source\": \"local\"`\nignore rules. They are fully protected from \"remote\" and \"all\" source filtering.\n\n#### Use Cases\n\n- Add missing entries not found in AniTrakt database\n- Correct incorrect mappings or metadata\n- Override titles with better translations\n- Add custom entries for special cases\n\n#### Example Overwrite\n\n```jsonc\n[\n  {\n    \"title\": \"Nijiyon Animation 2\",\n    \"mal_id\": 57623,\n    \"trakt_id\": 198874,\n    \"guessed_slug\": \"nijiyon-animation\", \n    \"season\": 2,\n    \"type\": \"shows\"\n  },\n  {\n    \"title\": \"Ameku Takao no Suiri Karte\",\n    \"mal_id\": 58600,\n    \"trakt_id\": 233930,\n    \"guessed_slug\": \"ameku-m-d-doctor-detective\",\n    \"season\": 1,\n    \"type\": \"shows\"\n  }\n]\n```\n\n## Processing Pipeline\n\nThe parser follows this sequence to ensure data integrity and clear override precedent:\n\n1. **Fetch \u0026 Parse**: Scrape data from AniTrakt website\n2. **Remote Filtering**: Apply ignore rules with `\"source\": \"remote\"` to remote data\n3. **Overwrite Processing**: Merge/replace items from overwrite files (takes precedent)\n4. **Final Filtering**: Apply ignore rules with `\"source\": \"all\"` to remaining remote items\n5. **Local Filtering**: Apply ignore rules with `\"source\": \"local\"` to overwrite items only\n6. **Sorting \u0026 Output**: Merge results and save to JSON files sorted by MAL ID\n\n### Data Flow Diagram\n\n```mermaid\ngraph TD\n    A[AniTrakt Website] --\u003e B[HTML Parser]\n    B --\u003e C[Remote Filtering]\n    C --\u003e D[Overwrite Processing]\n    D --\u003e E[Overwrite Items]\n    D --\u003e F[Remote Items]\n    F --\u003e G[All Source Filtering]\n    E --\u003e H[Local Source Filtering]\n    G --\u003e I[Combine Results]\n    H --\u003e I\n    I --\u003e J[Sort by MAL ID]\n    J --\u003e K[Save to JSON]\n```\n\n## Usage\n\n### Prerequisites\n\n```bash\npip install requests beautifulsoup4\n```\n\n### Running the Parser\n\n```bash\npython main.py\n```\n\nThe parser will automatically:\n- Fetch the latest data from AniTrakt\n- Apply all configured filters and overwrites\n- Generate sorted JSON output files\n- Create a timestamp file for tracking updates\n\n### Output Files\n\n| File | Description |\n|------|-------------|\n| `db/movies.json` | Movie mappings (sorted alphabetically) |\n| `db/tv.json` | TV show mappings (sorted alphabetically) |\n| `updated.txt` | Last successful update timestamp (UTC) |\n| `movies.html` | Cached HTML from AniTrakt movies page |\n| `shows.html` | Cached HTML from AniTrakt shows page |\n\n### Configuration Files (Optional)\n\n| File | Purpose |\n|------|---------|\n| `db/ignore_movies.json` | Ignore rules for movies |\n| `db/ignore_tv.json` | Ignore rules for TV shows |\n| `db/overwrite_movies.json` | Manual overrides for movies |\n| `db/overwrite_tv.json` | Manual overrides for TV shows |\n\n## Architecture\n\nThe refactored codebase follows a modular architecture:\n\n- **`FileManager`**: Handles all file I/O operations with UTF-8 support\n- **`FilterEngine`**: Processes ignore rules with AND/OR logic\n- **`DataManager`**: Manages data merging and overwriting\n- **`HTMLParser`**: Scrapes and parses AniTrakt website data\n- **`AniTraktParser`**: Main orchestrator coordinating all components\n- **`TextUtils`**: Text processing utilities for slugification\n- **Custom Exceptions**: Proper error handling hierarchy\n\n## Guessed Slug\n\n### Recommendation\n\nFor the most reliable and complete data, including accurate slugs, release\nyears, and other metadata, it is **highly recommended** to use the\n**[Extended repo](https://github.com/rensetsu/db.trakt.extended-anitrakt)**.\nThe extended database programmatically fetches the correct information directly\nfrom the Trakt.tv API, resolving the limitations described below.\n\nThis repository is best suited for users who only require the basic mapping\nbetween MyAnimeList and Trakt IDs.\n\n### `guessed_slug` Limitations\n\nIf you choose to use this repository, please be aware of the following\nlimitations regarding the `guessed_slug` field:\n\n* **Based on English Titles**: \\\n  Slugs are generated from the presumed English title of the anime. This can\n  lead to inaccuracies if the title on Trakt.tv differs.\n\n* **Movies Require the Year**: \\\n  The `guessed_slug` for movies is incomplete. Trakt.tv requires the release\n  year to be appended to the slug (e.g., `your-name-2016`). This information is\n  not included in this database.\n\n* **Potential for Mismatches**: \\\n  While generally effective for TV shows, a `guessed_slug` might not work for\n  shows with similar names on Trakt as well.\n\n* **Non-alphabetical Titles**: \\\n  Titles that are purely numerical or symbols have a `null` value for\n  `guessed_slug` to prevent conflicts with Trakt's numeric ID system.\n\nIn cases where the `guessed_slug` is incorrect, you can always fall back to\nusing the `trakt_id` to fetch the correct information directly from the\nTrakt.tv API.\n\n## Contributing\n\nWe welcome contributions! Here's how to get started:\n\n1. **Fork the repository**\n2. **Create your feature branch**: `git checkout -b feature/amazing-feature`\n3. **Configure ignore rules or overwrite files** as needed\n4. **Test your changes**: `python main.py`\n5. **Commit your changes**: `git commit -m 'Add amazing feature'`\n6. **Push to the branch**: `git push origin feature/amazing-feature`\n7. **Open a Pull Request**\n\n### Contribution Guidelines\n\n- Ensure all new features include appropriate logging\n- Test ignore rules and overwrite files thoroughly\n- Update documentation for any new configuration options\n- Follow the existing code style and architecture patterns\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frensetsu%2Fdb.trakt.anitrakt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frensetsu%2Fdb.trakt.anitrakt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frensetsu%2Fdb.trakt.anitrakt/lists"}