{"id":18793320,"url":"https://github.com/deflix-tv/imdb2meta","last_synced_at":"2026-04-28T21:33:07.467Z","repository":{"id":55673921,"uuid":"314904344","full_name":"Deflix-tv/imdb2meta","owner":"Deflix-tv","description":"A service for getting movie and TV show metadata for an IMDb ID via HTTP or gRPC","archived":false,"fork":false,"pushed_at":"2021-01-16T19:37:26.000Z","size":74,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-12-29T15:44:32.990Z","etag":null,"topics":["go","golang","grpc","http","imdb","imdb-dataset","metadata","web-service"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Deflix-tv.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-11-21T21:12:46.000Z","updated_at":"2022-11-27T10:41:50.000Z","dependencies_parsed_at":"2022-08-15T06:10:37.811Z","dependency_job_id":null,"html_url":"https://github.com/Deflix-tv/imdb2meta","commit_stats":null,"previous_names":["doingodswork/imdb2meta"],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Deflix-tv%2Fimdb2meta","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Deflix-tv%2Fimdb2meta/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Deflix-tv%2Fimdb2meta/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Deflix-tv%2Fimdb2meta/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Deflix-tv","download_url":"https://codeload.github.com/Deflix-tv/imdb2meta/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239718380,"owners_count":19685725,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["go","golang","grpc","http","imdb","imdb-dataset","metadata","web-service"],"created_at":"2024-11-07T21:24:24.200Z","updated_at":"2025-12-28T22:30:14.962Z","avatar_url":"https://github.com/Deflix-tv.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# imdb2meta\n\nA service for getting movie and TV show metadata for an IMDb ID via HTTP or gRPC, using the official IMDb datasets\n\n## Content\n\n- [Content](#content)\n- [Usage](#usage)\n  1. [Import data](#1-import-data)\n  2. [Run service](#2-run-service)\n  3. [Query service](#3-query-service)\n- [Protocol buffer generation](#protocol-buffer-generation)\n- [⚠ Warning](#⚠-warning)\n\n## Usage\n\nFirst you need import the data of the IMDb dataset into a database, then you need to start the web service which is backed by the database and finally you can query it via HTTP or gRPC.\n\n### 1. Import data\n\nFirst you need import the data of the IMDb dataset into a database. We support [BadgerDB](https://github.com/dgraph-io/badger) and [bbolt](https://github.com/etcd-io/bbolt).\n\nSteps:\n\n1. Download the `title.basics.tsv.gz` dataset from \u003chttps://datasets.imdbws.com\u003e\n   - For more info about IMDb datasets see \u003chttps://www.imdb.com/interfaces/\u003e\n   - \u003e ⚠ Warning: `IMDb.com, Inc` is the copyright owner of the data in the IMDb datasets. You may only use the data for personal and non-commercial use. For more info see [\"Can I use IMDb data in my software?\"](https://help.imdb.com/article/imdb/general-information/can-i-use-imdb-data-in-my-software/G5JTRESSHJBBHTGX) and their [copyright/conditions of use](https://www.imdb.com/conditions) statement.\n\n2. Exract the TSV file somewhere\n3. Run the import tool with the appropriate CLI arguments\n   - Example: `imdb2meta-import -tsvPath \"/home/john/Downloads/data.tsv\" -badgerPath \"/home/john/imdb2meta/badger\"`\n\n\u003e Note: The import takes a while (and much longer with bbolt than with BadgerDB), the process requires a lot of memory and the final DB size is fairly big.  \n\u003e With a 6-core, 12-thread CPU and a mid-range SSD, an import of all data (7351639 rows as of 2020-11-21) into BadgerDB takes 4 minutes, up to 1.03 GB memory and the final DB size is 1.29 GB.  \n\u003e When skipping TV episodes and storing only the minimal metadata it takes 1 minute and 5 seconds, up to 530 MB memory and the final DB size is 314 MB.\n\nCLI reference:\n\n```text\nUsage of imdb2meta-import:\n  -badgerPath string\n        Path to the directory with the BadgerDB files\n  -boltPath string\n        Path to the bbolt DB file\n  -limit int\n        Limit the number of rows to process (excluding the header row)\n  -minimal\n        Only store minimal metadata (ID, type, title, release/start year)\n  -skipEpisodes\n        Skip storing individual TV episodes\n  -skipMisc\n        Skip title types like \"videoGame\", \"audiobook\" and \"radioSeries\"\n  -tsvPath string\n        Path to the \"data.tsv\" file that's inside the \"title.basics.tsv.gz\" archive\n```\n\n### 2. Run service\n\nAfter importing the data you can start the web service.\n\nExample: `imdb2meta-service -badgerPath \"/home/john/imdb2meta/badger\"`\n\nCLI reference:\n\n```text\nUsage of imdb2meta-service:\n  -badgerPath string\n        Path to the directory with the BadgerDB files\n  -bindAddr string\n        Local interface address to bind to. \"localhost\" only allows access from the local host. \"0.0.0.0\" binds to all network interfaces. (default \"localhost\")\n  -boltPath string\n        Path to the bbolt DB file\n  -grpcPort int\n        Port to listen on for gRPC requests (default 8081)\n  -httpPort int\n        Port to listen on for HTTP requests (default 8080)\n```\n\n#### Docker\n\nYou can also run the service as Docker container.\n\n1. Update the image: `docker pull doingodswork/imdb2meta-service`\n2. Start the container: `docker run --name imdb2meta -v /path/to/badger:/data -p 8080:8080 -p 8081:8081 doingodswork/imdb2meta-service -badgerPath \"/data\"`\n   - \u003e Note: `Ctrl-C` only detaches from the container. It doesn't stop it.\n   - When detached, you can attach again with `docker attach imdb2meta`\n3. To stop the container: `docker stop imdb2meta`\n4. To start the (still existing) container again: `docker start imdb2meta`\n\n### 3. Query service\n\nAfter starting the web service you can query it via HTTP or gRPC:\n\n#### HTTP\n\nExample request: `curl \"http://localhost:8080/meta/tt1254207\"`\n\nExample response:\n\n```json\n{\n    \"id\": \"tt1254207\",\n    \"titleType\": \"SHORT\",\n    \"primaryTitle\": \"Big Buck Bunny\",\n    \"startYear\": 2008,\n    \"runtime\": 10,\n    \"genres\": [\n        \"Animation\",\n        \"Comedy\",\n        \"Short\"\n    ]\n}\n```\n\n#### gRPC\n\nExample request (using [grpcurl](https://github.com/fullstorydev/grpcurl)): `grpcurl -plaintext -d '{\"id\":\"tt1254207\"}' localhost:8081 imdb2meta.MetaFetcher/Get`  \n(In Windows/PowerShell you have to use `'{\\\"id\\\":\\\"tt1254207\\\"}'`)\n\nExample response:\n\n```json\n{\n    \"id\": \"tt1254207\",\n    \"titleType\": \"SHORT\",\n    \"primaryTitle\": \"Big Buck Bunny\",\n    \"startYear\": 2008,\n    \"runtime\": 10,\n    \"genres\": [\n        \"Animation\",\n        \"Comedy\",\n        \"Short\"\n    ]\n}\n```\n\n## Protocol buffer generation\n\nTo re-generate the `meta.pb.go` file from the `meta.proto` file, run: `protoc -I=\"./protos\" --go_out=./pb --go_opt=paths=source_relative meta.proto`\n\nTo re-generate the `service.pb.go` and `service_grpc.pb.go` files from the `service.proto` file, run: `protoc -I=\"./protos\" --go_out=./pb --go_opt=paths=source_relative --go-grpc_out=./pb --go-grpc_opt=paths=source_relative service.proto`\n\n## ⚠ Warning\n\n`IMDb.com, Inc` is the copyright owner of the data in the IMDb datasets. You may only use the data for personal and non-commercial use. For more info see [\"Can I use IMDb data in my software?\"](https://help.imdb.com/article/imdb/general-information/can-i-use-imdb-data-in-my-software/G5JTRESSHJBBHTGX) and their [copyright/conditions of use](https://www.imdb.com/conditions) statement.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeflix-tv%2Fimdb2meta","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeflix-tv%2Fimdb2meta","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeflix-tv%2Fimdb2meta/lists"}