{"id":13856811,"url":"https://github.com/lujiaying/MovieTaster-Open","last_synced_at":"2025-07-13T19:32:53.475Z","repository":{"id":48969711,"uuid":"100246761","full_name":"lujiaying/MovieTaster-Open","owner":"lujiaying","description":"A practical movie recommend project based on Item2vec.","archived":false,"fork":false,"pushed_at":"2020-10-12T12:20:41.000Z","size":15680,"stargazers_count":280,"open_issues_count":2,"forks_count":91,"subscribers_count":12,"default_branch":"master","last_synced_at":"2024-08-06T03:02:17.111Z","etag":null,"topics":["deep-learning","item2vec","word2vec"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lujiaying.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-08-14T08:45:56.000Z","updated_at":"2024-07-21T16:11:30.000Z","dependencies_parsed_at":"2022-08-27T19:51:47.352Z","dependency_job_id":null,"html_url":"https://github.com/lujiaying/MovieTaster-Open","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lujiaying%2FMovieTaster-Open","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lujiaying%2FMovieTaster-Open/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lujiaying%2FMovieTaster-Open/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lujiaying%2FMovieTaster-Open/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lujiaying","download_url":"https://codeload.github.com/lujiaying/MovieTaster-Open/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225912302,"owners_count":17544142,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","item2vec","word2vec"],"created_at":"2024-08-05T03:01:14.250Z","updated_at":"2024-11-22T14:30:54.067Z","avatar_url":"https://github.com/lujiaying.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# MovieTaster-Open\n\nA movie recommend project based on Item2vec.\n\nReference: \n- Barkan, Oren, and Noam Koenigstein. \"Item2vec: neural item embedding for collaborative filtering.\" Machine Learning for Signal Processing (MLSP), 2016 IEEE 26th International Workshop on. IEEE, 2016.\n- JayveeHe, https://github.com/JayveeHe/MusicTaster. Github.\n\n[Demo\u003e](https://movietaster.leanapp.cn/movies/)\n\n\u003cimg src=\"/recommend_multiple.jpg\" /\u003e\n\nMore details for this project, plese refer to [blog\u003e](https://lujiaying.github.io/posts/2017/08/MovieTaster/) or [zhihu\u003e](https://zhuanlan.zhihu.com/p/28491088)\n\n## Project Struct\n- datas:  to store corpus\n- models: to store item vectors for movies\n- utils: a collection of tool functions\n\n\n## Usage\n\n0. Compile [fasttext](https://github.com/facebookresearch/fastText) under root of the project\n\n1. Process corpus, the data file would be generated under ```./datas/```.\n\n```\n$ cd datas \u0026\u0026 tar -xzvf corpus.tar.gz\n$ cd ..\n$ python utils/process.py\n```\n\n2. Train model by running fasttext. Please refer to ```./run.sh``` for other configuration.\n\n```\n$ ./fasttext skipgram -input ./datas/doulist_0804_09.movie_id -output ./models/fasttext_model_0804_09_skipgram -minCount 5 -epoch 50 -neg 100\n```\n\n3. Here's a result from the default configuration.\n\n```\n$ python utils/eval.py\nmovie_name_id_dict is 130206 from [./datas/doulist_0804_09.json]\nmovie_name_id_dict is 130206 from [./datas/doulist_0804_09.json]\nmovie_id_name_dict is 130206 from [./datas/doulist_0804_09.json]\n[2017-08-18 07:52:53,579][pid:22831] eval.topk_like: INFO: [1210]小时代 top 5 likes:\n[2017-08-18 07:52:54,149][pid:22831] eval.topk_like: INFO: [2396]小时代2：青木时代 0.820323\n[2017-08-18 07:52:54,149][pid:22831] eval.topk_like: INFO: [3387]分手合约 0.606063\n[2017-08-18 07:52:54,150][pid:22831] eval.topk_like: INFO: [4087]不二神探 0.604207\n[2017-08-18 07:52:54,150][pid:22831] eval.topk_like: INFO: [3839]天台爱情 0.601879\n[2017-08-18 07:52:54,150][pid:22831] eval.topk_like: INFO: [821]致我们终将逝去的青春 0.600836\n[2017-08-18 07:52:54,150][pid:22831] eval.topk_like: INFO: [144]倩女幽魂 top 5 likes:\n[2017-08-18 07:52:54,690][pid:22831] eval.topk_like: INFO: [2719]倩女幽魂3：道道道 0.685621\n[2017-08-18 07:52:54,690][pid:22831] eval.topk_like: INFO: [1812]倩女幽魂2：人间道 0.681594\n[2017-08-18 07:52:54,691][pid:22831] eval.topk_like: INFO: [466]胭脂扣 0.678032\n[2017-08-18 07:52:54,691][pid:22831] eval.topk_like: INFO: [261]青蛇 0.671541\n[2017-08-18 07:52:54,692][pid:22831] eval.topk_like: INFO: [156]东邪西毒 0.664057\n[2017-08-18 07:52:54,692][pid:22831] eval.topk_like: INFO: [2602]悟空传 top 5 likes:\n[2017-08-18 07:52:55,253][pid:22831] eval.topk_like: INFO: [5648]闪光少女 0.671337\n[2017-08-18 07:52:55,253][pid:22831] eval.topk_like: INFO: [2189]绣春刀II：修罗战场 0.646861\n[2017-08-18 07:52:55,254][pid:22831] eval.topk_like: INFO: [8297]逆时营救 0.634753\n[2017-08-18 07:52:55,254][pid:22831] eval.topk_like: INFO: [12571]京城81号Ⅱ 0.625549\n[2017-08-18 07:52:55,254][pid:22831] eval.topk_like: INFO: [10545]父子雄兵 0.623032\n[2017-08-18 07:52:55,254][pid:22831] eval.topk_like: INFO: [56]美国往事 top 5 likes:\n[2017-08-18 07:52:55,789][pid:22831] eval.topk_like: INFO: [18]天堂电影院 0.756449\n[2017-08-18 07:52:55,790][pid:22831] eval.topk_like: INFO: [14]辛德勒的名单 0.737502\n[2017-08-18 07:52:55,790][pid:22831] eval.topk_like: INFO: [17]教父 0.735216\n[2017-08-18 07:52:55,790][pid:22831] eval.topk_like: INFO: [69]闻香识女人 0.734119\n[2017-08-18 07:52:55,790][pid:22831] eval.topk_like: INFO: [42]西西里的美丽传说 0.732334\n[2017-08-18 07:52:55,791][pid:22831] eval.topk_like: INFO: [2644]战狼2 top 5 likes:\n[2017-08-18 07:52:56,346][pid:22831] eval.topk_like: INFO: [1456]大护法 0.612700\n[2017-08-18 07:52:56,347][pid:22831] eval.topk_like: INFO: [2107]战狼 0.580637\n[2017-08-18 07:52:56,347][pid:22831] eval.topk_like: INFO: [2602]悟空传 0.580365\n[2017-08-18 07:52:56,347][pid:22831] eval.topk_like: INFO: [9941]建军大业 0.575422\n[2017-08-18 07:52:56,347][pid:22831] eval.topk_like: INFO: [19040]阿唐奇遇 0.573613\n```\n\n------------------------\n\n# MovieTaster-Open\n\n使用Item2Vec做电影推荐\n\n参考: \n- Barkan, Oren, and Noam Koenigstein. \"Item2vec: neural item embedding for collaborative filtering.\" Machine Learning for Signal Processing (MLSP), 2016 IEEE 26th International Workshop on. IEEE, 2016.\n- JayveeHe, https://github.com/JayveeHe/MusicTaster. Github.\n\n[Demo\u003e](https://movietaster.leanapp.cn/movies/)\n\n\u003cimg src=\"/recommend_multiple.jpg\" /\u003e\n\n[原理详情请参考\u003e](https://lujiaying.github.io/posts/2017/08/MovieTaster/)\n\n## 目录结构\n- datas:  存放语料\n- models: 存放电影向量\n- utils: 工具类，包括生成推荐电影列表等\n\n\n## 使用说明\n\n0. 在根目录下载并编译[fasttext](https://github.com/facebookresearch/fastText)\n\n1. 处理语料。结束后将在```./datas/```下生成fasttext所需要的格式\n\n```\n$ cd datas \u0026\u0026 tar -xzvf corpus.tar.gz\n$ cd ..\n$ python utils/process.py\n```\n\n2. 参考```./run.sh```中生成词向量的参数配置，任选一个。\n\n```\n$ ./fasttext skipgram -input ./datas/doulist_0804_09.movie_id -output ./models/fasttext_model_0804_09_skipgram -minCount 5 -epoch 50 -neg 100\n```\n\n3. 查看模型效果\n\n```\n$ python utils/eval.py\nmovie_name_id_dict is 130206 from [./datas/doulist_0804_09.json]\nmovie_name_id_dict is 130206 from [./datas/doulist_0804_09.json]\nmovie_id_name_dict is 130206 from [./datas/doulist_0804_09.json]\n[2017-08-18 07:52:53,579][pid:22831] eval.topk_like: INFO: [1210]小时代 top 5 likes:\n[2017-08-18 07:52:54,149][pid:22831] eval.topk_like: INFO: [2396]小时代2：青木时代 0.820323\n[2017-08-18 07:52:54,149][pid:22831] eval.topk_like: INFO: [3387]分手合约 0.606063\n[2017-08-18 07:52:54,150][pid:22831] eval.topk_like: INFO: [4087]不二神探 0.604207\n[2017-08-18 07:52:54,150][pid:22831] eval.topk_like: INFO: [3839]天台爱情 0.601879\n[2017-08-18 07:52:54,150][pid:22831] eval.topk_like: INFO: [821]致我们终将逝去的青春 0.600836\n[2017-08-18 07:52:54,150][pid:22831] eval.topk_like: INFO: [144]倩女幽魂 top 5 likes:\n[2017-08-18 07:52:54,690][pid:22831] eval.topk_like: INFO: [2719]倩女幽魂3：道道道 0.685621\n[2017-08-18 07:52:54,690][pid:22831] eval.topk_like: INFO: [1812]倩女幽魂2：人间道 0.681594\n[2017-08-18 07:52:54,691][pid:22831] eval.topk_like: INFO: [466]胭脂扣 0.678032\n[2017-08-18 07:52:54,691][pid:22831] eval.topk_like: INFO: [261]青蛇 0.671541\n[2017-08-18 07:52:54,692][pid:22831] eval.topk_like: INFO: [156]东邪西毒 0.664057\n[2017-08-18 07:52:54,692][pid:22831] eval.topk_like: INFO: [2602]悟空传 top 5 likes:\n[2017-08-18 07:52:55,253][pid:22831] eval.topk_like: INFO: [5648]闪光少女 0.671337\n[2017-08-18 07:52:55,253][pid:22831] eval.topk_like: INFO: [2189]绣春刀II：修罗战场 0.646861\n[2017-08-18 07:52:55,254][pid:22831] eval.topk_like: INFO: [8297]逆时营救 0.634753\n[2017-08-18 07:52:55,254][pid:22831] eval.topk_like: INFO: [12571]京城81号Ⅱ 0.625549\n[2017-08-18 07:52:55,254][pid:22831] eval.topk_like: INFO: [10545]父子雄兵 0.623032\n[2017-08-18 07:52:55,254][pid:22831] eval.topk_like: INFO: [56]美国往事 top 5 likes:\n[2017-08-18 07:52:55,789][pid:22831] eval.topk_like: INFO: [18]天堂电影院 0.756449\n[2017-08-18 07:52:55,790][pid:22831] eval.topk_like: INFO: [14]辛德勒的名单 0.737502\n[2017-08-18 07:52:55,790][pid:22831] eval.topk_like: INFO: [17]教父 0.735216\n[2017-08-18 07:52:55,790][pid:22831] eval.topk_like: INFO: [69]闻香识女人 0.734119\n[2017-08-18 07:52:55,790][pid:22831] eval.topk_like: INFO: [42]西西里的美丽传说 0.732334\n[2017-08-18 07:52:55,791][pid:22831] eval.topk_like: INFO: [2644]战狼2 top 5 likes:\n[2017-08-18 07:52:56,346][pid:22831] eval.topk_like: INFO: [1456]大护法 0.612700\n[2017-08-18 07:52:56,347][pid:22831] eval.topk_like: INFO: [2107]战狼 0.580637\n[2017-08-18 07:52:56,347][pid:22831] eval.topk_like: INFO: [2602]悟空传 0.580365\n[2017-08-18 07:52:56,347][pid:22831] eval.topk_like: INFO: [9941]建军大业 0.575422\n[2017-08-18 07:52:56,347][pid:22831] eval.topk_like: INFO: [19040]阿唐奇遇 0.573613\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flujiaying%2FMovieTaster-Open","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flujiaying%2FMovieTaster-Open","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flujiaying%2FMovieTaster-Open/lists"}