{"id":15642943,"url":"https://github.com/logicjake/tuling-video-click-top3","last_synced_at":"2025-04-30T11:44:59.250Z","repository":{"id":104610803,"uuid":"239765647","full_name":"LogicJake/tuling-video-click-top3","owner":"LogicJake","description":"图灵联邦视频点击预测大赛线上第三-【ctr, embedding, 穿越特征】","archived":false,"fork":false,"pushed_at":"2020-03-04T06:25:48.000Z","size":45,"stargazers_count":61,"open_issues_count":0,"forks_count":18,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-10-29T16:52:25.890Z","etag":null,"topics":["ctr","embedding"],"latest_commit_sha":null,"homepage":"https://www.logicjake.xyz/2020/02/10/%E5%9B%BE%E7%81%B5%E8%81%94%E9%82%A6%E8%A7%86%E9%A2%91%E7%82%B9%E5%87%BB%E9%A2%84%E6%B5%8B%E5%A4%A7%E8%B5%9B-%E8%B5%9B%E5%90%8E%E6%80%BB%E7%BB%93/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LogicJake.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-02-11T13:14:57.000Z","updated_at":"2023-11-27T05:05:53.000Z","dependencies_parsed_at":"2023-06-09T15:30:31.850Z","dependency_job_id":null,"html_url":"https://github.com/LogicJake/tuling-video-click-top3","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LogicJake%2Ftuling-video-click-top3","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LogicJake%2Ftuling-video-click-top3/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LogicJake%2Ftuling-video-click-top3/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LogicJake%2Ftuling-video-click-top3/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LogicJake","download_url":"https://codeload.github.com/LogicJake/tuling-video-click-top3/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223779728,"owners_count":17201287,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ctr","embedding"],"created_at":"2024-10-03T11:58:15.609Z","updated_at":"2024-11-09T03:15:35.934Z","avatar_url":"https://github.com/LogicJake.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# tuling-video-click-top3\n图灵联邦视频点击预测大赛线上第三\n\n# 2020-TURING-TOPIA-Video-Click-SINGLE-LightGBM-top3\n===============================================================================================================     \n**图灵联邦视频点击预测大赛线上第三（LightGBM单模）**\n## 主办方： 图灵联邦\n## 赛道：2020-视频点击预测大赛\n\n**赛道链接**：https://www.turingtopia.com/competitionnew/detail/e4880352b6ef4f9f8f28e8f98498dbc4/sketch       \n**赛程时间**：*2019.11.11-2020.03.09*  \n**参与人**：[第一次打比赛](https://github.com/LogicJake)、[郑](https://github.com/jackhuntcn) 、[小兔子乖乖](https://github.com/PandasCute) 、   [Freak](https://github.com/BovenPeng/)  、[luweihai](https://github.com/luweihai)     \n**方案文档**：[文档链接](https://www.logicjake.xyz/2020/02/10/%E5%9B%BE%E7%81%B5%E8%81%94%E9%82%A6%E8%A7%86%E9%A2%91%E7%82%B9%E5%87%BB%E9%A2%84%E6%B5%8B%E5%A4%A7%E8%B5%9B-%E8%B5%9B%E5%90%8E%E6%80%BB%E7%BB%93/)      \n**百度云盘下载链接**:为避免数据丢失，提供数据集下载地址链接: https://pan.baidu.com/s/1YPtg4QyiAdhRAMoxjis_Gw  密码: 0a3r       \n## 1.数据说明  \n**train.csv**\n\n| 字段     | 中文名| 数据类型|  说明 |\n|:-------:|:-------:|:-------:|:-------:|\n|id|\t用户ID|\tVARCHAR2(50)|\t代表数据集的第几条数据，从1到11376681|\n|target|\t是否点击|\tVARCHAR2(50)|\t代表该视频是否被用户点击了，1代表点击，0代表未点击。|\n|timestamp|修改时间戳|\tVARCHAR2(50)|代表改用户点击改视频的时间戳，如果未点击则为NULL。|\n|deviceid|\t设备ID|\tVARCHAR2(50)|用户的设备id|\n|newsid|视频ID|\tVARCHAR2(50)|视频的id。|\n|guid|注册ID|\tVARCHAR2(50)|\t用户的注册id。|\n|pos|推荐位置|\tVARCHAR2(50)|\t视频推荐位置|\n|app_version|app版本|\tVARCHAR2(50)|\tapp版本。|\n|device_vendor|设备厂商|\tVARCHAR2(50)|\t设备厂商|\n|netmodel|网络类型|\tVARCHAR2(50)|\t网络类型。|\n|osversion|操作系统版本|\tVARCHAR2(50)|\t操作系统版本。|\n|lng|经度|\tVARCHAR2(50)|经度。|\n|lat|维度|\tVARCHAR2(50)|\t维度。|\n|device_version|设备版本|\tVARCHAR2(50)|\t设备版本。|\n|ts|\t用户ID|时间戳|\t视频暴光给用户的时间戳。|\n\n**test.csv**\n\n| 字段     | 中文名| 数据类型|  说明 |\n|:-------:|:-------:|:-------:|:-------:|\n|id|\t用户ID|\tVARCHAR2(50)|\ttest_1到test_3653592|\n|deviceid|\t设备ID|\tVARCHAR2(50)|用户的设备id|\n|newsid|视频ID|\tVARCHAR2(50)|视频的id。|\n|guid|注册ID|\tVARCHAR2(50)|\t用户的注册id。|\n|pos|推荐位置|\tVARCHAR2(50)|\t视频推荐位置|\n|app_version|app版本|\tVARCHAR2(50)|\tapp版本。|\n|device_vendor|设备厂商|\tVARCHAR2(50)|\t设备厂商|\n|netmodel|网络类型|\tVARCHAR2(50)|\t网络类型。|\n|osversion|操作系统版本|\tVARCHAR2(50)|\t操作系统版本。|\n|lng|经度|\tVARCHAR2(50)|经度。|\n|lat|维度|\tVARCHAR2(50)|\t维度。|\n|device_version|设备版本|\tVARCHAR2(50)|\t设备版本。|\n|ts|\t用户ID|时间戳|\t视频暴光给用户的时间戳。|\n\n**app.csv**\n\n| 字段     | 中文名| 数据类型|  说明 |\n|:-------:|:-------:|:-------:|:-------:|\n|id|\t用户ID|\tVARCHAR2(50)|\ttest_1到test_3653592|\n|**deviceid**|\t设备ID|\tVARCHAR2(50)|用户的设备id|\n|applist deviceid|视频ID|\tVARCHAR2(50)|用户所拥有的app，我们已将app的名字设置成了app_1,app_2..的形式。|\n\n**test.csv**\n\n| 字段     | 中文名| 数据类型|  说明 |\n|:-------:|:-------:|:-------:|:-------:|\n|id|\t用户ID|\tVARCHAR2(50)|\ttest_1到test_3653592|\n|deviceid|\t设备ID|\tVARCHAR2(50)|用户的设备id|\n|guid|注册ID|\tVARCHAR2(50)|\t用户的注册id。|\n|outertag|用户画像|\tVARCHAR2(50)|用户画像用竖号分隔，冒号后面的数字代表对该标签的符合程度，分数越高代表该标签越符合该用户。|\n|tag|用户画像|\tVARCHAR2(50)|同outertag|\n|level|用户等级|\tVARCHAR2(50)|用户等级。|\n|personidentification|是否优劣|\tVARCHAR2(50)|1表示劣质用户 0表示正常用户。|\n|followscore|徒弟分|\tVARCHAR2(50)|徒弟分（好友分）。 |\n|personalscore|个人分|\tVARCHAR2(50)|个人分。 |\n|gender|性别|\tVARCHAR2(50)|性别|\n\n## 2.配置环境与依赖库 \n  - python3\n  - scikit-learn\n  - gensim\n  - Ubuntu   \n  - LightGBM\n  - notebook \n## 3.运行代码步骤说明  \n分别按照代码顺序  \n运行1,2,3,4 \n\u003e 1 feature.ipynb\t 特征工程   \n\u003e 2 fold_model.ipynb\t     \n\u003e 3 offline_model.ipynb 离线模型  \n\u003e 4 online_model.ipynb 线上模型 \n\n## 4.特征工程      \n - **原始特征**     \n - **穿越特征**   \n - **统计特征** \n - **embedding特征**     \n## 5.模型训练   \n单模，初赛最终榜：0.83695 线上第三\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flogicjake%2Ftuling-video-click-top3","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flogicjake%2Ftuling-video-click-top3","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flogicjake%2Ftuling-video-click-top3/lists"}