{"id":15661695,"url":"https://github.com/shusentang/bdc2019","last_synced_at":"2026-03-07T08:34:23.149Z","repository":{"id":108596879,"uuid":"203328545","full_name":"ShusenTang/BDC2019","owner":"ShusenTang","description":"2019中国高校计算机大赛——大数据挑战赛 第三名解决方案","archived":false,"fork":false,"pushed_at":"2020-02-16T14:48:33.000Z","size":4515,"stargazers_count":123,"open_issues_count":1,"forks_count":26,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-07-20T19:41:45.355Z","etag":null,"topics":["competition","data-mining","deep-learning","feature-engineering","machine-learning"],"latest_commit_sha":null,"homepage":"https://www.kesci.com/home/competition/5cc51043f71088002c5b8840/content","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ShusenTang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-08-20T08:00:34.000Z","updated_at":"2025-03-16T03:32:44.000Z","dependencies_parsed_at":null,"dependency_job_id":"303a22e6-7329-4e31-bb88-6873a5256df8","html_url":"https://github.com/ShusenTang/BDC2019","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ShusenTang/BDC2019","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ShusenTang%2FBDC2019","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ShusenTang%2FBDC2019/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ShusenTang%2FBDC2019/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ShusenTang%2FBDC2019/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ShusenTang","download_url":"https://codeload.github.com/ShusenTang/BDC2019/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ShusenTang%2FBDC2019/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30209954,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-07T05:23:27.321Z","status":"ssl_error","status_checked_at":"2026-03-07T05:00:17.256Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["competition","data-mining","deep-learning","feature-engineering","machine-learning"],"created_at":"2024-10-03T13:29:01.230Z","updated_at":"2026-03-07T08:34:23.124Z","avatar_url":"https://github.com/ShusenTang.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\u003cdiv align=center\u003e\n\u003cimg src=\"background.png\" alt=\"background\"/\u003e\n\u003c/div\u003e\n\n[2019中国高校计算机大赛——大数据挑战赛](https://www.kesci.com/home/competition/5cc51043f71088002c5b8840/content)\n\n鸡你太美（初赛复赛均第三名）解决方案，包含全部代码、文档及答辩PPT\n\n## 赛题描述：\n搜索中一个重要的任务是根据query和title预测query下doc点击率，本次大赛参赛队伍需要根据**脱敏**后的数据预测指定doc的点击率，结果按照指定的评价指标使用在线评测数据进行评测和排名，得分最优者获胜。\n\n### 任务分类：\n* 短文本匹配\n* 点击率预估\n\n## 数据说明：\n\n`train_data.sample`是官方给的训练样本示例，数据按列分割，分隔符为”,\"，为不带表头的CSV数据格式。数据格式如下：\n\n|列名|类型|示例|\n|---|---|---|\n|query_id|int|3|\n|query|hash string，term空格分割|1 9 117|\n|query_title_id|title在query下的唯一标识|2|\n|title|hash string，term空格分割|3 9 120|\n|label|int, 取值{0, 1}|0|\n\n\u003e **注意：提供的样本示例`train_data.sample`仅为帮助理解赛题以及调通代码，由于样本示例仅为两万行，因此构造的出来的特征意义不大（数据严重泄露）。**\n\n\n## 其他方案\n* [第1名-Progressing](https://www.kesci.com/home/project/5d9ef1fc037db3002d3f75a3)\n* [第2名-蜗牛本牛](https://github.com/srtianxia/BDC2019_Rank2)\n* [第4名-Fah](https://github.com/ZanyFun9/2019BDC_solution_4th)\n* [第5名-拯救菜鸟](https://github.com/LiuYaKu/2019-rank5)\n* [第9名-bestfitting](https://github.com/tinySean/bdc2019-rank9th)\n* [第11名-lili](https://github.com/harrylyx/2019BigDataChallenge)\n* [第12名-福建大三本](https://github.com/leadert/BDC2019-Rank12th-lgb-esim)\n* [第15名-改革春风吹满地](https://github.com/P01son6415/MatchModels)\n\n------------------\n\n感兴趣就给个star吧:-D\n\n**最后感谢两位队友@Han和@hcccccccc**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshusentang%2Fbdc2019","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshusentang%2Fbdc2019","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshusentang%2Fbdc2019/lists"}