{"id":13535137,"url":"https://github.com/FakerYFX/Bert-Pytorch-Chinese-TextClassification","last_synced_at":"2025-04-02T00:32:38.749Z","repository":{"id":172144015,"uuid":"164197035","full_name":"FakerYFX/Bert-Pytorch-Chinese-TextClassification","owner":"FakerYFX","description":"Pytorch Bert Finetune in Chinese Text Classification","archived":false,"fork":false,"pushed_at":"2024-04-11T14:10:38.000Z","size":24,"stargazers_count":211,"open_issues_count":4,"forks_count":35,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-08-11T16:09:17.709Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FakerYFX.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-01-05T08:37:07.000Z","updated_at":"2024-07-24T03:53:22.000Z","dependencies_parsed_at":"2024-01-14T02:36:55.903Z","dependency_job_id":"60443d6a-44e8-421a-aaec-e090cce86653","html_url":"https://github.com/FakerYFX/Bert-Pytorch-Chinese-TextClassification","commit_stats":null,"previous_names":["xieyufei1993/bert-pytorch-chinese-textclassification","fakeryfx/bert-pytorch-chinese-textclassification"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FakerYFX%2FBert-Pytorch-Chinese-TextClassification","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FakerYFX%2FBert-Pytorch-Chinese-TextClassification/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FakerYFX%2FBert-Pytorch-Chinese-TextClassification/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FakerYFX%2FBert-Pytorch-Chinese-TextClassification/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FakerYFX","download_url":"https://codeload.github.com/FakerYFX/Bert-Pytorch-Chinese-TextClassification/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":222788514,"owners_count":17037777,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T08:00:50.172Z","updated_at":"2024-11-02T23:30:42.537Z","avatar_url":"https://github.com/FakerYFX.png","language":"Python","readme":"# Bert-Pytorch-Chinese-TextClassification\nPytorch Bert Finetune in Chinese Text Classification\n\n### Step 1\n\nDownload the pretrained TensorFlow model:[chinese_L-12_H-768_A-12](https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip)\n\n### Step 2\n\nChange the TensorFlow Pretrained Model into Pytorch\n\n```shell\ncd  convert_tf_to_pytorch\n```\n\n```shell\nexport BERT_BASE_DIR=/workspace/mnt/group/ocr/xieyufei/bert-tf-chinese/chinese_L-12_H-768_A-12\n\npython3 convert_tf_checkpoint_to_pytorch.py \\\n  --tf_checkpoint_path $BERT_BASE_DIR/bert_model.ckpt \\\n  --bert_config_file $BERT_BASE_DIR/bert_config.json \\\n  --pytorch_dump_path $BERT_BASE_DIR/pytorch_model.bin\n```\n\n### Step 3\n\nDownload the Chinese News DataSet:[Train](https://pan.baidu.com/s/15rkzx-YRbP5XRNeapzYWLw) for 5w and [Dev](https://pan.baidu.com/s/1HuYTacgAQFqGAJ8FYXNqOw) for 5k\n\n### Step 4\n\nJust Train and Test\n\n```shell\ncd src\n```\n\n```shell\nexport GLUE_DIR=/workspace/mnt/group/ocr/xieyufei/bert-tf-chinese/glue_data\nexport BERT_BASE_DIR=/workspace/mnt/group/ocr/xieyufei/bert-tf-chinese/chinese_L-12_H-768_A-12/\nexport BERT_PYTORCH_DIR=/workspace/mnt/group/ocr/xieyufei/bert-tf-chinese/chinese_L-12_H-768_A-12/\n\npython3 run_classifier_word.py \\\n  --task_name NEWS \\\n  --do_train \\\n  --do_eval \\\n  --data_dir $GLUE_DIR/SouGou/ \\\n  --vocab_file $BERT_BASE_DIR/vocab.txt \\\n  --bert_config_file $BERT_BASE_DIR/bert_config.json \\\n  --init_checkpoint $BERT_PYTORCH_DIR/pytorch_model.bin \\\n  --max_seq_length 256 \\\n  --train_batch_size 24 \\\n  --learning_rate 2e-5 \\\n  --num_train_epochs 50.0 \\\n  --output_dir ./newsAll_output/ \\\n  --local_rank 3\n```\n\n1个Epoch的结果如下：\n\n```\neval_accuracy = 0.9742\neval_loss = 0.10202122390270234\nglobal_step = 2084\nloss = 0.15899521649851786\n```\n\n\n\n","funding_links":[],"categories":["BERT classification task:"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFakerYFX%2FBert-Pytorch-Chinese-TextClassification","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FFakerYFX%2FBert-Pytorch-Chinese-TextClassification","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFakerYFX%2FBert-Pytorch-Chinese-TextClassification/lists"}