{"id":15521385,"url":"https://github.com/julesbelveze/bert-sequence-classifier","last_synced_at":"2026-02-16T14:31:14.935Z","repository":{"id":53145803,"uuid":"311122270","full_name":"JulesBelveze/BERT-sequence-classifier","owner":"JulesBelveze","description":"🤗  Dockerized BERT-Multi-Label-Classifier Inferer 🤗","archived":false,"fork":false,"pushed_at":"2021-08-30T10:35:06.000Z","size":59902,"stargazers_count":4,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-07-28T11:03:09.745Z","etag":null,"topics":["api","bert","classification","distilbert","docker","huggingface","inference","multi-label-classification","roberta","toxicity","transformers"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JulesBelveze.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-11-08T17:58:14.000Z","updated_at":"2022-07-07T05:36:41.000Z","dependencies_parsed_at":"2022-09-11T21:53:41.478Z","dependency_job_id":null,"html_url":"https://github.com/JulesBelveze/BERT-sequence-classifier","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/JulesBelveze/BERT-sequence-classifier","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JulesBelveze%2FBERT-sequence-classifier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JulesBelveze%2FBERT-sequence-classifier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JulesBelveze%2FBERT-sequence-classifier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JulesBelveze%2FBERT-sequence-classifier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JulesBelveze","download_url":"https://codeload.github.com/JulesBelveze/BERT-sequence-classifier/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JulesBelveze%2FBERT-sequence-classifier/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29510134,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-16T09:05:14.864Z","status":"ssl_error","status_checked_at":"2026-02-16T08:55:59.364Z","response_time":115,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","bert","classification","distilbert","docker","huggingface","inference","multi-label-classification","roberta","toxicity","transformers"],"created_at":"2024-10-02T10:34:23.185Z","updated_at":"2026-02-16T14:31:14.910Z","avatar_url":"https://github.com/JulesBelveze.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🤗 BERT-Multi-Label-Classifier / Dockerized Inferer 🤗\nRepository to fine-tune a BERT-base multi-label/multi-class classifier, based on _HuggingFace_ library. The repository includes a _Flask_ API wrapper for inference.\n\n## Table of contents\n* [Installation](#installation)\n* [Organisation of files](#organisation-of-files)\n* [Datasets](#datasets)\n* [Models](#models)\n  * [Multi-label-classifier](#multi-label-classifier)\n  * [Multi-class-classifier](#multi-class-classifier)\n* [Inference](#inference)\n* [TODO](#todo)\n\n## Installation\nTo install the repository please run the following command:\n```\ngit clone https://github.com/JulesBelveze/BERT-multi-label-classifier.git\n```\nThe repository uses _Poetry_ as a package manager (see full documentation [here](https://python-poetry.org/docs/#installation)). To install the required packages please run the following commands:\n```\npython3 -m venv .venv/bert-mlc\nsource .venv/bert-mlc/bin/activate\npoetry install\n```\nThis repo uses neptune.ai to manage experiments. We invite you to look at their [documentation](https://docs.neptune.ai/index.html) if needed.\n\n## Organisation of files\n* `models/`: folder containing custom models\n* `utils/`: folder containing function utilities\n* `main.py`: main file to run\n* `train.py`: file containing the training procedure\n* `eval.py`: file containing the evaluation procedure\n* `app.py`: file containing the _Flask_ app\n* `inferer.py`: file containing the model inferer\n* `poetry.lock`: _Poetry_ file\n* `pyproject.toml`: _Poetry_ file\n* `requirements_inference.txt`: required packages for inference\n* `Dockerfile`: file to run the API as a docker image\n\n## Datasets\n* **multi-class:** you can download it [here](https://raw.githubusercontent.com/susanli2016/NLP-with-Python/master/data/title_conference.csv)\n* **multi-label:** [Toxic Comment Classification Challenge | Kaggle](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/data)\n\n## Models\nWe provide customisation of four different models: BERT, Roberta, XLMRoberta and Distilbert.\n### 1. Multi-label-classifier\nThe model is an adaptation of  the `BertForSequenceClassification` model of [HuggingFace](https://huggingface.co/transformers/model_doc/bert.html#bertforsequenceclassification) to handle multi-label. The key modification here is the modification of loss function.\n### 2. Multi-class-classifier\nThe model used is basically a MLP on top of a BERT model. Once again, the custom model provided extends the `BertForSequenceClassification` model of [HuggingFace](https://huggingface.co/transformers/model_doc/bert.html#bertforsequenceclassification) to integrate the class weights in the loss function.\n## Inference\nThe inferrer only supports single input inference. It handles all the processing steps required to feed the text into the classification model.\nIt can be used in the following way:\n```\nmodel_infer = ModelInferer(config=config, checkpoint_path=checkpoint_path, quantize=True)\nmodel_infer.predict(\"I hate you from more than you can imagine\")\n```\nWe also provide a Flask API that encapsulates the inferrer as well as a way Dockerized the app for production usage.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjulesbelveze%2Fbert-sequence-classifier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjulesbelveze%2Fbert-sequence-classifier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjulesbelveze%2Fbert-sequence-classifier/lists"}