{"id":19731522,"url":"https://github.com/shunk031/allennlp-shiba-model","last_synced_at":"2025-04-30T02:31:17.774Z","repository":{"id":49146110,"uuid":"380465202","full_name":"shunk031/allennlp-shiba-model","owner":"shunk031","description":"AllenNLP integration for Shiba: Japanese CANINE model","archived":false,"fork":false,"pushed_at":"2021-06-26T17:39:16.000Z","size":83,"stargazers_count":12,"open_issues_count":0,"forks_count":2,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-04-29T13:41:50.403Z","etag":null,"topics":["allennlp","canine","machine-learning","transformers","transformers-library"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shunk031.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-06-26T09:37:58.000Z","updated_at":"2022-02-16T07:24:32.000Z","dependencies_parsed_at":"2022-08-25T16:02:15.740Z","dependency_job_id":null,"html_url":"https://github.com/shunk031/allennlp-shiba-model","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shunk031%2Fallennlp-shiba-model","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shunk031%2Fallennlp-shiba-model/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shunk031%2Fallennlp-shiba-model/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shunk031%2Fallennlp-shiba-model/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shunk031","download_url":"https://codeload.github.com/shunk031/allennlp-shiba-model/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224194114,"owners_count":17271420,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["allennlp","canine","machine-learning","transformers","transformers-library"],"created_at":"2024-11-12T00:21:29.066Z","updated_at":"2024-11-12T00:21:29.574Z","avatar_url":"https://github.com/shunk031.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Allennlp Integration for [Shiba](https://github.com/octanove/shiba)\n\n[![CI](https://github.com/shunk031/allennlp-shiba-model/actions/workflows/ci.yml/badge.svg)](https://github.com/shunk031/allennlp-shiba-model/actions/workflows/ci.yml)\n[![Release](https://github.com/shunk031/allennlp-shiba-model/actions/workflows/release.yml/badge.svg)](https://github.com/shunk031/allennlp-shiba-model/actions/workflows/release.yml)\n![Python](https://img.shields.io/badge/python-3.7%20%7C%203.8-blue?logo=python)\n[![PyPI](https://img.shields.io/pypi/v/allennlp-shiba.svg)](https://pypi.org/project/allennlp-shiba/)\n\n`allennlp-shiab-model` is a Python library that provides AllenNLP integration for [shiba-model](https://pypi.org/project/shiba-model/).\n\n\u003e SHIBA is an approximate reimplementation of CANINE [[1]](https://github.com/octanove/shiba#1) in raw Pytorch, pretrained on the Japanese wikipedia corpus using random span masking. If you are unfamiliar with CANINE, you can think of it as a very efficient (approximately 4x as efficient) character-level BERT model. Of course, the name SHIBA comes from the identically named Japanese canine.\n\n## Installation\n\nInstalling the library and dependencies is simple using `pip`.\n\n```shell\npip install allennlp-shiba\n```\n\n## Example\n\nThis library enables users to specify the in a jsonnet config file. Here is an example of the model in jsonnet config file:\n\n```json\n{\n    \"dataset_reader\": {\n        \"tokenizer\": {\n            \"type\": \"shiba\",\n        },\n        \"token_indexers\": {\n            \"tokens\": {\n                \"type\": \"shiba\",\n            }\n        },\n    },\n    \"model\": {\n        \"shiba_embedder\": {\n            \"type\": \"basic\",\n            \"token_embedders\": {\n                \"shiba\": {\n                    \"type\": \"shiba\",\n                    \"eval_model\": true,\n                }\n            }\n\n        }\n    }\n}\n```\n\n\n## Reference\n\n- Joshua Tanner and Masato Hagiwara (2021). [SHIBA: Japanese CANINE model](https://github.com/octanove/shiba). GitHub repository, GitHub.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshunk031%2Fallennlp-shiba-model","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshunk031%2Fallennlp-shiba-model","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshunk031%2Fallennlp-shiba-model/lists"}