{"id":18347480,"url":"https://github.com/katanabana/nihotip","last_synced_at":"2026-05-08T13:32:35.691Z","repository":{"id":224333169,"uuid":"753608245","full_name":"katanabana/Nihotip","owner":"katanabana","description":"Nihotip is a web app that lets users explore Japanese text through interactive tokenization and detailed insights. Built with React and Python, it offers a dynamic way to analyze words and symbols with tooltips for deeper understanding.","archived":false,"fork":false,"pushed_at":"2024-09-26T05:24:28.000Z","size":46805,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-15T14:47:05.792Z","etag":null,"topics":["japanese","japanese-characters","japanese-language","japanese-learning","jmdictfurigana","language","mecab","nlp","python","react","sudachipy","text-analysis","text-tokenization","tokenization","tooltips","wanakana","webapp"],"latest_commit_sha":null,"homepage":"https://nihotip.netlify.app","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/katanabana.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-06T13:11:38.000Z","updated_at":"2024-09-26T05:24:32.000Z","dependencies_parsed_at":null,"dependency_job_id":"5dff421f-9106-47b0-80c1-c4af29ff81da","html_url":"https://github.com/katanabana/Nihotip","commit_stats":null,"previous_names":["katanabana/nihotip"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/katanabana%2FNihotip","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/katanabana%2FNihotip/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/katanabana%2FNihotip/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/katanabana%2FNihotip/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/katanabana","download_url":"https://codeload.github.com/katanabana/Nihotip/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248127725,"owners_count":21052270,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["japanese","japanese-characters","japanese-language","japanese-learning","jmdictfurigana","language","mecab","nlp","python","react","sudachipy","text-analysis","text-tokenization","tokenization","tooltips","wanakana","webapp"],"created_at":"2024-11-05T21:14:10.896Z","updated_at":"2026-05-08T13:32:30.644Z","avatar_url":"https://github.com/katanabana.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Nihotip\n\nNihotip is a web application designed to help users explore the intricacies of the Japanese language through a dynamic\nand interactive interface. With a React frontend and a Python backend, Nihotip provides a convenient way to tokenize\nJapanese text and delve into detailed information about words, symbols, and their respective properties via tooltips.\nNihotip offers a robust solution for analyzing Japanese text at multiple levels of granularity.\n\n![Demo](demo.gif)\n\nURL of the published version: https://nihotip.netlify.app\n\n## ✨ Features\n\n- **Japanese Text Tokenization:**\n  Input Japanese text and have it automatically tokenized into words and symbols.\n\n- **Detailed Word and Symbol Insights:**\n  Hover over words or symbols to access detailed tooltips that explain the structure, readings, and associated\n  properties of each token.\n\n- **Level-based Token Breakdown:**\n  Nihotip organizes tokenized text into multiple hierarchical levels for easy navigation (features of different levels\n  of tokens are listed inside brackets):\n    - text\n        - not a japanese word\n            - punctuation\n            - space\n            - line break\n            - string of not japanese characters\n        - japanese word (part of speech)\n            - **part by reading**\n                - one or multiple kanji (kana reading -\u003e **part by reading**)\n                - digraph\n                    - **big kana without tenten**\n                    - **big kana with tenten**\n                    - small kana (_respective_ **big kana**)\n                - **kana without tenten** (romaji, association)\n                - **kana with tenten** (_respective_ **kana without tenten**)\n\n- **part by reading:**\n\n  _Parts are gotten by cutting the reading of the word. They allow to determine the kana reading for each kanji. A part\n  consists of multiple characters if the reading of a kanji along with the characters surrounding it can't be cut. For\n  example, the part \"大人\" of the word \"大人買い\" uses a special reading \"おとな\" that can't be cut. That's why the \"おとな\"\n  reading applies to the whole part._\n\n- **syllable:**\n\n    - single kana\n    - digraph\n    - kana with \"っ\", \"ッ\" or \"ー\"\n    - single kanji\n\n- **Tooltip insights:**\n  Show how readings map to individual characters and provide additional details like romaji and kana associations.\n\n## 🛠️ Getting Started\n\nTo run the application locally, follow these steps:\n\nTo run the application locally, follow these steps:\n\n1. Clone the repository and navigate into the project directory.\n\n2. **Set up Environmental Variables:**\n\n   Create `.env` files in the respective directories with the following content:\n\n   - **client/.env**\n\n     Create a file named `client/.env` and add:\n     ```plaintext\n     REACT_APP_BACKEND_URL=http://localhost:3001\n     ```\n\n   - **server/.env**\n\n     Create a file named `server/.env` and add:\n     ```plaintext\n     PORT=3001\n     HOST=localhost\n     FRONTEND_URL=http://localhost:3000\n     ```\n\n3. Open two terminal windows and run the following commands in separate terminals:\n\n   ```bash\n   # Start the frontend (React)\n   cd client\n   npm install\n   npm start\n   ```\n\n   ```bash\n   # Start the backend (Python)\n   cd server\n   pip install -r requirements.txt\n   python main.py\n   ```\n\n4. Open your browser and visit `http://localhost:3000` to start interacting with Nihotip.\n\n## 🚀 Upcoming Features\n\n- **Multilingual Tooltips:**\n  Add the option to choose the language for tooltips to enhance accessibility for non-Japanese speakers.\n\n- **Word Normalization:**\n  Implement word normalization for more accurate tokenization results.\n\n- **Notes for Ambiguous Words:**\n  Provide detailed notes for words that belong to multiple parts of speech or have different interpretations based on\n  context.\n\n## 🤝 Contributing\n\nWe welcome contributions! If you'd like to contribute to Nihotip, feel free to submit issues or pull requests on the\nGitHub repository.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkatanabana%2Fnihotip","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkatanabana%2Fnihotip","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkatanabana%2Fnihotip/lists"}