{"id":32004454,"url":"https://github.com/jchunk-io/jchunk","last_synced_at":"2026-03-10T11:04:10.460Z","repository":{"id":249721711,"uuid":"832343000","full_name":"jchunk-io/jchunk","owner":"jchunk-io","description":"JChunk is a lightweight and flexible library designed to provide multiple strategies for text chunking within Java applications","archived":false,"fork":false,"pushed_at":"2026-02-21T21:40:18.000Z","size":81868,"stargazers_count":17,"open_issues_count":11,"forks_count":4,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-02-22T02:41:52.783Z","etag":null,"topics":["chunk","chunking","etl-pipeline","java","rag","text-splitter","text-splitting"],"latest_commit_sha":null,"homepage":"https://docs.jchunk.io/","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jchunk-io.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-07-22T20:38:29.000Z","updated_at":"2026-02-21T21:39:15.000Z","dependencies_parsed_at":"2024-11-24T18:23:11.529Z","dependency_job_id":"656858d1-5d2e-4a81-9a72-3764195548b1","html_url":"https://github.com/jchunk-io/jchunk","commit_stats":null,"previous_names":["pablosanchi/text-chunker-module","jchunk-io/jchunk"],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/jchunk-io/jchunk","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jchunk-io%2Fjchunk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jchunk-io%2Fjchunk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jchunk-io%2Fjchunk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jchunk-io%2Fjchunk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jchunk-io","download_url":"https://codeload.github.com/jchunk-io/jchunk/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jchunk-io%2Fjchunk/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30331644,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-10T05:25:20.737Z","status":"ssl_error","status_checked_at":"2026-03-10T05:25:17.430Z","response_time":106,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chunk","chunking","etl-pipeline","java","rag","text-splitter","text-splitting"],"created_at":"2025-10-15T17:39:15.424Z","updated_at":"2026-03-10T11:04:10.451Z","avatar_url":"https://github.com/jchunk-io.png","language":"Java","funding_links":[],"categories":["人工智能"],"sub_categories":["Spring Cloud框架"],"readme":"# JChunk\n\n[![GitHub Actions Status](https://img.shields.io/github/actions/workflow/status/jchunk-io/jchunk/build.yml?branch=main\u0026logo=GitHub\u0026style=for-the-badge)](.)\n[![Apache 2.0 License](https://img.shields.io/github/license/jchunk-io/jchunk?style=for-the-badge\u0026logo=apache\u0026color=brightgreen)](.)\n\n## A Java Library for Text Chunking\n\nJChunk project is simple library that enables different types of text splitting strategies, essential for RAG applications.\n\n## Docs\n\n[Jchunk Website](https://jchunk-io.github.io/jchunk/)\n\n## Installing\n\n### Fixed Chunker \n\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003eio.jchunk\u003c/groupId\u003e\n    \u003cartifactId\u003ejchunk-fixed\u003c/artifactId\u003e\n    \u003cversion\u003e${jchunk.version}\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n```groovy\nimplementation(\"io.jchunk:jchunk-fixed:${JCHUNK_VERSION}\")\n```\n\n### Recursive Chunker\n\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003eio.jchunk\u003c/groupId\u003e\n    \u003cartifactId\u003ejchunk-recursive-character\u003c/artifactId\u003e\n    \u003cversion\u003e${jchunk.version}\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n```groovy\nimplementation(\"io.jchunk:jchunk-recursive-character:${JCHUNK_VERSION}\")\n```\n\n### Semantic Chunker\n\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003eio.jchunk\u003c/groupId\u003e\n    \u003cartifactId\u003ejchunk-semantic\u003c/artifactId\u003e\n    \u003cversion\u003e${jchunk.version}\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n```groovy\nimplementation(\"io.jchunk:jchunk-semantic:${JCHUNK_VERSION}\")\n```\n\n## Building\n\nTo build with tests\n\n```sh\n./mvnw clean verify -Dgpg.skip=true\n```\n\nTo reformat using the java-format plugin\n\n```sh\n./mvnw spotless:apply\n```\n\nTo check javadocs using the javadoc:javadoc\n\n```sh\n./mvnw javadoc:javadoc -Pjavadoc\n```\n\n## Contributing\n\nPlease read [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct, and the process for submitting pull requests to us.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjchunk-io%2Fjchunk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjchunk-io%2Fjchunk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjchunk-io%2Fjchunk/lists"}