{"id":30208675,"url":"https://github.com/tsdata/ranx-k","last_synced_at":"2026-04-13T01:35:12.024Z","repository":{"id":308092812,"uuid":"1031536867","full_name":"tsdata/ranx-k","owner":"tsdata","description":"Korean-optimized RAG evaluation toolkit with Kiwi tokenizer, ROUGE metrics,  and IR evaluation for retrieval systems (Hit@K, NDCG@K, MRR, etc.)","archived":false,"fork":false,"pushed_at":"2025-08-20T00:06:59.000Z","size":992,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-13T01:34:50.229Z","etag":null,"topics":["evaluation","hit-rate","kiwi","korean","langchain","map","mrr","ndcg","nlp","rag","ranx","retrieval","rouge"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tsdata.png","metadata":{"files":{"readme":"README.ko.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-08-04T00:20:39.000Z","updated_at":"2025-11-04T06:26:07.000Z","dependencies_parsed_at":"2025-08-10T12:01:46.623Z","dependency_job_id":null,"html_url":"https://github.com/tsdata/ranx-k","commit_stats":null,"previous_names":["tsdata/rank-k"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/tsdata/ranx-k","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tsdata%2Franx-k","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tsdata%2Franx-k/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tsdata%2Franx-k/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tsdata%2Franx-k/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tsdata","download_url":"https://codeload.github.com/tsdata/ranx-k/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tsdata%2Franx-k/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31736723,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-12T22:19:12.206Z","status":"ssl_error","status_checked_at":"2026-04-12T22:18:33.088Z","response_time":58,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["evaluation","hit-rate","kiwi","korean","langchain","map","mrr","ndcg","nlp","rag","ranx","retrieval","rouge"],"created_at":"2025-08-13T18:01:55.684Z","updated_at":"2026-04-13T01:35:12.006Z","avatar_url":"https://github.com/tsdata.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ranx-k: 한국어 최적화 ranx IR 평가 도구 🇰🇷\n\n[![PyPI version](https://badge.fury.io/py/ranx-k.svg)](https://badge.fury.io/py/ranx-k)\n[![Python version](https://img.shields.io/pypi/pyversions/ranx-k.svg)](https://pypi.org/project/ranx-k/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n**[English](README.md) | [한국어](README.ko.md)**\n\n**ranx-k**는 한국어에 최적화된 정보 검색(IR) 평가 도구로, 기존 ranx 라이브러리를 확장하여 Kiwi 토크나이저와 한국어 임베딩을 지원합니다. RAG(Retrieval-Augmented Generation) 시스템의 성능을 정확하게 평가할 수 있습니다.\n\n## 🚀 주요 특징\n\n- **한국어 특화**: Kiwi 형태소 분석기를 활용한 정확한 토큰화\n- **ranx 기반**: 검증된 IR 평가 메트릭 (Hit@K, NDCG@K, MRR, MAP@K 등) 지원\n- **LangChain 호환**: LangChain 검색기 인터페이스 표준 지원\n- **다양한 평가 방법**: ROUGE, 임베딩 유사도, 의미적 유사도 기반 평가\n- **등급별 관련성 지원**: NDCG 계산을 위해 유사도 점수를 관련성 등급으로 사용\n- **구성 가능한 ROUGE 타입**: ROUGE-1, ROUGE-2, ROUGE-L 선택 가능\n- **엄격한 임계값 적용**: 유사도 임계값 미만 문서는 검색 실패로 올바르게 처리\n- **검색 순서 보존**: 재순위화(reranking) 시스템의 정확한 평가 (v0.0.16+)\n- **실용적 설계**: 프로토타입부터 프로덕션까지 단계별 평가 지원\n- **높은 성능**: 기존 방법 대비 30~80% 한국어 평가 정확도 향상\n- **이중언어 출력**: 국제적 접근성을 위한 영어-한국어 병기 출력 지원\n\n## 📦 설치\n\n```bash\npip install ranx-k\n```\n\n또는 개발 버전 설치:\n\n```bash\npip install \"ranx-k[dev]\"\n```\n\n## 🔗 검색기 호환성\n\nranx-k는 **LangChain 검색기 인터페이스**를 지원합니다:\n\n```python\n# 검색기는 invoke() 메서드를 구현해야 합니다\nclass YourRetriever:\n    def invoke(self, query: str) -\u003e List[Document]:\n        # Document 객체 리스트 반환 (page_content 속성 필요)\n        pass\n\n# LangChain Document 사용 예시\nfrom langchain.schema import Document\ndoc = Document(page_content=\"텍스트 내용\")\n```\n\n\u003e **참고**: LangChain은 MIT 라이선스 하에 배포됩니다. 자세한 내용은 [문서](docs/ko/quickstart.md#langchain-license)를 참조하세요.\n\n## 🔧 빠른 시작\n\n### 기본 사용법\n\n```python\nfrom ranx_k.evaluation import simple_kiwi_rouge_evaluation\n\n# 간단한 Kiwi ROUGE 평가\nresults = simple_kiwi_rouge_evaluation(\n    retriever=your_retriever,\n    questions=your_questions,\n    reference_contexts=your_reference_contexts,\n    k=5\n)\n\nprint(f\"ROUGE-1: {results['kiwi_rouge1@5']:.3f}\")\nprint(f\"ROUGE-2: {results['kiwi_rouge2@5']:.3f}\")\nprint(f\"ROUGE-L: {results['kiwi_rougeL@5']:.3f}\")\n```\n\n### 향상된 평가 (Rouge Score + Kiwi)\n\n```python\nfrom ranx_k.evaluation import rouge_kiwi_enhanced_evaluation\n\n# 검증된 rouge_score 라이브러리 + Kiwi 토크나이저\nresults = rouge_kiwi_enhanced_evaluation(\n    retriever=your_retriever,\n    questions=your_questions,\n    reference_contexts=your_reference_contexts,\n    k=5,\n    tokenize_method='morphs',  # 'morphs' 또는 'nouns'\n    use_stopwords=True\n)\n```\n\n### 의미적 유사도 기반 ranx 평가\n\n```python\nfrom ranx_k.evaluation import evaluate_with_ranx_similarity\n\n# 참조 기반 평가 (정확한 재현율을 위해 권장)\nresults = evaluate_with_ranx_similarity(\n    retriever=your_retriever,\n    questions=your_questions,\n    reference_contexts=your_reference_contexts,\n    k=5,\n    method='embedding',\n    similarity_threshold=0.6,\n    use_graded_relevance=False,        # 이진 관련성 (기본값)\n    evaluation_mode='reference_based'  # 모든 참조 문서 대상 평가\n)\n\nprint(f\"Hit@5: {results['hit_rate@5']:.3f}\")\nprint(f\"NDCG@5: {results['ndcg@5']:.3f}\")\nprint(f\"MRR: {results['mrr']:.3f}\")\nprint(f\"MAP@5: {results['map@5']:.3f}\")\n```\n\n#### 다른 임베딩 모델 사용\n\n```python\n# OpenAI 임베딩 모델 (API 키 필요)\nresults = evaluate_with_ranx_similarity(\n    retriever=your_retriever,\n    questions=your_questions,\n    reference_contexts=your_reference_contexts,\n    k=5,\n    method='openai',\n    similarity_threshold=0.7,\n    embedding_model=\"text-embedding-3-small\"\n)\n\n# 최신 BGE-M3 모델 (한국어 우수)\nresults = evaluate_with_ranx_similarity(\n    retriever=your_retriever,\n    questions=your_questions,\n    reference_contexts=your_reference_contexts,\n    k=5,\n    method='embedding',\n    similarity_threshold=0.6,\n    embedding_model=\"BAAI/bge-m3\"\n)\n\n# 한국어 특화 Kiwi ROUGE 방법 - 구성 가능한 ROUGE 타입\nresults = evaluate_with_ranx_similarity(\n    retriever=your_retriever,\n    questions=your_questions,\n    reference_contexts=your_reference_contexts,\n    k=5,\n    method='kiwi_rouge',\n    similarity_threshold=0.3,  # Kiwi ROUGE는 낮은 임계값 권장\n    rouge_type='rougeL',      # 'rouge1', 'rouge2', 'rougeL' 선택\n    tokenize_method='morphs', # 'morphs' 또는 'nouns' 선택\n    use_stopwords=True        # 불용어 필터링 설정\n)\n```\n\n### 종합 평가\n\n```python\nfrom ranx_k.evaluation import comprehensive_evaluation_comparison\n\n# 모든 평가 방법 비교\ncomparison = comprehensive_evaluation_comparison(\n    retriever=your_retriever,\n    questions=your_questions,\n    reference_contexts=your_reference_contexts,\n    k=5\n)\n```\n\n## 📊 평가 방법\n\n### 1. Kiwi ROUGE 평가\n- **장점**: 빠른 속도, 직관적 해석\n- **용도**: 프로토타이핑, 빠른 피드백\n\n### 2. Enhanced ROUGE (Rouge Score + Kiwi)\n- **장점**: 검증된 라이브러리, 안정성\n- **용도**: 프로덕션 환경, 신뢰성 중요한 평가\n\n### 3. 의미적 유사도 기반 ranx\n- **장점**: 전통적 IR 메트릭, 의미적 유사도\n- **용도**: 연구, 벤치마킹, 상세 분석\n\n## 🎯 성능 개선 사례\n\n```python\n# 기존 방법 (영어 토크나이저)\nbasic_rouge1 = 0.234\n\n# ranx-k (Kiwi 토크나이저)\nranxk_rouge1 = 0.421  # +79.9% 향상!\n```\n\n## 📊 추천 임베딩 모델\n\n| 모델 | 용도 | 임계값 | 특징 |\n|------|------|--------|------|\n| `paraphrase-multilingual-MiniLM-L12-v2` | 기본 | 0.6 | 빠름, 가벼움 |\n| `text-embedding-3-small` (OpenAI) | 정확도 | 0.7 | 높은 정확도, 비용 효율적 |\n| `BAAI/bge-m3` | 한국어 | 0.6 | 최신, 다국어 우수 |\n| `text-embedding-3-large` (OpenAI) | 프리미엄 | 0.8 | 최고 성능 |\n\n## 📈 점수 해석 가이드\n\n| 점수 범위 | 평가 | 권장 조치 |\n|-----------|------|-----------|\n| 0.7 이상 | 🟢 매우 좋음 | 현재 설정 유지 |\n| 0.5~0.7 | 🟡 양호 | 미세 조정 고려 |\n| 0.3~0.5 | 🟠 보통 | 개선 필요 |\n| 0.3 미만 | 🔴 부족 | 대폭 수정 필요 |\n\n## 🔍 고급 사용법\n\n### 등급별 관련성 모드\n\n```python\n# 등급별 관련성 모드 - 유사도 점수를 관련성 등급으로 사용\nresults = evaluate_with_ranx_similarity(\n    retriever=your_retriever,\n    questions=questions,\n    reference_contexts=references,\n    method='embedding',\n    similarity_threshold=0.6,\n    use_graded_relevance=True   # 유사도 점수를 관련성 등급으로 사용\n)\n\nprint(f\"NDCG@5: {results['ndcg@5']:.3f}\")\n```\n\n\u003e **등급별 관련성 참고사항**: `use_graded_relevance` 매개변수는 주로 NDCG (Normalized Discounted Cumulative Gain) 계산에 영향을 미칩니다. Hit@K, MRR, MAP 같은 다른 메트릭들은 ranx 라이브러리에서 관련성을 이진으로 처리합니다. 문서 관련성의 품질 차이를 구분해야 할 때 등급별 관련성을 사용하세요.\n\n### 커스텀 임베딩 모델\n\n```python\n# 커스텀 임베딩 모델 사용\nresults = evaluate_with_ranx_similarity(\n    retriever=your_retriever,\n    questions=questions,\n    reference_contexts=references,\n    method='embedding',\n    embedding_model=\"your-custom-model-name\",\n    similarity_threshold=0.6\n)\n```\n\n### 구성 가능한 ROUGE 타입\n\n```python\n# 다양한 ROUGE 메트릭 비교\nfor rouge_type in ['rouge1', 'rouge2', 'rougeL']:\n    results = evaluate_with_ranx_similarity(\n        retriever=your_retriever,\n        questions=questions,\n        reference_contexts=references,\n        method='kiwi_rouge',\n        rouge_type=rouge_type,\n        tokenize_method='morphs',\n        similarity_threshold=0.3\n    )\n    print(f\"{rouge_type.upper()}: Hit@5 = {results['hit_rate@5']:.3f}\")\n```\n\n### 임계값 민감도 분석\n\n```python\n# 다양한 임계값이 평가에 미치는 영향 분석\nthresholds = [0.3, 0.5, 0.7]\nfor threshold in thresholds:\n    results = evaluate_with_ranx_similarity(\n        retriever=your_retriever,\n        questions=questions,\n        reference_contexts=references,\n        similarity_threshold=threshold\n    )\n    print(f\"임계값 {threshold}: Hit@5={results['hit_rate@5']:.3f}, NDCG@5={results['ndcg@5']:.3f}\")\n```\n\n## 📚 예제\n\n- [기본 토크나이저 예제](examples/basic_tokenizer.py)\n- [BGE-M3 평가 예제](examples/bge_m3_evaluation.py)\n- [임베딩 모델 비교](examples/embedding_models_comparison.py)\n- [종합 비교](examples/comprehensive_comparison.py)\n\n## 🤝 기여하기\n\n기여를 환영합니다! 이슈와 풀 리퀘스트를 자유롭게 제출해 주세요.\n\n## 📄 라이선스\n\n이 프로젝트는 MIT 라이선스 하에 배포됩니다. 자세한 내용은 [LICENSE](LICENSE) 파일을 참조하세요.\n\n## 🙏 감사의 말\n\n- Elias Bassani의 [ranx](https://github.com/AmenRa/ranx)를 기반으로 구축\n- [Kiwi](https://github.com/bab2min/kiwipiepy)를 통한 한국어 형태소 분석\n- [sentence-transformers](https://github.com/UKPLab/sentence-transformers)를 통한 임베딩 지원\n\n## 📞 지원\n\n- 🐛 이슈 트래커: GitHub에서 이슈를 제출해 주세요\n- 📧 이메일: ontofinance@gmail.com\n\n---\n\n**ranx-k** - 정확하고 쉬운 한국어 RAG 평가를 위한 도구!","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftsdata%2Franx-k","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftsdata%2Franx-k","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftsdata%2Franx-k/lists"}