{"id":34051109,"url":"https://github.com/yangfa-zhang/lunax","last_synced_at":"2025-12-14T01:02:39.449Z","repository":{"id":292630207,"uuid":"976986381","full_name":"yangfa-zhang/lunax","owner":"yangfa-zhang","description":"Lunax is a machine learning framework specifically designed for the processing and analysis of tabular data.","archived":false,"fork":false,"pushed_at":"2025-06-17T03:13:57.000Z","size":27008,"stargazers_count":12,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-27T16:46:21.167Z","etag":null,"topics":["data-analysis","data-science","lunax","machine-learning","tabular-data"],"latest_commit_sha":null,"homepage":"https://lunax-doc.readthedocs.io/en/latest/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yangfa-zhang.png","metadata":{"files":{"readme":"README.CN.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-03T06:50:37.000Z","updated_at":"2025-10-21T02:15:03.000Z","dependencies_parsed_at":"2025-06-16T12:36:23.069Z","dependency_job_id":null,"html_url":"https://github.com/yangfa-zhang/lunax","commit_stats":null,"previous_names":["yangfa-zhang/luna","yangfa-zhang/lunax"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/yangfa-zhang/lunax","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yangfa-zhang%2Flunax","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yangfa-zhang%2Flunax/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yangfa-zhang%2Flunax/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yangfa-zhang%2Flunax/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yangfa-zhang","download_url":"https://codeload.github.com/yangfa-zhang/lunax/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yangfa-zhang%2Flunax/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":27714239,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-13T02:00:09.769Z","response_time":147,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-science","lunax","machine-learning","tabular-data"],"created_at":"2025-12-14T01:02:36.751Z","updated_at":"2025-12-14T01:02:39.437Z","avatar_url":"https://github.com/yangfa-zhang.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Python version](https://img.shields.io/badge/python-3.9%20%7C%203.10%20%7C%203.11%20%7C%203.12-blue)](https://pypi.org/project/lunax/)\n### \n[中文](README.CN.md) | [EN](README.md)\n### \n\n\u003cdiv\u003e\n\n\u003ca href=\"./imgs/luna3.jpg\"\u003e\u003cimg src=\"./imgs/luna3.jpg\" width=\"50\" align=\"left\" /\u003e\u003c/a\u003e``lunax`` 是一个用于表格数据处理分析的机器学习框架。 lunax这个名字来自于图中的这只可爱的小猫🐱，是华南理工大学最受欢迎的小猫**luna**。在[API文档](https://lunax-doc.readthedocs.io/en/latest/)中查看更详细的说明**⭐️ 如果喜欢，欢迎点个star！ ⭐️**\n\u003c/div\u003e\n\n---\n\n### 如何下载\n```bash\nconda create -n 你的环境名 python=3.11\nconda activate 你的环境名\npip install lunax\n```\n\n### 已有功能\n- 数据加载和预处理\n- EDA分析\n- 自动化机器学习建模\n- 模型评估和解释\n- 集成学习\n- 特征重要性分析\n- 面向对象设计，统一接口便于扩展\n- 使用pytest进行单元测试，保证代码质量\n\n\n### 快速开始\n#### 数据加载和预处理\n```Python\nfrom lunax.data_processing import *\ndf_train = load_data('train.csv') # 或者 df = load_data('train.parquet')\ntarget = '标签列名'\ndf_train = preprocess_data(df_train,target) # 数据预处理, 包括缺失值处理, 特征编码, 特征缩放\nX_train, X_val, y_train, y_val = split_data(df_train, target)\n```\n#### EDA分析\n```Python\nfrom lunax.viz import numeric_eda, categoric_eda\nnumeric_eda([df_train,df_test],['train','test'],target=target) # 数值型特征分析\ncategoric_eda([df_train,df_test],['train','test'],target=target) # 类别型特征分析\n```\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003cimg src=\"./imgs/eda.png\" width=\"600\"/\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cimg src=\"./imgs/eda2.png\" width=\"600\"/\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n#### 自动化机器学习建模\n```Python\nfrom lunax.models import xgb_clf # 或者 xgb_reg, lgbm_reg, lgbm_clf, cat_reg, cat_clf\nfrom lunax.hyper_opt import OptunaTuner\ntuner = OptunaTuner(n_trials=10,model_class=\"XGBClassifier\") # 超参数优化, n_trials为优化次数\n# 或者 \"XGBRegressor\", \"LGBMRegressor\", \"LGBMClassifier\", \"CatRegressor\", \"CatClassifier\"\nresults = tuner.optimize(X_train, y_train, X_val, y_val)\nbest_params = results['best_params']\nmodel = xgb_clf(best_params)\nmodel.fit(X_train, y_train)\n```\n#### 模型评估和解释\n```Python\nmodel.evaluate(X_val, y_val)\n```\n```text\n[lunax]\u003e label information:\n+---------+---------+\n|   label |   count |\n+=========+=========+\n|       1 |     319 |\n+---------+---------+\n|       0 |     119 |\n+---------+---------+\n[lunax]\u003e model evaluation results:\n+-----------+------------+-------------+----------+------+\n| metrics   |   accuracy |   precision |   recall |   f1 |\n+===========+============+=============+==========+======+\n| values    |       0.73 |        0.53 |     0.73 | 0.61 |\n+-----------+------------+-------------+----------+------+\n```\n\n#### 集成学习\n```Python\nfrom lunax.ensembles import HillClimbingEnsemble\nmodel1 = xgb_clf()\nmodel2 = lgbm_clf()\nmodel3 = cat_clf()\nfor model in [model1, model2, model3]:\n    model.fit(X_train, y_train)\nensemble = HillClimbingEnsemble(\n    models=[model1, model2, model3],\n    metric=['auc'],\n    maximize=True\n)\nbest_weights = ensemble.fit(X_val, y_val)\npredictions = ensemble.predict(df_test)\n```\n#### 特征重要性分析\n```Python\nfrom lunax.xai import TreeExplainer\nexplainer = TreeExplainer(model)\nexplainer.plot_summary(X_val)\nimportance = explainer.get_feature_importance(X_val)\n```\n\n```text\n[lunax]\u003e Clear blue/red separation indicates a highly influential feature.\n```\n\u003cimg src=\"./imgs/shap.png\" width=\"600\" /\u003e\n\n```text\n[lunax]\u003e Feature Importance Ranking:\n+----+---------------+---------------------+\n|    |    Feature    |     Importance      |\n+----+---------------+---------------------+\n| 1  |     cloud     | 2.3085615634918213  |\n| 2  |   sunshine    | 0.6377484202384949  |\n| 3  |   dewpoint    | 0.5257667899131775  |\n| 4  |   humidity    | 0.4827548861503601  |\n| 5  |   windspeed   | 0.40086665749549866 |\n| 6  |      id       | 0.38620123267173767 |\n| 7  |   pressure    | 0.3780971169471741  |\n| 8  |    mintemp    | 0.32988569140434265 |\n| 9  |      day      | 0.30587586760520935 |\n| 10 |    maxtemp    | 0.26082852482795715 |\n| 11 | winddirection | 0.23236176371574402 |\n| 12 |  temparature  | 0.17218443751335144 |\n+----+---------------+---------------------+\n```\n\n#### 预测\n```Python\ndf_test = load_data('test.csv')\ndf_test = preprocess_data(df_train,target)\ny_pred = model.predict(df_test)\n# y_pred_proba = model.predict_proba(X_test)\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyangfa-zhang%2Flunax","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyangfa-zhang%2Flunax","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyangfa-zhang%2Flunax/lists"}