{"id":31756553,"url":"https://github.com/lixin97/wirelessmathlm","last_synced_at":"2025-10-09T19:19:17.570Z","repository":{"id":317026256,"uuid":"1065219265","full_name":"LiXin97/WirelessMathLM","owner":"LiXin97","description":"WirelessMathLM:Teaching Mathematical Reasoning for LLMs in Wireless Communications with Reinforcement Learning - Official repository for WirelessMathLM paper","archived":false,"fork":false,"pushed_at":"2025-09-28T09:30:58.000Z","size":1074,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-09-28T11:39:07.793Z","etag":null,"topics":["datasets","large-language-models","machine-learning","mathematical-reasoning","mathmatics","reinforcement-learning","wireless","wireless-communication"],"latest_commit_sha":null,"homepage":"http://lixin.ai/WirelessMathLM/","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LiXin97.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-27T09:36:27.000Z","updated_at":"2025-09-28T09:31:02.000Z","dependencies_parsed_at":"2025-09-28T11:39:09.275Z","dependency_job_id":"8c672e98-55e0-421d-8fc6-361c4c8b0fc8","html_url":"https://github.com/LiXin97/WirelessMathLM","commit_stats":null,"previous_names":["lixin97/wirelessmathlm"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/LiXin97/WirelessMathLM","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiXin97%2FWirelessMathLM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiXin97%2FWirelessMathLM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiXin97%2FWirelessMathLM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiXin97%2FWirelessMathLM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LiXin97","download_url":"https://codeload.github.com/LiXin97/WirelessMathLM/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiXin97%2FWirelessMathLM/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279001981,"owners_count":26083243,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["datasets","large-language-models","machine-learning","mathematical-reasoning","mathmatics","reinforcement-learning","wireless","wireless-communication"],"created_at":"2025-10-09T19:19:15.821Z","updated_at":"2025-10-09T19:19:17.559Z","avatar_url":"https://github.com/LiXin97.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"# WirelessMathLM: Teaching Mathematical Reasoning for LLMs in Wireless Communications with Reinforcement Learning\n\n[![Website](https://img.shields.io/badge/Website-Live-blue)](https://lixin.ai/WirelessMathLM)\n[![arXiv](https://img.shields.io/badge/arXiv-Coming%20Soon-red)](https://arxiv.org/)\n[![Code](https://img.shields.io/badge/Code-Coming%20Soon-green)](https://github.com/)\n\n\u003e **Authors:** [Xin Li](https://lixin.ai/), [Mengbing Liu](https://liumengbing.com/), [Yiyang Zhu](https://scholar.google.com/citations?user=LWh42_8AAAAJ), Wenhe Zhang, [Li Wei](https://scholar.google.com.sg/citations?user=zdSz9-gAAAAJ), [Jiancheng An](https://scholar.google.com/citations?user=QbTi47kAAAAJ), [Chau Yuen](https://blogs.ntu.edu.sg/chau-yuen/)\n\u003e **Affiliation:** Nanyang Technological University\n\n## 📖 Abstract\n\nLarge language models (LLMs) excel at general mathematical reasoning but fail catastrophically on specialized technical mathematics. In wireless communications, where problems require precise manipulation of information-theoretic bounds, optimization constraints, and signal processing formulations, even state-of-the-art models struggle to achieve competent performance.\n\nWe present **WirelessMathLM**, demonstrating that compact models (0.5B–7B parameters) can match or exceed much larger models through domain-specific reinforcement learning with verifiable rewards. Our key insight is that wireless mathematics problems possess a unique property—verifiable correctness—that enables effective reinforcement learning without human feedback.\n\n## 🎯 Key Contributions\n\n- **WirelessMathBench-XL**: A comprehensive benchmark of 4,027 problems from 970 papers in wireless communications\n- **Domain-specific RL**: Group Relative Policy Optimization (GRPO) with binary verification rewards, training directly from base checkpoints without supervised warm-start\n- **Efficient Performance**: Our 7B model achieves 39.5% accuracy, approaching GPT-4o (40.4%) while using ~100× fewer parameters than DeepSeek-R1 (671B, 57.4%)\n- **Transfer Learning**: Positive transfer to general mathematics benchmarks (+8.4 points average across MATH, Minerva-Math, OlympiadBench, AMC, and AIME)\n\n## 📊 Results Overview\n\n### Model Performance on WirelessMathBench-XL\n\n| Model | Parameters | Accuracy |\n|-------|------------|----------|\n| **WirelessMathLM-7B** | 7B | **39.5%** |\n| GPT-4o | ~1.8T | 40.4% |\n| DeepSeek-R1 | 671B | 57.4% |\n\n### GRPO Training Impact\n\nGRPO training nearly doubles performance across all model scales:\n- **0.5B**: +11% improvement\n- **3B**: +103% improvement\n- **7B**: +81% improvement\n\n\n## 📋 Dataset: WirelessMathBench-XL\n\nWirelessMathBench-XL contains **4,027 mathematical problems** extracted from **970 research papers** in wireless communications, covering:\n\n- Information theory and channel capacity\n- Signal processing and beamforming\n- Optimization in wireless networks\n- MIMO systems and spatial diversity\n- Resource allocation and scheduling\n- Network coding and cooperative communications\n\n## 🔬 Methodology\n\n### Group Relative Policy Optimization (GRPO)\n\nOur approach uses GRPO with binary verification rewards:\n\n1. **No Supervised Fine-tuning**: Train directly from base model checkpoints\n2. **Verifiable Rewards**: Leverage the mathematical nature of wireless problems for automatic verification\n3. **Domain-specific Training**: Focus specifically on wireless communications mathematics\n4. **Efficient Scaling**: Achieve strong performance with compact models\n\n### Training Pipeline\n\n```\nBase Model → GRPO Training → WirelessMathLM\n    ↑              ↑              ↓\nQwen2.5    Binary Rewards   Wireless Math\n                              Expertise\n```\n\n## 📈 Transfer Learning Results\n\nOur models show positive transfer to general mathematics:\n\n| Benchmark | Improvement |\n|-----------|-------------|\n| MATH | +8.2 points |\n| Minerva-Math | +7.9 points |\n| OlympiadBench | +9.1 points |\n| AMC | +8.7 points |\n| AIME | +8.5 points |\n| **Average** | **+8.4 points** |\n\n## 📚 Citation\n\n```bibtex\n@article{li2025wirelessmathlm,\n  title={WirelessMathLM: Teaching Mathematical Reasoning for LLMs in Wireless Communications with Reinforcement Learning},\n  author={Li, Xin and Liu, Mengbing and Zhu, Yiyang and Zhang, Wenhe and Wei, Li and An, Jiancheng and Yuen, Chau},\n  journal={arXiv preprint},\n  year={2025}\n}\n```\n\n## 🔗 Resources\n\n- **Paper**: Coming soon on arXiv\n- **Code**: Will be released upon publication\n- **Website**: [Project Homepage](website/index.html)\n- **Overview**: [WirelessMathLM-Overview.pdf](arXiv_WirelessMathLM/WirelessMathLM-Overview.pdf)\n\n## 📧 Contact\n\nFor questions or collaborations, please contact:\n- **Xin Li**: [xin019@ntu.edu.sg](mailto:xin019@ntu.edu.sg)\n\n---\n\n**Nanyang Technological University** | **Project Maxwell**","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flixin97%2Fwirelessmathlm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flixin97%2Fwirelessmathlm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flixin97%2Fwirelessmathlm/lists"}